add mat * dense_vec operation #198

ArrogantGao · 2026-01-16T16:55:35Z

Description

This PR adds support for multiplying a BlockSparseArray matrix by a dense vector, producing a dense vector result.

Previously, attempting BlockSparseMatrix * Vector would fall back to inefficient double loop. This change implements the matrix-vector multiplication by iterating over stored blocks and accumulating contributions to the result vector.

Implementation details:

Added mul!_blocksparse method for matrix-vector multiplication in linearalgebra.jl
Added muladd!_blocksparse overload for the matrix-vector case in arraylayouts.jl
Added ArrayLayouts.materialize! for MatMulVecAdd with BlockLayout{SparseLayout} matrix and DenseColumnMajor vector

Note: GPU arrays (JLArrays) require @allowscalar because block iteration involves scalar indexing.

Minimal demonstration of previous behavior

using BlockArrays: Block
using BlockSparseArrays

cnt = fill(16, 128);
bsa = BlockSparseArray{Float64}(undef, cnt, cnt);
o = ones(2048);

@time bsa * o;
  1.917505 seconds (32.90 M allocations: 17.235 GiB, 29.10% gc time, 99.34% compilation time)

Minimal demonstration of new behavior

using BlockArrays: Block
using BlockSparseArrays

cnt = fill(16, 128);
bsa = BlockSparseArray{Float64}(undef, cnt, cnt);
o = ones(2048);

@time bsa * o;
  0.000019 seconds (183 allocations: 25.672 KiB, 0.44% compilation time: 100% of which was recompilation)

@time bsa' * o;
  0.000021 seconds (71 allocations: 20.969 KiB, 0.19% compilation time: 100% of which was recompilation)

@time transpose(bsa) * o;
  0.000023 seconds (69 allocations: 20.875 KiB, 0.19% compilation time: 100% of which was recompilation)

How Has This Been Tested?

Added a new test section "Matrix-vector multiplication" in test/test_basics.jl that runs across all element types (Float32, Float64, ComplexF32) and array types (Array, JLArray).

Test basic BlockSparseMatrix * dense Vector multiplication against dense reference
Test result element type and dimensions
Test adjoint(BlockSparseMatrix) * dense Vector
Test transpose(BlockSparseMatrix) * dense Vector

Checklist:

My code follows the style guidelines of this project. Please run using JuliaFormatter; format(".") in the base directory of the repository (~/.julia/dev/BlockSparseArrays) to format your code according to our style guidelines.
I have performed a self-review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have added tests that verify the behavior of the changes I made.
I have made corresponding changes to the documentation.
My changes generate no new warnings.
Any dependent changes have been merged and published in downstream modules.

codecov · 2026-01-16T17:08:28Z

Codecov Report

❌ Patch coverage is 88.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.18%. Comparing base (10e69a2) to head (c5e1e02).

Files with missing lines	Patch %	Lines
src/blocksparsearrayinterface/linearalgebra.jl	86.66%	2 Missing ⚠️
src/blocksparsearrayinterface/arraylayouts.jl	85.71%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #198      +/-   ##
==========================================
+ Coverage   71.91%   72.18%   +0.26%     
==========================================
  Files          36       36              
  Lines        2026     2049      +23     
==========================================
+ Hits         1457     1479      +22     
- Misses        569      570       +1

Flag	Coverage Δ
docs	`5.78% <0.00%> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mtfishman · 2026-01-16T17:14:14Z

Thanks, though I think the output should be a dense array, not a block sparse array, since subsequent operations on a block sparse array with all blocks filled would be slow.

Also, where does the scalar indexing occur? I would think that you are just taking contiguous slices of blocks from the dense array, so that should be fine to then multiply with a corresponding block of the block sparse array without scalar indexing.

ArrogantGao · 2026-01-16T17:19:34Z

Thanks, though I think the output should be a dense array, not a block sparse array, since subsequent operations on a block sparse array with all blocks filled would be slow.

I agree that returning a dense array would be preferable, but the previous implementation returned a block sparse array, and I'm concerned about breaking existing code.

Also, where does the scalar indexing occur? I would think that you are just taking contiguous slices of blocks from the dense array, so that should be fine to then multiply with a corresponding block of the block sparse array without scalar indexing.

Yes, but it seems this operation is not allowed for JLArray. I can have a double check on that.

mtfishman · 2026-01-16T17:27:42Z

I agree that returning a dense array would be preferable, but the previous implementation returned a block sparse array, and I'm concerned about breaking existing code.

That is fine, we don't have code that relies on that operation and I'm not aware of external users of these packages (you are the first I'm aware of). Anyway, as you saw, that operation was very slow so wouldn't have been useful in practice.

Yes, but it seems this operation is not allowed for JLArray. I can have a double check on that.

That would be strange to me, yes please look into that.

ArrogantGao · 2026-01-16T20:17:47Z

That is fine, we don't have code that relies on that operation and I'm not aware of external users of these packages (you are the first I'm aware of). Anyway, as you saw, that operation was very slow so wouldn't have been useful in practice.

I’ve added a new similar function for sparse block * vec, so it now returns a dense vector.

julia> using BenchmarkTools

julia> using BlockArrays, BlockSparseArrays

julia> cnt = fill(16, 128);

julia> bsa = BlockSparseArray{Float64}(undef, cnt, cnt);

julia> o = ones(2048);

julia> y = bsa * o;

julia> typeof(y)
Vector{Float64} (alias for Array{Float64, 1})

julia> @benchmark $(bsa) * $(o)
BenchmarkTools.Trial: 10000 samples with 10 evaluations per sample.
 Range (min … max):  1.717 μs … 405.992 μs  ┊ GC (min … max):  0.00% … 98.59%
 Time  (median):     2.000 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   2.473 μs ±  12.398 μs  ┊ GC (mean ± σ):  17.89% ±  3.55%

     ▁  ▁▁ ▁▃▅▆█████▆▅▄▃▁                                     ▂
  ▃▃▆█████████████████████▇▅▅▄▆▅▅▆▅▆▆▆▅▇▇▇██████▇▆▇▆▆▅▆▅▆▄▅▅▅ █
  1.72 μs      Histogram: log(frequency) by time      2.77 μs <

 Memory estimate: 19.02 KiB, allocs estimate: 40.

That would be strange to me, yes please look into that.

You’re right, @allowscalar isn’t necessary because @view isn’t considered scalar indexing. I removed it in tests.

mtfishman · 2026-01-16T21:01:06Z

src/abstractblocksparsearray/arraylayouts.jl

+        elt::Type,
+        axes::Tuple{<:AbstractUnitRange},
+    )
+    return Vector{elt}(undef, length(only(axes)))


This wouldn't be generic to GPU. I think it would be better to do something like this:

Suggested change

return Vector{elt}(undef, length(only(axes)))

return similar(mul.B, elt, axes)

so it bases the output type on the dense input.

I have revised that.

add mat * dense_vec operation

82a7673

ArrogantGao mentioned this pull request Jan 16, 2026

[BUG] *(block sparse, dense) (and vice versa) is slow #197

Open

mtfishman mentioned this pull request Jan 16, 2026

[BUG] Broadcasting a vector into an n x 1 block sparse matrix fails #200

Open

update

4cbd4f3

mtfishman reviewed Jan 16, 2026

View reviewed changes

more general similar

c5e1e02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add mat * dense_vec operation #198

add mat * dense_vec operation #198

Uh oh!

ArrogantGao commented Jan 16, 2026

Uh oh!

codecov bot commented Jan 16, 2026 •

edited

Loading

Uh oh!

mtfishman commented Jan 16, 2026

Uh oh!

ArrogantGao commented Jan 16, 2026 •

edited by mtfishman

Loading

Uh oh!

mtfishman commented Jan 16, 2026 •

edited

Loading

Uh oh!

ArrogantGao commented Jan 16, 2026

Uh oh!

mtfishman Jan 16, 2026

Uh oh!

ArrogantGao Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	return Vector{elt}(undef, length(only(axes)))
	return similar(mul.B, elt, axes)

add mat * dense_vec operation #198

Are you sure you want to change the base?

add mat * dense_vec operation #198

Uh oh!

Conversation

ArrogantGao commented Jan 16, 2026

Description

How Has This Been Tested?

Checklist:

Uh oh!

codecov bot commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mtfishman commented Jan 16, 2026

Uh oh!

ArrogantGao commented Jan 16, 2026 • edited by mtfishman Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mtfishman commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ArrogantGao commented Jan 16, 2026

Uh oh!

mtfishman Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

ArrogantGao Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 16, 2026 •

edited

Loading

ArrogantGao commented Jan 16, 2026 •

edited by mtfishman

Loading

mtfishman commented Jan 16, 2026 •

edited

Loading