⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

Conversation

@ArrogantGao
Copy link

Description

This PR adds support for multiplying a BlockSparseArray matrix by a dense vector, producing a dense vector result.

Previously, attempting BlockSparseMatrix * Vector would fall back to inefficient double loop. This change implements the matrix-vector multiplication by iterating over stored blocks and accumulating contributions to the result vector.

Implementation details:

  • Added mul!_blocksparse method for matrix-vector multiplication in linearalgebra.jl
  • Added muladd!_blocksparse overload for the matrix-vector case in arraylayouts.jl
  • Added ArrayLayouts.materialize! for MatMulVecAdd with BlockLayout{SparseLayout} matrix and DenseColumnMajor vector

Note: GPU arrays (JLArrays) require @allowscalar because block iteration involves scalar indexing.

Minimal demonstration of previous behavior

using BlockArrays: Block
using BlockSparseArrays

cnt = fill(16, 128);
bsa = BlockSparseArray{Float64}(undef, cnt, cnt);
o = ones(2048);

@time bsa * o;
  1.917505 seconds (32.90 M allocations: 17.235 GiB, 29.10% gc time, 99.34% compilation time)

Minimal demonstration of new behavior

using BlockArrays: Block
using BlockSparseArrays

cnt = fill(16, 128);
bsa = BlockSparseArray{Float64}(undef, cnt, cnt);
o = ones(2048);

@time bsa * o;
  0.000019 seconds (183 allocations: 25.672 KiB, 0.44% compilation time: 100% of which was recompilation)

@time bsa' * o;
  0.000021 seconds (71 allocations: 20.969 KiB, 0.19% compilation time: 100% of which was recompilation)

@time transpose(bsa) * o;
  0.000023 seconds (69 allocations: 20.875 KiB, 0.19% compilation time: 100% of which was recompilation)

How Has This Been Tested?

Added a new test section "Matrix-vector multiplication" in test/test_basics.jl that runs across all element types (Float32, Float64, ComplexF32) and array types (Array, JLArray).

  • Test basic BlockSparseMatrix * dense Vector multiplication against dense reference
  • Test result element type and dimensions
  • Test adjoint(BlockSparseMatrix) * dense Vector
  • Test transpose(BlockSparseMatrix) * dense Vector

Checklist:

  • My code follows the style guidelines of this project. Please run using JuliaFormatter; format(".") in the base directory of the repository (~/.julia/dev/BlockSparseArrays) to format your code according to our style guidelines.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that verify the behavior of the changes I made.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • Any dependent changes have been merged and published in downstream modules.

@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

❌ Patch coverage is 88.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.18%. Comparing base (10e69a2) to head (c5e1e02).

Files with missing lines Patch % Lines
src/blocksparsearrayinterface/linearalgebra.jl 86.66% 2 Missing ⚠️
src/blocksparsearrayinterface/arraylayouts.jl 85.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #198      +/-   ##
==========================================
+ Coverage   71.91%   72.18%   +0.26%     
==========================================
  Files          36       36              
  Lines        2026     2049      +23     
==========================================
+ Hits         1457     1479      +22     
- Misses        569      570       +1     
Flag Coverage Δ
docs 5.78% <0.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mtfishman
Copy link
Member

Thanks, though I think the output should be a dense array, not a block sparse array, since subsequent operations on a block sparse array with all blocks filled would be slow.

Also, where does the scalar indexing occur? I would think that you are just taking contiguous slices of blocks from the dense array, so that should be fine to then multiply with a corresponding block of the block sparse array without scalar indexing.

@ArrogantGao
Copy link
Author

ArrogantGao commented Jan 16, 2026

Thanks, though I think the output should be a dense array, not a block sparse array, since subsequent operations on a block sparse array with all blocks filled would be slow.

I agree that returning a dense array would be preferable, but the previous implementation returned a block sparse array, and I'm concerned about breaking existing code.

Also, where does the scalar indexing occur? I would think that you are just taking contiguous slices of blocks from the dense array, so that should be fine to then multiply with a corresponding block of the block sparse array without scalar indexing.

Yes, but it seems this operation is not allowed for JLArray. I can have a double check on that.

@mtfishman
Copy link
Member

mtfishman commented Jan 16, 2026

I agree that returning a dense array would be preferable, but the previous implementation returned a block sparse array, and I'm concerned about breaking existing code.

That is fine, we don't have code that relies on that operation and I'm not aware of external users of these packages (you are the first I'm aware of). Anyway, as you saw, that operation was very slow so wouldn't have been useful in practice.

Yes, but it seems this operation is not allowed for JLArray. I can have a double check on that.

That would be strange to me, yes please look into that.

@ArrogantGao
Copy link
Author

That is fine, we don't have code that relies on that operation and I'm not aware of external users of these packages (you are the first I'm aware of). Anyway, as you saw, that operation was very slow so wouldn't have been useful in practice.

I’ve added a new similar function for sparse block * vec, so it now returns a dense vector.

julia> using BenchmarkTools

julia> using BlockArrays, BlockSparseArrays

julia> cnt = fill(16, 128);

julia> bsa = BlockSparseArray{Float64}(undef, cnt, cnt);

julia> o = ones(2048);

julia> y = bsa * o;

julia> typeof(y)
Vector{Float64} (alias for Array{Float64, 1})

julia> @benchmark $(bsa) * $(o)
BenchmarkTools.Trial: 10000 samples with 10 evaluations per sample.
 Range (min  max):  1.717 μs  405.992 μs  ┊ GC (min  max):  0.00%  98.59%
 Time  (median):     2.000 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   2.473 μs ±  12.398 μs  ┊ GC (mean ± σ):  17.89% ±  3.55%

     ▁  ▁▁ ▁▃▅▆█████▆▅▄▃▁                                     ▂
  ▃▃▆█████████████████████▇▅▅▄▆▅▅▆▅▆▆▆▅▇▇▇██████▇▆▇▆▆▅▆▅▆▄▅▅▅ █
  1.72 μs      Histogram: log(frequency) by time      2.77 μs <

 Memory estimate: 19.02 KiB, allocs estimate: 40.

That would be strange to me, yes please look into that.

You’re right, @allowscalar isn’t necessary because @view isn’t considered scalar indexing. I removed it in tests.

elt::Type,
axes::Tuple{<:AbstractUnitRange},
)
return Vector{elt}(undef, length(only(axes)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be generic to GPU. I think it would be better to do something like this:

Suggested change
return Vector{elt}(undef, length(only(axes)))
return similar(mul.B, elt, axes)

so it bases the output type on the dense input.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have revised that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants