-
Notifications
You must be signed in to change notification settings - Fork 3
add mat * dense_vec operation #198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #198 +/- ##
==========================================
+ Coverage 71.91% 72.18% +0.26%
==========================================
Files 36 36
Lines 2026 2049 +23
==========================================
+ Hits 1457 1479 +22
- Misses 569 570 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Thanks, though I think the output should be a dense array, not a block sparse array, since subsequent operations on a block sparse array with all blocks filled would be slow. Also, where does the scalar indexing occur? I would think that you are just taking contiguous slices of blocks from the dense array, so that should be fine to then multiply with a corresponding block of the block sparse array without scalar indexing. |
I agree that returning a dense array would be preferable, but the previous implementation returned a block sparse array, and I'm concerned about breaking existing code.
Yes, but it seems this operation is not allowed for JLArray. I can have a double check on that. |
That is fine, we don't have code that relies on that operation and I'm not aware of external users of these packages (you are the first I'm aware of). Anyway, as you saw, that operation was very slow so wouldn't have been useful in practice.
That would be strange to me, yes please look into that. |
I’ve added a new julia> using BenchmarkTools
julia> using BlockArrays, BlockSparseArrays
julia> cnt = fill(16, 128);
julia> bsa = BlockSparseArray{Float64}(undef, cnt, cnt);
julia> o = ones(2048);
julia> y = bsa * o;
julia> typeof(y)
Vector{Float64} (alias for Array{Float64, 1})
julia> @benchmark $(bsa) * $(o)
BenchmarkTools.Trial: 10000 samples with 10 evaluations per sample.
Range (min … max): 1.717 μs … 405.992 μs ┊ GC (min … max): 0.00% … 98.59%
Time (median): 2.000 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.473 μs ± 12.398 μs ┊ GC (mean ± σ): 17.89% ± 3.55%
▁ ▁▁ ▁▃▅▆█████▆▅▄▃▁ ▂
▃▃▆█████████████████████▇▅▅▄▆▅▅▆▅▆▆▆▅▇▇▇██████▇▆▇▆▆▅▆▅▆▄▅▅▅ █
1.72 μs Histogram: log(frequency) by time 2.77 μs <
Memory estimate: 19.02 KiB, allocs estimate: 40.
You’re right, |
| elt::Type, | ||
| axes::Tuple{<:AbstractUnitRange}, | ||
| ) | ||
| return Vector{elt}(undef, length(only(axes))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wouldn't be generic to GPU. I think it would be better to do something like this:
| return Vector{elt}(undef, length(only(axes))) | |
| return similar(mul.B, elt, axes) |
so it bases the output type on the dense input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have revised that.
Description
This PR adds support for multiplying a
BlockSparseArraymatrix by a dense vector, producing a dense vector result.Previously, attempting
BlockSparseMatrix * Vectorwould fall back to inefficient double loop. This change implements the matrix-vector multiplication by iterating over stored blocks and accumulating contributions to the result vector.Implementation details:
mul!_blocksparsemethod for matrix-vector multiplication inlinearalgebra.jlmuladd!_blocksparseoverload for the matrix-vector case inarraylayouts.jlArrayLayouts.materialize!forMatMulVecAddwithBlockLayout{SparseLayout}matrix andDenseColumnMajorvectorNote: GPU arrays (JLArrays) require
@allowscalarbecause block iteration involves scalar indexing.Minimal demonstration of previous behavior
Minimal demonstration of new behavior
How Has This Been Tested?
Added a new test section "Matrix-vector multiplication" in
test/test_basics.jlthat runs across all element types (Float32, Float64, ComplexF32) and array types (Array, JLArray).BlockSparseMatrix * dense Vectormultiplication against dense referenceadjoint(BlockSparseMatrix) * dense Vectortranspose(BlockSparseMatrix) * dense VectorChecklist:
using JuliaFormatter; format(".")in the base directory of the repository (~/.julia/dev/BlockSparseArrays) to format your code according to our style guidelines.