Improve stability of velocity fits in template metrics #4342

ecobost · 2026-01-24T16:35:12Z

In template metrics, velocity_above and velocity_below estimate how fast an spike moves along the axon/dendrites by fitting a line where x is distance (template_channel_location - location of soma) and y is time (position of the peak at each template_channel compared to the soma expressed in msecs). A small slope means the spike takes a long time to move through the probe (so mm/msecs), a higher slope means the spike moves fast. Currently when a peak position is the same (or very close) along the template channels, VelocityFits() tries to fit a straight vertical line; which gives an infinite slope (in practice, the fitting fails with NaN because X is ill-conditioned). This is pretty common.

This PR switches the fitting to regress peak_ms (in y) onto channel_distances (in x) and then take 1/slope to obtain the same velocities as before. This is more stable and also justified in the original source "Because the time difference between the trough of adjacent sites could be 0, to avoid inﬁnite numbers, we calculated the inverse of velocity (ms/mm) instead by ﬁtting a regression line to the time of waveform trough at different sites against the distance of the sites relative to soma" (Jia et al, 2020). I also centered the data before fitting the regressor. Centering helps stability and avoids having to learn an intercept which gives the model a bit more robustness.

Here's the velocities calculated with the current method and the new method (proposed in this PR):

Predicted velocities are consistent for lower velocities (center of the plot) and more stable at higher velocities. It is also able to make predictions for a lot more cases (this is a 3h neuropixels recording and the proportions of NaNs drops from 0.37 to 0.26). I also manually checked some results (where there were nans before) and they look sensible.

Overall, I think it's just a sensible change for added stability and better predictions.

for more information, see https://pre-commit.ci

chrishalcrow · 2026-01-26T16:29:53Z

This looks great, I'll have a close look tomorrow :)

chrishalcrow · 2026-01-27T09:49:20Z

src/spikeinterface/metrics/template/metrics.py

        distances_um_above = np.array([np.linalg.norm(cl - max_channel_location) for cl in channel_locations_above])
-        velocity_above, _, score = fit_velocity(peak_times_ms_above, distances_um_above)
+        inv_velocity_above, score = fit_line_robust(distances_um_above, peak_times_ms_above)
+        velocity_above = 1 / inv_velocity_above


Something to improve while we're at it: I think it would be nice to catch the division by zero before it happens, since we're expecting this to fail for numerous units. If we do this, the user doesn't get annoying / uninterruptible division by zero errors

Suggested change

velocity_above = 1 / inv_velocity_above

if inv_velocity_above != 0:

velocity_above = 1 / inv_velocity_above

else:

velocity_above = np.nan

chrishalcrow · 2026-01-27T09:49:37Z

src/spikeinterface/metrics/template/metrics.py

        distances_um_below = np.array([np.linalg.norm(cl - max_channel_location) for cl in channel_locations_below])
-        velocity_below, _, score = fit_velocity(peak_times_ms_below, distances_um_below)
+        inv_velocity_below, score = fit_line_robust(distances_um_below, peak_times_ms_below)
+        velocity_below = 1 / inv_velocity_below


Suggested change

velocity_below = 1 / inv_velocity_below

if inv_velocity_below != 0:

velocity_below = 1 / inv_velocity_below

else:

velocity_below = np.nan

chrishalcrow

Looks great!

Confirmed that it gives similar results on real data, is fast, and gives fewer nan results. Win, win win. Made a tiny comment which would marginally make the error messages less annoying.

@ecobost out of interest, since you've been looking at this score: I've only really got NP2 data. Do you know if this algorithm generalizes to large square array probes? Or is it designed only for long ones?

for more information, see https://pre-commit.ci

ecobost · 2026-01-28T00:33:44Z

Hi @chrishalcrow, thanks for having a look at this

I found out that sklearn's TheilSenRegressor requires fit_intercept=True (otherwise it fits lines without intercept y=wx with a single datapoint rather than two datapoints, pretty odd behavior and not what we want as the entire dataset might have intercept 0 but any particular pair of datapoints does not) so I changed that back in this commit 0630977

I then realized that because this regressor is designed to work on multi-dimensional input, it actually returns an estimate of the median (the spatial median across the parameter space) rather than the median slope (closed-form solution when num_features=1, our case); that is inexact and more time-consuming (as it has to solve a small optimization problem). So I just implemented the algorithm myself (it is pretty straightforward and hopefully easy to understand), it returns almost the same results as sklearn but it's 5-7x faster (and exact).

Overall, this should produce better more stable fits. Here's the same plot as above (with velocity computed as in the current code in x axis vs proposed 1/inv_velocity), kind of just looks the same

re: your question. I work with NHP 1.0 probes so a unit template is essentially a single column with channels 20-microns apart, which is the canonical use case for this method. Assumptions are that the neuron runs parallel to your depth dimension (y by default, i.e., along the length of the probe for neuropixels) and that channels are nearby in x (code has a column range to limit the x breadth if need be). One could compute a 2-d propagation speed tracking how the spike moves in x and y (think of having a 2-d heatmap of spike delays) but it will require some extra thought

chrishalcrow · 2026-01-28T12:56:15Z

src/spikeinterface/metrics/template/metrics.py

+        if np.abs(x1 - x0) > eps:
+            slopes.append((y1 - y0) / (x1 - x0))


Suggested change

if np.abs(x1 - x0) > eps:

slopes.append((y1 - y0) / (x1 - x0))

slopes.append((y1 - y0) / (x1 - x0))

I'm pretty sure this is impossible: we don't allow probes with equal channel locations. (And it never happens for my data!)

chrishalcrow · 2026-01-28T12:56:30Z

src/spikeinterface/metrics/template/metrics.py

-    return slope, intercept, score
+    # Calculate R2 score
+    y_pred = slope * x + bias
+    r2_score = 1 - ((y - y_pred) ** 2).sum() / (((y - y.mean()) ** 2).sum() + eps)


Suggested change

r2_score = 1 - ((y - y_pred) ** 2).sum() / (((y - y.mean()) ** 2).sum() + eps)

r2_score = 1 - ((y - y_pred) ** 2).sum() / (((y - y.mean()) ** 2).sum())

I think the eps shouldn't be here

chrishalcrow · 2026-01-28T12:57:55Z

src/spikeinterface/metrics/template/metrics.py

-        if score < min_r2:
+        inv_velocity_above, score = fit_line_robust(distances_um_above, peak_times_ms_above)
+        if score > min_r2:
+            velocity_above = 1 / inv_velocity_above if np.abs(inv_velocity_above) > 1e-9 else np.inf


Suggested change

velocity_above = 1 / inv_velocity_above if np.abs(inv_velocity_above) > 1e-9 else np.inf

velocity_above = 1 / inv_velocity_above if np.abs(inv_velocity_above) > 1e-9 else np.nan

np.inf is more correct, but it causes some problems downstream with pandas. To allow np.inf we'd need to update some other code and add some tests to make sure everything was robust to infs.

chrishalcrow · 2026-01-28T12:58:02Z

src/spikeinterface/metrics/template/metrics.py

-        if score < min_r2:
+        inv_velocity_below, score = fit_line_robust(distances_um_below, peak_times_ms_below)
+        if score > min_r2:
+            velocity_below = 1 / inv_velocity_below if np.abs(inv_velocity_below) > 1e-9 else np.inf


Suggested change

velocity_below = 1 / inv_velocity_below if np.abs(inv_velocity_below) > 1e-9 else np.inf

velocity_below = 1 / inv_velocity_below if np.abs(inv_velocity_below) > 1e-9 else np.nan

chrishalcrow · 2026-01-28T13:01:09Z

Hello, my results are fairly different for the new implementation (sometimes a x2 difference near velocity of 1000: figure below is velocity_above), and I went down a rabbit hole this morning...

I think your PR fixes something more fundamental than just stability. The Theil-Sen regressor is robust to outliers in y, not in x. For the velocity fits, the distances come from channel locations and are very easy to compute while the peak times are more noisy (since they are to do with peak locations on multiple templates: these can be hard to find when you're further from the signal). Since the peak times are noisier, they should be treated like the independent variable in the regression. This PR makes this change.

So: I would say this PR makes quite a big change in the metric computation but it fixes an error in the previous implementation.

ecobost and others added 2 commits January 24, 2026 16:32

change direction of velocity fit in template metrics

365fe04

[pre-commit.ci] auto fixes from pre-commit.com hooks

362a87b

for more information, see https://pre-commit.ci

alejoe91 added the metrics Related to metrics module label Jan 26, 2026

chrishalcrow self-assigned this Jan 26, 2026

chrishalcrow reviewed Jan 27, 2026

View reviewed changes

chrishalcrow approved these changes Jan 27, 2026

View reviewed changes

ecobost and others added 4 commits January 27, 2026 23:27

set fit_intercept=True in TheilSen and check for possible infinites

0630977

compute theil-sen estimator manually (exact and faster)

1f6e8f5

[pre-commit.ci] auto fixes from pre-commit.com hooks

48cbf20

for more information, see https://pre-commit.ci

minor: code formatting

0ef443f

chrishalcrow reviewed Jan 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve stability of velocity fits in template metrics #4342

Improve stability of velocity fits in template metrics #4342

ecobost commented Jan 24, 2026

Uh oh!

chrishalcrow commented Jan 26, 2026

Uh oh!

chrishalcrow Jan 27, 2026

Uh oh!

chrishalcrow Jan 27, 2026

Uh oh!

chrishalcrow left a comment

Uh oh!

ecobost commented Jan 28, 2026

Uh oh!

chrishalcrow Jan 28, 2026

Uh oh!

chrishalcrow Jan 28, 2026

Uh oh!

chrishalcrow Jan 28, 2026

Uh oh!

chrishalcrow Jan 28, 2026

Uh oh!

chrishalcrow commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		if np.abs(x1 - x0) > eps:
		slopes.append((y1 - y0) / (x1 - x0))

	r2_score = 1 - ((y - y_pred) 2).sum() / (((y - y.mean()) 2).sum() + eps)
	r2_score = 1 - ((y - y_pred) 2).sum() / (((y - y.mean()) 2).sum())

	velocity_above = 1 / inv_velocity_above if np.abs(inv_velocity_above) > 1e-9 else np.inf
	velocity_above = 1 / inv_velocity_above if np.abs(inv_velocity_above) > 1e-9 else np.nan

	velocity_below = 1 / inv_velocity_below if np.abs(inv_velocity_below) > 1e-9 else np.inf
	velocity_below = 1 / inv_velocity_below if np.abs(inv_velocity_below) > 1e-9 else np.nan

Improve stability of velocity fits in template metrics #4342

Are you sure you want to change the base?

Improve stability of velocity fits in template metrics #4342

Conversation

ecobost commented Jan 24, 2026

Uh oh!

chrishalcrow commented Jan 26, 2026

Uh oh!

chrishalcrow Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow left a comment

Choose a reason for hiding this comment

Uh oh!

ecobost commented Jan 28, 2026

Uh oh!

chrishalcrow Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

chrishalcrow commented Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants