SenseVoice output_timestamp=True,  # 必须显式打开再用vad模型报错

具体原因是在 funasr 的 inference_with_vad 逻辑中，程序尝试将 VAD（语音端点检测）切分的时间戳与音频偏移量相加，但其中一方被识别成了 字符串（str），而另一方是 整数（int）。

这通常是由于 SenseVoice 模型与 funasr 库的某些版本在处理“带 VAD 模式”时，对返回结果的内部封装格式处理不统一导致的。SenseVoice开启时间戳用vad模型报错了raceback (most recent call last):
  File "F:\workspace\SenseVoice\demo1.py", line 22, in <module>
    res = model.generate(
  File "F:\workspace\SenseVoice\.venv\lib\site-packages\funasr\auto\auto_model.py", line 311, in generate
    return self.inference_with_vad(
  File "F:\workspace\SenseVoice\.venv\lib\site-packages\funasr\auto\auto_model.py", line 537, in inference_with_vad
    t[0] += vadsegments[j][0]
TypeError: can only concatenate str (not "int") to str

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SenseVoice output_timestamp=True, # 必须显式打开再用vad模型报错 #2782

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SenseVoice output_timestamp=True, # 必须显式打开再用vad模型报错 #2782

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions