-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
🐛 Bug
If the 'decibel_thres' of the 'FsmnVADStreaming' class is set to values higher the the default value(-50, for example), the model may crash at this line of 'funasr/models/fsmn_vad_streaming/model.py':
cur_decibel = cache["stats"].decibel[t]
the reported error is:
IndexError: list index out of range
This problem persists in versions 1.1.12, 1.1.14 and 1.3.0(maybe all versions from 1.1.12 to 1.3.0.)
To Reproduce
1.Set 'decibel_thres' of FsmnVADStreaming to -50.
2.Use the VAD pipline to segment audio.
Code sample
import soundfile as sf
from modelscope.pipelines import pipeline
from modelscope.utils.constant import Tasks
import torch
import numpy as np
vad_inference_pipeline = pipeline(
task=Tasks.voice_activity_detection,
model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch',
model_revision="v2.0.4",
disable_update=True,
device=device,
decibel_thres=-50
)
wav, _ = sf.read("video_clip2_16k.wav", dtype="int16")
segments_result = vad_inference_pipeline(input=wav.tobytes(), fs=16000)
Expected behavior
Get the right segemnts of audio.
Environment
- OS (Linux):
- FunASR Version (1.1.12, 1.1.14, 1.3.0):
- ModelScope Version (1.19.0):
- PyTorch Version (2.0.1+cu118):
- How you installed funasr (
pip): - Python version: 3.8.20
- GPU (4090)
- CUDA/cuDNN version (cuda12.1):
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working