更细粒度的流式推理模式#2671
Merged
Merged
Conversation
modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py
modified: GPT_SoVITS/AR/models/t2s_model.py modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py
modified: GPT_SoVITS/TTS_infer_pack/TTS.py modified: GPT_SoVITS/module/models.py modified: api_v2.py
|
最新发布的么? |
Owner
|
正常的推理的话,(steam是False?),效果是不影响的吧?@ChasonJiang |
Contributor
Author
@RVC-Boss 正常推理就不是流模式了(return_fragment也是False),不影响效果。开启流模式就根据上述的三种模式来。 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
支持更细粒度的流式推理模式。
三种模式:
1.return_fragment:之前版本的streaming_mode,chunk长度就是分句的token长度,质量最好(和baseline一致),但响应速度最慢。
2.streaming_mode:相比return_fragment的chunk更细,但是首包长度不定(动态chunk长度),质量较好(比baseline咬字可能稍弱),响应速度一般。
3.streaming_mode+fixed_length_chunk:在streaming_mode的基础上开启fixed_length_chunk,chunk长度相对固定(可能小于设定的chunk长度),首包长度固定,质量一般(比baseline咬字可能稍弱,且当chunk长度设置过短时可能出现基音断裂不连贯的现象),响应速度最快。
总结:
1.质量:return_fragment > streaming_mode > streaming_mode+fixed_length_chunk
2.响应速度: streaming_mode+fixed_length_chunk >> streaming_mode > return_fragment