渻ƼȫINTERSPEECH飬360Ϊ渻ʶɽ
2024-09-29 16:34
գ渻ƼϯϣٰĹͨźŴ——INTERSPEECH 2024ΪQifusion-Net Layer-adapted Stream/Non-stream Model for End-to-End Multi-Accent Speech RecognitionּݽȫչʾʶijɾͣΪй硢ȫµıˡ
INTERSPEECHΪȫѧʢȻ֮һصĶѧߡоԱҵ䣬ͬ̽½չսδơһƽ̨ѧˮƽҲ¼ײľѳȨԺӰҵҡ
ͼ1渻ƼINTERSPEECH 2024ּݽ
ݽУ渻Ƽ˿ͬʱ֧20ַԵһ渻ʶϵͳ“QiFree”ǹڽҵִ͵ʶϵͳĿ뷽ʶȨԼKeSpeechĶԱУ渻ƼƾԶʶAutomatic Speech Recognition ASRۣʵ˷Կȷʵﵽ79.10%ԶKeSpeechĻˮƽ61.13%һֱ۷ӳ渻ƼʶȷϵԽ֡ͬʱںʶʵĹؼָ——CERCharacter Error Rate ַʣϣ渻Ƽ8.08%ijɼԶKeSpeech10.38%չķʶĸЧ뾫
1渻Ƽ“QiFree”ЧKeSpeech BaselineԱ
渻Ƽеʶϵͳ“QiFree”˵һģֻʶضһԵͨµIJӦںϽṹϢģЧȡϢʵ˼˵룬һǿ˵ʵʱֵһǣ“QiFree”ͨʶCERϱȵλڼ-³-ɡ-ȶʶϣȹѳɼʵ˳15%һͻԳɹõINTERSPEECHλ˵ĸ߶ϿɣһϿ渻ƼġQifusion-Netںϵʽ/ʽ˵˶ʶܡչֵϵͳܴʶܵԽ֣һ“ACCEPT”
ֵһǣһ˾ijƼͷӰʶԴĶԱУ渻ƼͬչֳѹԵơԲģѵḻĶʱ渻ƼȻԸ͵CER8.08% vs 15.61% vs 26.55%ӱ֤似ܹԽԺ㷨ŻĸЧԡ⣬ȫȵʶϵͳOpenai-whisper v2ȣܺͨʶϾƣķʶһϸ渻ƼȻƣһӡ֤ڷʶϵȫȵλ
2 渻Ƽ“QiFree”ؼָһƼ˾Ա
渻ƼINTERSPEECH 2024ϵһξ࣬ǶʶϸɹһȫչʾйҵһǿDZ渻ƼԽļʵʹ¾ŷʶһַչ360Ϊ渻ΪȫͨźŴĽйǻй