Interspeech 2025
Samir Sadok1,*, Julien Hauret2,3,*, Éric Bavu2
1 Inria, Université Grenoble Alpes, CNRS, LJK, France
2 LMSSC, Conservatoire national des arts et métiers (Cnam), Paris, France
3 APC, French-German Research Institute of Saint-Louis, France
* Equal contribution
DAC | 16, 24 or 44.1 | 86 | 1024 | 9 |
SpeechTokenizer | 16 | 50 | 1024 | 8 |
BigCodec | 16 | 80 | 8192 | 1 |
Mimi | 24 | 12.5 | 2048 | 1 + 31 |
Where are speech attributes encoded in neural audio codecs?
Deterministic mapping between HuBERT's and codec's token ?
librispeech-test-clean
librispeech-test-clean
How to analyze and control audio from codec tokens with AnCoGen?
Librispeech
Librispeech