KIT - Künstliche Intelligenz für Sprachtechnologien - Studium und Lehre - Abschlussarbeiten - Automatic Speech Recognition for Children´s Speech

Automatic Speech Recognition for Children's Speech

Forschungsthema:Automatic Speech Recognition
Betreuung:
Zhaolin Li
Zusatzfeld:
Description:

In recent years, Automatic Speech Recognition (ASR) technology has advanced significantly, especially in recognizing adult speech. However, ASR systems continue to face challenges when processing children's speech, which differs markedly from adult speech in pronunciation, vocabulary, and speech patterns. This thesis will address these challenges, particularly focusing on issues such as invented words and speech disfluencies. By leveraging state-of-the-art speech foundation models, the research aims to enhance ASR performance for children's speech and contribute to the creation of more accurate and child-friendly ASR systems.

Requirements:

Knowledge about python and pytorch;

Knowledge about machine learning

Interests about speech technologies

Literatures:

1. Yu, Fan, et al. "The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines." 2021 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2021

2. Jain, Rishabh, et al. "Adaptation of Whisper models to child speech recognition." arXiv preprint arXiv:2307.13008, 2023.

3. Olstad, Anne Marte Haug, et al. "Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages.", 2024.

4. W. Liu, Y. Qin, Z. Peng and T. Lee, "Sparsely Shared Lora on Whisper for Child Speech Recognition," ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024