Automatic Speech Recognition for Children's Speech

  • Subject:Automatic Speech Recognition
  • Supervisor:

    Zhaolin Li

  • Add on:

    Description:

    In recent years, Automatic Speech Recognition (ASR) technology has advanced significantly, especially in recognizing adult speech. However, ASR systems continue to face challenges when processing children's speech, which differs markedly from adult speech in pronunciation, vocabulary, and speech patterns. This thesis will address these challenges, particularly focusing on issues such as invented words and speech disfluencies. By leveraging state-of-the-art speech foundation models, the research aims to enhance ASR performance for children's speech and contribute to the creation of more accurate and child-friendly ASR systems.

     

    Requirements:

    Knowledge about python and pytorch;

    Knowledge about machine learning

    Interests about speech technologies

     

    Literatures:

    1. Yu, Fan, et al. "The SLT 2021 children speech recognition challenge: Open datasets, rules and baselines." 2021 IEEE Spoken Language Technology Workshop (SLT). IEEE, 2021

    2. Jain, Rishabh, et al. "Adaptation of Whisper models to child speech recognition." arXiv preprint arXiv:2307.13008, 2023.

    3. Olstad, Anne Marte Haug, et al. "Collecting Linguistic Resources for Assessing Children’s Pronunciation of Nordic Languages.", 2024.

    4. W. Liu, Y. Qin, Z. Peng and T. Lee, "Sparsely Shared Lora on Whisper for Child Speech Recognition," ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024