113 Research Dr, Bethlehem, PA 18015

View map

Abstract: In recent years, the field of speech and language processing has made significant strides, yet persistent challenges such as speech noise, limited high-quality data, and the lack of robustness in speech generation systems persist. Furthermore, evaluating speech remains a considerable obstacle for comprehensive assessment at scale. Concurrently, recent breakthroughs in Large Language Models (LLMs) have revolutionized text generation and natural language processing. However, the complexity of spoken language introduces unique hurdles, including managing long speech waveform sequences. In this presentation, I will explore recent innovations in multimodal spoken language modeling, generative speech system evaluation, and high-fidelity speech enhancement. Additionally, it explores potential avenues for future research for conversational AI towards greater robustness and effectiveness.


Biography: Soumi Maiti is a postdoctoral researcher at the Language Technologies Institute, Carnegie Mellon University, specializing in speech and language processing. Her research broadly focuses on building intelligent systems that can communicate with humans naturally.  Soumi holds a Ph.D. from the Graduate Center, City University of New York (CUNY), supported by the Graduate Center Fellowship under the guidance of Prof Michael Mandel. She earned her B.Tech. in Computer Science from the Indian Institute of Engineering Science and Technology, Shibpur. Previously, she contributed to the Text-To-Speech team at Apple, and gained valuable experience at Google and Interactions LLC as a student researcher and research intern, respectively. Additionally, she served as an adjunct lecturer at Brooklyn College, CUNY, for three years and as a Math Fellow at Hunter College. She has served as session chair in ICASSP 2024, ICASSP 2023, SLT 2023 and others, and area chair at EMNLP 2023.

Event Details

See Who Is Interested

0 people are interested in this event

Lehigh University Events Calendar Powered by the Localist Community Event Platform © All rights reserved