Archive for September, 2024

Models for low-latency, streaming applications could benefit from the knowledge capacity of larger models, but edge devices cannot run these models due to resource constraints. A possible solution is to transfer hints during inference from a large model running remotely to a small model running on-device. However, this incurs a   Read More ...

Despite broad interest in modeling spoken dialogue agents, most approaches are inherently half-duplex’’ -- restricted to turn-based interaction with responses requiring explicit prompting by the user or implicit tracking of interruption or silence events. Human dialogue, by contrast, isfull-duplex’’ allowing for rich synchronicity in the form of quick and dynamic   Read More ...

Extracting the speech of participants in a conversation amidst interfering speakers and noise presents a challenging problem. In this paper, we introduce the novel task of target conversation extraction, where the goal is to extract the audio of a target conversation based on the speaker embedding of one of its   Read More ...

Check out some media coverage of our work: https://www.popsci.com/technology/ai-headphones-noise-cancelling https://www.technologyreview.com/2024/05/23/1092832/noise-canceling-headphones-use-ai-to-let-a-single-voice-through

Check out the project: https://tsh.cs.washington.edu/ In crowded settings, the human brain can focus on speech from a target speaker, given prior knowledge of how they sound. We introduce a novel intelligent hearable system that achieves this capability, enabling target speech hearing to ignore all interfering speech and noise, but the   Read More ...