Uncategorized : Mobile Intelligence Lab

Bandhav Veluri gets his Ph.D. and joins Meta Llama team!

Congratulations to Bandhav for an impactful Ph.D. work. We will miss you!

Knowledge boosting (InterSpeech’24)

Models for low-latency, streaming applications could benefit from the knowledge capacity of larger models, but edge devices cannot run these models due to resource constraints. A possible solution is to transfer hints during inference from a large model running remotely to a small model running on-device. However, this incurs a Read More ...

Our work on full-duplex dialogue agents to appear at EMNLP 2024 main conference

Despite broad interest in modeling spoken dialogue agents, most approaches are inherently half-duplex’’ -- restricted to turn-based interaction with responses requiring explicit prompting by the user or implicit tracking of interruption or silence events. Human dialogue, by contrast, isfull-duplex’’ allowing for rich synchronicity in the form of quick and dynamic Read More ...

Target conversation extraction: Source separation using turn-taking dynamics (Interspeech’24)

Extracting the speech of participants in a conversation amidst interfering speakers and noise presents a challenging problem. In this paper, we introduce the novel task of target conversation extraction, where the goal is to extract the audio of a target conversation based on the speaker embedding of one of its Read More ...

AI-powered headphones can tune into a single voice in a crowd

Check out some media coverage of our work: https://www.popsci.com/technology/ai-headphones-noise-cancelling https://www.technologyreview.com/2024/05/23/1092832/noise-canceling-headphones-use-ai-to-let-a-single-voice-through

Look once to hear received CHI honorable mention award!

Check out the project: https://tsh.cs.washington.edu/ In crowded settings, the human brain can focus on speech from a target speaker, given prior knowledge of how they sound. We introduce a novel intelligent hearable system that achieves this capability, enabling target speech hearing to ignore all interfering speech and noise, but the Read More ...

MIT Tech review covers our work on AI headphones

Check out the article: https://www.technologyreview.com/2023/11/09/1083145/noise-canceling-headphones-could-let-you-pick-and-choose-the-sounds-you-want-to-hear/

Semantic Hearing: Programming Acoustic Scenes in real-time (UIST’23)

Imagine being able to listen to the birds chirping in a park without hearing the chatter from other hikers, or being able to block out traffic noise on a busy street while still being able to hear emergency sirens and car honks. We introduce semantic hearing, a novel capability for Read More ...

Acoustic swarms and speech zones (Nature Communications’23)

Imagine being in a crowded room with a cacophony of speakers and having the ability to focus on or remove speech from a specific 2D region. This would require understanding and manipulating an acoustic scene, isolating each speaker, and associating a 2D spatial context with each constituent speech. However, separating Read More ...

Shape-changing origami microfliers@Science Robotics’23

Using wind to disperse microfliers that fall like seeds and leaves can help automate large-scale sensor deployments. Here, we present battery-free microfliers that can change shape in mid-air to vary their dispersal distance. We design origami microfliers using bi-stable leaf-out structures and uncover an important property: a simple change in Read More ...