Is active music cancelation possible?

Question

Any audio engineers out there? I don't know enough about waveforms, but I was wondering if it might be possible to combine active noise cancelation techniques (as in Airpods or other headphones) with music fingerprinting and waveform inversion in order to make headphones that can cancel out music?For example, let's you say you want to go to a coffee shop, but don't like the music that they play. Regular active noise cancelation headphones can filter out some of the background noise already, but what if they could also recognize the song that's playing (using existing fingerprinting techniques), download it, invert the waveform and then use the microphone to measure delay and frequency shifts in real time and try to destructively cancel it out? (Only for the headphones wearer, not the actual source of the music.My hope is that while regular noise cancelation works best on repetitive waveforms (like an engine hum or an electrical whine) because it's limited to what the mic hears in real time, having the exact song downloaded ahead of time would allow you to more easily apply the corrections in sync with the waveform.Is that conceivable?

mitchellpkt · Accepted Answer

Back in 2018 I spent a week or two messing around with this idea, and produced a hacky proof-of-concept.[1] It's not intended for real-time or production use, I just made a prototype to see if it could be done.
The README explains the method: once the contaminating song is identified, it syncs up the recordings in time with a correlation analysis, adjusts for frequency-dependent gain effects, then subtracts the undesired content.
Warning: I'm not an audio engineer, the output sound quality is NOT good! This was just a toy project in my early days of learning to code. I assume there are much better ways to approach this that would yield significantly better results.
[1] https://github.com/mitchellpkt/tracksubtract

al2o3cr · Answer

The tricky part would be maintaining the equalization + delay matching in a changing environment - imagine a person suddenly walks in between you and the speaker that's playing the music.Unless your correction system can respond in real-time, you've accidentally created an audio-frequency bistatic radar :P

gus_massa · Answer

It's a very interesting problem. I'm worried about echos and crappy speakers distortion, but it looks totally posible.

jonahbenton · Answer

I have wanted this at the speaker level, be able to cancel environmental noise coming through eg a window. A library of cancellation patterns, delivered through an aptly named sound bar.

jononor · Answer

Here is one approach that works: block/reduce the majority of sound physically. And then use s microphone with an adaptive filterbank to let in only the sounds you want to have. This is how hearing protection with speech passthrough works.Masking with a static noise source can also help.

brudgers · Answer

A. Time DomainFingerprinting a song introduces latency because the song has to play long enough to get an adequate sample for the fingerprint.What would the headphones do before a match is established?B. Space DomainCanceling requires a waveform 180 degrees out of phase. It needs to be about the same size as the song. If you stream that waveform, there&rsquo;s latency and mechanical rights issues. If it is stored locally, you need space for all-the-songs.C. Music is bigger than TexasRemixes, covers, recordings of live performances add complexity to fingerprinting and increase the size of the phased wave database.Good luck.