HACKER Q&A
📣 desertraven

Best method to classify short audio events in real time?


I don't have too much experience with statistics (or ML), but a lot of the articles I've found are quite complicated for something I expected to be simple.

There are four distinct sounds I need to detect in real time with an embedded device. Think a clap sensor, but with 4 different sounding claps.

How might I go about this? How much training data (if any) do I need to collect? Is there an off-the-shelf method to just classify a few different audio events to a high degree of accuracy, and then embed that to a microcontroller (even a computer at this point)?

Thanks!


  👤 tkanarsky Accepted Answer ✓
Edge Impulse does everything you described. It has a really nice web UI that lets you collect and annotate data, extract features, train a model, and bake it into a microcontroller image for inference. It supports a good chunk of microcontrollers and SBCs out there.

https://docs.edgeimpulse.com/docs/development-platforms/full...


👤 t0mas88
This book has a good example using Tensorflow Lite on a micro controller for speech recognition on a few commands, that would probably work for different sounds: https://www.amazon.com/TinyML-Learning-TensorFlow-Ultra-Low-...

(And it's overall a nice book, very easy to read and follow along the examples)


👤 simne
I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).

Best book I know on DSP:

The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.