Best method to classify short audio events in real time?

Question

I don't have too much experience with statistics (or ML), but a lot of the articles I've found are quite complicated for something I expected to be simple.There are four distinct sounds I need to detect in real time with an embedded device. Think a clap sensor, but with 4 different sounding claps.How might I go about this? How much training data (if any) do I need to collect? Is there an off-the-shelf method to just classify a few different audio events to a high degree of accuracy, and then embed that to a microcontroller (even a computer at this point)?Thanks!

tkanarsky · Accepted Answer

Edge Impulse does everything you described. It has a really nice web UI that lets you collect and annotate data, extract features, train a model, and bake it into a microcontroller image for inference. It supports a good chunk of microcontrollers and SBCs out there.https://docs.edgeimpulse.com/docs/development-platforms/full...

t0mas88 · Answer

This book has a good example using Tensorflow Lite on a micro controller for speech recognition on a few commands, that would probably work for different sounds: https://www.amazon.com/TinyML-Learning-TensorFlow-Ultra-Low-...(And it's overall a nice book, very easy to read and follow along the examples)

simne · Answer

I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).
Best book I know on DSP:
The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

Best method to classify short audio events in real time?

I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).Best book I know on DSP:The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.

I think this task is lot more about DSP, and very little ML, just Bayes classification (I will write on it later).
Best book I know on DSP:
The Scientist and Engineer's Guide to Digital Signal Processing By Steven W. Smith, Ph.D.