HACKER Q&A
📣 urupvog

What to do with 500M call recordings?


We have around 500 million call recordings with average duration of 1 minute in English/hindi and other languages spoken in India. Just wondering, what can we do with this huge dataset? What type of models can we create?


  👤 muzani Accepted Answer ✓
I'd probably run it through some kind of sentiment analysis, try to get it to model as happiness/satisfaction on y-axis, vs time on x-axis.

Then you can map that to conversations and see what words increases or decreases satisfication. Look for sudden changes in the happiness contour.

You can also try to map that to customer surveys at the end of the call - see if you can improve perceived quality, by say, greeting them cheerfully early on, or if different phrases will diffuse anger better. Maybe even see if you can spot weird patterns, like if certain accents trigger anger or contempt.


👤 sharemywin
can they be transcribe via a free speech to text library?

Can you find a existing transcription service that you can up sell to clients?

use hat service to refine you dataaset

then you can build your own model. and cut out the service.

assuming you have it organized by client you could map it to industry. and then build an industry classifier.

run it through sentiment analysis, translation.


👤 Spooky23
Analyze them for quality purposes to improve whatever it is that you're supposed to be doing.

👤 mtmail
Did the people consent to the recordings?