However, no one has tried making an AI wearable wrist device that isn't a smartwatch. I'm envisioning a wristband with no screen that is used as a device that passively listens to your conversations and gathers context (but doesn't store your recordings). I imagine this being paired with your phone so that the AI can proactively send you notifications regarding what you've talked about, almost like having a second companion always there, ready to help. This would be super cool to have as a student in a classroom because then I can have my personal AI help me study after!
This sounds cool to me, but I'm not an engineer, so I don't know why no one has attempted this. What are the hardware or software blockers preventing this from happening? Has it not been done due to engineering constraints or is it because it's a stupid idea?
Yikes...
> but doesn't store your recordings
Still, yikes!
1. Reliable speech to text from multiple possibly identifiable speakers
2. Long-term knowledge storage and retrieval
Speech to text is a solved problem, AFAIK with a caveat: single speaker. You'd need to train a local AI to identify all these different voices reliably. No easy feat.
Assuming you have done that, you have to feed that data into a vector database to retrieve it when you're talking to the AI. You can't use it to train the AI because it would be too expensive. But then you hit another roadblock: you either have very good querying capabilities for that database so you're able to retrieve what matters and feed into the prompt; or your context window is huge. The latter is expensive.
Some commercial LLM implementations are already implementing some form of learning based on previous chats, so it might be doable from a cost perspective.
I think you can't fit the necessary computing power into a wristband today. It needs to take care of speech to text (again, multiple speakers), uploading all of that to some cloud, and do it for hours and hours non-stop.
Maybe it could just be a smart microphone that uploads a constant stream of audio to the cloud for processing? A privacy nightmares no one is willing to touch most likely. Would you have to ask permission from anyone in the room before you enter with your microphone?
How do you make it reliably listen to your conversations give that you may move your hands around, wear a shirt or jacket covering it, have them below a table, etc?