Is there no company smaller than Google or Amazon or Apple willing or able to make a better smart home device? Especially now with AI solving the language-proficiency problem in talking to these devices, it seems like the rest of the product is just API integrations with a few hundred smart home hardware vendors, and some easy lookups like weather and calendar.
Is there something I'm missing about why the current generation of these still suck so bad? As a weekend project I bolted speech-synthesis and speech-recognition to ChatGPT (and have seen similar "Show HN" posts) and immediately had a more interesting conversational partner than Alexa or, god help them, Siri has ever been.
Why are three of the most profitable technology companies in the history of mankind unable to come up with better than Google Home, Alexa, and Siri? Is there some unusual challenge here that I fail to understand?
Plenty of people I know would spend $500-1000 (not to mention the lightbulbs, appliances, or various “smart” gadgets) on just the smart home “brain” itself, and not even for something HN folks might consider “ideal” (privacy respecting, no ads etc), but just for a device that actually, usefully, and consistently does most of the things these devices purport to do. Why can’t three different trillion-dollar companies even manage that, and more to the point, why aren’t there more billion-dollar companies even trying?
The whole IoT era is plagued by draconian old-world corporate power-control games, which are unrefined crudity compared to where PCs had been headed. Some healthy standards & protocols that actually leave users in power would start to cleanup the toxic waste dump operating-regime these products are in now.
1. Requiring an always-on connection to the Internet, regardless of whether a local wifi peer network would suffice.
2. Requiring a proprietary hub in the cloud that serves as a bidirectional relay. Extra points if you can charge a monthly fee for this service.
3. Requiring anytime remote firmware updates that the user cannot veto.
All these things enable the HA company to snoop everything that goes on in your house and sell this data to advertisers. They make a lot more money from this than from the sale of the devices.
And that's why I only build my own HA devices.
If you want something that's not cloud based you can get it, but I think it costs more than that. I haven't gotten quotes myself, but per here it's in the thousands (tens of thousands if you go wired): https://www.electronichouse.com/home-lighting/weighing-wired...
Business-model-wise it's probably either that "pay $$$$ up front" or "pay a subscription" and how many people would really pay a subscription vs just using the Google/Amazon/Apple systems?
Because none of those companies are in the business of providing a useful service to end users. Google sells ads, Amazon sells marketplaces, Apple sells consumer entertainment system. None of them went into the smart home business to address the needs or solve the problems of people using the devices. They all intended to use the devices to expand their cores business and make it more profitable.
It allows different brands of smart devices to talk to each other without having to buy the smarthub for each brand separately.
"Matter products run locally and do not rely on an internet connection"
If you want to use non-HomeKit items, you can use a RaspberryPi and run HomeBridge. Or you can just stick to HomeKit items (which are unfortunately more expensive and limited in selection).
I see you’re mostly unsatisfied with voice controls, but Siri is pretty good with HoneKit controls. Or you could just use your phone.
2. OpenAI's APIs are expensive, and building an alternative datacenter for running the open-source LLMs at scale is even more expensive (and probably not allowed under their licensing). And even bigco private LLMs like Bard are presumably pretty expensive for their creators to run. So, "the AI is free" model of Siri/Alexa/Google Assistant doesn't easily port itself to the newer tech.
3. If you try to avoid that by running the models locally... Getting reasonable performance requires expensive hardware, so something like a $49 Google Home Mini is out of the question for now.
I think it's super likely that at some point, the current generation of home assistants are gonna get walloped by the better AIs we've seen developed. It just hasn't happened yet; someone needs to a) do it, and b) figure out the pricing model — either offer enough value that subscriptions become worth it vs worse-but-free alternatives, or sell some kind of larger-use-case at-home-AI box running local models that can also do home assistant stuff.
Or just wait for the cost of running the models to come down, at which point you can just 1:1 port the current experience over, except you end up with much smarter assistants.
Many people see large corporations and because we’re the “21st century” believe that everything that is made should be the best or is somehow the most advanced thing and therefore should be far superior. Unfortunately that couldn’t be farther from the truth in many ways.
Items made are often times just pushed out there because they want to be the first at a growing market, without really really caring what the best for the consumer is. They want wants best for their pocketbooks, what turns a profit and positions them as the leader in the minds of the masses. Security is usually an afterthought. Actual need and what’s best for humanity is even further down in that list.
Feeding big data machines will never be what’s best for individuals needs
ZigBee just seems way too unreliable. WiFi devices have - for me at least - been rock solid in terms of reliability, while ZigBee stuff randomly disappears or needs to be relinked all the time (and it is hard to debug when this happens, unlike just pinging an IP on your LAN)
Home assistant et al are typically overkill and another point of unreliability. Just run Mosquitto (MQTT broker) locally and run a local NodeRed instance if you need anything sophisticated in terms of logic.
Since you are using MQTT it is trivial to write your own app for controlling everything from your phone. Use tailscale if you need to do things when off-LAN.
Door & window sensors, temp/humidity sensor, lock are 'endpoints' and use low power, light bulbs & strips, switches act as mesh nodes.
Highly responsive and seems reliable. I use Apple gear.
There's just no market. It looked snazzy when it was coming out, there was good hype, and then people actually experienced them. It's just not that cool and less useful than a clapper.
To make money you need to keep selling devices and that is hard. Having someone first pay for devices and then monthly fee is much more profitable. This model applies to many industries.
I need a smart home where the built-in devices communicate and change behaviour based on sensor input and a controller where I'm in control of the software and open standards are being used. This also means that everything should work without internet access. Needing internet to turn a light on is pretty stupid. I'm thinking of you IKEA.
I'm not going to waste my money on any smart devices before this happens.
It's definitely doable and I still have my own custom assistants in the house. However, I had to get around with a Snowboy model for hotword detection (and Snowboy is now basically abandoned), Mozilla DeepSpeech model for speech-to-text (and that's quite heavy), and Mycroft's mimic3 text-to-speech model (and Mycroft is now basically bankrupt). Then writing the integration is relatively easy - I used Platypush, but it can definitely be done with Home Assistant and OpenHAB too.
Compared to 3-4 years ago, I think we're now in a state where the content is no longer the issue (just plug into a LLM, and all of your text requests will get an answer), nor integrations are a problem (just write a Platypush event hook on speech detected, and you can connect it to everything, no need for "Works with Google/Alexa" labels). Text-to-speech synthesis has also become cheap and ubiquitous.
But the hotword detection and speech-to-text models are still IMHO the bottleneck. Hotword detection is a field where you need a very small and lightweight model that only detects a specific word or phrase in a very reliable way. Snowboy was an amazing FOSS project - which also came with this cool idea of "crowd-funded models", where in order to download a model for a certain hotword you were first supposed to provide three audio tracks where you say that word in order to improve the model. But it's now discontinued because it cost the volunteers too much to run the infra.
And Mozilla DeepSpeech is a relatively good choice for general-purpose speech-to-text, but it's heavy (it takes 100% of the CPU when it runs on a Raspberry Pi) and it's mostly optimized for English - even support for other Western languages is patchy.
If there are other open-source alternatives that solve these problems, I'd be very happy to learn about them. Once these blockers are removed, there should be really no reason for anyone to feed their audio streams to Google or Amazon.