HACKER Q&A
📣 muedzi

I built a pipeline for in-video search at a hackathon, what're the uses?


So I attended a hackathon recently and I wanted to play around with the GCP stack. I had a lot of fun with it, ended up using their Vision API and many other smaller things the build a pipeline where you could upload a video and, index a whole bunch of objects in it so that you could search. I even built a 'voice skill' where you could say something like 'find all occurrences of people' or 'Start playing when baboon appears'

It was really fun, but I suspect there might actually be a use for it somewhere out there.

I'm not a business or startup guy, but am willing to explore it as a side project. Any ideas where this might be useful?

The one thing that came to mind was analyzing security footage, but beyond that I'm blank


  👤 hos234 Accepted Answer ✓
Probably in sports broadcasting/sports analytics. They do a lot of manual lookups for replay, identifying important moments, filtering out wrong camera angles etc. There maybe some simple cases that can be automated. Looking at what Hawkeye, Playsight SmartCourt etc do might give you some ideas.

👤 throw394812
Hey, send me an email, I'm interested.

👤 verdverm
Sales call analytics

👤 aisafetyceo
I went down that path as conceptual learning for building a voice controlled website building system

- you can't count on making much money just reselling the software on larry pages computer because he is counting on making much money just reselling software on his computers ... https://teachablemachine.withgoogle.com/

Also.. more seriously, the latency to the google servers and back makes serious applications out of reach.

for ex. combine the concept with this https://comma.ai/shop and theres a geniune billion dollars waiting to be thought out

another ex. https://www.indus.ai but powered with many cheap https://amzn.to/2Xg4JkB

the path forward is to write a learning algorithm and recognize your work as one who creates virtual computers - screenless computing is a big deal, microsoft surface earbuds will have basic office integration, but imagine if you could your pipeline communicate the contents of pdfs,websites etc. without a screen around

theres the obvious consumer products but also imagine being able to work with a computer in new environments theres also the non obvious: imagine one smartphone camera pointed at a room, every person in that room can access their virtual computers using camera or voice You can sell kits that that transform rasberry pi's into useful robots

- 3d reconstruction is cleaner and more powerful then deep learning - stitching together screenshots of websites - learning as environment mirroring

its also useful to think of the practical economics of the distribution of the technology ... android phones can go around $70 - $1000 but they essentially run the same free software opensource software

so there is not much of dollars that any individual can extract from the distribution of "AI" software

What I would do is delve into the technical software reasons behind Googles acquisition of Fitbit.

Spoiler: How does fitbit allow developers to build with javascript and css without running a "browser"

The point here is Larry Page is eyeing the virtual software developer opportunity and so are you! in order to extract value from the vision recognition opportunity you must spend a lot of energy writing rules

it would be efficient if the end user could program their camera on the fly using natural language without internet connection on any device

this north star is the makers definition of an agi .. and as a independent business you can't afford to work on a subset of the problem