State of the art in CV is remains video prediction: given N frames of patch input, generate the next frame.
If you are into space exploration, there are a lot of cool datasets like the "Spot the GEO" challenge
https://aiforspace.github.io/2021/
And if you get access to NVidia GPUs in the cluster, there's plenty of envelope pushing stuff you can do with Omniverse: AI for rendering, light transport, physics simulations, etc
Examples:
https://english.kyodonews.net/news/2021/07/6d6979264de4-japa...
https://www.latimes.com/local/lanow/la-me-ln-settlement-auti...
There are a lot of pretraining tasks in vision/multimodal that are cool. Largely techniques introduced or refined by OpenAI re-implemented as pytorch open source codebases with varying degrees of success:
- Finetune your own CLIP https://github.com/mlfoundations/open_clip
- Train a (much smaller) DALLE https://github.com/lucidrains/DALLE-pytorch
- Train your own guided diffusion https://colab.research.google.com/drive/1javQRTkALBWLFWnx1K4... (pretty tough, may only be feasible on domain-specific data)
- Train a variational autoencoder (VAE)
- "VQGAN" from Heidelberg https://github.com/CompVis/taming-transformers
- "Discrete VAE", used as the backbone for OpenAI's DALL-E, reimplimented here (and other places) https://github.com/lucidrains/DALLE-pytorch
Write a fully machine vision aimbot for CSGO. Perhaps you could feed the mouse and keyboard input into the tracking algorithm to improve accuracy. You need to intercept the mouse input anyway to tamper with the game state.
Predict a coin flip realtime.
Write a program that retroactively looks for a certain cat in a security cam footage (I miss my cat). This is the one I actually attempted a while ago using a the most dumb method known to man: Since it was an orange cat on a mostly grey/green footage I just defined a color range from dark brownish orange to light brownish orange and parsed each frame of the recording. It didn't work that well without defining a lot of treshold rules.
There are quite a few deterministic carnival/arcade games you could cheat with a bit of machine vision magic :^) Stacker comes to mind for example
Also the idea, that many have shared of using CV to detect insects (say a cockroach) and then attacking it with some sort of weapon (everyone loves lasers, but a laser strong enough to kill an insect like that seems like it would introduce significant risk of collateral, so I wonder if instead a jet of household pesticide could be used). I wondered a while back if those little hexbug toys could be used for development.
I've created a bot for the card game Set years ago using classic computer vision. Should revisit that when I get my OAK-D Lite camera.
I play cash games every week and it takes the host forever to count people's chips when they want to cash out.
read video from my dash cam and classify at the vehicles around me for being a police car or not.