Any advice on where to start? Any resources (paid or free) that you can recommend? I am not looking into becoming a Machine Learning engineer or something similar, more like effectively understanding the deluge of tools and language models that are released every day, their capabilities and uses
I've been using Perplexity Pro, which has several engines available. It's been extremely frustrating with vast oscillations in quality; many prompts are completely ignored; the output has seemingly degraded in intelligence and quality; any work with files leads to loops of stupidity; and customer support ignores most support requests.
I tried Gemini (free version) through an API and it's terrible so far.
On a positive note, Perplexity has, when not malfunctioning, been surprisingly helpful in writing Python code for various tasks. But all the frustrations, canned ethics and inane "safety" features have me wanting a more tunable and effective option and this is where everything gets pretty confusing and convoluted for me.
Google's AI are clearly priced, with token caching, cloud, etc and other options, but I still find it confusing as well as expect the cost to exceed what I'd get in comparison to Perplexity or OpenAI.
I'm in my third week of exploring all this and still quite ignorant and dumbfounded.
Edit: I should add that making this endeavor maximally difficult, I am doing my best to not install anything from beyond the repositories for my Linux distro. Otherwise I would require, open source, a pgp signature and an established user base.
If you are just looking to use LLMs, you are basically either going to be using ollama with the smaller size models to run on hardware, or you are going to be using an API online to run inference in the cloud. All of this is pretty easy.
If you are trying to learn ML in depth, then you just have to basically start from scratch.
Start with Karpathy Micrograd project. It basically shows you how stuff works under the hood. Get the repo, run the examples, play around with it. Try to figure out how to approximate a mathematical function, i.e you give it one input and then the output is a single value.
Then familiarize yourself with Pytorch. Start with the official tutorials, and try stuff.
This is also a good read.https://pytorch.org/blog/inside-the-matrix/
Then you have to basically grind out the following steps
1. Read a paper about some model. Start with small papers like digit recognition with MINST. Then move onto things like image recognition, e.t.c. Generally you want to cover basic classification models, image detection models that draw bounding boxes around stuff, and things like auto encoder.
2. Implement the model in Pytorch (or model application, like for example recreate the deepfake face swap with autoencoders). This is going to be the hardest part to grind out, but it gets exponentially easier after you start cause you will realize that a good portion of the stuff is boilerplate code.
3. Load the weights downloaded from the internet
4. Run the model and verify it works.
5. Retrain the model. If you don't want to generate a dataset, you can basically just shuffle up the labels. But you can find different datasets for different models available.
As a cheat, tinygrad repo has some of these implemented in tinygrad, so you can look at the code and reimplement it in pytorch. https://github.com/tinygrad/tinygrad/tree/master/examples
Then, familiarize yourself with the LLMs. Karpathy Nano GPT is a good start. Then same idea, read->recreate For example, read about flash attention and then try to recreate it.
Also separately, you want to look at Hugging Face Accelerate library and learn how to work with available hardware for both inference and training. I.e try to use all compute resources (CPU/GPU/Disk/Ram) on your box to run stuff.
If you can do all of this, you will be significantly more skilled than a lot of ML people in Big Tech.
As a bonus, you can also explore tinygrad, and how things actually happen on the gpu when you run a model.