Code completion was fine without LLMs, and solving problems myself usually ends up being quicker than trying to coerce an LLM into doing it correctly and then verifying that the output is actually correct.
The one time I used an LLM in my workflow to good success was using ChatGPT to automatically create an enum of every EU/EEA country and a switch statement over that enum. Those sort of "grunt work" tasks that don't require any thinking, but a lot of typing, seem to be where LLMs shine.
I have a few observations:
- I vastly prefer Cursor's Copilot++ UX for autocomplete compared to GitHub's in VSCode, which I used until a few months ago.
- The Composer multi-file editor (cmd+i) is easily its most powerful feature and what I use most often, even when I'm working on single files. It just works better for some reason.
- It's far more effective working in popular stacks, eg. Typescript/NextJS etc. It's rarely a time-saver when working in Elixir, for example.
- In a similar vein, the less 'conventional' your task or code is, the less useful it becomes.
- As the context increases, it gets noticeably less useful. I often find myself having to plan what context I want to feed it and resetting context often.
- It's very effective at 'translation' tasks, eg. converting a schema from one format to another. It's much less effective at generating complex business logic.
- I only find it useful to generate code I confidently know how to write myself already. Otherwise, it doesn't save me time. The times I've been tempted, it's almost always bitten me.
They may be somewhat useful for some small function/lookup, but:
- they will always, without fail, hallucinate non-existent functions, methods, types and libraries. Your job becomes a constant code review, and I'm not here to babysit an eager junior
- claude will always, without fail, rewrite and break your code, and any fixes you've done to Claude's code. Even if you tell it not to. Again, you end up babysitting an eager junior
- fo any question outside of StackOverflow's top-3 languages and top-10 libraries they are next to useless as they know nothing about anything
These are not "AIs" or code assistants or autocompletes. These are generic token prediction machines which don't care if you write code or Harry Potter fanfics.
If you train them specifically on vast amounts of code and documentation, then maybe, just maybe, they may actually be useful
It's really useful for basic stuff, scripting, boilerplate, repetitive things, SQL. For example it's great at converting types from different languages, or generate types from API reference docs.
You gotta be careful though as it is not perfect and if you do something "not typical" it will not work well. Also sometimes it is a bit outdated, produces code with different "style", or just plain wrong stuff, so you have to tweak it a bit.
It's not going to code for you yet (unless you do very basic stuff) but it's a great tool to increase productivity. I do believe I move faster.
I am integrating AI translations into my custom static site generator. I will test the outcome heavily before putting my name (and a big warning) on my translated content, but the early results look good. Getting it right is a lot harder than piping the page through ChatGPT. Everything needs to be translated from the UI strings to the URL slugs.
My work will no longer just benefit English-speaking immigrants, but also locals and immigrants who Google things in other languages. I am very excited about it.
I also use ChatGPT heavily for “what’s the word/expression for” questions and other fuzzy language queries. As a non-native speaker, I want to know if a given expression is appropriate for the context, if it’s dated, if it’s too informal and if it’s dialect-appropriate.
I also use it for coding, but mostly because it’s faster than reading python’s docs. I ask it questions I know the answer to, hoping to find better approaches. So much happened since Python 2.7, and I don’t always know what to ask for.
On occasion I treat it like I treated my mom as a child. I ask it all sorts of questions about the world I observe around me. It’s amazing to get a short, clear answer to work from, instead of sifting through barely relevant, spammy search results. This is super helpful when getting to know new tech and figuring out what it actually is.
It’s just so damn cool to have a sort of real-life Hitchhiker’s Guide slash Pokédex in my pocket. These things appeared in the span of a year, and nobody seems impressed. Well, I am mad impressed.
- add files, tabs, terminal output and IDE diagnostics into the context via slash commands - feed in documentation or other remote web content, also via simple slash commands - activate workflow mode, which will help you edit the files instead of having to copy things around - then ask questions, ask for refactoring, ask for new features
Often I like to ask for the high level approach, and if I agree with it let it guide the implementation. It makes mistakes, so I always have to validate and test what it creates, but as you add more information into the context and the LLM has a good amount of stuff to work with, the output quality really improves significantly.
It's a process though, and it takes time to get familiar with the workflow, build intuition when the LLM falls on its face, when you should try a different approach etc.
Generally I can echo what others have said, it works best if you already kind of know what you want to achieve and just use the LLM as an assistant that does the grunt work for you, documents the process, provides a rubber duck etc.
Generally, I would not want to work without an integrated LLM anymore, it provides that much value to my workflow. No panacea, no silver bullet, but when used right in the right circumstances it can be incredibly useful.
A secondary usecase for me is working on repositories where tasks and todos are structured in markdown files (even things like travel planning). Letting the LLM Guide you through todos and create a documentation trail through the process, identify important todos, carry along information as you go is wonderful, I would absolutely give that a try as well.
Whenever I use it with Rust, Golang, Scala etc it's not worth the effort.
* code translation - e.g. convert a self-contained implementation of a numerical algorithm from one language to another and generate test cases and property tests which make sure the implementations are equivalent. The goal is to avoid having to proof read the generated code.
* one-off scripts - any task where code design doesn't matter, the amount of code is limited to couple hundred lines (GPT-4o) and the result will be thrown away after use.
* API exploration - producing examples for APIs and languages I'm not fluent in. Reading reference documentation gives a better understanding, LLMs get the results out faster.
I do have OpenAI keys and Claude, and I use them both (just as the mood fits, to see which works best).
I'm been coding for decades, so I"m quite experienced. I find that an LLM is no substitute for experience, but it definitely helps with progress. I work regularly in a range of languages: Java, Javascript with Typescript, Javascript pure ESM, Python, SQL. It's great to have a quick prototype tool.
One key takeaway - learning to "drive the LLM" is a skill by itself. I find that some people are "hesitant" to learn this, and they usually complain about how bad the LLM is at generating code.. but, in reality, they are bad at "driving" the LLM.
If I put you in an F1 car, the car would perform perfectly, but unless you had the skills to handle the car, you will not win any races.. might not get around the track one time..
Also, I'm in my 60's so, this is all "new" tech. I've just never been afraid of "new" tech. I'd hate for some 30 year old hot-shot to show my up because they learned to master using that LLM Tool and I just blew it off as "new tech".
Anyway, my $0.02
Working with multiple products, Python and many different libraries and frameworks I have to constantly look up commands and usage. With the LLM I select a line or two, and ask 'sort by region, exclude pre 2023' and it writes (or adds) the necessary Pandas calls to the dataframe. The LLM has my code, tools documentation and more as context, so I don't have to say much to get the right code, but the questions are important, have to converse with a programmer mindset still.
It has almost completely replaced using Google for helping with code. Often I waste too much time in Slashdot looking at a dozen possible similar situations, but not what I need. The LLM immediately gives me something right in my own code. Usually have to give it some followup commands to tweak the code or take a different approach.
My money is on Cursor [1], which does not stop to amaze me, and seems to get a lot of traction. The integration is very clever, and it is scary how it figures out what I intend to do. Then again, I'm probably doing mundane tasks most of the time. For the few bright moments in my day I tend to use ChatGPT, because most of my real problems are in application domains, not in code.
I am not a firm believer in forking large open-source projects, though, as it will take a lot of effort to keep up with future diversions. This makes me a bit wary of projects such as Cursor and Void [2]. Somebody needs deep pockets to sustainably surpass the popularity of VS Code. To point out just one problem with forking: VS Code works fine in Ubuntu, but Cursor does not work out of the box there. Having to disable the sandbox is a show-stopper for most.
In that respect, the extensions might be a safer bet, and I think Sourcegraph's Cody and Continue are making the largest waves there. Hard to tell with so many waves.
A couple weeks ago I had to create some classes (typescript) implementing a quite big spec for a file format to transfer quizzes between applications. I decided to try some more advanced tool, ending up with Cursor and continue.dev. I copied the spec (which are public on the web) into a text file and used them as context, together with a skeleton start for the main class I needed, and experimented with different engines and different prompts.
The end result is that I got a very good starting point for my code, saving me many hours. Surprisingly, for this task, the best result was generated by Gemini 1.5 pro.
Since then I've been trying to integrate these tools more into my day to day programming, with varying results, but I'm now convinced that, even with the limits of the current LLMs, this technology can have a much higher impact on programming with better harnessing, eg integrating it with compiler/code analysis tools output and automated testing.
Two examples...
1. Recently wanted to build a Chrome plugin and never built one before. Used o1-preview to build it all for me.
2. Wanted to build a visualization of the world with color-coded maps using D3. Again, hadn't used D3 much in the past... Claude basically wrote all the code for me and then I just had to make edits to fit my site/template.
The downside of LLMs is that it doesn't remove your need to know your code. And writing code yourself is a very good way to know it.
We also use cursor, copilot, claude dev, aider and plandex to see if anything is beating anything else and to get ideas from for our toolkit.
Currently I estimate it writes about 70% to 80% code (including adding features to the tool itself), and saved me hours of work per week.
Lately I've been exclusively using API since it is cheaper than paying $20 monthly subscription.
On top of that, I have configured a chatgpt code reviewer on my github repos. Most of the comments I get from it on PRs are useless but from time to time it does spot a problem or suggest an improvement.
I use it to understand new codebases quickly, create the documentation boilerplate for the code I'm working on that needs better docs, or update/rewrite outdated ones.
When the codebase fits in the context window, it's simple. But even if I'm working on a larger thing, it takes a bit of RAG-alike effort to build knowledge topology, and then it's super easy to get docs on (the actual!) architecture, specific components, and all the way down to atomic function level.
It’s 10x easier and more direct to just do it myself, distilling my mental model into precise instructions.
Maybe it works better for languages with more boilerplate- I use Ruby which is quite terse.
I do really like to use the plain web browser tools, though (currently claude), for generating boilerplate code that I then review and integrate carefully into my code. It has sped up my workflow in that way.
This experience of developing such tool is invaluable, I doubt I would be able to learn so much about LLMs and AI service providers if I was using some off-the-shelf software. It also allows me to adjust it to my own needs, while the "main stream" tools usually try to please "large enterprise customers", so as a regular user you end up living with annoying bugs/quirks, that will never be fixed (because no large customer asks for it).
In my opinion the future of such tools is multi modal (both input and output). For my specific use case (hobby gamedev) the goal is to eventually become "the idea guy", who occasionally jumps into the code to perform some fun programming task, and is not able to create assets(GFX/SFX) themselves.
For example: when I'm using my tool, and I want to fix a problem like misaligned buttons, instead of typing an exhaustive prompt, I rather prefer to paste a screenshot, and formulate task with just few words ("align those buttons").
Autocompletions: GitHub Copilot, still it’s better that Mistral small code models in my opinion
But like everything you have to spec it well
Then any top model basically for duplicating work.
“Here is component B, here is component A and test A. Produce a test for B following the same pattern”
But you need to use Retrieval Augmented Generation (or bare LLMs will either invent the nonexistent or present solutions to non-proposed problems).
I’m a huge fan of keeping things simple by being verbose, and Copilot takes all the drudgery away by filling out all the repeating patterns.
Once in a while when I can’t be bothered to figure out Typescript types, I ask ChatGPT to write one for me.
On a personal assistant level, it's been useful to have me remind of words that I have forgotten, or to have it rephrase a sentence in a different way, to suit some mood, etc.
Occasionally I have fun with it by having it answer in rhymes, or theatrically like a fortune teller, or like lyrics of some gangsta rap.
I don't know about performance increases. What I notice most is that I'm less annoyed with it, than with the general state of the internet. The web pages and I have different goals: I want a specific piece of information, and they want me to load all their ads, affiliates and such in return, and to spend time on the page reading their drivel. It is also horrible to search for version-specific information, for example, I appreciate that it brings up the Ruby answer for the latest version, but I'm stuck on 2.5.0, and so, I need that specifically. LLMs are usually great at this. Orders of magnitude less fluff, rich text-only answer.
I use Claude now that they've successfully made chatgpt too insufferable to use.
I haven't tried any of the special AI editors since I don't know if I would survive the traumatic injury of having emacs taken away from me.