previous thread: https://news.ycombinator.com/item?id=39263664
I also built this incremental clicker game where you split words ad infinitum (like Infinite Craft but in reverse):
The next steps are to adapt the content based on realtime user feedback or queries like "What happened on this street during the French Revolution?" or "Tell me more about this building’s history." as well as offering step by step itinerary suggestions.
I believe it's now possible to build an Augmented Reality solution, where you leave your phone in your pocket, put some airpods and you listen or converse with your personal tour guide and fully immerse yourself in history.
[2] https://apps.apple.com/us/app/id6446038181?uo=4
[3] https://play.google.com/store/apps/details?id=com.audiala.au...
I suck at web design, so I built a novel tool [0] that streamlines this flow:
webpage —> screen capture —> LLM prompt —> design feedback
It’s a browser extension that lets you capture a portion of the webpage —> the image gets sent to an LLM with a prompt —> the LLM gives design feedback inside the extension
As far as I know, this is the only tool I’ve seen that does this, and it’s completely free. It’s called Design CoPilot, and I haven’t done any real marketing yet as it’s in Beta.
Before I built this, I would manually screen capture components, then go to ChatGPT and drop the image, write a prompt telling ChatGPT it’s a design expert and I want design feedback, and then read its reply and implement the feedback.
I’d greatly appreciate if you try the tool and leave a review!
[0] - https://chromewebstore.google.com/detail/design-copilot/hgal...
So I built a tool called Uiino. It lets you make story maps with plain text. Later, I added AI. Now it can generate maps instantly.
It's helped tons of people map out their apps. Gives them a clear bird eye view of their apps/flows.
Try → https://www.uiino.com (No signup required)
P.S. We just hit 10,000 website visitors. And 300+ monthly active users. It's all free. I'm pretty stoked about it.
Currently it supports English, German and Spanish, but I plan to add more languages in the future.
As a bonus you can also add words from yt videos and websites.
Now my product manager and customer support teams use it for most small questions / charts needed.
It’s one of the first AI clients I think. I’ve been working on it for 16 months and shipped more than 100 releases.
It started with the idea of invoking an LLM within the Apple Notes app. I hate going back-and-forth between the Notes app and ChatGPT web app.
So I built the “AI inline” feature where I can trigger a keyboard shortcut and it would send the prompt to OpenAI and get the answer back.
But this is quite advanced, not many users can use it.
So later I added the full Chat UI.
I don’t want to charge a subscription and manage AI tokens so I sell perpetual licenses with 1 year of update. Users bring their own API keys.
As an all-in-one client, it unlocks some interesting features that I didn’t expect in the beginning: the ability to switch between multiple AI services and models.
This allows me to use Web Browsing with Claude, for example.
I later added support for Function Calling and this has unlocked many capabilities: edit video with ffmpeg plugin (give instructions to the LLM and it would execute the ffmpeg command locally), search the web using advanced search engine such as Kagi, WolframAlpha, Analyze documents and source code similarly to Claude Project…
And finally I added the AI Command feature where you can prompt directly with the highlighted text. Imagine you’re reading an article and wanted to list key takeaways, you can press a keyboard shortcut, choose the command and it would show you a quick answer. You can build any command you want: from summarization to translation or grammar fix…
I also built another free app called ShotSolve[1]. It allows you to take a screenshot with a keyboard shortcut and quickly analyze it with an LLM.
During building BoltAI, a customer asked for the ability to analyze PDF documents so I decided to build another app for that use case called PDF Pals[2]
These apps allow me to quit my job and pursue solopreneurship full time.
[0]: https://boltai.com
[2]: https://pdfpals.com
Unlike previous efforts in this space, the technique I am using consumes very little context, and I'm hoping to get it running on consumer GPUs.
Also made https://resgen.app to help candidates tailor resumes
It uses WebLLM so I don't have to pay/send data to an AI provider.
You enter a starting URL, describe the data you want in a prompt, the AI suggests columns for the output spreadsheet which you can customize, and then goes off and turns the website into structured data into a CSV file.
It also supports limits, you can say for example "visit at most 100 pages" and it will stop after 100 pages.
It was easier said than done to get prompts working as intended and the crawler to focus on the most relevant URLs first. As always, the final 20% end up taking up 80% of the development time.
It's similar to this AI-driven chat assistant for ECE 120 course at UIUC:
I have been working on it on the side for the last 17 months.
- Quick project structure and dependency analysis - Smart file exclusion (respects .gitignore) - Tailored analysis based on user expertise - README generation from codebase insights
Recent updates include enhanced CLI aesthetics and multi-LLM provider support. Docs and additional tests are in progress.
Curious to hear your thoughts or feature suggestions! (And if you visit, I'd love to know if you came from this HN thread )
[1]:https://www.npmjs.com/package/sourcesailor?activeTab=readme
I found in a recent project that it was interesting to compare the answers between GPT, Gemini, Claude, and Mistral. I have better results by combining the parts I like in each answer. It's also easier to ask follow up questions directly from my email client regardless of which device I'm currently using.
I built the app in a weekend with Ruby on Rails and AWS SES. I'm now wondering if it could be useful to other people? It's already open source and I could probably make the hosted version free by allowing users to provide their own API keys.
The goal is to take all of the busy work out of task management so you can always know what your focus should be on and not waste brain power worrying about missing something important.
You can see the earnings report and calls side by side and ask questions about them.
I'm also working on adding a live calls feature that can trigger workflows. So you'd pose a question before the call, define some answer structure (using structured outputs) and then connect it to some kinda of integration (e.g. invoke a lambda, slack/discord message, etc.)
As the call happens in real time I transcribe it and run the output by the llm to answer your question.
I see it being useful for fed talk, earnings calls, and open to ideas for other data sources!
- tool to manage examples dataset for another ai application
- macos transcribe app that allows me to hotkey to speak to notion page, with a little status bar menu to select pages
All relatively trivial but I'd never bother to create them manually
Right now it’s just what i got up in a hurry last year, covers Seattle area only. Next year will cover SF and LA at least (and have a lot more features).
The data is quite granular (each row is one batch, say x age group at y location with z theme) and I do a lot of structuring data with prompts and now OpenAI’s Structured Outputs. I think the data gathering would have made this cost prohibitive otherwise.
I’m not engineer by training or work experience, so this also would have been impossible for me to build without ChatGPT’s help coding.
I couldn't keep up with my news so I made the perfect summarizer that goes through the thought process of the author : https://github.com/thiswillbeyourgithub/WDoc
I needed an AI based system that go through my anki cards, but might as well make it able to read dozens of file formats. Now I can put entire medical youtube playlists, conferences, anki databases, hundreds of PDFs and ask a single question across all of them at once .
It's both the same project
https://incoherency.co.uk/encmech/
https://incoherency.co.uk/blog/stories/encyclopedia-mechanic...
You'll have to create an account to see it, but I've posted a preview on Twitter: https://x.com/BonfireVTT/status/1833988532978823259
It works both on regular software and in complex games like RDR2. And it doesn't cheat by using any game-/software-specific API, nor accessibility calls, nor DOM trees. :)
https://baai-agents.github.io/Cradle/
And we're still evolving it.
It currently doesn't work on some sites.
[1] https://chromewebstore.google.com/detail/tc-summarizer/kjdfi...
Sounds like job search in future is going to be wild(er)
https://github.com/impredicative/newssurvey
It is to be extended to other sites as well. If you read the generated sample reports, you will see how it has potential to change lives.
Unlike much of the commercial stuff that people make, which almost nobody ends up using, the best part is that it is open source.
Im a little jelly of the artifacts feature in claude. Been wanting something like that but their design is better than any of my ideas.
I don't know any webdev so please forgive the jank.
You type in a class idea, then you get themed skills back in a skill tree-like diagram, with flavor text for each ability.
Still many things that could be improved, just thought it was fun.
- 10 boxes of Clothes, 100x140x170 cm, 5kg each, non stackable.
- 8 boxes of Shoes, 80x120x170 cm, 3kg each, keep boxes upright.
- Try to fit everything into a single container.
Interview Copilot for helping you ace your live coding interviews. Desktop app + companion web mode so it's truly undetectable by interview platforms, controlled by global hotkeys.
https://github.com/Merkoba/Meltdown
100% made in Python (and tk). I made my own markdown parser.
Hundreds of commands and command line arguments.
I use it everyday myself.
We tried different LLM models but always returned back to Open AI.
Link: https://acciomatrix.com