HACKER Q&A
📣 _false

Who has deployed commercial features using GPT4?


I'd be curious to learn how effective has GPT4 been to enable product features and what it means for the things we might see in the future.

In particular, I have the following questions:

1. What was the product you were working on?

2. Were there any new software engineering challenges that came from working with GPT4 (e.g. output quality, testing, monitoring, etc.)?


  👤 abdullin Accepted Answer ✓
Great question!

1. I've observed multiple products across customers.

1.1 Correcting or filling missing information in structured data. For example a system to suggest corrections to products in a company catalogue (each product category has different schema). Unstructured data is pulled from various websites and optionally from categories retrieved from images. It is then compared against the data and most probable fixes are reported.

Most of the work is done by a few polite prompts to GPT3.5/4 (~5 English sentences in total)

1.2 Better search company data. E.g. a chat bot for internal documentation that can also access internal services in order to answer a question. Same ~5 English sentences to do bulk of the work.

1.3 (non commercial) Endangered language preservation. Building a smart agent that is accessible via chat/hardware (like Alexa/Homepod), that talks in native language can understand and helps to preserve the culture. This is a complex one.

2. Tech stack itself is rather simple. Mostly - GPT, LangChain/LlamaIndex, Vector database with embeddings for memory, plugins for external services and potentially agents to drive workflows.

Output quality, testing, monitoring, scalability etc also don't differ much from operating normal "old-school" ML models. If anything, it feels simpler.

The tricky part is that the entire notion of LLM-driven micro-services is new. Quality of the resulting product largely depends on knowing prompting tricks and following the latest news in an area.

Plus the biggest challenge that customers want to be solved: "How can I ran it on my hardware?"


👤 lukasfischer
We are building LandHive AI, which enables you to generate websites by simply promt the system with a website title and a content briefing. See a demo: https://youtu.be/0S5rU0odTOk

My biggest challenge using the api so far was that the output is not reliable. Randomly from time to time it outputs notes and comments even tough I asked to only reply in a code block. Also if I rerun exactly the same promt, it can output something completely different (different content is fine, but I teach chatgpt to follow a structure - it works in 90% of the cases just fine). I’m using 3.5 not 4.

The api can be down regularly. This is annoying especially if you have longer conversations. I had a hard time to resume a conversation. I usually restart the whole process.

However, the overall capabilities are mind blowing. The system surprises me very often.

https://landhive-ai.netnode.ch/home


👤 onurgenes
We were already using LLM’s at https://nureply.com for creating personalization for cold emails but GPT-4 enables us to create more specific and engaging icebreakers for our users.

Challenge comes in pricing and getting a good result. Generally longer the prompt, better the results but you have to adjust accordingly.

Also, generally using only GPT-4 doesn’t make sense. Mix and match between 2 different models make sense. (e.g. data extraction can be done with GPT-3.5 but writing a good email should be done with GPT-4)


👤 hubraumhugo
IMO one of the killer use cases of GPT is reformatting information from any format X to any other format Y, and we're using this superpower in the relatively "boring" space of data extraction: https://kadoa.com can turn any website into an API.

👤 bluelightning2k
We have it as an option in DemoTime

For those unaware DT produces a highlight-reel video after every software sales meeting.

Not sure if LLM will go on by default. The algorithmic version of DT is super strong, so just generating the scripts with GPT is MUCH worse.

For us the correct usage is to sprinkle in GPT, e.g. to also add a section to the output video which summarizes the user's goals


👤 cloudking
I'm also curious to know what business problem did you solve with GPT that you couldn't previously or as effectively?

So far I've seen a ton of cool demos, but not much real life business use cases.


👤 gimili
We're integrating GPT3/4 functionalities into our hardware engineering SAAS: https://www.valispace.com/ai/

It mainly helps with 2 things:

- allowing engineers to develop their products much faster (especially doing good requirements engineering for now)

- allowing us to demo to/onboard users with data from their specific usecase (prepopulate their trial account)

Hardware engineering at first does not seem like an obvious choice for LLMs, but I think that it will be those vertical solutions that will still surprise us all the most.

Here are some more details, how hardware design gets concretely aided by LLMs: https://assistedeverything.substack.com/p/todays-ai-sucks-at...


👤 ShaneMcGowan
I work for Intercom.com, we are currently adding GPT 4 to the support bot

https://www.intercom.com/ai-bot

Looks pretty cool from what I’ve seen


👤 williamstein
1. I'm integrating ChatGPT extensively into https://CoCalc.com. This integration makes a lot of sense, because cocalc is a platform in which relatively inexperienced students use Jupyter notebooks, linux terminals and Latex. So far, the most popular feature by far is a "Help me fix this" button that appears above stacktraces in Jupyter notebooks.

2. One software engineering challenges is that ChatGPT often outputs code in markdown blocks. I've had to emphasize in prompts that it should explicitly mark the language. I then got inspired to make it possible to evaluate in place the code that appears in these blocks using a Jupyter kernel, and spent a week making that work (so, e.g., if you type a question into the chatgpt box on the landing page at https://cocalc.com, and code appears in the output, often you can just evaluate it right there). There seem to be endless surprises and challenges though. For example, a few minutes ago I realized that sometimes the giant tracebacks one gets when using Python in Jupyter notebooks are so big (even doing simple things with matplotlib) that they end up resulting in too much truncation: https://github.com/sagemathinc/cocalc/issues/6634

3. I'm mostly using GPT-3.5-turbo rather than GPT4, even though I have a GPT4 api key. Aside from costs, GPT4 takes about 4x as long, which often just feels too long for my use case. The average time for a complete response from GPT-3.5 for my application is about 8 seconds, versus over 30s for GPT4.


👤 barrrrald
We built Hex’s Magic features using GPT-4. You can generate, edit, debug, and explain SQL and Python. We have a few hundred people using it every day, and are opening it more broadly soon.

https://hex.tech/magic


👤 taf2
We have integrated it into our AskAi feature that allows customers to answer natural language questions about the outcome of a phone call. So for example, “was there an appointment scheduled in the phone call and what was the time scheduled?” Or “what was the final quoted value in the following conversation?” We can then take the structured outputs and use them for conversation tracking with google ads. This is a game changer when so many in the industry still rely on call length to measure a positive customer / lead interaction.

We are ctm.app


👤 gavinjoyce
We're using GPT 4 to create personalised content for product demo videos for sales teams. Demo here: https://www.linkedin.com/feed/update/urn:li:activity:7049764...

Even though it's much slower, GPT 4 is way more consistent than 3.5. The OpenAI APIs have had lot of flakiness in the past couple of weeks, we retry requests up to 10 times to work around this


👤 roseway4
Re (2): GPT4 requires engineering for resiliency. The API (currently) has availability issues and high variance in latency. It’s not (yet) ready for interactive use. For other use cases, a good queuing and retry strategy is necessary.

👤 88stacks
We have deployed gpt4 in multiple layers of our stack at https://88stacks.com using it for tooling, marketing, and other places.

👤 faangiq
Dude. You gotta stay pre-product. Pre-revenue? No no no. If you’re pre-product your valuation is only capped by their imaginations.

👤 kbrackson
1) Have used it to rewrite a couple prod functions. Can't/won't go into details. 2) It cannot write advanced logic so use with care.

👤 dustymcp
yes i used it to give multiple text variants to a marketing tool, it was very simple since i doesnt have direct user input. just do httpcall get text from database have a solid prompt to give alternatives and off we went.