What is going on regarding quality of service for API access to LLM's?

Question

I saw in the latest ChatGPT plus announcements that you get better speed if you pay them $20/month. This made me wonder how the speed of the plus version of ChatGPT compares to the API that we pay for (to integrate ChatGPT into https://cocalc.com). We have had solid usage over the last 2 months, and I keep track of exactly how long the complete response takes for each api request. I just checked the stats and the average api response time for chatgpt and gpt4 have both gotten MASSIVELY WORSE for us over time: smc=# select model, sum(total_time_s)/count(*) from openai_chatgpt_log where time >= now() - interval '1 weeks' group by model; model | ?column? ---------------+-------------------- gpt-4 | 64.17583870967742 gpt-3.5-turbo | 22.513887411945003 (2 rows) smc=# select model, sum(total_time_s)/count(*) from openai_chatgpt_log where time >= now() - interval '8 weeks' and time

williamstein · Accepted Answer

There is an official answer at the end of this thread from somebody at openai claiming that they do not intentionally slow down the API: https://community.openai.com/t/we-proved-the-api-is-intentio...It sounds like they are just swamped with usage and are just trying to keep it working at all&hellip;