To fix this I had to write it as a streaming response to a json file and use ijson (https://pypi.org/project/ijson/) to read the content which decreased memory utilisation a lot. So while looking around, I found many people had the same issue, so this got me thinking how would I have sent this content if i had built the third party api myself.
Some previous questions asked on stackoverflow,
https://stackoverflow.com/questions/2400643/is-there-a-memory-efficient-and-fast-way-to-load-big-json-files-in-python
https://stackoverflow.com/questions/11057712/huge-memory-usage-of-pythons-json-module
How do you send a response via an api for very large content, along with their advantages/disadvantages.
The server can incrementally stream a chunk and client incrementally consume a chunk keeping flat memory usage.
I think gRpc also supports streaming but don’t know much on it.
JSON is a bad format for large files as generally you need to read the entire file in to memory before you can use it as you observed.