HACKER Q&A
📣 chirau

What is the fastest way to load .csv file into pandas?


I have a csv file with millions of rows and pd.read_csv() seems to take some time. I have to search through this csv each time a query comes in so that load time must be minimized.

Is there any faster way of loading it into pandas?


  👤 CyberBull Accepted Answer ✓
Loading the CSV through read_CSV then maybe querying it as json/dict.

So read_csv(file).to_dict(orient='records')

That changes the CSV data into a list of objects. I had this problem in a project I was working on and it speed up the processing.

You could then also revert the data back to CSV through DataFrame(list).to_csv(path to save the file)


👤 periheli0n
Parsing ASCII simply takes time. What you describe sounds like a use case for SQLite. Parse once when building the database. When indexed properly, searching should be much faster.

👤 uberman
Perhaps loading a file from disk into pandas every request is not the right strategy to begin with.