HACKER Q&A
📣 didgetmaster

How do you configure Postgres for parallel queries?


I am trying to get my Postgres server to run queries as fast as possible. I have a table with about 7.5M rows and 22 columns (Chicago crime data) and have created appropriate (I think) indexes on the columns I want to query against.

I have created my own DB engine and try to benchmark it against other RDBMS offerings like Postgres. Unlike some benchmarks, I don't try to cherry-pick queries that make my system seem faster. I want the other DBs to run as fast as possible.

In this video: https://www.youtube.com/watch?v=OVICKCkWMZE I ran three typical queries and timed them using default settings for Postgres. After I recorded that video, I adjusted the following config parameters for parallel queries:

max_worker_processes = 32 max_parallel_maintenance_workers = 32 max_parallel_workers_per_gather = 32 parallel_leader_participation = on max_parallel_workers = 32

This sped up the queries on my machine (Ryzen 5950x with 16 cores/32 threads) but not as much as I hoped.

Query 1 went from 685 ms to 370 ms. Query 2 went from 1259 ms to 933 ms. Query 3 went from 5011 ms to 4718 ms.

They are still significantly slower than my system. I want the benchmark to be as fair as possible.

Does anyone know of other parameters which will increase the speed even more?


  👤 marek_leisk2 Accepted Answer ✓
Since the data won't change frequently it might be worth trying to load it into ClickHouse. I've had needle-in-haystack searches on multi-million row tables speed up by 2 orders of magnitude over PG.