Largest speedup you ever achieved by only changing a few lines of code?

Question

chrisutz · Accepted Answer

Started a new job working on a company's API team. The API had out of memory issues and had processes crashing all the time.The code was PHP. All API calls ended like this:echo json_encode($data) . "
";Changed just one character, the period to a comma so the string wasn't duplicated before being output. Problem solved. Felt like a hero.

muzani · Answer

Gallery code that queries all photos and shows them to the user. It had a 13 sec delay with 10k photos and above.
I thought it was a Big O problem at first, because the code was hacky and used more Arrays that it needed. But it was because it was obtaining all images and then sorting them by time.
I sped it up to milliseconds by making the queries sorted by time.

Arcten · Answer

Iterating via reference instead of copy in C++ is a great way to get a speed up with a one character change.I.e. changing for (const MyType v : collection) to for (const MyType& v : collection)

yen223 · Answer

If you're working on Django codebases, `select_related` and `prefetch_related` are going to be your friends. Maybe 70-80% of Django slowness I've encountered (back when I did Django) was because of something issuing hundreds of db calls unnecessarily.

yongjik · Answer

Almost twenty years ago, found a debug logging function using O_SYNC. Took it out. Test ran so fast that a teammate said that they thought it didn't run.Now, of course, there's zero reason to write every debug output with O_SYNC. Classical case of cargo-cult programming. I'd like to say I've never seen something like that again, but then I'd be lying.

max_hammer · Answer

Moved data aggregation from Oracle to `awk`
The process was loading 40GB files to database and aggregation took more than 5 hours.
Wrote a simple awk one liner with associate array and process was completed in minutes

maxrev17 · Answer

Dropped a request by 40 seconds, by preventing entity framework round-tripping the db 3000+ times, for a property which was already held in memory! Pretty much a 1 liner change. Took some time to find, and the use of miniprofiler (thx Stackoverflow!).

pknerd · Answer

The company I used to work with was a b2b portal. Their Chinese customers were having issue to submit form due to the firewall and was taking 22 seconds. What I simply did was sending close header from server which made client browser to shut the request and release. Behind the scene it was still taking same time but at least customer weren't waiting. It was kind of Ajax but using apache flags.

bradknowles · Answer

My most recent example is some SQL commands that were being executed on a nightly basis by another team against our database. Their code was sometimes taking more than eight hours to run, and this could cause timing problems where it took so long that it caused further processes down the pipeline to fail.
I took their exact SQL commands and wrapped them in my own scripting, and the simplified single-threaded version was executing in about fifteen minutes.
I went back through and added some explicit parallelization combined with wait commands to ensure that everything in that stage was complete before going to the next stage. That improved version now executes in around 600 seconds.

lakkal · Answer

Not really lines of code as such, but in Visual Foxpro, opening a DBF (database table file) on a fileserver, from an application running on that fileserver, referring to the file using its local path rather than a mapped network drive file path, resulted in a big speedup. (the application typically runs on RDP servers, but can also be run directly on the fileserver, which we do for some heavy-duty processes)

speedgoose · Answer

Wrapping multiple SQL mutations into a single transaction.Adding the right indexes in a relational database.Converting from csv to parquet before querying large datasets on Apache Spark.

dsgrillo · Answer

Sending campaign flow would create tens of thousands records on DB. After each record insertion, there was a sleep(0.1) - in place to solve problems related to master/slave setup in other flows - just conditionally disabling the sleep was enough to reduce the procedure time from ~5min to ~30 secs

LarryMade2 · Answer

Revising a complex SQL query, so it doesn't do everything in one fail swoop which resulted in 1000x more records to process than necessary. By using sub selects/union made the query that took a half minute down to a couple seconds.

iujjkfjdkkdkf · Answer

Some basic word counting on text files in bash, I was tokenizing to one token per line and then counting:tr -cs '[:alnum:]' '[
*]' | sort | uniq -cThe sort takes a long time (probably just n log n I guess) on a big text. Swapping forawk '{k[$0]++} END {for (token in k) print token, k[token];}'and then sorting on the numbers does the same thing faster.

sharmi · Answer

import gc gc.disable()in python. The python script would previously take 4 hours to run. There were lots of small functions and these were called on a loop. I ensured that there were no circular references within any of the functions and then disabled gc (Which is what the gc would look for, variables without circular references would be garbage collected automatically when they go out of scope.)The script ran in 20 mins.The hue and cry people raised over the gc.disable though convinced me never to do any unconventional optimizations again.

amir734jj · Answer

MongoDb driver in C# was failing to convert LINQ expression to mongo query. I noticed it added unit test to make sure it would never silently do in-memory filter. Night and day difference.

Black101 · Answer

Modified an SQL query... went from a few minutes, to a few seconds.

surds · Answer

Took a massive data migration from several hours (over a day) to a few minutes (less than 10) on MongoDB with the right indices.

Leparamour · Answer

Does somebody have any stories for Python codebases?

billconan · Answer

changed from single thread to multi-threads,changed std::map to std::unordered_map

Largest speedup you ever achieved by only changing a few lines of code?

Largest speedup you ever achieved by only changing a few lines of code?

Iterating via reference instead of copy in C++ is a great way to get a speed up with a one character change.
I.e. changing for (const MyType v : collection) to for (const MyType& v : collection)

If you're working on Django codebases, `select_related` and `prefetch_related` are going to be your friends. Maybe 70-80% of Django slowness I've encountered (back when I did Django) was because of something issuing hundreds of db calls unnecessarily.

Moved data aggregation from Oracle to `awk`
The process was loading 40GB files to database and aggregation took more than 5 hours.
Wrote a simple awk one liner with associate array and process was completed in minutes

Dropped a request by 40 seconds, by preventing entity framework round-tripping the db 3000+ times, for a property which was already held in memory! Pretty much a 1 liner change. Took some time to find, and the use of miniprofiler (thx Stackoverflow!).

Wrapping multiple SQL mutations into a single transaction.
Adding the right indexes in a relational database.
Converting from csv to parquet before querying large datasets on Apache Spark.

Sending campaign flow would create tens of thousands records on DB. After each record insertion, there was a sleep(0.1) - in place to solve problems related to master/slave setup in other flows - just conditionally disabling the sleep was enough to reduce the procedure time from ~5min to ~30 secs

Revising a complex SQL query, so it doesn't do everything in one fail swoop which resulted in 1000x more records to process than necessary. By using sub selects/union made the query that took a half minute down to a couple seconds.

MongoDb driver in C# was failing to convert LINQ expression to mongo query. I noticed it added unit test to make sure it would never silently do in-memory filter. Night and day difference.

Modified an SQL query... went from a few minutes, to a few seconds.

Took a massive data migration from several hours (over a day) to a few minutes (less than 10) on MongoDB with the right indices.

Does somebody have any stories for Python codebases?

changed from single thread to multi-threads,
changed std::map to std::unordered_map

Largest speedup you ever achieved by only changing a few lines of code?

Largest speedup you ever achieved by only changing a few lines of code?

Iterating via reference instead of copy in C++ is a great way to get a speed up with a one character change.I.e. changing for (const MyType v : collection) to for (const MyType& v : collection)

If you're working on Django codebases, `select_related` and `prefetch_related` are going to be your friends. Maybe 70-80% of Django slowness I've encountered (back when I did Django) was because of something issuing hundreds of db calls unnecessarily.

Moved data aggregation from Oracle to `awk`The process was loading 40GB files to database and aggregation took more than 5 hours.Wrote a simple awk one liner with associate array and process was completed in minutes

Dropped a request by 40 seconds, by preventing entity framework round-tripping the db 3000+ times, for a property which was already held in memory! Pretty much a 1 liner change. Took some time to find, and the use of miniprofiler (thx Stackoverflow!).

Wrapping multiple SQL mutations into a single transaction.Adding the right indexes in a relational database.Converting from csv to parquet before querying large datasets on Apache Spark.

Sending campaign flow would create tens of thousands records on DB. After each record insertion, there was a sleep(0.1) - in place to solve problems related to master/slave setup in other flows - just conditionally disabling the sleep was enough to reduce the procedure time from ~5min to ~30 secs

Revising a complex SQL query, so it doesn't do everything in one fail swoop which resulted in 1000x more records to process than necessary. By using sub selects/union made the query that took a half minute down to a couple seconds.

MongoDb driver in C# was failing to convert LINQ expression to mongo query. I noticed it added unit test to make sure it would never silently do in-memory filter. Night and day difference.

Modified an SQL query... went from a few minutes, to a few seconds.

Took a massive data migration from several hours (over a day) to a few minutes (less than 10) on MongoDB with the right indices.

Does somebody have any stories for Python codebases?

changed from single thread to multi-threads,changed std::map to std::unordered_map

Iterating via reference instead of copy in C++ is a great way to get a speed up with a one character change.
I.e. changing for (const MyType v : collection) to for (const MyType& v : collection)

Moved data aggregation from Oracle to `awk`
The process was loading 40GB files to database and aggregation took more than 5 hours.
Wrote a simple awk one liner with associate array and process was completed in minutes

Wrapping multiple SQL mutations into a single transaction.
Adding the right indexes in a relational database.
Converting from csv to parquet before querying large datasets on Apache Spark.

changed from single thread to multi-threads,
changed std::map to std::unordered_map