Why is this Racket code so fast?

Question

I was completing some of the Project Euler problems in Racket and usually see the Fibonacci sequence take a couple seconds to generate. While showing my wife I tried a number greater than 4,000,000 and it was instant, so I tried again and again and again, up to a 1,000,000 digit number that took 70 seconds. 10^10000 is still around 100ms.Can anyone explain how this code is so quick? Running it on a Mac M1 Pro (2021).https://github.com/Mithaecus/cp-project-euler/blob/main/answers/0002.rkthttps://github.com/Mithaecus/cp-project-euler/blob/main/answers/_solution.rkt

km6GEwiQqsyEKQH · Accepted Answer

lim = 10**1000000 a, b = 1, 2 acc = 0 while a < lim: if not a & 1: acc += a a, b = b, (a + b) print("done", flush=True) print(acc) For a baseline this trivial python implementation does it in 34 ms for 10^10000 and 2m55 for 10^1,000,000 on a ryzen 5. Because the fibonacci sequence is increasing roughly exponentially, the iteration count scales roughly linearly to the number of digits. Then it just goes down to the number crunching speed of the bignum implementation? (Calculating the exponent also takes multiple seconds, as well formatting the number to a string takes around 10 seconds as well.)

Jtsummers · Answer

racket/stream is lazy, so you only ever generate values when they're directly asked for, and in this case they get discarded almost immediately. This means that the two sequences (fibonacci-sequence and even-fibonacci-sequence) use very little memory at any one point in time. If you switched from a lazy stream approach to a naive approach generating every single Fibonacci number below the threshold and then filtering and then summing, you'd see much worse performance. By using a lazy stream approach you end up with something that should be roughly comparable to this C-ish pseudocode, but still has the appearance of the functional map/filter/reduce style: threshold = whatever; accum = 0; for (a = 0, b = 0; a

psyklic · Answer

10^10000 is a large number, but it is surpassed only by the 47,852th Fibonacci number -- a small number of iterations for a computer.
I often suppose as a rule of thumb that a C program can perform ~10 million for-loop additions per second (assuming the accumulator fits in a register). So to do 500,000 bignum additions/second, each bignum addition would only require on average the time of 20 normal additions+iterations.
Also note that the functions provided are all tail recursive. Racket performs tail-call optimizations which would make these similarly performant to a for loop.

whateveracct · Answer

Doesn't Racket run on Chez Scheme now? I'd read up on that. It's a famously performance-oriented Scheme implementation.

pubby · Answer

I believe racket/stream uses memoization. So it should be as fast as any memoized fibbonaci function.