When Iterators Aren't Zero Cost

by Xavier Denis

Here’s something that surprises many new Rust developers: iterators can be faster than hand-written loops. The compiler sees the pattern, LLVM works its magic, and the result outperforms manual code.

Now here’s something that surprises even experienced Rust developers: in certain cases, iterators make code 3-4x slower. We’re taught to believe that iterators are zero cost. But what happens when they aren’t?

On modern CPUs, performance comes from pipelining and vectorization, and rustc does a lot of work to transform our beautiful chains of iterators into highly optimized machine code. But, sometimes we just ask too much. An innocuous definition of next can lead to a 100x slowdown, and adding seemingly unnecessary batching can recover the missing performance.

This talk covers the diagnosis (how to spot iterator overhead in profiles), the theory (why batching helps at the CPU level), and the implementation (a production batched iterator design using columnar storage, compile-time batch sizes, and skip-ahead semantics).

Picture of Xavier Denis

Xavier Denis

he/him
engineer @ turbopuffer
Links: icon of github icon of bluesky

Xavier first started using Rust back in 2014, when he failed to write a kernel in it. Since then he’s earned his PhD in the verification of Rust programs developing the Creusot verifier.

He’s always got a hundred projects going on, but those are just as often developing new recipes in the kitchen as they are reviving dead programming languages.