*Dr. Dobb's*currently. The subject of the articles is code optimization, but the thing I found most interesting is the code being optimized. It is a software-only rasterizer. I wouldn't have thought it possible (even for Michael Abrash) to get DirectX7 performance in software. But most of all, it's noteworthy that state-of-the-art graphics hardware and drivers still have so many bugs and portability problems that Abrash finds it worthwhile to spend a lot of time and sacrifice two generations off the state of the art in order to avoid them. I certainly feel his pain, and I'm jealous that he is able to focus on a single type of processor and thus achieve this level of optimization.

The other thing that caught my eye was an illustration of how on modern processors, optimization can be extremely nonobvious. The example was a subproblem I've solved before: Given a value *x* between 0 and 1 and an array of ascending values in the same range, figure out which pair of values *x* falls between. I used a binary search. It turns out that on the x86, binary search is predicted poorly by the processor's branch prediction logic, while in a linear search every compare but the last one is predicted correctly. A mispredicted branch is very expensive. So, for a small number of bins a linear search is 30-50% faster (!), and the problem size has to reach 64 bins before the improved asymptotic efficiency of the binary search actually beats the linear search in real life. My problem had at most a few tens of bins, and the search was deep in an inner loop, so this would have been useful information to have known.

Labels: graphics, programming