Joe Ganley

I make software and sometimes other things.

 

Due to Blogger's termination of support for FTP, this blog is no longer active.

It is possible that some links from here, particularly those within the site, are now broken. If you encounter one of those, your best bet is to go to the new front page and hunt for it from there.

Most, but not all, of the blog's posts are on this page; the archives are here.

 
Michael Abrash has a series of articles appearing in Dr. Dobb's currently. The subject of the articles is code optimization, but the thing I found most interesting is the code being optimized. It is a software-only rasterizer. I wouldn't have thought it possible (even for Michael Abrash) to get DirectX7 performance in software. But most of all, it's noteworthy that state-of-the-art graphics hardware and drivers still have so many bugs and portability problems that Abrash finds it worthwhile to spend a lot of time and sacrifice two generations off the state of the art in order to avoid them. I certainly feel his pain, and I'm jealous that he is able to focus on a single type of processor and thus achieve this level of optimization.

The other thing that caught my eye was an illustration of how on modern processors, optimization can be extremely nonobvious. The example was a subproblem I've solved before: Given a value x between 0 and 1 and an array of ascending values in the same range, figure out which pair of values x falls between. I used a binary search. It turns out that on the x86, binary search is predicted poorly by the processor's branch prediction logic, while in a linear search every compare but the last one is predicted correctly. A mispredicted branch is very expensive. So, for a small number of bins a linear search is 30-50% faster (!), and the problem size has to reach 64 bins before the improved asymptotic efficiency of the binary search actually beats the linear search in real life. My problem had at most a few tens of bins, and the search was deep in an inner loop, so this would have been useful information to have known.

Labels: ,

0 Comments: