As an applications developer, catastrophic bugs interest me. I'm particularly interested in concurrent programming errors, which is why I work at the lab where the Intel Threading Tools are developed. I'm also a big fan of NASA and the programmers at JPL.
Anyway, back in 1997 I'd go to the JPL web site every day for updates on the Mars Pathfinder, one of the most successful space exploration missions of all time. I didn't know it back then but the Mars Pathfinder was nearly crippled by a classic threading error known as a priority inversion.
Mike Jones of Microsoft Research posted a couple of articles on his web site that describe the error in the Mars Pathfinder software and how it was fixed. The first is Mike's article "What Really Happened on Mars?" with a follow-up from Glenn Reeves, the head of the software team for the Mars Pathfinder spacecraft.
I recommend these articles to anyone interested threading. If you're adjusting thread priorities and you've never heard of priority inversion, stop what you're doing and read these articles ;-).
If you have any other bug stories related to threading, please post them.