Bluescreen Debugging for Dummies: Prologue
That could probably help 90% of the developers at Microsoft, to be honest. Kernel mode debugging is sometimes equated to black magic for devs who spend most of their time in the highly friendly (and deterministic) world of user mode.
An analogy I like is to compare kernel mode to cutthroat corporate life, and user mode to bucolic academic life. In user mode, there are rules that you can’t break, and are enforced from the outside by the faculty (the OS). Breaking the rules doesn’t take down the whole university, only one member. You only hurt yourself.
On the other hand, kernel mode has a set of agreed upon rules, but they’re not nearly as strongly enforced…results are more important after all! You’re expected to follow them, you can cheat if you want (and even get away with it for awhile), but when you screw up, chances are you’ll take down the whole enterprise.
You can’t take anything for granted in kernel mode, because there can be any number of kids playing in the same sandbox. Someone can take your CPU away, take your memory away, dump garbage on your data, and not even call you in the morning. Paradoxically, while it’s more critical here than anywhere else that everyone follows the rules, it’s not in our interest to strongly enforce them. Too much error and behavior checking here could bring the system down to an unusable crawl. So we let driver writers have the power to destroy worlds, and hope they use them for good and not evil.
As anyone who has used Windows NT and up knows, this doesn’t always go well.
In coming entries, I’ll cover some of the basics of how to open and analyze memory dump files, so you can at least feel like you have a starting place when you get a blue screen on one of your systems. I’ll move on to more advanced topics if there’s an interest.
Comments
- Anonymous
August 05, 2004
Looking forward to some more articles. Here's a recent interesting stack trace from dump that happened on shutdown; who's to blame: mcafee (nai) or sysinternals (filem)? :)
naiavf5x+0x1cde
FILEM+0x6ff4
nt!IopCompleteUnloadOrDelete+0xc3
nt!IoDeleteDevice+0x71
naiavf5x+0x2643
nt!IopCompleteUnloadOrDelete+0xc3
nt!IopDecrementDeviceObjectRef+0x35
nt!IopDeleteFile+0x1e5
nt!ObpRemoveObjectRoutine+0xdd
nt!ObfDereferenceObject+0x5d
nt!ObpCloseHandleTableEntry+0x153
nt!ObpCloseHandle+0x85
nt!NtClose+0x19
nt!KiSystemService+0xc4
SharedUserData!SystemCallStub+0x4 - Anonymous
August 06, 2004
Ah, see that's one of those dumps right there that automatically falls out of the "Dummies" category. You need to really dig into the assembly to get an idea of who did what to who.
My first suggestion would be to see if there are updates available for either or both drivers. :) - Anonymous
August 29, 2004
The comment has been removed