Everything is perfect; you've upgraded to Windows 7. It's fully patched, all drivers are updated, security is tight, maybe you even have new hardware...yet the old Blue Screen of Death (BSOD) taunts you from your new high definition-screen.
The good news is that you can quickly solve the problem in most cases by using the Windows debugger tool. It's simple and free.
Back in the Window XP era (2005), we wrote a tutorial on solving Windows crashes (How to solve Windows system crashes in minutes). This is an updated version that will make you the master of system crash resolution in your home or office.
SCREENSHOTS: Six Windows 7 nightmares (and how to fix them)
Is crash resolution different for different versions of Windows?
The same approach to resolve system crashes applies to the many variants of Windows, says Andre Vachon, principal development lead at Microsoft. "The latest releases of Microsoft Windows use the same operating system kernel, the same primary interfaces, drivers work on both server and client, and the debugger uses the same debug files. Further, we used the same code base and source tree to compile both 32- and 64-bit versions."
With that in mind and for simplicity I will refer to Windows 7. However, not only will the information apply to other current releases, much of it will apply to legacy versions back to Windows 2000.
Why Windows 7 crashes
Windows became more stable as it matured. And, while the operating system has gone from 16-bit to 32-bit and now 64-bit, the features have become more extravagant, and the footprint much larger - it is actually harder to bring down.
Still, it does fall over. However, the reasons for such system failures have not changed from the XP days.
Windows takes advantage of a protection mechanism that lets multiple applications run at the same time without stepping all over each other. Known now as User Mode and Kernel Mode, it was originally known as the Ring Protection scheme.
Kernel Mode (Ring 0) software has complete and unfettered access to the hardware. Software operating here is normally the most trusted because it can execute any instruction and reference any address in the system. Crashes in Kernel Mode are complete system failures requiring a reboot. This is where you find the operating system kernel code and most drivers.
User Mode (Ring 3) software cannot directly access the hardware or reference any address freely. It must pass instructions - perhaps more accurately requests - through calls to APIs. This feature enables protection for the overall operation of the system, regardless of whether an application makes an erroneous call or accesses an inappropriate address. Crashes in User Mode are generally recoverable, requiring a restart of the application but not the entire system. This is where you find most of the code running on your computer ranging from Word to Solitaire and some drivers.
So with much of the software running in User Mode these days, there is simply less opportunity for applications to corrupt system-level software and, for that matter, each other. However, kernel-mode software is not protected from other kernel-mode software. For example, if a video driver erroneously accesses a portion of memory assigned to another program (or memory not marked as accessible to drivers) Windows will stop the entire system. This is known as a Bug Check and the familiar Blue Screen of Death is displayed.
Crash causes by the numbers
While the numbers vary, they do not vary much. When combining data reported from several sources including my own 20 years dealing with crash prevention and resolution, a trend becomes clear; about 70% of Windows system crashes are caused by third party drivers operating in Kernel Mode, 15% is unknown, 10% is from faulty hardware (more than half from bad memory) and only about 5% from faulty Microsoft code.
An important point that is not well known is that most crashes are repeat crashes. This is so because most admins are not able to resolve system crashes immediately. As a result those crashes tend, unfortunately, to occur again...and again. More often than not, these events recur over weeks and in many cases over months before being resolved. By using the information in this article to solve crashes when they first occur, you will prevent many subsequent crashes.
Getting Started: System Requirements
To prepare to solve Windows 7 system crashes using WinDbg you will need a PC with the following:
• 32-bit or 64-bit Windows 7/Vista/XP or Windows Server 2008/2003
• Approximately 25MB of hard disk space (this does not include storage for dump files or for symbol files)
• Live Internet connection
• Microsoft Internet Explorer 5.0 or later
• The latest version of WinDbg comes as an option in the Windows SDK. The SDK download file is called winsdk_web.exe, is 498KB in size, and can be downloaded for free. (Note that after installing the debugger you can delete the large download file thus freeing up lots of space.)
• A memory dump (the page file must be on C: for Windows to save the memory dump file)
After downloading the Windows SDK and running the Setup wizard, select the Debugging Tools for Windows option under Common Utilities.
Configure Startup and Recovery
This is annoying. Someone made it very non-intuitive to locate the dialogue box needed to check that your system is set to take the appropriate actions during a BugCheck, including whether to automatically restart and what size dump files to save.
Find the Startup and Recovery dialog box:
1. Select the Start button at the bottom left of your screen
2. Select Control Panel
3. Select System and Security
4. From the options in the right column, select System
5. From the left column select Advanced system settings to display the System Properties box
6. In the System Properties box select the Advanced tab
7. In the Startup and Recovery area select the Settings button
See the Startup and Recovery dialog box below:
Ensure Startup and Recovery settings are correct
Under System failure
1. Check Write an event to the system log
2. Check Automatically restart
3. Select Kernel memory dump
4. Ensure dump file to be written to %SystemRoot%\MEMORY.DMP
5. Check Overwrite any existing file to save hard drive space
Note that this will mean that your system will save both a kernel dump file and a minidump file. However, while you will have a minidump for every event, only the last kernel dump will be saved.
Launching the debugger: To launch WinDbg select the following:
Start | All Programs | Debugging Tools for Windows| WinDbg
If you are going to use it with any frequency, simplify launching the program by pinning it to the Startup menu or send a shortcut to the desktop.
What's the big deal about symbols?
Before you jump in to save the day by finding the miscreant module in a dump file you have to be sure the debugger is ready. Most importantly you have to be sure it will locate the symbol files for the precise version of the operating system that you are troubleshooting.
Symbol tables are a byproduct of compilation. When a program is compiled, the source code is translated from a high-level language into machine code. At the same time, the compiler creates a symbol file with a list of identifiers, their locations in the program, and their attributes. Some identifiers are global and local variables, and function calls. A program doesn't require this information to execute. Therefore, it can be taken out and stored in another file, reducing the size of the final executable.
Smaller executables take up less disk space and load into memory faster than large ones. But there is a flip side: When a program causes a problem, the operating system knows only the hex address at which the problem occurred. You need something more than that to determine which program was using that memory space and what it was trying to do. Windows symbol tables hold the answer and having access to symbols specific to your system's memory is like putting place names on a map. Conversely, analyzing a dump file with the wrong symbol tables would be like finding your way through San Francisco with a map of Boston.
Configure WinDbg to locate symbols
There are an amazing number of symbol table files for Windows. This is so because every build of the operating system, even one-off variants, results in a new file. Fortunately, WinDbg can handle it for you but you must configure it with the correct search path. To do this, launch WinDbg and select the following:
File | Symbol file path
Then enter the following path: (Make sure that your firewall allows access to msdl.microsoft.com)
Note that the address between the asterisks is where you want the symbols stored for future reference. For example, I store the symbols in a folder called symbols at the root of my c: drive, thus:
When opening a memory dump, WinDbg will look at the executable files (.exe, .dll, etc.) and extract version information. It then creates a request to the symbol server at Microsoft, which includes this version information and locates the precise symbol tables to draw information from. It won't download all symbols for the specific operating system you are troubleshooting; it will download what it needs. Alternatively, you can opt to download and store the complete symbol file from Microsoft. This, however, will run from about 600MB to near 800MB for each version of the operating system you analyze. In contrast WinDbg downloaded less than 100MB to analyze several versions of the operating system on my test machine. Even with the low cost of hard drives these days, the space savings is significant.
About dump files
A memory dump file is a snapshot of what the system had in memory when it crashed. Though perhaps the least attractive and correspondingly least intuitive thing you are likely ever to look at, it is your best friend when the operating system falls over. Windows creates three different sizes of memory dumps; minidumps, kernel dumps, and full dumps.
1. Small or minidump
Windows 7 minidumps are 256K-bytes, which is tiny by any standard, however they have grown from the Windows 2000/XP days when they were only 64K. One of the reasons they are so small is that they do not contain any of the binary or executable files that were in memory at the time of the failure. However, those files are critically important for subsequent analysis by the debugger. As long as you are debugging on the machine that created the dump file WinDbg can find them in the System Root folders (unless the binaries were changed by a system update after the dump file was created). Alternatively the debugger should be able to locate them through SymServ. Properly configured, Windows 7 creates and saves a minidump for every crash event as well as a kernel dump (described below).
2. Kernel dump
Kernel dumps are roughly equal in size to the RAM occupied by the Windows 7's kernel. On my notebook a kernel dump runs about 344MB and compressed it is just over 100MB. One advantage to a kernel dump is that it contains the binaries. As a default I would always have the system save the latest kernel dump. Remember that while saving it, the system will also save a minidump.
3. Complete or full dump
A full memory dump is about equal to the amount of installed RAM. With many systems having multiple GBs, this can quickly become a storage issue, especially if you are having more than the occasional crash. Normally I do not advise saving a full memory dump because they take so much space and are generally unneeded. However, Microsoft's Vachon advises that "if you are trying to debug a very complex problem, such as an RPC issue between multiple services in the box and you want to see what the services are doing in User Mode, the full memory dump can be very helpful." Therefore, stick to the kernel dump but be prepared to switch the setting to generate a full dump on occasion.
What if you do not have a memory dump to work with?
If you do not have a memory dump to look at, do not worry, you can make it crash! The simplest way (without having to change Registry settings) is to run a cool tool called NotMyFault (thank you Mark Russinovich and the team at SysInternals.) It provides a selection of options to load a misbehaving driver (which requires administrative privileges).
But remember...it WILL CREATE A SYSTEM CRASH! So prepare your system and be sure to let anyone who needs access to the system to log off for a few minutes. Save any files that contain information you might otherwise lose and close applications. If you have configured your system as described above, it should work fine. The machine should go down, reboot, and you will have both a minidump as well as a kernel dump to look at. I've used it plenty of times and had no problems.
Download NotMyFault and force a system crash
1. Download the NotMyFault tool from the following Microsoft Web site and extract the files to a folder: http://download.sysinternals.com/Files/Notmyfault.zip
2. Right-click on NotMyFault.exe or at the Command Prompt type NotMyFault. If you get the message "You don't have permission to open this file" then try again but when right-clicking select "Run as Administrator".
3. From the menu select "High IRQL fault (kernelmode)" and the Do Bug button. This will generate a memory dump file and a "Stop D1" error.
4. Sit back...your system will be back in momentarily and you will have both a minidump and kernel dump to view.
Load a dump file
If you get the message "You don't have permission to open this file", re-launch WinDbg by right-clicking on it and selecting Run as administrator.
Once the debugger is running, select the menu option File | Open crash dump and point it to open the memory dump you want to analyze. When offered to Save information for workspace select Yes if you want it to remember where the dump file is.
WinDbg looks for the Windows symbol files for that precise build of Windows. It references the symbol file path, accesses microsoft.com, and displays the results.
NOTE: If the debugger seems busy, it is probably the first time a dump file for a specific machine has been opened, therefore, WinDbg is downloading symbols from SymServ. The next time a dump is opened for the same machine the debugger will likely seem much faster since the symbol files will be available locally.
A Command window will appear. This is where the crash analysis will be displayed. At the lower left will be a KD> prompt. To the right of the prompt is a single-line window where you will enter commands.