Print an article from OSR Online

The NT Insider

Bagging Bugs ? Avoidance and Detection Tips to Consider
Guest article contributed by Nathan Bushman (By: The NT Insider, Vol 9, Issue 5, Sep-Oct 2002 | Published: 15-Oct-02| Modified: 10-Oct-02)

Every time I come across a handy driver development tip that will help me to avoid bugs or detect them earlier, I jot it down. Some of these tips are obvious, but I write them down anyway because I need the reminder. Others are less obvious and perhaps they?ll help you to discover or avoid a bug.

Catch 'em Automatically

Use the Driver Verifier. On your test machine just run ?verifier? from the Start | Run dialog. Driver Verifier will let you perform a number of tests on the driver of your choice. Click on the Settings tab and select the drivers that you want Verifier to test. Then, in the same tab, select which tests you want Verifier to perform. You should at a minimum select the Special Pool, Force IRQL Checking, Pool Tracking and I/O Verification Level 1 tests.

Test your driver on both the free and checked builds of the OS. The checked OS is full of assertions that will help you to find bugs. If you?re testing with a hybrid OS (a free OS with checked versions of the HAL and kernel), you should also add checked versions of the other drivers that your driver interacts with. For example, if you?re testing a storage filter driver on a hybrid system, then in addition to the checked HAL and kernel, you should also turn off SFP (download OSR?s free SFPControl at www.osr.com) and install the checked versions of ntfs.sys, fastfat.sys, diskperf.sys, ftdisk.sys, disk.sys, classpnp.sys, scsiport.sys, etc.

If Driver Verifier finds a problem, or when the checked build of the OS asserts, you?ll end up with a blue screen. It?ll save you quite a bit of time if you have a kernel debugger such as WinDbg attached to your test systems so that these bug checks will simply trap in the debugger, allowing you to analyze the problem immediately. Be sure to generate symbol files for your driver so that you can perform source-level debugging. You do this in your SOURCES file, which I?ll discuss later.

Enable pool tagging on your test system, and be sure to use the ?WithTag? variations of any function calls that allocate memory. Pool tagging will help you to discover memory leaks because it will associate a tag (which you provide in the allocation call) with each memory allocation. Download PoolTag (free at www.osr.com). It?s a nice GUI app that will both enable pool tagging and display the amount of memory allocated to each tag.

Test your driver on multiprocessor systems. The more processors the better. Multiprocessor systems are more likely to reveal synchronization bugs in your code like the lack of synchronization when it?s needed, or deadlocks.

Run your driver against test suites including any applicable HCTs and third-party programs (ex: IOMeter for storage drivers). Build in unit tests that fully test the features of your driver. Make it possible for these tests to be run by a script. This will enable you to run a daily smoke screen to see if any of the code you?ve added that day has broken pre-existing code.

Catch 'em While Coding

Use ASSERT() or ASSERTMSG() for all assumptions in your code. Every function should have an ASSERT(), which checks the current IRQL to see if it?s within the assumed range. Every function that?s internal to the driver (that can?t be called by code outside of the driver) should have ASSERTs that validate parameter values. If a function assumes that it?s running in the system process context, it should perform an ASSERT to see if it?s process is the system process. By using ASSERTs to check IRQL and process context, you?ll self-document these requirements for every function. Keep in mind that ASSERT() and ASSERTMSG() are macros that are not defined in release builds, so never perform any actual work within the expressions that you pass to an ASSERT().

Functions in a driver that may be the first functions entered from user mode must validate their parameters. This check should always occur, regardless of whether your driver is built in the free or checked environment. It should not be done using ASSERT(). Such functions include your driver?s dispatch, AddDevice, DriverEntry, and Unload routines.

Beware of macros. Many of the APIs in the DDK are actually implemented as macros, and while they may be well written, it?s best to be cautious when using any macro. Never pass arguments to a macro that are expressions with side effects because the macro might just evaluate the expression multiple times. Use block notation for conditional expressions to avoid problems with poorly-written multi-statement macros. For example:

if (flag) DoSomething();

If the function DoSomething() is a actually a multi-statement macro, then only the first statement in the macro would be conditionally executed. Be safe and use block notation for your conditional expressions, like:

if (flag) { DoSomething(); }

Remember to synchronize access to any resources that can be touched by more than one thread at the same time. It?s far better to experience a performance hit due to synchronization code and to have your code properly synchronized than to have code that?s a little faster but full of bugs.

Don?t touch paged memory at IRQL >= DISPATCH_LEVEL. Also, keep in mind that the UNICODE character tables are in paged memory, so if you attempt to use a DbgPrint() statement to print out a WCHAR/UNICODE string, the system will attempt to access paged memory in order to convert that string to text that can be sent to the attached kernel debugger. At DISPATCH_LEVEL, the statement DbgPrint(?%S?, unicodeString.Buffer); will cause a 0x0A bug check.

Some entry points to your driver such as AddDevice() are documented to be called at PASSIVE_LEVEL. For this reason these functions are often placed in a pageable code section. Be aware that if you use a spinlock in such a function, you must not place the code for the function in a pageable code section. The reason for this is that the uniprocessor kernel will raise IRQL to DISPATCH_LEVEL for the duration of ownership of the spinlock.

Avoid stack overflows. Every NT thread has two stacks, a user-mode and a kernel-mode stack. The user-mode stack is quite large, but the kernel-mode stack is limited to 12K. To avoid overflowing the kernel stack, avoid deeply nested function calls, large local variables, and recursive calls. Also, some system calls (ex: registry I/O calls) consume quite a bit of stack. You may find it necessary to farm off some of your API calls to worker threads whose stacks are empty. You can use ASSERTs with IoGetRemainingStackSize() to quickly locate areas in your code where you might overflow the stack.

Avoid undocumented system calls. The behavior of these calls can change from one implementation to the next, and even between service packs. If you do use undocumented calls, document exactly which ones you used and the reasons you were forced to use them.

Be very careful to avoid deadlocks if your driver initiates new I/O that will pass through a device stack in which your driver participates. This is particularly true if your driver blocks on that new I/O.

Keep in mind that the data that an IRP?s MDL refers to can be changing while your driver owns the IRP. An MDL?s data is not static.

All completion routines that return STATUS_SUCCESS must include this code:

if (Irp->PendingReturned) { IoMarkIrpPending(Irp); }

This is necessary in order to propagate the PendingReturned flag up to the top of the stack so that the IRP can be properly freed upon completion.

Zero-out (or use some other non-random value) structs that you are about to free. This will help you to identify when you are referencing memory that has already been freed. If you leave data in the struct, it?s likely that your code will not complain about the data that it finds in the memory when it uses a pointer that points to memory that?s already been freed. This technique is easily accomplished by #defining a new version of the free-memory function that, if DBG is defined, will zero-out the memory before freeing it.

Use structured exception handling whenever you access user-mode virtual memory such as in calls to MmMapLockedPagesSpecifyCache() or MmMapLocked Pages() when mapping pages into user address space, or calls to MmProbeAndLockPages(), MmProbeForRead() and MmProbeForWrite().

Discover 'em in the Debugger

Step through every line of your code in a symbolic debugger like WinDbg before you release your driver. Stepping through your code is the best way to see if you actually implemented what you intended to implement. You will need the symbol files (.PDB files) for your test system?s OS and for the driver that you?re testing. WinDbg and its documentation have improved tremendously over the past couple of years. You?ll find that the documentation contains detailed instructions for discovering the causes of the vast majority of the bug checks that you?ll encounter.

Use cookies in your structs so that you can identify them when you?re poking around in memory or in a crash dump file.

If a driver has trampled over critical data structures in memory, the crash dump file may not be usable. When using a kernel debugger to analyze a crash dump file, use dumpchk.exe first to check the file to see if it?s corrupt or not.

Get the Bugs that Got Away

It?s highly likely that there will be bugs in the drivers that you ship. It?s therefore critical that you have a support plan in place to deal with crash dump files that are sent to you by your customers. There are several types of files that you should generate at build time and preserve for future use, that can help you to locate bugs in your shipping drivers. You should alter your SOURCES file to instruct the build process to generate these useful files. Figure 1 is a sample SOURCES file for the W2K DDK build environment that will generate all of the files described below.

# This sample SOURCES file is for use with the W2K DDK
# Specify the target file/type of the build (MyDriver.sys)
TARGETNAME=MyDriver
TARGETPATH=obj
TARGETTYPE=DRIVER

# Produce the same symbolic information for both free & checked builds.
# This will allow us to perform full source-level debugging on both
# builds without affecting the free build's performance
!IF "$(DDKBUILDENV)" != "checked"
NTDEBUG=ntsdnodbg
NTDEBUGTYPE=both
USE_PDB=1
!ELSE
NTDEBUG=ntsd
NTDEBUGTYPE=both
USE_PDB=1
!ENDIF

# Set compiler optimizations:
# /Ox - Full optimization enabled
# /Os - favor speed over size when optimizing
# /Od - Disable all optimizations
# /Oi - Enable optimization for intrinsic functions
# /Fc$*.cod - Generate mixed assembler/source code files
#

# For both checked and free builds, make sure that any intrinsic
# functions are compiled correctly. To do this, ensure that /Oi
# is selected for both free and checked builds. There is a bug in
# VC++ 6.0 (at least through SP4) where, if you specify any
# intrinsic functions in your code with "#pragma intrinsic" but
# you don't have the /Oi optimization enabled, neither a call
# to the function, nor the intrinsic inline version of the function
# will end up in your object code. This bug only applies to free
# builds, but just to be safe we'll make sure that the flag is
# enabled for all builds.
!IF "$(DDKBUILDENV)" != "checked"
MSC_OPTIMIZATION=/Ox /Os /Oi /Fc$*.cod
!ELSE
MSC_OPTIMIZATION=/Od /Oi /Fc$*.cod
!ENDIF

# Generate a linker map file just in case we need one for debugging
!IF "$(DDKBUILDENV)" != "checked"
LINKER_FLAGS=$(LINKER_FLAGS) -MAP:.\objfre\i386\$(TARGETNAME).map \
-MAPINFO:EXPORTS -MAPINFO:LINES -MAPINFO:FIXUPS
!ELSE
LINKER_FLAGS=$(LINKER_FLAGS) -MAP:.\objchk\i386\$(TARGETNAME).map \
-MAPINFO:EXPORTS -MAPINFO:LINES -MAPINFO:FIXUPS
!ENDIF

# Generate a browser information file for use in IDE development
BROWSER_INFO=1
BROWSERFILE=$(TARGETNAME).BSC ?n

# Set the compiler's warning level
MSC_WARNING_LEVEL=-W3 ?WX

# Specify the files to be used in the build
INCLUDES=$(BASEDIR)\inc;.

SOURCES=MyDriver.c

Figure 1 ? Sample SOURCES File

First, and most importantly, you should generate .PDB symbol files. While the XP DDK will generate .PDB files automatically, the NT4 and W2K DDKs will need explicit instructions in the SOURCES file. These .PDB files will enable you to perform source-level (rather than assembly) debugging. Do not release your .PDB files to the public as they can be used to generate the entire source of your driver.

Some kernel debuggers have the ability to show your source code mixed with assembly language while you are debugging. WinDbg cannot (to my knowledge) do this, but you can generate mixed source/assembly code output at build time in the form of .COD files.

Another useful file to generate is a .MAP file which is a table of the mappings of symbols to addresses that were assigned by the linker.

You may also find it helpful to create a debugger extension DLL that will give you extra commands in WinDbg specifically tailored to debugging your driver. WinDbg comes with sample code for a simple debugger extension DLL. They?re very easy to write.

Conclusion

Drivers seem to beat the bad programming practices out of you because it?s simply too painful to reboot all the time. These techniques are the lessons learned from many beatings. Hopefully they?ll help you to avoid a bruise or two.

OSR would like to thank Nathan Bushman of PowerQuest Corporation for his guest article contribution. He can be reached at: nate.bushman@powerquest.com

This article was printed from OSR Online http://www.osronline.com