OSRLogo
OSRLogoOSRLogoOSRLogo x Seminar Ad
OSRLogo
x

Everything Windows Driver Development

x
x
x
GoToHomePage xLoginx
 
 

    Thu, 14 Mar 2019     118020 members

   Login
   Join


 
 
Contents
  Online Dump Analyzer
OSR Dev Blog
The NT Insider
The Basics
File Systems
Downloads
ListServer / Forum
  Express Links
  · The NT Insider Digital Edition - May-June 2016 Now Available!
  · Windows 8.1 Update: VS Express Now Supported
  · HCK Client install on Windows N versions
  · There's a WDFSTRING?
  · When CAN You Call WdfIoQueueP...ously

I've Got Work To Do - Worker Threads & Work Queues

 

There are times in your device driver where you may want to perform some sort of work “out of line” from the current processing.  Some common examples of these cases are:

  

  • You want to return control to the calling thread, and continue some sort of lengthy processing in a different context.
  • There’s some processing that you need to perform at IRQL PASSIVE_LEVEL, but the thread of execution you’re in is running at IRQL DISPATCH_LEVEL.
  • You want to allow a single requesting thread to make multiple calls into your driver, with each of those calls continuing to execute in parallel.
  • You want to limit the amount of parallel execution in your driver code but you don’t want to block the requestor threads’ execution to achieve this limitation on parallelism.

In each of these cases, what you need is one or more worker threads.  Worker threads are just what the name implies: Kernel mode threads that are used to perform a particular work assignment on behalf of a driver.  The NT Executive provides a built-in worker thread pool that can be used for many purposes.  Alternatively, you can create your own group of worker threads to meet special needs.

 

This article explains how to use the NT Executive’s worker thread routines, as well as how to build your own.  As you’ll see, either one is a very simple process.

 

Using NT's Executive Work Items

NT’s Executive work items make using worker threads really easy.   First, you create a worker thread function that does whatever work you need done.  This is the function that the worker thread package will call.  The routine returns no value, and takes a single longword (PVOID) as a parameter.

 

Next, you allocate and initialize a work item.  This work item contains a pointer to your worker routine, and the value for the parameter to pass to your worker routine as an argument.

 

Finally, when you want your work routine to be executed, you call a function to request the Executive queue the work item on one of its internal work queues.   Your choice of work queues are the DelayedWorkQueue and the CriticalWorkQueue.  We’ll discuss the differences between these two queues later.

 

After you queue the work item, as soon as an Executive worker thread is free, it will remove the work item from the queue, and call the indicated worker routine.  The rest, as they say, is up to you.  The actual code needed to do the above is extremely simple.  Consider the following code (Figure 1) to allocate, initialize, and queue a work item.

 

PWORK_QUEUE_ITEM WorkItem;

 

 

WorkItem = ExAllocatePool(NonPagedPool, sizeof(WORK_QUEUE_ITEM) );

 

ExInitializeWorkItem(WorkItem, WorkRoutine, WorkItem);

 

ExQueueWorkItem(WorkItem, DelayedWorkQueue);

 

Figure 1

 

Pretty simple, huh?

 

This code would work with the trivial work routine shown in Figure 2.

 

VOID

WorkRoutine(PVOID Parameter)

{

 

    DbgPrint("In the work routine.  Parameter = 0x%0x\n",

               Parameter);

 

    ExFreePool(Parameter);

}

 

Figure 2

 

The worker routine will always be called by the Executive at IRQL PASSIVE_LEVEL, in the context of the system process.

 

Note that while the Executive is responsible for removing the work item from its work queue, the worker routine itself is responsible for returning the work item itself to pool.  This is the reason we pass the address of the work item into the worker function.

 

The only problem with this approach is that it doesn’t really allow us to pass any parameters to the work routine.  The only parameter that we can pass is used by our passing a pointer to the work item.  We can fix this, obviously, by creating our own private data structure that contains any parameters we want to pass to our worker routine.  We can also include in this data structure a pointer to the work item.  We would then pass the worker routine a pointer to our data structure as the parameter.  It would get the parameters out of this data structure, and would also get the pointer to the work entry to return to pool from the data structure.

 

You can simplify this approach even further by embedding the WORK_QUEUE_ENTRY structure right your private work queue structure.  That would require only one structure to be returned to pool.  The example in Figure 3 illustrates this approach.

 

//

// Definition for OSR Work Queue Item

//

typedef struct _OSR_WORK_ITEM {

 

    WORK_QUEUE_ITEM    WorkItem;

    PVOID              Param1;

    PVOID              Param2;

    PVOID              Param3;

 

} OSR_WORK_ITEM, * POSR_WORK_ITEM;

 

 

//

//  Queue an OSR work item for execution

//

POSR_WORK_ITEM OsrWorkItem;

 

    //

    // Allocate space

    //

    OsrWorkItem = ExAllocatePool(NonPagedPool, sizeof(OSR_WORK_ITEM));

 

    //

    // Set the parameters to pass

    //

    OsrWorkItem->Param1 = (PVOID)0x11111;

    OsrWorkItem->Param2 = (PVOID)0x22222;

    OsrWorkItem->Param3 = (PVOID)0x33333;

 

    //

    // Init the work item embedded in the private structure

    //

    ExInitializeWorkItem(&OsrWorkItem->WorkItem,  // Item to initialize

                    WorkRoutine,        // Function to call

                    OsrWorkItem);       // Parameter to pass

 

    //

    // Queue it for execution

    //

    ExQueueWorkItem(&OsrWorkItem->WorkItem, // Item to queue

                    DelayedWorkQueue);           // Queue to put it on

 

//

// Worker Routine – Uses OSR Work Item

//

VOID

WorkRoutine(PVOID Parameter)

{

    POSR_WORK_ITEM OsrWorkItem = (POSR_WORK_ITEM)Parameter;

 

    DbgPrint("In the work routine.  Parameter = 0x%0x\n",

               Parameter);

    //

    // Next, we could use the parameters.  He we just display them

    // to prove we got them

    //

    DbgPrint("Function params 1 = 0x%0x, 2 = 0x%0x, 3= 0x%0x\n",

                OsrWorkItem->Param1,

                OsrWorkItem->Param2,

                OsrWorkItem->Param3);

 

    //

    // Free the pool for the our private work item

    //

    ExFreePool(OsrWorkItem);

   

}

 

Figure 3

 

Here, you can see the definition of the OSR_WORK_ITEM.  This structure includes in it a standard NT Executive WORK_QUEUE_ITEM, as well as storage for various parameters to be passed to the worker routine.  The code allocates space for the OSR work item, and fills in the parameters to be passed.  It then initializes the embedded Executive work item within that structure.  Finally, it queues the Executive work item on the DelayedWorkQueue by calling ExQueueWorkItem().

 

As a result of the work item being queued, WorkRoutine() executes as soon as there is a free thread for the delayed work queue.  It does whatever work needs to be done, and then returns the entire OSR work item structure to pool and returns.

 

NT Work Queue Types

So, you can see from the above that NT’s Executive work queues are very easy to access and use.  But how does NT manage its work queues?

 

The NT Executive initializes pools of threads to service three work queues when the system is started.  These three work queues are defined by the WORK_QUEUE_TYPE enum in ntddk.h.  The three work queue types are:

 

·         DelayedWorkQueue,

·         CriticalWorkQueue

·         HyperCriticalWorkQueue

 

The DelayedWorkQueue is the one used by most drivers for operations.  The NT Executive apparently dedicates something like three threads to servicing requests from this queue. These threads run in the upper half of the dynamic priority range.

 

The CriticalWorkQueue is often used by file systems.  Our experience is that the NT Executive dedicates between 3 and 10 threads to this queue.  These threads run in the low part of the real time range.

 

The HyperCriticalWorkQueue is not documented.  We assume the threads servicing this queue are the highest priority of all.

 

It’s important to understand that the work queue threads are dedicated to their particular thread pool.  If there are, for example, three threads for the DelayedWorkQueue, and they are all busy, additional work items queued to this queue will wait until one of the threads servicing this queue is free.  This is true even if there are free threads available from the pool that services one or both of the other work queues.

 

Make It Yours

You might think we were a bit skimpy on the information in that previous section.  “Apparently … three threads” running in “the upper half of the dynamic priority range”.  Not the typical stuff you read in The NT Insider, is it.  What’s the deal?  The deal is that we really don’t care about the details of the NT Executive’s work queues.  Except for very simple, one-time, uses, we typically create our own work queue package.

 

Creating your own work queue package is extremely simple.  The advantage of this is that you can create your thread pool exactly the way you want it.  You can define the exact number of threads, their priorities, and how the queue will work.  You also get to define your own work item structure, so it can contain precisely what is necessary for your use.

 

While creating your own worker thread package might seem like a difficult task, it’s really very simple. Create an initialization routine that creates a list head for the work items, a synchronization event for the threads to wait on, and then creates the threads by calling PsCreateSystemThread().  Note that the type of event that you use is important: A synchronization event automatically resets after a single blocked thread is signaled.  Code like that in Figure 4 might be used for this purpose.

  

    //

    // Init the list head for the queue of work items

    //

    InitializeListHead(&workerQueue->WorkQueue);

 

    //

    // Initialize the event

    //

    KeInitializeEvent(&workerQueue->WorkQueueEvent, SynchronizationEvent, FALSE);

 

    //

    // Start the worker threads.

    //

    for (index = 0; index < workerQueue->NumberOfWorkerThreads; index++) {

 

       code = PsCreateSystemThread(&workerQueue->WorkQueueThreads[index],

(ACCESS_MASK)0L,

0,

0,

0,

OsrWorkerStart,   // Initial function to call

workerQueue);           // Parameter to pass

 

       if (!NT_SUCCESS(code)) {

 

          OsrBugCheck();

 

       }

 

    }

 

Figure 4

 

In the code in Figure 4, we create a set of worker threads by calling PsCreateSystemThread(). We store the thread handle returned to us in the structure workerQueue->WorkQueueThreads, indexed by the thread index.  When each of the threads is initially started, it calls OsrWorkerStart().  A sample definition for this function appears in Figure 5.

 

static VOID OsrWorkerStart(PVOID Context)

{

    POSR_WORKERQUEUE workerQueue = (POSR_WORKERQUEUE) Context;

    PWORK_QUEUE_ITEM workItem;

 

    while(TRUE) {

 

        workItem = NULL;  // assume there's nothing to do

 

        //

        // Lock the list of work items

        //

        KeAcquireSpinLock(&workerQueue->WorkQueueSpinLock, &workerQueue->OldIrql);

 

        if (!IsListEmpty(&workerQueue->WorkQueue)) {

 

            workItem = (PWORK_QUEUE_ITEM) RemoveHeadList(&workerQueue->WorkQueue);

 

        }

 

        //

        // Drop the lock

        //

        KeReleaseSpinLock(&workerQueue->WorkQueueSpinLock, workerQueue->OldIrql);

 

        //

        // Do we have something to do?  If not, we go to sleep and wait.

        //

        if (!workItem) {

 

            //

            // Wait to we're awoken

            //

            KeWaitForSingleObject(&workerQueue->WorkQueueEvent,

                                  Executive,

                                  KernelMode,

                                  FALSE,

                                  0);

            continue;

        }

       

        //

        // Call the entry indicated in the work item

        //

        (*workItem->WorkerRoutine)(workItem->Parameter);

 

    }

 

}

 

Figure 5

 

As you can see from Figure 5, all this function does is attempt to remove a work item from a linked list that is protected with a spin lock.  If it is successful in removing the work item, it calls the indicated work routine passing the parameter from the work item.  If there is nothing in the work item list, the function waits for an event to be set.

 

It is in this worker thread start routine where you would alter the default priority of each worker thread, by calling KeSetBasePriorityThread(), for example.

 

Queuing a work item is simply a matter of placing an entry on a list, and setting the appropriate event (See Figure 6).

 

VOID OsrQueueWorkItem(POSR_WORKERQUEUE workerQueue, PWORK_QUEUE_ITEM WorkItem)

{

    //

    // Add the work item to the list of things to do.

    //

    KeAcquireSpinLock(&workerQueue->WorkQueueSpinLock, &workerQueue->OldIrql);

 

    InsertTailList(&workerQueue->WorkQueue, &WorkItem->List);

 

    KeReleaseSpinLock(&workerQueue->WorkQueueSpinLock, workerQueue->OldIrql);

 

    //

    // Wake one of the woker threads

    //

    KeSetEvent(&workerQueue->WorkQueueEvent, 0, FALSE);

 

}

 

Figure 6

Variations

The custom work queue package illustrated in this article doesn’t do anything particularly different than the standard NT Executive work queue package.  Of course, you could modify the number of available threads and their priorities extremely easily.  This would certainly be useful.

 

But there are other variations on the theme of work queues, also.  The number of available threads in the work queue could dynamically grow so that there was always a thread available to process any queued requests.  Or only a single thread could be created.  This serializes the work performed on the queue (eliminating problems of shared data structure access).

 

Downsides

If work queues are so cool, why don’t device drivers make more use of them?  Basically, because they don’t typically need to.

 

The standard device driver architecture requires us to queue requests that arrive to our driver, and return STATUS_PENDING to the requestor, if we can’t immediately complete the request.  Due to the way devices work, we’ll typically be woken up by our ISR and DpcForIsr when the request that’s pending on the device completes.  When that request completes, we’ll typically tell the operating system it is done and try to dequeue  and start another.  There just isn’t much call for worker threads.

 

Also, using worker threads does have its down side.  Every time your worker thread needs to be started, you’ll need to context switch into that thread.  Some device operations take less time to complete than a context switch!  So, farming every device request off for processing by a worker thread pool just isn’t cool.

 

Summing Up

Work queues can be very handy.  We’ve really only scratched the surface of how they can be used.  The built-in NT Executive work queues are easy to use.  Alternatively, you can design your own thread package.  This lets you customize the attributes of the thread pool.  If you’ve got an operation to perform, and can accept the overhead of a context switch to get that operation performed, consider worker threads.

 

User Comments
Rate this article and give us feedback. Do you find anything missing? Share your opinion with the community!
Post Your Comment

"RE: HyperCriticalWorkQueue"
This article is REALLY old (1998!) and the meaning of these values has changed over time. The current meaning of these values can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/ff566382(v=vs.85).aspx

03-Nov-14, Scott Noone


"HyperCriticalWorkQueue"
From http://www.decuslib.com/decus/vmslt00a/nt/ntpscod.txt "Note that the term 'hypercritical' used for one of the worker queues is actually a misnomer since its thread runs at a priority lower than critical worker threads. Its purpose is process cleanup when processes exit."

16-Oct-14, Roland Pihlakas


"Really helpful~~"

Rating:
12-Feb-04, Fred Liu


"API Obsolete, new one is IoQueueWorkItem"
A very good article.

It is worth adding that ExQueueWorkItem is "obsolete". IoQueueWorkItem is the replacement. The later is supported on Win Me, Win 2000 and later. Is ExQueueWorkItem supported on Win98 SE?

Rating:
15-Jul-03, Joe Nord


Post Your Comments.
Print this article.
Email this article.
bottom nav links