Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

OSR_Community_User · December 31, 2012, 4:56pm

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
? ? &sendingState->mutexForSendingState,
? ? Executive,
? ? KernelMode,
? ? TRUE,
? ? NULL);
if(sendingState->SendingCancelled == TRUE)
{
? ? KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
? ? return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of?KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of?KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

Ken_Johnson · December 31, 2012, 5:29pm

Yes, you need to check the return value of KeWaitForMutexObject in the wait configuration that you supplied, unless you are guaranteed to only execute this code on a system thread and never permit the mutex to become abandoned. (And not checking the return value in that situation is bad practice as it results in the code only working due to subtle and easily overlooked circumstances.)

Consider the case where this code is running on a user mode thread, and thread termination is requested for this thread while your code is waiting in KeWaitForMutexObject(). Because you have declared the wait to be kernel mode alertable, the routine will return with STATUS_ALERTED, but without granting ownership of the mutex. Only STATUS_SUCCESS (or STATUS_WAIT_0 + n, in the case of a wait multiple service request where ?n? is the index of the signaled object), or STATUS_ABANDONED_WAIT_0 (+ n in the case of a wait multiple service request) indicate that mutex ownership has been granted.

Subsequently, your logic may attempt to then release the mutex without having ever gained ownership of it, which is illegal.

S (Msft)

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 1:56 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);

if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Doron_Holan · December 31, 2012, 5:30pm

Where else do you release the mutex? Do you hold the mutex while sending io? If yes, is there a code path on the io completion oath that releases the mutex?

d

From: Aspiring Programmermailto:xxxxx
Sent: ?12/?31/?2012 1:56 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);
if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

Ken_Johnson · December 31, 2012, 5:58pm

I should mention that the right choice of action here is likely to switch to a guarded mutex or fast mutex instead of a KMUTEX, provided that you need a schedulable synchronization mechanism. Otherwise, use a spin lock instead.

If you really must use a KMUTEX, you probably really want to use a nonalertable kernel mode wait here, so that you do not have the wait unexpectedly interrupted during process termination (you shouldn?t typically design the duration of mutex lock hold times to be long, so this is probably the right choice if you can?t switch to a guarded or fast mutex).

If you are allowing the mutex to become abandoned, don?t do that. Abandoned mutexes are typically only legitimately used to synchronize with user mode processes that might become terminated at any time, and even then, the actually valuable use cases are few and far between. Allowing a user mode program to synchronize your internal kernel mode data structures is a very bad idea, so you are (hopefully) not doing this.

S (Msft)

From: Skywingmailto:xxxxx
Sent: ?12/?31/?2012 14:29
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

Yes, you need to check the return value of KeWaitForMutexObject in the wait configuration that you supplied, unless you are guaranteed to only execute this code on a system thread and never permit the mutex to become abandoned. (And not checking the return value in that situation is bad practice as it results in the code only working due to subtle and easily overlooked circumstances.)

Consider the case where this code is running on a user mode thread, and thread termination is requested for this thread while your code is waiting in KeWaitForMutexObject(). Because you have declared the wait to be kernel mode alertable, the routine will return with STATUS_ALERTED, but without granting ownership of the mutex. Only STATUS_SUCCESS (or STATUS_WAIT_0 + n, in the case of a wait multiple service request where ?n? is the index of the signaled object), or STATUS_ABANDONED_WAIT_0 (+ n in the case of a wait multiple service request) indicate that mutex ownership has been granted.

Subsequently, your logic may attempt to then release the mutex without having ever gained ownership of it, which is illegal.

- S (Msft)

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 1:56 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);

if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

OSR_Community_User · December 31, 2012, 7:48pm

Thank you. I’m trying to maintain an old driver written by someone else, and the challenge with switching to fast / guarded mutexes and spin locks across the board is that the code is highly recursive, and I’m pretty sure a lot of deadlocks will result.

I’m trying to understand what is happening, and have some more questions for you:

In the call to KeWaitForMutexObject(), the driver currently specifies “Executive” as the “Wait reason” everywhere. Is there a reasonable way to determine the cases where specifying “UserRequest” is more appropriate?

Or should I just specify False for alertable, and that will take care of the alerting issue in all these cases?

The DDK docs suggest that KeReleaseMutex() can raise STATUS_ABANDONED and STATUS_MUTEX_NOT_OWNED exceptions. Does this mean that I can use a try - catch block to trap these exceptions, and avert a blue screen?

From: Skywing
To: Windows System Software Devs Interest List
Sent: Monday, December 31, 2012 2:57 PM
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I should mention that the right choice of action here is likely to switch to a guarded mutex or fast mutex instead of a KMUTEX, provided that you need a schedulable synchronization mechanism. Otherwise, use a spin lock instead.

If you really must use a KMUTEX, you probably really want to use a nonalertable kernel mode wait here, so that you do not have the wait unexpectedly interrupted during process termination (you shouldn’t typically design the duration of mutex lock hold times
to be long, so this is probably the right choice if you can’t switch to a guarded or fast mutex).

If you are allowing the mutex to become abandoned, don’t do that. Abandoned mutexes are typically only legitimately used to synchronize with user mode processes that might become terminated at any time, and even then, the actually valuable use cases are few
and far between. Allowing a user mode program to synchronize your internal kernel mode data structures is a very bad idea, so you are (hopefully) not doing this.

- S (Msft)

________________________________
From: Skywing
Sent: ‎12/‎31/‎2012 14:29
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

Yes, you need to check the return value of KeWaitForMutexObject in the wait configuration that you supplied, unless you are guaranteed to only execute this code on a system thread and never permit the mutex to become abandoned. (And not checking the return value in that situation is bad practice as it results in the code only working due to subtle and easily overlooked circumstances.)

Consider the case where this code is running on a user mode thread, and thread termination is requested for this thread while your code is waiting in KeWaitForMutexObject(). Because you have declared the wait to be kernel mode alertable, the routine will return with STATUS_ALERTED, but without granting ownership of the mutex. Only STATUS_SUCCESS (or STATUS_WAIT_0 + n, in the case of a wait multiple service request where ‘n’ is the index of the signaled object), or STATUS_ABANDONED_WAIT_0 (+ n in the case of a wait multiple service request) indicate that mutex ownership has been granted.

Subsequently, your logic may attempt to then release the mutex without having ever gained ownership of it, which is illegal.

- S (Msft)

From:xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 1:56 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);

if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Ken_Johnson · December 31, 2012, 11:30pm

There is a blurb with a general recommendation on choosing between those two waits reasons in the MSDN documentation for KeWaitForSingleObject (and similar kernel wait services), but there isn?t a hard and fast rule. Generally, UserRequest is intended to indicate a wait on behalf of some operation by user mode. There isn?t a behavioral difference presently built-in to the kernel between these two waits reasons, just what tools that can query the wait reason display (e.g. the debugger).

If you recursively take the lock and can?t easily move from a KMUTEX, then I would recommend setting Alertable to FALSE in most circumstances. This assumes that code running with the mutex held doesn?t perform long waits or other operations that could result in indefinitely holding up other code attempting to obtain synchronization on the lock. It is generally atypical to specify an alertable kernel mode wait unless you want to explicitly take action to break out of a long wait to allow thread termination to complete in a timely manner. If you make the wait non-alertable, then a thread termination request won?t interrupt the wait on the mutex object.

It is possible to catch these exceptions from KeReleaseMutex, but I would not recommend doing so in your situation. It is better to fix the code so that it keeps track of the ownership itself (otherwise you may perform an operation that you believe to be guarded by the lock, but which is not actually guarded by the lock when the code failed to obtain it in reality ? and that may result in more difficult to diagnose corruption).

S (Msft)

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 4:48 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

Thank you. I’m trying to maintain an old driver written by someone else, and the challenge with switching to fast / guarded mutexes and spin locks across the board is that the code is highly recursive, and I’m pretty sure a lot of deadlocks will result.

I’m trying to understand what is happening, and have some more questions for you:

In the call to KeWaitForMutexObject(), the driver currently specifies “Executive” as the “Wait reason” everywhere. Is there a reasonable way to determine the cases where specifying “UserRequest” is more appropriate?

Or should I just specify False for alertable, and that will take care of the alerting issue in all these cases?

The DDK docs suggest that KeReleaseMutex() can raise STATUS_ABANDONED and STATUS_MUTEX_NOT_OWNED exceptions. Does this mean that I can use a try - catch block to trap these exceptions, and avert a blue screen?

From: Skywing >
To: Windows System Software Devs Interest List >
Sent: Monday, December 31, 2012 2:57 PM
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I should mention that the right choice of action here is likely to switch to a guarded mutex or fast mutex instead of a KMUTEX, provided that you need a schedulable synchronization mechanism. Otherwise, use a spin lock instead.

If you really must use a KMUTEX, you probably really want to use a nonalertable kernel mode wait here, so that you do not have the wait unexpectedly interrupted during process termination (you shouldn?t typically design the duration of mutex lock hold times to be long, so this is probably the right choice if you can?t switch to a guarded or fast mutex).

If you are allowing the mutex to become abandoned, don?t do that. Abandoned mutexes are typically only legitimately used to synchronize with user mode processes that might become terminated at any time, and even then, the actually valuable use cases are few and far between. Allowing a user mode program to synchronize your internal kernel mode data structures is a very bad idea, so you are (hopefully) not doing this.

- S (Msft)
________________________________
From: Skywingmailto:xxxxx
Sent: ?12/?31/?2012 14:29
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions
Yes, you need to check the return value of KeWaitForMutexObject in the wait configuration that you supplied, unless you are guaranteed to only execute this code on a system thread and never permit the mutex to become abandoned. (And not checking the return value in that situation is bad practice as it results in the code only working due to subtle and easily overlooked circumstances.)

Consider the case where this code is running on a user mode thread, and thread termination is requested for this thread while your code is waiting in KeWaitForMutexObject(). Because you have declared the wait to be kernel mode alertable, the routine will return with STATUS_ALERTED, but without granting ownership of the mutex. Only STATUS_SUCCESS (or STATUS_WAIT_0 + n, in the case of a wait multiple service request where ?n? is the index of the signaled object), or STATUS_ABANDONED_WAIT_0 (+ n in the case of a wait multiple service request) indicate that mutex ownership has been granted.

Subsequently, your logic may attempt to then release the mutex without having ever gained ownership of it, which is illegal.

- S (Msft)

From: xxxxx@lists.osr.com mailto:xxxxx [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 1:56 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);

if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx></mailto:xxxxx>

OSR_Community_User · January 1, 2013, 5:14am

Thank you very much, you have been very helpful.

I will change Alertable to FALSE, and then see if the sporadic blue screens are resolved over the next several days.

From: Skywing
To: Windows System Software Devs Interest List
Sent: Monday, December 31, 2012 8:29 PM
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

There is a blurb with a general recommendation on choosing between those two waits reasons in the MSDN documentation for KeWaitForSingleObject (and similar kernel wait services), but there isn’t a hard and fast rule. Generally, UserRequest is intended to indicate a wait on behalf of some operation by user mode. There isn’t a behavioral difference presently built-in to the kernel between these two waits reasons, just what tools that can query the wait reason display (e.g. the debugger).

If you recursively take the lock and can’t easily move from a KMUTEX, then I would recommend setting Alertable to FALSE in most circumstances. This assumes that code running with the mutex held doesn’t perform long waits or other operations that could result in indefinitely holding up other code attempting to obtain synchronization on the lock. It is generally atypical to specify an alertable kernel mode wait unless you want to explicitly take action to break out of a long wait to allow thread termination to complete in a timely manner. If you make the wait non-alertable, then a thread termination request won’t interrupt the wait on the mutex object.

It is possible to catch these exceptions from KeReleaseMutex, but I would not recommend doing so in your situation. It is better to fix the code so that it keeps track of the ownership itself (otherwise you may perform an operation that you believe to be guarded by the lock, but which is not actually guarded by the lock when the code failed to obtain it in reality – and that may result in more difficult to diagnose corruption).

- S (Msft)

From:xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 4:48 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

Thank you. I’m trying to maintain an old driver written by someone else, and the challenge with switching to fast / guarded mutexes and spin locks across the board is that the code is highly recursive, and I’m pretty sure a lot of deadlocks will result.

I’m trying to understand what is happening, and have some more questions for you:

1. In the call to KeWaitForMutexObject(), the driver currently specifies “Executive” as the “Wait reason” everywhere. Is there a reasonable way to determine the cases where specifying “UserRequest” is more appropriate?

Or should I just specify False for alertable, and that will take care of the alerting issue in all these cases?

2. The DDK docs suggest that KeReleaseMutex() can raise STATUS_ABANDONED and STATUS_MUTEX_NOT_OWNED exceptions. Does this mean that I can use a try - catch block to trap these exceptions, and avert a blue screen?

From:Skywing
To: Windows System Software Devs Interest List
Sent: Monday, December 31, 2012 2:57 PM
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I should mention that the right choice of action here is likely to switch to a guarded mutex or fast mutex instead of a KMUTEX, provided that you need a schedulable synchronization mechanism. Otherwise, use a spin lock instead.

If you really must use a KMUTEX, you probably really want to use a nonalertable kernel mode wait here, so that you do not have the wait unexpectedly interrupted during process termination (you shouldn’t typically design the duration of mutex lock hold times
to be long, so this is probably the right choice if you can’t switch to a guarded or fast mutex).

If you are allowing the mutex to become abandoned, don’t do that. Abandoned mutexes are typically only legitimately used to synchronize with user mode processes that might become terminated at any time, and even then, the actually valuable use cases are few
and far between. Allowing a user mode program to synchronize your internal kernel mode data structures is a very bad idea, so you are (hopefully) not doing this.

- S (Msft)

From: Skywing
Sent: ‎12/‎31/‎2012 14:29
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions
Yes, you need to check the return value of KeWaitForMutexObject in the wait configuration that you supplied, unless you are guaranteed to only execute this code on a system thread and never permit the mutex to become abandoned. (And not checking the return value in that situation is bad practice as it results in the code only working due to subtle and easily overlooked circumstances.)

Consider the case where this code is running on a user mode thread, and thread termination is requested for this thread while your code is waiting in KeWaitForMutexObject(). Because you have declared the wait to be kernel mode alertable, the routine will return with STATUS_ALERTED, but without granting ownership of the mutex. Only STATUS_SUCCESS (or STATUS_WAIT_0 + n, in the case of a wait multiple service request where ‘n’ is the index of the signaled object), or STATUS_ABANDONED_WAIT_0 (+ n in the case of a wait multiple service request) indicate that mutex ownership has been granted.

Subsequently, your logic may attempt to then release the mutex without having ever gained ownership of it, which is illegal.

- S (Msft)

From:xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring Programmer
Sent: Monday, December 31, 2012 1:56 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

I’m sporadically getting these inexplicable blue screens on calling the KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);

if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue screen.

Should I be checking the return value of KeWaitForMutexObject() and call KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to prevent the blue screeen?

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · January 1, 2013, 8:26am

Big red flag: “highly recursive” is a scary concept in the kernel where
stacks are small

There remains the possibility that a stack overrun could corrupt a data
structure, leading to this confusion.

Yes, you should check the return value; your code is making the
assumption that the acquisition has succeeded; if it did not, then
releasing it would be an error.

And, make sure that the KeInitializeMutex has completed before you try to
use it! I’ve seen some nasty code where the KeInitializeMutex might or
might not be called in a PASSIVE_LEVEL thread before a dedicated driver
thread tried to block on the mutex. Make sure the data structure is not
“put into play” before it is fully initialized.

And an attempt to free an unowned mutex is indicative of a much deeper
bug. In the code you showed here, because you don’t know if you acquired
the mutex, you don’t know if you can release it.

One of the exercises in my Systems Programming course (app level) used
mutexes. One of the interesting bugs that nailed most of my students was
the assumption that WaitFor… calls would work. Sometimes, due to a
programming error, they would pass a NULL handle in, then wonder why they
got data corruption. The answer was that they had never acquired the
lock. I forced them to put a test in, and when they saw WAIT_FAILED
coming back, it forced them to look for the reason for this. The kernel
is far less forgiving.
joe

Thank you very much, you have been very helpful.

I will change Alertable to FALSE, and then see if the sporadic blue
screens are resolved over the next several days.Â

From: Skywing
> To: Windows System Software Devs Interest List
> Sent: Monday, December 31, 2012 8:29 PM
> Subject: RE: [ntdev] Baffling KeReleaseMutex() and
> STATUS_MUTANT_NOT_OWNED Exceptions
>
>
>
> There is a blurb with a general recommendation on choosing between those
> two waits reasons in the MSDN documentation for KeWaitForSingleObject (and
> similar kernel wait services), but there isnâ€™t a hard and fast rule.Â
> Generally, UserRequest is intended to indicate a wait on behalf of some
> operation by user mode.Â There isnâ€™t a behavioral difference presently
> built-in to the kernel between these two waits reasons, just what tools
> that can query the wait reason display (e.g. the debugger).
> Â
> If you recursively take the lock and canâ€™t easily move from a KMUTEX,
> then I would recommend setting Alertable to FALSE in most circumstances.Â
> This assumes that code running with the mutex held doesnâ€™t perform long
> waits or other operations that could result in indefinitely holding up
> other code attempting to obtain synchronization on the lock.Â It is
> generally atypical to specify an alertable kernel mode wait unless you
> want to explicitly take action to break out of a long wait to allow thread
> termination to complete in a timely manner.Â If you make the wait
> non-alertable, then a thread termination request wonâ€™t interrupt the
> wait on the mutex object.
> Â
> Â
> It is possible to catch these exceptions from KeReleaseMutex, but I would
> not recommend doing so in your situation.Â It is better to fix the code
> so that it keeps track of the ownership itself (otherwise you may perform
> an operation that you believe to be guarded by the lock, but which is not
> actually guarded by the lock when the code failed to obtain it in reality
> â€“ and that may result in more difficult to diagnose corruption).
> Â
> - S (Msft)
> Â
> From:xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring
> Programmer
> Sent: Monday, December 31, 2012 4:48 PM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED
> Exceptions
> Â
> Thank you. I’m trying to maintain an old driver written by someone else,
> and the challenge with switching to fast / guarded mutexes and spin locks
> across the board is that the code is highly recursive, and I’m pretty sure
> a lot of deadlocks will result.
> Â
> I’m trying to understand what is happening, and have some more questions
> for you:
> Â
> 1. In the call to KeWaitForMutexObject(), the driver currently specifies
> “Executive” as the “Wait reason” everywhere. Is there a reasonable way to
> determine the cases where specifying “UserRequest” is more appropriate?
> Â
> Or should I just specify False for alertable, and that will take care of
> the alerting issue in all these cases?
> Â
> 2. The DDK docs suggest that KeReleaseMutex() can raise STATUS_ABANDONED
> and STATUS_MUTEX_NOT_OWNED exceptions. Does this mean that I can use a try
> - catch block to trap these exceptions, and avert a blue screen?
> Â
>
>
>
> From:Skywing
> To: Windows System Software Devs Interest List
> Sent: Monday, December 31, 2012 2:57 PM
> Subject: RE: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED
> Exceptions
>
>
> I should mention that the right choice of action here is likely to switch
> to a guarded mutex or fast mutex instead of a KMUTEX, provided that you
> need a schedulable synchronization mechanism.Â Otherwise, use a spin lock
> instead.
>
> If you really must use a KMUTEX, you probably really want to use a
> nonalertable kernel mode wait here, so that you do not have the wait
> unexpectedly interrupted during process termination (you shouldnâ€™t
> typically design the duration of mutex lock hold times
> to be long, so this is probably the right choice if you canâ€™t switch to
> a guarded or fast mutex).
>
> If you are allowing the mutex to become abandoned, donâ€™t do that.Â
> Abandoned mutexes are typically only legitimately used to synchronize with
> user mode processes that might become terminated at any time, and even
> then, the actually valuable use cases are few
> and far between.Â Allowing a user mode program to synchronize your
> internal kernel mode data structures is a very bad idea, so you are
> (hopefully) not doing this.
>
> - S (Msft)
>
>
>
> From: Skywing
> Sent: â€Ž12/â€Ž31/â€Ž2012 14:29
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] Baffling KeReleaseMutex() andÂ
> STATUS_MUTANT_NOT_OWNED Exceptions
> Yes, you need to check the return value of KeWaitForMutexObject in the
> wait configuration that you supplied, unless you are guaranteed to only
> execute this code on a system thread and never permit the mutex to become
> abandoned.Â (And not checking the return value in that situation is bad
> practice as it results in the code only working due to subtle and easily
> overlooked circumstances.)
> Â
> Consider the case where this code is running on a user mode thread, and
> thread termination is requested for this thread while your code is waiting
> in KeWaitForMutexObject().Â Because you have declared the wait to be
> kernel mode alertable, the routine will return with STATUS_ALERTED, but
> without granting ownership of the mutex.Â Only STATUS_SUCCESS (or
> STATUS_WAIT_0 + n, in the case of a wait multiple service request where
> â€˜nâ€™ is the index of the signaled object), or STATUS_ABANDONED_WAIT_0
> (+ n in the case of a wait multiple service request) indicate that mutex
> ownership has been granted.
> Â
> Subsequently, your logic may attempt to then release the mutex without
> having ever gained ownership of it, which is illegal.
> Â
> - S (Msft)
> Â
> From:xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Aspiring
> Programmer
> Sent: Monday, December 31, 2012 1:56 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED
> Exceptions
> Â
> I’m sporadically getting these inexplicable blue screens on calling the
> KeReleaseMutex() API.
> Â
> EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant
> object was made by a thread that was not the owner of the mutant object.
> Â
> The code looks like this:
> Â
> KeWaitForMutexObject(
> Â Â &sendingState->mutexForSendingState,
> Â Â Executive,
> Â Â KernelMode,
> Â Â TRUE,
> Â Â NULL);
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â
> if(sendingState->SendingCancelled == TRUE)
> {
> Â Â KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
> Â Â return;
> }
> Â
> It works well almost all the time, but on occasion I get the above blue
> screen.
> Â
> Should I be checking the return value ofÂ KeWaitForMutexObject() and call
> KeReleaseMutex() only in case of success?
> Â
> From the documentation it looks like all possible return values
> ofÂ KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?
> Â
> Should I be using a try - catch block to trap the exception, in order to
> prevent the blue screeen?
> Â
> Â
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> Â
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · January 1, 2013, 8:40am

I forgot to add: yes, you could do _try/_except, but this is only going to
hide the fact that the code is wrong. The consequences of the incorrect
code could be seriously fatal to overall system integrity. Generally,
intercepting exceptions ranks high in the list of Things To Not Even
Consider Except In Rare And Exotic Circumstances (for example
MmProbeAndLockPages, which is rarely done and only in drivers that can be
considered somewhat exotic). Your situation is that your code is wrong,
so hiding the problem does not solve the problem.

I wonder…I’ve seen a number of questions here, whose nature I think
results in “Potemkin programming” (see “Potemkin village” in wikipedia).
Remember: I named it here!
joe

I’m sporadically getting these inexplicable blue screens on calling the
KeReleaseMutex() API.

EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant
object was made by a thread that was not the owner of the mutant object.

The code looks like this:

KeWaitForMutexObject(
&sendingState->mutexForSendingState,
Executive,
KernelMode,
TRUE,
NULL);
if(sendingState->SendingCancelled == TRUE)
{
KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
return;
}

It works well almost all the time, but on occasion I get the above blue
screen.

Should I be checking the return value of KeWaitForMutexObject() and call
KeReleaseMutex() only in case of success?

From the documentation it looks like all possible return values
of KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?

Should I be using a try - catch block to trap the exception, in order to
prevent the blue screeen?

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · January 1, 2013, 8:45am

This would be a bug. Thread A acquires the mutex, and it is held until
thread B can release it. Oops!

In cases like this, a semaphore would be a better choice, since it can be
released by any thread. You do, however, lose recursive acquisition
semantics, which may or may not be a problem. It’s hard to say, because
we only see a few lines of seriously buggy code, and aren’t seeing the
overall design.
joe

Where else do you release the mutex? Do you hold the mutex while sending
io? If yes, is there a code path on the io completion oath that releases
the mutex?

d

From: Aspiring Programmermailto:xxxxx
> Sent: ý12/ý31/ý2012 1:56 PM
> To: Windows System Software Devs Interest Listmailto:xxxxx
> Subject: [ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED
> Exceptions
>
> I’m sporadically getting these inexplicable blue screens on calling the
> KeReleaseMutex() API.
>
> EXCEPTION_CODE: (NTSTATUS) 0xc0000046 - An attempt to release a mutant
> object was made by a thread that was not the owner of the mutant object.
>
> The code looks like this:
>
> KeWaitForMutexObject(
> &sendingState->mutexForSendingState,
> Executive,
> KernelMode,
> TRUE,
> NULL);
> if(sendingState->SendingCancelled == TRUE)
> {
> KeReleaseMutex(&sendingState->mutexForSendingState, FALSE);
> return;
> }
>
> It works well almost all the time, but on occasion I get the above blue
> screen.
>
> Should I be checking the return value of KeWaitForMutexObject() and call
> KeReleaseMutex() only in case of success?
>
> From the documentation it looks like all possible return values of
> KeWaitForMutexObject() are NT_SUCCESS, so what am I doing wrong here?
>
> Should I be using a try - catch block to trap the exception, in order to
> prevent the blue screeen?
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

David_R_Cattley · January 1, 2013, 10:16am

There isn?t a behavioral difference presently built-in to the kernel between
these two waits reasons, just what tools that can query the wait reason
display (e.g. the debugger).

Was there in prior kernels?

I have been (perhaps mistakenly) under the impression that the wait reason
could have some impact on whether or not the Kernel Stack could be unlocked
and allowed to page out. I see that it is of no consequence to correct
operation if the Kernel does take that liberty but I was just trying to keep
my understanding up to date.

Thanks,

Dave Cattley

anton_bassov · January 1, 2013, 11:04am

> I have been (perhaps mistakenly) under the impression that the wait reason could have some

impact on whether or not the Kernel Stack could be unlocked and allowed to page out.

I was about to say exactly the same thing that you did, but then looked at the target function’s declaration

NTSTATUS KeWaitForMutexObject(
In PVOID Mutex,
In KWAIT_REASON WaitReason,
In KPROCESSOR_MODE WaitMode,
In BOOLEAN Alertable,
In_opt PLARGE_INTEGER Timeout
);

Therefore, as you can see, Ken is speaking about the different parameter - we were both thinking about the WaitMode parameter, while Ken is speaking about WaitReason one…

Anton Bassov

Ken_Johnson · January 1, 2013, 12:59pm

Yes, it is the WaitMode that determines pagability of the kernel stack across the wait, not the WaitReason. The MSDN docs should describe this behavior.

S (Msft)

From: xxxxx@hotmail.com mailto:xxxxx
Sent: ?1/?1/?2013 8:05
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE:[ntdev] Baffling KeReleaseMutex() and STATUS_MUTANT_NOT_OWNED Exceptions

> I have been (perhaps mistakenly) under the impression that the wait reason could have some
> impact on whether or not the Kernel Stack could be unlocked and allowed to page out.

I was about to say exactly the same thing that you did, but then looked at the target function’s declaration

NTSTATUS KeWaitForMutexObject(
In PVOID Mutex,
In KWAIT_REASON WaitReason,
In KPROCESSOR_MODE WaitMode,
In BOOLEAN Alertable,
In_opt PLARGE_INTEGER Timeout
);

Therefore, as you can see, Ken is speaking about the different parameter - we were both thinking about the WaitMode parameter, while Ken is speaking about WaitReason one…

Anton Bassov

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

David_R_Cattley · January 1, 2013, 4:03pm

Thanks. My mistake. I should have gone and read the docs and seen that
it was a different parameter.

Regards,

Dave Cattley

Peter_Viscarola_OSR · January 1, 2013, 5:42pm

I realize you’re referencing two specific wait reasons… but for the archives and completeness: Until a few years back (errrr, Vista maybe?) there was never any behavioral difference in the OS regarding wait reason EVER. In past code, I had code which took advantage of this fact by using an unusual wait reason for things like waiting threads in worker thread pools – this made finding those threads in a crash dump fast and easy.

However, IIRC, the wait reason is now significant in some cases… so, best to follow the guidance which has always existed and use “Executive” or whatever.

Peter
OSR