Seems the new kernel bug - race between NtLockFile() and NtWriteFile() could hang any multi-threaded

Hi,

Seems the new kernel bug discovered (Windows 8…10, other not checked).
(unfortunately I not found other acceptable way to submit bugreport to MS).

In general - a race between NtLockFile() and NtWriteFile() could hang any
multi-threaded application.

Testcase to reproduce
https://gist.github.com/leo-yuriev/b25697898a754dbaa0dac89abf824a72

Example of kernel stack traces:

Thread #1:
nt!KiSwapContext+0x7a
nt!KiCommitThreadWait+0x1d2
nt!KeWaitForSingleObject+0x19f
nt!IopSynchronousServiceTail+0x2a9
nt!NtLockFile+0x514
nt!KiSystemServiceCopyEnd+0x13

Thread #2:
nt!KiSwapContext+0x7a
nt!KiCommitThreadWait+0x1d2
nt!KeWaitForSingleObject+0x19f
nt!IopAcquireFileObjectLock+0x84
nt! ?? ::NNGAKEGL::`string’+0x491d5
nt!KiSystemServiceCopyEnd+0x13

Regargs,
Leonid.

P.S.
A bug was found when testing https://github.com/leo-yuriev/libmdbx and
https://github.com/leo-yuriev/libfpta

no, this is not window bug. this your logic bug. 2 threads call in loop:
LockFileEx - WriteFile - UnlockFile
now let be next execution sequence:

thread_1 thread_2

LockFileEx
LockFileEx
WriteFile

so

  1. thread_1:LockFileEx - ok
  2. thread_1:SwitchToThread();
  3. thread_2:LockFileEx

all synchronous operations on file is sequential
here because file object open in synchronous mode (FileObject->Flags & FO_SYNCHRONOUS_IO)
IopAcquireFastLock or IopAcquireFileObjectLock for FileObject called
FileObject->Busy = TRUE;
IopSynchronousServiceTail
IoCallDriver
FsRtlProcessFileLock
FsRtlPrivateCheckForExclusiveLockAccess return false:
lock not Granted because it already hold by thread_1
FsRtlProcessFileLock return STATUS_PENDING because lock not Granted
we return to IopSynchronousServiceTail
because SynchronousIo==true and STATUS_PENDING returned
IopSynchronousServiceTail call KeWaitForSingleObject( &FileObject->Event…)
wait when lock irp will be completed
but it will not be completed until thread_1 call UnlockFile
so thread_2 begin wait when thread_1 call UnlockFile

  1. thread_1:WriteFile
    here because file object open in synchronous mode (FileObject->Flags & FO_SYNCHRONOUS_IO)
    IopAcquireFastLock or IopAcquireFileObjectLock for FileObject called
    because FileObject->Busy == TRUE (it set by thread_2 in NtLockFile)
    we begin wait on until previous synchronous io request (thread_2:LockFileEx) not finished
    so we wait on KeWaitForSingleObject( &FileObject->Lock)

what we have ?
thread_1 wait until previous io operation on file will be completed
but it will be complete when lock will be granted for thread_2
but this will be only after thread_1 call UnlockFile
but this will be only after thread_1 finish WriteFile

so thread_1 can not finish WriteFile until he not call UnlockFile
but it can not call UnlockFile until it not finish WriteFile

here thread_1 lock himself.

and demo code for this can be much more simply and clear

ULONG WINAPI thread_proc(HANDLE hFile)
{
OVERLAPPED ov = {};
if (LockFileEx(hFile, LOCKFILE_EXCLUSIVE_LOCK, 0, 1, 0, &ov))
{
UnlockFile(hFile, 0, 0, 1, 0);
}

return GetLastError();
}

HANDLE hFile = CreateFileW(L"\\?\D:\tmp.txt", FILE_GENERIC_WRITE, 0, 0, OPEN_EXISTING, 0, 0);
if (hFile != INVALID_HANDLE_VALUE)
{
OVERLAPPED ov = {};
if (LockFileEx(hFile, LOCKFILE_EXCLUSIVE_LOCK, 0, 1, 0, &ov))
{
if (HANDLE hThread = CreateThread(0, 0, thread_proc, hFile, 0, 0))
{
// give time to thread_proc call LockFileEx
MessageBoxW(0, 0, L"Now thread Hung!", MB_ICONWARNING);
ULONG NumberOfBytesWritten;
WriteFile(hFile, “*”, 1, &NumberOfBytesWritten, 0);
CloseHandle(hThread);
}
UnlockFile(hFile, 0, 0, 1, 0);
}

CloseHandle(hFile);
}

note - the hung was only because thread_1 and thread_2 use the same synchronous file handle. because synchronous io is sequential per file file object. if threads will be use separate file handles(not duplicated) - will be no hang. because separate file object locks we can change demo code in next way - not pass handle to second thread but let it open file himself.

ULONG WINAPI thread_proc(HANDLE hFile)
{
hFile = CreateFileW(L"*", FILE_GENERIC_WRITE,
FILE_SHARE_WRITE|FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0);

if (hFile != INVALID_HANDLE_VALUE)
{
OVERLAPPED ov = {};
if (LockFileEx(hFile, LOCKFILE_EXCLUSIVE_LOCK, 0, 1, 0, &ov))
{
MessageBoxW(0, L"inside lock", L"thread_proc", MB_ICONINFORMATION);

UnlockFile(hFile, 0, 0, 1, 0);
}

CloseHandle(hFile);
}

return GetLastError();
}

HANDLE hFile = CreateFileW(L"*", FILE_GENERIC_WRITE,
FILE_SHARE_WRITE|FILE_SHARE_READ, 0, OPEN_EXISTING, 0, 0);

if (hFile != INVALID_HANDLE_VALUE)
{
HANDLE hThread = 0;
OVERLAPPED ov = {};
if (LockFileEx(hFile, LOCKFILE_EXCLUSIVE_LOCK, 0, 1, 0, &ov))
{
if (hThread = CreateThread(0, 0, thread_proc, 0, 0, 0))
{
// give time to thread_proc call LockFileEx
MessageBoxW(0, 0, L"Now thread Not Hung!", MB_ICONINFORMATION);

ULONG NumberOfBytesWritten;

if (!WriteFile(hFile, “*”, 1, &NumberOfBytesWritten, 0))
{
GetLastError();
}
}
UnlockFile(hFile, 0, 0, 1, 0);
}

if (hThread)
{
WaitForSingleObject(hThread, INFINITE);
CloseHandle(hThread);
}

CloseHandle(hFile);
}

this show that this is not a problem in case 2 separate applications, which used separate file handle.

code:

  1. thread_1:LockFileEx(h1)
  2. thread_2:LockFileEx(h2)
  3. thread_1:WriteFile(h1)
  4. thread_1:UnlockFile(h1)

assume that h1 -> FileObject1 != FileObject2 <- h2
will be executed without problems. thread_2 will be wait until thread_1 not call UnlockFile(h1), but this not prevent thread_1 from complete call WriteFile(h1), because FileObject1->Busy == FALSE.
call LockFileEx(h2) set FileObject2->Busy = TRUE. so we not hung in IopAcquireFileObjectLock

  1. not hold 3.

but next code always hung (in all windows versions):

  1. thread_1:LockFileEx(h1)
  2. thread_2:LockFileEx(h1)
  3. thread_1:WriteFile(h1)
  4. thread_1:UnlockFile(h1)

thread_1 will be wait inside WriteFile - IopAcquireFileObjectLock on FileObject->Lock
so:
3. finished only only after 2.
2. finished only after 4.
4. finished (even begin) only after 3.

Thank you for explanation.

Not a bug, but feature :wink:

Seems mad for me in comparison Linux.

but this is well known example with 2 critical section deadlock
every (even query it name) synchronous operation on FileObject executed inside critical section

WriteFile(FileObject)
is
EnterCriticalSection(&FileObject->Lock);
WriteFileNoLock(FileObject);
LeaveCriticalSection(&FileObject->Lock);

the Lock/Unlock file exclusive - this is another critical section acquire (by sense)

so when you execute
LockFileEx - WriteFile - UnlockFile
you execute next code:

EnterCriticalSection(&FileObject->Lock);
EnterCriticalSection(&cs);
LeaveCriticalSection(&FileObject->Lock);

//SwitchToThread()

EnterCriticalSection(&FileObject->Lock);
LeaveCriticalSection(&FileObject->Lock);

//SwitchToThread()

EnterCriticalSection(&FileObject->Lock);
LeaveCriticalSection(&cs);
LeaveCriticalSection(&FileObject->Lock);

try it execute in concurent in 2 threads:

thread_1 thread_2

EnterCriticalSection(&FileObject->Lock);
EnterCriticalSection(&cs);
LeaveCriticalSection(&FileObject->Lock);

EnterCriticalSection(&FileObject->Lock);
EnterCriticalSection(&cs);

EnterCriticalSection(&FileObject->Lock);

also note if thread_1 will be use FileObject1->Lock and thread_2 will be use FileObject2->Lock and only cs will be common - deadlock is gone.

this is even not feature. need be very accurate when we sequential acquire two critical sections