WDS hangs in IRP_MJ_LOCK_CONTROL of legacy filter driver

Hi,

I am maintaining one legacy file system filter driver.
I have installed it on W2K8 R2 with WDS (Windows Deployment Service). When I try to create an install image in WDS, the application hangs forever.
Please note that this driver is deployed on all the possible windows OSes starting from Windows NT to W2K8 R2. This driver runs only on servers (windows 2K server, 2k3 server, 2k8 servers etc.).
I debugged and found that in the file system filter driver it hangs in the dispatcher function of IRP_MJ_LOCK_CONTROL. The algorithm of dispatcher of IRP_MJ_LOCK_CONTROL is like
A. Check the minor code to find out whether it is lock or unlock.
B. If it is lock, pass the irp to lower driver in the stack, with completion routine.
C. In the completion routine, if success, create new irp to unlock, previously allocated byte range.
D. Pass it to the lower driver with completion routine.
E. If unlock successful, notify the completion of lock, which in turn notify the main dispatcher waiting for event.
F. In the main dispatcher, regardless of the error condition, filter driver creates a lock structure and using FltProcessLock, implement the lock on file itself.
G. Complete the request.
H. If minor code is unlock. Without passing down to lower driver stack, process it using FltProcessLock.
I. Complete the irp.
So, when I try to create install image using WDS following steps occurs.

  1. Dispatcher of IRP_MJ_LOCK_CONTROL is called with lock as minor code.
  2. As described above, it passes the same irp to lower stack and on successful completion of it, it passes the new irp with the minor code unlock to lower driver stack.
  3. The lower driver stack fails unlock with STATUS_RANGE_NOT_LOCKED (0xC000007EL).
  4. Dispatcher ignores the error and implements the locking itself.
  5. Dispatcher gets unlock as minor code and process it successfully.
  6. Dispatcher gets lock as minor code, this time when dispatcher sends the same irp to lower driver with completion routine, completion routine never gets called.
  7. !irp shows that irp is being processed by Ntfs(current stack is ntfs).

The code path for IRP_MJ_LOCK_CONTROL is working fine, in all other scenarios.
I am able to find two workaround to solution.
a) Comment out the A-E in the dispatcher code and it works like charm.
b) Change the flag in the IO_STACK_LOCATION of next IO_STACK_LOCATION to fail request immediately. This also works perfectly fine.
Since I cannot execute the option a) (because it will affect the operation of the filter driver) mentioned above. And I am not sure is it right way to solve the problem with option b).

If any of you have encounter similar situation, I would like to hear your expert comments on the problem. If you think there is alternate solution possible please let me know.

Thanks
Manish

> 1. Dispatcher of IRP_MJ_LOCK_CONTROL is called with lock as minor code.

  1. As described above, it passes the same irp to lower stack and on
    successful completion of it, it passes the new irp with the minor code
    unlock to lower driver stack.
  2. The lower driver stack fails unlock with STATUS_RANGE_NOT_LOCKED
    (0xC000007EL).

Something is wrong with your IRPs, if you just locked the region then the
unlock shouldn’t be failing.

The locking package keeps track of the process performing the lock. If you
call IoGetRequestorProcess on the lock and unlock IRPs do you get the same
process back?

-scott


Scott Noone
Consulting Associate
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntfsd…
> Hi,
>
> I am maintaining one legacy file system filter driver.
> I have installed it on W2K8 R2 with WDS (Windows Deployment Service). When
> I try to create an install image in WDS, the application hangs forever.
> Please note that this driver is deployed on all the possible windows OSes
> starting from Windows NT to W2K8 R2. This driver runs only on servers
> (windows 2K server, 2k3 server, 2k8 servers etc.).
> I debugged and found that in the file system filter driver it hangs in the
> dispatcher function of IRP_MJ_LOCK_CONTROL. The algorithm of dispatcher of
> IRP_MJ_LOCK_CONTROL is like
> A. Check the minor code to find out whether it is lock or unlock.
> B. If it is lock, pass the irp to lower driver in the stack, with
> completion routine.
> C. In the completion routine, if success, create new irp to unlock,
> previously allocated byte range.
> D. Pass it to the lower driver with completion routine.
> E. If unlock successful, notify the completion of lock, which in turn
> notify the main dispatcher waiting for event.
> F. In the main dispatcher, regardless of the error condition, filter
> driver creates a lock structure and using FltProcessLock, implement the
> lock on file itself.
> G. Complete the request.
> H. If minor code is unlock. Without passing down to lower driver stack,
> process it using FltProcessLock.
> I. Complete the irp.
> So, when I try to create install image using WDS following steps occurs.
> 1. Dispatcher of IRP_MJ_LOCK_CONTROL is called with lock as minor code.
> 2. As described above, it passes the same irp to lower stack and on
> successful completion of it, it passes the new irp with the minor code
> unlock to lower driver stack.
> 3. The lower driver stack fails unlock with STATUS_RANGE_NOT_LOCKED
> (0xC000007EL).
> 4. Dispatcher ignores the error and implements the locking itself.
> 5. Dispatcher gets unlock as minor code and process it successfully.
> 6. Dispatcher gets lock as minor code, this time when dispatcher sends the
> same irp to lower driver with completion routine, completion routine never
> gets called.
> 7. !irp shows that irp is being processed by Ntfs(current stack is ntfs).
>
>
>
>
>
> The code path for IRP_MJ_LOCK_CONTROL is working fine, in all other
> scenarios.
> I am able to find two workaround to solution.
> a) Comment out the A-E in the dispatcher code and it works like charm.
> b) Change the flag in the IO_STACK_LOCATION of next IO_STACK_LOCATION to
> fail request immediately. This also works perfectly fine.
> Since I cannot execute the option a) (because it will affect the operation
> of the filter driver) mentioned above. And I am not sure is it right way
> to solve the problem with option b).
>
> If any of you have encounter similar situation, I would like to hear your
> expert comments on the problem. If you think there is alternate solution
> possible please let me know.
>
> Thanks
> Manish
>
>

Thanks Scott,

I didn’t try IoGetRequestorProcess, but to make sure the lock and unlock are issued from the same process. I posted both the request from the same thread. Also, i have copied the parameters(file object, byte range, offset, key) from lock to unlock irp.

I will try your recommendation and post the result.

Thanks
Manish

>C. In the completion routine, if success, create new irp to unlock, previously

allocated byte range.

In the case the lock request is pended the completion is called in other thread context.
To ensure the same process context: You have to be waiting in dispatch for completion if pending was returned and create unlock IRP in dispatch routine.

b) Change the flag in the IO_STACK_LOCATION of next IO_STACK_LOCATION to fail
request immediately. This also works perfectly fine.

It doesn’t work fine. It simply doesn’t lock.
See comment for SL_FAIL_IMMEDIATELY.
http://msdn.microsoft.com/en-us/library/ff549251(VS.85).aspx

Bronislav Gabrhelik

Hi all,

Thanks for your response.

Looks like i found the problem.
As suggested by Scott, I checked the process of lock and unlock irp. I found that Lock irp was created in System process context, but unlock irp as it was created in the thread context of WDS, its process was WDS.

On further analysis I found that though stack trace was showing on of the thread of WDS is calling my filter’s dispatch routine, but the Irp was created by one of the system thread. That system thread which created IRP has executed the function from RDBSS (RxSpinUpRequestsDispatcher) and went to wait mode.

So, the whole story was like this. WDS creates a folder and shares it, and then tries to access the folder’s content with shared name, which eventually goes to MxSmb, RDBSS via MUP and after processing of the IRP in RDBSS. RDBSS will create new IRP, same as original IRP and submit it to local file system.

I changed the unlock Irp’s Irp->Tail.Overlay.Thread from PsGetCurrentThread to the Lock Irp’s Irp->Tail.Overlay.Thread. And everything works as expected.

Thanks Bronislav Gabrhelik for pointing out that changing flag to FailImmediate will not grant the lock. I actually saw, but it was not affecting the behavior of the WDS, so i mentioned everything works fine. But in the light of new solution, i discarded change made to flag.

Thanks
Manish