Knowing op-lock break is complete in a filter driver

Hello,

We have a filter manger based file system filter driver which needs to process the file in IRP_MJ_CREATE and IRP_MJ_CLEANUP paths. While processing the file, our filter opens the file using FltCreateFile API.

We noticed that when we copy a file from network, in the IRP_MJ_CLEANUP path, when we call FltCreateFile (with FILE_COMPLETE_IF_OPLOCKED flag), we get STATUS_OPLOCK_BREAK_IN_PROGRESS status code. If we don’t specify FILE_COMPLETE_IF_OPLOCKED, we deadlock, since apparently the op-lock break notification needs to be delivered on the same server thread.

So basically, we want to defer our processing until the op-lock break completes. I was wondering what’s the best way to achieve this?

One thought was to track it (failure to open due to op-lock) in the stream context and when we receive IRP_MJ_FILE_SYSTEM_CONTROL with fltParams->FileSystemControl.Common.FsControlCode = FSCTL_OPLOCK_BREAK_ACKNOWLEDGE, initiate the file processing?

Does anybody see any issues with it OR is there any other better way of solving this issue?

Thanks.
-Prasad

Handling it in the oplock break notify is a reasonable solution, since you know that it is safe to proceed after the oplock break notification has been completed by the FSD.

Tony
OSR

Well, if I remember correctly, you might not always see an explicit oplock acknowledgement or release, IRP_MJ_CLEANUP should perform those roles as well. So if you’re actually blocking the IRP_MJ_CLEANUP for the FILE_OBJECT that has the oplock then I don’t think you can expect an oplock ACK since as far the caller is concerned, they’ve released the oplock. They don’t even have a handle anymore (hence the IRP_MJ_CLEANUP)…

Perhaps you could try something where you track the owner of the oplock and when you see an IRP_MJ_CLEANUP on the owning FILE_OBJECT you could either break the oplock yourself or use the same FILE_OBJECT to copy the file. I’ve not tested this works, it’s just an idea…

Thanks,
Alex.

Alex is correct. Closing the handle (i.e. IRP_MJ_CLEANUP) also acknowledges breaks.

The recommended way to do this (and what srv, the file server, itself does when it gets STATUS_OPLOCK_BREAK_IN_PROGRESS), is to get a thread that can be blocked and send an FSCTL_OPLOCK_BREAK_NOTIFY on the file object. When that IRP completes with STATUS_SUCESS it means that any oplock breaks that were in progress have completed.

Note that this FSCTL will return STATUS_SUCCESS if issued on a file that

  • has no oplocks granted on it
  • has oplocks granted on it, but which have not been broken and are awaiting acknowledgement

This FSCTL pends only when the file has an oplock that has broken and is awaiting acknowledgement.

Christian [MSFT]
This posting is provided “AS IS” with no warranties, and confers no rights.

Hello,

Thank you all for your responses.

Just to elaborate further: We are doing file processing in PostOpCreate and PreOpCleanup. In both the paths, we need to re-open the file since the original file object may not have desired access e.g. say read. In PreOpCleanup we are doing the processing only when last handle (system wide) is getting closed. We are getting STATUS_OPLOCK_BREAK_IN_PROGRESS in the PreOpCleanup path.

Now, in this context

@Tony, I presume we need to check for FSCTL_OPLOCK_BREAK_ACKNOWLEDGE in PreOpFileSystemControl and then in the PostOpFileSystemControl, we can assume that the op-lock is broken?

@Alex, we are not blocking IRP_MJ_CLEANUP since that causes deadlock. Hence, we are using FILE_COMPLETE_IF_OPLOCKED when we open the file in this path. After that we do see FSCTL_OPLOCK_BREAK_ACKNOWLEDGE in PreOpFileSystemControl. How do I break the oplock myself without causing a deadlock?

@Christian, when we open the file with FILE_COMPLETE_IF_OPLOCKED, it initiates an op-lock break. However, we want to know when the op-lock break completes. Are you suggesting that we should spawn a separate thread that opens the file and then check for FSCTL_OPLOCK_BREAK_ACKNOWLEDGE in PreOpFileSystemControl?

Thanks.
-Prasad

I think you might have missed my point. If you are in the preCleanup
callback and you issue your own IRP_MJ_CREATE with FILE_COMPLETE_IF_OPLOCKED
and you get an STATUS_OPLOCK_BREAK_IN_PROGRESS then it’s possible that the
IRP_MJ_CLEANUP you are currently processing would be the acknowledgement for
that oplock and you are blocking that. In that case, as long as you keep the
IRP_MJ_CLEANUP from reaching the file system you will deadlock. You cannot
expect the application to issue a separate FSCTL_OPLOCK_BREAK_ACKNOWLEDGE
for the oplock you just broke with your IRP_MJ_CREATE because it has closed
its handle already (that’s why you’re seeing the IRP_MJ_CLEANUP). Now,
please note that it’s possible that occasionally you might see an
FSCTL_OPLOCK_BREAK_ACKNOWLEDGE, depending on whether the oplock is
associated with the FILE_OBJECT for which you’re seeing the cleanup or not.

This is how I read Christian’s suggestion:
When the IRP_MJ_CREATE you issue completes with
STATUS_OPLOCK_BREAK_IN_PROGRESS just create a separate thread that issues
the FSCTL_OPLOCK_BREAK_NOTIFY. After the thread is set up you must let the
IRP_MJ_CLEANUP continue to the file system. In your other thread, when the
FSCTL_OPLOCK_BREAK_NOTIFY completes with STATUS_SUCCESS you can use your
handle you got from FltCreateFile (the one that caused the oplock break in
the first place) to process the file.

Thanks,
Alex.

Thanks Alex for your response. Yes, you are correct. I saw, that, some times, I do not see FSCTL_OPLOCK_BREAK_ACKNOWLEDGE on PreOpFileSystemControl e.g. On Windows 2003, I am getting STATUS_OPLOCK_BREAK_IN_PROGRESS when opening file in the IRP_MJ_CLEANUP path and I do see FSCTL_OPLOCK_BREAK_ACKNOWLEDGE on PreOpFileSystemControl. However, on Windows 2008 R2, I am getting STATUS_OPLOCK_BREAK_IN_PROGRESS when opening the file in the IRP_MJ_CREATE path, however, I do not see FSCTL_OPLOCK_BREAK_ACKNOWLEDGE on PreOpFileSystemControl.

So apparently, waiting for FSCTL_OPLOCK_BREAK_ACKNOWLEDGE on PreOpFileSystemControl won’t work for me always.

Regarding Christian’s suggestion which you interpreted: spawning a separate thread may be expensive for each op-lock failure. Would creating a work queue item be good enough here? Can I be sure, that, when I issue FSCTL_OPLOCK_BREAK_NOTIFY, it will definitely return in finite amount of time and not block forever? Should I issue this FSCTL_OPLOCK_BREAK_NOTIFY using ZwFsControlFile?

Thanks.
-Prasad

“spawning a separate thread may be expensive for each op-lock failure. Would creating a work queue
item be good enough here?” - really ? Have you measured it or are you just estimating ? Also, what kind of processing do you plan to do with the file ? Creating a thread might not matter if your processing is complex enough…

“Can I be sure, that, when I issue FSCTL_OPLOCK_BREAK_NOTIFY, it will definitely return in finite amount of time and not block forever?” - i don’t think so. however, i expect you should be able to wait with a timeout if you get STATUS_PENDING.

“Should I issue this FSCTL_OPLOCK_BREAK_NOTIFY using ZwFsControlFile?” - ZwFsControlFile requires a handle which you can’t create since you just saw the IRP_MJ_CLEANUP for that FILE_OBJECT. So you might need to use FltFsControlFile.

Now that I think about it you could try a different approach. If my understanding is correct you don’t really care about the oplock notification, you just want to be able to “process” the file. As such you could try this:

When the FltCreateFile() you call completes with STATUS_OPLOCK_BREAK_IN_PROGRESS just create a separate thread that starts processing the file using the handle you got from your FltCreateFile(). After the thread is set up you must let the IRP_MJ_CLEANUP continue to the file system. Through the magic of oplocks the actual processing of the file will be suspended until the oplock is acknowledged anyway so you don’t have to worry about it (so you don’t need to issue any oplock FSCTLs or anything). As I said before, I’ve not done this so it might not work but it might be worth trying.

Thanks,
Alex.

In Windows 2008 R2 you will not see FSCTL_OPLOCK_BREAK_ACKNOWLEDGE nearly as often as in previous releases. The reason is that SRV uses the new-in-Win7 granular oplocks, which use FSCTL_REQUEST_OPLOCK as both the request and acknowledgement control. Whether it is a request or acknowledgement depends on the contents of the REQUEST_OPLOCK_INPUT_BUFFER that accompanies that FSCTL.

I think Alex’s approach of using a separate processing thread is better. It ensures that you work properly with the oplock protocol, since your thread will block waiting for the oplock break to be acknowledged. It also frees you from having to be as intimately familiar with the ins and outs of how oplocks are processed.

One issue that has not been addressed here is a bit more fundamental. You are breaking oplocks. Ideally a filter should not do that. The overarching goal of any filter is that the user should not notice its presence outside of the work it is specifically intended to do. Now, as long as you are only breaking an oplock when the last handle is being closed, that’s okay. If you’re breaking oplocks in the post-create, that’s probably not okay.

Christian [MSFT]
This posting is provided “AS IS” with no warranties, and confers no rights.

Hi Alex,

Thanks again for your response.

I haven’t measured, but, I thought that creating a work queue item may be cheaper than creating a separate thread. But, may be I am wrong here?

The processing needs to happen in PostOpCreate. After I am 100% sure that the op-lock break is complete, I intend to call ZwOpenFile which will hit into my filter’s PostOpCreate and do the processing.

In PostOpCreate/PreOpCleanup, if I get a failure while opening the file, I cannot wait there for op-lock break to complete without causing a deadlock. Hence, I fail to process the file at that point. However, I would like to process the file in PostOpCreate as soon as op-lock break completes.

Given this, I would go with the following approach based on your inputs.

If I get a failure while opening the file in PostOpCreate/PreOpClean, create a thread/work queue item that will do the following

  1. Call FltFsControlFile with FSCTL_OPLOCK_BREAK_NOTIFY and wait for it to complete.
  2. Issue ZwOpenFile on the file.

Do you see any issues with it OR do you have any other suggestions?

Thanks.
-Prasad

@Christian, yes, I understand your concern. By breaking oplocks, filter is negating the benefits for which oplocks were introduced. However, we are ensuring that we are not processing the file on all PostopCreate and on all last handle close. Basically once the file is processed, it is not processed again until it’s modified.

Thanks.
-Prasad

That approach looks like it might work, assuming you don’t hold off the
original operation. Please note however that I’ve never implementing
anything similar so that’s not guaranteed to work.

I’m not sure why you wouldn’t directly queue the thread without attempting
to open the file. Per Christian’s earlier email, FSCTL_OPLOCK_BREAK_NOTIFY
would return STATUS_SUCCESS if there is no oplock at all so it wouldn’t
matter if there was an oplock or not…

Thanks,
Alex.

From: “xxxxx@vmware.com

> We are doing file processing in PostOpCreate and PreOpCleanup. In both the paths, we need to re-open the file since the original file object may not have desired access e.g. say read.

What APIs are you using to process the file? Since you are a file system filter, the desired access of the original file object should be no issue because the only access checks are done on IRP_MJ_CREATE.

Regards,
Razvan

Hello,

I pursued on the scheduling work queue item path and it seems to be working fine. Here is what I do

In PostOpCreate/PreOpCleanup, if my FltCreateFile (with FILE_COMPLETE_IF_OPLOCKED flag) returns me STATUS_OPLOCK_BREAK_IN_PROGRESS, I schedule a work queue item using FltQueueGenericWorkItem and pass the file handle (returned by FltCreateFile) as a parameter to work queue item function. Then, my PostOpCreate/PreOpCleanup returns.

The work queue item function does the following

  1. Gets the PFILE_OBJECT from handle using ObReferenceObjectByHandle.
    ns = ObReferenceObjectByHandle(fileHandle,
    SYNCHRONIZE,
    *IoFileObjectType,
    KernelMode,
    &fileObj,
    NULL);
  2. Gets the PFILE_INSTANCE from PFILE_OBJECT.
  3. Issues FltFsControlFile using PFILE_OBJECT and PFILE_INSTANCE with FSCTL_OPLOCK_BREAK_NOTIFY.
    ns = FltFsControlFile(fltInstance,
    fileObj,
    FSCTL_OPLOCK_BREAK_NOTIFY,
    NULL,
    0,
    NULL,
    0,
    NULL);
  4. Gets the filename from the FILE_OBJECT.
  5. Calls ZwOpenFile on the filename. This call hits into PostOpCreate of the filter and we do the desired processing there.

This seems to be working fine. Does anybody see any issues with this?

I have couple of concerns about the approach though.

  1. In step 5, when I issue ZwOpenFile, it hits into my PostOpCreate. Is it possible that, when I attempt to call FltCreateFile, it initiates an op-lock break again? In this case, my worker queue item may get scheduled in a loop. I can fix this issue by tracking it in the stream context.
  2. There could be one potential race case. In step 4 when I get the filename from the file object and before I execute ZwOpenFile, if somebody renames a file, my open may fail.

Thanks.
-Prasad

You shouldn’t call Zw APIs in a minifilter for which equivalent Flt APIs exist. ZwOpenFile will send the create to the top of the stack. Besides, you’ve already issued a FltCreateFile and gotten a file object back. Why incur the expense and complexity of opening the file yet again?

And like Razvan said, you shouldn’t need to reopen the file in order to read it, thereby breaking oplocks that wouldn’t have broken had your filter not been there. You’re in the kernel. No access checking happens on reads from a kernel-mode caller.

Christian [MSFT]
This posting is provided “AS IS” with no warranties, and confers no rights.

Hello Christian and Razvan

We have seen, that, sometimes stack based FILE_OBJECTs are passed to our PostOpCreate function. This typically happens during network I/O. The passed FILE_OBJECT does not have read/write or any access i.e. fltObjects->FileObject->ReadAccess/WriteAccess = FALSE.

If we use the same file object to perform an I/O, cache manager maintains a reference to it and when this stack based FILE_OBJECT goes out of scope, these references become invalid leading to a BSODs are random places. That’s the reason, we re-open the file to perform I/O.

The reason we call ZwOpenFile instead of FltCreateFile is: in order to do the processing, we need the sufficient context which is available to us only in PostOpCreate path. We don’t have this context available in worker queue item.

Thanks.
-Prasad

> We don’t have this context available in worker queue item.?

Why not perform the processing inline (in PreCleanup on the same thread) instead of using a work item??

This would guarantee that the stack-based file object stays valid while you do the processing and that you’re in the right context (stack) to reference it.

Razvan

@Razwan, well, this won’t address the use case for us.

If we fail to process the file due to op-lock, the goal is to be able to process the file as soon as possible. This should happen even if someone doesn’t issue subsequent IRP_MJ_CREATE or IRP_MJ_CLEAUP on it.

That’s the reason I am explicitly initiating an open using ZwCreateFile as soon as the op-lock is broken.

Thanks
-Prasad

> If we fail to process the file due to op-lock, the goal is to be able to process the file as soon as possible. This should happen even if someone doesn’t issue subsequent IRP_MJ_CREATE or IRP_MJ_CLEAUP on it.?

That’s the reason I am explicitly initiating an open using ZwCreateFile as soon as the op-lock is broken.

By using the original file object you wouldn’t fail to process the file due to op-lock because you wouldn’t try to open the file.

I’m obviously missing something here because you seem to insist on opening the file instead of using the existing file object.

What is your use case? What kind of processing do you do?

Regards,
Razvan.

Ok, I am revisiting this topic in the context of using the original file object instead of reopening using FltCreateFile.

The reason I originally started opening the file using FltCreateFile was due to the stack based file objects. It was causing random crashes due to cache manager holding references to stale stack based file objects. This typically used to happen for network based file access.

I came across http://www.osronline.com/article.cfm?article=219 which talks about similar issues.

Do you forsee any issues if I do the following:

In both PostOpCreate and PreOpCleanup, I will use the original file object to perform the processing if PFILE_OBJECT that is passed in is within the current thread stack limits. Otherwise, I will open the file using FltCreateFile. This will minimize the use of FltCreateFile and subsequent op-lock breaks.

I have few more questions in the context of http://www.osronline.com/article.cfm?article=219

  1. If I always perform non-cached I/O, can I still run into crashes due to stack based file objects? I believe yes, because, the article says that NTFS doesn’t obey non-cached flag for compressed files and some other file systems may just opt to ignore it.
  2. Are stack based file objects used only in Windows XP/2003 OR it’s used in later versions of Windows as well?
  3. The article is probably written for legacy filter drivers and not for filter manager based filter drivers?
  4. The pseudo code at end of the article seems bit hacky. Is it safe to use it?

Thanks.
-Prasad