FltReleaseContext Causes crash

Hi All,

We are facing a weird system crash with (Bug Check 0xF5: FLTMGR_FILE_SYSTEM with parameter 6D) after calling FltReleaseContext() in our mini filter driver. We have referred WDDK7 sample driver (miniFilter\cancelSafe) for writing the driver and referred following link to properly release instance context (https://msdn.microsoft.com/en-us/library/windows/hardware/ff552001(v=vs.85).aspx).

Sample Code:
typedef struct _INSTANCE_CONTEXT {
PFLT_INSTANCE Instance; // Instance for this context.
FLT_CALLBACK_DATA_QUEUE Cbdq;
KEVENT TeardownEvent;
} INSTANCE_CONTEXT, *PINSTANCE_CONTEXT;

NTSTATUS InstanceSetup (
__in PCFLT_RELATED_OBJECTS FltObjects,
__in FLT_INSTANCE_SETUP_FLAGS Flags,
__in DEVICE_TYPE VolumeDeviceType,
__in FLT_FILESYSTEM_TYPE VolumeFilesystemType
)
{
PINSTANCE_CONTEXT instanceContex; // pointer to instance context
NTSTATUS rc = FltAllocateContext( FltObjects->Filter, FLT_INSTANCE_CONTEXT, INSTANCE_CONTEXT_SIZE, NonPagedPool, (PFLT_CONTEXT *)instanceContex);
instanceContex->Instance = FltObjects->Instance;

//some code like FltCbdqInitialize, InitializeListHead etc.

// associate the context with this instance. Reference count increases to 2.
rc = FltSetInstanceContext(instanceContex->Instance, FLT_SET_CONTEXT_REPLACE_IF_EXISTS, instanceContex, NULL);
// and drop the reference count incremented by FltSetInstanceContext. Reference count decreases to 1.
FltReleaseContext(Instance);

//some processing

if (Error) // Error occurred during processing.
{
// and drop the reference count incremented by FltAllocateContext and delete the context. Reference count decreases to 1.
// BSOD point. Bug check F5, 6D
FltReleaseContext(Instance);
return STATUS_FLT_DO_NOT_ATTACH;
}
}

Also if we replace BSOD point with following code then issue get fixed and while debugging it shows proper release and deletion of the context

// and drop the reference count incremented by FltAllocateContext and delete the context
PFLT_INSTANCE oldContext = NULL;
status = FltDeleteInstanceContext( ip->instancep, //Instance
&oldContext ); //OldContext
if (oldContext != NULL)
{
FltReleaseContext(oldContext);
}
We have referred to the following msdn link which mentions that - to release the initial reference we need to remove the context from the object by calling - “FltDeleteContext”
Link: https://msdn.microsoft.com/en-us/library/windows/hardware/ff551957(v=vs.85).aspx

We can see two different approach used on WDDK sample, one is Using ?FltDeleteContext or ?FltDeleteInstanceContext? to handle an error situation(as shown above). This is being used in WDK sample - ?miniFilter\minispy\filter?.
Second is by not calling ?FltReleaseContext? in case of an error condition.
Queries:

  1. Is calling FltDeleteContext/FltDeleteInstanceContext and then FltReleaseContext, the proper approach to release the last reference of the instance Context, in case of an error condition in the instance setup?

  2. In case of an error condition if we do not call ?FltReleaseContext? (to release the last reference incremented FltAllocateContext), then, will cause be a memory leak?

Thanks,
Devashree

Well …
FltReleaseContext(Instance); , really instance ?
:slight_smile:

On Fri, May 29, 2015 at 3:26 PM, wrote:

> Hi All,
>
> We are facing a weird system crash with (Bug Check 0xF5:
> FLTMGR_FILE_SYSTEM with parameter 6D) after calling FltReleaseContext() in
> our mini filter driver. We have referred WDDK7 sample driver
> (miniFilter\cancelSafe) for writing the driver and referred following link
> to properly release instance context (
> https://msdn.microsoft.com/en-us/library/windows/hardware/ff552001(v=vs.85).aspx
> ).
>
> Sample Code:
> typedef struct _INSTANCE_CONTEXT {
> PFLT_INSTANCE Instance; // Instance for this context.
> FLT_CALLBACK_DATA_QUEUE Cbdq;
> KEVENT TeardownEvent;
> } INSTANCE_CONTEXT, *PINSTANCE_CONTEXT;
>
>
> NTSTATUS InstanceSetup (
> in PCFLT_RELATED_OBJECTS FltObjects,
>
in FLT_INSTANCE_SETUP_FLAGS Flags,
> in DEVICE_TYPE VolumeDeviceType,
>
in FLT_FILESYSTEM_TYPE VolumeFilesystemType
> )
> {
> PINSTANCE_CONTEXT instanceContex; // pointer to instance
> context
> NTSTATUS rc = FltAllocateContext( FltObjects->Filter,
> FLT_INSTANCE_CONTEXT, INSTANCE_CONTEXT_SIZE, NonPagedPool, (PFLT_CONTEXT
> *)instanceContex);
> instanceContex->Instance = FltObjects->Instance;
> …
> //some code like FltCbdqInitialize, InitializeListHead etc.
> …
> // associate the context with this instance. Reference count
> increases to 2.
> rc = FltSetInstanceContext(instanceContex->Instance,
> FLT_SET_CONTEXT_REPLACE_IF_EXISTS, instanceContex, NULL);
> // and drop the reference count incremented by
> FltSetInstanceContext. Reference count decreases to 1.
> FltReleaseContext(Instance);
> …
> //some processing
> …
> if (Error) // Error occurred during processing.
> {
> // and drop the reference count incremented by
> FltAllocateContext and delete the context. Reference count decreases to 1.
> // BSOD point. Bug check F5, 6D
> FltReleaseContext(Instance);
> return STATUS_FLT_DO_NOT_ATTACH;
> }
> }
>
> Also if we replace BSOD point with following code then issue get fixed and
> while debugging it shows proper release and deletion of the context
>
> // and drop the reference count incremented by FltAllocateContext and
> delete the context
> PFLT_INSTANCE oldContext = NULL;
> status = FltDeleteInstanceContext( ip->instancep,
> //Instance
> &oldContext
> ); //OldContext
> if (oldContext != NULL)
> {
> FltReleaseContext(oldContext);
> }
> We have referred to the following msdn link which mentions that - to
> release the initial reference we need to remove the context from the object
> by calling - “FltDeleteContext”
> Link:
> https://msdn.microsoft.com/en-us/library/windows/hardware/ff551957(v=vs.85).aspx
>
> We can see two different approach used on WDDK sample, one is Using
> ?FltDeleteContext or ?FltDeleteInstanceContext? to handle an error
> situation(as shown above). This is being used in WDK sample -
> ?miniFilter\minispy\filter?.
> Second is by not calling ?FltReleaseContext? in case of an error condition.
> Queries:
> 1) Is calling FltDeleteContext/FltDeleteInstanceContext and then
> FltReleaseContext, the proper approach to release the last reference of the
> instance Context, in case of an error condition in the instance setup?
>
> 2) In case of an error condition if we do not call ?FltReleaseContext? (to
> release the last reference incremented FltAllocateContext), then, will
> cause be a memory leak?
>
>
> Thanks,
> Devashree
>
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Bercea. G.

sorry for the typo!

It is FltReleaseContext(instanceContex);

So just to simplify the above code -

//Create a context
PFLT_CONTEXT context

//Allocate it - Reference count of “context” becomes 1
FltAllocateContext(Filter, FLT_INSTANCE_CONTEXT, ContextSize, NonPagedPool, context);

//associate the context with instance (the instance we get in parameter of instance setup)
//This would increase the reference count of the “context” by 1.
//So reference count of context is now 2
FltSetInstanceContext(instance, FLT_SET_CONTEXT_REPLACE_IF_EXISTS, context, NULL)
//Decrement the reference count (that got incremented in the above line).
//Reference count goes back to 1.
FltReleaseContext(context)

//if(Some error in processing)
{
//Cleanup code:

//Here, We need to release the context created in FltAllocateContext
//Calling ‘FltReleaseContext’ should make the reference count of context 0,
//and release it from the associated object.
//But, it causes Bugcheck F5{6D…}
FltReleaseContext(context)

}

Fix:
Adding FltDeleteContext(context); right before FltReleaseContext(context) fixes the issue.

Query:
Is this is right way to fix this crash?
or, should I remove the call to FltReleaseContext(…) in my clean up code?

Well if the Set failed then the increment is not added yet.
There is one increment from the Allocate and one from the Set, but the set
failed, so only one from the Allocate, but you already de-reference the
allocation one, so that is all you should expect the context cleanup
callback.
The Delete context removes the context from Filter Manager’s internal list,
and you can manually free it sort of speak.
Is your cleanup context callback not being called ?
So the fix is just remove the release context from the error condition
there altogether, there are 0 references there if the Set fails.

Gabriel.

On Fri, May 29, 2015 at 4:22 PM, wrote:

> sorry for the typo!
>
> It is FltReleaseContext(instanceContex);
>
> So just to simplify the above code -
>
> //Create a context
> PFLT_CONTEXT context
>
> //Allocate it - Reference count of “context” becomes 1
> FltAllocateContext(Filter, FLT_INSTANCE_CONTEXT, ContextSize,
> NonPagedPool, context);
>
> …
>
> //associate the context with instance (the instance we get in parameter of
> instance setup)
> //This would increase the reference count of the “context” by 1.
> //So reference count of context is now 2
> FltSetInstanceContext(instance, FLT_SET_CONTEXT_REPLACE_IF_EXISTS,
> context, NULL)
> //Decrement the reference count (that got incremented in the above line).
> //Reference count goes back to 1.
> FltReleaseContext(context)
>
> //if(Some error in processing)
> {
> //Cleanup code:
>
> //Here, We need to release the context created in FltAllocateContext
> //Calling ‘FltReleaseContext’ should make the reference count of context 0,
> //and release it from the associated object.
> //But, it causes Bugcheck F5{6D…}
> FltReleaseContext(context)
>
> }
>
> Fix:
> Adding FltDeleteContext(context); right before FltReleaseContext(context)
> fixes the issue.
>
> Query:
> Is this is right way to fix this crash?
> or, should I remove the call to FltReleaseContext(…) in my clean up code?
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


Bercea. G.

We have implemented the ?FltSetInstanceContext? as per its msdn documentation - https://msdn.microsoft.com/en-us/library/windows/hardware/ff544521(v=vs.85).aspx

It states that: A successful call to FltSetInstanceContext increments the reference count on NewContext. If FltSetInstanceContext fails, the reference count remains unchanged. In either case, the filter calling FltSetInstanceContext must call FltReleaseContext to decrement the NewContext object.

Hence, the back-to-back calls-
First to ?FltSetInstanceContext(?)?
And then to ?FltReleaseContext(context)?

Also, in our case the call to ?FltSetInstanceContext(?) succeeds, and it increments the reference count of context by 1.
So, in the very next line we call ?FltReleaseContext(context)? to decrement the reference count, incremented by set.

Then after a few lines of code we encounter an error condition and hence, want cleanup and return.
In this cleanup process we want to free the context allocated using FltAllocateContext(?context).
Example:
//lets say we are trying to get volume properties, and this fails for some reason
//Now we want to cleanup and return.
rc = FltGetVolumeProperties(?);
if(!NT_SUCCESS(rc))
{
//At this stage the reference count of context is 1
//So, Release the last reference count of context that got allocated in ?FltAllocateContext?
FltReleaseContext(context);
//Ideally this should have decremented the reference count of context from 1 to 0 and then deleted the context.
//However this call causes a crash
}

Fixes:

  1. Adding a FltDeleteContext(context) right before FltReleaseContex fixes the crash.
  2. Also, removing the call to FltReleaseContext(?) in the if condition also fixes the crash. However in this case the reference count won?t decrement from 1 to 0

Query: What would you suggest would be a proper fix for this issue fix1 or fix2?

Thanks,
Devashree Ganguly

So I think the problem is that you’re trying to “make the context go away”
by calling FltReleaseContext() twice… This isn’t how it’s supposed to
work, once FltSetInstanceContext() returned success, the instance
internally has a pointer to the context, so bycalling FltReleaseContext()
twice you will end up freeing the context but the instance still has the
pointer so now it’s pointing to freed memory. FltMgr detects this and
bugchecks.

FltDeleteContext() will actually remove the instance’s pointer to the
instance, so that’s why it works.

But just calling FltDeleteContext() is still not right. Your flow should be
something like this:

  1. FltAllocateContext ( refCount is 1)
  2. FltSetInstanceContext( on success refCount is 2, on failure it’s still 1
    so on failure you need to just call FltReleaseContext() and you’re done
    with cleanup)
    N.B. -> At this point if you call FltReleeaseContext() (like you do) then
    you’re no longer allowed to touch the instance at all, you can’t call
    FltRelease or FltDelete on it, because you gave you YOUR reference when you
    called FltReleaseContext().
  3. do stuff that might fail
  4. if anything fails in such a way that you want to remove the context,
    call FltDeleteContext() (refcount is back to 1)
    … Do more stuff
  5. finally call FltReleaseContext() -> at this point refcount drops to
    either 0 (if there was a problem) or to 1 (if there was no problem).

See my blog post here:
http://fsfilters.blogspot.com/2010/02/context-usage-in-minifilters.html for
more information on how contexts work

Thanks,
Alex.

On Tue, Jun 2, 2015 at 8:20 AM, wrote:

> We have implemented the ?FltSetInstanceContext? as per its msdn
> documentation -
> https://msdn.microsoft.com/en-us/library/windows/hardware/ff544521(v=vs.85).aspx
>
> It states that: A successful call to FltSetInstanceContext increments the
> reference count on NewContext. If FltSetInstanceContext fails, the
> reference count remains unchanged. In either case, the filter calling
> FltSetInstanceContext must call FltReleaseContext to decrement the
> NewContext object.
>
> Hence, the back-to-back calls-
> First to ?FltSetInstanceContext(?)?
> And then to ?FltReleaseContext(context)?
>
> Also, in our case the call to ?FltSetInstanceContext(?) succeeds, and it
> increments the reference count of context by 1.
> So, in the very next line we call ?FltReleaseContext(context)? to
> decrement the reference count, incremented by set.
>
> Then after a few lines of code we encounter an error condition and hence,
> want cleanup and return.
> In this cleanup process we want to free the context allocated using
> FltAllocateContext(?context).
> Example:
> //lets say we are trying to get volume properties, and this fails for some
> reason
> //Now we want to cleanup and return.
> rc = FltGetVolumeProperties(?);
> if(!NT_SUCCESS(rc))
> {
> //At this stage the reference count of context is 1
> //So, Release the last reference count of context that got allocated
> in ?FltAllocateContext?
> FltReleaseContext(context);
> //Ideally this should have decremented the reference count of
> context from 1 to 0 and then deleted the context.
> //However this call causes a crash
> }
>
> Fixes:
> 1) Adding a FltDeleteContext(context) right before FltReleaseContex
> fixes the crash.
> 2) Also, removing the call to FltReleaseContext(?) in the if
> condition also fixes the crash. However in this case the reference count
> won?t decrement from 1 to 0
>
>
> Query: What would you suggest would be a proper fix for this issue fix1 or
> fix2?
>
> Thanks,
> Devashree Ganguly
>
>
> —
> NTFSD is sponsored by OSR
>
> OSR is hiring!! Info at http://www.osr.com/careers
>
> For our schedule of debugging and file system seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Thanks for your reviews Alex,

As per your suggestion I changed the code. Also I live debugged the code to check the values of instance and context.
**Additionally I added a breakpoint on my “instance context cleanup callback” function, to confirm how and when instances are cleared by the OS.

Following are the results I got on implementing your suggestion.

  1. rc = FltAllocateContext(…, context); -> ( refCount is 1)

At this stage if we look at the instance we see that no context is associated with the instance

0: kd> !fltkd.instance 8b8ba568
FLT_INSTANCE: 8b8ba568 “issfltr Instance” “320900”
FLT_OBJECT: 8b8ba568 [01000000] Instance
RundownRef : 0x00000002 (1)
PointerCount : 0x00000001
PrimaryLink : [8abd0a5c-8b05c228]
OperationRundownRef : 8beef960
Could not read field “Number” of fltmgr!_EX_RUNDOWN_REF_CACHE_AWARE from address: 8beef960
Flags : [00000004] Initing
Volume : 8b05c1a0 “\Device\HarddiskVolume1”
Filter : 8b8b51e8 “issfltr”
TrackCompletionNodes : 8c040488
ContextLock : (8b8ba5a4)
Context : 00000000
CallbackNodes : (8b8ba5b4)
VolumeLink : [8abd0a5c-8b05c228]
FilterLink : [8b8b5250-8b8b5250]

Note that Context is NULL

  1. rc = FltSetInstanceContext(instance, FLT_SET_CONTEXT_REPLACE_IF_EXISTS, context, NULL)
    if (!NT_SUCCESS(rc)) {
    //If FltSetInstanceContext fails perform cleanup
    Some_cleanup_tasks;

//Drop the reference count
FltReleaseContext(ip); -> (refcount can be 0 or 1 after this releasing. No problems here)
}

  1. If “FltSetInstanceContext” succeeds the context gets associated to the instance.

!fltkd.instance 8b8ba568
FLT_INSTANCE: 8b8ba568 “issfltr Instance” “320900”
FLT_OBJECT: 8b8ba568 [01000000] Instance
RundownRef : 0x00000002 (1)
PointerCount : 0x00000002
PrimaryLink : [8abd0a5c-8b05c228]
OperationRundownRef : 8beef960
Could not read field “Number” of fltmgr!_EX_RUNDOWN_REF_CACHE_AWARE from address: 8beef960
Flags : [00000004] Initing
Volume : 8b05c1a0 “\Device\HarddiskVolume1”
Filter : 8b8b51e8 “issfltr”
TrackCompletionNodes : 8c040488
ContextLock : (8b8ba5a4)
Context : 8b8ba450
CallbackNodes : (8b8ba5b4)
VolumeLink : [8abd0a5c-8b05c228]
FilterLink : [8b8b5250-8b8b5250]

Note: ‘Context’ has value now
Also, Reference count increases by 1. ‘Usecount’ now becomes 2


1: kd> !fltkd.ctx 8b8ba450
CONTEXT_NODE: 8b8ba450 [0002] InstanceContext NonPagedPool
ALLOCATE_CONTEXT_NODE: 8b8b7008 “issfltr” [01] LookasideList
Could not read field “NonPaged.L.Size” of FltMgr!_ALLOCATE_CONTEXT_LOOKASIDE from address: 8b8b7008
AttachedObject : 8b8ba568
UseCount : 2
TREE_NODE: 8b8ba45c (k1=00000000, k2=00000000) [00010000] InTree
UserData : 8b8ba480

  1. Skip the call to FltReleaseContext() right after FltSetInstanceContext(); -> So refcount still remains 2

  2. do stuff that might fail.

  3. If(anything_fails)
    a) call FltDeleteContext() -> This simply disassociates the context from the instance
    On viewing the instance we see that this call sets context back to NULL.


1: kd> !fltkd.instance 8b8ba568
FLT_INSTANCE: 8b8ba568 “issfltr Instance” “320900”
FLT_OBJECT: 8b8ba568 [01000000] Instance
RundownRef : 0x00000002 (1)
PointerCount : 0x00000001
PrimaryLink : [8abd0a5c-8b05c228]
OperationRundownRef : 8beef960
Could not read field “Number” of fltmgr!_EX_RUNDOWN_REF_CACHE_AWARE from address: 8beef960
Flags : [00000004] Initing
Volume : 8b05c1a0 “\Device\HarddiskVolume1”
Filter : 8b8b51e8 “issfltr”
TrackCompletionNodes : 8c040488
ContextLock : (8b8ba5a4)
Context : 00000000
CallbackNodes : (8b8ba5b4)
VolumeLink : [8abd0a5c-8b05c228]
FilterLink : [8b8b5250-8b8b5250]

b)Check what happened to the context after call to FltDeleteContext:

CONTEXT_NODE: 8b8ba450 [0002] InstanceContext NonPagedPool
ALLOCATE_CONTEXT_NODE: 8b8b7008 “issfltr” [01] LookasideList
Could not read field “NonPaged.L.Size” of FltMgr!_ALLOCATE_CONTEXT_LOOKASIDE from address: 8b8b7008
AttachedObject : 8b8ba568
UseCount : 2
TREE_NODE: 8b8ba45c (k1=00000000, k2=00000000) [00000000]
UserData : 8b8ba480

This shows that the reference count on context is still 2

c)Call FltReleaseContext(context). -> This reduces the reference count of context from 2 to 1


1: kd> !fltkd.ctx 8b8ba450
CONTEXT_NODE: 8b8ba450 [0002] InstanceContext NonPagedPool
ALLOCATE_CONTEXT_NODE: 8b8b7008 “issfltr” [01] LookasideList
Could not read field “NonPaged.L.Size” of FltMgr!_ALLOCATE_CONTEXT_LOOKASIDE from address: 8b8b7008
AttachedObject : 8b8ba568
UseCount : 1
TREE_NODE: 8b8ba45c (k1=00000000, k2=00000000) [00000000]
UserData : 8b8ba480

d) Cleanup finished now return.

Issue:

  1. In this entire process (and sometime after that) I observed that the breakpoint in “instance context cleanup callback” never got called. Meaning that system has not yet freed the above mentioned context. It’s reference count is still 1.
  2. So I assumed that this might be called in the driver unload function.
  3. So I turned my driver off using “net stop MyDriver”
  4. To my surprise I found that after doing these changes my driver failed to stop.
  5. I waited for Half an hour for my driver to stop but it seems its stuck somewhere in the Driver unload.

Conclusion: Commenting the fist call to “FltReleaseContext” didn’t cause a crash, however it also didn’t free the context(for some reason). And this seems to be reason why the driver fails to unload.

Note: adding the first call to FltReleaseContext (along with the proper call to fltDeleteContext) fixes the issue.

Query: Have I implemented your suggestions correctly? Should I be changing some part of code?

Thanks,
Devashree.

It pains me to see this.
It is really not that hard, and Windows API in the kernel regarding ref counted objects is really not that difficult.
If something returns an object with success usually boosts the ref count by 1, unless is something like PsGetCurrentProcess() which will not boost the ref count of the returned EPROCESS, but most of the calls that give you back a pointer, they give it back to you meaning you own it, and to prove it they give you that ref count.

In this case the Alloc gives you a pointer = ref count 1
The set gives you one if set with success = ref count 2

The cleanup of the context occurs if the ref count reaches 0.

So mostly your code should basically look like this:

myCtx = AllocTheCtx ();
if (!myCtx) // nothing was allocated
{
return ERROR;
}

// at this stage ctx’s ref count is 1

rc = SetInstanceCtx(myCtx, Instance); // try to set the context

FltReleaseContext(myCtx); // release one reference to the context, regardless if the set was
// successful or not
// if the set was succesful the ref count gets to be 2, but you make it 1 again, if the set was not
// succesful the ref count remains 1, and you make it 0, and it gets cleaned - up by your cleanup function
if (!NT_SUCCESS(rc))
{
// the set did not succeed
return rc; // just return, if it was an error the ref count now is already 0 and your cleanup function
// will be invoked
}

//
// do other stuff
//

return; //you have 1 reference from the alloc.

When you want to get rid of the instance context, and you want your cleanup to be called, it is easy now because you get called when the instance is torn down.
So now, when your instance is torn down, depend on what you do in the TeardownStart of TeardownComplete

NTSTATUS Teardown()
{
instCtx = GetInstanceCtx();
FltReleaseContext(instCtx); // still 1
//but you have the pointer still
// and then you take it out completely and your cleanup will be invoked assuming you
// have no reference leaks
FltDeleteInstanceContext( FltObjects->Instance, NULL );

return;
}

PS: your driver not unloading suggest leaks in references.
The filter manager pbably waits for the ref counts to drop so it can call the cleanups but never happens.
Run with verifier, because it will stop and blue screen and you can investigate the leaks.

Good luck