Cache synchronization between shadowFileObject and realFileObject

OSR_Community_User · October 16, 2014, 10:41am

Hi, I’m developing an encryption filter based on minifilter, now I encounter a problem.
Because I use shadow file object in my encryption filter, so my filter maintains the scb for file objects that I have created shadow file objects for. But there is often a problem : sometimes programs try to read the content of a file, which I have just write something into. But at that pointe the cache is still maintained by my filter’s scb and has not been flushed into the disk, so the programs don’t read the right content of the file.
Is there a way to solve this problem? Or may be I should synchronize the real file object’s cache with my filter’s maintained cache?

OSR_Community_User · October 16, 2014, 12:57pm

Right. When you see an I/O operation on the shadow file object, you’ll need to ensure you either flush (for a read) or purge (for a write) in your own cache to maintain coherency.

These situations should be very rare, since you should still be controlled by sharing between the opens. Thus, to get into this situation you’d need a case where the file is opened for shared write - in that case, there will probably be byte range locks. Thus, you’ll have to ensure byte range locks are properly handled in your filter as well.

If an application does shared write and doesn’t synchronize between paths you’re actually ok - the application is already broken and the fact you break it differently shouldn’t really matter. This is a good way of demonstrating what I call the “Heisenberg uncertainty principle of file systems” - asynchronous overlapping write operations from multiple threads.

Tony
OSR

OSR_Community_User · October 16, 2014, 9:33pm

Thanks Tony, your advice is very useful.
But I’m a little confused about “flush (for a read) or purge (for a write)”, shouldn’t I call flush at first and then call purge? I think only calling purge (for a write) may lead to a data loss, because the cache maintained by my filter has gone, but there may be some content wait to be flushed into the disk.
Am I right?

OSR_Community_User · October 17, 2014, 1:11pm

When data is written to one cached view, it invalidates the data in the other cached view. A *purge* will cause the invalid data to be discarded from the cache, which is what you want. If you do wish to flush the data first (from one cached view) and then purge it, you can do so - it won’t cause an issue - but you’re doing unnecessary I/O (you’re writing data, then writing it again).

Note: I’m assuming that everything here is on PAGE_SIZE boundaries. Clearly when you have things that are not on aligned boundaries you will need to flush the partial pages and hope for the best.

When data is read into one cached view and there are dirty contents in the other cached view, you just need to flush the dirty cached view first, so the other cached view will then read the correct data off the backing store.

Tony
OSR

OSR_Community_User · October 23, 2014, 7:34am

Thanks Tony, I have achieved the cached part.
Now I’m still confused about Memory Map Operations, and the same, it is about Data Coherence. From archieves in the forum I learned that it is impossible to detect the Memory Map IO, so I would like to know is there any way to detect there are some dirty pages needed to be flushed? I think it should be easier than detecting Memory Map IO.

OSR_Community_User · October 23, 2014, 1:06pm

MmDoesFileHaveUserWritableReferences can be used to determine if there are any mappings of the file into a user mode address space. FastFat uses this function.

The easiest thing to do here is just flush - if there are dirty pages, they get written. If not, this becomes a no-op (because Mm won’t write clean pages!)

Tony
OSR

OSR_Community_User · October 24, 2014, 2:48am

?The easiest thing to do here is just flush?, yes, I do think it’s the easiest way to flush dirty data into the disk, but I wonder how could anther Memory Map Section know the data has been modified(which means that the data in its Memory Map has been out of date) and it has to fetch the new data. After all, there is no way to inform Mm of data has been modified.