Content Filtering

hello,

i want to do content filtering in mini filter, for that i have checked the
scanner example.for txt it’s working great, but for doc it’s open the
document containing the “foul” string.i know that for doc it require office
reader,but in mini filter i don’t know how to implement.so please suggest
what can i do for that?
thanxs

(This is what I best recall about scanner)

Scanner work on file extension correct? And check for foul language in create and write. Now for not working with word few reasons could be

  1. it is reading intial few bytes, now .doc is a binary format where even an empty file is of 11 KB approx. So your filter will not be able to found that.

  2. When writing, word creates temp file and later renamed them, so if you have not included .tmp (or whatever word uses) you’ll continue missing things.

Thanks
Aditya

Hey thanks Aditya

i have tried with .tmp extension for write and it’s working
so no problem for write operation.

but i want to read the content of doc file (ie binary format)
in mini filter,so is there any other way to read .doc file?

please help.

>>but i want to read the content of doc file (ie binary format) in mini filter,so is there any other way to read .doc file?

What ever way you read it, data is same, unless you know how to parse a doc file to retrieve actual data. Same is true for other binary formats also, so there is no straight forward way as far as I can think.

You want to content filtering for binary file ? If you can explain it a bit, i think I can better visualize it. are you checking for mail attachments.

Thanks
Aditya

I want content filtering for All File Types (eg :- txt,inf,doc,xls,docx,xlsx etc…).
when i tried to print the buffer of FltReadFile() for txt,inf it shows the content of file,
but for doc it just shows the file header(D0 CF 11 E0 A1 B1 1A E1 00) so can i read the content of document?
i know how to read office files in .net(C#,vc++) but in mini filter is it possible?

Are you opening the file in binary mode, seeking to the correct offset
within the file?
Have you obtained the specification of the doc format?
You should then be able to parse the structure of the file for the start
of the “real” data and then perform the dirty word check.
Do you need to look at headers, tables etc? What about embedded objects
within the document?
I suspect the C# code is a wrapper for unmanaged code, so it would be
possible.
Do you really need to filter “All” file types?

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: 16 September 2009 11:09
To: Windows File Systems Devs Interest List
Subject: RE:[ntfsd] Content Filtering

*** WARNING ***

This message has originated outside your organisation,
either from an external partner or the Global Internet.
Keep this in mind if you answer this message.

I want content filtering for All File Types (eg :-
txt,inf,doc,xls,docx,xlsx etc…).
when i tried to print the buffer of FltReadFile() for txt,inf it shows
the content of file, but for doc it just shows the file header(D0 CF 11
E0 A1 B1 1A E1 00) so can i read the content of document?
i know how to read office files in .net(C#,vc++) but in mini filter is
it possible?


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars (including our
new fs mini-filter seminar) visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

********************************************************************
This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.
********************************************************************

> i know how to read office files in .net(C#,vc++) but in mini filter is it possible?

I would offload all of this to user-mode app.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

>>i know how to read office files in .net(C#,vc++) but in mini filter is it possible?

How about doing all scanning in user mode and pass the decision to ur driver.