Ballpark estimate on time to develop filesystem

OSR_Community_User · March 9, 2007, 9:18am

All,

I’ve seen some discussion on this across several different threads,
and estimates have varied widely, so I thought I’d just come out and
ask the question. Given an expert developer, but a cursory level of
NT filesystem internals knowledge, what kind of ballpark time
estimate are we looking at for developing a new filesystem? In my
specific case, you can assume the need is a virtual drive, which sits
on top of an existing NTFS file system – that may affect the
estimate significantly, so I thought I’d mention it.

I’ve heard estimates of everything from several months to two years,
so I’d like to get a better idea of what we are looking at.

Thanks,

Brad

OSR_Community_User · March 9, 2007, 10:05am

Brad O’Hearne wrote:

I’ve heard estimates of everything from several months to two years, so
I’d like to get a better idea of what we are looking at.

Ballpark for a filesystem with no previous fsd experience? 12-18 months
to get something production ready.

I really wouldn’t take something that complex on as the first project -
try a few simple filters first to get a feel for the development
process, how the IRP system works, etc.

Tony

OSR_Community_User · March 9, 2007, 10:07am

To develop a file system out of nowhere? If you’re versed with data structures
significantly, you may pull something stable within a few months, but that will not
be a good file system. Experience is definitely required in making a robust file
system (i.e. recovery, safety, optimizations etc.)
If you have a budget, do consider hiring someone who already made a file system.
At least ask them for a quote. I know some members have made excellent FSes within
three months (but with a lot of previous experience). So for your file system, it
might take several weeks for them to do it.

BTW, you are looking at two drivers (a file system for whatever reason you may
need it… I didn’t follow the previous discussion; and a virtual drive - this is
fairly easy to accomplish with RamDisk as Dan suggested, but still a separate task).

Dejan.

Brad O’Hearne wrote:

All,

I’ve seen some discussion on this across several different threads, and estimates
have varied widely, so I thought I’d just come out and ask the question. Given an
expert developer, but a cursory level of NT filesystem internals knowledge, what
kind of ballpark time estimate are we looking at for developing a new filesystem?
In my
specific case, you can assume the need is a virtual drive, which sits on top of an
existing NTFS file system – that may affect the estimate significantly, so I
thought I’d mention it.

I’ve heard estimates of everything from several months to two years, so I’d like to
get a better idea of what we are looking at.

–
Kind regards, Dejan
http://www.alfasp.com
File system audit, security and encryption kits.

OSR_Community_User · March 9, 2007, 10:29am

Not to disagree with Dejan entirely, but I suspect the number of people
capable of doing something in a few weeks is vanishingly small - and
none of those who might have a prototype/demo up in a few weeks would be
willing to ship such a product.

In my experience, the complications that arise are:

Versions: each additional OS version that must be supported increases
the development/test overhead. We routinely do 6-8 week intense test
cycles prior to any release (so even if development were to be an
astonishingly fast 3 weeks, a real release would be several months.)
Development is best targeted to the downlevel platform and then ported
to newer versions (but this does mean you must restrict yourself to
using only downlevel features, or encoding equivalents for uplevel
features.) One example of this is IoCreateFileSpecifyDeviceObjectHint.
This is not present in Windows 2000 (although it may be called in SOME
versions, it is not present on all versions.)
File Systems: each additional FS with which you interact increases the
cost associated with the implementation. If you restrict it to just one
(NTFS or FAT) things are easier (and this is a point because you will be
interacting with one, or the other, or both.) If you include any of the
network file systems your development and test time will easily double
(the redirector implementation for SMB exhibits numerous genuine
Byzantine behaviors.)
Interop: for some reason, everyone seems to think that they will “just
work” with every AV product on the market. Our experience is exactly
the opposite - that you have to test and resolve issues with every AV
product (and honestly, every VERSION of every AV product that you expect
to support.) Some products will disable when you attach a debugger to
the system (making debugging a tad bit more challenging.)
Application specific failures: The moment someone comes to you with a
case like “program X fails on your file system” be prepared for some
long, challenging debugging situations. I have seen failures that are
incredibly subtle and can take days or weeks of intense effort to
resolve.
Edge conditions: read-only files/devices, ACLs, removable media,
removable devices, error paths and failure conditions ALL end up
creating considerable grief.

These issue are the difference between a “prototype” and a “product” and
are what make the development process take so long.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Nikolay_Zelinski · March 9, 2007, 11:10am

EldoS Corporation has developed a file system driver
(http://www.eldos.com/solfs/driver.php) and will release more FS driver
products in near future. It's very likely that we can offer some solution
for you. Please send a letter to xxxxx@eldos.com or use our HelpDesk (
http://www.eldos.com/support/ticket_list.php ) to contact us for details.

Nikolay

"Brad O'Hearne" ???????/???????? ? ???????? ?????????:
news:xxxxx@ntfsd...
> All,
>
> I've seen some discussion on this across several different threads, and
> estimates have varied widely, so I thought I'd just come out and ask the
> question. Given an expert developer, but a cursory level of NT filesystem
> internals knowledge, what kind of ballpark time estimate are we looking
> at for developing a new filesystem? In my specific case, you can assume
> the need is a virtual drive, which sits on top of an existing NTFS file
> system -- that may affect the estimate significantly, so I thought I'd
> mention it.
>
> I've heard estimates of everything from several months to two years, so
> I'd like to get a better idea of what we are looking at.
>
> Thanks,
>
> Brad
>

OSR_Community_User · March 9, 2007, 11:12am

Thank you Tony Hoyle, Dejan, and Tony Mason, for the replies. I find the
information very useful. I know it is hard to ballpark something when
you don’t have the exact details. So in a sense, I’m getting a general
ballpark with all possible obstacles considered (which is good). Without
getting too detailed, or convoluting the matter, my specific requirement
is this:

develop a drive which sits atop an exiting mounted drive (a local hard
drive, essentially – ignore network redirection for now), for which all
reads and writes are intercepted, and handled. In other words, there is
no separate, unique device associated with this filesystem. It leverages
an existing device for its actually reading and writing. Call this a
virtual drive, or something else – I’m not sure how you would term it.
drive must be addressable by all apps, scripts, the command prompt,
and visible in windows Explorer.
while hook can be at the kernel level, I want the bulk of my code to
run in user-mode.
works on Windows XP and Windows 2000 (ignore Vista and older versions
of Windows).

From various questions I’ve posted in other threads, it seems there
isn’t a general consensus on whether I need to develop a file system
driver, a system filter driver, or something else (I’ve heard the word
“virtual drive” used, though I am not totally clear how this maps to a
file system driver and/or filter driver, as I think we are still talking
one of those – if a “virtual drive” is something different entirely,
please expound – my Googling didn’t turn up anything extremely
enlightening in this respect.) So I am still trying to determine exactly
what needs to be developed. It seems a mini-filter might do the job, but
even with a mini-filter, I have a question as to how to construct the
mountable entity – the user-addressable thing that is seen in Explorer,
can be referenced by name from the command prompt, scripts, etc.So that
tends to lean me back in the file system direction – but that seems
entirely too much – as I don’t need to replace or create any
fundamental filesystem-like behavior at all – I just need to intercept
it and redirect my own reads and writes to an existing filesystem.

You guys have been amazingly helpful, and I’m coming to the conclusion
of my information gathering for a direction – but if you could indulge
me one last time, two final questions:

Can you be more certain as to whether you see my need as a file
system driver, or filter driver (or something else), and if a filter
driver, what the mechanism is for presenting this virtual drive to the
user/system in Explorer, the command prompt, scripts, etc.
Based on your answer to 1), can you give a tighter estimate? (For
example, if I need merely a filter driver, I’d expect something far
below 18 months).

Thanks so much – this has been extremely helpful. Our course is going
to be decided today, and this last piece of information will really help
that congeal.

Brad

Tony Mason wrote:

Not to disagree with Dejan entirely, but I suspect the number of people
capable of doing something in a few weeks is vanishingly small - and
none of those who might have a prototype/demo up in a few weeks would be
willing to ship such a product.

In my experience, the complications that arise are:

Versions: each additional OS version that must be supported increases
the development/test overhead. We routinely do 6-8 week intense test
cycles prior to any release (so even if development were to be an
astonishingly fast 3 weeks, a real release would be several months.)
Development is best targeted to the downlevel platform and then ported
to newer versions (but this does mean you must restrict yourself to
using only downlevel features, or encoding equivalents for uplevel
features.) One example of this is IoCreateFileSpecifyDeviceObjectHint.
This is not present in Windows 2000 (although it may be called in SOME
versions, it is not present on all versions.)

File Systems: each additional FS with which you interact increases the
cost associated with the implementation. If you restrict it to just one
(NTFS or FAT) things are easier (and this is a point because you will be
interacting with one, or the other, or both.) If you include any of the
network file systems your development and test time will easily double
(the redirector implementation for SMB exhibits numerous genuine
Byzantine behaviors.)

Interop: for some reason, everyone seems to think that they will “just
work” with every AV product on the market. Our experience is exactly
the opposite - that you have to test and resolve issues with every AV
product (and honestly, every VERSION of every AV product that you expect
to support.) Some products will disable when you attach a debugger to
the system (making debugging a tad bit more challenging.)

Application specific failures: The moment someone comes to you with a
case like “program X fails on your file system” be prepared for some
long, challenging debugging situations. I have seen failures that are
incredibly subtle and can take days or weeks of intense effort to
resolve.

Edge conditions: read-only files/devices, ACLs, removable media,
removable devices, error paths and failure conditions ALL end up
creating considerable grief.

These issue are the difference between a “prototype” and a “product” and
are what make the development process take so long.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

Questions? First check the IFS FAQ at https://www.osronline.com/article.cfm?id=17

You are currently subscribed to ntfsd as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

OSR_Community_User · March 9, 2007, 11:38am

Brad,

First, let me point out that filter drivers are often not easier than
file systems to implement. The lower edge of a file system is (usually)
a block or network interface, while the lower edge of a filter driver is
typically a file system. The latter interface is more complicated.

A “filter driver” attempts to modify the behavior of an existing file
system. It seems from your description that you are trying to create
the appearance of a separate volume where the actual implementation of
storage is not tied to that appearance. To me that suggests a virtual
file system (essentially one that redirects the actual storage to the
real location.)

But the range of options isn’t exactly black and white. You could, for
example, create an NTFS volume and have it store your logical
presentation, with the redirection information encoded as reparse
points. In that case you’d be implementing a filter driver that catches
the reparse operations and interprets it so the request can be sent to
the original location. This would in fact be a filter driver and you’d
be exploiting this extra NTFS volume as your storage of the naming
information.

Or you might choose instead to implement this as a logical file system,
where you store the name space structure in some proprietary format
(hey, store it in SQL if you want, it really doesn’t matter.) Then when
someone queries a directory you construct the format of the directory
and ship it to them. When someone opens a file or stream you use the
name information to figure out where the REAL object is located and send
the request to that alternate location. You might do this via the
STATUS_REPARSE mechanism (“get out of the way”) or you might do this via
a mapping mechanism (“don’t bypass me”) depending upon your specific
requirements (one “issue” with the get out of the way approach is the
relative open, which can complicate things.)

Based upon what you’ve described thus far, my personal inclination would
be to create a logical structure via private data representation,
retrieve the information dynamically as I needed it and redirect the
requests to the “real target” wherever it might be located (I keep
thinking it might be worthwhile trying to use the “open by ID”
optimization, but that might just be unduly complicated.) As long as
you control directories, the only relative opens would be on the
file/stream (so someone could open another stream of the same file) and
that might be sufficient. But if you are sensitive to renames, for
example, then the “get out of the way” approach probably doesn’t work as
well for you, and you may want to watch for the renames on those files.
Then you either go to the double mapping technique (first file object
points to you, second file object points to the NTFS instance of the
file) or you use a hybrid technique (so you have a filter to watch for
meta-data operations, and a namespace driver to present the logical
format.)

The beauty of this space is that there are so many different ways of
achieving the specific functionality. None of them are perfect and each
of them has a range of trade-offs involved (and the BEST part is they
are short-term/long-term trade-offs that you seldom understand until you
are done with the project and realize you’ve chosen a less-than-optimal
solution.)

Job security in this business is very good.

Regards,

Tony

Tony Mason
Consulting Partner
OSR Open Systems Resources, Inc.
http://www.osr.com

OSR_Community_User · March 9, 2007, 12:46pm

Nikolay,

Thank you very much for your reply. I have sent an email to your
HelpDesk for more info – the sooner they can turn that request around
the better.

Brad

Nikolay Zelinski wrote:

EldoS Corporation has developed a file system driver
(http://www.eldos.com/solfs/driver.php) and will release more FS driver
products in near future. It’s very likely that we can offer some solution
for you. Please send a letter to xxxxx@eldos.com or use our HelpDesk (
http://www.eldos.com/support/ticket_list.php ) to contact us for details.

Nikolay

“Brad O’Hearne” ???/??? ? ??? ???:
> news:xxxxx@ntfsd…
>
>> All,
>>
>> I’ve seen some discussion on this across several different threads, and
>> estimates have varied widely, so I thought I’d just come out and ask the
>> question. Given an expert developer, but a cursory level of NT filesystem
>> internals knowledge, what kind of ballpark time estimate are we looking
>> at for developing a new filesystem? In my specific case, you can assume
>> the need is a virtual drive, which sits on top of an existing NTFS file
>> system – that may affect the estimate significantly, so I thought I’d
>> mention it.
>>
>> I’ve heard estimates of everything from several months to two years, so
>> I’d like to get a better idea of what we are looking at.
>>
>> Thanks,
>>
>> Brad
>>
>>
>
>
>
> —
> Questions? First check the IFS FAQ at https://www.osronline.com/article.cfm?id=17
>
> You are currently subscribed to ntfsd as: xxxxx@neurofire.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>

Don_Burn_1 · March 9, 2007, 12:56pm

Brad,

I believe it has been mentioned, but if not: consider getting the
OSR file system development kit http://www.osr.com/toolkits_fsdk.shtml.
This will take much of the complexity out of the development of a file
system device, and while pricey is well worth the costs. The product is
very solid and has great documentation and support.

–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

“Brad O’Hearne” wrote in message news:xxxxx@ntfsd…
> Nikolay,
>
> Thank you very much for your reply. I have sent an email to your HelpDesk
> for more info – the sooner they can turn that request around the better.
>
>
> Brad
>
> Nikolay Zelinski wrote:
>> EldoS Corporation has developed a file system driver
>> (http://www.eldos.com/solfs/driver.php) and will release more FS driver
>> products in near future. It’s very likely that we can offer some
>> solution for you. Please send a letter to xxxxx@eldos.com or use our
>> HelpDesk ( http://www.eldos.com/support/ticket_list.php ) to contact us
>> for details.
>> –
>>
>> Nikolay
>>
>>
>>
>> “Brad O’Hearne” ???/??? ? ???
>> ???: news:xxxxx@ntfsd…
>>
>>> All,
>>>
>>> I’ve seen some discussion on this across several different threads,
>>> and estimates have varied widely, so I thought I’d just come out and
>>> ask the question. Given an expert developer, but a cursory level of NT
>>> filesystem internals knowledge, what kind of ballpark time estimate
>>> are we looking at for developing a new filesystem? In my specific
>>> case, you can assume the need is a virtual drive, which sits on top of
>>> an existing NTFS file system – that may affect the estimate
>>> significantly, so I thought I’d mention it.
>>>
>>> I’ve heard estimates of everything from several months to two years,
>>> so I’d like to get a better idea of what we are looking at.
>>>
>>> Thanks,
>>>
>>> Brad
>>>
>>>
>>
>>
>>
>> —
>> Questions? First check the IFS FAQ at
>> https://www.osronline.com/article.cfm?id=17
>>
>> You are currently subscribed to ntfsd as: xxxxx@neurofire.com
>> To unsubscribe send a blank email to xxxxx@lists.osr.com
>>
>
>

OSR_Community_User · March 9, 2007, 1:16pm

Don,

Thanks for the reply. I’m compiling all of our options right now, and I
don’t doubt the OSR FSDK may be one of (if not the Cadillac) of things
out there. This is one our options we are considering.

Brad

Don Burn wrote:

Brad,

I believe it has been mentioned, but if not: consider getting the
OSR file system development kit http://www.osr.com/toolkits_fsdk.shtml.
This will take much of the complexity out of the development of a file
system device, and while pricey is well worth the costs. The product is
very solid and has great documentation and support.

OSR_Community_User · March 11, 2007, 7:59am

On Fri, March 9, 2007 4:29 pm, Tony Mason wrote:

Not to disagree with Dejan entirely, but I suspect the number of people
capable of doing something in a few weeks is vanishingly small - and
none of those who might have a prototype/demo up in a few weeks would be
willing to ship such a product.

From start yes, but I am confident some have a framework for such jobs,
which would make things easier (I’m going with the premise that the
file system here, if needed, need not be complex at all, thus no new
tricks need to be implemented).
I may be wrong, of course.

–
Alfa File System Filtering components. Security, monitoring and encryption.