File Virtualization Methods

Hi,

I want to ask if anyone knows is it possible to redirect the IO operations to another file ‘fake’ file that don’t need to exist.

To be more precise lets application uses 10 different files and before Create/open specific file we redirect the IO to some ‘virtual’ file. The virtual file size and data can be supplied by intercepting the read/write operation to that file.

I saw that there is a possibility to ‘map’ files through GenerateNameCallback and related functionality but is possible to redirect it to ‘fake’ file that actually don’t need to be created.

I understand that this is not a easy (if possible at all ) task but if someone can point some guidelines that would be great.

Thanks in advance!

…and the purpose of this is…?

Note that an application can create a “temporary” file by specifying a
flag in the CreateFile call, which is a hint to Windows that it may not
need to be committed to the disk. A file system is free to honor this or
ignore it as it sees fit.

So if your goal is to somehow improve performance, do you have any idea if
this will? It sounds like the RAMdisk question in sheep’s clothing.
joe

Hi,

I want to ask if anyone knows is it possible to redirect the IO operations
to another file ‘fake’ file that don’t need to exist.

To be more precise lets application uses 10 different files and before
Create/open specific file we redirect the IO to some ‘virtual’ file. The
virtual file size and data can be supplied by intercepting the read/write
operation to that file.

I saw that there is a possibility to ‘map’ files through
GenerateNameCallback and related functionality but is possible to
redirect it to ‘fake’ file that actually don’t need to be created.

I understand that this is not a easy (if possible at all ) task but if
someone can point some guidelines that would be great.

Thanks in advance!


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Joseph thanks for reply.

I need to intercept the IO for SQL server for certain operation , and redirect to ‘virtual’ - fake file. Performance is not a issue.

Thanks,
Adrijan

Adrijan,

If the ‘virtual’ files are going to be backed by some real file on the file system you can could just return STATUS_REPARSE on the IRP_MJ_CREATE operatation for the files that don’t exist. If there is no real file backing the virtualized files then I’d look into something that leverages the methods described here: https://www.osronline.com/article.cfm?article=560.

Short answer: Yes, it’s possible.

> I need to intercept the IO for SQL server for certain operation …

You are aware that SQL-Server (as nearly every transactional DBMS), has very strict IO Requirements? File-Integrity, Order of IO-Operations, Flushes, None-Cached Access, …

[SQL Server I/O Basics, Chapter 2]
http://technet.microsoft.com/en-us/library/cc917726.aspx

You should at least run the MS provided tests for IO Systems:

[How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem]
http://support.microsoft.com/kb/231619/en-us

GP

Also, how is it that you expect to know that the file request is coming
from SQL server? And how are you going to deal with requests from
industrial-strength backup utilities that are backing up live SQL
databases? And what about disk defragmenters?

A p-baked idea is a generalization; half-baked ideas have p==0.5. A lot
of the ideas I hear in this newsgroup seem to have p < 0.1. When you
think about mucking in file systems, you have to understand fourth-order
effects if you want any hope of success. I’ve been writing device drivers
on a variety of platforms for over 40 years. I’ve heard Tony Mason’s
talks at conferences, and they are one reason I would never attempt to
write a file system. His knowledge of nth-order effects is amazing. I am
not sure how I could reliably tell that reqests were coming from SQL
server, and I don’t know all the things it does, but my past experience
suggests that it might have background threads doing file optimizations,
record compaction, transaction management, etc., and which of these can be
safely “virtualized”? The, “Hey, fellas, let’s just redirect the IRPs to
a virtual file” approach is a bit scary. And even if you slog through
millions of IRP trace lines, you cannot be sure you have seen all the
patterns that *might* exist, and therefore might have no idea if your
solution is complete.

If I had a client that wanted something like this, the first thing I’d
suggest is a solid requirements document. Then I’d suggest they pay
someone at OSR to do a sanity check on it, and perhaps even give a quote
on doing it. File systems are not for beginners, or even people like me
who have only done simple PCI devices. And there is essentially no room
for errors.

Note that a requirements document states the problem that needs to be
solved, fairly precisely. “I need to intercept the IO for SQL server for
certain operations” is not a requirement; it is a suggestion for an
implementation. There may be much easier ways to achieve this, perhaps
involving redirection using existing mechanisms already implemented and
tested for years. Or, maybe you really do have to track every IRP. But
I’m not sure from your questions that you have the in-depth understanding
of the complexities of the file system to avoid making some grievous
error. And while my success rate is high, my failures were nearly always
from some subtlety I was not aware of. And I saw one project nearly crash
and burn because the disk did not handle power failure well, and this
hardware failure compromised the transaction management. It was a
fourth-order effect.

For example, the statement about a “fake” file that doesn’t need to exist
at all is confusing. If I write to it, is the data simply discarded
(because the file doesn’t exist at all) and if so, what happens when I try
to read that data back? What do I get? If I have a terabyte of database,
what does the fake file do if I write to byte offset 760,238,124? Does
the fake file have to handle sparse record updte? And where does the
“fake” file store all the information? Obviously, for realistic-sized
databases it can’t store it in main memory, so in fact the “fake” file has
to have a place to put this, which would be on the disk, making it a
“real” file. How is this going to live in the SQL ecosystem, which as I
indicated might have lots of pathways in SQL server itself and more
components than SQL server in the ecosystem (backup and defragmentation
being the two most immediate ones that came to mind). And don’t forget
that SQL is transacted, so how does the fake file live in harmony with the
notion of transactioning? Nd what about record locking?

A long time ago, in a different lifetime, I worried about these things,
and although I’ve not used SQL server or dealt with the Windows file
system, I do know that the actions of both are very complex, and the two
together add composite complexity to the problem.
joe

> I need to intercept the IO for SQL server for certain operation …

You are aware that SQL-Server (as nearly every transactional DBMS), has
very strict IO Requirements? File-Integrity, Order of IO-Operations,
Flushes, None-Cached Access, …

[SQL Server I/O Basics, Chapter 2]
http://technet.microsoft.com/en-us/library/cc917726.aspx

You should at least run the MS provided tests for IO Systems:

[How to use the SQLIOSim utility to simulate SQL Server activity on a disk
subsystem]
http://support.microsoft.com/kb/231619/en-us

GP


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Guys thanks for reply,

Joseph I understand your comments and suggestions and agree that I was not precise about task that I need to accomplish. Also agreed that I am new in this and don’t understand the complexity of the problem and because of that I am looking for easiest way to accomplish things.
I understand that this is not a task ‘I will do it for 3 days’ this can take some serious time, but this is task I was appointed to do, and I will do my best to achieve.
Also English is not my native language and sorry if I write something that I shouldn’t.

In general task I want to achieve is something similar to Virtual Database, here is the link of some example:
http://www.idera.com/Help/SQLvdb/1-5/Web/
In general this is possible. And they also used the Driver/service approach.

My idea was the following (I will try to explain)

  1. User specify the .bak file for restore
  2. Instead of restoring the .bak file (like normal in SQL Server)
  3. I create the ‘Virtual’ .mdf file which is intercepted for IO (Which can be a real file on the FS)
  4. Read/write data are populated from .bak file via ‘Virtual’ .mdf
    The point of this is process is saving the space.

Now the main issue is the FileSize of .mdf file. The question is it possible to manipulate the file size w/o actually change the physical size of the file. I read from this forum that this is never a good idea, and maybe is my approach wrong (but currently it is only one). I didn’t found any document about this issue. The SQL read/write data is not a issue here.

If someone can suggest another approach or to point on some examples about file size manipulation that would be great.

Thanks again for reply.

I have gone to the site, read the descriptions, and looked at the prices.
It looks like a sophisticated, complex product which has been tested and
has support. How expensive is your time? Given it will probably take on
the order of a year to achieve a subset of their product, and be a
constant maintenance problem which will continually distract you from more
important work, and I would not begin to estimate the cost of a bug that
caused a business-critical database to get scrambled, how, exactly, is
writing your own going to be cost-effective in any way? If you start now,
you might have something in a year that mostly works, sort of. Or, you
can call them, have it shipped fastest way, and be up and running in less
than a week (if you are in the US, or have a local distributor, then
potentially running the next day) I don’t know how you price your time,
but if I had to write a requirements document (no implementation) I would
already be at a higher price point than their 10-node license. The
implementation cost would be higher than buying one of every one of their
SQL products, and I wouldn’t guarantee that what I ended up with was as
robust and reliable as their product. I’ve seen too many spectacular
“NIH” disasters (“Not Invented Here”) including one case where several
weeks of senior developers’ time was diverted into writing a spreadsheet
because the mainframe spreadsheet cost $6,000. And buying everyone who
needed it a desktop PC with Lotus-1-2-3 would have cost about $15,000.
The result was inevitable; after wasting 12 or more man-weeks (nominally,
about $60,000) the project was abandoned. Meanwhile, every other
deliverable to the revenue stream slipped a month. The spreadsheet was a
purely internal tool, and was pure expense. So, unless you are planning
to compete with them, building your own is unlikely to be cost effective.

joe

Guys thanks for reply,

Joseph I understand your comments and suggestions and agree that I was not
precise about task that I need to accomplish. Also agreed that I am new in
this and don’t understand the complexity of the problem and because of
that I am looking for easiest way to accomplish things.
I understand that this is not a task ‘I will do it for 3 days’ this can
take some serious time, but this is task I was appointed to do, and I will
do my best to achieve.
Also English is not my native language and sorry if I write something that
I shouldn’t.

In general task I want to achieve is something similar to Virtual
Database, here is the link of some example:
http://www.idera.com/Help/SQLvdb/1-5/Web/
In general this is possible. And they also used the Driver/service
approach.

My idea was the following (I will try to explain)

  1. User specify the .bak file for restore
  2. Instead of restoring the .bak file (like normal in SQL Server)
  3. I create the ‘Virtual’ .mdf file which is intercepted for IO (Which can
    be a real file on the FS)
  4. Read/write data are populated from .bak file via ‘Virtual’ .mdf
    The point of this is process is saving the space.

Now the main issue is the FileSize of .mdf file. The question is it
possible to manipulate the file size w/o actually change the physical size
of the file. I read from this forum that this is never a good idea, and
maybe is my approach wrong (but currently it is only one). I didn’t found
any document about this issue. The SQL read/write data is not a issue
here.

If someone can suggest another approach or to point on some examples about
file size manipulation that would be great.

Thanks again for reply.


NTFSD is sponsored by OSR

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> industrial-strength backup utilities that are backing up live SQL

databases?

They either use VSS or the SQL’s BACKUP DATABASE statement.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

You do realize that $20.000/mo salary is hardly regular even in driver dev. field in the USA, right? And that
$1000/mo gross salary is considered high in third world countries and even in BRIC area it is not too either?
When you started talking, I agreed, but when you introduced some numbers, I wasn’t sure whether you were trying
to argue for or against your previous case :smiley:

xxxxx@flounder.com wrote:

The result was inevitable; after wasting 12 or more man-weeks (nominally,
about $60,000) the project was abandoned.


Kind regards, Dejan (MSN support: xxxxx@alfasp.com)
http://www.alfasp.com
File system audit, security and encryption kits.

Ok Adrijan,

so when mounting a Backup-File, you are speaking about an read only database. You want to “virtualize” the .mdf file, based on the “bak” file, which is a *Full* Database-Backup file.

Generally you would need to perform the following actions:

  1. have some user-mode app which takes a bak file and contolls the overall process
  2. create a virtual file, as “read-only”, it may be in existing namespace (like temp), you it may be a filter
  3. issue a sp_attach_single_file_db SQL statement to the instance with the UM-app
  4. intercept opens and reads and construct the data from the .bak file

Because I have no experience in a FS filter, I can’t tell you about that. You should consider to buy a licence of an existing virtualization filter (like Callback File System or other).

Reversing the format of .bak files shall be quite doable. Once you understand the headers, the data-pages (MSSQL generally uses 4k Pages in Data-Files), are dumps directly to the .bak file. Only used pages are dumped. Unused pages don’t need to be initialized (they are fully initiailized on allocation, which won’t happen in a Read-Only DB).

One hard point will be the fact, that - if you want to provide a transactional consistent view of the data - you also have to understand the Transaction-Log records (which is writeen sequential, with variable sized records, and not organized in Pages), appended to the Data-Pages in a full backup file to perform a virtual UNDO / REDO phase against the .bak. There is some documentation about the binary layout of Data-Pages, but the binary encoding of the Transaction-Log is completly undocumented, and may (and has) changed between SQL-Server Versions.

I remember a quite successfull Tool which implements TX-Log reading (it was an “Undo”-Tool for SQL, all User-Mode), for SQL 2000, where the (quite big) company never managed to upgrade to SQL 2005.

GP

Thanks for replays,

Examine the restore command from SQL:
RESTORE DATABASE [ActiveDb] FROM
DISK = N’F:\BASE\TestDB.bak’ WITH FILE = 1,
MOVE N’TestDB’ TO N’C:\Temp\ActiveDb.mdf’,
MOVE N’TestDB_log’ TO N’C:\Temp\ActiveDb_1.LDF’,
NOUNLOAD, STATS = 10
GO

1.When you start restoring process the SQL will create .mdf and .ldf files (others as well , but simplifying the process).
2.Than SQL will start reading/writing to this file
3. Read/write will occur to .mdf and .ldf files.
4. My driver is suppose to intercept this calls for *.mdf and *.ldf and supply data from *.bak (Reading)
5. Now writing is problematic , when SQL try to write to this data, I need to discard this (this information is in .bak) but I need to change the file size, allocation size, EndOfFile… But without changing the actual size. (Writing)

My question is not related to any SQL part of the problem (this is Driver forum not SQL). The structures of .mdf .bak a .ldf files are not issue here. And implementation for this already exist and works w/o driver.

The main issue is the number (5) Writing.

  1. Is the possible to ‘fake’ the file size ? (Documents, examples, posts… Please )
  2. I read from this forum about Layered FSD (Is there some documentation related, examples …)
  3. Is it possible to override the IRP_MJ_QUERY_INFORMATION, IRP_MJ_NETWORK_QUERY_OPEN and IRP_MJ_DIRECTORY_CONTROL to achieve this or it is not a good idea ?
  4. Any other better approach ?

The question is can I change the file size according to write buffer size but actually canceling/ignoring the write operation.

Thanks,

Hallo again,

so you want to go the RESTORE DATABASE path, and not the ATTACH DATABASE (as readonly) Path? In this case you may just allow SQL-Server to perform the REDO/UNDO, and you should not need to understand the Log-Records, which is good.

I don’t think you have to virtualize the .LDF Transaction-Log, because SQL will initialize it after the Restore, so there should be no Duplication of Data from the .BAK.

  1. Now writing is problematic , when SQL try to write to this data, I need to
    discard this (this information is in .bak) but I need to change the file size,
    allocation size, EndOfFile… But without changing the actual size. (Writing)

I don’t think you can ignore all writes to the .MDF, because SQL Server will modify it in the RECOVERY phase. You need to persist this changes, otherwise the Database can’t be accessed.

You just want to discard writes to MDF/NDF if the page is *exaclty* like the page in the BAK. Otherwise you have to perform a “Copy on Write”.

My question is not related to any SQL part of the problem (this is Driver forum
not SQL). The structures of .mdf .bak a .ldf files are not issue here. And
implementation for this already exist and works w/o driver.

I would take a look at the Callback File System, which implements all the Driver Parts, so you can reuse your existing UM code (API exists for .NET, C++ managed and unmanaged, Java and Delphi). And it allows to to virtualize File-Size and EndOfFile. You get Callbacks as events in you UM Component.

[OnSetEndOfFile event/delegate/callback]
http://www.eldos.com/documentation/cbfs/ref_evt_setendoffile.html

[OnWriteFile event/delegate/callback]
http://www.eldos.com/documentation/cbfs/ref_evt_writefile.html

[OnSetAllocationSize event/delegate/callback]
http://www.eldos.com/documentation/cbfs/ref_evt_setallocationsize.html

Price starts at US$4000, depending on licence. And you get a fully developed, tested and signed driver. And you can try it with a eval licence.

G?nter thanks for reply,

Yes I already tried their product and the problem is that not all calls for file size a covered. They also conformed that CallbackFilter can not be used for what I need.

But thanks again for reply.

Hello Adrijan,

ok. I was able to get good results with CBFS, but in a different scenario.

Maybe you could avoid going the Kernel-Way when you implement the File-Virtualizer in Form of an SMB/CIFS Server. Like:

RESTORE DATABASE [ActiveDb] FROM
DISK = N’F:\BASE\TestDB.bak’ WITH FILE = 1,
MOVE N’TestDB’ TO N’\mytaggedinferface\ActiveDB.mdf’,
MOVE N’TestDB_log’ TO N’\mytaggedinferface\ActiveDB.mdf’

but you’ll need trace-flag 1807 to allow UNC paths for Databases.

GP