Filter Driver to speed up access to files?

Folks,

We have a gaming machine which basically loads a lot of image files when the game starts up and we want to speed up this process. One of my colleagues suggested writing a filter driver to intercept the calls to fopen/fread to a particular list of files ( set via registry ) and use part of the physical RAM to load them there. I haven’t done any filter driver or anything to do with File System stuff. I was thinking of the RAMDISK driver…maybe write something similar and copy all the games files to that drive before the game starts.

These games files are just loaded and read. I think they are not written to.
Any ideas/suggestions are welcome.

Thanks in Advance.
Shan

The FIRST thing you should do to get better performance is use the native
file I/O API’s like ReadFile, and forget about fopen/fread. Or even better,
use memory mapped files instead of file I/O API’s of any kind.

The OS will already cache open files, so your app might start up a thread
that sweeps though and reads all the files you will need. If there is
sufficient ram, they will all then be cached. The OS is MUCH more efficient
at making best use of RAM that your filter driver could be. If these files
are memory mapped files, the end result will just be a memory address with
the file contents cached in ram. The OS memory mapped file prefetch might
just do this for you when you go to touch the files.

Both these techniques are standard app optimizations.

If it’s a server, you might also put these files on a super fast media, like
a local high performance flash disk. I believe these can achieve transfer
rates in excess of a gigabyte/sec, although you will pay for performance. It
sounds like you may want high sequential access performance, to preload ram,
so you might also just use striped rotating disks.

Writing any kind of driver is not what you should be exploring.

Jan

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-395770-
xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Sunday, January 10, 2010 11:28 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Filter Driver to speed up access to files?

Folks,

We have a gaming machine which basically loads a lot of image files
when the game starts up and we want to speed up this process. One of my
colleagues suggested writing a filter driver to intercept the calls to
fopen/fread to a particular list of files ( set via registry ) and use
part of the physical RAM to load them there. I haven’t done any filter
driver or anything to do with File System stuff. I was thinking of the
RAMDISK driver…maybe write something similar and copy all the games
files to that drive before the game starts.

These games files are just loaded and read. I think they are not
written to.
Any ideas/suggestions are welcome.

Thanks in Advance.
Shan


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

>load them there. I haven’t done any filter driver or anything to do with File System stuff. I was thinking of

the RAMDISK driver

RAMDISK will hardly be faster then the Windows file cache, especially if it will use pageable memory.

Write a small app in any language which will list all these files, and read each of them to nowhere - just read() or fread() to a scratch buffer.

Run this app just before the game.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

> file I/O API’s like ReadFile, and forget about fopen/fread. Or even better,

use memory mapped files instead of file I/O API’s of any kind.

Not better at all, this pollutes the physical pages a lot.

Explicit noncached IO with a not-so-large buffer will be by far faster for some scenarios, like sequential read of a multi-GB file.

But: this assumes that they have control over the syscalls used by the game, which is probably not true.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

wrote in message news:xxxxx@ntdev…
>
> Folks,
>
> We have a gaming machine which basically loads a lot of image files when
> the game starts up and we want to speed up this process. One of my
> colleagues suggested writing a filter driver to intercept the calls to
> fopen/fread to a particular list of files ( set via registry ) and use
> part of the physical RAM to load them there. I haven’t done any filter
> driver or anything to do with File System stuff. I was thinking of the
> RAMDISK driver…maybe write something similar and copy all the games files
> to that drive before the game starts.
>
> These games files are just loaded and read. I think they are not written
> to.
> Any ideas/suggestions are welcome.
>
> Thanks in Advance.
> Shan

Defragmenting can do wonders.

–pa

Thanks guys for the responses.

Write a small app in any language which will list all these files, and read each of them to nowhere - >just read() or fread() to a scratch buffer.

-I will try this. Write an app to read the image files. When the app closes, Are the file stil in memory?

The system is not a server. It’s a standard gaming machine which rus Windows XP embedded. I wish I can put in a solid state drive. But that’s going to increase the cost of the terminal. The system has a 2GB DDR3 RAM.

Defragmenting can do wonders.

  • I am running the contig.exe tool from sysinternals. Have to wait and see if there is any performance improvement.

I think the games uses Win32 APIs to open files. Not fread/fopen.

xxxxx@gmail.com wrote:

We have a gaming machine which basically loads a lot of image files when the game starts up and we want to speed up this process. One of my colleagues suggested writing a filter driver to intercept the calls to fopen/fread to a particular list of files ( set via registry ) and use part of the physical RAM to load them there.

It’s interesting to me that you would propose this. How do you think
the file APIs currently work? At their core, they all read from disk
into physical RAM. Anything you do is going to slow that down.

It’s certainly true that you can reduce the latency by reducing the
number of layers, which means using ReadFile directly instead of fopen,
but if you’re reading blocks of a reasonable size, fread vectors almost
directly to ReadFile. You might be able to do a bit better by mapping
the files into memory files, but that depends on how much data there is
and how you use the files.

The key to increasing your performance is to use the tools you have to
their best advantage, because you go off to write another tool.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

xxxxx@gmail.com wrote:

  • I am running the contig.exe tool from sysinternals. Have to wait and see if there is any performance improvement.

Have you actually measured the performance, to see how many bytes per
second you are actually achieving? If not, then you are just guessing.
You need to know where you are, and what is achievable, before you
embark on an improvement plan.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>It’s interesting to me that you would propose this. How do you think the file APIs currently work? At >their core, they all read from disk into physical RAM. Anything you do is going to slow that down.

  • Agreed. I would like to preload the files into the memory and let them stay there in the memory until shutdown( kinda of locking if that term is right). When the game process starts, it’s going to start different games based on user input, when the actual game app starts, it does not have to load the image files from the disk since it’s already in physical RAM.

  • One other thing is, physical pages allocated for the file Does it make any difference in performance if they are contiguous in or discontiguous in physical Memory?

Well, as previous replies have said, given lack of memory pressure, the Cache Manager already DOES load them (albeit on first access) and leave them in memory.

If this is a “file by file” thing you’re talking about, you’re really not talking a per-device block-level cache, right? You’re talking about a file-level, file-system type cache.

[quote]
One other thing is, physical pages allocated for the file Does it make any
difference in performance if they are contiguous in or discontiguous in physical
Memory
< /QUOTE>

DUDE! You’re worry about some detail of implementation before you’ve figure out what you’re implementing!!

Go back and read some of the replies you’ve already received:

a) RAM disk
b) Simple app to open all the files you can about and read them

Those are the same as far as Windows is concerned. If you don’t know this detail, you really, really don’t want to even THINK about writing any sort of driver. No disrespect intended, but seriously.

Peter
OSR

xxxxx@gmail.com wrote:

> It’s interesting to me that you would propose this. How do you think the file APIs currently work? At >their core, they all read from disk into physical RAM. Anything you do is going to slow that down.
>

  • Agreed. I would like to preload the files into the memory and let them stay there in the memory until shutdown( kinda of locking if that term is right). When the game process starts, it’s going to start different games based on user input, when the actual game app starts, it does not have to load the image files from the disk since it’s already in physical RAM.

The Windows paging system is the result of 20 years of testing,
development, and extensive performance tuning. It is extremely unlikely
that your interference will do anything at other than decrease the
overall performance. Windows is very good at noticing which pages are
being used and which are not. If your image files are not being used,
then it will be better for the overall performance of your game to let
those pages get used for other things.

  • One other thing is, physical pages allocated for the file Does it make any difference in performance if they are contiguous in or discontiguous in physical Memory?

You should be able to figure this out. It makes no difference, which is
good, because this isn’t under your control anyway.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> - One other thing is, physical pages allocated for the file Does it make any difference in performance

if they are contiguous in or discontiguous in physical Memory?

No.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

Oh, I don’t agree with THAT at ALL.

Well, that’s assuming (and the OP hasn’t told us a whole hell of a lot, really): He’s got a specific target workload in mind (his gaming machine that accesses some files), and he wants to optimize THAT target workload at the expense of the performance of other things on the machine.

While Windows is pretty well tuned for the general case, it’s usually ALWAYS possible to optimize a specific case – sometimes very significantly.

So, for example… he could pin all his images in memory, causing the working sets of everything else running in the system to be trimmed to its minimum. So, the performance of the system doing anything OTHER than running his game program or accessing these images might suck. But he might not care, right??

Peter
OSR

Thanks folks for the responses. It looks like writing a File System Filter driver is a bad idea in this case.

While Windows is pretty well tuned for the general case, it’s usually ALWAYS
possible to optimize a specific case – sometimes very significantly

-Is there any specific optimizations that you have in mind?
The system just runs the game engine and the games( based on user input) and no other applications.
It runs on Windows XP Embedded SP3.

Folks,
For some reasons , I need to write this File System Filter Driver . They are not based on technical stuff, so I rather not go into those reasons. As some of them already mentioned, it may very well degrade the system performance or worst create hard to track bugs/bug-checks etc. On the bright side, atleast I am learning something new here.

So I have started with a simple driver by looking at the examples in the WDK ( examples are easy to understand. Thanks to Microsoft ). It just looks for one particular list of files and if it finds it, and if it’s read request, it returns some dummy data for now.

I like to reserve 256MB ( initially) of physical RAM ( total RAM is 2GB ) pages and store all the file contents in them. Use MmAllocatePagesForMdl /other API to allocate the pages. And whenever a file read request comes through, map those physical pages (where the file contents are located) to kernel virtual address and do the copy to the user-buffer and then unmap it. Does it makes sense?

I think getting a 256MB of contiguous kernel virtual address range is going to be difficult considering
the Gfx wants 128MB or more. So we may need to map /unmap the physical pages whenever we want to do the copy. Because of this there is going to be a performance penalty I assume.

Any ideas/suggestions (flames as well) regarding this are welcome!

>I like to reserve 256MB ( initially) of physical RAM ( total RAM is 2GB ) pages and store all the file >contents in them. Use MmAllocatePagesForMdl /other API to allocate the pages. And whenever a file >read request comes through, map those physical pages (where the file contents are located) to kernel >virtual address and do the copy to the user-buffer and then unmap it. Does it makes sense?
I don’t think it is possible. For Windows XP the maximum limit of non page pool is 256M. You are going to use all non page memory.

Igor Sharovar

>I don’t think it is possible. For Windows XP the maximum limit of non page pool

is 256M. You are going to use all non page memory.

Yes you are right. But I need to map to kernel space only when I need to do a copy to user buffer and then I can unmap the pages. There might be some performance penalty I guess.

>Yes you are right. But I need to map to kernel space only when I need to do a copy to user buffer and >then I can unmap the pages.
It would not guarantee that the driver could get memory which it needs. Non page pool is shared resources among many components.

Igor Sharovar

>> I like to reserve 256MB ( initially) of physical RAM ( total RAM is

> 2GB ) pages and store all the file >contents in them. Use
> MmAllocatePagesForMdl/other API to allocate the pages.

I don’t think it is possible. For Windows XP the maximum limit of
non page pool is 256M. You are going to use all non page memory.

MmAllocatePagesForMdl doesn’t use nonpaged pool. It obtains
pages from the standby/free/zeroed lists, so you can technically allocate
as much memory as is currently available in the system (if it’s more than
4 GB then you’ll need to do multiple allocations, because an MDL can
only describe up to 4 GB worth of pages).

Note that available (= standby + free + zeroed) memory can fluctuate
a lot, so even on a system with lots of RAM you can’t assume that a
256 MB allocation will always succeed…


Pavel Lebedinsky/Windows Fundamentals Test
This posting is provided “AS IS” with no warranties, and confers no rights.

>MmAllocatePagesForMdl doesn’t use nonpaged pool. It obtains pages from the standby/free/zeroed >lists, so you can technically allocate as much memory as is currently available in the system (if it’s >more than 4 GB then you’ll need to do multiple allocations, because an MDL can only describe up to 4 >GB worth of pages).

If I allocate 256MB of pages ( assuming it’s available and successful) using MmAllocatePagesforMdl. what API should I use just to map say for example about 16MB of pages to kernel space. Sort of like a window. And then after doing the copy to user buffer, I can unmap the pages again.