, RE:maximum IRP write lengh

How big is “Big”? This definition changes dynamically and has to be
interpreted in the context of its time and its environment.

In the MFC newsgroup, I regularly got questions like this one:

OP: I have this HUGE file, and I have to do a lot of forward and backward
scanning of the file to do searches. How can I handle this when the
string I’m searching for crosses a buffer boundary? I find it very hard
to get my head around how to do this optimally when doing backward scans.

Me: “HUGE” is not a number. File sizes are expressed in number of bytes.
Please explain what “HUGE” is.

OP: 10MB max, and most files will be somewhat smaller, but no smaller than
about 5MB

Me: Oh, you said HUGE when you meant TINY. The algorithm is:
Ask how big the file is
Get a buffer of that size
Read the entire file into it
Write your code, knowing everything is in memory
OR
Memory-map the file into your address space
Write your code, knowing everything is, or will be as needed, in memory

OP: But that means I would be using up 10MB of memory!

Me: How much memory do you have on your machine?

OP: 4GB

Me: 10MB/4GB = ? Why are you concerned with using a fraction of a percent
of your memory?

OP: But the file is still huge

Me: And how big is your virtual address space? 2GB? 3GB? Do the same
arithmetic. You’re still using a fraction of a percent of your address
space.

OP: So why isn’t a 10MB file considered huge?

Me: HUGE files are expressed in GB to TB. BIG files are expressed in
multiples of 100MB. SMALL files are under 100 MB. 10MB is TINY.
Anything under 1MB is INFINITESIMAL. This is not a PDP-11 in 1975, this is
Windows machine in (year > 2000). The tradeoffs are different. Why
write and debug complex code that does not need to exist?

I did not track how many times I had conversations like this on the
newsgroup, but it happened with alarming frequency.
joe

PRECISELY!

The data being written is ALREADY in memory. Unless you’re advocating
writing the data serially (in which case you just opted for a *much*
slower set of operations), you’ll have precisely the same amount of memory
pinned and in use simultaneously regardless of the number of discrete
transfers involved, right?

So, by requiring smaller transfers, all you get is MORE transfers that are
smaller. MORE calls to IoCallDriver (for each device object in the
stack). MORE IRPs. More… everything.

People are all hung-up on 2GB because it sounds so big… but as Dr.
Newcomer said “if there is 2GB free in the 64GB memory, why not use it” –
If the machine has 64GB of memory, pinning just over 3% of that memory is
actually pretty reasonable, right?

I *hear* Mr. Grig’s argument that cache thrashing is not good… and I
agree that it would be *highly* advantageous to have a CopyFile API that
did something other than read followed by write. But condemning large
I/Os because they can be used for copy file operations seems to me to be
“throwing the baby out with the bath water” (does that English-language
saying make sense to everyone?). S’pose I’m capturing data from a
collection of high-speed satellite links. I want to be able to collect
and write that data as quickly as possible. If I can get 2GB writes, that
can ONLY be a good thing in my book.

Peter
OSR


NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

xxxxx@flounder.com wrote:

OP: But that means I would be using up 10MB of memory!

Me: How much memory do you have on your machine?

OP: 4GB

Yes, this is a sentiment I hear over and over on the forums. “Oh, my
gosh, Task Manager shows memory usage at 85%! What’s wrong with my
system? Windows is such a memory hog!”

The fact of the matter is that unused memory equals wasted money. If
the memory is there, keep it busy.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>The fact of the matter is that unused memory equals wasted money. If the memory is there, keep it busy.

The Great Windows Mystery is then why it has to page-in so often, if the memory is plenty? Ever seen when you switch to another Visual Studio instance, you have to wait a few seconds while the disk is thrashing? Even when you just switch to another file in the VS.

>thrashing? Even when you just switch to another file in the VS.

I disagree with this, this operation works very quick for me and does not require disk accesses.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

It looks like there is a common sense on Windows that is not to be good a citizen, and grab as much as you can. For example, by default SQL will use all the memory available, so we see memory allocation failure more often if our driver coexists with SQL. Like people in some country would rather run deficit than balanced budget.

But it also makes no sense to not use resources which are available, and
will make the task run faster. Consider: on a 64GB system, as Pter points
out, 2GB is about 3% of the available memory. It is not 50%. We have no
data on what happens on a heavily-loaded 4GB machine, and until the OP or
someone else can provide this data, this is a debate with no basis. It
makes no sense to claim that 3% memory usage is even going to have a
measurable impact on overall system performance, let alone a “significant”
impact and that is the only data point we have.

And how is it that SQL, which is a user app that would be paged, could
generate serious memory pressure, unless it is doing Direct I/O of large
buffers, which does not necessarily mean it is a Bad Citizen, just an app
that needs lots of resources. Note that on Win32, soeone running SQL
Server will probably boot with a 3GB user partition, which significantly
increase kernel pressure. So you can’t make a blanket statement about
"SQL Server unless you qualify it with the size of the user partition.
Remember that someone who boots 3GB has made a choice to favor SQL over
other uses. And that leads to a question about the size of allocations
that are failing. You cannot simply say you have allocation problems
unless you specify both the size of allocations and total usage. So in
addition to avoiding the lack of facts regarding the rationale of the 2GB
buffer size, you are claiming failures without any supporting data. For
example, are the driver failures due to actual lack of memory, or are they
due to kernel address space fragmentation.

Note that if I were called in as a consultant on this problem, I would
need all of the above questions answered before I even started to work on
a solution. Either the client would have to supply the data, or pay me to
discover it (which would also mean I would find them more trustworthy than
“Before he left, our programmer told us…” Wait a minute. This has
already happened. But it was in app space, altough the problem was pretty
much the same)
joe

It looks like there is a common sense on Windows that is not to be good a
citizen, and grab as much as you can. For example, by default SQL will use
all the memory available, so we see memory allocation failure more often
if our driver coexists with SQL. Like people in some country would rather
run deficit than balanced budget.


NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Back to the 2GB size topic, what is the performance improvement are you expecting by using this big 2G size? Let’s say we run it on a 4G system and a 64G system and copying the same file, so the 4G system is supposed to use small IRP size, right? I can run some test if you are interested.

ren.j@263.net wrote:

It looks like there is a common sense on Windows that is not to be good a citizen, and grab as much as you can. For example, by default SQL will use all the memory available, so we see memory allocation failure more often if our driver coexists with SQL.

That’s just silly. Of COURSE you should be a good citizen. SQL Server
is, in many circumstances, a mission-critical application that is the
only process on the machine. It should use as much memory as it needs
to do its job.

Adobe Photoshop (in its earlier days) is another program that
arbitrarily snatches up a percentage of the available memory for its own
use. Neither one of those is a “good citizen”.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>Let’s say we run it on a 4G system and a
64G system and copying the same file, so the 4G system is supposed to use small
IRP size, right?

More that that, it just makes sense to not try use as big as possible IRP, but settle on a reasonable (performance-wise) limit, without trying to make it dependent on the system size. There is an advantage in using 64K vs 4K transfer, and have around 1MB worth of pending transfers, but larger transfers (and larger pending transfer amount) give very marginal improvements well below the noise.