Wonders of paged code sections.

Alex_Grig · April 25, 2014, 4:52pm

Just got a crash stack for a bugcheck DRIVER_IRQL_NOT_LESS_OR_EQUAL:

FAILED_INSTRUCTION_ADDRESS:
NDIS!Rtl::KNeutralLock::Release+0
fffff800`00704494 ?? ???

STACK_TEXT:
nt!KeBugCheckEx
nt!KiBugCheckDispatch+0x69
nt!KiPageFault+0x23a
NDIS!Rtl::KNeutralLock::Release
NDIS!ndisSetDevicePowerOnComplete+0x11ea8
nt!IovpLocalCompletionRoutine+0x174
nt!IopfCompleteRequest+0x438
nt!IovCompleteRequest+0x1d7
pci!PciPowerUpDeviceTimerDpc+0x14d
nt!KiRetireDpcList+0x6b2
nt!KiIdleLoop+0x5a

OF COURSE, “NDIS!Rtl::KNeutralLock::Release” is in a paged code section.

THANKS, MICROSOFT, for caring about a (systemwide total of) couple megabytes of working set, at a mere cost of a few bugchecks. A cute sad face every now and then doesn’t make anyone to off themselves, does it?

Jeffrey_Tippet_MSFT · April 25, 2014, 5:07pm

Thanks for the bug report - I can confirm this is an issue in the current implementation of NDIS.SYS.

Memory pressure is the gating factor in phones, embedded devices, and highly-dense VM deployments. Footprint matters.

Alex_Grig · April 25, 2014, 5:25pm

>Memory pressure is the gating factor in phones, embedded devices, and highly-dense VM deployments.

Having observed obscene page-in effects in a 4G Windows 7 laptop which never commits over 2-2.5 GB, I’d say that the memory pressure should not be the first concern in the Windows MM. Bad page retention policies should concern MS more.

OSR_Community_User · April 25, 2014, 8:54pm

A page should never be paged out and its physical reality removed. My
64-bit win7 8-core w/8 GB is slower than its 32-bit, 4-core, 2GB, XP
predecessor. If I let a program sit idle for a couple hours I can click
on its window, then go out and prepare a 7-course dinner for 6 people
before the app is willing to be active. The disk light would induce
seizures if I kept staring at it. There’s Something Wrong With This
Picture.

(OK, I’m exaggerating a bit; it’s probably more like nuking a slice of
pizza).

And are you suggesting that (a) One Size^H^H^H^HPolicy Fits All and (b)
the OS can’t tell the difference between a 64GB desktop and a 2GB phone
environment?
joe

>Memory pressure is the gating factor in phones, embedded devices, and
> highly-dense VM deployments.

Having observed obscene page-in effects in a 4G Windows 7 laptop which
never commits over 2-2.5 GB, I’d say that the memory pressure should not
be the first concern in the Windows MM. Bad page retention policies should
concern MS more.

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Alex_Grig · April 25, 2014, 10:39pm

>Memory pressure is the gating factor in phones, embedded devices, and
highly-dense VM deployments

HyperV needs to learn to de-dupe RO executable pages. This relieves much more memory pressure than the paged kernel code.

OSR_Community_User · April 26, 2014, 7:07am

FWIW Linux suffers from the same problem: the default “swappiness” value
is 60 and Ubuntu recommend setting it to 10
(https://help.ubuntu.com/community/SwapFaq#What_is_swappiness_and_how_do_I_change_it.3F)
to reduce how readily swap will be used.

–
Bruce

On 4/25/2014 6:52 PM, xxxxx@flounder.com wrote:

A page should never be paged out and its physical reality removed. My
64-bit win7 8-core w/8 GB is slower than its 32-bit, 4-core, 2GB, XP
predecessor. If I let a program sit idle for a couple hours I can click
on its window, then go out and prepare a 7-course dinner for 6 people
before the app is willing to be active. The disk light would induce
seizures if I kept staring at it. There’s Something Wrong With This
Picture.

(OK, I’m exaggerating a bit; it’s probably more like nuking a slice of
pizza).

And are you suggesting that (a) One Size^H^H^H^HPolicy Fits All and (b)
the OS can’t tell the difference between a 64GB desktop and a 2GB phone
environment?
joe

>> Memory pressure is the gating factor in phones, embedded devices, and
>> highly-dense VM deployments.
> Having observed obscene page-in effects in a 4G Windows 7 laptop which
> never commits over 2-2.5 GB, I’d say that the memory pressure should not
> be the first concern in the Windows MM. Bad page retention policies should
> concern MS more.
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Peter_Viscarola_OSR · April 26, 2014, 11:39am

We’re conflating multiple issues here.

Pageable kernel mode code… Not bad conceptually, but routinely trying to figure out how to make small parts of your driver Pageable is spending time looking for trouble, in my opinion. The big question should be how the bug in NDIS.SYS wasn’t discovered by Verifier.

On the other point: Windows policy for trimming a process working set have been controversial for more than a decade. It’s a hard problem to solve.

Peter
OSR
@OSRDrivers

Alex_Grig · April 26, 2014, 12:02pm

>On the other point: Windows policy for trimming a process working set have been controversial for more than a decade. It’s a hard problem to solve.

In the meantime RAM sizes made the working set trimming a thing that does NOT need be routinely done… How long it took Microsoft NOT to trim a process WS in minimise (and on TS session disconnect, which used to minimise all windows)?

Windows file cache, also, seems to try to keep files cached indefinitely and without ever trying to limit cache size (in the old good times, Win9x had registry-specified cache limit, though). When some crappy file scanner/indexer is opening lots of files in cached mode (or, equally bad, opens them as file sections), the executable pages seem to fall a victim of that fast.

Peter_Viscarola_OSR · April 26, 2014, 1:08pm

In fact, it does exactly this. Files are not kicked out of cache until there’s memory pressure.

I don’t know how the memory used for file cache is balanced against process memory. I should look that up…

Peter
OSR
@OSRDrivers

Jeffrey_Tippet_MSFT · April 26, 2014, 2:54pm

> The big question should be how the bug in NDIS.SYS wasn’t discovered by Verifier.

That is a good question. I added the code in question late in the release, in an error path a in a power transition. Apparently my testing did not cover this path while DV was enabled. The static analysis toolset doesn’t seem to catch this either, so I’ll have to investigate that too.

Peter_Viscarola_OSR · April 26, 2014, 5:53pm

Thanks Jeffrey.

Have I ever told you how great it is to have you contribute here? Your comments are always frank, helpful, cordial, and greatly appreciated.

Peter
OSR
@OSRDrivers

Alex_Grig · April 26, 2014, 9:06pm

@M M:

As all of the veterans here know, page trimming and replacement as well as cache
balancing algorithm design makes NP hard issues seem trivial.

This is one of those problems that don’t have to have the “best” solution. They need to be good enough, and, the most important, avoid pathological behaviors. Pathological cases usually happen when the programmer tried to make the behavior too smart for its own good. Or overly aggressive.

Avoiding pathological cases is more important than getting the best performance. The “best performance” you can only see in artificial benchmarks, anyway, and when you see an increase in single percents, the users will not notice it, because it’s below noise.

I’ve read about a typical case of misguided optimization. A programmer’s blog on MSDN had an article about converting PDB write component (in LINK) to multi-threaded, because measurements show that it takes significant time. Not once the programmer asked themselves a question: “why would a glorified binary formatting routime be noticeably slow? Could it be because LINK writes a PDB as a compressed file? Could its write pattern cause excessive compression overhead?”

David_R_Cattley · April 26, 2014, 9:07pm

> Have I ever told you how great it is to have you contribute here? Your
comments are always frank, helpful, cordial, and greatly appreciated.

+1 (well, way more than that actually)

Dave Cattley

OSR_Community_User · April 26, 2014, 11:06pm

Why is LRU used for page-replacement? Because at IBM, in the early 1970s,
one researcher asked “What if we knew the furure?” and to that end
instrumented the OS to track page references. Then he took the trace (to
give you an idea of when this was done, the trace was on a magtape) and
built a simulator that could look into the “future” to decide which page
was furthest in the future for a reference, and simulated paging it out.
His overall performance was only about 5% better than LRU. Since the
problem of telling past behavior was easy to solve, and predicting the
future was somwhat more difficult, LRU became the algoritm of choice.

A friend of mine did his PhD dissertation on optimal register allocation
in a compiler. There is a known-best-algoritm for doing this that
involves matrix artithmetic of gigantic binary matrices at each value
creation point. I don’t recall the complexity, but it was certainly
exponential; he used to run overnight batch jobs to compile tiny
subroutines. The compiler we used for production, however, had a
heuristic algorithm whose complexity was linear in the number of value
creations. Like LRU, his algorithm produced code only about 5% better
than the production compiler. So the linear heuristic was “good enough”.

Do we have any similar studies showing aggressive page-out to give
comparable performance to lazy page-out? The experience of many of us
suggests that aggressive page-out produces user-perceptible and serious
degradation of response, particularly when memory is abundant. We already
know that techniques which optimize server performance are different than
techniques that optimize highly-interactive environments; an algorithm
that improves performance under high memory pressure appears to be less
successful when memory pressure is low, and the suggestion seems to be
that the system should be adaptive to its environment rather than trying
to use a single algorithm for 2GB phones running a small number of apps
and 16GB workstations running 50-100 apps.

I am not optimistic; look how many years it took to get GDI to stop
failing when lots of apps needed lots of GDI space, and the same limits
were used for multicore 4GB Win32 machines as had been used for 2GB Win16
machines.
joe

@M M:

>As all of the veterans here know, page trimming and replacement as well
> as cache
balancing algorithm design makes NP hard issues seem trivial.

This is one of those problems that don’t have to have the “best” solution.
They need to be good enough, and, the most important, avoid pathological
behaviors. Pathological cases usually happen when the programmer tried to
make the behavior too smart for its own good. Or overly aggressive.

Avoiding pathological cases is more important than getting the best
performance. The “best performance” you can only see in artificial
benchmarks, anyway, and when you see an increase in single percents, the
users will not notice it, because it’s below noise.

I’ve read about a typical case of misguided optimization. A programmer’s
blog on MSDN had an article about converting PDB write component (in LINK)
to multi-threaded, because measurements show that it takes significant
time. Not once the programmer asked themselves a question: “why would a
glorified binary formatting routime be noticeably slow? Could it be
because LINK writes a PDB as a compressed file? Could its write pattern
cause excessive compression overhead?”

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Alex_Grig · April 26, 2014, 11:28pm

>I am not optimistic; look how many years it took to get GDI to stop failing when lots of apps needed lots of GDI space

I wonder if many years later, when MS finally falls to the same fate as Nokia, Windows will still be resetting the video drivers because “it took too much time”, making your screen go blank at random times.

Peter_Viscarola_OSR · April 26, 2014, 11:48pm

I WILL tell you that I know the guy who owns the Windows memory Manager, and he’s very, very, clever. In his first few years on the job, he made more improvements to MM than has been made the entire time prior to his arrival. Unlike some other subsystems, over the years the MM has dramatically evolved, improved, and gotten dramatically more sophisticated… All while staying uber-reliable.

So moan about it if you must, and I agree it ain’t perfect, but this guy’s done a far better job than I could have done.

Peter
OSR
@OSRDrivers

Peter_Viscarola_OSR · April 27, 2014, 10:06am

It occurred to me that some folks might misinterpret who I was referring to here…

So to be perfectly clear: I’m am referring to Landy Wang, who has artfully evolved the memory manager from its humble origins to the NUMA-aware beast it is today. For the time when it was doing writes paging that were a maximum of 64K to today where you routinely see multi-megabyte writes.

So while you may differ with how a particular policy is implemented, think for a minute: When’s the last time you had a beef with the reliability of MM?

Peter
OSR
@OSRDrivers

Don_Burn · April 27, 2014, 10:20am

Landy’s talks on the memory manager are one of the things I most miss about
the demise of WinHEC. They were one of the talks that disproved the “DDC
lite” label for WinHEC. Of course now we have BUILD, with no content
announcements, and the general description by many folks I know who have
been there as either “PDC lite” or “WinHEC ultra lite”.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@osr.com
Sent: Sunday, April 27, 2014 10:07 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Wonders of paged code sections.

It occurred to me that some folks might misinterpret who I was referring to
here…

So to be perfectly clear: I’m am referring to Landy Wang, who has artfully
evolved the memory manager from its humble origins to the NUMA-aware beast
it is today. For the time when it was doing writes paging that were a
maximum of 64K to today where you routinely see multi-megabyte writes.

So while you may differ with how a particular policy is implemented, think
for a minute: When’s the last time you had a beef with the reliability of
MM?

Peter
OSR
@OSRDrivers

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Prokash_Sinha · April 27, 2014, 1:36pm

Joe,

Things change very rapidly, and as stated here the owner of the MM is
one of a class, no doubt about that… Here the problem is correctness
( a corner case bug ). We don’t know how much of adaptive approaches
were used, but I suspect already a lot.

In the past analysis of algorithms were mainly based on Offline
approach. MM, scheduler all these are non-deterministic, and tries to
handle hard problem. But in lot of cases, a new approach ( which was
sitting and eating dust for about 20 years) is called - Competitive
Cost Analysis is being used. This is specially for OnIine algorithms.
It is powerful analysis, but when comes down to implementing knowing
the depth of the problems, it is a pure art of an artist, and MM owner
is one such person, really …

-pro

On Sat, Apr 26, 2014 at 8:03 PM, wrote:
> Why is LRU used for page-replacement? Because at IBM, in the early 1970s,
> one researcher asked “What if we knew the furure?” and to that end
> instrumented the OS to track page references. Then he took the trace (to
> give you an idea of when this was done, the trace was on a magtape) and
> built a simulator that could look into the “future” to decide which page
> was furthest in the future for a reference, and simulated paging it out.
> His overall performance was only about 5% better than LRU. Since the
> problem of telling past behavior was easy to solve, and predicting the
> future was somwhat more difficult, LRU became the algoritm of choice.
>
> A friend of mine did his PhD dissertation on optimal register allocation
> in a compiler. There is a known-best-algoritm for doing this that
> involves matrix artithmetic of gigantic binary matrices at each value
> creation point. I don’t recall the complexity, but it was certainly
> exponential; he used to run overnight batch jobs to compile tiny
> subroutines. The compiler we used for production, however, had a
> heuristic algorithm whose complexity was linear in the number of value
> creations. Like LRU, his algorithm produced code only about 5% better
> than the production compiler. So the linear heuristic was “good enough”.
>
> Do we have any similar studies showing aggressive page-out to give
> comparable performance to lazy page-out? The experience of many of us
> suggests that aggressive page-out produces user-perceptible and serious
> degradation of response, particularly when memory is abundant. We already
> know that techniques which optimize server performance are different than
> techniques that optimize highly-interactive environments; an algorithm
> that improves performance under high memory pressure appears to be less
> successful when memory pressure is low, and the suggestion seems to be
> that the system should be adaptive to its environment rather than trying
> to use a single algorithm for 2GB phones running a small number of apps
> and 16GB workstations running 50-100 apps.
>
> I am not optimistic; look how many years it took to get GDI to stop
> failing when lots of apps needed lots of GDI space, and the same limits
> were used for multicore 4GB Win32 machines as had been used for 2GB Win16
> machines.
> joe
>
>> @M M:
>>
>>>As all of the veterans here know, page trimming and replacement as well
>>> as cache
>> balancing algorithm design makes NP hard issues seem trivial.
>>
>> This is one of those problems that don’t have to have the “best” solution.
>> They need to be good enough, and, the most important, avoid pathological
>> behaviors. Pathological cases usually happen when the programmer tried to
>> make the behavior too smart for its own good. Or overly aggressive.
>>
>> Avoiding pathological cases is more important than getting the best
>> performance. The “best performance” you can only see in artificial
>> benchmarks, anyway, and when you see an increase in single percents, the
>> users will not notice it, because it’s below noise.
>>
>> I’ve read about a typical case of misguided optimization. A programmer’s
>> blog on MSDN had an article about converting PDB write component (in LINK)
>> to multi-threaded, because measurements show that it takes significant
>> time. Not once the programmer asked themselves a question: “why would a
>> glorified binary formatting routime be noticeably slow? Could it be
>> because LINK writes a PDB as a compressed file? Could its write pattern
>> cause excessive compression overhead?”
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>>
>> OSR is HIRING!! See http://www.osr.com/careers
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Alex_Grig · April 27, 2014, 1:42pm

>So while you may differ with how a particular policy is implemented, think for a minute: When’s the last time you had a beef with the reliability of MM?

The MM can be perfectly robust and reliable, but at the same time suffer of obscessive-compulsive file data retentiveness. It’s only been recently enough (Vista+) that cache policy for USB drives become to flush dirty pages soon enough. In Win2003 (God forbid I mention it) I was very surprized to be getting “data cannot be flushed” warnings on “unsafe” unplug of a USB stick, EVEN IF NO FILE WERE MODIFIED.