The curse of PAGE strikes another victim

Microsoft, in their never ending over-eager quest to save a few KB of non-paged kernel memory, got another case of penny-wise, pound-foolish:

*** Fatal System Error: 0x000000d1
(0x85ED6902,0x00000002,0x00000008,0x85ED6902)

Connected to Windows 8 9600 x86 compatible target at (Mon Nov 30 13:05:28.973 2015 (UTC + 5:30)), ptr64 FALSE
Kernel Debugger connection established.

************* Symbol Path validation summary **************
Response Time (ms) Location
Deferred srv*http://msdl.microsoft.com/download/symbols
OK C:\Temp\17.2.0.2\x86
Symbol search path is: srv*http://msdl.microsoft.com/download/symbols;C:\Temp\17.2.0.2\x86
Executable search path is:
Windows 8 Kernel Version 9600 MP (16 procs) Free x86 compatible
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 9600.17031.x86fre.winblue_gdr.140221-1952
Machine Name:
Kernel base = 0x8125e000 PsLoadedModuleList = 0x8145d438
Debug session time: Mon Nov 30 13:05:14.789 2015 (UTC + 5:30)
System Uptime: 0 days 0:19:27.078
Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows 8 9600 x86 compatible target at (Mon Nov 30 13:05:33.270 2015 (UTC + 5:30)), ptr64 FALSE
Loading Kernel Symbols



Loading User Symbols

Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck D1, {85ed6902, 2, 8, 85ed6902}

Probably caused by : pci.sys ( pci!PciPowerUpDeviceTimerDpc+d3 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
81366754 cc int 3
5: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 85ed6902, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000008, value 0 = read operation, 1 = write operation
Arg4: 85ed6902, address which referenced memory

Debugging Details:

READ_ADDRESS: 85ed6902

CURRENT_IRQL: 2

FAULTING_IP:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

IP_IN_PAGED_CODE:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: AV

PROCESS_NAME: System

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

DPC_STACK_BASE: FFFFFFFF87460000

TRAP_FRAME: 8745b85c – (.trap 0xffffffff8745b85c)
ErrCode = 00000010
eax=00000004 ebx=9d302e70 ecx=a4475edc edx=00000000 esi=a44751d8 edi=9d302e70
eip=85ed6902 esp=8745b8d0 ebp=8745b8e0 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
ndis!Rtl::KNeutralLock::Release:
85ed6902 8bff mov edi,edi
Resetting default scope

LAST_CONTROL_TRANSFER: from 813e36d9 to 81366754

FAILED_INSTRUCTION_ADDRESS:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

STACK_TEXT:
8745b364 813e36d9 00000003 79b3d568 00000065 nt!RtlpBreakWithStatusInstruction
8745b3b8 813e31f3 87473340 8745b7b8 8745b85c nt!KiBugCheckDebugBreak+0x1f
8745b78c 81365326 0000000a 85ed6902 00000002 nt!KeBugCheck2+0x676
8745b7b0 81379923 0000000a 85ed6902 00000002 nt!KiBugCheck2+0xc6
8745b7b0 85ed6902 0000000a 85ed6902 00000002 nt!KiTrap0E+0x1cf
8745b8cc 85e9508e 00000002 85e7ffbc 9d302f04 ndis!Rtl::KNeutralLock::Release
8745b8e0 817166ca a4475120 9d302e70 a44751d8 ndis!ndisSetDevicePowerOnComplete+0x150d2
8745b910 812cb56a a4475120 9d302e70 8745b9b8 nt!IovpLocalCompletionRoutine+0x136
8745b98c 81715c8f 00000000 94f51448 00000004 nt!IopfCompleteRequest+0x4ea
8745b9f0 860a298d 94f51820 8744f300 00000001 nt!IovCompleteRequest+0x123
8745ba28 812db456 94f51820 94f51448 a72a6d55 pci!PciPowerUpDeviceTimerDpc+0xd3
8745bae0 812db053 8745bb28 00000000 a7dbb040 nt!KiExecuteAllDpcs+0x216
8745bc04 8137aae0 00000000 00000000 00000000 nt!KiRetireDpcList+0xf3
8745bc08 00000000 00000000 00000000 00000000 nt!KiIdleLoop+0x38

STACK_COMMAND: kb

FOLLOWUP_IP:
pci!PciPowerUpDeviceTimerDpc+d3
860a298d b9fffffffe mov ecx,0FEFFFFFFh

SYMBOL_STACK_INDEX: a

SYMBOL_NAME: pci!PciPowerUpDeviceTimerDpc+d3

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: pci

IMAGE_NAME: pci.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 53088818

IMAGE_VERSION: 6.3.9600.17031

BUCKET_ID_FUNC_OFFSET: d3

FAILURE_BUCKET_ID: AV_VRF_CODE_AV_PAGED_IP_pci!PciPowerUpDeviceTimerDpc

BUCKET_ID: AV_VRF_CODE_AV_PAGED_IP_pci!PciPowerUpDeviceTimerDpc

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:av_vrf_code_av_paged_ip_pci!pcipowerupdevicetimerdpc

FAILURE_ID_HASH: {2f2c6833-e611-db3d-954c-fb0313bc7b49}

Followup: MachineOwner
---------

5: kd> !analyze -v


Bugcheck Analysis



DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 85ed6902, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000008, value 0 = read operation, 1 = write operation
Arg4: 85ed6902, address which referenced memory

Debugging Details:
------------------

READ_ADDRESS: 85ed6902

CURRENT_IRQL: 2

FAULTING_IP:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

IP_IN_PAGED_CODE:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT

BUGCHECK_STR: AV

PROCESS_NAME: System

ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre

DPC_STACK_BASE: FFFFFFFF87460000

TRAP_FRAME: 8745b85c – (.trap 0xffffffff8745b85c)
ErrCode = 00000010
eax=00000004 ebx=9d302e70 ecx=a4475edc edx=00000000 esi=a44751d8 edi=9d302e70
eip=85ed6902 esp=8745b8d0 ebp=8745b8e0 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
ndis!Rtl::KNeutralLock::Release:
85ed6902 8bff mov edi,edi
Resetting default scope

LAST_CONTROL_TRANSFER: from 813e36d9 to 81366754

FAILED_INSTRUCTION_ADDRESS:
ndis!Rtl::KNeutralLock::Release+0
85ed6902 8bff mov edi,edi

STACK_TEXT:
8745b364 813e36d9 00000003 79b3d568 00000065 nt!RtlpBreakWithStatusInstruction
8745b3b8 813e31f3 87473340 8745b7b8 8745b85c nt!KiBugCheckDebugBreak+0x1f
8745b78c 81365326 0000000a 85ed6902 00000002 nt!KeBugCheck2+0x676
8745b7b0 81379923 0000000a 85ed6902 00000002 nt!KiBugCheck2+0xc6
8745b7b0 85ed6902 0000000a 85ed6902 00000002 nt!KiTrap0E+0x1cf
8745b8cc 85e9508e 00000002 85e7ffbc 9d302f04 ndis!Rtl::KNeutralLock::Release
8745b8e0 817166ca a4475120 9d302e70 a44751d8 ndis!ndisSetDevicePowerOnComplete+0x150d2
8745b910 812cb56a a4475120 9d302e70 8745b9b8 nt!IovpLocalCompletionRoutine+0x136
8745b98c 81715c8f 00000000 94f51448 00000004 nt!IopfCompleteRequest+0x4ea
8745b9f0 860a298d 94f51820 8744f300 00000001 nt!IovCompleteRequest+0x123
8745ba28 812db456 94f51820 94f51448 a72a6d55 pci!PciPowerUpDeviceTimerDpc+0xd3
8745bae0 812db053 8745bb28 00000000 a7dbb040 nt!KiExecuteAllDpcs+0x216
8745bc04 8137aae0 00000000 00000000 00000000 nt!KiRetireDpcList+0xf3
8745bc08 00000000 00000000 00000000 00000000 nt!KiIdleLoop+0x38

STACK_COMMAND: kb

FOLLOWUP_IP:
pci!PciPowerUpDeviceTimerDpc+d3
860a298d b9fffffffe mov ecx,0FEFFFFFFh

SYMBOL_STACK_INDEX: a

SYMBOL_NAME: pci!PciPowerUpDeviceTimerDpc+d3

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: pci

IMAGE_NAME: pci.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 53088818

IMAGE_VERSION: 6.3.9600.17031

BUCKET_ID_FUNC_OFFSET: d3

FAILURE_BUCKET_ID: AV_VRF_CODE_AV_PAGED_IP_pci!PciPowerUpDeviceTimerDpc

BUCKET_ID: AV_VRF_CODE_AV_PAGED_IP_pci!PciPowerUpDeviceTimerDpc

ANALYSIS_SOURCE: KM

FAILURE_ID_HASH_STRING: km:av_vrf_code_av_paged_ip_pci!pcipowerupdevicetimerdpc

FAILURE_ID_HASH: {2f2c6833-e611-db3d-954c-fb0313bc7b49}

Followup: MachineOwner
---------

5: kd> lmvm pci
start end module name
8609e000 860d5000 pci (pdb symbols) C:\Program Files\Windows Kits\8.1\Debuggers\x64\sym\pci.pdb\3FE60E9E1BA34EA3BCD0C5A75BEAFCEA2\pci.pdb
Loaded symbol image file: pci.sys
Image path: \SystemRoot\System32\drivers\pci.sys
Image name: pci.sys
Timestamp: Sat Feb 22 16:50:56 2014 (53088818)
CheckSum: 00038385
ImageSize: 00037000
File version: 6.3.9600.17031
Product version: 6.3.9600.17031
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 2.0 Dll
File date: 00000000.00000000
Translations: 0409.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft? Windows? Operating System
InternalName: pci.sys
OriginalFilename: pci.sys
ProductVersion: 6.3.9600.17031
FileVersion: 6.3.9600.17031 (winblue_gdr.140221-1952)
FileDescription: NT Plug and Play PCI Enumerator
LegalCopyright: ? Microsoft Corporation. All rights reserved.
5: kd> .crash
Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.

nt!RtlpBreakWithStatusInstruction:
81366754 cc int 3
5: kd> .crash

Needless to say, ndis!Rtl::KNeutralLock::Release is placed in PAGE section.
But somebody must have gotten a bonus for meeting their target metric of saving 4K of memory.

Yeah, but think of what that 4K saves when multiplied by 100 virtual machines!

:wink:

To be fair: EVERYone has bugs. And not everyone’s code gets to run on 110 million systems to see how well it works.

Now that I’ve said that, let me hasten to add "FUCK using pageable memory in kernel-mode code, except where it’s effectively required and guaranteed to be safe. " Like for large data structures that are only ever accessed at IRQL PASSIVE_LEVEL… which in your WDF driver is, basically, in a work item and in no other I/O processing routines (unless you’ve taken the time to establish a PassiveLevel Execution Constraint… something more WDF devs have never even heard of).

But I digress…

Peter
OSR
@OSRDrivers



It’s also possible it’s just a bug in the template. Note that, unless otherwise instructed, the C++ compiler will stick vtables into pageable memory. This is fixed by using a declspec: __declspec(code_seg(“$kerneltext$”)). So, whomever defined the KNeutralLock template may not have put the declaration there and nobody noticed it previously (though it should show up very quickly with driver verifier enabled, as it will make any pageable code look paged out).

This is a known risk (see the ancient C++ document on kernel mode driver development with C++ from Microsoft: http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/kmcode.doc) What’s changed in recent versions is giving developers the ability to control where (at least some) of these things are being placed. But it is easy to get it wrong - particularly missing this specific declspec.

Another useful reference is for the “/kernel” switch to the compiler: https://msdn.microsoft.com/library/jj620896.aspx

It forcibly disables C++ structured exception handling, the use of global new/delete, and dynamic_cast or typeid. But it doesn’t fix your class declarations (including the templated class declarations).

Tony
OSR

In the public symbols, it’s the only instantiation of KNeutralLock template.

The goal behind /kernel was that the vtable and other compiler generated code would never end up in a PAGEable section by default and they would match the section of the containing type if one was specified

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Tony Mason
Sent: Monday, November 30, 2015 1:20 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] The curse of PAGE strikes another victim



It’s also possible it’s just a bug in the template. Note that, unless otherwise instructed, the C++ compiler will stick vtables into pageable memory. This is fixed by using a declspec: __declspec(code_seg(“$kerneltext$”)). So, whomever defined the KNeutralLock template may not have put the declaration there and nobody noticed it previously (though it should show up very quickly with driver verifier enabled, as it will make any pageable code look paged out).

This is a known risk (see the ancient C++ document on kernel mode driver development with C++ from Microsoft: https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdownload.microsoft.com%2Fdownload%2F5%2Fb%2F5%2F5b5bec17-ea71-4653-9539-204a672f11cf%2Fkmcode.doc&data=01|01|Doron.Holan%40microsoft.com|843ae8a2761342299c3908d2f9cc18f0|72f988bf86f141af91ab2d7cd011db47|1&sdata=VEd3gqw0FaN88H8ExSHF84110Orpwi9HbHHeWFzFIT8%3D) What’s changed in recent versions is giving developers the ability to control where (at least some) of these things are being placed. But it is easy to get it wrong - particularly missing this specific declspec.

Another useful reference is for the “/kernel” switch to the compiler: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmsdn.microsoft.com%2Flibrary%2Fjj620896.aspx&data=01|01|Doron.Holan%40microsoft.com|843ae8a2761342299c3908d2f9cc18f0|72f988bf86f141af91ab2d7cd011db47|1&sdata=4xDggQoyMB6ya4x1is0RB8XTZiSjyZhR%2FoQh071ZWX4%3D

It forcibly disables C++ structured exception handling, the use of global new/delete, and dynamic_cast or typeid. But it doesn’t fix your class declarations (including the templated class declarations).

Tony
OSR


NTDEV is sponsored by OSR

Visit the list online at: https:

MONTHLY seminars on crash dump analysis, WDF, Windows internals and software drivers!
Details at https:

To unsubscribe, visit the List Server section of OSR Online at https:</https:></https:></https:>

I was thinking that the rather nice win10-IoT running on my rpi2 needs this
*blessing*, but no really it needs to never page anything at all, ever.

Mark Roddy

On Mon, Nov 30, 2015 at 1:52 PM, wrote:

> Yeah, but think of what that 4K saves when multiplied by 100 virtual
> machines!
>
> :wink:
>
> To be fair: EVERYone has bugs. And not everyone’s code gets to run on 110
> million systems to see how well it works.
>
> Now that I’ve said that, let me hasten to add "FUCK using pageable memory
> in kernel-mode code, except where it’s effectively required and guaranteed
> to be safe. " Like for large data structures that are only ever accessed at
> IRQL PASSIVE_LEVEL… which in your WDF driver is, basically, in a work
> item and in no other I/O processing routines (unless you’ve taken the time
> to establish a PassiveLevel Execution Constraint… something more WDF devs
> have never even heard of).
>
> But I digress…
>
> Peter
> OSR
> @OSRDrivers
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list online at: <
> http://www.osronline.com/showlists.cfm?list=ntdev&gt;
>
> MONTHLY seminars on crash dump analysis, WDF, Windows internals and
> software drivers!
> Details at http:
>
> To unsubscribe, visit the List Server section of OSR Online at <
> http://www.osronline.com/page.cfm?name=ListServer&gt;
></http:>

> Yeah, but think of what that 4K saves when multiplied by 100 virtual machines!

This is, indeed, a truly wonderful suggestion - indeed, let’s simply THINK, instead of dumbly and thoughtlessly repeating someone else’s propaganda. As in an old joke about the programmers, for the sake of simplicity let’s assume we are speaking about 128 VMs, rather than just of 100 ones.

4K * 128 =512K, i.e. 1/2 of just 1M!!!. This is all that we save. Now let’s recall that an absolute minimum RAM requirement for VM guest is 1G these days. Therefore, in order to be able to run 128 VM guests the target physical machine must have AT LEAST 128G of RAM. In other words, we are saving 512K on 128G machine, which happens to be 0.000003815 of the total machine’s RAM capacity. Just THINK about it carefully - less than laughable
HALF OF ONE -THOUSANDTH OF A PERCENTAGE POINT (!!!) is all that we save…

Anton Bassov

xxxxx@hotmail.com wrote:

> Yeah, but think of what that 4K saves when multiplied by 100 virtual machines!
This is, indeed, a truly wonderful suggestion - indeed, let’s simply THINK, instead of dumbly and thoughtlessly repeating someone else’s propaganda. As in an old joke about the programmers, for the sake of simplicity let’s assume we are speaking about 128 VMs, rather than just of 100 ones.

I believe you missed the tag in Peter’s post.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> I believe you missed the tag in Peter’s post.

Indeeed, I did not see tag in Peter’s post. Furthermore, I don’t see anything
that may even remotely suggest any sarcasm behind his statement. After all, the perceived “RAM savings” of paged code in virtualised environments happens to be one of the key arguments of those folks who tell you about the practical usefulness of “typedef void VOID” declaration, and Peter happens to be among the last people in the observable universe (at least “in a public setting”) whom I would expect to be openly sarcastic about anything that these folks say…

Anton Bassov



You’re just ASKING to go back on moderation, aren’t you? By TRYING to piss me off??

READ the post, Mr. Bassov. The WHOLE post. See the little Winky emoticon? See the statement about pageable memory later on in the past?

Get a clue, or stop posting here. Or at least stop annoying me when you DO post here.

Peter
OSR
@OSRDrivers

Peter,

You’re just ASKING to go back on moderation, aren’t you?

Hold on - do you mean I am not moderated any more? In fact, I expected my post to go through your review before making its way to the public, so that I thought you would simply discard it if you found it somehow inappropriate…

Get a clue, or stop posting here.Or at least stop annoying me when you DO post here.

Well, once I am not moderated any more I promise to try my best to behave. Sorry for any inconvenience caused…

Anton Bassov

Call it an early Christmas miracle. Or a late Diwali present. Or an slightly early Chunuka gift. Or a small early celebration of Milad un Nabi, for those who approve of its celebration. Or a slightly early celebration of Bohdi Day.

Now go forth and sin no more…

Peter
OSR
@OSRDrivers

On Thu, Dec 3, 2015 at 7:25 PM, wrote:

> all it an early Christmas miracle. Or a late Diwali present. Or an
> slightly early Chunuka gift. Or a small early celebration of Milad un
> Nabi, for those who approve of its celebration. Or a slightly early
> celebration of Bohdi Day

Festivus? How dare you leave out the restofus!

Mark Roddy

I left out Kwanza from my post, too. Clearly, this puts me in at LEAST the same category as Woodrow Wilson.

I am deeply ashamed of both omissions. I will atone by taking the second half of December off to contemplate my sins.

Peter
OSR
@OSRDrivers

>Festivus? How dare you leave out the restofus!

What about “Holiday”?

https://en.wikipedia.org/wiki/Flying_Spaghetti_Monster

Anton Bassov

> Call it an early Christmas miracle. Or a late Diwali present. Or an slightly early Chunuka gift. Or a

small early celebration of Milad un Nabi, for those who approve of its celebration. Or a slightly early
celebration of Bohdi Day.

Is this list related in some way with an American sci-fi story about the “Bocono” religion, where the states are called “granfallons” and the story ended with some morons accidentally turning the whole water in all oceans to a special form of ice?


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com