How to debug this analyze -v?

Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.

Loading Dump File [C:\Users\xxxxx\Desktop\20140107JAEMEMORY.DMP]
Kernel Summary Dump File: Only kernel address space is available

Symbol search path is: srv*c:\users\xxxxx\downloads\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 7 Kernel Version 7601 (Service Pack 1) MP (8 procs) Free x64
Product: WinNt, suite: TerminalServer SingleUserTS
Built by: 7601.18247.amd64fre.win7sp1_gdr.130828-1532
Machine Name:
Kernel base = 0xfffff80002c1e000 PsLoadedModuleList = 0xfffff80002e616d0
Debug session time: Tue Jan 7 15:37:08.839 2014 (UTC - 6:00)
System Uptime: 0 days 23:11:08.667
Loading Kernel Symbols



Loading User Symbols
PEB is paged out (Peb.Ldr = 000007ff`fffdf018). Type “.hh dbgerr001” for details
Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 1A, {41790, fffffa80081e0970, ffff, 0}

Page 3ce4e0 not present in the dump file. Type “.hh dbgerr004” for details
Probably caused by : win32k.sys ( win32k!SURFACE::bDeleteSurface+3c8 )

Followup: MachineOwner

2: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

MEMORY_MANAGEMENT (1a)

Any other values for parameter 1 must be individually examined.

Arguments:
Arg1: 0000000000041790, The subtype of the bugcheck.
Arg2: fffffa80081e0970
Arg3: 000000000000ffff
Arg4: 0000000000000000

Debugging Details:

BUGCHECK_STR: 0x1a_41790

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

PROCESS_NAME: xxxxxxx.exe

CURRENT_IRQL: 0

LAST_CONTROL_TRANSFER: from fffff80002d04d50 to fffff80002c93bc0

STACK_TEXT:
fffff88008d6cf28 fffff80002d04d50 : 000000000000001a 0000000000041790 fffffa80081e0970 000000000000ffff : nt!KeBugCheckEx
fffff88008d6cf30 fffff80002c803df : fffffa8000000000 000000002ed70fff 0000000000000000 0000000000000000 : nt! ?? ::FNODOBFM::string'+0x35084 fffff88008d6d0f0 fffff80002c92e53 : ffffffffffffffff fffff88008d6d3c0 fffff88008d6d428 0000000000008000 : nt!NtFreeVirtualMemory+0x61f fffff88008d6d1f0 fffff80002c8f410 : fffff9600011745c 0000000000000001 fffff900c01e9010 fffff900c5387cc0 : nt!KiSystemServiceCopyEnd+0x13 fffff88008d6d388 fffff9600011745c : 0000000000000001 fffff900c01e9010 fffff900c5387cc0 0000000000000000 : nt!KiServiceLinkage fffff88008d6d390 fffff960001177ac : 0000000000000000 fffff88000000000 fffff900c5387cc0 0000000000000000 : win32k!SURFACE::bDeleteSurface+0x3c8 fffff88008d6d4e0 fffff960000d82a5 : 0000000016050f80 fffff900c5387cc0 0000000016050f80 0000000000000001 : win32k!bDeleteSurface+0x34 fffff88008d6d510 fffff80002c92e53 : fffffa800f06ab50 fffff88008d6d5c0 000000000185000f 0000000025b78380 : win32k!NtGdiDeleteObjectApp+0xd5 fffff88008d6d540 000007fefdc1108a : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiSystemServiceCopyEnd+0x13 000000000016dba8 fffff80002c8b210 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : 0x7fefdc1108a
fffff88008d6d720 fffff88008d6da28 : fffff80002c96e53 fffff88008d6d780 0000000000000000 fffff900c066a5c0 : nt!KiCallUserMode
fffff88008d6d728 fffff80002c96e53 : fffff88008d6d780 0000000000000000 fffff900c066a5c0 fffff88008d6dc70 : 0xfffff88008d6da28 fffff88008d6d730 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : nt!SwapContext_PatchXSave+0xa3

STACK_COMMAND: kb

FOLLOWUP_IP:
win32k!SURFACE::bDeleteSurface+3c8
fffff9600011745c e9b0010000 jmp win32k!SURFACE::bDeleteSurface+0x57d (fffff96000117611)

SYMBOL_STACK_INDEX: 5

SYMBOL_NAME: win32k!SURFACE::bDeleteSurface+3c8

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: win32k

IMAGE_NAME: win32k.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 52705fba

FAILURE_BUCKET_ID: X64_0x1a_41790_win32k!SURFACE::bDeleteSurface+3c8

BUCKET_ID: X64_0x1a_41790_win32k!SURFACE::bDeleteSurface+3c8

Followup: MachineOwner

Is this the dump from a virtual machine? Is it a full memory dump? How much memory does it have? The page the isn’t in the dump file is 16,346,120,192. My guess is that it’s memory mapped onto some PCI card.

When I used to have to debug memory that was resident on the bars of the xen virtual PCI card, I had debug code that would copy this memory into system memory so that I could have access to it after a crash.

No, I’m afraid this is just from a normal physical machine. Could it be caused by the transient hardware error?

Yes, 1A/41790 bugchecks are often caused by hardware memory corruptions such as single bit errors.

If you can reproduce this on Windows 8 or 8.1, there is a good chance that you will get a different bugcheck (1A/41792) that will include the corrupted PTE value in one of the bugcheck parameters. On win7 the corrupted value has already been overwritten by the time the bugcheck occurs, so there is no way to tell whether this was a single bit error or something else.

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, January 7, 2014 5:42 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] How to debug this analyze -v?

No, I’m afraid this is just from a normal physical machine. Could it be caused by the transient hardware error?

How much physical memory in the system? Just a hunch, but I’m suspecting that the important data is in the video memory, which wouldn’t be resident in the crash dump.

I found two things suspect: the address, and the fact that it occurred in
NtFreeMemory. It smells a bitvlike corruption [of memory…sorry] at which
point discussing BARs and hardware could be meaningless. Or meaningful,
but for different reasons. I immediately become suspect of memory damage
when I see any kind of access fault in a “free” routine, only because it
could mean either memory-header damage, or the earlier damage caused a bad
address to be pssed in.

I can’t rule out any of the suggestions already made, but if we want to
bandy the phrase “root cause” around, the particular error may be a
second-order effect of the root cause. Since this is a display driver, it
may be impossible to locate and identify the operation that caused the
damage. I offer up the usual advice: Driver Verifier with Special Pool
enabled. It’s worth a try.
joe

How much physical memory in the system? Just a hunch, but I’m suspecting
that the important data is in the video memory, which wouldn’t be resident
in the crash dump.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

The machine in question has somewhere between 4GB and 16GB installed. I will check. Also, I will run memtest86+ as well if it is believed that the memory could be bad.

I could be way off, but I was speculating that the page that WinDBG couldn’t find in the dump file was beyond the bounds of the physical RAM in the system (page 3ce4e0 is byte 16,346,120,192). Last I knew from debugging in my xen days, when a dump is created, the contents of memory that exists on PCI boards isn’t included in the dump file, and thus any data that is resident in that memory is MIA in a crash dump. I don’t imagine that the system memory is bad, but instead that the memory allocation that is failing the be freed exists in the onboard RAM of the video card.

Feel free to blast my logic guys - I can handle it.

Hi all, after testing via memtest86+, one of the two 8GB sticks in the machine was bad. After removing it, no crashes.