Test Machine unable to save memory dump files

Hi all,

I have a test machine running Win7 x64(Enterprise) that refuses to save the memory.dmp file. I have the usual things set up:

I setup the keyboard shortcut to force a dump file, (per https://msdn.microsoft.com/en-us/library/windows/hardware/ff545499(v=vs.85).aspx) which does work (I’ve used on other machines with no issues. )

So my problem is this:

  1. I press CTRL + ScrlLock + ScrlLock
  • Windows throws BSOD
  • I see proper bugcheck in the BSOD (Bug Check 0xE2)
  • I see the ‘writing physical memory to disk: 100’
  1. No memory.dmp file (or minidump file) is saved where they are set (C:\ in this case).

Checking the Event Log, I don’t see a bugcheck error either. I only see a Kernel-Power (event ID 41) event logged.

Things I’ve tried:

  1. Made sure > 15gb of space on HDD (HDD has 300+GB)
  2. Made sure page file is larger than physical memory (4gb ram, pagefile=8GB)
  3. Made sure the .dmp file is saved in a location that exists (C:\dump.dmp)
  4. Turned off ‘auto restart after BSOD’ in case the system isn’t given enough time to write to disk
  5. Turned off Debugging mode for that OS

Are there any known issues where the symptom is “a kernel-power event is recorded when a BugCheck occurs?”

Thanks,
Tim

We have also seen the same sort of thing on some of our Win2008 machines. On some machines it works fine, and on others it does not. If we move the hard drive around, the problem follows the drive, so it’s not something broken with the hardware.

We even worked with Microsoft Support on the problem and didn’t really get anywhere.

One key discovery is that the dump works fine for us. If you power off the machine at the time of the reboot following the bluescreen, you can load up the pagefile.sys into WinDbg and it sees the memory.dmp just fine.

That means that the problem is not with creating the dump, but in copying it to memory.dmp. One of my co-workers managed to follow that process a bit further, and when it doesn’t work, he saw that there was some sort of access error when it tried to open the pagefile.sys while the system was booting up.

Here are his notes which may be of use:

Experimentation with running ProcMon in boot logging mode revealed some useful information about what works and doesn’t work. Taed’s earlier analysis that the dump is written correctly to the pagefile but never moved to MEMORY.DMP on the reboot is confirmed. The crash drivers for Windows 2008 are different than 2003, but that doesn’t really matter here. (This gentlemen has done some interesting research into how it works now: https://crashdmp.wordpress.com/)
smss.exe (Windows Session Manager) is responsible for determining if a crash dump exists and to properly dispose of the crash dump when found. smss.exe scans the following registry entries:
HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management\PagingFiles
HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management\PagefileOnOsVolume (only used for BitLocker apps)
HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management\ExistingPageFiles
HKLM\System\CurrentControlSet\Control\CrashControl\DedicatedDumpFile
HKLM\System\CurrentControlSet\Control\CrashControl\AutoReboot
smss.exe then attempts to open “ExistingPageFiles” to read the first 4K to determine if a crash dump lives there. In every failure case I have recorded, the CreateFile call fails with a SHARING VIOLATION error. Because it cannot read the pagefile, smss.exe skips the effort to save the crash dump. smss.exe attempts to open the pagefile with the following access mode: Generic Read/Write, Delete, Write DAC. What I cannot figure out is why a sharing violation occurs. No other process has an open handle to the pagefile back to the point when ProcMon started collecting data. The sharing violation error is the only one I ever see, and it always occurs at the same point when the system is coming up.
Now, if new NTFS partitions are created on the system or secondary drive, and the pagefile is located somewhere other than the system volume, then smss.exe has no problem accessing the pagefile. I have managed through various manipulations of the paging file to get smss.exe to occasionally open the pagefile on the system drive, but the problem eventually comes back within one or two subsequent reboots.
One thing of note is that a system partition which has been upgraded to Windows 2008 has a different owner and set of security attributes on the C:\ directory than a partition natively created on Windows 2008. Natively-created partitions have SYSTEM as the owner and a security attribute for the “Everyone” object that allows read access just on C:. System partitions upgraded from Windows 2003 to Windows 2008 (and also created by the deployment disk?) have “TrustedInstaller” as the owner and no security attribute for “Everyone”. I see no reason for a sharing violation due to this unless there is an odd conflict with the ACL between smss.exe and the pagefile that results in a sharing violation as opposed to an access denied error.

Thanks twynnell, there is more information here than what ms support (and
their pro support) is able to give me. I will be giving it a second go on
Monday

On Feb 25, 2017 7:39 AM, wrote:

We have also seen the same sort of thing on some of our Win2008 machines.
On some machines it works fine, and on others it does not. If we move the
hard drive around, the problem follows the drive, so it’s not something
broken with the hardware.

We even worked with Microsoft Support on the problem and didn’t really get
anywhere.

One key discovery is that the dump works fine for us. If you power off the
machine at the time of the reboot following the bluescreen, you can load up
the pagefile.sys into WinDbg and it sees the memory.dmp just fine.

That means that the problem is not with creating the dump, but in copying
it to memory.dmp. One of my co-workers managed to follow that process a
bit further, and when it doesn’t work, he saw that there was some sort of
access error when it tried to open the pagefile.sys while the system was
booting up.

Here are his notes which may be of use:

Experimentation with running ProcMon in boot logging mode revealed some
useful information about what works and doesn’t work. Taed’s earlier
analysis that the dump is written correctly to the pagefile but never moved
to MEMORY.DMP on the reboot is confirmed. The crash drivers for Windows
2008 are different than 2003, but that doesn’t really matter here. (This
gentlemen has done some interesting research into how it works now:
https://crashdmp.wordpress.com/)
smss.exe (Windows Session Manager) is responsible for determining if a
crash dump exists and to properly dispose of the crash dump when found.
smss.exe scans the following registry entries:
HKLM\System\CurrentControlSet\Control\Session Manager\Memory
Management\PagingFiles
HKLM\System\CurrentControlSet\Control\Session Manager\Memory
Management\PagefileOnOsVolume (only used for BitLocker apps)
HKLM\System\CurrentControlSet\Control\Session Manager\Memory
Management\ExistingPageFiles
HKLM\System\CurrentControlSet\Control\CrashControl\DedicatedDumpFile
HKLM\System\CurrentControlSet\Control\CrashControl\AutoReboot
smss.exe then attempts to open “ExistingPageFiles” to read the first 4K to
determine if a crash dump lives there. In every failure case I have
recorded, the CreateFile call fails with a SHARING VIOLATION error. Because
it cannot read the pagefile, smss.exe skips the effort to save the crash
dump. smss.exe attempts to open the pagefile with the following access
mode: Generic Read/Write, Delete, Write DAC. What I cannot figure out is
why a sharing violation occurs. No other process has an open handle to the
pagefile back to the point when ProcMon started collecting data. The
sharing violation error is the only one I ever see, and it always occurs at
the same point when the system is coming up.
Now, if new NTFS partitions are created on the system or secondary drive,
and the pagefile is located somewhere other than the system volume, then
smss.exe has no problem accessing the pagefile. I have managed through
various manipulations of the paging file to get smss.exe to occasionally
open the pagefile on the system drive, but the problem eventually comes
back within one or two subsequent reboots.
One thing of note is that a system partition which has been upgraded to
Windows 2008 has a different owner and set of security attributes on the
C:\ directory than a partition natively created on Windows 2008.
Natively-created partitions have SYSTEM as the owner and a security
attribute for the “Everyone” object that allows read access just on C:.
System partitions upgraded from Windows 2003 to Windows 2008 (and also
created by the deployment disk?) have “TrustedInstaller” as the owner and
no security attribute for “Everyone”. I see no reason for a sharing
violation due to this unless there is an odd conflict with the ACL between
smss.exe and the pagefile that results in a sharing violation as opposed to
an access denied error.


WINDBG is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

MONTHLY seminars on crash dump analysis, WDF, Windows internals and
software drivers!
Details at http:

To unsubscribe, visit the List Server section of OSR Online at <
http://www.osronline.com/page.cfm?name=ListServer&gt;</http:>

As a workaround I connected a separate physical harddrive to the machine, and set the memory dump location as well as the page file to that new harddrive. The memory dump can now successfully be saved on that drive.

This isn’t the best long term solution as the page file is now sitting on a separate drive as the OS, but it’ll get me what I need to push on the actual issue.

Very strange indeed.

Note that you can use a file other than the paging file for crash dump
purposes. See DedicatedDumpFile:

https://blogs.msdn.microsoft.com/ntdebugging/2010/04/02/how-to-use-the-dedicateddumpfile-registry-value-to-overcome-space-limitations-on-the-system-drive-when-capturing-a-system-memory-dump/

-scott
OSR
@OSRDrivers

wrote in message news:xxxxx@windbg…

As a workaround I connected a separate physical harddrive to the machine,
and set the memory dump location as well as the page file to that new
harddrive. The memory dump can now successfully be saved on that drive.

This isn’t the best long term solution as the page file is now sitting on a
separate drive as the OS, but it’ll get me what I need to push on the actual
issue.