delete DeviceExtension in PnpHandler causes Bugcheck 50 during uninstall

Hi all,

I have a behavoir which I do not understand why this happen.

I have the changed SysVad-Audio driver as a child device of my USB driver and I use a DeviceExtentione more less the same way like in the sample driver.
if I get a call into my PnpHandle IRP_MN_REMOVE_DEVICE I will delete the allocated memory in my DeviceExtention

case IRP_MN_QUERY_REMOVE_DEVICE:
case IRP_MN_REMOVE_DEVICE:
case IRP_MN_SURPRISE_REMOVAL:
case IRP_MN_REMOVE_DEVICE:	
		pExt = static_cast<MyInterFace*>(_DeviceObject->DeviceExtension);
		if (pExt->m_pCommonAdapter != NULL)
		{
			RemoveAllCaptureFilters(pExt->m_pCommonAdapter);
			RemoveAllRenderFilters(pExt->m_pCommonAdapter);

			pExt->m_pCommonAdapter->Cleanup();
			pExt->m_pCommonAdapter->Release();
	
			if (stack->MinorFunction == IRP_MN_REMOVE_DEVICE)
			{
                                if (pExt->m_pCommonAdapter)
				{   
                                      delete pExt->m_pCommonAdapter;
				      pExt->m_pCommonAdapter = NULL;
                                  }
			}
		}

        break;

during unplug. Everything is fine and the code works.
If I do an uninstall of the driver the code above bugchecks right after a delte the pExt->m_pCommonAdapter and I pass it over into

ntStatus = PcDispatchIrp(_DeviceObject, _Irp);

with this:

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced.  This cannot be protected by try-except.
Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffd80755a7cf88, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff80422a3ce4f, If non-zero, the instruction address which referenced the bad memory
	address.
Arg4: 0000000000000002, (reserved)

Debugging Details:
------------------

KEY_VALUES_STRING: 1

STACKHASH_ANALYSIS: 1

TIMELINE_ANALYSIS: 1

DUMP_CLASS: 1

DUMP_QUALIFIER: 0

BUILD_VERSION_STRING:  19041.1.amd64fre.vb_release.191206-1406

DUMP_TYPE:  0

BUGCHECK_P1: ffffd80755a7cf88

BUGCHECK_P2: 0

BUGCHECK_P3: fffff80422a3ce4f

BUGCHECK_P4: 2

READ_ADDRESS: Unable to get offset of nt!_MI_VISIBLE_STATE.SpecialPool
Unable to get value of nt!_MI_VISIBLE_STATE.SessionSpecialPool
 ffffd80755a7cf88 Nonpaged pool

FAULTING_IP: 
portcls!PnpStopDevice+223
fffff804`22a3ce4f 488b01          mov     rax,qword ptr [rcx]

This will only happen if I enable verifier with the standard settings for my driver and just when I do a “uninstall” of the driver.
Not if I unplug the device.

Any idea what happen?

Thanks!

K_W

Maybe the Callstack is intersting too:

STACK_TEXT:  
ffff8085`1337d9c8 fffff803`4c12e032 : ffff8085`1337db30 fffff803`4bf98480 fffff804`22a00000 00000000`00000000 : nt!DbgBreakPointWithStatus
ffff8085`1337d9d0 fffff803`4c12d616 : fffff804`00000003 ffff8085`1337db30 fffff803`4c027d10 ffff8085`1337e080 : nt!KiBugCheckDebugBreak+0x12
ffff8085`1337da30 fffff803`4c012ed7 : 00000000`00000000 00000000`00000000 ffffd807`55a7cf88 ffffd807`55a7cf88 : nt!KeBugCheck2+0x946
ffff8085`1337e140 fffff803`4c06927f : 00000000`00000050 ffffd807`55a7cf88 00000000`00000000 ffff8085`1337e420 : nt!KeBugCheckEx+0x107
ffff8085`1337e180 fffff803`4bec1960 : 00000000`00000000 00000000`00000000 ffff8085`1337e4a0 00000000`00000000 : nt!MiSystemFault+0x1898cf
ffff8085`1337e280 fffff803`4c020f5e : fffff804`22a00000 ffffd807`5b3a2040 00000000`00000000 ffff9981`cc1e9180 : nt!MmAccessFault+0x400
ffff8085`1337e420 fffff804`22a3ce4f : ffffd807`59068510 ffffd807`00000000 ffffd807`59068300 ffffd807`59068400 : nt!KiPageFault+0x35e
ffff8085`1337e5b0 fffff804`22a3925a : ffffd807`59068490 ffffd807`5ac3ef28 ffffd807`5ac3ecf0 00000000`00000000 : portcls!PnpStopDevice+0x223
ffff8085`1337e5f0 fffff804`22a35412 : ffffd807`59068340 ffffd807`5ac3ecf0 00000000`00000000 ffffd807`558ee301 : portcls!DispatchPnp+0x47ca
ffff8085`1337e660 fffff803`4bf88fe7 : 00000000`00000000 00000000`00000000 ffffd807`59068340 ffffd807`00000001 : portcls!PcDispatchIrp+0x202
ffff8085`1337e6d0 fffff803`4c5ddf0a : ffffd807`5ac3ecf0 ffffd807`59068340 ffffd807`558ee3a8 ffffd807`558ee3a8 : nt!IopfCallDriver+0x53
ffff8085`1337e710 fffff803`4c049d9b : ffffd807`5ac3ecf0 ffff8085`1337e7e0 ffffd807`5cf47210 ffffd807`5cf46100 : nt!IovCallDriver+0x266
ffff8085`1337e750 fffff804`22bd15d9 : ffff8085`1337e7e8 ffffd807`5ac3ecf0 ffffd807`558ee3a8 fffff803`4c2aa518 : nt!IofCallDriver+0x1e088b
ffff8085`1337e790 fffff804`22bd1023 : ffffd807`5ac3ecf0 ffffd807`5b8a2af0 ffffc183`00000000 ffffd807`5cf47210 : ksthunk!CKernelFilterDevice::DispatchIrp+0x155
ffff8085`1337e7f0 fffff803`4bf88fe7 : ffffd807`5ac3efb8 fffff803`4c5ea19e ffffd807`00000001 ffffd807`00000001 : ksthunk!CKernelFilterDevice::DispatchIrpBridge+0x13
ffff8085`1337e820 fffff803`4c5ddf0a : ffffd807`5ac3ecf0 ffffd807`5b8a2af0 ffffd807`5ac3ecf0 fffff803`4c5ea6b9 : nt!IopfCallDriver+0x53
ffff8085`1337e860 fffff803`4c049d9b : ffffd807`5b8a2af0 00000000`00000000 ffffd807`5b8a2af0 ffffd807`5cf47210 : nt!IovCallDriver+0x266
ffff8085`1337e8a0 fffff803`4c2aa518 : 00000000`00000000 ffffd807`5b8a2af0 ffff8085`1337e990 ffffc183`b78df0b0 : nt!IofCallDriver+0x1e088b
ffff8085`1337e8e0 fffff803`4c34ff4e : 00000000`00000002 ffffd807`5abb36e0 ffffd807`5abb36e0 ffffd807`5a57ac80 : nt!IopSynchronousCall+0xf8
ffff8085`1337e950 fffff803`4bf886dc : ffffc183`c102cb90 ffffd807`5a57ac80 00000000`00000001 00000000`0000000a : nt!IopRemoveDevice+0x126
ffff8085`1337ea00 fffff803`4c34faf2 : ffffd807`5a57ac80 00000000`00000015 00000000`00000000 cb3a4008`00200001 : nt!PnpRemoveLockedDeviceNode+0x1ac
ffff8085`1337ea60 fffff803`4c34f827 : ffffd807`5a57ac80 ffff8085`1337eae0 00000000`00000015 ffffd807`5a57ac80 : nt!PnpDeleteLockedDeviceNode+0x4e
ffff8085`1337eaa0 fffff803`4c34e133 : ffffd807`5abb36e0 ffffc183`00000002 ffffd807`5abb36e0 00000000`00000001 : nt!PnpDeleteLockedDeviceNodes+0xf7
ffff8085`1337eb20 fffff803`4c3486bb : ffff8085`1337ec60 ffffd807`5a57ac00 ffff8085`1337ec00 ffffc183`00000003 : nt!PnpProcessQueryRemoveAndEject+0x39b
ffff8085`1337ec00 fffff803`4c27096e : ffffc183`c102cb90 ffffc183`bf0fc510 ffffd807`4e87ab00 00000000`00000000 : nt!PnpProcessTargetDeviceEvent+0xeb
ffff8085`1337ec30 fffff803`4bedaae5 : ffffd807`5b3a2040 ffffd807`5b3a2040 ffffd807`4e87ab90 ffffd807`51df3350 : nt!PnpDeviceEventWorker+0x2ce
ffff8085`1337ecb0 fffff803`4bf09a75 : ffffd807`5b3a2040 00000000`00000080 ffffd807`4e875200 00000000`00000080 : nt!ExpWorkerThread+0x105
ffff8085`1337ed50 fffff803`4c01a428 : ffff9981`cc1e9180 ffffd807`5b3a2040 fffff803`4bf09a20 00000000`00000002 : nt!PspSystemThreadStartup+0x55
ffff8085`1337eda0 00000000`00000000 : ffff8085`1337f000 ffff8085`13379000 00000000`00000000 00000000`00000000 : nt!KiStartSystemThread+0x28

Your code is handling IRP_MN_QUERY_REMOVE the same as stop, remove, and surprise remove, and that is wrong. Just pass the query down the stack and do nothing about it. You also have IRP_MN_REMOVE_DEVICE twice in your switch condition, that does no harm, but I would have thought that this produced at least a warning during compilation.

Also:

The query remove is going to call cleanup and release, and then stop is likely trying to use something cleanup and release deleted?
The original code in sysvad is this:

```

case IRP_MN_REMOVE_DEVICE:
case IRP_MN_SURPRISE_REMOVAL:
case IRP_MN_STOP_DEVICE:
    ext = static_cast<PortClassDeviceContext*>(_DeviceObject->DeviceExtension);

    if (ext->m_pCommon != NULL)
    {
        ext->m_pCommon->Cleanup();
        
        ext->m_pCommon->Release();
        ext->m_pCommon = NULL;
    }
    break;

Note that cleanup and release are protected by a conditional expression that tests the state of m_pCommon, and that IRP_MN_QUERY_REMOVE is not in the switch condition, and that any of the request types in the condition go through the same path.

Hi Mark,

thanks a lot for your answer.

Well of course you are right. My call is not exactly the same. The second
IRP_MN_REMOVE_DEVICE
was a copy paste error. My original code is a bit messed up with comments and Dbg prints etc.

Let me start a bit more from beginning:

If I have the same sequence like you mentioned above, the memory of the m_pCommon will not get freed.
In this case this will happen:

DRIVER_VERIFIER_DETECTED_VIOLATION (c4)
A device driver attempting to corrupt the system has been caught.  This is
because the driver was specified in the registry as being suspect (by the
administrator) and the kernel has enabled substantial checking of this driver.
If the driver attempts to corrupt the system, bugchecks 0xC4, 0xC1 and 0xA will
be among the most commonly seen crashes.
Arguments:
Arg1: 0000000000000062, A driver has forgotten to free its pool allocations prior to unloading.
Arg2: ffffae076275bf30, name of the driver having the issue.
Arg3: ffffae0765857cd0, verifier internal structure with driver information.
Arg4: 0000000000000001, total # of (paged+nonpaged) allocations that weren't freed.
	Type !verifier 3 drivername.sys for info on the allocations
	that were leaked that caused the bugcheck.

So I decided to free the memory…

Why I check for IRP_MN_REMOVE_DEVICE to free it:

If I delete the m_pCommon in one of the earlier calls some other destructors will BugCheck because they still make use of the m_pCommon.
So that is why I have my conditional expression which will delete the allocated memory only in the last call, which is
IRP_MN_REMOVE_DEVICE

Now everything seems to work as expected until I try to Remove and Uninstall the driver.
In this case:

IRP_MN_SURPRISE_REMOVAL nor IRP_MN_STOP_DEVICE will be called to start my clean up here.
So I added
IRP_MN_QUERY_REMOVE_DEVICE
which will get called before
IRP_MN_REMOVE_DEVICE

But still, if I delete the memory in the last step, I will get this bug check from above after I pass the query down the stack.

So I do not understand why, because I actually nicely clean up everything I have used so far.

One issue I think of:

the PnPHandler is:

#pragma code_seg("PAGE")

But the m_pCommon has the PoolType: NonPagedPool

Is this an issue!?

If so, my question would be, at which point I need to free the memory and where is this done in the original sample.
Probably I messed something here?

This is NOT the way things are done in an audio driver. The Port Class driver owns the DeviceExtension. It allocates the extension for you in PcAdapterDevice, and takes care of releasing it later. It ASSUMES that extension contains ITS data. You can’t allocate your own. If you want a custom extension, you specify the size you want (as PORT_CLASS_DEVICE_EXTENSION_SIZE + sizeof(MY_DEVICE_EXTENSION)) as the last parameter to PcAddApterDevice. Then, you create a simple function to find your part of the extension when you need it:

inline DEVICE_CONTEXT * GetDeviceContext( const DEVICE_OBJECT * fdo )
{
    // Our context region follows the port class's region.
    return (DEVICE_CONTEXT*)((PCHAR)fdo->DeviceExtension+PORT_CLASS_DEVICE_EXTENSION_SIZE);
}

Thanks Tim,

case IRP_MN_REMOVE_DEVICE:
case IRP_MN_SURPRISE_REMOVAL:
case IRP_MN_STOP_DEVICE:
	pExt = (MyInterFace*)((PBYTE)(_DeviceObject->DeviceExtension) + PORT_CLASS_DEVICE_EXTENSION_SIZE);
	if (pExt->m_pCommon != NULL)
		{
                        // I added this to make sure every thing will destroy
			RemoveAllCaptureFilters(pExt->m_pCommon);
			RemoveAllRenderFilters(pExt->m_pCommon);

			pExt->m_pCommon->Cleanup();
			pExt->m_pCommon->Release();

			// #1
			// if deleted it will bugcheck 50
			
			delete pExt->m_pCommon;
			// #2
			// else this will bugcheck because not freed memory

			pExt->m_pCommon = NULL;
		}
	break;	

it is actually already implemented like you mentioned.

I do not see a correct freeing of the memory if I give the responsibility to the Port Class driver.

One difference to the original sample code is my own extension. And I place the m_pCommon pointer into my own area.
So I do something wrong somewhere else I guess.

Is there any hint where and what I can double check?

Thanks!

What’s the point of this? Is that your CAdapterCommon object? If so, that should be cleaned up for you. Port Class is handed an instance of that object as part of StartDevice, and it takes care of cleanup through the normal COM Release process. If not, then why isn’t everything you need inside your device extension extension? You ought to be able to use one of the methods that are automatically cleaned up. Why add something else to screw up?

Thanks again Tim,

It seems I have found the reason why the CAdapterCommon was not released proper from the Port Class.
I accidently increased the ref count at one place in my code and did to call release. So I guess the Port Class did not triggered the deletion of it.

Thanks for helping me with this!

Best regards

Good find. Every COM programmer has encountered this issue.