CSQ on a multi-processor machine

Hi,

I’m experiencing a live-lock situation withe using CSQ on a multi-processor
machine. The issue doesn’t happen in a single processor machine. First off
my driver is a top level driver that creates a single worker thread that
handles queued IOCTLS for user requests in system context. I use ERESOURCE
in my CSQ callback routines as below:

//
VOID MyDrvAcquireLock(
IN PIO_CSQ Csq,
OUT PKIRQL Irql
)
{
PMyDrv_DEVICE_EXTENSION devExtension;

devExtension = CONTAINING_RECORD(Csq,
MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
KeEnterCriticalRegion();
ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
}

//
VOID MyDrvReleaseLock(
IN PIO_CSQ Csq,
IN KIRQL Irql
)
{
PMyDrv_DEVICE_EXTENSION devExtension;

devExtension = CONTAINING_RECORD(Csq,
MyDrv_DEVICE_EXTENSION, CancelSafeQueue);

ExReleaseResourceLite( &devExtension->IrpQueueLock );
KeLeaveCriticalRegion();
}

Here is how my cleanup dispatch routine looks like:

NTSTATUS
MyDrvCleanup(
IN PDEVICE_OBJECT DeviceObject,
IN PIRP Irp
)
{

PMyDrv_DEVICE_EXTENSION devExtension;
LIST_ENTRY tempQueue;
PLIST_ENTRY thisEntry;
PIRP pendingIrp;
PIO_STACK_LOCATION pendingIrpStack, irpStack;

devExtension = DeviceObject->DeviceExtension;

irpStack = IoGetCurrentIrpStackLocation(Irp);

while(pendingIrp = IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
irpStack->FileObject))
{
// Cancel the IRP
pendingIrp->IoStatus.Information = 0;
pendingIrp->IoStatus.Status = STATUS_CANCELLED;
MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
}

// Finally complete the cleanup IRP
Irp->IoStatus.Information = 0;
Irp->IoStatus.Status = STATUS_SUCCESS;
IoCompleteRequest(Irp, IO_NO_INCREMENT);

return STATUS_SUCCESS;
}

After the worker thread processes a large volume of requests, the machine
hangs. Looking at the stack trace of my process, which is sending IOCTLS to
the driver, I see the following trace every time:

THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread: e15d9798
RUNNING on processor 1
IRP List:
816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
Not impersonating
DeviceMap e15531d0
Owning Process 8166f1f8 Image:
MyUserProg.exe
Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
12.281)
Context Switch Count 89534 LargeStack
UserTime 00:00:00.0546
KernelTime 00:04:17.0562
Win32 Start Address MyUserProg. (0x0043d73a)
Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit f7b1e000
Call 0
Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16

ChildEBP RetAddr
f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO: [0,1,0])
f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
[Non-Fpo])
f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO: [Non-Fpo])
(CONV: stdcall)
f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
[Non-Fpo])
f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo]) (CONV:
stdcall)
f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO: [Non-Fpo])
f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO: [Non-Fpo])
f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame @
f7b22d64)
0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO: [0,0,0])
0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])

As can be seen, upon issue of CloseHandle from user app I run into this
situation. Here is snip from !stacks command that shows this thread is
running on PROCESSOR 1. So it’s a live lock:

[8166f1f8 MyUserProg.exe]
708.0001b0 81557da8 0003f12 RUNNING hal!KeReleaseQueuedSpinLock+0x51

Like I said earlier, this *only* happens when I run this code on pentium
dual-core machine and doesn’t happen on a single processor machine.

My questions are:

  1. Are there any issues for using CSQ on a multi-processor m/c.
  2. Or, this is due to using a ERESOURCE in a multi-processor environment. If
    so, what locking primitive is suggested so that this works in both uni and
    multi-processor environments.

Thanks in advance,
Chandra

Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“chandra97 97” wrote in message news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a multi-processor
> machine. The issue doesn’t happen in a single processor machine. First off
> my driver is a top level driver that creates a single worker thread that
> handles queued IOCTLS for user requests in system context. I use ERESOURCE
> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp = IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the machine
> hangs. Looking at the stack trace of my process, which is sending IOCTLS to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread: e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO: [0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO: [Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo]) (CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO: [Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO: [Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO: [0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into this
> situation. Here is snip from !stacks command that shows this thread is
> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor environment. If
> so, what locking primitive is suggested so that this works in both uni and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>

I allocate ERESOURCE in device extension which is stored in non-paged pool.
As per documentation of Csq the locks used in callback need not necessarily
be spin locks.

In this case as seen in the stack trace listed earlier, the CsqAcquireLock
gets called in user thread context which is at PASSIVE_LEVEL. To emphasize
my earlier observation- this works on a single-processor machine.

Chandra

On 6/19/07, Maxim S. Shatskih wrote:
>
> Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?
>
> –
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
> “chandra97 97” wrote in message news:xxxxx@ntdev…
> > Hi,
> >
> > I’m experiencing a live-lock situation withe using CSQ on a
> multi-processor
> > machine. The issue doesn’t happen in a single processor machine. First
> off
> > my driver is a top level driver that creates a single worker thread that
> > handles queued IOCTLS for user requests in system context. I use
> ERESOURCE
> > in my CSQ callback routines as below:
> >
> > //
> > VOID MyDrvAcquireLock(
> > IN PIO_CSQ Csq,
> > OUT PKIRQL Irql
> > )
> > {
> > PMyDrv_DEVICE_EXTENSION devExtension;
> >
> > devExtension = CONTAINING_RECORD(Csq,
> > MyDrv_DEVICE_EXTENSION,
> CancelSafeQueue);
> > KeEnterCriticalRegion();
> > ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
> > }
> >
> > //
> > VOID MyDrvReleaseLock(
> > IN PIO_CSQ Csq,
> > IN KIRQL Irql
> > )
> > {
> > PMyDrv_DEVICE_EXTENSION devExtension;
> >
> > devExtension = CONTAINING_RECORD(Csq,
> > MyDrv_DEVICE_EXTENSION,
> CancelSafeQueue);
> >
> > ExReleaseResourceLite( &devExtension->IrpQueueLock );
> > KeLeaveCriticalRegion();
> > }
> >
> > Here is how my cleanup dispatch routine looks like:
> >
> > NTSTATUS
> > MyDrvCleanup(
> > IN PDEVICE_OBJECT DeviceObject,
> > IN PIRP Irp
> > )
> > {
> >
> > PMyDrv_DEVICE_EXTENSION devExtension;
> > LIST_ENTRY tempQueue;
> > PLIST_ENTRY thisEntry;
> > PIRP pendingIrp;
> > PIO_STACK_LOCATION pendingIrpStack, irpStack;
> >
> > devExtension = DeviceObject->DeviceExtension;
> >
> > irpStack = IoGetCurrentIrpStackLocation(Irp);
> >
> > while(pendingIrp =
> IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> > irpStack->FileObject))
> > {
> > // Cancel the IRP
> > pendingIrp->IoStatus.Information = 0;
> > pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> > MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> > IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> > }
> >
> > // Finally complete the cleanup IRP
> > Irp->IoStatus.Information = 0;
> > Irp->IoStatus.Status = STATUS_SUCCESS;
> > IoCompleteRequest(Irp, IO_NO_INCREMENT);
> >
> > return STATUS_SUCCESS;
> > }
> >
> > After the worker thread processes a large volume of requests, the
> machine
> > hangs. Looking at the stack trace of my process, which is sending IOCTLS
> to
> > the driver, I see the following trace every time:
> >
> > THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
> e15d9798
> > RUNNING on processor 1
> > IRP List:
> > 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> > Not impersonating
> > DeviceMap e15531d0
> > Owning Process 8166f1f8 Image:
> > MyUserProg.exe
> > Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
> > 12.281)
> > Context Switch Count 89534 LargeStack
> > UserTime 00:00:00.0546
> > KernelTime 00:04:17.0562
> > Win32 Start Address MyUserProg. (0x0043d73a)
> > Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> > Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit
> f7b1e000
> > Call 0
> > Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
> >
> > ChildEBP RetAddr
> > f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO:
> [0,1,0])
> > f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> > [Non-Fpo])
> > f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO:
> [Non-Fpo])
> > (CONV: stdcall)
> > f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> > [Non-Fpo])
> > f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
> (CONV:
> > stdcall)
> > f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> > f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> > f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
> [Non-Fpo])
> > f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
> [Non-Fpo])
> > f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> > f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> > f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame
> @
> > f7b22d64)
> > 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO:
> [0,0,0])
> > 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> > 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
> >
> >
> > As can be seen, upon issue of CloseHandle from user app I run into this
> > situation. Here is snip from !stacks command that shows this thread is
> > running on PROCESSOR 1. So it’s a live lock:
> >
> > [8166f1f8 MyUserProg.exe]
> > 708.0001b0 81557da8 0003f12
> RUNNING hal!KeReleaseQueuedSpinLock+0x51
> >
> > Like I said earlier, this only happens when I run this code on pentium
> > dual-core machine and doesn’t happen on a single processor machine.
> >
> > My questions are:
> >
> > 1) Are there any issues for using CSQ on a multi-processor m/c.
> > 2) Or, this is due to using a ERESOURCE in a multi-processor
> environment. If
> > so, what locking primitive is suggested so that this works in both uni
> and
> > multi-processor environments.
> >
> > Thanks in advance,
> > Chandra
> >
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Technically you are correct that you can use resource locks for your
cancel locks. However, a brief search of the ddk samples and elsewhere
indicates that the common practice is to use a spinlock. Just as an
experiment, why not replace the resource locks with spinlocks and see if
that fixes your problem?

Resources are used extensively in the file system space and are
certainly mP safe. The debugger !locks command ought to show you who is
holding your resource.


From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:47 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

I allocate ERESOURCE in device extension which is stored in non-paged
pool. As per documentation of Csq the locks used in callback need not
necessarily be spin locks.

In this case as seen in the stack trace listed earlier, the
CsqAcquireLock gets called in user thread context which is at
PASSIVE_LEVEL. To emphasize my earlier observation- this works on a
single-processor machine.

Chandra

On 6/19/07, Maxim S. Shatskih wrote:

Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“chandra97 97” wrote in message
news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a
multi-processor
> machine. The issue doesn’t happen in a single processor machine. First
off
> my driver is a top level driver that creates a single worker thread
that
> handles queued IOCTLS for user requests in system context. I use
ERESOURCE
> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE
);
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp =
IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the
machine
> hangs. Looking at the stack trace of my process, which is sending
IOCTLS to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146
(0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit
f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount
16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO:
[0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO:
[Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
(CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
[Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
[Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0]
TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO:
[0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into
this
> situation. Here is snip from !stacks command that shows this thread is

> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING
hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on
pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor
environment. If
> so, what locking primitive is suggested so that this works in both uni
and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

ERESOURCE can be reacquired by the current owner … I don’t know why that would cause a problem unless you’re actually being called in a DPC and have just been lucky enough that you haven’t found a non-resident page yet.

I like Mark’s suggestion. If you think it may be the ERESOURCE than switch to SPINLOCK and see if the problem goes away.

-p

From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of Roddy, Mark
Sent: Tuesday, June 19, 2007 12:32 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] CSQ on a multi-processor machine

Technically you are correct that you can use resource locks for your cancel locks. However, a brief search of the ddk samples and elsewhere indicates that the common practice is to use a spinlock. Just as an experiment, why not replace the resource locks with spinlocks and see if that fixes your problem?

Resources are used extensively in the file system space and are certainly mP safe. The debugger !locks command ought to show you who is holding your resource.


From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:47 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

I allocate ERESOURCE in device extension which is stored in non-paged pool. As per documentation of Csq the locks used in callback need not necessarily be spin locks.

In this case as seen in the stack trace listed earlier, the CsqAcquireLock gets called in user thread context which is at PASSIVE_LEVEL. To emphasize my earlier observation- this works on a single-processor machine.

Chandra
On 6/19/07, Maxim S. Shatskih > wrote:
Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.commailto:xxxxx
http://www.storagecraft.com

“chandra97 97” > wrote in message news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a multi-processor
> machine. The issue doesn’t happen in a single processor machine. First off
> my driver is a top level driver that creates a single worker thread that
> handles queued IOCTLS for user requests in system context. I use ERESOURCE
> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp = IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the machine
> hangs. Looking at the stack trace of my process, which is sending IOCTLS to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread: e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO: [0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO: [Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo]) (CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO: [Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO: [Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO: [0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into this
> situation. Here is snip from !stacks command that shows this thread is
> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor environment. If
> so, what locking primitive is suggested so that this works in both uni and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx>

Yes, using SPINLOCK solves the problem.

Thanks
Chandra

On 6/19/07, Peter Wieland wrote:
>
> ERESOURCE can be reacquired by the current owner ? I don’t know why that
> would cause a problem unless you’re actually being called in a DPC and have
> just been lucky enough that you haven’t found a non-resident page yet.
>
>
>
> I like Mark’s suggestion. If you think it may be the ERESOURCE than
> switch to SPINLOCK and see if the problem goes away.
>
>
>
> -p
>
>
>
> From: xxxxx@lists.osr.com [mailto:
> xxxxx@lists.osr.com] *On Behalf Of *Roddy, Mark
> Sent: Tuesday, June 19, 2007 12:32 PM
> To: Windows System Software Devs Interest List
> Subject: RE: [ntdev] CSQ on a multi-processor machine
>
>
>
> Technically you are correct that you can use resource locks for your
> cancel locks. However, a brief search of the ddk samples and elsewhere
> indicates that the common practice is to use a spinlock. Just as an
> experiment, why not replace the resource locks with spinlocks and see if
> that fixes your problem?
>
>
>
> Resources are used extensively in the file system space and are certainly
> mP safe. The debugger !locks command ought to show you who is holding your
> resource.
>
>
> ------------------------------
>
> From: xxxxx@lists.osr.com [mailto:
> xxxxx@lists.osr.com] *On Behalf Of *chandra97 97
> Sent: Tuesday, June 19, 2007 2:47 PM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] CSQ on a multi-processor machine
>
>
>
> I allocate ERESOURCE in device extension which is stored in non-paged
> pool. As per documentation of Csq the locks used in callback need not
> necessarily be spin locks.
>
> In this case as seen in the stack trace listed earlier, the CsqAcquireLock
> gets called in user thread context which is at PASSIVE_LEVEL. To emphasize
> my earlier observation- this works on a single-processor machine.
>
> Chandra
>
> On 6/19/07, Maxim S. Shatskih wrote:
>
> Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?
>
> –
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> xxxxx@storagecraft.com
> http://www.storagecraft.com
>
> “chandra97 97” wrote in message news:xxxxx@ntdev…
> > Hi,
> >
> > I’m experiencing a live-lock situation withe using CSQ on a
> multi-processor
> > machine. The issue doesn’t happen in a single processor machine. First
> off
> > my driver is a top level driver that creates a single worker thread that
> > handles queued IOCTLS for user requests in system context. I use
> ERESOURCE
> > in my CSQ callback routines as below:
> >
> > //
> > VOID MyDrvAcquireLock(
> > IN PIO_CSQ Csq,
> > OUT PKIRQL Irql
> > )
> > {
> > PMyDrv_DEVICE_EXTENSION devExtension;
> >
> > devExtension = CONTAINING_RECORD(Csq,
> > MyDrv_DEVICE_EXTENSION,
> CancelSafeQueue);
> > KeEnterCriticalRegion();
> > ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
>
> > }
> >
> > //
> > VOID MyDrvReleaseLock(
> > IN PIO_CSQ Csq,
> > IN KIRQL Irql
> > )
> > {
> > PMyDrv_DEVICE_EXTENSION devExtension;
> >
> > devExtension = CONTAINING_RECORD(Csq,
> > MyDrv_DEVICE_EXTENSION,
> CancelSafeQueue);
> >
> > ExReleaseResourceLite( &devExtension->IrpQueueLock );
> > KeLeaveCriticalRegion();
> > }
> >
> > Here is how my cleanup dispatch routine looks like:
> >
> > NTSTATUS
> > MyDrvCleanup(
> > IN PDEVICE_OBJECT DeviceObject,
> > IN PIRP Irp
> > )
> > {
> >
> > PMyDrv_DEVICE_EXTENSION devExtension;
> > LIST_ENTRY tempQueue;
> > PLIST_ENTRY thisEntry;
> > PIRP pendingIrp;
> > PIO_STACK_LOCATION pendingIrpStack, irpStack;
> >
> > devExtension = DeviceObject->DeviceExtension;
> >
> > irpStack = IoGetCurrentIrpStackLocation(Irp);
> >
> > while(pendingIrp =
> IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> > irpStack->FileObject))
> > {
> > // Cancel the IRP
> > pendingIrp->IoStatus.Information = 0;
> > pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> > MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> > IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> > }
> >
> > // Finally complete the cleanup IRP
> > Irp->IoStatus.Information = 0;
> > Irp->IoStatus.Status = STATUS_SUCCESS;
> > IoCompleteRequest(Irp, IO_NO_INCREMENT);
> >
> > return STATUS_SUCCESS;
> > }
> >
> > After the worker thread processes a large volume of requests, the
> machine
> > hangs. Looking at the stack trace of my process, which is sending IOCTLS
> to
> > the driver, I see the following trace every time:
> >
> > THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
> e15d9798
> > RUNNING on processor 1
> > IRP List:
> > 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> > Not impersonating
> > DeviceMap e15531d0
> > Owning Process 8166f1f8 Image:
> > MyUserProg.exe
> > Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
> > 12.281)
> > Context Switch Count 89534 LargeStack
> > UserTime 00:00:00.0546
> > KernelTime 00:04:17.0562
> > Win32 Start Address MyUserProg. (0x0043d73a)
> > Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> > Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit
> f7b1e000
> > Call 0
> > Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
> >
> > ChildEBP RetAddr
> > f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO:
> [0,1,0])
> > f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> > [Non-Fpo])
> > f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO:
> [Non-Fpo])
> > (CONV: stdcall)
> > f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> > [Non-Fpo])
> > f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
> (CONV:
> > stdcall)
> > f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> > f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> > f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
> [Non-Fpo])
> > f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
> [Non-Fpo])
> > f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> > f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> > f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame
> @
> > f7b22d64)
> > 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO:
> [0,0,0])
> > 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> > 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
> >
> >
> > As can be seen, upon issue of CloseHandle from user app I run into this
> > situation. Here is snip from !stacks command that shows this thread is
> > running on PROCESSOR 1. So it’s a live lock:
> >
> > [8166f1f8 MyUserProg.exe]
> > 708.0001b0 81557da8 0003f12
> RUNNING hal!KeReleaseQueuedSpinLock+0x51
> >
> > Like I said earlier, this only happens when I run this code on pentium
> > dual-core machine and doesn’t happen on a single processor machine.
> >
> > My questions are:
> >
> > 1) Are there any issues for using CSQ on a multi-processor m/c.
> > 2) Or, this is due to using a ERESOURCE in a multi-processor
> environment. If
> > so, what locking primitive is suggested so that this works in both uni
> and
> > multi-processor environments.
> >
> > Thanks in advance,
> > Chandra
> >
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
> — Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
> Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Which raises the age old question: “sure it works in practice, but in
theory”?

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 5:52 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

Yes, using SPINLOCK solves the problem.

Thanks
Chandra

On 6/19/07, Peter Wieland mailto:xxxxx > wrote:

ERESOURCE can be reacquired by the current owner . I don’t know why that
would cause a problem unless you’re actually being called in a DPC and have
just been lucky enough that you haven’t found a non-resident page yet.

I like Mark’s suggestion. If you think it may be the ERESOURCE than switch
to SPINLOCK and see if the problem goes away.

-p

From: xxxxx@lists.osr.com [mailto:
mailto:xxxxx xxxxx@lists.osr.com]
On Behalf Of Roddy, Mark
Sent: Tuesday, June 19, 2007 12:32 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] CSQ on a multi-processor machine

Technically you are correct that you can use resource locks for your cancel
locks. However, a brief search of the ddk samples and elsewhere indicates
that the common practice is to use a spinlock. Just as an experiment, why
not replace the resource locks with spinlocks and see if that fixes your
problem?

Resources are used extensively in the file system space and are certainly mP
safe. The debugger !locks command ought to show you who is holding your
resource.

_____

From: xxxxx@lists.osr.com [mailto:
mailto:xxxxx xxxxx@lists.osr.com]
On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:47 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

I allocate ERESOURCE in device extension which is stored in non-paged pool.
As per documentation of Csq the locks used in callback need not necessarily
be spin locks.

In this case as seen in the stack trace listed earlier, the CsqAcquireLock
gets called in user thread context which is at PASSIVE_LEVEL. To emphasize
my earlier observation- this works on a single-processor machine.

Chandra

On 6/19/07, Maxim S. Shatskih wrote:

Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“chandra97 97” wrote in message news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a
multi-processor
> machine. The issue doesn’t happen in a single processor machine. First off
> my driver is a top level driver that creates a single worker thread that
> handles queued IOCTLS for user requests in system context. I use ERESOURCE

> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE );
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION, CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp = IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the machine
> hangs. Looking at the stack trace of my process, which is sending IOCTLS
to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146 (0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount 16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO: [0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO: [Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
(CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
[Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
[Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0] TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO: [0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into this
> situation. Here is snip from !stacks command that shows this thread is
> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor environment.
If
> so, what locking primitive is suggested so that this works in both uni and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the List
Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx></mailto:xxxxx>

While you can choose any lock you want to implement locking in a CSQ,
that lock must be acquirable at DISPATCH_LEVEL to be robust against all
callers. An ERESOURCE does not meet that requirement.

d

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:52 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

Yes, using SPINLOCK solves the problem.

Thanks
Chandra

On 6/19/07, Peter Wieland wrote:

ERESOURCE can be reacquired by the current owner … I don’t know why
that would cause a problem unless you’re actually being called in a DPC
and have just been lucky enough that you haven’t found a non-resident
page yet.

I like Mark’s suggestion. If you think it may be the ERESOURCE than
switch to SPINLOCK and see if the problem goes away.

-p

From: xxxxx@lists.osr.com [mailto:
xxxxx@lists.osr.com
mailto:xxxxx] On Behalf Of Roddy, Mark
Sent: Tuesday, June 19, 2007 12:32 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] CSQ on a multi-processor machine

Technically you are correct that you can use resource locks for your
cancel locks. However, a brief search of the ddk samples and elsewhere
indicates that the common practice is to use a spinlock. Just as an
experiment, why not replace the resource locks with spinlocks and see if
that fixes your problem?

Resources are used extensively in the file system space and are
certainly mP safe. The debugger !locks command ought to show you who is
holding your resource.

________________________________

From: xxxxx@lists.osr.com [mailto:
xxxxx@lists.osr.com
mailto:xxxxx] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:47 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

I allocate ERESOURCE in device extension which is stored in non-paged
pool. As per documentation of Csq the locks used in callback need not
necessarily be spin locks.

In this case as seen in the stack trace listed earlier, the
CsqAcquireLock gets called in user thread context which is at
PASSIVE_LEVEL. To emphasize my earlier observation- this works on a
single-processor machine.

Chandra

On 6/19/07, Maxim S. Shatskih wrote:

Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“chandra97 97” wrote in message
news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a
multi-processor
> machine. The issue doesn’t happen in a single processor machine. First
off
> my driver is a top level driver that creates a single worker thread
that
> handles queued IOCTLS for user requests in system context. I use
ERESOURCE
> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE
);
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp =
IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the
machine
> hangs. Looking at the stack trace of my process, which is sending
IOCTLS to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146
(0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit
f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount
16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO:
[0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO:
[Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
(CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
[Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
[Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0]
TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO:
[0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into
this
> situation. Here is snip from !stacks command that shows this thread is

> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING
hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on
pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor
environment. If
> so, what locking primitive is suggested so that this works in both uni
and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>

Doron the documentation, to my surprise, says explicitly just the
opposite. You should go talk to the doc people.

“Drivers can use any locking mechanism to lock the queue, such as
mutexes. For more information about mutexes, see Mutex Objects.”

  • the WDK 6000 docs.

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Doron Holan
Sent: Wednesday, June 20, 2007 12:38 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] CSQ on a multi-processor machine

While you can choose any lock you want to implement locking in a CSQ,
that lock must be acquirable at DISPATCH_LEVEL to be robust against all
callers. An ERESOURCE does not meet that requirement.

d

From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:52 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

Yes, using SPINLOCK solves the problem.

Thanks
Chandra

On 6/19/07, Peter Wieland wrote:

ERESOURCE can be reacquired by the current owner … I don’t know why
that would cause a problem unless you’re actually being called in a DPC
and have just been lucky enough that you haven’t found a non-resident
page yet.

I like Mark’s suggestion. If you think it may be the ERESOURCE than
switch to SPINLOCK and see if the problem goes away.

-p

From: xxxxx@lists.osr.com [mailto:
xxxxx@lists.osr.com
mailto:xxxxx] On Behalf Of Roddy, Mark
Sent: Tuesday, June 19, 2007 12:32 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] CSQ on a multi-processor machine

Technically you are correct that you can use resource locks for your
cancel locks. However, a brief search of the ddk samples and elsewhere
indicates that the common practice is to use a spinlock. Just as an
experiment, why not replace the resource locks with spinlocks and see if
that fixes your problem?

Resources are used extensively in the file system space and are
certainly mP safe. The debugger !locks command ought to show you who is
holding your resource.

________________________________

From: xxxxx@lists.osr.com [mailto:
xxxxx@lists.osr.com
mailto:xxxxx] On Behalf Of chandra97 97
Sent: Tuesday, June 19, 2007 2:47 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] CSQ on a multi-processor machine

I allocate ERESOURCE in device extension which is stored in non-paged
pool. As per documentation of Csq the locks used in callback need not
necessarily be spin locks.

In this case as seen in the stack trace listed earlier, the
CsqAcquireLock gets called in user thread context which is at
PASSIVE_LEVEL. To emphasize my earlier observation- this works on a
single-processor machine.

Chandra

On 6/19/07, Maxim S. Shatskih wrote:

Am I wrong that CSQ callback can be called on DISPATCH_LEVEL?


Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“chandra97 97” wrote in message
news:xxxxx@ntdev…
> Hi,
>
> I’m experiencing a live-lock situation withe using CSQ on a
multi-processor
> machine. The issue doesn’t happen in a single processor machine. First
off
> my driver is a top level driver that creates a single worker thread
that
> handles queued IOCTLS for user requests in system context. I use
ERESOURCE
> in my CSQ callback routines as below:
>
> //
> VOID MyDrvAcquireLock(
> IN PIO_CSQ Csq,
> OUT PKIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
> KeEnterCriticalRegion();
> ExAcquireResourceExclusiveLite( &devExtension->IrpQueueLock, TRUE
);
> }
>
> //
> VOID MyDrvReleaseLock(
> IN PIO_CSQ Csq,
> IN KIRQL Irql
> )
> {
> PMyDrv_DEVICE_EXTENSION devExtension;
>
> devExtension = CONTAINING_RECORD(Csq,
> MyDrv_DEVICE_EXTENSION,
CancelSafeQueue);
>
> ExReleaseResourceLite( &devExtension->IrpQueueLock );
> KeLeaveCriticalRegion();
> }
>
> Here is how my cleanup dispatch routine looks like:
>
> NTSTATUS
> MyDrvCleanup(
> IN PDEVICE_OBJECT DeviceObject,
> IN PIRP Irp
> )
> {
>
> PMyDrv_DEVICE_EXTENSION devExtension;
> LIST_ENTRY tempQueue;
> PLIST_ENTRY thisEntry;
> PIRP pendingIrp;
> PIO_STACK_LOCATION pendingIrpStack, irpStack;
>
> devExtension = DeviceObject->DeviceExtension;
>
> irpStack = IoGetCurrentIrpStackLocation(Irp);
>
> while(pendingIrp =
IoCsqRemoveNextIrp(&devExtension->CancelSafeQueue,
> irpStack->FileObject))
> {
> // Cancel the IRP
> pendingIrp->IoStatus.Information = 0;
> pendingIrp->IoStatus.Status = STATUS_CANCELLED;
> MyDrv_KDPRINT((“Cleanup cancelled irp %p\n”, pendingIrp));
> IoCompleteRequest(pendingIrp, IO_NO_INCREMENT);
> }
>
> // Finally complete the cleanup IRP
> Irp->IoStatus.Information = 0;
> Irp->IoStatus.Status = STATUS_SUCCESS;
> IoCompleteRequest(Irp, IO_NO_INCREMENT);
>
> return STATUS_SUCCESS;
> }
>
> After the worker thread processes a large volume of requests, the
machine
> hangs. Looking at the stack trace of my process, which is sending
IOCTLS to
> the driver, I see the following trace every time:
>
> THREAD 81557da8 Cid 0708.01b0 Teb: 7ffde000 Win32Thread:
e15d9798
> RUNNING on processor 1
> IRP List:
> 816da360: (0006,0094) Flags: 00000404 Mdl: 00000000
> Not impersonating
> DeviceMap e15531d0
> Owning Process 8166f1f8 Image:
> MyUserProg.exe
> Wait Start TickCount 35456 Ticks: 16146
(0:00:04:
> 12.281)
> Context Switch Count 89534 LargeStack
> UserTime 00:00:00.0546
> KernelTime 00:04:17.0562
> Win32 Start Address MyUserProg. (0x0043d73a)
> Start Address kernel32!BaseProcessStartThunk (0x77e813f2)
> Stack Init f7b23000 Current f7b22bec Base f7b23000 Limit
f7b1e000
> Call 0
> Priority 8 BasePriority 8 PriorityDecrement 0 DecrementCount
16
>
> ChildEBP RetAddr
> f7b22bf4 80527571 hal!KeReleaseQueuedSpinLock+0x51 (FPO:
[0,1,0])
> f7b22c14 f7ab5ab4 nt!ExAcquireResourceExclusiveLite+0x65 (FPO:
> [Non-Fpo])
> f7b22c28 f7ac2934 MyDrvdrv!MyDrvAcquireLock+0x24 (FPO:
[Non-Fpo])
> (CONV: stdcall)
> f7b22c40 f7ab5ded MyDrvdrv!WdmlibIoCsqRemoveNextIrp+0x16 (FPO:
> [Non-Fpo])
> f7b22c6c 804eb605 MyDrvdrv!MyDrvCleanup+0x2d (FPO: [Non-Fpo])
(CONV:
> stdcall)
> f7b22c7c 8056a12c nt!IopfCallDriver+0x31 (FPO: [0,0,1])
> f7b22ca8 805a0a63 nt!IopCloseFile+0x23a (FPO: [Non-Fpo])
> f7b22cd8 805a041b nt!ObpDecrementHandleCount+0x119 (FPO:
[Non-Fpo])
> f7b22d00 805a04b1 nt!ObpCloseHandleTableEntry+0x14b (FPO:
[Non-Fpo])
> f7b22d48 805a05d7 nt!ObpCloseHandle+0x85 (FPO: [Non-Fpo])
> f7b22d58 80531814 nt!NtClose+0x19 (FPO: [1,0,0])
> f7b22d58 7ffe0304 nt!KiSystemService+0xc9 (FPO: [0,0]
TrapFrame @
> f7b22d64)
> 0012e894 77f5b5d4 SharedUserData!SystemCallStub+0x4 (FPO:
[0,0,0])
> 0012e898 77e7a683 ntdll!ZwClose+0xc (FPO: [1,0,0])
> 0012e8a0 10091f0b kernel32!CloseHandle+0x4d (FPO: [1,0,0])
>
>
> As can be seen, upon issue of CloseHandle from user app I run into
this
> situation. Here is snip from !stacks command that shows this thread is

> running on PROCESSOR 1. So it’s a live lock:
>
> [8166f1f8 MyUserProg.exe]
> 708.0001b0 81557da8 0003f12 RUNNING
hal!KeReleaseQueuedSpinLock+0x51
>
> Like I said earlier, this only happens when I run this code on
pentium
> dual-core machine and doesn’t happen on a single processor machine.
>
> My questions are:
>
> 1) Are there any issues for using CSQ on a multi-processor m/c.
> 2) Or, this is due to using a ERESOURCE in a multi-processor
environment. If
> so, what locking primitive is suggested so that this works in both uni
and
> multi-processor environments.
>
> Thanks in advance,
> Chandra
>


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

— Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256 To unsubscribe, visit the
List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer</mailto:xxxxx></mailto:xxxxx>