Too many Irp stacks to be believed (>30)!!

Hi, I have a crash in the 1394 isoch callback function where I reuse a work
IRP each time to call down the 1394 stack to detach the buffer (completion
routine loads more data and calls down to reattach etc).

The analyse output is below, but the interesting report is this:

0: kd> !irp 833cd008
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd> !irp 833cd008 1
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd>

I’d like to find out more about the sordid history of this IRP but it
doesn’t want to show me any.

Any suggestions? Thanks, Mike

The debugger shows:

NO_MORE_IRP_STACK_LOCATIONS (35)
A higher level driver has attempted to call a lower level driver through
the IoCallDriver() interface, but there are no more stack locations in the
packet, hence, the lower level driver would not be able to access its
parameters, as there are no parameters for it. This is a disasterous
situation, since the higher level driver “thinks” it has filled in the
parameters for the lower level driver (something it MUST do before it calls
it), but since there is no stack location for the latter driver, the former
has written off of the end of the packet. This means that some other memory
has probably been trashed at this point.
Arguments:
Arg1: 833cd008, Address of the IRP
Arg2: 00000000
Arg3: 00000000
Arg4: 00000000
Debugging Details:

DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x35
PROCESS_NAME: Idle
LAST_CONTROL_TRANSFER: from 8052015d to 805371aa
STACK_TEXT:
80556388 8052015d 00000035 833cd008 00000000 nt!KeBugCheckEx+0x1b
805563a0 f7bba5f1 83478350 833c4540 0000d868 nt!IopfCallDriver+0x17
805563b4 f780c5f9 83728258 833bab18 804db68a lm1394!IsochTxCallback+0x97
[c:\development\lm1394\driver\lm1394\isochapi.c @ 2970]
805563ec f780d209 8475a0e0 00000000 00000001
ohci1394!OhciHandleIsochInt+0x21d
80556428 804dcd22 8475b624 8475a0e0 00000000 ohci1394!OhciIsochDpc+0x57
80556450 804dcc07 00000000 0000000e 00000000 nt!KiRetireDpcList+0x61
80556454 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x28

STACK_COMMAND: kb
FOLLOWUP_IP:
lm1394!IsochTxCallback+97 [c:\development\lm1394\driver\lm1394\isochapi.c @
2970]
f7bba5f1 ?? ???
FAULTING_SOURCE_CODE:

IoCallDriver(pDeviceExtension->StackDeviceObject,
pCallbackControl->workIrp);

SYMBOL_STACK_INDEX: 2
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: lm1394
IMAGE_NAME: lm1394.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 45081b76
SYMBOL_NAME: lm1394!IsochTxCallback+97
FAILURE_BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
Followup: MachineOwner
---------

How are you reinitializing the PIRP in the completion routine? By
calling IoReuseIrp?

d

– I can spell, I just can’t type.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mike Kemp
Sent: Wednesday, September 20, 2006 9:51 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Too many Irp stacks to be believed (>30)!!

Hi, I have a crash in the 1394 isoch callback function where I reuse a
work
IRP each time to call down the 1394 stack to detach the buffer
(completion
routine loads more data and calls down to reattach etc).

The analyse output is below, but the interesting report is this:

0: kd> !irp 833cd008
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd> !irp 833cd008 1
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd>

I’d like to find out more about the sordid history of this IRP but it
doesn’t want to show me any.

Any suggestions? Thanks, Mike

The debugger shows:

NO_MORE_IRP_STACK_LOCATIONS (35)
A higher level driver has attempted to call a lower level driver through
the IoCallDriver() interface, but there are no more stack locations in
the
packet, hence, the lower level driver would not be able to access its
parameters, as there are no parameters for it. This is a disasterous
situation, since the higher level driver “thinks” it has filled in the
parameters for the lower level driver (something it MUST do before it
calls
it), but since there is no stack location for the latter driver, the
former
has written off of the end of the packet. This means that some other
memory
has probably been trashed at this point.
Arguments:
Arg1: 833cd008, Address of the IRP
Arg2: 00000000
Arg3: 00000000
Arg4: 00000000
Debugging Details:

DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x35
PROCESS_NAME: Idle
LAST_CONTROL_TRANSFER: from 8052015d to 805371aa
STACK_TEXT:
80556388 8052015d 00000035 833cd008 00000000 nt!KeBugCheckEx+0x1b
805563a0 f7bba5f1 83478350 833c4540 0000d868 nt!IopfCallDriver+0x17
805563b4 f780c5f9 83728258 833bab18 804db68a lm1394!IsochTxCallback+0x97

[c:\development\lm1394\driver\lm1394\isochapi.c @ 2970]
805563ec f780d209 8475a0e0 00000000 00000001
ohci1394!OhciHandleIsochInt+0x21d
80556428 804dcd22 8475b624 8475a0e0 00000000 ohci1394!OhciIsochDpc+0x57
80556450 804dcc07 00000000 0000000e 00000000 nt!KiRetireDpcList+0x61
80556454 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x28

STACK_COMMAND: kb
FOLLOWUP_IP:
lm1394!IsochTxCallback+97
[c:\development\lm1394\driver\lm1394\isochapi.c @
2970]
f7bba5f1 ?? ???
FAULTING_SOURCE_CODE:
actually—>
IoCallDriver(pDeviceExtension->StackDeviceObject,
pCallbackControl->workIrp);

SYMBOL_STACK_INDEX: 2
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: lm1394
IMAGE_NAME: lm1394.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 45081b76
SYMBOL_NAME: lm1394!IsochTxCallback+97
FAILURE_BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
Followup: MachineOwner
---------


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Hi Doron

Before I (re)use the IRP in the callback I do this

IoReuseIrp(pCallbackControl->workIrp, STATUS_SUCCESS);

IoSetCompletionRoutine( pCallbackControl->workIrp, IsochTXCallbackDetachCompletionRoutine, pCallbackControl, TRUE, TRUE, TRUE);
IoCallDriver(pDeviceExtension->StackDeviceObject, pCallbackControl->workIrp);

then in the detach completion routine I reuse it again to reattach

IoReuseIrp(pCallbackControl->workIrp, STATUS_SUCCESS);
IoSetCompletionRoutine( pCallbackControl->workIrp, IsochTXCallbackReattachCompletionRoutine, pCallbackControl, TRUE, TRUE, TRUE);
//and now re-attach
IoCallDriver(pDeviceExtension->StackDeviceObject, pCallbackControl->workIrp);

Maybe I should not be using it IN the completion routine - is it reusable yet?

BTW these are IRPs I created at the start with

StackSize = pDeviceExtension->StackDeviceObject->StackSize;
pIrp = IoAllocateIrp(StackSize,FALSE);

Like most of these things, it works most of the time!

Thanks, Mike

From: Doron Holan
To: Windows System Software Devs Interest List
Sent: Wednesday, September 20, 2006 7:38 PM
Subject: RE: [ntdev] Too many Irp stacks to be believed (>30)!!

How are you reinitializing the PIRP in the completion routine? By
calling IoReuseIrp?

d

– I can spell, I just can’t type.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mike Kemp
Sent: Wednesday, September 20, 2006 9:51 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Too many Irp stacks to be believed (>30)!!

Hi, I have a crash in the 1394 isoch callback function where I reuse a
work
IRP each time to call down the 1394 stack to detach the buffer
(completion
routine loads more data and calls down to reattach etc).

The analyse output is below, but the interesting report is this:

0: kd> !irp 833cd008
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd> !irp 833cd008 1
Irp is active with 255 stacks 255 is current (= 0x833cd054)
No Mdl: No System Buffer: Thread 00000000: Too many Irp stacks to be
believed (>30)!!
0: kd>

I’d like to find out more about the sordid history of this IRP but it
doesn’t want to show me any.

Any suggestions? Thanks, Mike

The debugger shows:

NO_MORE_IRP_STACK_LOCATIONS (35)
A higher level driver has attempted to call a lower level driver through
the IoCallDriver() interface, but there are no more stack locations in
the
packet, hence, the lower level driver would not be able to access its
parameters, as there are no parameters for it. This is a disasterous
situation, since the higher level driver “thinks” it has filled in the
parameters for the lower level driver (something it MUST do before it
calls
it), but since there is no stack location for the latter driver, the
former
has written off of the end of the packet. This means that some other
memory
has probably been trashed at this point.
Arguments:
Arg1: 833cd008, Address of the IRP
Arg2: 00000000
Arg3: 00000000
Arg4: 00000000
Debugging Details:
------------------

DEFAULT_BUCKET_ID: DRIVER_FAULT
BUGCHECK_STR: 0x35
PROCESS_NAME: Idle
LAST_CONTROL_TRANSFER: from 8052015d to 805371aa
STACK_TEXT:
80556388 8052015d 00000035 833cd008 00000000 nt!KeBugCheckEx+0x1b
805563a0 f7bba5f1 83478350 833c4540 0000d868 nt!IopfCallDriver+0x17
805563b4 f780c5f9 83728258 833bab18 804db68a lm1394!IsochTxCallback+0x97

[c:\development\lm1394\driver\lm1394\isochapi.c @ 2970]
805563ec f780d209 8475a0e0 00000000 00000001
ohci1394!OhciHandleIsochInt+0x21d
80556428 804dcd22 8475b624 8475a0e0 00000000 ohci1394!OhciIsochDpc+0x57
80556450 804dcc07 00000000 0000000e 00000000 nt!KiRetireDpcList+0x61
80556454 00000000 0000000e 00000000 00000000 nt!KiIdleLoop+0x28

STACK_COMMAND: kb
FOLLOWUP_IP:
lm1394!IsochTxCallback+97
[c:\development\lm1394\driver\lm1394\isochapi.c @
2970]
f7bba5f1 ?? ???
FAULTING_SOURCE_CODE:
actually—>
IoCallDriver(pDeviceExtension->StackDeviceObject,
pCallbackControl->workIrp);

SYMBOL_STACK_INDEX: 2
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: lm1394
IMAGE_NAME: lm1394.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 45081b76
SYMBOL_NAME: lm1394!IsochTxCallback+97
FAILURE_BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
BUCKET_ID: 0x35_lm1394!IsochTxCallback+97
Followup: MachineOwner
---------


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

that looks ok. you can reuse in the completion routine. you are returning status_more_processing_required from the completion routines right? one thing you could do is see what the current stack location pointer value is when the irp completes. it should always be the same value. if it changes linearly over time, someone is not playing nice with the stack locations and eventually you run out.

d

Hi Doron

Thanks, I spent the morning re-reading Oney and it stills looks okay to me.
I do return status_more_processing_required.

I don’t do anything with the IRP in the completion routine, just re-use it.
I’m also assuming it is okay to IoReuseIrp() the first time when it has not
been used before?

I’ve found out that this probably occurred when the device was unplugged so
I’m considering changing to IoSetCompletionRoutineEx(). I’m assuming this
will guarantee that I get to the completion routine and release its lock
then?

I’ll try the test you recommend.

Mike

From: xxxxx@Microsoft.com
To: Windows System Software Devs Interest List
Sent: Thursday, September 21, 2006 9:11 AM
Subject: RE:[ntdev] Too many Irp stacks to be believed (>30)!!

that looks ok. you can reuse in the completion routine. you are returning
status_more_processing_required from the completion routines right? one
thing you could do is see what the current stack location pointer value is
when the irp completes. it should always be the same value. if it changes
linearly over time, someone is not playing nice with the stack locations and
eventually you run out.

d


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

IoSetCompletionRoutineEx will keep your driver’s image in memory, it has
nothing to do w/this problem though.

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Mike Kemp
Sent: Thursday, September 21, 2006 3:42 AM
To: Windows System Software Devs Interest List
Subject: Re: RE:[ntdev] Too many Irp stacks to be believed (>30)!!

Hi Doron

Thanks, I spent the morning re-reading Oney and it stills looks okay to
me.
I do return status_more_processing_required.

I don’t do anything with the IRP in the completion routine, just re-use
it.
I’m also assuming it is okay to IoReuseIrp() the first time when it has
not
been used before?

I’ve found out that this probably occurred when the device was unplugged
so
I’m considering changing to IoSetCompletionRoutineEx(). I’m assuming
this
will guarantee that I get to the completion routine and release its lock

then?

I’ll try the test you recommend.

Mike

From: xxxxx@Microsoft.com
To: Windows System Software Devs Interest List
Sent: Thursday, September 21, 2006 9:11 AM
Subject: RE:[ntdev] Too many Irp stacks to be believed (>30)!!

that looks ok. you can reuse in the completion routine. you are
returning
status_more_processing_required from the completion routines right? one

thing you could do is see what the current stack location pointer value
is
when the irp completes. it should always be the same value. if it
changes
linearly over time, someone is not playing nice with the stack locations
and
eventually you run out.

d


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Mike Kemp wrote:

Thanks, I spent the morning re-reading Oney and it stills looks okay
to me. I do return status_more_processing_required.

I don’t do anything with the IRP in the completion routine, just
re-use it. I’m also assuming it is okay to IoReuseIrp() the first time
when it has not been used before?

Did you read the doc page? It depends on how you initially allocated
the IRP. Are you calling IoAllocateIrp first?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Yes, I allocate 4 IRPs when I initially allocate all the isoch comms stuff,
2 for TX, 2 for RX, and they each handle half the attached buffers.

e.g. 4 of these…

StackSize = pDeviceExtension->StackDeviceObject->StackSize;
pIrp = IoAllocateIrp(StackSize,FALSE);

then when the callback comes I IoReuseIrp() the IRP for the detach operation
with a detach completion routine.
in the detach completion routine I IoReuseIrp() the IRP for the reattach
operation with a reattach completion routine.
in the reattach completion routine I don’t touch the IRP as I don’t need it
at this stage, it is reused next callback for the same set of buffers

  • all completion routines return status_more_processing_required to make
    sure the IRPs are never touched again.

You get a IoReuseIrp() the first time round even though it has not been used
before - difficult to see why this would be a problem, but I can flag it to
prevent this if it is.

Which doc page might I be missing?

Any ideas?

Thanks, Mike.

----- Original Message -----
From: Tim Roberts
To: Windows System Software Devs Interest List
Sent: Thursday, September 21, 2006 5:30 PM
Subject: Re: [ntdev] Too many Irp stacks to be believed (>30)!!

Mike Kemp wrote:

Thanks, I spent the morning re-reading Oney and it stills looks okay
to me. I do return status_more_processing_required.

I don’t do anything with the IRP in the completion routine, just
re-use it. I’m also assuming it is okay to IoReuseIrp() the first time
when it has not been used before?

Did you read the doc page? It depends on how you initially allocated
the IRP. Are you calling IoAllocateIrp first?


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

A sneakier variation of one of Doron’s earlier suggestions might help you catch the offending culprit [if it is somehow traversing a stack deeper than it is supposed to be]:

(1) Add two to the stack size when you allocate the IRP. [I picked this since you are seeing -1, and the first location is 1- you can go more if you care to].
(2) On each completion routine, before you reuse the IRP, check that the first two [or more if you add more] stack locations [beginning at (PIO_STACK_LOCATION) (pIrp + 1)] are all 0. If they’re not, breakpoint and take a look at what devices the IRP just went through. IRP Stacks go from the end forward, so these entries shouldn’t ever get used.

This won’t work if your problem is someone endlessly recursing [endlessly forwarding the IRP to its own device object, for instance]. But I think if that happened, you’d have already noticed it on the stack trace when it bugchecked. There could be memory corruptors, etc., too.

I’m not saying it’s a surefire remedy [I tried it experimentally, but I’m sure it wasn’t an exhaustive test], but it MIGHT catch something, and it’s not a ton of extra code.

Another thought is that if it’s always the same IRP that bugchecks, adding one location and setting a debugger break on a pointer-sized write access to the device object pointer in the extra location should catch the perpetrator in the act [except that I think it will also probably trigger every time you reuse the IRP- you could use breakpoint commands to just restart the debugger each time this happens, but that could leave you with a really long, eye-glazing history to wade through when you are finished]. As I recall it, the Intel architecture used to allow only one such breakpoint- if that’s no longer true, you perhaps could do this to all 4 of the IRPs.

Please ignore my last suggestion- having made it was embarassment enough.

Doron’s initial suggestion is much better. Among other possibilities, it occurs to me that someone downstream may have improperly copied your completion routine into their next stack location, causing it to be invoked prematurely. I n addition to the location, you could also check if the stack size in the IRP was corrupted at the start of the completion routine [it should never change].