INTERRUPT_EXCEPTION_NOT_HANDLED Following Assertion failure - code c0000420

I’ve been working on a driver for some time, now, and I just recently started running into crashes that occur a few seconds after installing the driver. I’m not really sure what I might have changed to cause the issue, and the debug output isn’t particularly helpful.

I’m seeing that there’s an assertion failure - code c0000420 that repeatedly comes up. Some searching on this forum has led me to believe that this usually means that a DPC is running too long. However, in those posts, that issue usually seems to be followed by a DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED bugcheck. In my case, though, it is followed by an INTERRUPT_EXCEPTION_NOT_HANDLED bugcheck. I don’t believe that I’m doing anything too time-intensive in either my ISR or DPC, but I could be wrong.

I should note that I’m using a KMDF bus driver that handles interrupts and that calls into an AVStream child device driver (whose function pointer retrieved via a query interface) during the DPC. I’m not even sure which of these layered drivers is responsible for the crash.

Any ideas of what could be causing this issue, or how to further hone in on and debug it?

Assertion failure - code c0000420 (first chance)
nt!KeAccumulateTicks+0x575:
fffff803ba0ec2e5 cd2c int 2Ch 0: kd\> gn Assertion failure - code c0000420 (first chance) nt!KeAccumulateTicks+0x575: fffff803ba0ec2e5 cd2c int 2Ch
1: kd> g
Continuing an assertion failure can result in the debuggee
being terminated (bugchecking for kernel debuggees).
If you want to ignore this assertion, use ‘ahi’.
If you want to force continuation, use ‘gh’ or ‘gn’.
1: kd> gn

*** Fatal System Error: 0x0000003d
(0xFFFFF803B95355D0,0x0000000000000000,0x0000000000000000,0xFFFFF803BA0EC2E5)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows 7 9200 x64 target at (Mon Jul 22 08:52:52.418 2013 (UTC - 4:00)), ptr64 TRUE
Loading Kernel Symbols



Loading User Symbols

Loading unloaded module list

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 3D, {fffff803b95355d0, 0, 0, fffff803ba0ec2e5}

Probably caused by : ntkrnlmp.exe ( nt!KeAccumulateTicks+575 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
fffff803`ba0f0930 cc int 3
0: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

INTERRUPT_EXCEPTION_NOT_HANDLED (3d)
Arguments:
Arg1: fffff803b95355d0
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: fffff803ba0ec2e5

Debugging Details:

CONTEXT: fffff803b95355d0 – (.cxr 0xfffff803b95355d0)
rax=0000000000000000 rbx=fffff803ba371180 rcx=0000000000000003
rdx=0000000000000000 rsi=0000000044944ec6 rdi=0000000000000001
rip=fffff803ba0ec2e5 rsp=fffff803b9535fd0 rbp=0000000000001cc2
r8=0000000000000000 r9=fffff803ba3cb880 r10=0000000000001125
r11=fffffa8006874810 r12=ffffffffc0000120 r13=0000000000000000
r14=0000000000000002 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
cs=0010 ss=0000 ds=002b es=002b fs=0053 gs=002b efl=00000202
nt!KeAccumulateTicks+0x575:
fffff803`ba0ec2e5 cd2c int 2Ch
Resetting default scope

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0x3D

PROCESS_NAME: System

CURRENT_IRQL: d

LAST_CONTROL_TRANSFER: from 0000000000000000 to fffff803ba0ec2e5

STACK_TEXT:
fffff803b9535fd0 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KeAccumulateTicks+0x575

FOLLOWUP_IP:
nt!KeAccumulateTicks+575
fffff803`ba0ec2e5 cd2c int 2Ch

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: nt!KeAccumulateTicks+575

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 5010ac4b

STACK_COMMAND: .cxr 0xfffff803b95355d0 ; kb

FAILURE_BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

Followup: MachineOwner

Another reason for this could be that you’re holding a spinlock for long time, or pass incorrect IRQL to KeReleaseSpinlock, or release spinlock in the wrong order, which would cause the IRQL stay at DISPATCH_LEVEL.

STACK_COMMAND: .cxr 0xfffff803b95355d0 ; kb

What is the output from that?

runtime tracing is generally helpful for debugging this sort of failure.

Mark Roddy

On Mon, Jul 22, 2013 at 10:23 AM, wrote:

> Another reason for this could be that you’re holding a spinlock for long
> time, or pass incorrect IRQL to KeReleaseSpinlock, or release spinlock in
> the wrong order, which would cause the IRQL stay at DISPATCH_LEVEL.
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

Two problems:

  1. You’re staying at elevated IRQL for too long, which is causing the
    NT_ASSERT to fire. See other responses for where to go with that one.

  2. You’re continuing the NT_ASSERT generated exception with “Go Not Handled”
    (gn) versus “Go Handled” (gh). This tells the O/S that the exception is not
    handled and therefore the O/S is crashing with
    INTERRUPT_EXCEPTION_NOT_HANDLED. Using gh instead should avoid the crash,
    though that doesn’t resolve the underlying issue of course.

-scott
OSR

wrote in message news:xxxxx@ntdev…

I’ve been working on a driver for some time, now, and I just recently
started running into crashes that occur a few seconds after installing the
driver. I’m not really sure what I might have changed to cause the issue,
and the debug output isn’t particularly helpful.

I’m seeing that there’s an assertion failure - code c0000420 that repeatedly
comes up. Some searching on this forum has led me to believe that this
usually means that a DPC is running too long. However, in those posts, that
issue usually seems to be followed by a DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED
bugcheck. In my case, though, it is followed by an
INTERRUPT_EXCEPTION_NOT_HANDLED bugcheck. I don’t believe that I’m doing
anything too time-intensive in either my ISR or DPC, but I could be wrong.

I should note that I’m using a KMDF bus driver that handles interrupts and
that calls into an AVStream child device driver (whose function pointer
retrieved via a query interface) during the DPC. I’m not even sure which of
these layered drivers is responsible for the crash.

Any ideas of what could be causing this issue, or how to further hone in on
and debug it?

Assertion failure - code c0000420 (first chance)
nt!KeAccumulateTicks+0x575:
fffff803ba0ec2e5 cd2c int 2Ch 0: kd\> gn Assertion failure - code c0000420 (first chance) nt!KeAccumulateTicks+0x575: fffff803ba0ec2e5 cd2c int 2Ch
1: kd> g
Continuing an assertion failure can result in the debuggee
being terminated (bugchecking for kernel debuggees).
If you want to ignore this assertion, use ‘ahi’.
If you want to force continuation, use ‘gh’ or ‘gn’.
1: kd> gn

*** Fatal System Error: 0x0000003d
(0xFFFFF803B95355D0,0x0000000000000000,0x0000000000000000,0xFFFFF803BA0EC2E5)

Break instruction exception - code 80000003 (first chance)

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows 7 9200 x64 target at (Mon Jul 22 08:52:52.418 2013
(UTC - 4:00)), ptr64 TRUE
Loading Kernel Symbols



Loading User Symbols

Loading unloaded module list

*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 3D, {fffff803b95355d0, 0, 0, fffff803ba0ec2e5}

Probably caused by : ntkrnlmp.exe ( nt!KeAccumulateTicks+575 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
fffff803`ba0f0930 cc int 3
0: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

INTERRUPT_EXCEPTION_NOT_HANDLED (3d)
Arguments:
Arg1: fffff803b95355d0
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: fffff803ba0ec2e5

Debugging Details:

CONTEXT: fffff803b95355d0 – (.cxr 0xfffff803b95355d0)
rax=0000000000000000 rbx=fffff803ba371180 rcx=0000000000000003
rdx=0000000000000000 rsi=0000000044944ec6 rdi=0000000000000001
rip=fffff803ba0ec2e5 rsp=fffff803b9535fd0 rbp=0000000000001cc2
r8=0000000000000000 r9=fffff803ba3cb880 r10=0000000000001125
r11=fffffa8006874810 r12=ffffffffc0000120 r13=0000000000000000
r14=0000000000000002 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
cs=0010 ss=0000 ds=002b es=002b fs=0053 gs=002b
efl=00000202
nt!KeAccumulateTicks+0x575:
fffff803`ba0ec2e5 cd2c int 2Ch
Resetting default scope

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0x3D

PROCESS_NAME: System

CURRENT_IRQL: d

LAST_CONTROL_TRANSFER: from 0000000000000000 to fffff803ba0ec2e5

STACK_TEXT:
fffff803b9535fd0 0000000000000000 : 0000000000000000 0000000000000000
0000000000000000 0000000000000000 : nt!KeAccumulateTicks+0x575

FOLLOWUP_IP:
nt!KeAccumulateTicks+575
fffff803`ba0ec2e5 cd2c int 2Ch

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: nt!KeAccumulateTicks+575

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 5010ac4b

STACK_COMMAND: .cxr 0xfffff803b95355d0 ; kb

FAILURE_BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

Followup: MachineOwner

When running the .cxr/kb command, I received the following output (note that this is from a different instance of the crash, so the addresses/values may have changed from my previous post):

rax=0000000000000000 rbx=fffff880009e7180 rcx=0000000000000003
rdx=0000000000000000 rsi=0000001300075d69 rdi=0000000000000001
rip=fffff8029f4772e5 rsp=fffff880009fafd0 rbp=000000000007f81e
r8=0000000000000000 r9=fffff880009f2e40 r10=000000000004c001
r11=fffffa80062632f0 r12=ffffffffc0000120 r13=0000000000000000
r14=0000000000000002 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00000202
nt!KeAccumulateTicks+0x575:
fffff8029f4772e5 cd2c int 2Ch \*\*\* Stack trace for last set context - .thread/.cxr resets it RetAddr : Args to Child : Call Site fffff8029f4abf11 : fffffa8000000001 fffff880009e7180 fffff880009fb130 fffff78000000320 : nt!KeAccumulateTicks+0x575
0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 00000000`00000000 : nt!KeUpdateRunTime+0x51

I’m not sure that that debug output is particularly useful to me, since the stack does not contain any frames in our driver(s).

Fortunately, thanks to the advice to check the spinlocks, I think that I *may* have found the cause of the issue (or if not, it’s definitely still something that requires fixing). It looks like we’re trying to use a standard spinlock in order to synchronize certain operations between the ISR and DPC. This not only violates the IRQL <= DISPATCH_LEVEL restriction on WdfSpinLockAcquire, but also introduces opportunity to deadlock. I think that the correct alternative would be to get rid of our spinlock and use WdfInterruptAcquire/ReleaseLock in place of the WdfSpinLockAcquire/Release in the DPC.

(The spinlock was always there and hadn’t been causing issues previously, but we have changed the operation of the DPC to do a lot more now, so I believe that probably contributed to the problem starting to manifest itself.)

Also, just to gain some insight, how long is inappropriately long to run at DISPATCH_LEVEL? Right now, during our DPC, we copy a 1080p video frame from a DMA common buffer into a user buffer from AVStream and submit it. Is this appropriate to do at DISPATCH_LEVEL?

xxxxx@cspeed.com wrote:

Fortunately, thanks to the advice to check the spinlocks, I think that I *may* have found the cause of the issue (or if not, it’s definitely still something that requires fixing). It looks like we’re trying to use a standard spinlock in order to synchronize certain operations between the ISR and DPC. This not only violates the IRQL <= DISPATCH_LEVEL restriction on WdfSpinLockAcquire, but also introduces opportunity to deadlock. I think that the correct alternative would be to get rid of our spinlock and use WdfInterruptAcquire/ReleaseLock in place of the WdfSpinLockAcquire/Release in the DPC.

OK, but remember that using the interrupt spinlock means you will be
raised to the device IRQL, which is even higher than dispatch. That
means there are other APIs you can’t call.

Also, just to gain some insight, how long is inappropriately long to run at DISPATCH_LEVEL? Right now, during our DPC, we copy a 1080p video frame from a DMA common buffer into a user buffer from AVStream and submit it. Is this appropriate to do at DISPATCH_LEVEL?

If that’s 3 bytes per pixel, you’re talking 6 MB, which is 1.5 million
cycles. On a 3 GHz CPU, that’s 500 microseconds. That shouldn’t be too
long for dispatch, but it may be too long for DIRQL. Do you really need
to lock out your interrupt while you’re doing the copy? Remember that
locking out the interrupt doesn’t stop the hardware. If the hardware is
continuing to write your common buffer, there’s still the opportunity
for an overlap.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

> I’ve been working on a driver for some time, now, and I just recently

started running into crashes that occur a few seconds after installing the
driver. I’m not really sure what I might have changed to cause the issue,
and the debug output isn’t particularly helpful.

I’m seeing that there’s an assertion failure - code c0000420 that
repeatedly comes up. Some searching on this forum has led me to believe
that this usually means that a DPC is running too long. However, in those
posts, that issue usually seems to be followed by a
DPC_QUEUE_EXECUTION_TIMEOUT_EXCEEDED bugcheck. In my case, though, it is
followed by an INTERRUPT_EXCEPTION_NOT_HANDLED bugcheck. I don’t believe
that I’m doing anything too time-intensive in either my ISR or DPC, but I
could be wrong.

I should note that I’m using a KMDF bus driver that handles interrupts and
that calls into an AVStream child device driver (whose function pointer
retrieved via a query interface) during the DPC. I’m not even sure which
of these layered drivers is responsible for the crash.

Any ideas of what could be causing this issue, or how to further hone in
on and debug it?

Assertion failure - code c0000420 (first chance)
nt!KeAccumulateTicks+0x575:
fffff803ba0ec2e5 cd2c int 2Ch 0: kd\> gn Assertion failure - code c0000420 (first chance) nt!KeAccumulateTicks+0x575: fffff803ba0ec2e5 cd2c int 2Ch
1: kd> g
Continuing an assertion failure can result in the debuggee
being terminated (bugchecking for kernel debuggees).
If you want to ignore this assertion, use ‘ahi’.
If you want to force continuation, use ‘gh’ or ‘gn’.
1: kd> gn

*** Fatal System Error: 0x0000003d
(0xFFFFF803B95355D0,0x0000000000000000,0x0000000000000000,0xFFFFF803BA0EC2E5)

Break instruction exception - code 80000003 (first chance)

This says your code hit a breakpoint instruction. There are a few of
these buried in places where they usually have a comment of the form
// This case can never happen
or some equivalent comment near that place. Which means you have somehow
forced the kernel into an impossible state, most often caused by
accidentally overwriting some critical piece of storage that causes an
inconsistent data structure. So that’s sort of where I’d start looking:
for memory damage issues.

A fatal system error has occurred.
Debugger entered on first try; Bugcheck callbacks have not been invoked.

A fatal system error has occurred.

Connected to Windows 7 9200 x64 target at (Mon Jul 22 08:52:52.418 2013
(UTC - 4:00)), ptr64 TRUE
Loading Kernel Symbols



Loading User Symbols

Loading unloaded module list

*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck 3D, {fffff803b95355d0, 0, 0, fffff803ba0ec2e5}

Probably caused by : ntkrnlmp.exe ( nt!KeAccumulateTicks+575 )

Followup: MachineOwner

nt!RtlpBreakWithStatusInstruction:
fffff803`ba0f0930 cc int 3

Yep, here it is. Definitely bad. You should not be at this instruction.
Now, the question is, how did you get there?

0: kd> !analyze -v
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************

INTERRUPT_EXCEPTION_NOT_HANDLED (3d)
Arguments:
Arg1: fffff803b95355d0
Arg2: 0000000000000000
Arg3: 0000000000000000
Arg4: fffff803ba0ec2e5

Debugging Details:

CONTEXT: fffff803b95355d0 – (.cxr 0xfffff803b95355d0)
rax=0000000000000000 rbx=fffff803ba371180 rcx=0000000000000003
rdx=0000000000000000 rsi=0000000044944ec6 rdi=0000000000000001
rip=fffff803ba0ec2e5 rsp=fffff803b9535fd0 rbp=0000000000001cc2
r8=0000000000000000 r9=fffff803ba3cb880 r10=0000000000001125
r11=fffffa8006874810 r12=ffffffffc0000120 r13=0000000000000000
r14=0000000000000002 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
cs=0010 ss=0000 ds=002b es=002b fs=0053 gs=002b
efl=00000202
nt!KeAccumulateTicks+0x575:
fffff803`ba0ec2e5 cd2c int 2Ch
Resetting default scope

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0x3D

PROCESS_NAME: System

CURRENT_IRQL: d

LAST_CONTROL_TRANSFER: from 0000000000000000 to fffff803ba0ec2e5

OK, this is suspicious. It suggests that you tried to call a function at
location 0, and the transfer is probably to the memory-access-denied
handler.

There are lots of possible causes, but far and away the most common cause
is having a buffer overrun on the stack that clobbers the return address.
I’d look for that first. Since this is Win64, the other most common cause,
mixing __cdecl and __stdcall, would not apply (there is only one calling
convention for Win64, and __cdecl and __stdcall are “noise words” the
compiler ignores).

STACK_TEXT:
fffff803b9535fd0 0000000000000000 : 0000000000000000 0000000000000000
0000000000000000 0000000000000000 : nt!KeAccumulateTicks+0x575

Alternatively, it could have been an attempt to call something at location
0. There are many possible causes of this; the most frequent one in C++
is having a pure virtual method, downcast it to a subclass, and call the
virtual method. It takes some work to manage this, but it /is/ doable.
Clobbering the vtable is another, and is easy to do if you ZeroMemory or
RtlZeroMemory a C++ object for sizeof(); the vtable pointer is in the
first sizeof(void*) bytes of the object, and gets overwritten (I’ve seen
this far too often). Or you implement your own idea of a vtable but fail
to populate one of the slots. Lots of ways to get call to try to call
location 0.

I’m not at all sure what is specifically wrong here, but it smells like
memory damage.
joe

FOLLOWUP_IP:
nt!KeAccumulateTicks+575
fffff803`ba0ec2e5 cd2c int 2Ch

SYMBOL_STACK_INDEX: 0

SYMBOL_NAME: nt!KeAccumulateTicks+575

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: nt

IMAGE_NAME: ntkrnlmp.exe

DEBUG_FLR_IMAGE_TIMESTAMP: 5010ac4b

STACK_COMMAND: .cxr 0xfffff803b95355d0 ; kb

FAILURE_BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

BUCKET_ID: X64_0x3D_VRF_nt!KeAccumulateTicks+575

Followup: MachineOwner


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

A couple of things:

Unfortunately, I was a bit hasty with my last post: although the ISR was indeed calling WdfSpinLockAcquire (and should not have), I was mistaken in saying that the DPC was also attempting to use that lock. In fact, aside from its creation, the spinlock was only ever referenced in the ISR (it appeared to have no discernible purpose in our code and I have since removed it). Also, the DPC was already using WdfInterruptAcquire/ReleaseLock as it should.

Anyway, I’m afraid that even after removing the WdfSpinLockAcquire/Release from the ISR, I’m still running into the original issue.

I should clarify that, in the DPC, I’m NOT doing the video frame copy/submit with the interrupt lock held. The interrupt lock is only held to grab a few status register values that were saved in the interrupt’s context during the ISR. Once we release the interrupt lock, which I believe should bring us back to DISPATCH_LEVEL, we do the copy.

Is there any way to determine which routine(s) within the driver is running too long and causing the assertion failure?

Interesting. This is certainly a much different trail than I’ve been following, but perhaps it makes more sense given that I’m not seeing much in the way of opportunities for excessive DPC times.

Even though you mentioned that they are irrelevant to Win64, could you explain the significance of those “noise words” that you mentioned?

Your advice has reminded me that I had recently removed the “purecall.c” file that was part of the original AVShws sample driver upon which our AVStream driver is based. The limited documentation in that file made it sound as though it was only necessary for backwards compatibility for Windows 98 Gold, and so I was happy to remove it. Was this a mistake, and could this be part of the problem?

For easy reference, the “purecall.c” file is as follows:

/**************************************************************************

AVStream Simulated Hardware Sample

Copyright (c) 2001, Microsoft Corporation.

File:

purecall.c

Abstract:

This file contains the _purecall stub necessary for virtual function
usage in drivers on 98 gold.

History:

created 9/16/02

**************************************************************************/

/*************************************************

Function:

_purecall

Description:

_purecall stub for virtual function usage

Arguments:

None

Return Value:

0

*************************************************/
#pragma warning (disable : 4100 4131)
int __cdecl
_purecall (
VOID
)

{
return 0;
}

“Is there any way to determine which routine(s) within the driver is
running too long and causing the assertion failure?”

What is on the stack when you hit the assertion? Don’t continue, analyze
the failure at that point.

Logging with timestamps can help to analyze where you are spending too much
time.

Mark Roddy

On Mon, Jul 22, 2013 at 2:26 PM, wrote:

> A couple of things:
>
> Unfortunately, I was a bit hasty with my last post: although the ISR was
> indeed calling WdfSpinLockAcquire (and should not have), I was mistaken in
> saying that the DPC was also attempting to use that lock. In fact, aside
> from its creation, the spinlock was only ever referenced in the ISR (it
> appeared to have no discernible purpose in our code and I have since
> removed it). Also, the DPC was already using
> WdfInterruptAcquire/ReleaseLock as it should.
>
> Anyway, I’m afraid that even after removing the WdfSpinLockAcquire/Release
> from the ISR, I’m still running into the original issue.
>
> I should clarify that, in the DPC, I’m NOT doing the video frame
> copy/submit with the interrupt lock held. The interrupt lock is only held
> to grab a few status register values that were saved in the interrupt’s
> context during the ISR. Once we release the interrupt lock, which I believe
> should bring us back to DISPATCH_LEVEL, we do the copy.
>
> Is there any way to determine which routine(s) within the driver is
> running too long and causing the assertion failure?
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

I thought that the limit for an ISR was 10 uS, and for a DPC was 50uS.
500 uS seems like it’s an order of magnitude larger. Sounds like the
failure mode here is the use of a common buffer. There are several
reasons a common buffer might be used:

  • it seemed like a good idea at the time
  • a mind-bogglingly poor design of the DMA interface requires
    contiguous physical pages
  • the destination of the video is not known until the data is examined
  • an internal staging buffer is required to keep up with the data rate

There are several alternatives that should be considered

  • rethink why it is being done. It wasn’t such a good idea after all
  • redesign the DMA interface. Any device that can’t do scatter/gather
    is out of touch with reality
  • rethink how the destination is determined
  • use async I/O and preload the device queue with a lot of fetch-frame
    requests

Synchronous I/O is often inappropriate for high-performance devices
because of the long app-kernel-app-kernel cycle time. If you have
performance with synchronous I/O, the first and best solution is to
abandon it as a concept. I have some experience with this. So you need
to consider the use of the common buffer, and whether or mot it makes
sense to spend another 500uS doing what your DMA device should have done
in the first place: transfer the data to the user space buffer directly.
joe

xxxxx@cspeed.com wrote:
> Fortunately, thanks to the advice to check the spinlocks, I think that I
> *may* have found the cause of the issue (or if not, it’s definitely
> still something that requires fixing). It looks like we’re trying to use
> a standard spinlock in order to synchronize certain operations between
> the ISR and DPC. This not only violates the IRQL <= DISPATCH_LEVEL
> restriction on WdfSpinLockAcquire, but also introduces opportunity to
> deadlock. I think that the correct alternative would be to get rid of
> our spinlock and use WdfInterruptAcquire/ReleaseLock in place of the
> WdfSpinLockAcquire/Release in the DPC.

OK, but remember that using the interrupt spinlock means you will be
raised to the device IRQL, which is even higher than dispatch. That
means there are other APIs you can’t call.

> Also, just to gain some insight, how long is inappropriately long to run
> at DISPATCH_LEVEL? Right now, during our DPC, we copy a 1080p video
> frame from a DMA common buffer into a user buffer from AVStream and
> submit it. Is this appropriate to do at DISPATCH_LEVEL?

If that’s 3 bytes per pixel, you’re talking 6 MB, which is 1.5 million
cycles. On a 3 GHz CPU, that’s 500 microseconds. That shouldn’t be too
long for dispatch, but it may be too long for DIRQL. Do you really need
to lock out your interrupt while you’re doing the copy? Remember that
locking out the interrupt doesn’t stop the hardware. If the hardware is
continuing to write your common buffer, there’s still the opportunity
for an overlap.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

> [quote]

There are lots of possible causes, but far and away the most common cause
is having a buffer overrun on the stack that clobbers the return address.
I’d look for that first. Since this is Win64, the other most common cause,
mixing __cdecl and __stdcall, would not apply (there is only one calling
convention for Win64, and __cdecl and __stdcall are “noise words” the
compiler ignores).
[/quote]

Interesting. This is certainly a much different trail than I’ve been
following, but perhaps it makes more sense given that I’m not seeing much
in the way of opportunities for excessive DPC times.

Even though you mentioned that they are irrelevant to Win64, could you
explain the significance of those “noise words” that you mentioned?

__cdecl is the “standard C interface” for calls. On the x86, it is
characterized by pushing the parameters right-to-left, with a potentially
variable number of arguments, issuing a CALL instruction, then upon
return, the parameters are stripped from the stack by adding an
appropriately-determined constant to ESP.

This sequence is particularly nasty for instruction prefetch, pipelining,
and the generally opportunistic non-sequential execution engine that
underlies the x86. It also produces larger code and executes more
instructions, but the primary failure is on overall instruction flow.
This linkage is required for functions like printf and any parameter list
that takes “…” in its parameter specification. <varargs.h> will not
work on other than cdecl.

stdcall is the “platform-specific best linkage”. On the x86, parameters
are pushed right-to-left, the CALL instruction is executed. Stack cleanup
is accomplished by a RET n instruction in the called function, which does
the return to the caller and at the same time strips n bytes of parameters
from the stack. There is no ADD instruction following the CALL.

fastcall is sort of like stdcall, except that the first two parameters
are passed in ECX and EDX. It is particularly useful for one-and-two
parameter functions.

Add to this mix the notion of the “frame pointer”. EBP is a pointer into
the stack such that all parameters are addressed as positive ffsets from
EBP, and local variables as negative offsets. This means that [EBP+8]
always represents the same parameter, and [EBP-16] always represents the
same local variable, while ESP is bouncing up and down as other functions
are called. That ESP bouncing around really stresses the asynchronous
execution unit. The mainenance of EBP adds instructions to function
prologs and epilogs, and further stress the asynchronous execution unit.
So there is a feature called “Frame Pointer Optimization” (or “FPO”) in
the recent compilers. What it does is, on an instruction-by-instruction
basis, simulate the frame pointer by knowing what its “instantaneous
offset” would be relative to the current ESP, and therefore knows that at
THIS value of EIP, the offset of [EBP+4] is really the same as [ESP+1C],
but at THAT value of EIP, the value that had been [EBP+4] is the same as
[ESP+24]. Now there is no need to manage EBP, which saves the
instructions that would have dealt with it, saves the pressure on the
asynchronous execution unit, frees up another computational register (thus
reducing pressure on the register allocation, and reducing register
spills).

Note that all x86 calling sequences still require a lot of ESP bouncing,
which is not good for the asychronous execution unit. The x64 uses a
fixed RSP for functions. Parameters are not pushed onto te stack, but
MOVed into locations relative to RSP; in effect, the top elements of the
stack are the parameter positions for the next call, and the amount ESP is
extended on function prolog is sizeof(locals) + sizeof(max(parameters)),
the second term being the maximum size of the parameter lists of all the
functions that are called. Then, since many functions take 4 or fewer
parameters, it works like a hybrid of the x86 __fastcall; the first four
parameters are actually placed in registers (RCX, RDX, R8 and R9, if I
recall correctly), and in general never appear on the stack. There are
several exceptions to this, which I will not go into at this point, but
one of them is the “…” case.

There is one more linkage type in the x86,__thiscall, which is generated
by the C++ compiler, and although you see it in all kinds of debugger and
linker contexts, but you can’t write it in C or C++ source. It is used
for C++ method calls, and uses ECX as the repository of the ‘this’
pointer.

The function call type also determines the function naming for the linker:

_cdecl func => func
stdcall func => _func@n where n is the number of bytes of parameters
_fastcall func => @xxxxx@n as above, including the parameters in ECX and EDX
thiscall xxx::func => you would not believe the mess

thiscall does what is called “name mangling”, a topic far OT for this
group.
joe

> Your advice has reminded me that I had recently removed the “purecall.c”
> file that was part of the original AVShws sample driver upon which our
> AVStream driver is based. The limited documentation in that file made it
> sound as though it was only necessary for backwards compatibility for
> Windows 98 Gold, and so I was happy to remove it. Was this a mistake, and
> could this be part of the problem?
>
> For easy reference, the “purecall.c” file is as follows:
>
> /
>
> AVStream Simulated Hardware Sample
>
> Copyright (c) 2001, Microsoft Corporation.
>
> File:
>
> purecall.c
>
> Abstract:
>
> This file contains the _purecall stub necessary for virtual
> function
> usage in drivers on 98 gold.
>
> History:
>
> created 9/16/02
>
>
/
>
> /
>
> Function:
>
> _purecall
>
> Description:
>
> _purecall stub for virtual function usage
>
> Arguments:
>
> None
>
> Return Value:
>
> 0
>
>
/
> #pragma warning (disable : 4100 4131)
> int __cdecl
> _purecall (
> VOID
> )
>
> {
> return 0;
> }
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
></varargs.h>

for the record, FPO is pretty evil when it comes to debugging and has been off as a default since XP. The insanity and pain that came with getting one extra register wasn’t worth it

d

Bent from my phone


From: xxxxx@flounder.commailto:xxxxx
Sent: ?7/?22/?2013 7:20 PM
To: Windows System Software Devs Interest Listmailto:xxxxx
Subject: RE:[ntdev] INTERRUPT_EXCEPTION_NOT_HANDLED Following Assertion failure - code c0000420

>


>
> Interesting. This is certainly a much different trail than I’ve been
> following, but perhaps it makes more sense given that I’m not seeing much
> in the way of opportunities for excessive DPC times.
>
> Even though you mentioned that they are irrelevant to Win64, could you
> explain the significance of those “noise words” that you mentioned?
>
cdecl is the “standard C interface” for calls. On the x86, it is
characterized by pushing the parameters right-to-left, with a potentially
variable number of arguments, issuing a CALL instruction, then upon
return, the parameters are stripped from the stack by adding an
appropriately-determined constant to ESP.

This sequence is particularly nasty for instruction prefetch, pipelining,
and the generally opportunistic non-sequential execution engine that
underlies the x86. It also produces larger code and executes more
instructions, but the primary failure is on overall instruction flow.
This linkage is required for functions like printf and any parameter list
that takes “…” in its parameter specification. <varargs.h> will not
work on other than
cdecl.

stdcall is the “platform-specific best linkage”. On the x86, parameters
are pushed right-to-left, the CALL instruction is executed. Stack cleanup
is accomplished by a RET n instruction in the called function, which does
the return to the caller and at the same time strips n bytes of parameters
from the stack. There is no ADD instruction following the CALL.

fastcall is sort of like __stdcall, except that the first two parameters
are passed in ECX and EDX. It is particularly useful for one-and-two
parameter functions.

Add to this mix the notion of the “frame pointer”. EBP is a pointer into
the stack such that all parameters are addressed as positive ffsets from
EBP, and local variables as negative offsets. This means that [EBP+8]
always represents the same parameter, and [EBP-16] always represents the
same local variable, while ESP is bouncing up and down as other functions
are called. That ESP bouncing around really stresses the asynchronous
execution unit. The mainenance of EBP adds instructions to function
prologs and epilogs, and further stress the asynchronous execution unit.
So there is a feature called “Frame Pointer Optimization” (or “FPO”) in
the recent compilers. What it does is, on an instruction-by-instruction
basis, simulate the frame pointer by knowing what its “instantaneous
offset” would be relative to the current ESP, and therefore knows that at
THIS value of EIP, the offset of [EBP+4] is really the same as [ESP+1C],
but at THAT value of EIP, the value that had been [EBP+4] is the same as
[ESP+24]. Now there is no need to manage EBP, which saves the
instructions that would have dealt with it, saves the pressure on the
asynchronous execution unit, frees up another computational register (thus
reducing pressure on the register allocation, and reducing register
spills).

Note that all x86 calling sequences still require a lot of ESP bouncing,
which is not good for the asychronous execution unit. The x64 uses a
fixed RSP for functions. Parameters are not pushed onto te stack, but
MOVed into locations relative to RSP; in effect, the top elements of the
stack are the parameter positions for the next call, and the amount ESP is
extended on function prolog is sizeof(locals) + sizeof(max(parameters)),
the second term being the maximum size of the parameter lists of all the
functions that are called. Then, since many functions take 4 or fewer
parameters, it works like a hybrid of the x86__fastcall; the first four
parameters are actually placed in registers (RCX, RDX, R8 and R9, if I
recall correctly), and in general never appear on the stack. There are
several exceptions to this, which I will not go into at this point, but
one of them is the “…” case.

There is one more linkage type in the x86, thiscall, which is generated
by the C++ compiler, and although you see it in all kinds of debugger and
linker contexts, but you can’t write it in C or C++ source. It is used
for C++ method calls, and uses ECX as the repository of the ‘this’
pointer.

The function call type also determines the function naming for the linker:

cdecl func => _func
__stdcall func => func@n where n is the number of bytes of parameters
fastcall func => @xxxxx@n as above, including the parameters in ECX and EDX
thiscall xxx::func => you would not believe the mess

__thiscall does what is called “name mangling”, a topic far OT for this
group.
joe

> Your advice has reminded me that I had recently removed the “purecall.c”
> file that was part of the original AVShws sample driver upon which our
> AVStream driver is based. The limited documentation in that file made it
> sound as though it was only necessary for backwards compatibility for
> Windows 98 Gold, and so I was happy to remove it. Was this a mistake, and
> could this be part of the problem?
>
> For easy reference, the “purecall.c” file is as follows:
>
> /
>
> AVStream Simulated Hardware Sample
>
> Copyright (c) 2001, Microsoft Corporation.
>
> File:
>
> purecall.c
>
> Abstract:
>
> This file contains the _purecall stub necessary for virtual
> function
> usage in drivers on 98 gold.
>
> History:
>
> created 9/16/02
>
>
/
>
> /
>
> Function:
>
> _purecall
>
> Description:
>
> _purecall stub for virtual function usage
>
> Arguments:
>
> None
>
> Return Value:
>
> 0
>
>
/
> #pragma warning (disable : 4100 4131)
> int__cdecl
> _purecall (
> VOID
> )
>
> {
> return 0;
> }
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer</varargs.h></mailto:xxxxx></mailto:xxxxx>

> for the record, FPO is pretty evil when it comes to debugging and has been

off as a default since XP. The insanity and pain that came with getting
one extra register wasn’t worth it

I agree completely. FPO makes for some of the most amazingly unreadable
code, ever. FPO was invented by the BLISS-11 compiler group at CMU in the
early 1970s. I spent many years debugging the resulting programs, and we
had no source debugger. Alternatives to using FPO abound, such as being
coated with honey and staked out on a nest of fire ants, being on the
receiving side of Death By A Thousand Cuts, and being forced to read the
drivel I write.
joe

d

Bent from my phone


From: xxxxx@flounder.commailto:xxxxx
> Sent: ý7/ý22/ý2013 7:20 PM
> To: Windows System Software Devs Interest Listmailto:xxxxx
> Subject: RE:[ntdev] INTERRUPT_EXCEPTION_NOT_HANDLED Following Assertion
> failure - code c0000420
>
>>


>>
>> Interesting. This is certainly a much different trail than I’ve been
>> following, but perhaps it makes more sense given that I’m not seeing
>> much
>> in the way of opportunities for excessive DPC times.
>>
>> Even though you mentioned that they are irrelevant to Win64, could you
>> explain the significance of those “noise words” that you mentioned?
>>
> cdecl is the “standard C interface” for calls. On the x86, it is
> characterized by pushing the parameters right-to-left, with a potentially
> variable number of arguments, issuing a CALL instruction, then upon
> return, the parameters are stripped from the stack by adding an
> appropriately-determined constant to ESP.
>
> This sequence is particularly nasty for instruction prefetch, pipelining,
> and the generally opportunistic non-sequential execution engine that
> underlies the x86. It also produces larger code and executes more
> instructions, but the primary failure is on overall instruction flow.
> This linkage is required for functions like printf and any parameter list
> that takes “…” in its parameter specification. <varargs.h> will not
> work on other than
cdecl.
>
> stdcall is the “platform-specific best linkage”. On the x86, parameters
> are pushed right-to-left, the CALL instruction is executed. Stack cleanup
> is accomplished by a RET n instruction in the called function, which does
> the return to the caller and at the same time strips n bytes of parameters
> from the stack. There is no ADD instruction following the CALL.
>
>
fastcall is sort of like __stdcall, except that the first two parameters
> are passed in ECX and EDX. It is particularly useful for one-and-two
> parameter functions.
>
> Add to this mix the notion of the “frame pointer”. EBP is a pointer into
> the stack such that all parameters are addressed as positive ffsets from
> EBP, and local variables as negative offsets. This means that [EBP+8]
> always represents the same parameter, and [EBP-16] always represents the
> same local variable, while ESP is bouncing up and down as other functions
> are called. That ESP bouncing around really stresses the asynchronous
> execution unit. The mainenance of EBP adds instructions to function
> prologs and epilogs, and further stress the asynchronous execution unit.
> So there is a feature called “Frame Pointer Optimization” (or “FPO”) in
> the recent compilers. What it does is, on an instruction-by-instruction
> basis, simulate the frame pointer by knowing what its “instantaneous
> offset” would be relative to the current ESP, and therefore knows that at
> THIS value of EIP, the offset of [EBP+4] is really the same as [ESP+1C],
> but at THAT value of EIP, the value that had been [EBP+4] is the same as
> [ESP+24]. Now there is no need to manage EBP, which saves the
> instructions that would have dealt with it, saves the pressure on the
> asynchronous execution unit, frees up another computational register (thus
> reducing pressure on the register allocation, and reducing register
> spills).
>
> Note that all x86 calling sequences still require a lot of ESP bouncing,
> which is not good for the asychronous execution unit. The x64 uses a
> fixed RSP for functions. Parameters are not pushed onto te stack, but
> MOVed into locations relative to RSP; in effect, the top elements of the
> stack are the parameter positions for the next call, and the amount ESP is
> extended on function prolog is sizeof(locals) + sizeof(max(parameters)),
> the second term being the maximum size of the parameter lists of all the
> functions that are called. Then, since many functions take 4 or fewer
> parameters, it works like a hybrid of the x86__fastcall; the first four
> parameters are actually placed in registers (RCX, RDX, R8 and R9, if I
> recall correctly), and in general never appear on the stack. There are
> several exceptions to this, which I will not go into at this point, but
> one of them is the “…” case.
>
> There is one more linkage type in the x86, thiscall, which is generated
> by the C++ compiler, and although you see it in all kinds of debugger and
> linker contexts, but you can’t write it in C or C++ source. It is used
> for C++ method calls, and uses ECX as the repository of the ‘this’
> pointer.
>
> The function call type also determines the function naming for the linker:
>
>
cdecl func => _func
> __stdcall func => func@n where n is the number of bytes of parameters
> fastcall func => @xxxxx@n as above, including the parameters in ECX and
> EDX
>
thiscall xxx::func => you would not believe the mess
>
> __thiscall does what is called “name mangling”, a topic far OT for this
> group.
> joe
>
>> Your advice has reminded me that I had recently removed the “purecall.c”
>> file that was part of the original AVShws sample driver upon which our
>> AVStream driver is based. The limited documentation in that file made it
>> sound as though it was only necessary for backwards compatibility for
>> Windows 98 Gold, and so I was happy to remove it. Was this a mistake,
>> and
>> could this be part of the problem?
>>
>> For easy reference, the “purecall.c” file is as follows:
>>
>> /
>>
>> AVStream Simulated Hardware Sample
>>
>> Copyright (c) 2001, Microsoft Corporation.
>>
>> File:
>>
>> purecall.c
>>
>> Abstract:
>>
>> This file contains the _purecall stub necessary for virtual
>> function
>> usage in drivers on 98 gold.
>>
>> History:
>>
>> created 9/16/02
>>
>>
/
>>
>> /
>>
>> Function:
>>
>> _purecall
>>
>> Description:
>>
>> _purecall stub for virtual function usage
>>
>> Arguments:
>>
>> None
>>
>> Return Value:
>>
>> 0
>>
>>
/
>> #pragma warning (disable : 4100 4131)
>> int__cdecl
>> _purecall (
>> VOID
>> )
>>
>> {
>> return 0;
>> }
>>
>>
>> —
>> NTDEV is sponsored by OSR
>>
>> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>>
>> OSR is HIRING!! See http://www.osr.com/careers
>>
>> For our schedule of WDF, WDM, debugging and other seminars visit:
>> http://www.osr.com/seminars
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>
>
>
> —
> NTDEV is sponsored by OSR
>
> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer</varargs.h></mailto:xxxxx></mailto:xxxxx>

> being forced to read the drivel I write.

Awe, c’mon Dr. Newcomber. I read every word. Maybe not for immediate
illumination but at least because I learn (or am reminded of something
forgotten) in every passage.

And in this case in unearthing BLISS-11 you made me think of another torture
to insert in that list: writing & debugging TECO macros …

Cheers,
Dave Cattley

> __thiscall does what is called “name mangling”, a topic far OT for this

No, this is only for C++ method, and means - pass “this” in ECX.

All COM methods are such.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com

I believe I said it was for C++ methods…
joe

> __thiscall does what is called “name mangling”, a topic far OT for this

No, this is only for C++ method, and means - pass “this” in ECX.

All COM methods are such.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

…(background music: Tico-Tico)…

I deliberately chose to avoid that time-sink. I knew where it would lead,
to some form of rehab where they reduce the complexity until the last week
before release you are down to programming in Dartmouth BASIC…
joe

> being forced to read the drivel I write.

Awe, c’mon Dr. Newcomber. I read every word. Maybe not for immediate
illumination but at least because I learn (or am reminded of something
forgotten) in every passage.

And in this case in unearthing BLISS-11 you made me think of another
torture
to insert in that list: writing & debugging TECO macros …

Cheers,
Dave Cattley


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

David R. Cattley wrote:

And in this case in unearthing BLISS-11 you made me think of another torture
to insert in that list: writing & debugging TECO macros …

My business partner was a TECO hacker years many ago. One of their
“rite of passage” questions was to try to predict the effect of typing
your name as a TECO macro.

He implemented a (finite) Turing machine in TECO, proving the language
to be Turing-complete.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Unfortunately, I’m no longer able to reproduce the issue to try to debug it further. After backing out a chunk of my changes and re-applying them incrementally to try to deduce where the problem was introduced, I’m back to where I was when the issue was showing, but without the issue peeking its head out anymore.

I’m not sure why I didn’t think to do this in the first place. That might have made short work of this.

Thanks for all of the advice from everyone, though. The idea that there might be stack corruption from differences in call stack conventions, etc. is a bit scary, as I’m not sure where I’d begin to debug that.

I hate knowing that there’s likely a nasty bug somewhere, and that I can’t seem to bring it out at present to catch it. Perhaps I’ll be back on this thread if/when it does come back.