NDIS IM Driver bugcheck

MessageHello all -

I am having a problem that I was hoping someone on this list may be able to
point me in the right direction of solving. I have an NDIS v4.0 deserialized
driver that has been in use on many servers for many years, running mostly
Win 2000 Advanced Server. The driver also appears to run correctly on Win XP
(although the Win 2000 servers are heavily loaded - running on XP has been
done only on client PC’s, not server-class PCs). I have been running it on a
quad-processor Windows 2003 server (enterprise edition) and it has been
bug-checking. The driver is a heavily modified version of the ImSamp sample.
What seems to be happening is that when my SendPacketsHandler is called, and
I internally queue the supplied packet for later processing (the driver
performs NAT, firewall, compression, and encryption functions), I call
NdisMSendComplete() on the original packet after setting the packet status
(via NDIS_SET_PACKET_STATUS()) to NDIS_STATUS_PENDING. This winds up
bugchecking in ExFreePoolWithTag().

I have copied a stack trace from the system that exhibits this behavior. I
cannot use a live debugger on this system, as it is several states away from
me, so I only have crash dumps to work with. I am assuming that I am missing
a subtle difference in the way NDIS operates on Win 2003, as this does not
seem to expose itself on any other systems. I realize that updating the
driver to an NDIS 5.x driver would be a good idea, but time constraints
don’t allow me to do that at this point in time.

Any hints on what direction to take in debugging this would be appreciated.
The stack trace follows. The name of the driver that is causing problems is
MPNAT2K.SYS (which is in the stack trace).

Thanks for any tips in advance,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

****************************************************************************
***
* *
* Bugcheck Analysis *
* *
****************************************************************************
***
Use !analyze -v to get detailed debugging information.
BugCheck C5, {10fe, 2, 0, 8056726b}
*** ERROR: Module load completed but symbols could not be loaded for
e1000325.sys
Probably caused by : mpnat2k.sys ( mpnat2k!MPSendPackets+6d6 )
Followup: MachineOwner

0: kd> !analyze -v
****************************************************************************
***
* *
* Bugcheck Analysis *
* *
****************************************************************************
***
DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address at
an
interrupt request level (IRQL) that is too high. This is
caused by drivers that have corrupted the system pool. Run the driver
verifier against any new (or suspect) drivers, and if that doesn’t turn up
the culprit, then use gflags to enable special pool.
Arguments:
Arg1: 000010fe, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 8056726b, address which referenced memory
Debugging Details:

BUGCHECK_STR: 0xC5_2
CURRENT_IRQL: 2
FAULTING_IP:
nt!ExFreePoolWithTag+27b
8056726b 668b4602 mov ax,[esi+0x2]
DEFAULT_BUCKET_ID: DRIVER_FAULT
LAST_CONTROL_TRANSFER: from f6db52e0 to 8056726b
TRAP_FRAME: f78a2330 – (.trap fffffffff78a2330)
ErrCode = 00000000
eax=00000000 ebx=00001104 ecx=00320001 edx=00310000 esi=000010fc
edi=85231038
eip=8056726b esp=f78a23a4 ebp=f78a23e8 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
nt!ExFreePoolWithTag+0x27b:
8056726b 668b4602 mov ax,[esi+0x2] ds:0023:000010fe=???
Resetting default scope
STACK_TEXT:
f78a23e8 f6db52e0 00001104 00000000 00000000 nt!ExFreePoolWithTag+0x27b
f78a2404 f6d9f8b9 85d52128 00001104 00000000 tcpip!ICMPSendComplete+0x30
f78a243c f6d9fa83 859b5460 006b9da8 00000000 tcpip!IPSendComplete+0x124
f78a2460 f724f06e 85a79008 856b9da8 00000000 tcpip!ARPSendComplete+0xf4
f78a2484 f7044abb 859d6940 856b9da8 00000000 NDIS!ndisMSendCompleteX+0x6e
f78a25b8 f723604c 851d5008 f78a25e4 00000001 mpnat2k!MPSendPackets+0x6d6
[send_NT.c @ 665]
f78a25d8 f6d9fb9c 85b62f40 856b9da8 85a79008 NDIS!ndisMSendX+0x115
f78a2600 f6da3485 85a79008 856b9da8 855beb98 tcpip!ARPSendData+0x196
f78a262c f6da35de 855beb02 f78a2602 00000001 tcpip!ARPTransmit+0x7a
f78a2748 f6db56e3 f6de1140 02d52128 85295020 tcpip!IPTransmit+0x71f
f78a27cc f6db0228 8168923f 9b68923f 00000000 tcpip!SendEcho+0x325
f78a2828 f6da063f 859b5460 9b68923f 8168923f tcpip!ICMPRcv+0x173
f78a2888 f6da08dd 00000020 859b5460 00000000 tcpip!DeliverToUser+0x17b
f78a293c f6d9ef0f 859b5460 855ca99a 0000001a tcpip!IPRcvPacket+0x66c
f78a297c f6dac81c 00000000 851bf688 855ca978 tcpip!ARPRcvIndicationNew+0x147
f78a29ac f726381f 85a79008 851bf688 855ca978 tcpip!ARPRcv+0x40
f78a2a14 f7041c45 859d6940 f78a2d4c 00000001
NDIS!ethFilterDprIndicateReceivePacket+0x352
f78a2d6c f72636bf 851d5008 8570bf10 8591910a
mpnat2k!CLReceiveIndication+0x1cc4 [recv_NT.c @ 880]
f78a2dd4 f7051a09 85ab6ad0 f78a2e4c 00000002
NDIS!ethFilterDprIndicateReceivePacket+0x209
WARNING: Stack unwind information not available. Following frames may be
wrong.
f78a2df4 f7051f3e f78a2e14 f78a2e4c 00000002 e1000325+0xa09
f78a2e0c f705881a 85b08348 f78a2e4c 00000002 e1000325+0xf3e
f78a2f58 f70562f2 85ac4160 f78a2f8b 85ab6ad0 e1000325+0x781a
f78a2f80 f70514f4 00ac4160 f7254025 85b08348 e1000325+0x52f2
f78a2ff4 804e5ea6 f694b674 00000000 00000000 e1000325+0x4f4

FOLLOWUP_IP:
mpnat2k!MPSendPackets+6d6 [send_NT.c @ 665]
f7044abb e96f060000 jmp mpnat2k!MPSendPackets+0xd4a (f704512f)
SYMBOL_STACK_INDEX: 5
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: mpnat2k!MPSendPackets+6d6
MODULE_NAME: mpnat2k
IMAGE_NAME: mpnat2k.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 41f15553
STACK_COMMAND: .trap fffffffff78a2330 ; kb
FAILURE_BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
Followup: MachineOwner

Well your driver is overwriting memory somewhere, or corrupting data in
a way that makes tcpip corrupt itself.

I would take WinDbg’s advice and “Run the driver verifier against any
new (or suspect) drivers, and if that doesn’t turn up the culprit, then
use gflags to enable special pool.”

So the driver only crashes on a quad processor machines and not single
processor machines?

Check your synchronization code, make sure your MP safe.

Ivan Bohannon

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Ed Lau
Sent: Wednesday, January 26, 2005 12:28 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] NDIS IM Driver bugcheck

MessageHello all -

I am having a problem that I was hoping someone on this list may be able
to
point me in the right direction of solving. I have an NDIS v4.0
deserialized
driver that has been in use on many servers for many years, running
mostly
Win 2000 Advanced Server. The driver also appears to run correctly on
Win XP
(although the Win 2000 servers are heavily loaded - running on XP has
been
done only on client PC’s, not server-class PCs). I have been running it
on a
quad-processor Windows 2003 server (enterprise edition) and it has been
bug-checking. The driver is a heavily modified version of the ImSamp
sample.
What seems to be happening is that when my SendPacketsHandler is called,
and
I internally queue the supplied packet for later processing (the driver
performs NAT, firewall, compression, and encryption functions), I call
NdisMSendComplete() on the original packet after setting the packet
status
(via NDIS_SET_PACKET_STATUS()) to NDIS_STATUS_PENDING. This winds up
bugchecking in ExFreePoolWithTag().

I have copied a stack trace from the system that exhibits this behavior.
I
cannot use a live debugger on this system, as it is several states away
from
me, so I only have crash dumps to work with. I am assuming that I am
missing
a subtle difference in the way NDIS operates on Win 2003, as this does
not
seem to expose itself on any other systems. I realize that updating the
driver to an NDIS 5.x driver would be a good idea, but time constraints
don’t allow me to do that at this point in time.

Any hints on what direction to take in debugging this would be
appreciated.
The stack trace follows. The name of the driver that is causing problems
is
MPNAT2K.SYS (which is in the stack trace).

Thanks for any tips in advance,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

************************************************************************
****
***
* *
* Bugcheck Analysis *
* *
************************************************************************
****
***
Use !analyze -v to get detailed debugging information.
BugCheck C5, {10fe, 2, 0, 8056726b}
*** ERROR: Module load completed but symbols could not be loaded for
e1000325.sys
Probably caused by : mpnat2k.sys ( mpnat2k!MPSendPackets+6d6 )
Followup: MachineOwner

0: kd> !analyze -v
************************************************************************
****
***
* *
* Bugcheck Analysis *
* *
************************************************************************
****
***
DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address
at
an
interrupt request level (IRQL) that is too high. This is
caused by drivers that have corrupted the system pool. Run the driver
verifier against any new (or suspect) drivers, and if that doesn’t turn
up
the culprit, then use gflags to enable special pool.
Arguments:
Arg1: 000010fe, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 8056726b, address which referenced memory
Debugging Details:

BUGCHECK_STR: 0xC5_2
CURRENT_IRQL: 2
FAULTING_IP:
nt!ExFreePoolWithTag+27b
8056726b 668b4602 mov ax,[esi+0x2]
DEFAULT_BUCKET_ID: DRIVER_FAULT
LAST_CONTROL_TRANSFER: from f6db52e0 to 8056726b
TRAP_FRAME: f78a2330 – (.trap fffffffff78a2330)
ErrCode = 00000000
eax=00000000 ebx=00001104 ecx=00320001 edx=00310000 esi=000010fc
edi=85231038
eip=8056726b esp=f78a23a4 ebp=f78a23e8 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
nt!ExFreePoolWithTag+0x27b:
8056726b 668b4602 mov ax,[esi+0x2] ds:0023:000010fe=???
Resetting default scope
STACK_TEXT:
f78a23e8 f6db52e0 00001104 00000000 00000000 nt!ExFreePoolWithTag+0x27b
f78a2404 f6d9f8b9 85d52128 00001104 00000000 tcpip!ICMPSendComplete+0x30
f78a243c f6d9fa83 859b5460 006b9da8 00000000 tcpip!IPSendComplete+0x124
f78a2460 f724f06e 85a79008 856b9da8 00000000 tcpip!ARPSendComplete+0xf4
f78a2484 f7044abb 859d6940 856b9da8 00000000
NDIS!ndisMSendCompleteX+0x6e
f78a25b8 f723604c 851d5008 f78a25e4 00000001 mpnat2k!MPSendPackets+0x6d6
[send_NT.c @ 665]
f78a25d8 f6d9fb9c 85b62f40 856b9da8 85a79008 NDIS!ndisMSendX+0x115
f78a2600 f6da3485 85a79008 856b9da8 855beb98 tcpip!ARPSendData+0x196
f78a262c f6da35de 855beb02 f78a2602 00000001 tcpip!ARPTransmit+0x7a
f78a2748 f6db56e3 f6de1140 02d52128 85295020 tcpip!IPTransmit+0x71f
f78a27cc f6db0228 8168923f 9b68923f 00000000 tcpip!SendEcho+0x325
f78a2828 f6da063f 859b5460 9b68923f 8168923f tcpip!ICMPRcv+0x173
f78a2888 f6da08dd 00000020 859b5460 00000000 tcpip!DeliverToUser+0x17b
f78a293c f6d9ef0f 859b5460 855ca99a 0000001a tcpip!IPRcvPacket+0x66c
f78a297c f6dac81c 00000000 851bf688 855ca978
tcpip!ARPRcvIndicationNew+0x147
f78a29ac f726381f 85a79008 851bf688 855ca978 tcpip!ARPRcv+0x40
f78a2a14 f7041c45 859d6940 f78a2d4c 00000001
NDIS!ethFilterDprIndicateReceivePacket+0x352
f78a2d6c f72636bf 851d5008 8570bf10 8591910a
mpnat2k!CLReceiveIndication+0x1cc4 [recv_NT.c @ 880]
f78a2dd4 f7051a09 85ab6ad0 f78a2e4c 00000002
NDIS!ethFilterDprIndicateReceivePacket+0x209
WARNING: Stack unwind information not available. Following frames may be
wrong.
f78a2df4 f7051f3e f78a2e14 f78a2e4c 00000002 e1000325+0xa09
f78a2e0c f705881a 85b08348 f78a2e4c 00000002 e1000325+0xf3e
f78a2f58 f70562f2 85ac4160 f78a2f8b 85ab6ad0 e1000325+0x781a
f78a2f80 f70514f4 00ac4160 f7254025 85b08348 e1000325+0x52f2
f78a2ff4 804e5ea6 f694b674 00000000 00000000 e1000325+0x4f4

FOLLOWUP_IP:
mpnat2k!MPSendPackets+6d6 [send_NT.c @ 665]
f7044abb e96f060000 jmp mpnat2k!MPSendPackets+0xd4a (f704512f)
SYMBOL_STACK_INDEX: 5
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: mpnat2k!MPSendPackets+6d6
MODULE_NAME: mpnat2k
IMAGE_NAME: mpnat2k.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 41f15553
STACK_COMMAND: .trap fffffffff78a2330 ; kb
FAILURE_BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
Followup: MachineOwner


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: xxxxx@nsisoftware.com
To unsubscribe send a blank email to xxxxx@lists.osr.com

Thanks for the reply. The driver has been run extensively under Driver
Verifier (and Bounds Checker), with the special pool option enabled.
Verifier did not report anything (i.e., no verifier specific bugchecks),
even on the PC where it is exhibiting this problem.

I don’t think that the number of processors is actually relevant. It runs
fine on quad processor Win 2000 PCs, and hyperthreaded Win XP PCs (I have
never tried a quad processor Win XP machine). Setting the number of
processors to one via BOOT.INI on the problematic Win 2003 machine did not
alleviate the problem.

I have ran the same driver on Win 2003 machines that are locally available
to me and could not cause the driver to break in a debugger at a point where
the same stack is present, so I suppose one of my questions is what is
causing Windows to go down the code path that it is, and what are the
particular functions in the stack trace used for? I would think that based
on the function names, an ARP packet has been received and for some reason,
an ICMP Echo request is then being generated locally, which doesn’t make
much sense to me.

Thanks again,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

----- Original Message -----
From: “Bohannon, Ivan”
To: “Windows System Software Devs Interest List”
Sent: Friday, January 28, 2005 2:34 PM
Subject: RE: [ntdev] NDIS IM Driver bugcheck

> Well your driver is overwriting memory somewhere, or corrupting data in
> a way that makes tcpip corrupt itself.
>
> I would take WinDbg’s advice and “Run the driver verifier against any
> new (or suspect) drivers, and if that doesn’t turn up the culprit, then
> use gflags to enable special pool.”
>
> So the driver only crashes on a quad processor machines and not single
> processor machines?
>
> Check your synchronization code, make sure your MP safe.
>
> Ivan Bohannon
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Ed Lau
> Sent: Wednesday, January 26, 2005 12:28 PM
> To: Windows System Software Devs Interest List
> Subject: [ntdev] NDIS IM Driver bugcheck
>
> MessageHello all -
>
> I am having a problem that I was hoping someone on this list may be able
> to
> point me in the right direction of solving. I have an NDIS v4.0
> deserialized
> driver that has been in use on many servers for many years, running
> mostly
> Win 2000 Advanced Server. The driver also appears to run correctly on
> Win XP
> (although the Win 2000 servers are heavily loaded - running on XP has
> been
> done only on client PC’s, not server-class PCs). I have been running it
> on a
> quad-processor Windows 2003 server (enterprise edition) and it has been
> bug-checking. The driver is a heavily modified version of the ImSamp
> sample.
> What seems to be happening is that when my SendPacketsHandler is called,
> and
> I internally queue the supplied packet for later processing (the driver
> performs NAT, firewall, compression, and encryption functions), I call
> NdisMSendComplete() on the original packet after setting the packet
> status
> (via NDIS_SET_PACKET_STATUS()) to NDIS_STATUS_PENDING. This winds up
> bugchecking in ExFreePoolWithTag().
>
> I have copied a stack trace from the system that exhibits this behavior.
> I
> cannot use a live debugger on this system, as it is several states away
> from
> me, so I only have crash dumps to work with. I am assuming that I am
> missing
> a subtle difference in the way NDIS operates on Win 2003, as this does
> not
> seem to expose itself on any other systems. I realize that updating the
> driver to an NDIS 5.x driver would be a good idea, but time constraints
> don’t allow me to do that at this point in time.
>
> Any hints on what direction to take in debugging this would be
> appreciated.
> The stack trace follows. The name of the driver that is causing problems
> is
> MPNAT2K.SYS (which is in the stack trace).
>
> Thanks for any tips in advance,
>
> Ed Lau
>
> MidCore Software, Inc.
> 900 Straits Tpke.
> Middlebury, CT 06762
>
>
>

>
> *
> * Bugcheck Analysis
> *
>

>
>

> Use !analyze -v to get detailed debugging information.
> BugCheck C5, {10fe, 2, 0, 8056726b}
> ERROR: Module load completed but symbols could not be loaded for
> e1000325.sys
> Probably caused by : mpnat2k.sys ( mpnat2k!MPSendPackets+6d6 )
> Followup: MachineOwner
> ---------
> 0: kd> !analyze -v
>
******************************************************************
> *
>

> * *
> * Bugcheck Analysis *
> * *
> ********************************************************************
>

> ***
> DRIVER_CORRUPTED_EXPOOL (c5)
> An attempt was made to access a pageable (or completely invalid) address
> at
> an
> interrupt request level (IRQL) that is too high. This is
> caused by drivers that have corrupted the system pool. Run the driver
> verifier against any new (or suspect) drivers, and if that doesn’t turn
> up
> the culprit, then use gflags to enable special pool.
> Arguments:
> Arg1: 000010fe, memory referenced
> Arg2: 00000002, IRQL
> Arg3: 00000000, value 0 = read operation, 1 = write operation
> Arg4: 8056726b, address which referenced memory
> Debugging Details:
> ------------------
>
> BUGCHECK_STR: 0xC5_2
> CURRENT_IRQL: 2
> FAULTING_IP:
> nt!ExFreePoolWithTag+27b
> 8056726b 668b4602 mov ax,[esi+0x2]
> DEFAULT_BUCKET_ID: DRIVER_FAULT
> LAST_CONTROL_TRANSFER: from f6db52e0 to 8056726b
> TRAP_FRAME: f78a2330 – (.trap fffffffff78a2330)
> ErrCode = 00000000
> eax=00000000 ebx=00001104 ecx=00320001 edx=00310000 esi=000010fc
> edi=85231038
> eip=8056726b esp=f78a23a4 ebp=f78a23e8 iopl=0 nv up ei pl nz na pe nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
> nt!ExFreePoolWithTag+0x27b:
> 8056726b 668b4602 mov ax,[esi+0x2] ds:0023:000010fe=???
> Resetting default scope
> STACK_TEXT:
> f78a23e8 f6db52e0 00001104 00000000 00000000 nt!ExFreePoolWithTag+0x27b
> f78a2404 f6d9f8b9 85d52128 00001104 00000000 tcpip!ICMPSendComplete+0x30
> f78a243c f6d9fa83 859b5460 006b9da8 00000000 tcpip!IPSendComplete+0x124
> f78a2460 f724f06e 85a79008 856b9da8 00000000 tcpip!ARPSendComplete+0xf4
> f78a2484 f7044abb 859d6940 856b9da8 00000000
> NDIS!ndisMSendCompleteX+0x6e
> f78a25b8 f723604c 851d5008 f78a25e4 00000001 mpnat2k!MPSendPackets+0x6d6
> [send_NT.c @ 665]
> f78a25d8 f6d9fb9c 85b62f40 856b9da8 85a79008 NDIS!ndisMSendX+0x115
> f78a2600 f6da3485 85a79008 856b9da8 855beb98 tcpip!ARPSendData+0x196
> f78a262c f6da35de 855beb02 f78a2602 00000001 tcpip!ARPTransmit+0x7a
> f78a2748 f6db56e3 f6de1140 02d52128 85295020 tcpip!IPTransmit+0x71f
> f78a27cc f6db0228 8168923f 9b68923f 00000000 tcpip!SendEcho+0x325
> f78a2828 f6da063f 859b5460 9b68923f 8168923f tcpip!ICMPRcv+0x173
> f78a2888 f6da08dd 00000020 859b5460 00000000 tcpip!DeliverToUser+0x17b
> f78a293c f6d9ef0f 859b5460 855ca99a 0000001a tcpip!IPRcvPacket+0x66c
> f78a297c f6dac81c 00000000 851bf688 855ca978
> tcpip!ARPRcvIndicationNew+0x147
> f78a29ac f726381f 85a79008 851bf688 855ca978 tcpip!ARPRcv+0x40
> f78a2a14 f7041c45 859d6940 f78a2d4c 00000001
> NDIS!ethFilterDprIndicateReceivePacket+0x352
> f78a2d6c f72636bf 851d5008 8570bf10 8591910a
> mpnat2k!CLReceiveIndication+0x1cc4 [recv_NT.c @ 880]
> f78a2dd4 f7051a09 85ab6ad0 f78a2e4c 00000002
> NDIS!ethFilterDprIndicateReceivePacket+0x209
> WARNING: Stack unwind information not available. Following frames may be
> wrong.
> f78a2df4 f7051f3e f78a2e14 f78a2e4c 00000002 e1000325+0xa09
> f78a2e0c f705881a 85b08348 f78a2e4c 00000002 e1000325+0xf3e
> f78a2f58 f70562f2 85ac4160 f78a2f8b 85ab6ad0 e1000325+0x781a
> f78a2f80 f70514f4 00ac4160 f7254025 85b08348 e1000325+0x52f2
> f78a2ff4 804e5ea6 f694b674 00000000 00000000 e1000325+0x4f4
>
> FOLLOWUP_IP:
> mpnat2k!MPSendPackets+6d6 [send_NT.c @ 665]
> f7044abb e96f060000 jmp mpnat2k!MPSendPackets+0xd4a (f704512f)
> SYMBOL_STACK_INDEX: 5
> FOLLOWUP_NAME: MachineOwner
> SYMBOL_NAME: mpnat2k!MPSendPackets+6d6
> MODULE_NAME: mpnat2k
> IMAGE_NAME: mpnat2k.sys
> DEBUG_FLR_IMAGE_TIMESTAMP: 41f15553
> STACK_COMMAND: .trap fffffffff78a2330 ; kb
> FAILURE_BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
> BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
> Followup: MachineOwner
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@nsisoftware.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
> To unsubscribe send a blank email to xxxxx@lists.osr.com

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Ed Lau[SMTP:xxxxx@midcore.com]
Reply To: Windows System Software Devs Interest List
Sent: Friday, January 28, 2005 9:22 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] NDIS IM Driver bugcheck

I have ran the same driver on Win 2003 machines that are locally available
to me and could not cause the driver to break in a debugger at a point where
the same stack is present, so I suppose one of my questions is what is
causing Windows to go down the code path that it is, and what are the
particular functions in the stack trace used for? I would think that based
on the function names, an ARP packet has been received and for some reason,
an ICMP Echo request is then being generated locally, which doesn’t make
much sense to me.

Neither for me. This is because of bad assumption :slight_smile: If you examine how tcpip.sys registers NDIS protocol, you’d notice all protocol handlers are named ArpXxx. So every send and receive is eventually handled by a function named ArpSomething. At least it was when I wrote NDIS drivers several years back. I’d interpret the stack trace as ICMP echo packet receive followed by immediate reply which causes crash. I guess it isn’t very important as the memory corruption is probably caused by your driver and not tcpip itself.

Now what to do. You’re very lucky because have a crashdump; there should be everything necessary to find the problem. Live debugging doesn’t give any advantage in such cases. Try to carefully analyze it, beginning with stack trace. It seems you’re completing packet 0x856b9da8 which is later passed to tcpip SendComplete handler. Later, address 0x006b9da8 is passed to next function which seems suspicious to me as it is your packet address with high byte zeroed. Examine relevant tcpip code to see how it could occur. Finally, wrong 0x1104 address is passed to ExFreePoolWithTag which causes crash. It can be a consequence of previous problem or caused by another memory corruption. Try to find from where it could come.

BTW, setting packet status to NDIS_STATUS_PENDING just before completion seems strange to me but maybe I haven’t understand your description correctly.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Michal -

Thanks for the informed response. The reason that I had wanted to know what
was happening was that I had wanted to try to reproduce the problem locally
(rather than on a remote PC), which I have not had any luck doing, or at
least getting the OS to break in the same code path so that I could try to
see what those functions are doing (the xxxSendComplete functions). I
suppose I can just unassemble these functions in the debugger without
actually breaking into them.

I agree with your stack analysis, but I hadn’t noticed the 0x006b9da8
address - not knowing exactly what IPSendComplete has as parameters, I sort
of ignored them other than the packet address (which appears valid), and I
had noticed that 0x00001104 was being passed to ExFreePoolWithTag. The
tricky part is determining how that came to be.

The purpose of setting NDIS_STATUS_PENDING was to prevent NDIS from
completing the packet if processing is performed asynchronously - a “filter”
function is called between setting the packet status and the completion call
is not always performed dependant on the results of the “filter” function.
If the packet has not been copied and queued for later processing, it is
completed in my SendComplete handler, otherwise, the original packet is
completed but not transmitted, and a copy of the packet is either
transmitted, dropped, modified, etc.

Thanks again,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

----- Original Message -----
From: “Michal Vodicka”
To: “Windows System Software Devs Interest List”
Sent: Friday, January 28, 2005 3:45 PM
Subject: RE: [ntdev] NDIS IM Driver bugcheck

> > ----------
> > From:
xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on
behalf of Ed Lau[SMTP:xxxxx@midcore.com]
> > Reply To: Windows System Software Devs Interest List
> > Sent: Friday, January 28, 2005 9:22 PM
> > To: Windows System Software Devs Interest List
> > Subject: Re: [ntdev] NDIS IM Driver bugcheck
> >
> > I have ran the same driver on Win 2003 machines that are locally
available
> > to me and could not cause the driver to break in a debugger at a point
where
> > the same stack is present, so I suppose one of my questions is what is
> > causing Windows to go down the code path that it is, and what are the
> > particular functions in the stack trace used for? I would think that
based
> > on the function names, an ARP packet has been received and for some
reason,
> > an ICMP Echo request is then being generated locally, which doesn’t make
> > much sense to me.
> >
> Neither for me. This is because of bad assumption :slight_smile: If you examine how
tcpip.sys registers NDIS protocol, you’d notice all protocol handlers are
named ArpXxx. So every send and receive is eventually handled by a function
named ArpSomething. At least it was when I wrote NDIS drivers several years
back. I’d interpret the stack trace as ICMP echo packet receive followed by
immediate reply which causes crash. I guess it isn’t very important as the
memory corruption is probably caused by your driver and not tcpip itself.
>
> Now what to do. You’re very lucky because have a crashdump; there should
be everything necessary to find the problem. Live debugging doesn’t give any
advantage in such cases. Try to carefully analyze it, beginning with stack
trace. It seems you’re completing packet 0x856b9da8 which is later passed to
tcpip SendComplete handler. Later, address 0x006b9da8 is passed to next
function which seems suspicious to me as it is your packet address with high
byte zeroed. Examine relevant tcpip code to see how it could occur. Finally,
wrong 0x1104 address is passed to ExFreePoolWithTag which causes crash. It
can be a consequence of previous problem or caused by another memory
corruption. Try to find from where it could come.
>
> BTW, setting packet status to NDIS_STATUS_PENDING just before completion
seems strange to me but maybe I haven’t understand your description
correctly.
>
> Best regards,
>
> Michal Vodicka
> UPEK, Inc.
> [xxxxx@upek.com, http://www.upek.com]
>
>
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
> To unsubscribe send a blank email to xxxxx@lists.osr.com

Ed,

I am not sure based on your original post if you did this but…

Make sure you enable VERIFIER for NDIS.SYS. It also helps to have the
CHECKED NDIS.SYS handy as well since it contains a number of usefule
diagnostic and verification capabilities.

Ed,

In your case I would recommend that you enable VERIFIER for NDIS.SYS,
TCPIP.SYS, your driver (of course), and the NIC drivers. Your driver and
the NIC drivers will most likely show almost no VERIFIER activity since
almost all of the OS specific interface used by them will be handled by
NDIS. If you don’t enable VERIFIER on NDIS.SYS, VERIFIER does not intercept
and process memory allocations, IRQL changes, SPINLOCK operations, etc. for
*your* driver since your driver calls NDIS.SYS to do these things typically.

In particular turn on Special Pool, Pool Tracking, IRQL Checking, and
Deadlock Detection. You might find I/O validation useful but in this case I
doubt it will matter.

With respect to NDIS_STATUS_PENDING, I may not have read your original post
correctly so set me straight if I got it wrong. I think you said that you
are queuing the transmit packet internally by setting the packet status to
NDIS_STATUS_PENDING and then calling NdisMSendComplete().

NdisMSendComplete() means that you are done with the packet. Setting the
status to NDIS_STATUS_PENDING and then calling this does not make sense. If
you want to hold onto the packet, fine, set the status to
NDIS_STATUS_PENDING while queueing the packets in your MiniportSendPackets()
routine but don’t call NdisMSendComplete() until you are good an done with
the packet.

As a deserialized (miniport), NDIS is expecting you to call
NdisMSendComplete() on every packet anyway. It is not going to look at the
status value in the OOB data to determine if your driver queued the packet
or not. NDIS will assume that all packets have been ‘queued’ and will be
completed by NdisMSendComplete() at some later time.

When your driver is good and done processing the packet, it should *then*
call NdisMSendComplete() and specify a *final* status for the packet. A
final status cannot ever be NDIS_STATUS_PENDING. Whatever you do, don’t
access the packet after you call NdisMSendComplete(). You have given it
back. No fair changing your mind.

Only a serialized miniport really needs to set NDIS_STATUS_PENDING in the
OOB status of a packet in MiniportSendPackets(). That is how NDIS knows
whether or not the packet was completed asyncronously or not. Notice that
not in this case (nor any other) do you set the OOB status in when the
packet is completed. I suppose it could complete before NDIS reads the OOB
data to know it was pending in the first place but I really don’t know if
this is possible given the locking around serialized miniports. The OOB
data status value is a bit confusing as to the exact purpose. The point is
that the final status value is specified in the call to NdisMSendComplete()
period.

If I got all of this wrong, I appologize in advance. However, turn on
verifier with NDIS.

Good Luck,

Dave Cattley
Consulting Engineer
Systems Software Development

“Ed Lau” wrote in message news:xxxxx@ntdev…
> Michal -
>
> Thanks for the informed response. The reason that I had wanted to know
> what
> was happening was that I had wanted to try to reproduce the problem
> locally
> (rather than on a remote PC), which I have not had any luck doing, or at
> least getting the OS to break in the same code path so that I could try to
> see what those functions are doing (the xxxSendComplete functions). I
> suppose I can just unassemble these functions in the debugger without
> actually breaking into them.
>
> I agree with your stack analysis, but I hadn’t noticed the 0x006b9da8
> address - not knowing exactly what IPSendComplete has as parameters, I
> sort
> of ignored them other than the packet address (which appears valid), and I
> had noticed that 0x00001104 was being passed to ExFreePoolWithTag. The
> tricky part is determining how that came to be.
>
> The purpose of setting NDIS_STATUS_PENDING was to prevent NDIS from
> completing the packet if processing is performed asynchronously - a
> “filter”
> function is called between setting the packet status and the completion
> call
> is not always performed dependant on the results of the “filter” function.
> If the packet has not been copied and queued for later processing, it is
> completed in my SendComplete handler, otherwise, the original packet is
> completed but not transmitted, and a copy of the packet is either
> transmitted, dropped, modified, etc.
>
> Thanks again,
>
> Ed Lau
>
> MidCore Software, Inc.
> 900 Straits Tpke.
> Middlebury, CT 06762
>

Ed,

Thanks for the informed response. The reason that I had wanted to know what
was happening was that I had wanted to try to reproduce the problem locally
(rather than on a remote PC), which I have not had any luck doing, or at
least getting the OS to break in the same code path so that I could try to
see what those functions are doing (the xxxSendComplete functions). I
suppose I can just unassemble these functions in the debugger without
actually breaking into them.

You can even disassembly them when analyzing crasdump. When you examine stack, it can show you the relevant code for each entry. Of course, if you have at least kernel memory dump. Minidumps are almost useless and it’s a pity it is default setting.

I agree with your stack analysis, but I hadn’t noticed the 0x006b9da8
address - not knowing exactly what IPSendComplete has as parameters, I sort
of ignored them other than the packet address (which appears valid), and I
had noticed that 0x00001104 was being passed to ExFreePoolWithTag. The
tricky part is determining how that came to be.

IDA disassembler can help in complicated cases. However, WinDbg is usually sufficient.

The purpose of setting NDIS_STATUS_PENDING was to prevent NDIS from
completing the packet if processing is performed asynchronously - a “filter”
function is called between setting the packet status and the completion call
is not always performed dependant on the results of the “filter” function.
If the packet has not been copied and queued for later processing, it is
completed in my SendComplete handler, otherwise, the original packet is
completed but not transmitted, and a copy of the packet is either
transmitted, dropped, modified, etc.

Are you always setting final packet status just before completion?

Just a wild guess: can’t be packet completes twice under some circumstances? Some kind of race conditions which would explain why it isn’t reproducible on other machines. It usually depends on timing i.e. machine speed.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

David -

Thanks for the reply. No, I hadn’t thought to enable Verifier for ndis.sys,
tcpip.sys, etc. - that’s a great idea. I will definitely do that.

I’ll revisit the setting of the packet status. The driver was created using
the ImSamp sample as a template, and I think that the original had comments
regarding setting the packet status to pending intentionally to prevent
completion by NDIS. I am also under the impression that the final status
parameter is used when calling NdisMSendComplete(), so setting the packet
status in the OOB data to pending shouldn’t cause problems - it’s set and
used when returning from the MiniportSendPackets() handler after queueing
the packet if desired, otherwise the packet has been completed - and once
that’s done, the packet is never touched (by my driver). I think that you
stated (probably more clearly than I) the same thing in your response. I
think that you might be correct in that my post may not have been clear on
the whole pending status subject, as well. Obviously, setting the packet
status after determining what I am going to do with it seems like a logical
change that can be made in my driver in this regard (i.e, only set the
status when I’m planning on holding on to it for a while).

Thanks again,

Ed Lau

----- Original Message -----
From: “David R. Cattley”
Newsgroups: ntdev
To: “Windows System Software Devs Interest List”
Sent: Friday, January 28, 2005 6:22 PM
Subject: Re:[ntdev] NDIS IM Driver bugcheck

> Ed,
>
> I am not sure based on your original post if you did this but…
>
> Make sure you enable VERIFIER for NDIS.SYS. It also helps to have the
> CHECKED NDIS.SYS handy as well since it contains a number of usefule
> diagnostic and verification capabilities.
>
> Ed,
>
> In your case I would recommend that you enable VERIFIER for NDIS.SYS,
> TCPIP.SYS, your driver (of course), and the NIC drivers. Your driver and
> the NIC drivers will most likely show almost no VERIFIER activity since
> almost all of the OS specific interface used by them will be handled by
> NDIS. If you don’t enable VERIFIER on NDIS.SYS, VERIFIER does not
intercept
> and process memory allocations, IRQL changes, SPINLOCK operations, etc.
for
> your driver since your driver calls NDIS.SYS to do these things
typically.
>
> In particular turn on Special Pool, Pool Tracking, IRQL Checking, and
> Deadlock Detection. You might find I/O validation useful but in this case
I
> doubt it will matter.
>
> With respect to NDIS_STATUS_PENDING, I may not have read your original
post
> correctly so set me straight if I got it wrong. I think you said that you
> are queuing the transmit packet internally by setting the packet status to
> NDIS_STATUS_PENDING and then calling NdisMSendComplete().
>
> NdisMSendComplete() means that you are done with the packet. Setting the
> status to NDIS_STATUS_PENDING and then calling this does not make sense.
If
> you want to hold onto the packet, fine, set the status to
> NDIS_STATUS_PENDING while queueing the packets in your
MiniportSendPackets()
> routine but don’t call NdisMSendComplete() until you are good an done with
> the packet.
>
> As a deserialized (miniport), NDIS is expecting you to call
> NdisMSendComplete() on every packet anyway. It is not going to look at
the
> status value in the OOB data to determine if your driver queued the packet
> or not. NDIS will assume that all packets have been ‘queued’ and will be
> completed by NdisMSendComplete() at some later time.
>
> When your driver is good and done processing the packet, it should then
> call NdisMSendComplete() and specify a final status for the packet. A
> final status cannot ever be NDIS_STATUS_PENDING. Whatever you do, don’t
> access the packet after you call NdisMSendComplete(). You have given it
> back. No fair changing your mind.
>
> Only a serialized miniport really needs to set NDIS_STATUS_PENDING in the
> OOB status of a packet in MiniportSendPackets(). That is how NDIS knows
> whether or not the packet was completed asyncronously or not. Notice that
> not in this case (nor any other) do you set the OOB status in when the
> packet is completed. I suppose it could complete before NDIS reads the
OOB
> data to know it was pending in the first place but I really don’t know if
> this is possible given the locking around serialized miniports. The OOB
> data status value is a bit confusing as to the exact purpose. The point
is
> that the final status value is specified in the call to
NdisMSendComplete()
> period.
>
> If I got all of this wrong, I appologize in advance. However, turn on
> verifier with NDIS.
>
> Good Luck,
>
> Dave Cattley
> Consulting Engineer
> Systems Software Development
>
>
> “Ed Lau” wrote in message news:xxxxx@ntdev…
> > Michal -
> >
> > Thanks for the informed response. The reason that I had wanted to know
> > what
> > was happening was that I had wanted to try to reproduce the problem
> > locally
> > (rather than on a remote PC), which I have not had any luck doing, or at
> > least getting the OS to break in the same code path so that I could try
to
> > see what those functions are doing (the xxxSendComplete functions). I
> > suppose I can just unassemble these functions in the debugger without
> > actually breaking into them.
> >
> > I agree with your stack analysis, but I hadn’t noticed the 0x006b9da8
> > address - not knowing exactly what IPSendComplete has as parameters, I
> > sort
> > of ignored them other than the packet address (which appears valid), and
I
> > had noticed that 0x00001104 was being passed to ExFreePoolWithTag. The
> > tricky part is determining how that came to be.
> >
> > The purpose of setting NDIS_STATUS_PENDING was to prevent NDIS from
> > completing the packet if processing is performed asynchronously - a
> > “filter”
> > function is called between setting the packet status and the completion
> > call
> > is not always performed dependant on the results of the “filter”
function.
> > If the packet has not been copied and queued for later processing, it is
> > completed in my SendComplete handler, otherwise, the original packet is
> > completed but not transmitted, and a copy of the packet is either
> > transmitted, dropped, modified, etc.
> >
> > Thanks again,
> >
> > Ed Lau
> >
> > MidCore Software, Inc.
> > 900 Straits Tpke.
> > Middlebury, CT 06762
> >
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@midcore.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com

Ed,

Yes, IMSAMP would have had such comments. I believe that it may have
originally been a serialized miniport model.

I’m sorry but in my last post I forgot to mention that in additional to
running VERIFIER with NDIS.SYS, you can turn on some very useful features of
the CHECKED NDIS.SYS (MSFT refers to this as NDIS VERIFIER in the DDK). In
particular, you might try turining on packet tracking. See the following
on MSDN or you can find the same in the DDK docs; look for NDIS Verifier
under “Network Devices and Protocols, Design Guide, Testing NDIS Drivers”.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/network/hh/network/ndistest_61fc3759-e857-4b5f-ac68-dbd96e129ddc.xml.asp

Good Luck,
Dave Cattley
Consulting Engineer
Systems Software Development

Thanks again to all who have offered good suggestions to this problem so
far. As of yet, the result of these suggestions is that Driver Verifier,
when enabled for NDIS.SYS, TCPIP.SYS, and my driver only complains athe
point where an invalid address is about to be freed. I haven’t yet been able
to try NDIS Verifier, as that involves rebooting that machine in Safe Mode,
which I haven’t yet figured out how to do remotely (I use VNC to gain access
to this machine, it’s several states away).

After reviewing the stack traces of the machine that fails and other Win
2003 machines where the code does not fail, several things appear
interesting to me that I don’t have answers for, and I was wondering if
anyone can explain some of the things I am seeing.

  1. The same code path in my SendPackets handler works if NDIS calls that
    handler while I am not in the middle of a call into my ProtocolReceive
    handler, whereas in the failing case I am in the middle of my
    ProtocolReceive handler. Note that my call to NdisMIndicateReceivePacket()
    is the last code executed in the receive handler. Is it usual for NDIS to
    call into the SendPacketsHandler while still executing the ProtocolReceive
    handler? I personally have never seen this before, but I have never looked,
    either. I have not been able to replicate in house the same (failing)
    scenario, where my SendPacketsHandler is called in this way. Can this be
    influenced by the driver for the NIC itself?

  2. The nature of the failure is that the packet pointer passed to
    NdisMSendComplete is always modified in the eventual call to IPSendComplete
    (via ARPSendComplete) in that the top byte has been NULLed (in the
    ARPSendComplete code, possibly in the call to NdisUnchainBufferAtFront in
    this code). I.E., if 0x12345678 was passed to NdisMSendComplete, 0x00345678
    is passed to IPSendComplete. My code does not execute between my call to
    NdisMSendComplete and this “change of address” (which is on the stack),
    which I verified by heavy tracing and post crash dump analysis of the output
    using DebugView. I have yet to be successful in getting a remote debugging
    connection established to this remote server, so I can’t (yet) determine
    where or why this address is being modified. My current operating theory is
    that I am setting or neglecting to set an attribute of the offending packet
    that is causing this to occurr in this particular situation, however, I
    don’t see how this could occur either as the packet that is being completed
    is not the address of my packet being handled in ProtocolReceive nor is it a
    packet I have allocated in the SendPackets handler - it is the packet passed
    to my handler.

Any light on these two somewhat rambling questions would be appreciated.

Thanks again,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Ed Lau
Sent: Wednesday, January 26, 2005 12:28 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] NDIS IM Driver bugcheck

MessageHello all -

I am having a problem that I was hoping someone on this list may be able
to
point me in the right direction of solving. I have an NDIS v4.0
deserialized
driver that has been in use on many servers for many years, running
mostly
Win 2000 Advanced Server. The driver also appears to run correctly on
Win XP
(although the Win 2000 servers are heavily loaded - running on XP has
been
done only on client PC’s, not server-class PCs). I have been running it
on a
quad-processor Windows 2003 server (enterprise edition) and it has been
bug-checking. The driver is a heavily modified version of the ImSamp
sample.
What seems to be happening is that when my SendPacketsHandler is called,
and
I internally queue the supplied packet for later processing (the driver
performs NAT, firewall, compression, and encryption functions), I call
NdisMSendComplete() on the original packet after setting the packet
status
(via NDIS_SET_PACKET_STATUS()) to NDIS_STATUS_PENDING. This winds up
bugchecking in ExFreePoolWithTag().

I have copied a stack trace from the system that exhibits this behavior.
I
cannot use a live debugger on this system, as it is several states away
from
me, so I only have crash dumps to work with. I am assuming that I am
missing
a subtle difference in the way NDIS operates on Win 2003, as this does
not
seem to expose itself on any other systems. I realize that updating the
driver to an NDIS 5.x driver would be a good idea, but time constraints
don’t allow me to do that at this point in time.

Any hints on what direction to take in debugging this would be
appreciated.
The stack trace follows. The name of the driver that is causing problems
is
MPNAT2K.SYS (which is in the stack trace).

Thanks for any tips in advance,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

************************************************************************
****
***
* *
* Bugcheck Analysis *
* *
************************************************************************
****
***
Use !analyze -v to get detailed debugging information.
BugCheck C5, {10fe, 2, 0, 8056726b}
*** ERROR: Module load completed but symbols could not be loaded for
e1000325.sys
Probably caused by : mpnat2k.sys ( mpnat2k!MPSendPackets+6d6 )
Followup: MachineOwner

0: kd> !analyze -v
************************************************************************
****
***
* *
* Bugcheck Analysis *
* *
************************************************************************
****
***
DRIVER_CORRUPTED_EXPOOL (c5)
An attempt was made to access a pageable (or completely invalid) address
at
an
interrupt request level (IRQL) that is too high. This is
caused by drivers that have corrupted the system pool. Run the driver
verifier against any new (or suspect) drivers, and if that doesn’t turn
up
the culprit, then use gflags to enable special pool.
Arguments:
Arg1: 000010fe, memory referenced
Arg2: 00000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 8056726b, address which referenced memory
Debugging Details:

BUGCHECK_STR: 0xC5_2
CURRENT_IRQL: 2
FAULTING_IP:
nt!ExFreePoolWithTag+27b
8056726b 668b4602 mov ax,[esi+0x2]
DEFAULT_BUCKET_ID: DRIVER_FAULT
LAST_CONTROL_TRANSFER: from f6db52e0 to 8056726b
TRAP_FRAME: f78a2330 – (.trap fffffffff78a2330)
ErrCode = 00000000
eax=00000000 ebx=00001104 ecx=00320001 edx=00310000 esi=000010fc
edi=85231038
eip=8056726b esp=f78a23a4 ebp=f78a23e8 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010202
nt!ExFreePoolWithTag+0x27b:
8056726b 668b4602 mov ax,[esi+0x2] ds:0023:000010fe=???
Resetting default scope
STACK_TEXT:
f78a23e8 f6db52e0 00001104 00000000 00000000 nt!ExFreePoolWithTag+0x27b
f78a2404 f6d9f8b9 85d52128 00001104 00000000 tcpip!ICMPSendComplete+0x30
f78a243c f6d9fa83 859b5460 006b9da8 00000000 tcpip!IPSendComplete+0x124
f78a2460 f724f06e 85a79008 856b9da8 00000000 tcpip!ARPSendComplete+0xf4
f78a2484 f7044abb 859d6940 856b9da8 00000000
NDIS!ndisMSendCompleteX+0x6e
f78a25b8 f723604c 851d5008 f78a25e4 00000001 mpnat2k!MPSendPackets+0x6d6
[send_NT.c @ 665]
f78a25d8 f6d9fb9c 85b62f40 856b9da8 85a79008 NDIS!ndisMSendX+0x115
f78a2600 f6da3485 85a79008 856b9da8 855beb98 tcpip!ARPSendData+0x196
f78a262c f6da35de 855beb02 f78a2602 00000001 tcpip!ARPTransmit+0x7a
f78a2748 f6db56e3 f6de1140 02d52128 85295020 tcpip!IPTransmit+0x71f
f78a27cc f6db0228 8168923f 9b68923f 00000000 tcpip!SendEcho+0x325
f78a2828 f6da063f 859b5460 9b68923f 8168923f tcpip!ICMPRcv+0x173
f78a2888 f6da08dd 00000020 859b5460 00000000 tcpip!DeliverToUser+0x17b
f78a293c f6d9ef0f 859b5460 855ca99a 0000001a tcpip!IPRcvPacket+0x66c
f78a297c f6dac81c 00000000 851bf688 855ca978
tcpip!ARPRcvIndicationNew+0x147
f78a29ac f726381f 85a79008 851bf688 855ca978 tcpip!ARPRcv+0x40
f78a2a14 f7041c45 859d6940 f78a2d4c 00000001
NDIS!ethFilterDprIndicateReceivePacket+0x352
f78a2d6c f72636bf 851d5008 8570bf10 8591910a
mpnat2k!CLReceiveIndication+0x1cc4 [recv_NT.c @ 880]
f78a2dd4 f7051a09 85ab6ad0 f78a2e4c 00000002
NDIS!ethFilterDprIndicateReceivePacket+0x209
WARNING: Stack unwind information not available. Following frames may be
wrong.
f78a2df4 f7051f3e f78a2e14 f78a2e4c 00000002 e1000325+0xa09
f78a2e0c f705881a 85b08348 f78a2e4c 00000002 e1000325+0xf3e
f78a2f58 f70562f2 85ac4160 f78a2f8b 85ab6ad0 e1000325+0x781a
f78a2f80 f70514f4 00ac4160 f7254025 85b08348 e1000325+0x52f2
f78a2ff4 804e5ea6 f694b674 00000000 00000000 e1000325+0x4f4

FOLLOWUP_IP:
mpnat2k!MPSendPackets+6d6 [send_NT.c @ 665]
f7044abb e96f060000 jmp mpnat2k!MPSendPackets+0xd4a (f704512f)
SYMBOL_STACK_INDEX: 5
FOLLOWUP_NAME: MachineOwner
SYMBOL_NAME: mpnat2k!MPSendPackets+6d6
MODULE_NAME: mpnat2k
IMAGE_NAME: mpnat2k.sys
DEBUG_FLR_IMAGE_TIMESTAMP: 41f15553
STACK_COMMAND: .trap fffffffff78a2330 ; kb
FAILURE_BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
BUCKET_ID: 0xC5_2_mpnat2k!MPSendPackets+6d6
Followup: MachineOwner


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

> is the last code executed in the receive handler. Is it usual for NDIS to

call into the SendPacketsHandler while still executing the ProtocolReceive
handler?

Yes. You’re deserialized, so, any code paths can be called at any moment.

Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Ed Lau[SMTP:xxxxx@midcore.com]
Reply To: Windows System Software Devs Interest List
Sent: Monday, January 31, 2005 11:12 PM
To: Windows System Software Devs Interest List
Subject: Re: Re:[ntdev] Re:NDIS IM Driver bugcheck

  1. The same code path in my SendPackets handler works if NDIS calls that
    handler while I am not in the middle of a call into my ProtocolReceive
    handler, whereas in the failing case I am in the middle of my
    ProtocolReceive handler. Note that my call to NdisMIndicateReceivePacket()
    is the last code executed in the receive handler. Is it usual for NDIS to
    call into the SendPacketsHandler while still executing the ProtocolReceive
    handler?

It was usual or at least nothing uncommon when I developed NDIS IM drivers at NT4 and w2k.

I personally have never seen this before, but I have never looked,
either. I have not been able to replicate in house the same (failing)
scenario, where my SendPacketsHandler is called in this way. Can this be
influenced by the driver for the NIC itself?

Maybe. Also, machine speed, number of CPUs and other things could influence it. I guess when TcpIp driver has something to send when processing received packet, it does. Take it as pure speculation, please. Anyway, it can be the key difference between yours and failing system.

  1. The nature of the failure is that the packet pointer passed to
    NdisMSendComplete is always modified in the eventual call to IPSendComplete
    (via ARPSendComplete) in that the top byte has been NULLed (in the
    ARPSendComplete code, possibly in the call to NdisUnchainBufferAtFront in
    this code). I.E., if 0x12345678 was passed to NdisMSendComplete, 0x00345678
    is passed to IPSendComplete. My code does not execute between my call to
    NdisMSendComplete and this “change of address” (which is on the stack),

Does it mean corrupted address is stored on the stack and later read and passed to the next function? If so, you encountered stack corruption. You could place clever memory breakpoint at the place where corruption occur to catch code which replaced it. Yes, live debugging is necessary and breakpoint would have to be set just before NdisMSendComplete is called to catch only current packet address corruption. I’d know how to do it with SoftICE and I’m not sure if WinDbg allows it.

BTW, you didn’t say how easily it is reproducible i.e. how often it occurs.

which I verified by heavy tracing and post crash dump analysis of the output
using DebugView. I have yet to be successful in getting a remote debugging
connection established to this remote server, so I can’t (yet) determine
where or why this address is being modified. My current operating theory is
that I am setting or neglecting to set an attribute of the offending packet
that is causing this to occurr in this particular situation, however, I
don’t see how this could occur either as the packet that is being completed
is not the address of my packet being handled in ProtocolReceive nor is it a
packet I have allocated in the SendPackets handler - it is the packet passed
to my handler.

You’ll see when find it :slight_smile: Such things have very surprising reasons sometimes. What about forgotten pointer somewhere which points to the “right” place at the stack. I’d examine OOB info in packet, miniport and protocol private ares – maybe your code (queuing?) corrupts something which TcpIp uses as a pointer later. Try to dump protocol private area when you first see packet and just before completion to compare and to see if there isn’t the wrong address with zeroed high byte.

BTW, which ndis.h version are you using? I vaguelly remember there was a wrong one with one of private areas shifted by 8 bytes which lead to corruptions.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

> > Is it usual for NDIS to

> call into the SendPacketsHandler while still executing the
ProtocolReceive
> handler?
>
It was usual or at least nothing uncommon when I developed NDIS IM drivers
at NT4 and w2k.

Maxim and Michal - thanks for the clarification.

> I personally have never seen this before, but I have never looked,
> either. I have not been able to replicate in house the same (failing)
> scenario, where my SendPacketsHandler is called in this way. Can this be
> influenced by the driver for the NIC itself?
>
Maybe. Also, machine speed, number of CPUs and other things could
influence it. I guess when TcpIp driver has something to send when
processing received packet, it does. Take it as pure speculation, please.
Anyway, it can be the key difference between yours and failing system.

I also was thinking that perhaps machine speed was influencing this, as it
is a very fast machine.

> 2. The nature of the failure is that the packet pointer passed to
> NdisMSendComplete is always modified in the eventual call to
IPSendComplete
> (via ARPSendComplete) in that the top byte has been NULLed (in the
> ARPSendComplete code, possibly in the call to NdisUnchainBufferAtFront
in
> this code).
Does it mean corrupted address is stored on the stack and later read and
passed to the next function? If so, you encountered stack corruption.

Yes, this appears to be what is happening after looking at ARPSendComplete
in WinDbg - the correct value is passed (according to WinDbg) to
ARPSendComplete, but the next call on the stack trace - to IP SendComplete -
has the incorrect value passed. The value on the stack is read into ESI and
then passed to IPSendComplete. So, yes, it appears that perhaps the stack is
being corrupted while in ARPSendComplete, or that the value in ESI is being
modified during on of the calls inside of ARPSendComplete - such as
NdisUnchainBufferAtFront.

You could place clever memory breakpoint at the place where corruption
occur to catch code which replaced it. Yes, live debugging is necessary >and
breakpoint would have to be set just before NdisMSendComplete is called to
catch only current packet address corruption. I’d know how to >do it with
SoftICE and I’m not sure if WinDbg allows it.

Yes, I’d love to be able to do this. I’m aware of how, (I’m an avid SoftICE
user, but use both SoftICE and WinDbg when needed). I am currently trying to
get Virtual SoftICE installed correctly in order to do this.

BTW, you didn’t say how easily it is reproducible i.e. how often it
occurs.

I have only seen it on this one PC, but it is very reproducable - the PC is
a server behind an F5 leveller which pings it every 5 seconds or so (if my
driver is running) - it appears that every ping causes this to happen.

You’ll see when find it :slight_smile:

I agree - that always seems to be the case :slight_smile:

BTW, which ndis.h version are you using? I vaguelly remember there was a
wrong one with one of private areas shifted by 8 bytes which lead to
corruptions.

The driver is written as an NDIS 4.0 driver, and uses the version of NDIS.H
from the Windows 2000 DDK. At some point, I’d like to upgrade to the lates
DDK - perhaps now might be a good time :slight_smile:

I have a feeling that at this point I’ll have to either travel out of state
to get to this PC in person or get Visual SoftICE working …

Thanks again,

Ed Lau

MidCore Software, Inc.
900 Straits Tpke.
Middlebury, CT 06762

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Ed Lau[SMTP:xxxxx@midcore.com]
Reply To: Windows System Software Devs Interest List
Sent: Tuesday, February 01, 2005 1:21 AM
To: Windows System Software Devs Interest List
Subject: Re: Re:[ntdev] Re:NDIS IM Driver bugcheck

Yes, this appears to be what is happening after looking at ARPSendComplete
in WinDbg - the correct value is passed (according to WinDbg) to
ARPSendComplete, but the next call on the stack trace - to IP SendComplete -
has the incorrect value passed. The value on the stack is read into ESI and
then passed to IPSendComplete. So, yes, it appears that perhaps the stack is
being corrupted while in ARPSendComplete, or that the value in ESI is being
modified during on of the calls inside of ARPSendComplete - such as
NdisUnchainBufferAtFront.

Try to locally trace through ARPSendComplete to see what is really passed to IPSendComplete. It isn’t necessary to have error scenario; just to see what is expected there. If it is original packet address, it is probably stack corruption. If it is something else (pointer to a context), look where is this pointer stored because this place is corrupted and replaced by packet address with cleared high byte. It’d be better to have the same OS+SP but for the quick examination anything you have alive would help. I guess such things aren’t changed so often.

I mean: seeing something which looks as corrupted packet address on the stack may lead to false assumptions and the only way how to verify/refute it is to understand how relevant code works.

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]