A BSOD problem of tdifw.

I'm using tdifw(sourceforge.net) on a Windows 2003 Server(Web server) and sometimes I had a sort of BSOD like this.

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

Use !analyze -v to get detailed debugging information.

BugCheck D1, {0, d0000002, 0, 0}

Probably caused by : tdifw.sys ( tdifw!tdi_event_connect+135 )

Followup: MachineOwner

1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: d0000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 00000000, address which referenced memory

Debugging Details:

READ_ADDRESS: 00000000

CURRENT_IRQL: 2

FAULTING_IP:
+0
00000000 ?? ???

PROCESS_NAME: System

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xD1

TRAP_FRAME: f78f693c -- (.trap 0xfffffffff78f693c)
ErrCode = 00000000
eax=00000000 ebx=87ad6c04 ecx=ba17b202 edx=00000000 esi=f78f6ab0 edi=00000002
eip=00000000 esp=f78f69b0 ebp=f78f6a70 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
00000000 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 00000000 to 80836df5

FAILED_INSTRUCTION_ADDRESS:
+0
00000000 ?? ???

STACK_TEXT:
f78f693c 00000000 badb0d00 00000000 00000000 nt!KiTrap0E+0x2a7
WARNING: Frame IP not in any known module. Following frames may be wrong.
f78f69ac ba177ce9 00000000 00000016 f78f6ad8 0x0
f78f6a70 b9b1a9a9 02ad6c04 00000016 f78f6ad8 tdifw!tdi_event_connect+0x135 [e:\tdifw\ev_conn.c @ 88]
f78f6af4 b9b1abaa 87aa0560 0100007f 0000fa0a tcpip!DelayedAcceptConn+0xbe
f78f6bc8 b9b0e239 893f3990 0100007f 0100007f tcpip!TCPRcv+0xddb
f78f6c28 b9b0c45e 00000024 893f3990 b9b119a4 tcpip!DeliverToUser+0x189
f78f6cb8 b9b1821e 893f3990 f78f6d30 00000014 tcpip!IPRcvPacket+0x686
f78f6d64 f7598064 b9b4ce60 893f3990 8aa32db0 tcpip!LoopXmitRtn+0x195
f78f6d80 8082db10 893f3990 00000000 8aa32db0 TDI!CTEpEventHandler+0x32
f78f6dac 80920833 b9b4ce60 00000000 00000000 nt!ExpWorkerThread+0xeb
f78f6ddc 8083fe9f 8082da53 00000001 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
tdifw!tdi_event_connect+135 [e:\tdifw\ev_conn.c @ 88]
ba177ce9 3d160000c0 cmp eax,0C0000016h

FAULTING_SOURCE_CODE:
84: ((PTDI_IND_CONNECT)(ctx->old_handler)) (ctx->old_context, RemoteAddressLength, RemoteAddress,
85: UserDataLength, UserData, OptionsLength, Options, ConnectionContext,
86: AcceptIrp);
87:

88: if (status != STATUS_MORE_PROCESSING_REQUIRED || !*AcceptIrp) {
89: result = FILTER_DENY;
90: goto done;
91: }
92:
93: irps = IoGetCurrentIrpStackLocation(*AcceptIrp);

SYMBOL_STACK_INDEX: 2

SYMBOL_NAME: tdifw!tdi_event_connect+135

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tdifw

IMAGE_NAME: tdifw.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 4aa75af8

FAILURE_BUCKET_ID: 0xD1_VRF_CODE_AV_NULL_IP_tdifw!tdi_event_connect+135

BUCKET_ID: 0xD1_VRF_CODE_AV_NULL_IP_tdifw!tdi_event_connect+135

Followup: MachineOwner

According to analyze, BSOD has occurred because tdifw driver access illegal memory address.
I think that BSOD has occurred when the original event connect(PTDI_IND_CONNECT) is being called.
All the information above, tdifw driver seems to have bad behavior.

How should I solve this problem?
Thanks in advance!

> According to analyze, BSOD has occurred because tdifw driver access illegal memory address.

I think that BSOD has occurred when the original event connect(PTDI_IND_CONNECT) is being

Maybe ctx->old_handler is NULL?


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

The error is caused by dereferencing a NULL pointer - more precisely, trying
to 'call' a NULL pointer.

So it seems likely that ctx->old_handler is NULL.

TDIFW is a sample. A bit of free code that sort-of works. I have only a
passing familiarity with its internals having reviewed code based on it in
the past. As TDI filters go, it was a good illustration but not necessarily
a robust implementation. A rock-solid TDI filter is not a trivial
undertaking.

So if the 'problem' you want to solve is to improve the sample, then, just
debug it. Figure out in Windbg when it crashes what is NULL and why. Then
go fix it.

If you are running on NT6 then be aware that the behavior of the TCPIP
device 'stacks' (\Device\TCP, \Device\UPD, \Device\IP, \Device\RawIP)
changed subtly and in ways that broke quite a few TDI filters. In
particular, querying the local address of an endpoint will no longer
complete the IRP synchronously which broke a number of TDI filters that
assumed the particular I/O could be performed at DISPATCH_LEVEL
'synchronously'. I recall noting that TDIFW did not handle that situation
correctly in the code I looked at. I cannot say if that is the case in the
original or the currently available code.

If your 'problem' is to build a commercial grade, robust TDI filter, then,
consider TDIFW as a good illustration of some of the issues you need to deal
with and *one* designers choice for dealing with them. There are other
sources as well and I encourage you to look at www.pcausa.com.

There is no avoiding, however, becoming quite knowledgeable in the 'black
art' of TDI filtering if you plan to get anywhere with this. TDIFW (nor any
other sample) is something that is going to just 'work' for your application
without you understanding quite a bit about TDI, the TCPIP.SYS
implementations (yes, multiple) and how other filters can interact. It
really is all a big messy situation and there are already plenty of bad TDI
filters in the wild!

Good Luck,
Dave Cattley

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@korea.com
Sent: Wednesday, September 16, 2009 12:13 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] A BSOD problem of tdifw.

I'm using tdifw(sourceforge.net) on a Windows 2003 Server(Web server) and
sometimes I had a sort of BSOD like this.

****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

Use !analyze -v to get detailed debugging information.

BugCheck D1, {0, d0000002, 0, 0}

Probably caused by : tdifw.sys ( tdifw!tdi_event_connect+135 )

Followup: MachineOwner

1: kd> !analyze -v
****************************************************************************
***
*
*
* Bugcheck Analysis
*
*
*
****************************************************************************
***

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at
an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 00000000, memory referenced
Arg2: d0000002, IRQL
Arg3: 00000000, value 0 = read operation, 1 = write operation
Arg4: 00000000, address which referenced memory

Debugging Details:

READ_ADDRESS: 00000000

CURRENT_IRQL: 2

FAULTING_IP:
+0
00000000 ?? ???

PROCESS_NAME: System

DEFAULT_BUCKET_ID: DRIVER_FAULT

BUGCHECK_STR: 0xD1

TRAP_FRAME: f78f693c -- (.trap 0xfffffffff78f693c)
ErrCode = 00000000
eax=00000000 ebx=87ad6c04 ecx=ba17b202 edx=00000000 esi=f78f6ab0
edi=00000002
eip=00000000 esp=f78f69b0 ebp=f78f6a70 iopl=0 nv up ei pl zr na pe
nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
efl=00010246
00000000 ?? ???
Resetting default scope

LAST_CONTROL_TRANSFER: from 00000000 to 80836df5

FAILED_INSTRUCTION_ADDRESS:
+0
00000000 ?? ???

STACK_TEXT:
f78f693c 00000000 badb0d00 00000000 00000000 nt!KiTrap0E+0x2a7
WARNING: Frame IP not in any known module. Following frames may be wrong.
f78f69ac ba177ce9 00000000 00000016 f78f6ad8 0x0
f78f6a70 b9b1a9a9 02ad6c04 00000016 f78f6ad8 tdifw!tdi_event_connect+0x135
[e:\tdifw\ev_conn.c @ 88]
f78f6af4 b9b1abaa 87aa0560 0100007f 0000fa0a tcpip!DelayedAcceptConn+0xbe
f78f6bc8 b9b0e239 893f3990 0100007f 0100007f tcpip!TCPRcv+0xddb
f78f6c28 b9b0c45e 00000024 893f3990 b9b119a4 tcpip!DeliverToUser+0x189
f78f6cb8 b9b1821e 893f3990 f78f6d30 00000014 tcpip!IPRcvPacket+0x686
f78f6d64 f7598064 b9b4ce60 893f3990 8aa32db0 tcpip!LoopXmitRtn+0x195
f78f6d80 8082db10 893f3990 00000000 8aa32db0 TDI!CTEpEventHandler+0x32
f78f6dac 80920833 b9b4ce60 00000000 00000000 nt!ExpWorkerThread+0xeb
f78f6ddc 8083fe9f 8082da53 00000001 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
tdifw!tdi_event_connect+135 [e:\tdifw\ev_conn.c @ 88]
ba177ce9 3d160000c0 cmp eax,0C0000016h

FAULTING_SOURCE_CODE:
84: ((PTDI_IND_CONNECT)(ctx->old_handler)) (ctx->old_context,
RemoteAddressLength, RemoteAddress,
85: UserDataLength, UserData, OptionsLength, Options,
ConnectionContext,
86: AcceptIrp);
87:

88: if (status != STATUS_MORE_PROCESSING_REQUIRED || !*AcceptIrp) {
89: result = FILTER_DENY;
90: goto done;
91: }
92:
93: irps = IoGetCurrentIrpStackLocation(*AcceptIrp);

SYMBOL_STACK_INDEX: 2

SYMBOL_NAME: tdifw!tdi_event_connect+135

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: tdifw

IMAGE_NAME: tdifw.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 4aa75af8

FAILURE_BUCKET_ID: 0xD1_VRF_CODE_AV_NULL_IP_tdifw!tdi_event_connect+135

BUCKET_ID: 0xD1_VRF_CODE_AV_NULL_IP_tdifw!tdi_event_connect+135

Followup: MachineOwner

According to analyze, BSOD has occurred because tdifw driver access illegal
memory address.
I think that BSOD has occurred when the original event
connect(PTDI_IND_CONNECT) is being called.
All the information above, tdifw driver seems to have bad behavior.

How should I solve this problem?
Thanks in advance!


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:

To unsubscribe, visit the List Server section of OSR Online at

First of all, I really thank you for your advices.

I’ve already checked whether ctx->old_handler is NULL or not.
If NULL, I returned STATUS_CONNECTION_REFUSED without calling PTDI_IND_CONNECT.
That is, if(ctx->old_handler == NULL) return STATUS_CONNECTION_REFUSED;

I think that there are structural problems in the tdifw.
Whenever a problem occurs, it aways related with DelayedAcceptConn operation of the tdifw.
According to the analysis(http://pennerslife.blogspot.com/2009/04/drvier-crash-dump-analysis.html) of the problem similar to mine, AcceptIrp had been already freed even though ClientEventConnect event handler returned STATUS_MORE_PROCESSING_REQUIRED.
AcceptIrp is the last output parameter of the PTDI_IND_CONNECT function.

I wonder that the tdifw isn’t handling STATUS_MORE_PROCESSING_REQUIRED correctly.
If then, how can I fix the problem?

Next, is the following scenario available?
Before calling PTDI_IND_CONNECT function, I will check the validity of AcceptIrp by calling MmIsAddressValid function.
If not valid, I will just return STATUS_CONNECTION_REFUSED without calling PTDI_IND_CONNECT.
That is, if(!MmIsAddressValid(*AcceptIrp)) return STATUS_CONNECTION_REFUSED;
I wonder that this can be a solution to the problem.

Ask advice!

Do you perform this check (for NULL ctx->old_handler) with the protection of
some lock? For instance, how do you synchronize the situation where this
callback value might change. I suspect (based on experience) that this is
not your particular problem but I simply raise the issue as an illustration
that interposing a driver between a TDI Client and TDI Protocol is much more
complex than simply filtering the I/O traffic.

Regarding your specific concern about STATUS_MORE_PROCESSING required:

AcceptIrp is a pointer to a pointer - a place to ‘return’ an IRP for the
condition of STATUS_MORE_PROCESSING_REQUIRED. It will *always* be valid (no
need to check it). By valid I mean that the value of AcceptIrp can be
dereferenced as *AcceptIrp and used as an “l-value”.

However, it is only valid during the callback. Once the callback returns,
the value is no longer valid and so if it is ‘captured’ and stored in a
context to be used later (say, in a DPC or work-item) the value of AcceptIrp
or the IRP that it may point to is completely unsafe.

The only exception is when the filter either replaces the IRP with its own
or can take a stack location in the IRP and set a completion handler. In
this case, the completion handler can synchronize with the filter logic to
ensure that the IRP is not allowed to be ‘completed’ back to the client
until the filter has finished its operation(s) that might wish to reference
the IRP or modify its completion behavior.

In the context of the callback the only thing that determines if the
*AcceptIrp value is valid is the return status. If it is
STATUS_MORE_PROCESSING_REQUIRED than *AcceptIrp *MUST* point to an IRP. If
the return status is any other value, *AcceptIrp is ignored and should be
ignored by the TDI filter as well.

Good Luck,
Dave Cattley

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@korea.com
Sent: Wednesday, September 16, 2009 9:27 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] A BSOD problem of tdifw.

First of all, I really thank you for your advices.

I’ve already checked whether ctx->old_handler is NULL or not.
If NULL, I returned STATUS_CONNECTION_REFUSED without calling
PTDI_IND_CONNECT.
That is, if(ctx->old_handler == NULL) return STATUS_CONNECTION_REFUSED;

I think that there are structural problems in the tdifw.
Whenever a problem occurs, it aways related with DelayedAcceptConn operation
of the tdifw.
According to the
analysis(http://pennerslife.blogspot.com/2009/04/drvier-crash-dump-analysis.
html) of the problem similar to mine, AcceptIrp had been already freed even
though ClientEventConnect event handler returned
STATUS_MORE_PROCESSING_REQUIRED.
AcceptIrp is the last output parameter of the PTDI_IND_CONNECT function.

I wonder that the tdifw isn’t handling STATUS_MORE_PROCESSING_REQUIRED
correctly.
If then, how can I fix the problem?

Next, is the following scenario available?
Before calling PTDI_IND_CONNECT function, I will check the validity of
AcceptIrp by calling MmIsAddressValid function.
If not valid, I will just return STATUS_CONNECTION_REFUSED without calling
PTDI_IND_CONNECT.
That is, if(!MmIsAddressValid(*AcceptIrp)) return STATUS_CONNECTION_REFUSED;
I wonder that this can be a solution to the problem.

Ask advice!


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer