TDI Event handler problem

Hi everyone,

I’m getting a DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) bug in my NDIS
protocol driver’s ProtocolReceivePacket handler. The received packet
is a new connection request and I call the recipient TDI Address
object’s registered TDI_EVENT_CONNECT handler.

The Address object is valid, and has a registered TDI_EVENT_CONNECT
handler. But it turns out that somewhere between gaining a pointer to
the connect event function handler, and actually calling the function,
the function pointer ends up as a NULL pointer.

It goes something like this:

if (CheckSyn()) {
PTDI_IND_CONNECT pFunc =
(PTDI_IND_CONNECT)(events.Handler(TDI_EVENT_CONNECT));
//pFunc valid
if (pFunc && events.IsEnabled()) { //pFunc valid
TDI_STATUS Status = (*pFunc) (…); //pFunc invalid

  • pFunc == NULL
    }
    }

The DDK document states that ClientEventConnect must be capable of
carrying out its operations at IRQL = DISPATCH_LEVEL. How is it then
that the function pointer ends up NULL ?

Can one call TDI event handlers from ProtocolReceivePacket or will I
have to pass this off to a worker thread or something ?

Kind Regards

Beyers

Beyeres,

Is it possible that the ‘events’ structure (something part of your internal
Address object I am assuming) is being updated by an IRP to disable this
particular event on another processor?

Is it also possible that the compiler has detected and optimized away the
‘alias’ pFunc entirely? The code shows the function pointer being returned
by a (method?) call to events.Handler(TDI_EVENT_CONNECT). Is that, by any
chance, an inline (or trival) method call on a simple C++ object that might
be turned into a simple access to a storage location by an optimizer?

Are you analyzing a crash-dump or a live system? In either case, look
carefully at the compiled code for the entire sequence. Also look at what
might be going on other threads (and CPUs) in the system to detect if the
‘events.xxxx’ structure/class is being updated.

Lastly, the ‘value’ of the TDI event handler function and the state of
‘.IsEnabled’ are most likely assumed to be ‘coherent’ and might need to be
protected by a spinlock when tested together. At the very least, the value
of the handler and the value of the context need to be ‘acquired’ as a pair,
together, under a lock, to ensure that the event handler and context are not
in the middle of an update (on another processor).

By the way, I think you can still download the source for the MSFT Research
IP6 protocol for NT4/2K (http://research.microsoft.com/msripv6). It is
very illustrative as a TDI transport ‘study case’, perhaps more so than the
old ‘ST’ TDI sample from the NT4 DDK.

Good Luck,
Dave Cattley
Consulting Engineer
Systems Software Development

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Beyers Cronje
Sent: Thursday, July 28, 2005 12:40 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] TDI Event handler problem

Hi everyone,

I’m getting a DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1) bug in my NDIS
protocol driver’s ProtocolReceivePacket handler. The received packet
is a new connection request and I call the recipient TDI Address
object’s registered TDI_EVENT_CONNECT handler.

The Address object is valid, and has a registered TDI_EVENT_CONNECT
handler. But it turns out that somewhere between gaining a pointer to
the connect event function handler, and actually calling the function,
the function pointer ends up as a NULL pointer.

It goes something like this:

if (CheckSyn()) {
PTDI_IND_CONNECT pFunc =
(PTDI_IND_CONNECT)(events.Handler(TDI_EVENT_CONNECT));
//pFunc valid
if (pFunc && events.IsEnabled()) { //pFunc valid
TDI_STATUS Status = (*pFunc) (…); //pFunc invalid

  • pFunc == NULL
    }
    }

The DDK document states that ClientEventConnect must be capable of
carrying out its operations at IRQL = DISPATCH_LEVEL. How is it then
that the function pointer ends up NULL ?

Can one call TDI event handlers from ProtocolReceivePacket or will I
have to pass this off to a worker thread or something ?

Kind Regards

Beyers


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Hi David,

Thank you for your detailed response and for the excellent pointer to
the IPv6 code.

Is it possible that the ‘events’ structure (something part of your internal
Address object I am assuming) is being updated by an IRP to disable this
particular event on another processor?

The ‘events’ structure is definately not being updated. For testing
I’m using only one socket which at the time of the bug is in ‘listen’
mode. No other IRPs are issued during this time for the Address
object.

Is it also possible that the compiler has detected and optimized away the
‘alias’ pFunc entirely? The code shows the function pointer being returned
by a (method?) call to events.Handler(TDI_EVENT_CONNECT). Is that, by any
chance, an inline (or trival) method call on a simple C++ object that might
be turned into a simple access to a storage location by an optimizer?

The events.Handler method is not an inline method. It only returns the
handler saves by a previous events.Set method.
Any idea how I can check for the C++ optimizer ? Is it possible for
the optimizer to invalidate the storage location at anytime after a
pointer to this location has been obtained?

It seems that the pFunch pointer is valid until I call memcpy() on a
local function defined TRANSPORT_ADDRESS structure just before I call
the event handler, so it does seem that ‘something’ is clearing the
storage location. Any ideas ?

Are you analyzing a crash-dump or a live system? In either case, look
carefully at the compiled code for the entire sequence. Also look at what
might be going on other threads (and CPUs) in the system to detect if the
‘events.xxxx’ structure/class is being updated.

I’m analyzing a live system via WinDbg. No updates are performed on
the events class on any processor.

Lastly, the ‘value’ of the TDI event handler function and the state of
‘.IsEnabled’ are most likely assumed to be ‘coherent’ and might need to be
protected by a spinlock when tested together. At the very least, the value
of the handler and the value of the context need to be ‘acquired’ as a pair,
together, under a lock, to ensure that the event handler and context are not
in the middle of an update (on another processor).

True, currently I’m working on adding the necessary locks. At this
stage only the ‘naked’ framework is in place with interfaces to NDIS
and TDI. The fact that no updates are performed on the events class
indicates that locking is not the issue.

By the way, I think you can still download the source for the MSFT Research
IP6 protocol for NT4/2K (http://research.microsoft.com/msripv6). It is
very illustrative as a TDI transport ‘study case’, perhaps more so than the
old ‘ST’ TDI sample from the NT4 DDK.

Great stuff!! I wasn’t aware of this, I’m sure this will be a great
sample and learning experience.

Thanks again David.

Regards

Beyers Cronje

Beyers,

Your statement

> It seems the pFunc pointer is valid until I call memcpy() …

Makes me wonder if the stack is being overwritten by a simple bug. I am not
sure what the source and target of the memcpy() are but if the target is a
stack variable somewhere in the vicinity of the pFunc local, you might have
mis-coded the length parameter calculation to memcpy(). Copying
TRASNSPORT_ADDRESS structures is a rather tricky bit of business (just
accessing them is a trick!) I strongly suggest that you go and look over
that bit of code again. Be mindful of the alignment requirements of various
fields else you may have trouble in non-x86 builds.

As for how to know what the optimizer is doing I suggest the following (in
order of usefulness):

  1. Use a checked build of your driver. No optimizer, no problem.
  2. Use Windbg to examine the generate code.

Since you are in a live system, it should be a simple matter to set a
break-point and step through the operation of this function noting when (in
the Locals window of Windbg) the pFunc changes from valid to NULL. Whatever
happened just before that ‘F8-step’ is your problem.

I don’t think you have an issue with the optimizer - these are pretty rare
but I wanted to be complete in my first response. If, as you say, the value
of pFunc is fine until you call memcpy(), well, I think you need to look at
the memcpy() as the principal clue to what is going wrong.

Good Luck,
Dave Cattley
Consulting Engineer
Systems Software Development

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Beyers Cronje
Sent: Thursday, July 28, 2005 12:40 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] TDI Event handler problem

Hi David,

Thank you for your detailed response and for the excellent pointer to
the IPv6 code.

Is it possible that the ‘events’ structure (something part of your
internal
Address object I am assuming) is being updated by an IRP to disable this
particular event on another processor?

The ‘events’ structure is definately not being updated. For testing
I’m using only one socket which at the time of the bug is in ‘listen’
mode. No other IRPs are issued during this time for the Address
object.

Is it also possible that the compiler has detected and optimized away the
‘alias’ pFunc entirely? The code shows the function pointer being
returned
by a (method?) call to events.Handler(TDI_EVENT_CONNECT). Is that, by any
chance, an inline (or trival) method call on a simple C++ object that
might
be turned into a simple access to a storage location by an optimizer?

The events.Handler method is not an inline method. It only returns the
handler saves by a previous events.Set method.
Any idea how I can check for the C++ optimizer ? Is it possible for
the optimizer to invalidate the storage location at anytime after a
pointer to this location has been obtained?

It seems that the pFunch pointer is valid until I call memcpy() on a
local function defined TRANSPORT_ADDRESS structure just before I call
the event handler, so it does seem that ‘something’ is clearing the
storage location. Any ideas ?

Are you analyzing a crash-dump or a live system? In either case, look
carefully at the compiled code for the entire sequence. Also look at what
might be going on other threads (and CPUs) in the system to detect if the
‘events.xxxx’ structure/class is being updated.

I’m analyzing a live system via WinDbg. No updates are performed on
the events class on any processor.

Lastly, the ‘value’ of the TDI event handler function and the state of
‘.IsEnabled’ are most likely assumed to be ‘coherent’ and might need to be
protected by a spinlock when tested together. At the very least, the
value
of the handler and the value of the context need to be ‘acquired’ as a
pair,
together, under a lock, to ensure that the event handler and context are
not
in the middle of an update (on another processor).

True, currently I’m working on adding the necessary locks. At this
stage only the ‘naked’ framework is in place with interfaces to NDIS
and TDI. The fact that no updates are performed on the events class
indicates that locking is not the issue.

By the way, I think you can still download the source for the MSFT
Research
IP6 protocol for NT4/2K (http://research.microsoft.com/msripv6). It is
very illustrative as a TDI transport ‘study case’, perhaps more so than
the
old ‘ST’ TDI sample from the NT4 DDK.

Great stuff!! I wasn’t aware of this, I’m sure this will be a great
sample and learning experience.

Thanks again David.

Regards

Beyers Cronje


Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
To unsubscribe send a blank email to xxxxx@lists.osr.com

Hi David,

Who da man! It turns out I was indeed using the incorrect length of
the TRANSPORT_ADDRESS in memcpy().

Thanks for your assistance.

Kind regards

Beyers

On 7/30/05, David R. Cattley wrote:
> Beyers,
>
> Your statement
>
> >> It seems the pFunc pointer is valid until I call memcpy() …
>
> Makes me wonder if the stack is being overwritten by a simple bug. I am not
> sure what the source and target of the memcpy() are but if the target is a
> stack variable somewhere in the vicinity of the pFunc local, you might have
> mis-coded the length parameter calculation to memcpy(). Copying
> TRASNSPORT_ADDRESS structures is a rather tricky bit of business (just
> accessing them is a trick!) I strongly suggest that you go and look over
> that bit of code again. Be mindful of the alignment requirements of various
> fields else you may have trouble in non-x86 builds.
>
> As for how to know what the optimizer is doing I suggest the following (in
> order of usefulness):
>
> 1. Use a checked build of your driver. No optimizer, no problem.
> 2. Use Windbg to examine the generate code.
>
> Since you are in a live system, it should be a simple matter to set a
> break-point and step through the operation of this function noting when (in
> the Locals window of Windbg) the pFunc changes from valid to NULL. Whatever
> happened just before that ‘F8-step’ is your problem.
>
> I don’t think you have an issue with the optimizer - these are pretty rare
> but I wanted to be complete in my first response. If, as you say, the value
> of pFunc is fine until you call memcpy(), well, I think you need to look at
> the memcpy() as the principal clue to what is going wrong.
>
> Good Luck,
> Dave Cattley
> Consulting Engineer
> Systems Software Development
>
>
> -----Original Message-----
> From: xxxxx@lists.osr.com
> [mailto:xxxxx@lists.osr.com] On Behalf Of Beyers Cronje
> Sent: Thursday, July 28, 2005 12:40 PM
> To: Windows System Software Devs Interest List
> Subject: Re: [ntdev] TDI Event handler problem
>
> Hi David,
>
> Thank you for your detailed response and for the excellent pointer to
> the IPv6 code.
>
> > Is it possible that the ‘events’ structure (something part of your
> internal
> > Address object I am assuming) is being updated by an IRP to disable this
> > particular event on another processor?
>
> The ‘events’ structure is definately not being updated. For testing
> I’m using only one socket which at the time of the bug is in ‘listen’
> mode. No other IRPs are issued during this time for the Address
> object.
>
> >
> > Is it also possible that the compiler has detected and optimized away the
> > ‘alias’ pFunc entirely? The code shows the function pointer being
> returned
> > by a (method?) call to events.Handler(TDI_EVENT_CONNECT). Is that, by any
> > chance, an inline (or trival) method call on a simple C++ object that
> might
> > be turned into a simple access to a storage location by an optimizer?
>
> The events.Handler method is not an inline method. It only returns the
> handler saves by a previous events.Set method.
> Any idea how I can check for the C++ optimizer ? Is it possible for
> the optimizer to invalidate the storage location at anytime after a
> pointer to this location has been obtained?
>
> It seems that the pFunch pointer is valid until I call memcpy() on a
> local function defined TRANSPORT_ADDRESS structure just before I call
> the event handler, so it does seem that ‘something’ is clearing the
> storage location. Any ideas ?
>
>
> >
> > Are you analyzing a crash-dump or a live system? In either case, look
> > carefully at the compiled code for the entire sequence. Also look at what
> > might be going on other threads (and CPUs) in the system to detect if the
> > ‘events.xxxx’ structure/class is being updated.
>
> I’m analyzing a live system via WinDbg. No updates are performed on
> the events class on any processor.
>
> >
> > Lastly, the ‘value’ of the TDI event handler function and the state of
> > ‘.IsEnabled’ are most likely assumed to be ‘coherent’ and might need to be
> > protected by a spinlock when tested together. At the very least, the
> value
> > of the handler and the value of the context need to be ‘acquired’ as a
> pair,
> > together, under a lock, to ensure that the event handler and context are
> not
> > in the middle of an update (on another processor).
>
> True, currently I’m working on adding the necessary locks. At this
> stage only the ‘naked’ framework is in place with interfaces to NDIS
> and TDI. The fact that no updates are performed on the events class
> indicates that locking is not the issue.
>
> >
> > By the way, I think you can still download the source for the MSFT
> Research
> > IP6 protocol for NT4/2K (http://research.microsoft.com/msripv6). It is
> > very illustrative as a TDI transport ‘study case’, perhaps more so than
> the
> > old ‘ST’ TDI sample from the NT4 DDK.
>
> Great stuff!! I wasn’t aware of this, I’m sure this will be a great
> sample and learning experience.
>
> Thanks again David.
>
> Regards
>
> Beyers Cronje
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: unknown lmsubst tag argument: ‘’
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>
> —
> Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@gmail.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>