Issue with IRQL

Hi , i have written a driver which uses firewall hook to hook a filter function to IP driver. My requirement is i need to log all the incoming requests from remote computers in a log file. But the problem comes with the IRQL, my filter function runs at DISPATCH_LEVEL, due to this restriction i cannot use ZwCreateFile API inside my filter function. As a workaround i’m running the logging routine in a seperate worker thread which keeps waiting on a notification event signalled by my filter function when the IP information is ready to be logged. This works fine most of the times as my thread runs at PASSIVE_LEVEL . But sometimes it does run in DISPATCH_LEVEL maybe this is due to context switching where one processor running at higher irql gets hold of the thread . I tried lowering the IRQL to PASSIVE_LEVEL and restored the old IRQL when done with logging , but that Bug checks sometimes but very rarely with “worker thread returned at bad IRQL” or IRQL_NOT_LESS_OR_EQUAL . Please let me know how should i handle this kind of scenario. Thanks

Easy part first:

You can NEVER do this. You must never call KeLowerIrql if you’re not undoing something you yourself have done with KeRaiseIrql.

And the similarly simple:

You’re NOT using a system work item for this, right? You’re using your OWN dedicated system thread that you created by calling PsCreateSystemThread, right? Because ideally you don’t want to be using threads from the system worker thread pool for long-running activities like this.

Now, on to the more difficult:

Are you trying to tell us that sometimes you just DISCOVER your worker thread running at IRQL DISPATCH_LEVEL, without EVER having done ANYthing in your worker thread that raises the IRQL? No acquiring a spin lock… nothing? Your thread just wakes up as a result of an event being signaled and you’re running at IRQL DISPATCH_LEVEL??

Because if that’s what you’re saying, this is simply not possible on a correctly running system.

Peter
OSR

Hi Prashant,
It is highly unusual for a thread running at PASSIVE level to spontaneously change to dispatch level for no good reason. Trying to lower the IRQL if you didn’t raise it is not the right approach, as it will violate some basic assumptions OS makes and will inevitably lead to a crash. If you share relevant sections of your code perhaps people will stop the problem.
Thanks,
Alex

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, August 25, 2009 10:30 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] Issue with IRQL

Hi , i have written a driver which uses firewall hook to hook a filter function to IP driver. My requirement is i need to log all the incoming requests from remote computers in a log file. But the problem comes with the IRQL, my filter function runs at DISPATCH_LEVEL, due to this restriction i cannot use ZwCreateFile API inside my filter function. As a workaround i’m running the logging routine in a seperate worker thread which keeps waiting on a notification event signalled by my filter function when the IP information is ready to be logged. This works fine most of the times as my thread runs at PASSIVE_LEVEL . But sometimes it does run in DISPATCH_LEVEL maybe this is due to context switching where one processor running at higher irql gets hold of the thread . I tried lowering the IRQL to PASSIVE_LEVEL and restored the old IRQL when done with logging , but that Bug checks sometimes but very rarely with “worker thread returned at bad IRQL” or IRQL_NOT_LESS_OR_EQUAL . Please let me know how should i handle this kind of scenario. Thanks


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Thanks peter for your reply.Yes thats what is happening. It just wakes up on a signal and to my surprise it’s in DISPATCH_LEVEL sometimes and i’m not even doing anything like holding a spin lock or anyhing else which could raise IRQL. I’m testing it on Vmware ,maybe i should check it on a machine if this cannot happen in a correctly running system. I will update you after checking it on a real machine.

Hi Alex,
Thanks for your input.Here is the code which resembles our actual code.

DriverEntry()
{
//Does all the iniitialization stuff here
PsCreateSystemThread ( &ThreadHandle, 0, NULL, NULL, NULL,(PKSTART_ROUTINE) WriteLogToFile, NULL ) ;

}

/* Filter Function which gets called by IP driver for incoming or outgoing traffic
FilterFunction()
{

GetRemoteIpInfo(); //This will get us the all the info about the remote system in a structure.
/* Data is ready to be written so i fire up an event*/

KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

}

void WriteLogToFile(void)
{
while(TRUE)
{
KeWaitForSingleObject ( &WriteLog,Executive,KernelMode,FALSE,NULL ) ;
RtlInitAnsiString ( &FileNameString, IntrPath ) ;

RtlAnsiStringToUnicodeString(&uniName,&FileNameString , TRUE);

InitializeObjectAttributes(&objAttr, &uniName,OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE,
NULL, NULL);

if(KeGetCurrentIrql() != PASSIVE_LEVEL)
{
goto cleanup; //Modified after peter and alex suggested not to lower IRQL
}

ntstatus = ZwCreateFile(&handle,
FILE_APPEND_DATA,
&objAttr, &ioStatusBlock, NULL,
FILE_ATTRIBUTE_NORMAL,
0,
FILE_OPEN_IF,
FILE_SYNCHRONOUS_IO_NONALERT,
NULL, 0);

if(!NT_SUCCESS(ntstatus)) {
DbgPrint(“\n\tWarning: Could not open the file (%s)\n”, uniName);
goto cleanup;
}

ntstatus = ZwWriteFile (handle,NULL,NULL, NULL, &ioStatusBlock,&write_data , sizeof(IntrInfo) ,NULL , NULL);

if ( !NT_SUCCESS ( ntstatus ) )
{
DbgPrint(“ZwWriteFile FAILED”);
goto cleanup;
}

Cleanup:
I here do all the cleaup stuff
KeClearEvent(&WriteLogEvent);

}
}

Alex, with all due respect: “highly unusual”?? Are you implying that there’s even a single, rare, edge-condition, case when this can happen on a properly functioning Windows system? Because if that’s the case, I’d suggest that lots of things in Windows will be well and truly broken. Threads running on Windows simply do not expect to spontaneously change IRQL.

Peter
OSR

As many have said, this really can’t happen the way you’ve shown it.
Without your exact code for the while loop, it’s really hard to say what
seemingly innocent thing might be causing the issue.

E.g. if something in your loop is occasionally leaving you in
DISPATCH_LEVEL I’m not sure whether KeWaitForSingleObject will always
blue-screen if the event happens to already be set when it’s called…
in which case you could come out of it still at DISPATCH_LEVEL. Note: I
very well might be wrong about that. Anyway, putting a check for IRQL
above the wait would tell you if something like that’s happening.

One other thing I’d do is check the return value of
KeWaitForSingleObject. I don’t see any particular *reason* it would be
erroring out, nor how that could cause this symptom, nor do I think it
should be alerting or exiting for an APC given the way you called it,
but it’s the only other thing I can think of to check unless you want to
post the actual code. And it’s a good idea anyway.

And of course, whenever anything truly bizarre happens, there’s always
the chance that a heap corruption has occurred somewhere in the system.

xxxxx@gmail.com wrote:

Hi Alex,
Thanks for your input.Here is the code which resembles our actual code.

DriverEntry()
{
//Does all the iniitialization stuff here
PsCreateSystemThread ( &ThreadHandle, 0, NULL, NULL, NULL,(PKSTART_ROUTINE) WriteLogToFile, NULL ) ;

}

/* Filter Function which gets called by IP driver for incoming or outgoing traffic
FilterFunction()
{

GetRemoteIpInfo(); //This will get us the all the info about the remote system in a structure.
/* Data is ready to be written so i fire up an event*/

KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

}

void WriteLogToFile(void)
{
while(TRUE)
{
KeWaitForSingleObject ( &WriteLog,Executive,KernelMode,FALSE,NULL ) ;
RtlInitAnsiString ( &FileNameString, IntrPath ) ;

RtlAnsiStringToUnicodeString(&uniName,&FileNameString , TRUE);

InitializeObjectAttributes(&objAttr, &uniName,OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE,
NULL, NULL);

if(KeGetCurrentIrql() != PASSIVE_LEVEL)
{
goto cleanup; //Modified after peter and alex suggested not to lower IRQL
}

ntstatus = ZwCreateFile(&handle,
FILE_APPEND_DATA,
&objAttr, &ioStatusBlock, NULL,
FILE_ATTRIBUTE_NORMAL,
0,
FILE_OPEN_IF,
FILE_SYNCHRONOUS_IO_NONALERT,
NULL, 0);

if(!NT_SUCCESS(ntstatus)) {
DbgPrint(“\n\tWarning: Could not open the file (%s)\n”, uniName);
goto cleanup;
}

ntstatus = ZwWriteFile (handle,NULL,NULL, NULL, &ioStatusBlock,&write_data , sizeof(IntrInfo) ,NULL , NULL);

if ( !NT_SUCCESS ( ntstatus ) )
{
DbgPrint(“ZwWriteFile FAILED”);
goto cleanup;
}

Cleanup:
I here do all the cleaup stuff
KeClearEvent(&WriteLogEvent);

}
}

> KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

Are you following this set with an immediate call to KeWaitForSingleObject?
Passing TRUE as the last parameter causes it to return the dispatcher
database lock held, not immediately doing a wait could cause you strange
issues. See KeSetEvent docs for more details.

-scott


Scott Noone
Consulting Associate
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…
> Hi Alex,
> Thanks for your input.Here is the code which resembles our actual code.
>
> DriverEntry()
> {
> //Does all the iniitialization stuff here
> PsCreateSystemThread ( &ThreadHandle, 0, NULL, NULL,
> NULL,(PKSTART_ROUTINE) WriteLogToFile, NULL ) ;
>
> }
>
> /* Filter Function which gets called by IP driver for incoming or outgoing
> traffic
> FilterFunction()
> {
>
> GetRemoteIpInfo(); //This will get us the all the info about the remote
> system in a structure.
> /* Data is ready to be written so i fire up an event*/
>
> KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;
>
> }
>
> void WriteLogToFile(void)
> {
> while(TRUE)
> {
> KeWaitForSingleObject ( &WriteLog,Executive,KernelMode,FALSE,NULL ) ;
> RtlInitAnsiString ( &FileNameString, IntrPath ) ;
>
> RtlAnsiStringToUnicodeString(&uniName,&FileNameString , TRUE);
>
> InitializeObjectAttributes(&objAttr, &uniName,OBJ_CASE_INSENSITIVE |
> OBJ_KERNEL_HANDLE,
> NULL, NULL);
>
> if(KeGetCurrentIrql() != PASSIVE_LEVEL)
> {
> goto cleanup; //Modified after peter and alex suggested not to lower IRQL
> }
>
> ntstatus = ZwCreateFile(&handle,
> FILE_APPEND_DATA,
> &objAttr, &ioStatusBlock, NULL,
> FILE_ATTRIBUTE_NORMAL,
> 0,
> FILE_OPEN_IF,
> FILE_SYNCHRONOUS_IO_NONALERT,
> NULL, 0);
>
> if(!NT_SUCCESS(ntstatus)) {
> DbgPrint(“\n\tWarning: Could not open the file (%s)\n”,
> uniName);
> goto cleanup;
> }
>
> ntstatus = ZwWriteFile (handle,NULL,NULL, NULL, &ioStatusBlock,&write_data
> , sizeof(IntrInfo) ,NULL , NULL);
>
> if ( !NT_SUCCESS ( ntstatus ) )
> {
> DbgPrint(“ZwWriteFile FAILED”);
> goto cleanup;
> }
>
> Cleanup:
> I here do all the cleaup stuff
> KeClearEvent(&WriteLogEvent);
>
> }
> }
>
>
>

Ironically, the OP explained himself the reason for elevated IRQL - indeed, it is dispatcher-related.
Look below:

KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

The last parameter is TRUE, which means atomic signal-and-wait, i.e. KeSetEvent() returns while holding a dispatcher spinlock. This is the last line in the function, so that there is no wait operation in sight.

The truly amazing thing here is that, according to the OP, it works once in a while - objectively, the whole thing should deadlock the system on the very first run …

Anton Bassov.

Please refer to SniffUSB , an open source project.

In it , you can focus on the log part

It’s fit for your scenario.

Kudos Anton Bassov!!! Thanks very much for pointing that out. It was indeed the KeSetEvent which was elevating IRQL because of it’s Wait parameter being set TRUE. Thanks very much. Also thanks to everyone for sparing your precious time.

Could we continue to dig into this scenario where KeSetEvent is not followed immediately by a call to a KeWaitXxx routine ? and what causes worker thread to sometimes work in DISPATCH_LEVEL ?

From the description of MSDN , “If the Wait parameter is TRUE, the return to the caller is executed without lowering IRQL or releasing the dispatcher database spin lock” ,

If so, then worker thread has no chance to be scheduled because “dispatcher database spin lock” is not released .

But the actual situation is not like this. Can someone disclose this underlying execution ? Thanks

Hi,
In addition to already pointed out problem with KeSetEvent I think there might be another problem with your implementation unless the sample bellow over simplifies things:

It seems that in your FilterFunction you populate some global structure with the data to be written and notify your “writer” to go ahead and pick it up. What happens if FilterFunction is called again before your writer is done with the data. Given that you can’t wait in your FilterFunction (it might be called at DISPATCH) it seems like you actually need to capture the data some place…

Also are you using Notification or Synchronization event object? For synchronization I don’t believe you need to clear the event (wait will consume the signal) and for notification it seems like event clearing can potentially result in lost signals.

Alex

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@gmail.com
Sent: Tuesday, August 25, 2009 11:37 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Issue with IRQL

Hi Alex,
Thanks for your input.Here is the code which resembles our actual code.

DriverEntry()
{
//Does all the iniitialization stuff here
PsCreateSystemThread ( &ThreadHandle, 0, NULL, NULL, NULL,(PKSTART_ROUTINE) WriteLogToFile, NULL ) ;

}

/* Filter Function which gets called by IP driver for incoming or outgoing traffic
FilterFunction()
{

GetRemoteIpInfo(); //This will get us the all the info about the remote system in a structure.
/* Data is ready to be written so i fire up an event*/

KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

}

void WriteLogToFile(void)
{
while(TRUE)
{
KeWaitForSingleObject ( &WriteLog,Executive,KernelMode,FALSE,NULL ) ;
RtlInitAnsiString ( &FileNameString, IntrPath ) ;

RtlAnsiStringToUnicodeString(&uniName,&FileNameString , TRUE);

InitializeObjectAttributes(&objAttr, &uniName,OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE,
NULL, NULL);

if(KeGetCurrentIrql() != PASSIVE_LEVEL)
{
goto cleanup; //Modified after peter and alex suggested not to lower IRQL
}

ntstatus = ZwCreateFile(&handle,
FILE_APPEND_DATA,
&objAttr, &ioStatusBlock, NULL,
FILE_ATTRIBUTE_NORMAL,
0,
FILE_OPEN_IF,
FILE_SYNCHRONOUS_IO_NONALERT,
NULL, 0);

if(!NT_SUCCESS(ntstatus)) {
DbgPrint(“\n\tWarning: Could not open the file (%s)\n”, uniName);
goto cleanup;
}

ntstatus = ZwWriteFile (handle,NULL,NULL, NULL, &ioStatusBlock,&write_data , sizeof(IntrInfo) ,NULL , NULL);

if ( !NT_SUCCESS ( ntstatus ) )
{
DbgPrint(“ZwWriteFile FAILED”);
goto cleanup;
}

Cleanup:
I here do all the cleaup stuff
KeClearEvent(&WriteLogEvent);

}
}


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

I meant to say that it does not happen, unless there is a bug.
Sorry,
Alex

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@osr.com
Sent: Tuesday, August 25, 2009 12:10 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Issue with IRQL

Alex, with all due respect: “highly unusual”?? Are you implying that there’s even a single, rare, edge-condition, case when this can happen on a properly functioning Windows system? Because if that’s the case, I’d suggest that lots of things in Windows will be well and truly broken. Threads running on Windows simply do not expect to spontaneously change IRQL.

Peter
OSR


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

I believe the dispatcher lock is nothing more than a IRQL raise, so that
when the lock is acquired multiple times it will not deadlock and after it’s
released the situation is back to normal despite un unmatching number of
acquire/releases. Obviously if you say “I am going to wait” and don’t do it
you can expect problems. If this were another type of lock such as a
spinlock it would not deadlock only the current CPU but all other CPUs in
the system that would eventually make use of the dispatcher lock.

The Wait option in KeSetEvent is more than just an optimization as it allows
a driver to set an event and wait as an atomic operation so that the order
of execution is guaranteed. A situation where this is useful is an event
pair which protects a shared buffer with a reader and a writer thread. The
read thread signals the write thread that it’s done it’s reading, but before
the writer thread can do it’s work, the read thread must already be waiting
for the writer thread’s event otherwise there is a race in which a read and
write can occur at the same time.

//Daniel

<xiedong_sl> wrote in message news:xxxxx@ntdev…
> Could we continue to dig into this scenario where KeSetEvent is not
> followed immediately by a call to a KeWaitXxx routine ? and what causes
> worker thread to sometimes work in DISPATCH_LEVEL ?
>
> From the description of MSDN , “If the Wait parameter is TRUE, the return
> to the caller is executed without lowering IRQL or releasing the
> dispatcher database spin lock” ,
>
> If so, then worker thread has no chance to be scheduled because
> “dispatcher database spin lock” is not released .
>
> But the actual situation is not like this. Can someone disclose this
> underlying execution ? Thanks
></xiedong_sl>

Hi alex, thanks for pointing that out. Yes the example above is just a simplified version of my actual code. In my actual code i do maintain a queue on which i do some processing based on which i write the log. The event i used is a notification event.

xiedong_sl@126.com wrote:

Could we continue to dig into this scenario where KeSetEvent is not followed immediately by a call to a KeWaitXxx routine ? and what causes worker thread to sometimes work in DISPATCH_LEVEL ?

From the description of MSDN , “If the Wait parameter is TRUE, the return to the caller is executed without lowering IRQL or releasing the dispatcher database spin lock” ,

If so, then worker thread has no chance to be scheduled because “dispatcher database spin lock” is not released.

Your conclusion is based on the assumption that the dispatcher must be
invoked in order to continue the calling thread. That’s not true. The
thread doesn’t need to be scheduled, because it’s already running.
KeSetEvent just has to return normally.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>I believe the dispatcher lock is nothing more than a IRQL raise,

…but in such case the whole operation is not really atomic from dispatcher’s perspective, although it is still atomic from the caller’s one. Let’s say you want to use this construct as a barrier for 2 threads.
Therefore, thread X sets event A and waits on event B, while thread Y sets event B and waits on A. After they break out each thread resets an event it had signaled back to non-signaled state.

If dispatcher spinlock is held at the time KeSEtEvent() returns control, everything will always work the same way. However, if it is not held, the whole thing may be producing random results once in a while on MP system…

Therefore, I think your statement holds true only for UP system…

Anton Bassov

From what I have seen in your code and your explanation, I believe you cause
memory corruption at some point, and thus the system might try to page
fault, or something and raise irql.

With respect,
Gabriel Bercea

GaMiTech Software Development
Mobile contact: ?(+40)0740049634
eMail: xxxxx@gmail.com
Blog: http://gamitech.blogspot.com/
Linkedin: http://www.linkedin.com/in/gamitech
Twitter: http://www.twitter.com/gamitech

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Tuesday, August 25, 2009 8:30 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Issue with IRQL

Hi , i have written a driver which uses firewall hook to hook a filter
function to IP driver. My requirement is i need to log all the incoming
requests from remote computers in a log file. But the problem comes with the
IRQL, my filter function runs at DISPATCH_LEVEL, due to this restriction i
cannot use ZwCreateFile API inside my filter function. As a workaround i’m
running the logging routine in a seperate worker thread which keeps waiting
on a notification event signalled by my filter function when the IP
information is ready to be logged. This works fine most of the times as my
thread runs at PASSIVE_LEVEL . But sometimes it does run in DISPATCH_LEVEL
maybe this is due to context switching where one processor running at higher
irql gets hold of the thread . I tried lowering the IRQL to PASSIVE_LEVEL
and restored the old IRQL when done with logging , but that Bug checks
sometimes but very rarely with “worker thread returned at bad IRQL” or
IRQL_NOT_LESS_OR_EQUAL . Please let me know how should i handle this kind of
scenario. Thanks


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Interesting, I wasn’t paying attention to that line.
“If Wait = TRUE, the routine returns without lowering the IRQL. In this
case, the KeSetEvent call must be immediately followed by a KeWaitXxx call.
By setting Wait = TRUE, the caller can prevent an unnecessary context switch
from occurring between the KeSetEvent call and the KeWaitXxx call. The
KeWaitXxx routine, before it returns, restores the IRQL to its original
value at the start of the KeSetEvent call.”

This actually I believe solves the problem.

With respect,
Gabriel Bercea

GaMiTech Software Development
Mobile contact: ?(+40)0740049634
eMail: xxxxx@gmail.com
Blog: http://gamitech.blogspot.com/
Linkedin: http://www.linkedin.com/in/gamitech
Twitter: http://www.twitter.com/gamitech

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@hotmail.com
Sent: Wednesday, August 26, 2009 4:25 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Issue with IRQL

Ironically, the OP explained himself the reason for elevated IRQL - indeed,
it is dispatcher-related.
Look below:

KeSetEvent ( &WriteLogEvent, IO_NO_INCREMENT, TRUE ) ;

The last parameter is TRUE, which means atomic signal-and-wait, i.e.
KeSetEvent() returns while holding a dispatcher spinlock. This is the last
line in the function, so that there is no wait operation in sight.

The truly amazing thing here is that, according to the OP, it works once in
a while - objectively, the whole thing should deadlock the system on the
very first run …

Anton Bassov.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer