Control Device Inhibits Unload

Doron_Holan · April 8, 2008, 2:00am

It defers to another thread and synchronous waits for the system thread to signal that it is done. In other words, ZwUnloadDriver handles DriverUnload synchronously, whether it calls it directly or defers to a system thread to call it and wait for it to return. This means that if DriverUnload blocks infinitely in the driver, bad mojo will occur b/c others will be blocked on it.

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@hotmail.com
Sent: Monday, April 07, 2008 10:52 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Control Device Inhibits Unload

Doron,

If not, ZwUnloadDriver will synchronously defer the call to a thread in the system process.
Either way, to the caller of ZwUnloadDriver the behavior is synchronous and DriverUnload()
has been called by the time it returns. The question is now…returning to what?

Actually, in our case the question is not “returns where” but “does it return at all”…

Consider the scenario when DrvUnload() waits for something that never happens. What is going to happen from ZwUnloadDriver() caller’s perspective??? Will thread that calls ZwUnloadDriver() go dead for good??? In other words, does ZwUnloadDriver() defer a call to another thread and wait for it to complete, or does it return to the caller straight away? Judging from the fact that DrvUnload() is void function that does not return any value, the status that ZwUnloadDriver() returns to a caller should not depend on DrvUnload() call in any possible way, so that I would rather expect the latter scenario. Could you please check it…

Anton Bassov

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

anton_bassov · April 8, 2008, 2:52am

> It defers to another thread and synchronous waits for the system thread to signal that it is done. In other >words, ZwUnloadDriver handles DriverUnload synchronously, whether it calls it directly or

defers to a system thread to call it and wait for it to return. This means that if DriverUnload
blocks infinitely in the driver, bad mojo will occur b/c others will be blocked on it.

Well, in such case waiting is, indeed, “not the best option”, so to say…

You still can make it work - instead of waiting, just manipulate the stack in such way that execution goes directly to PsTerminateSystemThread() after ZwUnloadDriver() returns. However, it will require assembly - you cannot do it in C…

Anton Bassov

Scott_Noone_OSR · April 8, 2008, 10:44am

>instead of waiting, just manipulate the stack in such way that execution

goes directly to PsTerminateSystemThread() after >ZwUnloadDriver() returns.

Too complicated. I’d just remove the driver object from the object manager
namespace and then modify the page table to remove the image from memory.

(Cause I’ve learned that I have to: That was sarcastic. Please don’t try was
you just saw at home. Ever.)

-scott

Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…
>> It defers to another thread and synchronous waits for the system thread
>> to signal that it is done. In other >words, ZwUnloadDriver handles
>> DriverUnload synchronously, whether it calls it directly or
>> defers to a system thread to call it and wait for it to return. This
>> means that if DriverUnload
>> blocks infinitely in the driver, bad mojo will occur b/c others will be
>> blocked on it.
>
> Well, in such case waiting is, indeed, “not the best option”, so to
> say…
>
> You still can make it work - instead of waiting, just manipulate the stack
> in such way that execution goes directly to PsTerminateSystemThread()
> after ZwUnloadDriver() returns. However, it will require assembly - you
> cannot do it in C…
>
> Anton Bassov
>

OSR_Community_User · April 8, 2008, 1:28pm

wrote in message news:xxxxx@ntdev…
>> Then how they do this in Linux? IIRC they run a task periodically to
>> detect
>> unowned/unreferenced modules and unload them. So a driver may remain
>> loaded for few seconds > after all it’s references go away.
>
> I am afraid you are much too optimistic…

Hmm usually I’m rather on the opposite side

> For the fun of doing it, I ran lsmod (64-bit Fedora 8), and got around 15
> loaded modules that are not used by anyone. I repeated the experiment half
> an hour later…but still I see exactly the same unused modules
> loaded…
>

See on the module autoclean:
http://tldp.org/HOWTO/html_single/Module-HOWTO/#AUTOLOAD
It’s optional and set per module; unloading is not triggered by removal of
devices; you can call it once or periodically,
until it succeeds.

Back to Windows - my “design pattern” for control devices is to put them
in a separate non-pnp driver, and load it manually.
Obviously, management hates this idea but now I have more arguments for it

Regards,
–PA

anton_bassov · April 8, 2008, 3:59pm

>>instead of waiting, just manipulate the stack in such way that execution

>goes directly to PsTerminateSystemThread() after ZwUnloadDriver() returns.

Too complicated.

PUSH 0
PUSH argument_to_ZwUnloadDriver
PUSH address_of_ PsTerminateSystemThread
JMP address_of_ ZwUnloadDriver

Where is complication??? If Windows kernel relied upon CDECL calling convention it would be more complex, but once it relies upon STDCALL, everything happens in itself…

I’d just remove the driver object from the object manager namespace and then modify the page table to > remove the image from memory. (Cause I’ve learned that I have to: That was sarcastic. Please don’t try > was you just saw at home. Ever.)

Just compare the above lines with what you have described, and you will see that your sarcasm is totally baseless (please note that if you put it into a separate .asm file, it will work for 64-bit Windows as well, and your driver remains fully “supported”)…

Anton Bassov

Doron_Holan · April 8, 2008, 4:12pm

This obviously only works from a thread that the driver created and started. You could not do this in a workitem b/c you would be killing a thread that is not yours. Again, why all the pain to unload driver from w/in the driver itself?

d

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@hotmail.com
Sent: Tuesday, April 08, 2008 12:59 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Control Device Inhibits Unload

>instead of waiting, just manipulate the stack in such way that execution
>goes directly to PsTerminateSystemThread() after ZwUnloadDriver() returns.

Too complicated.

PUSH 0
PUSH argument_to_ZwUnloadDriver
PUSH address_of_ PsTerminateSystemThread
JMP address_of_ ZwUnloadDriver

Where is complication??? If Windows kernel relied upon CDECL calling convention it would be more complex, but once it relies upon STDCALL, everything happens in itself…

I’d just remove the driver object from the object manager namespace and then modify the page table to > remove the image from memory. (Cause I’ve learned that I have to: That was sarcastic. Please don’t try > was you just saw at home. Ever.)

Just compare the above lines with what you have described, and you will see that your sarcasm is totally baseless (please note that if you put it into a separate .asm file, it will work for 64-bit Windows as well, and your driver remains fully “supported”)…

Anton Bassov

NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

Tim_Roberts · April 8, 2008, 4:22pm

xxxxx@hotmail.com wrote:

>> instead of waiting, just manipulate the stack in such way that execution
>> goes directly to PsTerminateSystemThread() after ZwUnloadDriver() returns.
>>
> Too complicated.
>

PUSH 0
PUSH argument_to_ZwUnloadDriver
PUSH address_of_ PsTerminateSystemThread
JMP address_of_ ZwUnloadDriver

Where is complication???

Of course that’s complicated. It’s complicated because I have to spin
off a separate thread to do it. It’s complicated because I will have to
document the heck out of it so that future generations can figure out
what the hell I’m trying to do. It’s complicated because I will have to
perspire at the front of the room in a code review to defend this kind
of hackery. It’s complicated because it relies on undocumented behavior.

I’m not saying it doesn’t work, but don’t try to pretend that this is
intuitively obvious to the casual observer. Any assembler code today is
an invitation to extra scrutiny.

The better solution is to live with the fact that my driver doesn’t
unload in certain unusual circumstances.

–
Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Peter_Viscarola_OSR · April 8, 2008, 4:29pm

Here here!! I couldn’t agree more. Tim’s entirely right.

I’d *FIRE* people for writing code like that and not having a satisfactory explanation as to why they did it.

“Putting it in a separate .asm file” doesn’t make it less complicated. In fact, the code you wrote above won’t work on an x64, even in a “separate .asm file” – Consider the calling convention differences.

Peter
OSR

anton_bassov · April 8, 2008, 4:34pm

> This obviously only works from a thread that the driver created and started.

Of course - from the very beginning I was speaking about doing in from a thread that you have created and dedicated solely for the purpose of unloading a driver…

Again, why all the pain to unload driver from w/in the driver itself?

This is a very good question that has to *objective* answer - indeed, objectively it is more reasonable to write a helper app/service/driver. However, in my experience, many clients would prefer to do everything in a single driver, and, once you are not their architect, you are going to have a hard time explaining to them that it is better to split the task into separate components. Believe me or not, but I know of someone who does even *floating-point* calculations in a driver - after all, not architects are reasonable, so that sometimes you have to fulfill rather stupid requests and think “unconventionally”…

Anton Bassov

anton_bassov · April 8, 2008, 4:43pm

> It’s complicated because I will have to perspire at the front of the room in a code review to defend

this kind of hackery.

It may be useful when you otherwise have to perspire defending your request to introduce an additional component- in my experience, not everyone likes this idea…

… it relies on undocumented behavior.

Which way??? What is “undocumented” here???

Anton Bassov

anton_bassov · April 8, 2008, 4:48pm

> In fact, the code you wrote above won’t work on an x64, even in a “separate .asm file”

Well, this is not surprising, taking into account that it is written for x86. This is what separate .asm files are for - the asm code that you write has to be specific to a given architecture.Otherwise, you are better off with inline assembly…

Anton Bassov

Peter_Viscarola_OSR · April 8, 2008, 4:54pm

So, doesn’t that inherently make it “complicated”?

I mean, I hear what you’re saying regarding creatively solving problems and all, and I certainly agree that if there was some client DEMAND – a demand you couldn’t talk them out of – I’d resort to this type of tactic (I’ve done waaaaay worse than that).

But it’s not a “reasonable” work around for the general case, right? It’s complicated (er, or is that still subject to debate) and it’s hard to explain (in the comments, to your code review colleagues, etc), and non-trivial to maintain.

Peter
OSR

Scott_Noone_OSR · April 8, 2008, 4:59pm

Re-read as:

“That’s insane. Here’s my more insane idea which I will try to pass off as
less insane.”

And maybe it will make sense.

I give up. Next time I’ll just do a spit-take.

-scott

Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…
>>>instead of waiting, just manipulate the stack in such way that execution
>>>goes directly to PsTerminateSystemThread() after ZwUnloadDriver()
>>>returns.
>
>> Too complicated.
>
> PUSH 0
> PUSH argument_to_ZwUnloadDriver
> PUSH address_of_ PsTerminateSystemThread
> JMP address_of_ ZwUnloadDriver
>
> Where is complication??? If Windows kernel relied upon CDECL calling
> convention it would be more complex, but once it relies upon STDCALL,
> everything happens in itself…
>
>
>
>> I’d just remove the driver object from the object manager namespace and
>> then modify the page table to > remove the image from memory. (Cause I’ve
>> learned that I have to: That was sarcastic. Please don’t try > was you
>> just saw at home. Ever.)
>
> Just compare the above lines with what you have described, and you will
> see that your sarcasm is totally baseless (please note that if you put it
> into a separate .asm file, it will work for 64-bit Windows as well, and
> your driver remains fully “supported”)…
>
> Anton Bassov
>

David_R_Cattley · April 8, 2008, 5:33pm

Well, it’s a good thing you didn’t actually write it Scott. The next sound
you would have heard was the other shoe dropping off of Peter’s foot…
(based on his comment, that is. Maybe you would merit just a warning!)

-dc

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Scott Noone
Sent: Tuesday, April 08, 2008 4:59 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] Control Device Inhibits Unload

Re-read as:

“That’s insane. Here’s my more insane idea which I will try to pass off as
less insane.”

And maybe it will make sense.

I give up. Next time I’ll just do a spit-take.

-scott

Scott Noone
Software Engineer
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…
>>>instead of waiting, just manipulate the stack in such way that execution
>>>goes directly to PsTerminateSystemThread() after ZwUnloadDriver()
>>>returns.
>
>> Too complicated.
>
> PUSH 0
> PUSH argument_to_ZwUnloadDriver
> PUSH address_of_ PsTerminateSystemThread
> JMP address_of_ ZwUnloadDriver
>
> Where is complication??? If Windows kernel relied upon CDECL calling
> convention it would be more complex, but once it relies upon STDCALL,
> everything happens in itself…
>
>
>
>> I’d just remove the driver object from the object manager namespace and
>> then modify the page table to > remove the image from memory. (Cause I’ve

>> learned that I have to: That was sarcastic. Please don’t try > was you
>> just saw at home. Ever.)
>
> Just compare the above lines with what you have described, and you will
> see that your sarcasm is totally baseless (please note that if you put it
> into a separate .asm file, it will work for 64-bit Windows as well, and
> your driver remains fully “supported”)…
>
> Anton Bassov
>

—
NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · April 8, 2008, 6:03pm

He did consider it. Actually, he spelled it out. He just got it wrong.

Where is complication??? If Windows kernel relied upon CDECL calling
>convention it would be more complex, but once it relies upon STDCALL,
>everything happens in itself…

Just compare the above lines with what you have described, and you
will >see that your sarcasm is totally baseless (please note that if you
put >it into a separate .asm file, it will work for 64-bit Windows as
well, >and your driver remains fully “supported”)…

…and we’re back, because all of this started with:

What is so complex here???

Where he was also wrong. I must, however, admit to not knowing the
difference between ‘???’ and ‘???’ Most of what has followed has been
about this:

‘Therefore, the “maximum” that one can expect in the solution that I
have described is a deadlock’

It should have read “solution” instead of “maximum,” because normally
things that deadlock aren’t considered solutions.

Incidentally, you might never have the chance to fire him, because it
would only take a single exposure to ‘???’ before your clients would
fire you.

mm

xxxxx@osr.com wrote:

The better solution is to live with the fact that my driver doesn’t unload in certain unusual circumstances.

Here here!! I couldn’t agree more. Tim’s entirely right.

I’d *FIRE* people for writing code like that and not having a satisfactory explanation as to why they did it.

“Putting it in a separate .asm file” doesn’t make it less complicated. In fact, the code you wrote above won’t work on an x64, even in a “separate .asm file” – Consider the calling convention differences.

Peter
OSR

anton_bassov · April 8, 2008, 6:47pm

Martin,

> Therefore, the “maximum” that one can expect in the solution that I have described is a deadlock’

It should have read “solution” instead of “maximum,” because normally things that deadlock aren’t >considered solutions.

I am afraid you just missed the irony of my statement…

For the practical purposes, deadlock==BSOD - they both mean that your driver just does not work. This is why I put “maximum” in quotation marks - otherwise, my statement would imply that deadlock is somehow better than BSOD…

Anton Bassov

Matthew_Carter · April 8, 2008, 6:47pm

The solutions I’ve thought of and heard of to work around the problem are:

implement a PnP device to use for receiving control messages instead of a
control device so all devices are PnP
implement a dummy PnP device to go along with your control device just so
that you’ll have a PnP device that can be shut down last so the PnP manager
will properly unload
stall the last filtered device from being removed right away when PnP
requests it, send an order to your control application to immediately
release the handle it holds, wait the handles to the control device to
close, then remove it before the last PnP device
make seperate drivers to control the PnP filtered devices and non-PnP
control devices so that you don’t end up with one driver with both PnP and
non-PnP devices.

I think pretty much all of those solutions have been suggested by people in
this thread. Implementing any one of those solutions adds quite a bit of
complexity to what otherwise might have been a very simple and
straightforward filter driver. All of them require PnP code that normally
may not be needed except #4, which has its own extra requirements of
inter-driver communication and needing one driver to load the other.
Instead of any of those I just kinda learn to live with it. If the
situation does come up where the control device is the last to shut down
then the driver gets stuck in memory, but if one of the devices starts again
the same driver stuck in memory will handle it, so its not a situation where
you have a memory leak that takes away more and more memory each time the
driver is run.

The only real problem with having the driver stuck there in memory is that
if the user tries to update the driver, he will have to reboot the whole
computer. A driver package that can unload and replace itself in all
situations without requiring the user to reboot feels much more professional
than one that can’t so I can understand the desire to achieve that, and I
share that desire, but in my own simple filter driver designs I just
couldn’t justify the additional complexity required to make it happen.

“Pavel A.” wrote in message news:xxxxx@ntdev…
> […]
> Back to Windows - my “design pattern” for control devices is to put them
> in a separate non-pnp driver, and load it manually.
> Obviously, management hates this idea but now I have more arguments for it

anton_bassov · April 8, 2008, 6:59pm

[quote]
The better solution is to live with the fact that my driver doesn’t unload in certain unusual circumstances. [/quote]

Earlier on this thread Pavel provided a link to a document which, although not related to Windows, says quite interesting thing that, I believe, directly applies to this thread(and not only to it):

Indeed, quite often we go lengths trying to design a “perfect” solution, but is it always worth an effort,especially taking into consideration the computing power and amount of available resources of an average modern machine???

Anton Bassov

Matthew_Carter · April 8, 2008, 8:02pm

I think its just the opposite. Its very seldom that anyone bothers to try
to design anything like a perfect solution to anything anymore *because*
they take into account the computing power and available resources of a
modern machine. I mean geez I compiled a .NET application and the
executable was over 1 MB and the memory footprint was something like 20 MB
for a ‘Hello World’ application that was a 4k executable that could be
loaded in maybe 300k memory 10 years ago with Win32 and yet someone thinks
.NET is a good thing. We develop slower drivers on purpose with WDF. Time
to market is by far a greater motivation than perfection in the PC world.

wrote in message news:xxxxx@ntdev…
> Indeed, quite often we go lengths trying to design a “perfect” solution,
> but is it always worth an effort,especially taking into consideration the
> computing power and amount of available resources of an average modern
> machine???

OSR_Community_User · April 8, 2008, 11:10pm

wrote in message news:xxxxx@ntdev…
>
> I’d FIRE people for writing code like that and not having a satisfactory
> explanation as to why they did it.

if memory serves, this technique became well known after publications
of Gary Nebbet (the self-deleting exe, for one).

Regards,
–PA