KeDelayExecutionThread at DISPATCH_LEVEL

Chris_Troester · March 1, 2024, 5:13pm

I have a WDM driver. In the DeviceIoControl function, IoStartPacket is called. The StartIo routine is called with DISPATCH_LEVEL and I handle the IOCTL.

In several IOCTL functions I use KeDelayExecutionThread with relative wait times of e.g. 100us:

KeDelayExecutionThread(KernelMode, FALSE, &liWaitTime);

I read that KeDelayExecutionThread must be called at <= APC_LEVEL, however for years I never had problems with the driver. What could be disadvantages of this approach?

Mark_Roddy · March 2, 2024, 1:36pm

I’m surprised that Driver Verifier or SDV don’t get upset about this. You should just convert it to KeStallExecutionProcessor, that’s effectively what you are doing anyway.

Tim_Roberts · March 2, 2024, 2:53pm

100us is a long time for KeStallExecutionProcessor. Won’t that trigger the “DPC overtime” blue screen?

craig_howard · March 2, 2024, 2:53pm

When your thread is operating at dispatch level (or ISR level) you have to watch out for timeout BSOD’s [https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/bug-check-0x133-dpc-watchdog-violation] … what specific problem are you having that you are solving with a thread stall at that point?

Mark_Roddy · March 2, 2024, 4:38pm

I agree with Mr. Roberts that 100us is too long. So shorten it. What is the stall supposed to accomplish? However, calling KeDelayExecutionThread at DISPATCH_LEVEL is effectively stalling the processor anyway.

Mark_Roddy · March 2, 2024, 4:42pm

And the other approach is to stop using StartIo. It is a horrible legacy NT3.x feature. Use a spinlock for synchronization and do your sleep at passive level.

Chris_Troester · March 4, 2024, 2:35pm

There are several places in the code where I wait. Here are the longest waits:

Writing Eeprom data:

Write unlock sequence
Write data DWORD 0 … 7. After each write, wait for 20ms
Write lock sequence

The Eeprom access is handled in the FPGA logic.

Erase Flash data:

Write address of the flash block that should be deleted to the BAR
Write control bit to BAR to start erasing the flash block
Poll the control bit whether the erase is finished. Poll interval is 100ms
Repeat this for all flash blocks in the flash segment

In the book “Programming the Windows Driver Model” from Walter Oney (1999), there is a chapter “The Standard model for IRP processing”. There is following picture:

I/O Manager -> Dispatch routine -> StartIo routine -> ISR -> DPC routine -> I/O Manager.

My driver uses the startIo routine, although we don’t have an interrupt that signals that the device has completed the request. Oney says that one advantage is that only one IRP is sent to the StartIo routine at a time.

Mark_Roddy · March 4, 2024, 2:54pm

Sure, Walter wrote that book more than 20 years ago. Even then the startio model was dubious. Startio serializes your IRP processing, but it does that by acquiring a spinlock that puts your startio callback at DISPATCH_LEVEL, restricting what you can do in that routine.

There are lots of ways to serialize irp processing without using startio. (And this would be utterly trivial to do with KMDF, but that is likely not feasible for you.) You really should be performing these operations at passive level with your rather large delays.

craig_howard · March 4, 2024, 5:24pm

Flash operations should all be handled at PASV level, ideally with a worker thread … your usermode pushes an IOCTL with the flash instructions (such as a list of addresses to erase, or the data to write) and pends, kernel mode pushes the operation to a worker thread, worker thread does the flash stuff and once it’s all happy completes the IRP with a status, usermode gets the result and decides what to do next

There are just so many things that can go wrong with/ delay working with Flash … take those writes, if the Flash is getting close to it’s wear point or the voltage drops your write won’t “write” … so you should be doing a read, write, read operation and compare the write and read to make sure the data “stuck”. The erases too can be problematic, since Flash erases are 1MB which can take awhile and if there is a voltage drop then that erase won’t really happen

Then there’s a concurrency problem (which might be what you’re trying to accomplish by going to Dispatch) as well as a reentrancy problem. All of those can be handled very effectively with a state engine in the kernel being updated by a worker thread …

Tim_Roberts · March 4, 2024, 10:43pm

Right. There is NO WAY you can have a dispatch-level thread wait for milliseconds. You would need to use a timer-based state machine. That’s not impossible, but a passive thread would be more robust.

Or, let the driver handle the lowest level operations and manage it all from user-mode.

system · April 3, 2024, 10:44pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.