The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.
Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/
first step to describe the platform where the situation occus.
There is a custom PCIe end point device on near typical Core-i5 based mainboard. Since this end point device is custom it has some limitations and needs own self diagnostics. To accomplish so there is a timer based self check in its device driver, where several device registers are read and their content validated. Several of them to check are located in PCIe configuration address space.
Periodically called function attempts to read such registers by issuing a read into parent driver, that is PCIe bus driver, using GetBusData(bus_ctx, PCI_WHICHSPACE_CONFIG, ...) function calls.
If registers are read and contents shows device is healthy then no further action needed. If registers are read and content shows internal problem in PCIe device, or registers were not read, then device will be declared having problem and driver will be soon unloaded.
This built in health check method of custom device have been working well for several years, since Windows 10 RS1 at least. It came to my attention, that recently a crash in driver has been reported in some occasional, rare (approximately 1 in 1650) cases. Investigation showed that crash is caused by GetBusData() call.
All variables are inspected and seem to be valid at the moment crash occurs.
My feeling is that in normal cases the call to GetBusData() goes through without issues. In the crash case something goes wrong in GetBusData() function in parent device driver operation - analyze command output claims device object has been blocking an Irp for too long a time.
My question to experts is: under which circumstances PCIe bus driver may behave that way, i.e. a call to GetBusData() would take too long?
Here is call stack from crash case
65baeab0 fffff80261d94f47 : ffff830f
c2e49130 ffff830fbff0e080 00000000
00000000 0000000000000000 : nt!KiSwapContext+0x76
65baebf0 fffff80261d94ab9 : 00000000
00000000 ffff830fc24dd2a0 00000000
00000000 ffff830fc0ee6b00 : nt!KiSwapThread+0x297
65baecb0 fffff80261d93840 : ffff830f
c0ee6b00 ffff830f00000000 00000000
00000000 ffffd18065baedc1 : nt!KiCommitThreadWait+0x549
65baed50 fffff80275a0cae6 : ffffd180
65baee60 ffffd18000000000 00000000
00000000 0000000000000000 : nt!KeWaitForSingleObject+0x520
65baee20 fffff80275a64c90 : 00000000
00000000 ffffd18000000000 00000000
00000000 0000000000000000 : Wdf01000!FxIoTarget::SubmitSync+0x192 [minkernel\wdf\framework\shared\targets\general\fxiotarget.cpp @ 1839]
65baeee0 fffff80275a66158 : ffff830f
c24dd2a0 0000000000000000 00000000
00000001 fffff802780ea078 : Wdf01000!FxIoTargetSendIo+0x290 [minkernel\wdf\framework\shared\targets\general\fxiotargetapi.cpp @ 812]
65baf150 fffff802780c9d3f : ffff830f
00000002 303578303d746573 ffffd180
65baf340 fffff802780e54fd : Wdf01000!imp_WdfIoTargetSendWriteSynchronously+0x38 [minkernel\wdf\framework\shared\targets\general\fxiotargetapi.cpp @ 1035]
65baf1a0 0000000000000000 : 00000002
0000ff12 000002130000fe12 00000112
0000ff13 000000000200002c :nnnn!UpdateDeviceStatusInfo+0x42f [L:\drivers\nnnn\driver\device.cpp @ 2494]
It looks like you're new here. If you want to get involved, click one of these buttons!
|Upcoming OSR Seminars|
|OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!||Kernel Debugging||30 Mar 2020||OSR Seminar Space|
|Developing Minifilters||15 Jun 2020||LIVE ONLINE|
|Writing WDF Drivers||22 June 2020||LIVE ONLINE|
|Internals & Software Drivers||28 Sept 2020||Dulles, VA|