BugCheck in SCSIWmi

I have implemented very basic WMI support in my storport driver based on the sample code in the DDK iSCSI driver. The very first request is causing a panic and it looks like a typical NULL pointer problem to me. Most of the parameters are extracted from the SRB so it seems unlikely they are the the culprit.

My code (almost identical to the iSCSI sample):

//
// Process the incoming WMI request.
//
pending = ScsiPortWmiDispatchFunction(
adapterRequest ? &pCard->WmiLibContext : &pCard->PDOWmiLibContext,
srb->WMISubFunction,
pCard,
requestContext,
srb->DataPath,
srb->DataTransferLength,
srb->DataBuffer);

This call is using pCard->WmiLibContext and that looks correct to me. I only implemented the minimum number of functions in my structure:

pWmiLibContext->GuidCount = gNumGuids;
pWmiLibContext->GuidList = gGuidList;

pWmiLibContext->QueryWmiRegInfo = WmiEntryQueryRegInfo;
pWmiLibContext->QueryWmiDataBlock = WmiEntryQueryDataBlock;
pWmiLibContext->SetWmiDataBlock = NULL; //WmiEntrySetDataBlock;
pWmiLibContext->SetWmiDataItem = NULL;
pWmiLibContext->ExecuteWmiMethod = NULL; //WmiEntryExecuteMethod;
pWmiLibContext->WmiFunctionControl = NULL; //WmiEntryFunctionControl;

The WmiSubfunction is 0. I assume that this corresponds to HwScsiWmiQueryReginfo. I set a breakpoint in my function and as I expected, the panic occurs before that breakpoint is triggered.

Now here is the output from !analyze -v

*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000000, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, value 0 = read operation, 1 = write operation
Arg4: fffffa6005d7a517, address which referenced memory

Debugging Details:

READ_ADDRESS: 0000000000000000

CURRENT_IRQL: 2

FAULTING_IP:
PMC80XX!ScsiWmipFindGuid+1f [d:\longhorn\drivers\storage\scsiwmi\wmilib.c @ 93]
fffffa60`05d7a517 498b08 mov rcx,qword ptr [r8]

DEFAULT_BUCKET_ID: VISTA_DRIVER_FAULT

BUGCHECK_STR: 0xD1

PROCESS_NAME: System

TRAP_FRAME: fffffa600218c430 -- (.trap 0xfffffa600218c430)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000000 rbx=0000000000000003 rcx=fffffa6005e57610
rdx=fffffa6005e57590 rsi=0000000000000001 rdi=0000000000000000
rip=fffffa6005d7a517 rsp=fffffa600218c5c8 rbp=fffffa600218c6b8
r8=0000000000000000 r9=0000000000000008 r10=fffffa6005e57610
r11=fffffa6005e57610 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei pl nz na pe nc
PMC80XX!ScsiWmipFindGuid+0x1f:
fffffa6005d7a517 498b08 mov rcx,qword ptr [r8] ds:0000000000000000=????????????????
Resetting default scope

LAST_CONTROL_TRANSFER: from fffff80001907562 to fffff80001857770

STACK_TEXT:
fffffa600218bbd8 fffff80001907562 : fffffa8007620bb0 0000000000000065 0000000000000000 fffff8000189a25c : nt!RtlpBreakWithStatusInstruction
fffffa600218bbe0 fffff8000190831b : 0000000000000003 0000000000000000 fffff80001897af0 00000000000000d1 : nt!KiBugCheckDebugBreak+0x12
fffffa600218bc40 fffff8000185d5d4 : fffffa8008e884b0 fffff880062a0b40 fffffa80099b7e30 ffffffffffffffff : nt!KeBugCheck2+0x6eb
fffffa600218c2b0 fffff8000185d26e : 000000000000000a 0000000000000000 0000000000000002 0000000000000000 : nt!KeBugCheckEx+0x104
fffffa600218c2f0 fffff8000185c14b : 0000000000000000 0000000000000000 fffffa800a3b7640 fffffa600218c648 : nt!KiBugCheckDispatch+0x6e
fffffa600218c430 fffffa6005d7a517 : fffffa6005d7a5d1 fffffa800a3b92a8 0000000000000000 0000000000000000 : nt!KiPageFault+0x20b
fffffa600218c5c8 fffffa6005d7a5d1 : fffffa800a3b92a8 0000000000000000 0000000000000000 0000000000000000 : PMC80XX!ScsiWmipFindGuid+0x1f [d:\longhorn\drivers\storage\scsiwmi\wmilib.c @ 93]
fffffa600218c5d0 fffffa6005d7aa2b : 0000000000000000 0000000000000000 0000000000000000 fffffa800a28efa0 : PMC80XX!ScsiWmipProcessRequest+0x65 [d:\longhorn\drivers\storage\scsiwmi\wmilib.c @ 310]
fffffa600218c640 fffffa6005c19c83 : fffffa800a3b7640 fffffa8000000000 0000000000000001 0000000000000003 : PMC80XX!ScsiPortWmiDispatchFunction+0x73 [d:\longhorn\drivers\storage\scsiwmi\wmilib.c @ 794]
fffffa600218c6b0 fffffa6005c12171 : fffffa800a3b8008 fffffa800a28efa0 0000000000000000 fffffa600218ca60 : PMC80XX!WmiOsProcessSrb+0x183 [d:\svn\spctrunk\tisa\drivers\windows\stor\startio.c @ 2469]
fffffa600218c740 fffffa600107b930 : fffffa800a3b8008 fffffa800a28efa0 0000000000000000 fffffa800a3b74f0 : PMC80XX!HWStartIo+0x5f1 [d:\svn\spctrunk\tisa\drivers\windows\stor\startio.c @ 492]
fffffa600218c890 fffffa600107da00 : 0000000000000000 fffffa600791d010 fffffa600791d0b8 0000000000001000 : storport!RaidAdapterPostScatterGatherExecute+0x150
fffffa600218c8f0 fffff80001d1e4ef : 0000000000001000 0000000000000000 fffffa8000011fd0 fffffa600791d000 : storport!RaidpAdapterContinueScatterGather+0x50
fffffa600218c920 fffffa600107dfe0 : 0000000000001f90 fffffa800a3b7640 00000000000005ff fffffa800a28efa0 : hal!HalBuildScatterGatherList+0x203
fffffa600218c990 fffffa600107e7e8 : 0000000000000000 0000000000001f90 fffffa800a3b7640 0000000000000000 : storport!RaidAdapterScatterGatherExecute+0x90
fffffa600218c9f0 fffffa60010bfe3e : 0000001400000014 0000000000000000 fffffa6001b66180 fffff6fb7ea00290 : storport!RaidAdapterRaiseIrqlAndExecuteXrb+0x18
fffffa600218ca20 fffffa60010c0025 : fffffa600791d000 fffffa600218cb00 0000000000000070 fffffa800a414000 : storport!RaWmiPassToMiniPort+0x23e
fffffa600218ca80 fffffa60010c031e : fffffa800a3b7640 fffffa800a3b7640 fffffa800a4017a0 fffff80001867ce0 : storport!RaWmiIrpRegisterRequest+0xe5
fffffa600218cad0 fffffa60010c05c7 : fffffa6001b667f0 fffffa6001087110 fffffa800a4017a0 fffffa800a3b74f0 : storport!RaWmiDispatchIrp+0x17e
fffffa600218cb40 fffff80001a6fd3a : fffffa800a401708 fffffa800a4017a0 fffffa800a3b74f0 fffff80001868600 : storport!RaDriverSystemControlIrp+0x67
fffffa600218cb80 fffff80001af7640 : 0000000000000000 fffffa800a4017a0 0000000000002000 fffffa800a414000 : nt!WmipForwardWmiIrp+0x1a6
fffffa600218cc00 fffff80001c14404 : 00000000c0000295 0000000000002000 0000000000000000 0000000000000000 : nt!WmipSendWmiIrp+0xa4
fffffa600218cc60 fffff80001c34aec : fffffa800a28b510 fffff800019948f8 fffffa8007620bb0 fffff880066c5ec0 : nt!WmipRegisterOrUpdateDS+0xb4
fffffa600218ccc0 fffff800018648c3 : fffff80001c34a90 fffff80001994801 fffffa8007620b00 0000000000000000 : nt!WmipRegistrationWorker+0x5c
fffffa600218ccf0 fffff80001a67f87 : fffff8000199ce00 0000000000000000 fffffa8007620bb0 0000000000000080 : nt!ExpWorkerThread+0xfb
fffffa600218cd50 fffff8000189a656 : fffffa6001b66180 fffffa8007620bb0 fffffa6001b6fd40 fffffa80076339d8 : nt!PspSystemThreadStartup+0x57
fffffa600218cd80 0000000000000000 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 : nt!KiStartSystemThread+0x16

STACK_COMMAND: kb

FOLLOWUP_IP:
PMC80XX!ScsiWmipFindGuid+1f [d:\longhorn\drivers\storage\scsiwmi\wmilib.c @ 93]
fffffa60`05d7a517 498b08 mov rcx,qword ptr [r8]

FAULTING_SOURCE_CODE:
No source found for 'd:\longhorn\drivers\storage\scsiwmi\wmilib.c'

SYMBOL_STACK_INDEX: 6

SYMBOL_NAME: PMC80XX!ScsiWmipFindGuid+1f

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: PMC80XX

IMAGE_NAME: PMC80XX.SYS

DEBUG_FLR_IMAGE_TIMESTAMP: 4c17ab04

FAILURE_BUCKET_ID: X64_0xD1_PMC80XX!ScsiWmipFindGuid+1f

BUCKET_ID: X64_0xD1_PMC80XX!ScsiWmipFindGuid+1f

Without source code for scsiwmi.lib, it is difficult to know which parameter is causing this.

Check requestContext parameter. It should be a pointer.

Igor Sharovar

The request context looks valid. It is defined as part of the SRB extension.

requestContext = &(srbExt->WmiRequestContext);

You could install the check build of scsiwmi.lib and set a breakpoint in PMC80XX!ScsiWmipFindGuid.
Go step by step and find a place where crash happened. You will use assembler code but you could figure out which parameter is wrong. This could be faster to find a problem.

Igor Sharovar

Where would I find the checked build of scsiwmi.lib. I see only one build available in the DDK.

Try to call bp ScsiWmipFindGuid in WinDbg. You should be stopped in the beginning of the function.

Igor Sharovar

Your NumGuids doesn’t match the actual size of GUID list, and one of the GUID pointers in the array is NULL.

I think you misinterpreted the information I provided at the top. I did not include my GUID list.

I was showing the SCSIWIB_LIB_CONTEXT. I am pretty sure it is acceptable to have NULL pointers for functions that are not provided by my driver.

On 6/16/2010 9:07 AM, xxxxx@pmc-sierra.com wrote:

I think you misinterpreted the information I provided at the top. I did not include my GUID list.

I was showing the SCSIWIB_LIB_CONTEXT. I am pretty sure it is acceptable to have NULL pointers for functions that are not provided by my driver.

The crash dump seemed to show a system GUID processor handling a NULL
pointer. He was suggesting that the problem could lie in your GUID
array. As you said, you did not include that, so we can’t know for sure.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

>I did not include my GUID list.
You must initialize PSCSIWIB_LIB_CONTEXT->GuidCount and PSCSIWIB_LIB_CONTEXT->GuidList.
You could make GuidCount = 0 and GuidList = NULL and you probably would not crash. But in this case StorPort would likely send to your driver StopAdapter request.

Igor Sharovar

Here is my GuidList. This was based on the sample code I found.

// WMI Definitions
// Set if the guid is only used for firing events. Guids that can be queried
// and that fire events should not have this bit set.
#define WMIREG_FLAG_EVENT_ONLY_GUID 0x00000040

// Create structures with GUID so that pointers can be loaded in the gGuidList.
GUID MS_SM_AdapterInformationQueryGUID = MS_SM_AdapterInformationQueryGuid;
GUID MS_SM_EventControlWmiGUID = MS_SM_EventControlGuid;
//GUID MS_SM_HBAApiVersionGUID = MS_SM_HBAApiVersionGuid;
GUID MS_SM_PortInformationMethodsGUID = MS_SM_PortInformationMethodsGuid;
GUID MS_SM_ScsiInformationMethodsGUID = MS_SM_ScsiInformationMethodsGuid;
GUID MS_SM_TargetInformationMethodsGUID = MS_SM_TargetInformationMethodsGuid;
GUID MS_SMHBA_BindingEntryGUID = MS_SMHBA_BINDINGENTRYGuid;
GUID MS_SMHBA_PortAttributesGUID = MS_SMHBA_PORTATTRIBUTESGuid;
GUID MS_SMHBA_ProtocolStatisticsGUID = MS_SMHBA_PROTOCOLSTATISTICSGuid;
GUID MS_SMHBA_SasPhyGUID = MS_SMHBA_SAS_PHYGuid;
GUID MS_SMHBA_SasPortGUID = MS_SMHBA_SAS_PortGuid;
GUID MS_SMHBA_SasPhyStatisticsGUID = MS_SMHBA_SASPHYSTATISTICSGuid;
GUID MS_SMHBA_ScsiEntryGUID = MS_SMHBA_SCSIENTRYGuid;

SCSIWMIGUIDREGINFO gGuidList = {
{&MS_SM_AdapterInformationQueryGUID, 1, 0},
{&MS_SM_PortInformationMethodsGUID, 1, 0},
{&MS_SMHBA_BindingEntryGUID, 1, 0},
{&MS_SMHBA_PortAttributesGUID, 1, 0},
{&MS_SMHBA_ProtocolStatisticsGUID, 1, 0},
{&MS_SMHBA_SasPhyGUID, 1, 0},
{&MS_SMHBA_SasPortGUID, 1, 0},
{&MS_SMHBA_SasPhyStatisticsGUID, 1, 0}
};

Each entry in gGuidList is a GUID (provided by Microsoft in predefined MOF) along with the instance count and flags.

On 6/16/2010 9:36 AM, xxxxx@pmc-sierra.com wrote:

Here is my GuidList. This was based on the sample code I found.

SCSIWMIGUIDREGINFO gGuidList = {
{&MS_SM_AdapterInformationQueryGUID, 1, 0},
{&MS_SM_PortInformationMethodsGUID, 1, 0},
{&MS_SMHBA_BindingEntryGUID, 1, 0},
{&MS_SMHBA_PortAttributesGUID, 1, 0},
{&MS_SMHBA_ProtocolStatisticsGUID, 1, 0},
{&MS_SMHBA_SasPhyGUID, 1, 0},
{&MS_SMHBA_SasPortGUID, 1, 0},
{&MS_SMHBA_SasPhyStatisticsGUID, 1, 0}
};

Each entry in gGuidList is a GUID (provided by Microsoft in predefined MOF) along with the instance count and flags.

That’s reasonable. Then, do you have this?
ULONG gNumGuids = sizeof(gGuidList)/sizeof(gGuidList[0]);


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Yes. I somehow missed that in my cut/paste.

#define MSFC_NUM_GUIDS (sizeof(gGuidList) / sizeof(SCSIWMIGUIDREGINFO))

ULONG gNumGuids = MSFC_NUM_GUIDS;

John,
What do return in your HwScsiWmiQueryReginfo? What do you assign in MofResourceName?

I set a breakpoint in HwScsiWmiQueryRegInfo and the BSOD occurs before that breakpoint triggers.

I am setting MofResourceName to NULL since I did not create my own MOF.

After stepping through the assembly language code a few times it looks like the problem is srb->DataPath. This is supposed to be a pointer to a GUID associated with the data block being retrieved by the WMI request (QueryRegInfo). Since the SRB is built outside my driver and a lot of drivers must be handling this call successfully, this all seems very mysterious.

But it is also possible I am misinterpreting the flow of the disassembled code.

>But it is also possible I am misinterpreting the flow of the disassembled code.
I think it is likely because this parameter comes from SRB.

Igor Sharovar

I replaced the NULL pointer in the SRB with a pointer to one of my GUID.

This eliminated the BSOD and caused my WmiDataQuery function to be called. I completed the operation matching GUID that I placed in the SRB and control returned to my storport driver eventually.

Unfortunately the request was considered “incomplete” based on the return value from ScsiPortWmiDispatchFunction and this obviously causes a driver deadlock.

>I replaced the NULL pointer in the SRB with a pointer to one of my GUID.
WDK does not says that you need to put an address of your defines GUID.
Do you have to compile hbaapi.mof file and attach it to a resource file?

Igor Sharovar

I thought that an MOF file was only required for new data blocks that I created specifically for my driver. The sample code I have indicates that it is possible to build a driver without creating any data blocks and simply returning NULL for QueryRegInfo.

There was no mention of the need to attach hbaapi.mof as a resource. That would imply that normally I could expect to attach two MOF file if I also had private data.

I would have also expected the first WMI request to be QueryRegInfo with a NULL pointer in the DataPath. If that is not the case, what information is being request from WMI before the data blocks are even defined? Perhaps my live modification of the pointer has altered the expected behavior in some subtle way, masking the real problem here.