Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Home NTDEV

More Info on Driver Writing and Debugging


The free OSR Learning Library has more than 50 articles on a wide variety of topics about writing and debugging device drivers and Minifilters. From introductory level to advanced. All the articles have been recently reviewed and updated, and are written using the clear and definitive style you've come to expect from OSR over the years.


Check out The OSR Learning Library at: https://www.osr.com/osr-learning-library/


Before Posting...

Please check out the Community Guidelines in the Announcements and Administration Category.

KiPageFault into BSOD when stepping over

Andrii_ChabykinAndrii_Chabykin Member Posts: 30
Good time, gentlemen.

I'm constantly running into Bug Check with the following stack:
00 ffffd000`20463d78 fffff801`0fa520ea nt!DbgBreakPointWithStatus
01 ffffd000`20463d80 fffff801`0fa519fb nt!KiBugCheckDebugBreak+0x12
02 ffffd000`20463de0 fffff801`0f9c9da4 nt!KeBugCheck2+0x8ab
03 ffffd000`204644f0 fffff801`0f9f1b1f nt!KeBugCheckEx+0x104
04 ffffd000`20464530 fffff801`0f8b85ad nt! ?? ::FNODOBFM::`string'+0x1797f
05 ffffd000`204645d0 fffff801`0f9d3f2f nt!MmAccessFault+0x7ed
06 ffffd000`20464710 fffff800`0034a2e3 nt!KiPageFault+0x12f
07 ffffd000`204648a0 fffff800`00e9441f Wdf01000!imp_WdfFdoInitQueryProperty+0x28
08 ffffd000`204648f0 fffff800`00e9a17f MyVolFlt!WdfFdoInitQueryProperty+0x5f
09 ffffd000`20464940 fffff800`0031055b MyVolFlt!MyVolFltEvtDeviceAdd+0x9f
0a ffffd000`20464bd0 fffff801`0f9449d9 Wdf01000!FxDriver::AddDevice+0xab


This happens ONLY when I step into/over WdfFdoInitQueryProperty. Breaking into debugger after this invocation produces no bug checks.

I've run into this problem multiple times in different places (of this module and other modules). Can't figure out whats wrong,
1: kd> !irql
Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
1: kd> !analyze -v
*******************************************************************************
* *
* Bugcheck Analysis *
* *
*******************************************************************************

PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: ffffe00020464c10, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff8000034a2e3, If non-zero, the instruction address which referenced the bad memory
address.
Arg4: 0000000000000002, (reserved)

1: kd> !pool ffffe00020464c10
Pool page ffffe00020464c10 region is Nonpaged pool
ffffe00020464000 is not a valid large pool allocation, checking large session pool...
Unable to read large session pool table (Session data is not present in mini and kernel-only dumps)
ffffe00020464000 is not valid pool. Checking for freed (or corrupt) pool
Address ffffe00020464000 could not be read. It may be a freed, invalid or paged out page

1: kd> ? poi(DeviceInit)
Evaluate expression: -35183830610928 = ffffe000`20464c10


Is this somehow connected with kd? How can I avoid this bugcheck?

Thanks.

Comments

  • Alex_GrigAlex_Grig Member Posts: 3,238
    THis is an access to missing or paged memory at high IRQL. It cannot be stepped over.

    Find our from the crashdump or from live debugger why the memory is missing or accessed on wrong IRQL.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Plus the usual advice on using Driver Verifier and Special Pool.

    The question is why there is a pointer to the invalid page. It could be a
    dangling pointer following a release of storage, or bad pointer
    arithmetic. Some more context, such as the source code near the point of
    failure, would help; if you show variable names, you need to show the code
    that sets their values.
    joe


    > THis is an access to missing or paged memory at high IRQL. It cannot be
    > stepped over.
    >
    > Find our from the crashdump or from live debugger why the memory is
    > missing or accessed on wrong IRQL.
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    OK. Heres failing code:

    NTSTATUS
    MyVolFltEvtDeviceAdd(_In_ WDFDRIVER Driver, _Inout_ PWDFDEVICE_INIT DeviceInit)
    /*++
    Routine Description:

    EvtDeviceAdd is called by the framework in response to AddDevice
    call from the PnP manager. We create and initialize a device object to
    represent a new instance of the device.

    Arguments:
    Driver - Handle to a framework driver object created in DriverEntry
    DeviceInit - Pointer to a framework-allocated WDFDEVICE_INIT structure.
    Return Value:
    NTSTATUS
    --*/
    {
    NTSTATUS status;

    PAGED_CODE();

    UNREFERENCED_PARAMETER(Driver);

    //PDEVICE_OBJECT Pdo = WdfFdoInitWdmGetPhysicalDevice(DeviceInit);

    DECLARE_UNICODE_STRING_SIZE(name, 256);
    ULONG retLen;
    status = WdfFdoInitQueryProperty(DeviceInit, DevicePropertyClassGuid, sizeof(name_buffer), name.Buffer, &retLen);

    I'm sure you understand its running on passive IRQL:
    1: kd> !irql
    Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)

    FAULTING_SOURCE_FILE: c:\program files (x86)\windows kits\8.1\include\wdf\kmdf\1.11\wdffdo.h

    FAULTING_SOURCE_LINE_NUMBER: 202

    FAULTING_SOURCE_CODE:
    198: PULONG ResultLength
    199: )
    200: {
    201: return ((PFN_WDFFDOINITQUERYPROPERTY) WdfFunctions[WdfFdoInitQueryPropertyTableIndex])(WdfDriverGlobals, DeviceInit, DeviceProperty, BufferLength, PropertyBuffer, ResultLength);
    > 202: }

    PAGE_FAULT_IN_NONPAGED_AREA (50)
    Invalid system memory was referenced. This cannot be protected by try-except,
    it must be protected by a Probe. Typically the address is just plain bad or it
    is pointing at freed memory.
    Arguments:
    Arg1: ffffe00020464c10, memory referenced.

    Arg1 is DeviceInit.



    Is this a debugger side-effect? Since it doesn't happen when there is no debugger attached or when I just don't step over at that point.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > OK. Heres failing code:
    >
    > NTSTATUS
    > MyVolFltEvtDeviceAdd(_In_ WDFDRIVER Driver, _Inout_ PWDFDEVICE_INIT
    > DeviceInit)
    > /*++
    > Routine Description:
    >
    > EvtDeviceAdd is called by the framework in response to AddDevice
    > call from the PnP manager. We create and initialize a device object to
    > represent a new instance of the device.
    >
    > Arguments:
    > Driver - Handle to a framework driver object created in DriverEntry
    > DeviceInit - Pointer to a framework-allocated WDFDEVICE_INIT structure.
    > Return Value:
    > NTSTATUS
    > --*/
    > {
    > NTSTATUS status;
    >
    > PAGED_CODE();
    >
    > UNREFERENCED_PARAMETER(Driver);
    >
    > //PDEVICE_OBJECT Pdo = WdfFdoInitWdmGetPhysicalDevice(DeviceInit);
    >
    > DECLARE_UNICODE_STRING_SIZE(name, 256);

    I am sort of curious how you determined that 256 is a valid value here.
    In app space, it is 261 characters, and there is a manifest constant that
    is used to get this value, MAX_PATH. I presume there is a similar
    constant defined in ntddk.h, and that's what you should use here.


    > ULONG retLen;
    > status = WdfFdoInitQueryProperty(DeviceInit, DevicePropertyClassGuid,
    > sizeof(name_buffer), name.Buffer, &retLen);

    I don't know the spec for this, but in app space it is a common error to
    use sizeof() on a Unicode string but the APIs want character count, not
    byte count. In the kernel, string counts in UNICODE_STRINGs are in bytes,
    but be sure you double-check this. And it would probably make sense that
    you store the result in the name.Length field. But what is name_buffer?
    I don't see it declared anywhere, and the UNICODE_STRING would not have a
    meaningful sizeof(). The value is likely to be name.MaxLength or whatever
    the field is (I can't check it right now, and I'm trusting a very rusty
    memory that has not had to look at a UNICODE_STRING in several years. I
    note that retLen is not declared anywhere, either. It should be a local
    variable. You can't post code that has undefined variables; we don't know
    where they are declared, or even their types.

    >
    > I'm sure you understand its running on passive IRQL:
    > 1: kd> !irql
    > Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
    >
    > FAULTING_SOURCE_FILE: c:\program files (x86)\windows
    > kits\8.1\include\wdf\kmdf\1.11\wdffdo.h
    >
    > FAULTING_SOURCE_LINE_NUMBER: 202
    >
    > FAULTING_SOURCE_CODE:
    > 198: PULONG ResultLength
    > 199: )
    > 200: {
    > 201: return ((PFN_WDFFDOINITQUERYPROPERTY)
    > WdfFunctions[WdfFdoInitQueryPropertyTableIndex])(WdfDriverGlobals,
    > DeviceInit, DeviceProperty, BufferLength, PropertyBuffer,
    > ResultLength);

    While it is nice that C and C++ let you compose long and complex
    expressions like this, you will find it a LOT easier to debug if you break
    this into about four lines of code. It becomes easier to debug. Even in
    C, you can make this more readable by declaring any variables you need in
    a local scope, e.g.,
    {
    SOMETYPE st;
    ANOTHERVAR v;
    st = ...some computation...;
    v = ...some computation based on st...;
    return v;
    }

    In C++, you don't need to declare the variables until they are used, but I
    often use this technique to give very limited scope to temporary variables
    I might need.

    There are many things that can go wrong here; WdfFunctions at that index
    may have an invalid address or have been damaged by some bad pointer work.

    But it looks like you made one of the silliest possible errors. You read
    the documentation of the function, and it said "PULONG ResultLength". So
    you assumed you had to have a variable of type PULONG that you would pass
    in, which means you have no idea how C works. So you declared a variable
    of type PULONG. Did you initialize it? If you did not initialize it, it
    holds garbage. If your luck is good, the value that is in this
    uninitialized variable will cause a BSOD; if your luck is bad, it will be
    a pointer to something important and that something important will be
    clobbered.

    The correct way, which you would know if you understood C/C++, would be

    ULONG ResultLength;

    ...(WdfDriverGlobals, ..., &ResultLength)

    I suggest reading about pointers in C, and fully understand what a pointer
    is, and does, and how they are created. &ResultLength creates a PULONG
    referencing the ULONG ResultLength. This is beginner's C knowledge.
    Learn the language you are programming in. The specification of a type of
    an argument to a function DOES NOT MEAN YOU NEED A VARIABLE OF THAT TYPE.
    It means you need an /expression/ of that type. So, for example, if
    ResultLength is a ULONG, the expression &ResultLength is a PULONG. And
    that satisfies the requirement of the function prototype. You simply
    passed an uninitialized variable in; if you had set your warning level to
    4, I think it would have caught this, and certainly if you used the
    /ANALYZE option which runs The Program Formerly Known As Prefast, that
    would definitely have caught it. So you need to understand how to
    properly use the tool chain that creates a driver. /W3 is simply
    inadequate for most serious programming.

    >> 202: }
    >
    > PAGE_FAULT_IN_NONPAGED_AREA (50)

    Yep, you have been lucky. You got a BSOD instead of clobbering something
    important.

    You need a good remedial course in the C language.
    joe


    > Invalid system memory was referenced. This cannot be protected by
    > try-except,
    > it must be protected by a Probe. Typically the address is just plain bad
    > or it
    > is pointing at freed memory.
    > Arguments:
    > Arg1: ffffe00020464c10, memory referenced.
    >
    > Arg1 is DeviceInit.
    >
    >
    >
    > Is this a debugger side-effect? Since it doesn't happen when there is no
    > debugger attached or when I just don't step over at that point.
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Daniel_TerhellDaniel_Terhell Member Posts: 1,357
    Joe, you are commenting on wdffdo.h here. Not on his code, which entirely
    invalidates the harangue.

    //Daniel




    >Joe wrote in message news:[email protected]
    >
    >But it looks like you made one of the silliest possible errors. You read
    >the documentation of the function, and it said "PULONG ResultLength". So
    >you assumed you had to have a variable of type PULONG that you would
    >passin, which means you have no idea how C works.
    >
    >The correct way, which you would know if you understood C/C++, would be
    >ULONG ResultLength;
    >...(WdfDriverGlobals, ..., &ResultLength)
    >I suggest reading about pointers in C, and fully understand what a pointer
    >is, and does, and how they are created. &ResultLength creates a PULONG
    >referencing the ULONG ResultLength. This is beginner's C knowledge.
    >Learn the language you are programming in. The specification of a type of
    >an argument to a function DOES NOT MEAN YOU NEED A VARIABLE OF THAT TYPE.
    >It means you need an /expression/ of that type. So, for example, if
    >ResultLength is a ULONG, the expression &ResultLength is a PULONG. And
  • Alexander_KrolAlexander_Krol Member - All Emails Posts: 50
    > From: [email protected] [mailto:bounce-549868-
    > [email protected]] On Behalf Of [email protected]
    > Sent: Tuesday, January 14, 2014 12:33 PM
    > To: Windows System Software Devs Interest List
    > Subject: RE:[ntdev] KiPageFault into BSOD when stepping over
    >
    > > OK. Heres failing code:
    > >
    > > NTSTATUS
    > > MyVolFltEvtDeviceAdd(_In_ WDFDRIVER Driver, _Inout_ PWDFDEVICE_INIT
    > > DeviceInit)
    > > /*++
    > > Routine Description:
    > >
    > > EvtDeviceAdd is called by the framework in response to AddDevice
    > > call from the PnP manager. We create and initialize a device object
    > to
    > > represent a new instance of the device.
    > >
    > > Arguments:
    > > Driver - Handle to a framework driver object created in DriverEntry
    > > DeviceInit - Pointer to a framework-allocated WDFDEVICE_INIT
    > structure.
    > > Return Value:
    > > NTSTATUS
    > > --*/
    > > {
    > > NTSTATUS status;
    > >
    > > PAGED_CODE();
    > >
    > > UNREFERENCED_PARAMETER(Driver);
    > >
    > > //PDEVICE_OBJECT Pdo =
    WdfFdoInitWdmGetPhysicalDevice(DeviceInit);
    > >
    > > DECLARE_UNICODE_STRING_SIZE(name, 256);
    ......
    >
    > > ULONG retLen;
    > > status = WdfFdoInitQueryProperty(DeviceInit,
    > DevicePropertyClassGuid,
    > > sizeof(name_buffer), name.Buffer, &retLen);
    >
    > >
    > > I'm sure you understand its running on passive IRQL:
    > > 1: kd> !irql
    > > Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
    > >
    > > FAULTING_SOURCE_FILE: c:\program files (x86)\windows
    > > kits\8.1\include\wdf\kmdf\1.11\wdffdo.h
    > >
    > > FAULTING_SOURCE_LINE_NUMBER: 202
    > >
    > > FAULTING_SOURCE_CODE:
    > > 198: PULONG ResultLength
    > > 199: )
    > > 200: {
    > > 201: return ((PFN_WDFFDOINITQUERYPROPERTY)
    > > WdfFunctions[WdfFdoInitQueryPropertyTableIndex])(WdfDriverGlobals,
    > > DeviceInit, DeviceProperty, BufferLength, PropertyBuffer,
    > > ResultLength);
    >
    > While it is nice that C and C++ let you compose long and complex
    > expressions like this, you will find it a LOT easier to debug if you
    > break
    > this into about four lines of code
    .........

    > But it looks like you made one of the silliest possible errors. You
    > read
    > the documentation of the function, and it said "PULONG ResultLength".
    > So
    > you assumed you had to have a variable of type PULONG that you would
    > pass
    > in, which means you have no idea how C works. So you declared a
    > variable
    > of type PULONG. Did you initialize it
    >
    > >> 202: }
    >
    > You need a good remedial course in the C language.
    > joe
    Sorry, Dr. Newcomer - but the crash is on return from inlined KMDF
    function WdfFdoInitQueryProperty.
    So - a) your stylistic criticism better be addressed to Microsoft and,
    more importantly,
    b) PULONG ResultLength is simply last one of function parameters.
    So - C++ problems are actually absent.

    What OP states is that he gets this bugcheck _only if he breaks into
    debugger somewhere before this function invocation and then steps over
    (or into) this line of code_ and he has _no_ bugcheck if there is no
    debugger attached or if he breaks into debugger _after_ this line.

    To OP: Andrii, and if you break into debugger before this line but just
    run code instead of stepping? No crash? Or the same one?

    Best regards,
    Alex Krol
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > Joe, you are commenting on wdffdo.h here. Not on his code, which entirely
    > invalidates the harangue.
    >
    > //Daniel
    >
    >
    That was not apparent to me. You would not believe how many times I see
    this error. Every programmer with marginal C knowledge makes it. After I
    explain it to them, I get some of the following

    ULONG result;
    PULONG presult = &result;

    and when I ask why, they tell me the function prototype requires a PULONG
    variable!

    Certainly the second-worst example was

    PULONG result = new ULONG;
    ...function using result...
    ...do things with *result
    delete result;

    and the worst (which was also in C++) wasa

    PULONG result = (PULONG)malloc(sizeof(ULONG));
    ...function using result...
    ...do things with *result...
    free(result);

    I would have at least three students per class who did some variant of
    these; they did not comprehend the concept of pointers, initialized
    variables, or how to read function prototypes. Many of these had more
    than a decade of C programming experience, and one complained that he had
    found pointers so esoteric that he never understood why anyone would want
    to use them [let us not digress into a discussion of
    pointers-vs-references, e.g., a Java/C# vs C/C++ discussion...]. With ten
    years' experience, he was also struct-challenged, and the notion of union
    was just so much noise. If he wanted an array of multivalued objects, he
    would create an array of int, and array of bool, an array of... instead of
    declaring a struct and making it an array.

    Maybe I'm just frustrated because I have lost six of the last ten days to
    illnesses of various sorts, including two hospital stays. But I think
    that code as we saw it shows a serious defect in thinking. And the OP
    should have spotted that error.

    I didn't think that any of the WDF source was available, so seeing a
    newbie mistake like this caused me to think it was the OP's code. If this
    is Microsoft code, some manager somewhere should catch pluperfect hell for
    either (a) not catching this or (b) not realizing his programmers were so
    undertrained that they could make this kind of error.

    In my Advanced Systems Programming course, I even devoted six slides to
    this problem, only to have students make the same error on their very
    first lab. Some of them just don't get pointers at all!
    joe
    >
    >
    >>Joe wrote in message news:[email protected]
    >>
    >>But it looks like you made one of the silliest possible errors. You read
    >>the documentation of the function, and it said "PULONG ResultLength". So
    >>you assumed you had to have a variable of type PULONG that you would
    >>passin, which means you have no idea how C works.
    >>
    >>The correct way, which you would know if you understood C/C++, would be
    >>ULONG ResultLength;
    >>...(WdfDriverGlobals, ..., &ResultLength)
    >>I suggest reading about pointers in C, and fully understand what a
    >> pointer
    >>is, and does, and how they are created. &ResultLength creates a PULONG
    >>referencing the ULONG ResultLength. This is beginner's C knowledge.
    >>Learn the language you are programming in. The specification of a type
    >> of
    >>an argument to a function DOES NOT MEAN YOU NEED A VARIABLE OF THAT TYPE.
    >>It means you need an /expression/ of that type. So, for example, if
    >>ResultLength is a ULONG, the expression &ResultLength is a PULONG. And
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Hello, Alex.

    kd> bp MyVolFlt!MyVolFltEvtDeviceAdd
    kd> g
    Break instruction exception - code 80000003 (first chance)
    MyVolFlt!MyVolFltEvtDeviceAdd:
    fffff800`011be0e0 cc int 3
    1: kd> g
    KDTARGET: Refreshing KD connection

    *** Fatal System Error: 0x00000050
    (0xFFFFE00020464C10,0x0000000000000000,0xFFFFF800002692E3,0x0000000000000002)

    So the same :( And its not about only this code. I get similar bugchecks while debugging other drivers (this one is just a skeleton project with nearly no real work performed).

    If I set bp right after WdfFdoInitQueryProperty call - it runs like a charm.

    Probably I must mention my environment:
    1) Host - Windows 8.1 Enterprise x64 with Hyper-V
    2) Target - Windows 8.1 Enterprise x64 under Hyper-V
    3) WinDbg 6.3.9600.16384 (WDK 8.1)
    4) VM got COM1 configured as a pipe for KD
    5) bootdebug is enabled
    6) Target is FRE build

    My only guess was Dynamic Memory in VM configuration. So I disabled it. Still no luck.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    See below...

    >> From: [email protected] [mailto:bounce-549868-
    >> [email protected]] On Behalf Of [email protected]
    >> Sent: Tuesday, January 14, 2014 12:33 PM
    >> To: Windows System Software Devs Interest List
    >> Subject: RE:[ntdev] KiPageFault into BSOD when stepping over
    >>
    >> > OK. Heres failing code:
    >> >
    >> > NTSTATUS
    >> > MyVolFltEvtDeviceAdd(_In_ WDFDRIVER Driver, _Inout_ PWDFDEVICE_INIT
    >> > DeviceInit)
    >> > /*++
    >> > Routine Description:
    >> >
    >> > EvtDeviceAdd is called by the framework in response to AddDevice
    >> > call from the PnP manager. We create and initialize a device object
    >> to
    >> > represent a new instance of the device.
    >> >
    >> > Arguments:
    >> > Driver - Handle to a framework driver object created in DriverEntry
    >> > DeviceInit - Pointer to a framework-allocated WDFDEVICE_INIT
    >> structure.
    >> > Return Value:
    >> > NTSTATUS
    >> > --*/
    >> > {
    >> > NTSTATUS status;
    >> >
    >> > PAGED_CODE();
    >> >
    >> > UNREFERENCED_PARAMETER(Driver);
    >> >
    >> > //PDEVICE_OBJECT Pdo =
    > WdfFdoInitWdmGetPhysicalDevice(DeviceInit);
    >> >
    >> > DECLARE_UNICODE_STRING_SIZE(name, 256);
    > ......
    >>
    >> > ULONG retLen;
    >> > status = WdfFdoInitQueryProperty(DeviceInit,
    >> DevicePropertyClassGuid,
    >> > sizeof(name_buffer), name.Buffer, &retLen);
    >>
    >> >
    >> > I'm sure you understand its running on passive IRQL:
    >> > 1: kd> !irql
    >> > Debugger saved IRQL for processor 0x1 -- 0 (LOW_LEVEL)
    >> >
    >> > FAULTING_SOURCE_FILE: c:\program files (x86)\windows
    >> > kits\8.1\include\wdf\kmdf\1.11\wdffdo.h
    >> >
    >> > FAULTING_SOURCE_LINE_NUMBER: 202
    >> >
    >> > FAULTING_SOURCE_CODE:
    >> > 198: PULONG ResultLength
    >> > 199: )
    >> > 200: {
    >> > 201: return ((PFN_WDFFDOINITQUERYPROPERTY)
    >> > WdfFunctions[WdfFdoInitQueryPropertyTableIndex])(WdfDriverGlobals,
    >> > DeviceInit, DeviceProperty, BufferLength, PropertyBuffer,
    >> > ResultLength);
    >>
    >> While it is nice that C and C++ let you compose long and complex
    >> expressions like this, you will find it a LOT easier to debug if you
    >> break
    >> this into about four lines of code
    > .........
    >
    >> But it looks like you made one of the silliest possible errors. You
    >> read
    >> the documentation of the function, and it said "PULONG ResultLength".
    >> So
    >> you assumed you had to have a variable of type PULONG that you would
    >> pass
    >> in, which means you have no idea how C works. So you declared a
    >> variable
    >> of type PULONG. Did you initialize it
    >>
    >> >> 202: }
    >>
    >> You need a good remedial course in the C language.
    >> joe
    > Sorry, Dr. Newcomer - but the crash is on return from inlined KMDF
    > function WdfFdoInitQueryProperty.
    > So - a) your stylistic criticism better be addressed to Microsoft and,
    > more importantly,
    > b) PULONG ResultLength is simply last one of function parameters.
    > So - C++ problems are actually absent.

    I have no idea what that assertion means. The code is simply wrong,
    W-R-O-N-G, big-time. The reason it is a Heisenbug is that single-stepping
    can alter the state of the stack, so that when single-stepping, the stack
    has a different garbage value for the uninitialized variable than if the
    programmer lets it run.

    The error is true whether the source code is C or C++, and is independent
    of the number of parameters or the position of the variable in the
    parameter list. The code is garbage. It has to be fixed. What is
    amazing is that it has been out there for so long with this deep and
    fundamental error in it, and nobody noticed!

    >
    > What OP states is that he gets this bugcheck _only if he breaks into
    > debugger somewhere before this function invocation and then steps over
    > (or into) this line of code_ and he has _no_ bugcheck if there is no
    > debugger attached or if he breaks into debugger _after_ this line.

    See above explanation. Recall the rule that using a variable whose value
    has not been established produces "undefined" results. The observed
    behavior is an example of one of the possible outcomes of code this bad.
    It could be much worse.

    >
    > To OP: Andrii, and if you break into debugger before this line but just
    > run code instead of stepping? No crash? Or the same one?

    Once the code is this broken, it doesn't matter what the OP does.
    Undefined behavior is the only possible outcome. The problem is not in
    the debugger, it is in the fact that the code is deeply and irrecoverably
    erroneous as written, and while the fix is trivial (remove the P from the
    declaration and add & to the parameter name), until the code IS fixed, it
    is simply not functional. The fact that it has not failed earlier is
    nothing short of miraculous. Or maybe it was failing, leading to
    unaccountable BSODs as the store through whatever pointer value was left
    on the stack overwrote some random important piece of data.

    There is no real choice here: if Microsoft wrote the code, Microsoft has
    to fix the code. If I had realized this was Microsoft code and I was
    writing a driver, I would exercise one of two choices: (a) stop all
    development until the error was corrected or (b) avoid calling the broken
    function while development continued. Simply calling this function opens
    you up to random memory damage and undefined behavior. It is unusable as
    written, assuming that the display we saw is the actual code.

    I once worked with a developing compiler, that had the property that it
    would frequently use 17 of the 16 available registers. When I saw this in
    my code, I would simply put my development on hold until the compiler team
    fixed the problem. When my boss's boss took me to task because I'd
    promised a port to the VAX "in a month", and it was now six weeks in, I
    pulled up my time sheets (I learned to keep careful time sheets) and
    pointed out that I was still on time; I had thus far expended fewer than
    five days on the project. When he demanded to know why, I pointedly said
    that the use-17-of-the-16-register bug was what killed it every time, and
    I could not debug my code when the compiler compiled it incorrectly. I
    was heavy on the sarcasm because I knew that he was the person who was
    writing the register allocator, and it was his bug. I ended the meeting
    by saying "When I get a working compiler, I expect it will take fewer than
    5 more days to port it, which means my estimate was off by two weeks." I
    then added, "Please let me know when we have a working compiler and I will
    try it again". It took two more weeks to fix that bug, and it took me
    three days to finish the port.

    This bug is a fatal bug. No progress can be made until it is fixed,
    unless progress can be made without calling that function. Somebody at
    Microsoft had better get a serious fire lit under them to get a fix out
    for this no later than last Tuesday.
    joe
    >
    > Best regards,
    > Alex Krol
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    > Hello, Alex.
    >
    > kd> bp MyVolFlt!MyVolFltEvtDeviceAdd
    > kd> g
    > Break instruction exception - code 80000003 (first chance)
    > MyVolFlt!MyVolFltEvtDeviceAdd:
    > fffff800`011be0e0 cc int 3
    > 1: kd> g
    > KDTARGET: Refreshing KD connection
    >
    > *** Fatal System Error: 0x00000050
    > (0xFFFFE00020464C10,0x0000000000000000,0xFFFFF800002692E3,0x0000000000000002)
    >
    > So the same :( And its not about only this code. I get similar bugchecks
    > while debugging other drivers (this one is just a skeleton project with
    > nearly no real work performed).
    >
    > If I set bp right after WdfFdoInitQueryProperty call - it runs like a
    > charm.

    No, it most definitely does not "run like a charm", unless you want to
    believe in charms, in which case it runs only because the random garbage
    left on the stack causes it to merely corrupt some random memory location,
    rather than try to access a nonexistent memory location. It is truly the
    "luck of the draw" that leaves an address on the stack that does not cause
    a BSOD. Now, if the spec of that function says that parameter can be
    NULL, then through only the most amazing coincidences, the value on the
    stack just happens to be zero. This is not "running", this is "not
    failing in spite of a deep and fundamental bug in the code". And it is
    only random luck that would leave this particular stack location set to
    NULL. When you start single-stepping, the stack gets a different pattern
    of garbage, and that particular garbage is fatal.

    I repeat: the code cannot be trusted. It does not "run" except by luck.
    It is entirely an accident that the value left on the stack without
    single-stepping does not cause a BSOD. Officially, the meaning of that
    code is undefined, and it is entitled to do anything at all, including
    causing your computer to vanish from Earth and take up orbit around
    Jupiter. However, the most likely "undefined" behavior is to damage some
    random piece of memory somewhere.

    >
    > Probably I must mention my environment:
    > 1) Host - Windows 8.1 Enterprise x64 with Hyper-V
    > 2) Target - Windows 8.1 Enterprise x64 under Hyper-V
    > 3) WinDbg 6.3.9600.16384 (WDK 8.1)
    > 4) VM got COM1 configured as a pipe for KD
    > 5) bootdebug is enabled
    > 6) Target is FRE build

    None of the above matters. The code is wrong. Nothing can save it,
    except fixing the bug. On the other hand, if you used to be in ordnance
    disposal and don't mind playing with armed explosives, you may be
    perfectly comfortable continuing to use this code. Just don't be
    surprised if it blows up in your face.

    >
    > My only guess was Dynamic Memory in VM configuration. So I disabled it.
    > Still no luck.

    No. The code is wrong. It cannot possibly work correctly, ever. If it
    has been giving the illusion of working, that is just the most amazing
    luck in the known universe. Until that bug is fixed, be afraid. Be very
    afraid.

    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Alexander_KrolAlexander_Krol Member - All Emails Posts: 50
    Andrii wrote:

    > Hello, Alex.
    >
    > kd> bp MyVolFlt!MyVolFltEvtDeviceAdd
    > kd> g
    > Break instruction exception - code 80000003 (first chance)
    > MyVolFlt!MyVolFltEvtDeviceAdd:
    > fffff800`011be0e0 cc int 3
    > 1: kd> g
    > KDTARGET: Refreshing KD connection
    >
    > *** Fatal System Error: 0x00000050
    >
    >
    (0xFFFFE00020464C10,0x0000000000000000,0xFFFFF800002692E3,0x00000000000
    > 00002)
    >
    > So the same :( And its not about only this code. I get similar
    > bugchecks while debugging other drivers (this one is just a skeleton
    > project with nearly no real work performed).
    >
    > If I set bp right after WdfFdoInitQueryProperty call - it runs like a
    > charm.
    >
    > Probably I must mention my environment:
    > 1) Host - Windows 8.1 Enterprise x64 with Hyper-V
    > 2) Target - Windows 8.1 Enterprise x64 under Hyper-V
    > 3) WinDbg 6.3.9600.16384 (WDK 8.1)
    > 4) VM got COM1 configured as a pipe for KD
    > 5) bootdebug is enabled
    > 6) Target is FRE build
    >
    > My only guess was Dynamic Memory in VM configuration. So I disabled
    it.
    > Still no luck.

    Ha! Curiouser and curiouser!
    Actually, I think the only person on this list who can shed some light
    on it is Doron Holan - he, after all is _the_ KMDF man at Microsoft.
    BTW, does this happen when debugging target is physical Windows 8.1
    machine and not a VM?

    Best regards,
    Alex Krol
  • Alexander_KrolAlexander_Krol Member - All Emails Posts: 50
    Dr. Newcomer wrote:
    >
    > No. The code is wrong

    The code in question is

    __checkReturn
    __drv_maxIRQL(PASSIVE_LEVEL)
    NTSTATUS
    FORCEINLINE
    WdfFdoInitQueryProperty(
    __in
    PWDFDEVICE_INIT DeviceInit,
    __in
    DEVICE_REGISTRY_PROPERTY DeviceProperty,
    __in
    ULONG BufferLength,
    __out_bcount_full_opt(BufferLength)
    PVOID PropertyBuffer,
    __out
    PULONG ResultLength
    )
    {
    return ((PFN_WDFFDOINITQUERYPROPERTY)
    WdfFunctions[WdfFdoInitQueryPropertyTableIndex])(WdfDriverGlobals,
    DeviceInit, DeviceProperty, BufferLength, PropertyBuffer, ResultLength);
    }

    (Well, it is copypasted from old KMDF 1.9, but this inlined function was
    not changed in later ones).
    You are seeing in debugger output just last lines starting from
    PULONG ResultLength
    - and missing the closing ).

    Best regards,
    Alex Krol
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Alex, I don't have physical machine available for debugging here :(

    And a small update. After first call to WdfFdoInitQueryProperty references rdx=DeviceInit:
    fffff800`003702e3 488b1a mov rbx,qword ptr [rdx]

    consequent step overs don't trigger bug checks. WDFDEVICE_INIT structure is allocated by framework and must be valid at that point.
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Second run with step into WdfFdoInitQueryProperty:
    1: kd> !pool ffffe000017d8e20 <- rdx = DeviceInit
    Pool page ffffe000017d8e20 region is Nonpaged pool
    ...
    *ffffe000017d8e10 size: 1f0 previous size: 1c0 (Allocated) *FxDr
    Pooltag FxDr : KMDF driver globals/generic pool allocation tag. Fallback tag in case driver tag is unusable., Binary : wdf01000.sys
  • Krzysztof_UchronskiKrzysztof_Uchronski Member - All Emails Posts: 165
    What does '!pte ffffe000`20464c10' say (run on dump from your first post)?

    Kris

    On Tue, Jan 14, 2014 at 12:49 PM, <[email protected]> wrote:
    > Second run with step into WdfFdoInitQueryProperty:
    > 1: kd> !pool ffffe000017d8e20 <- rdx = DeviceInit
    > Pool page ffffe000017d8e20 region is Nonpaged pool
    > ...
    > *ffffe000017d8e10 size: 1f0 previous size: 1c0 (Allocated) *FxDr
    > Pooltag FxDr : KMDF driver globals/generic pool allocation tag. Fallback tag in case driver tag is unusable., Binary : wdf01000.sys
    >
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer



    --
    Kris
  • Krzysztof_UchronskiKrzysztof_Uchronski Member - All Emails Posts: 165
    I haven't done anything in KMDF for couple of years now so it might be
    completely unrelated but did you check what's going on the other CPUs?
    I saw race conditions being more "visible" when debug stepping.

    Kris

    On Tue, Jan 14, 2014 at 1:21 PM, Krzysztof Uchronski <[email protected]> wrote:
    > What does '!pte ffffe000`20464c10' say (run on dump from your first post)?
    >
    > Kris
    >
    > On Tue, Jan 14, 2014 at 12:49 PM, <[email protected]> wrote:
    >> Second run with step into WdfFdoInitQueryProperty:
    >> 1: kd> !pool ffffe000017d8e20 <- rdx = DeviceInit
    >> Pool page ffffe000017d8e20 region is Nonpaged pool
    >> ...
    >> *ffffe000017d8e10 size: 1f0 previous size: 1c0 (Allocated) *FxDr
    >> Pooltag FxDr : KMDF driver globals/generic pool allocation tag. Fallback tag in case driver tag is unusable., Binary : wdf01000.sys
    >>
    >>
    >> ---
    >> NTDEV is sponsored by OSR
    >>
    >> Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >>
    >> OSR is HIRING!! See http://www.osr.com/careers
    >>
    >> For our schedule of WDF, WDM, debugging and other seminars visit:
    >> http://www.osr.com/seminars
    >>
    >> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    >
    >
    >
    > --
    > Kris



    --
    Kris
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Krzysztof, I was digging this right before you wrote :)

    Before unsuccessful call to WdfFdoInitQueryProperty (bp before call is made)
    1: kd> !pte poi(DeviceInit)
    VA ffffe00020464c10
    PXE at FFFFF6FB7DBEDE00 PPE at FFFFF6FB7DBC0000 PDE at FFFFF6FB78000810 PTE at FFFFF6F000102320
    contains 0000000000381863 contains 0000000000382863 contains 0000000000000000
    pfn 381 ---DA--KWEV pfn 382 ---DA--KWEV not valid

    After successful call to WdfFdoInitQueryProperty (bp after the call is made):

    1: kd> !pte poi(DeviceInit)
    VA ffffd00020464c10
    PXE at FFFFF6FB7DBEDD00 PPE at FFFFF6FB7DBA0000 PDE at FFFFF6FB74000810 PTE at FFFFF6E800102320
    contains 00000000002A4863 contains 00000000002A3863 contains 0000000000541863 contains 8000000002820963
    pfn 2a4 ---DA--KWEV pfn 2a3 ---DA--KWEV pfn 541 ---DA--KWEV pfn 2820 -G-DA--KW-V

    Does this mean Windows fixes kernel PTEs on the fly? OK, but why does it bugcheck at that point?
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    PS: Other cores are idle.
  • Alex_GrigAlex_Grig Member Posts: 3,238
    1. Do you ever map memory as non-cached, by any chance?

    2. List the breakpoints: bl

    Does it show any stray breakpoints you forgot about?
  • Doron_HolanDoron_Holan Member - All Emails Posts: 10,645
    Iirc, DeviceInit is stack based on the call to AddDevice so I am not sure how !pte handles stack addresses. As for the break, this has nothing to do with kmdf, rather the debugger interacting with the os.
    d
    Bent from my phone
    ________________________________
    From: [email protected]
    Sent: ?1/?14/?2014 5:44 AM
    To: Windows System Software Devs Interest List
    Subject: RE:[ntdev] KiPageFault into BSOD when stepping over

    Krzysztof, I was digging this right before you wrote :)

    Before unsuccessful call to WdfFdoInitQueryProperty (bp before call is made)
    1: kd> !pte poi(DeviceInit)
    VA ffffe00020464c10
    PXE at FFFFF6FB7DBEDE00 PPE at FFFFF6FB7DBC0000 PDE at FFFFF6FB78000810 PTE at FFFFF6F000102320
    contains 0000000000381863 contains 0000000000382863 contains 0000000000000000
    pfn 381 ---DA--KWEV pfn 382 ---DA--KWEV not valid

    After successful call to WdfFdoInitQueryProperty (bp after the call is made):

    1: kd> !pte poi(DeviceInit)
    VA ffffd00020464c10
    PXE at FFFFF6FB7DBEDD00 PPE at FFFFF6FB7DBA0000 PDE at FFFFF6FB74000810 PTE at FFFFF6E800102320
    contains 00000000002A4863 contains 00000000002A3863 contains 0000000000541863 contains 8000000002820963
    pfn 2a4 ---DA--KWEV pfn 2a3 ---DA--KWEV pfn 541 ---DA--KWEV pfn 2820 -G-DA--KW-V

    Does this mean Windows fixes kernel PTEs on the fly? OK, but why does it bugcheck at that point?



    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
    d
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    I'm sure it's not KMDF's fault since I get similar behavior without KMDF. As I said it looks like kd side effect.
    The biggest issue is that debugging experience suffers cause of restarts I'm forced into by bug checks. There were no problems without the debugger.
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    Since this is breaking in at DeviceAdd() of a volume filter, is this the boot disk? Are you halting the OS during boot, so that when you start stepping you hit page fault timeouts?

    Just a common occurrence that should be checked first ...

    -----Original Message-----
    From: [email protected] [mailto:[email protected]] On Behalf Of [email protected]
    Sent: Tuesday, January 14, 2014 7:21 AM
    To: Windows System Software Devs Interest List
    Subject: RE:[ntdev] KiPageFault into BSOD when stepping over

    I'm sure it's not KMDF's fault since I get similar behavior without KMDF. As I said it looks like kd side effect.
    The biggest issue is that debugging experience suffers cause of restarts I'm forced into by bug checks. There were no problems without the debugger.

    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,452
    I've seen this problem before when doing boot debugging. I suspect that a
    breakpoint has been set in the boot debugger, but that information has not
    been properly handed off to the normal kernel debugger.

    As we know, a software breakpoint is set by replacing a byte in the
    instruction stream with a 0xCC ("int 3"). It's the debugger's job to handle
    this breakpoint and swap the original byte back in before you hit "go". Note
    from the debugger output the O/S thinks that the first instruction in your
    code is "int 3"!

    kd> bp MyVolFlt!MyVolFltEvtDeviceAdd
    kd> g
    Break instruction exception - code 80000003 (first chance)
    MyVolFlt!MyVolFltEvtDeviceAdd:
    fffff800`011be0e0 cc int 3

    This is serious bad business, if the debugger was working properly this
    should be abstracted from you and you would see the first compiler generated
    instruction. If the debugger does not put the original byte back, then the
    instruction stream is hosed and you'll get a bugcheck 50.

    I have yet to try to nail this down more, but I've definitely seen it
    frequently in VMWare environments. Options:

    1. Stop enabling boot debugging if you don't need it

    2. Add a temporary __debugbreak() call in your code (don't forget to remove
    it!)

    3. Do your own breakpoints by swapping the byte yourself ("db" to find out
    what it is, "eb" to modify it). This sounds stupid and annoying, which it
    is, but it worked for me once in a pinch.

    -scott
    OSR

    "Speer, Kenny" wrote in message news:[email protected]

    Since this is breaking in at DeviceAdd() of a volume filter, is this the
    boot disk? Are you halting the OS during boot, so that when you start
    stepping you hit page fault timeouts?

    Just a common occurrence that should be checked first ...

    -----Original Message-----
    From: [email protected]
    [mailto:[email protected]] On Behalf Of [email protected]
    Sent: Tuesday, January 14, 2014 7:21 AM
    To: Windows System Software Devs Interest List
    Subject: RE:[ntdev] KiPageFault into BSOD when stepping over

    I'm sure it's not KMDF's fault since I get similar behavior without KMDF. As
    I said it looks like kd side effect.
    The biggest issue is that debugging experience suffers cause of restarts I'm
    forced into by bug checks. There were no problems without the debugger.

    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at
    http://www.osronline.com/page.cfm?name=ListServer

    -scott
    OSR

  • Krzysztof_UchronskiKrzysztof_Uchronski Member - All Emails Posts: 165
    So I guess hardware breakpoints would workaround the problem as well.

    Kris

    On Tuesday, 14 January 2014, Scott Noone wrote:

    > I've seen this problem before when doing boot debugging. I suspect that a
    > breakpoint has been set in the boot debugger, but that information has not
    > been properly handed off to the normal kernel debugger.
    >
    > As we know, a software breakpoint is set by replacing a byte in the
    > instruction stream with a 0xCC ("int 3"). It's the debugger's job to handle
    > this breakpoint and swap the original byte back in before you hit "go".
    > Note from the debugger output the O/S thinks that the first instruction in
    > your code is "int 3"!
    >
    > kd> bp MyVolFlt!MyVolFltEvtDeviceAdd
    > kd> g
    > Break instruction exception - code 80000003 (first chance)
    > MyVolFlt!MyVolFltEvtDeviceAdd:
    > fffff800`011be0e0 cc int 3
    >
    > This is serious bad business, if the debugger was working properly this
    > should be abstracted from you and you would see the first compiler
    > generated instruction. If the debugger does not put the original byte back,
    > then the instruction stream is hosed and you'll get a bugcheck 50.
    >
    > I have yet to try to nail this down more, but I've definitely seen it
    > frequently in VMWare environments. Options:
    >
    > 1. Stop enabling boot debugging if you don't need it
    >
    > 2. Add a temporary __debugbreak() call in your code (don't forget to
    > remove it!)
    >
    > 3. Do your own breakpoints by swapping the byte yourself ("db" to find out
    > what it is, "eb" to modify it). This sounds stupid and annoying, which it
    > is, but it worked for me once in a pinch.
    >
    > -scott
    > OSR
    >
    > "Speer, Kenny" wrote in message news:[email protected]
    >
    > Since this is breaking in at DeviceAdd() of a volume filter, is this the
    > boot disk? Are you halting the OS during boot, so that when you start
    > stepping you hit page fault timeouts?
    >
    > Just a common occurrence that should be checked first ...
    >
    > -----Original Message-----
    > From: [email protected] [mailto:
    > [email protected]] On Behalf Of [email protected]
    > Sent: Tuesday, January 14, 2014 7:21 AM
    > To: Windows System Software Devs Interest List
    > Subject: RE:[ntdev] KiPageFault into BSOD when stepping over
    >
    > I'm sure it's not KMDF's fault since I get similar behavior without KMDF.
    > As I said it looks like kd side effect.
    > The biggest issue is that debugging experience suffers cause of restarts
    > I'm forced into by bug checks. There were no problems without the debugger.
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >


    --
    Kris
  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Scott, bp is set during normal debugging session. I use boot debugger only for .kdfiles to make WinDbg upload fresh sys file. int 3 works fine, don't blame proven technique :).

    I was sure this problem is well known. Should I change my configuration?
  • OSR_Community_UserOSR_Community_User Member Posts: 110,217
    I would suggest allowing your system to boot, then set your breakpoint, and add another disk (either iscsi/fc* or if virtualized, a vmdk or vhdx) then your EvtDeviceAdd() routine will be called again.

    -----Original Message-----
    From: [email protected] [mailto:[email protected]] On Behalf Of [email protected]
    Sent: Tuesday, January 14, 2014 3:16 PM
    To: Windows System Software Devs Interest List
    Subject: RE:[ntdev] KiPageFault into BSOD when stepping over

    Scott, bp is set during normal debugging session. I use boot debugger only for .kdfiles to make WinDbg upload fresh sys file. int 3 works fine, don't blame proven technique :).

    I was sure this problem is well known. Should I change my configuration?

    ---
    NTDEV is sponsored by OSR

    Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

    OSR is HIRING!! See http://www.osr.com/careers

    For our schedule of WDF, WDM, debugging and other seminars visit:
    http://www.osr.com/seminars

    To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,452
    Hardware breakpoints should indeed avoid the crash. OP: does setting .allow_bp_ba_convert 1 make the problem go away?

    I'm not sure what exactly causes the issue, so I don't know what to suggest in terms of changing your configuration. Every time I have hit it I've used one of the workarounds and made a note to isolate a repro and investigate in my copious free time :)

    -scott
    OSR

    -scott
    OSR

  • Andrii_ChabykinAndrii_Chabykin Member Posts: 30
    Now I finally managed to figure out what was wrong. I would normally set my bp during initial break-in sequence:

    Connected to Windows 8 9600 x64 target at (Thu Jan 16 00:54:33.435 2014 (UTC + 2:00)), ptr64 TRUE
    Kernel Debugger connection established.
    ************* Symbol Path validation summary **************
    Response Time (ms) Location
    Deferred cache*C:\Development\Tools\Symbols
    Deferred srv*http://msdl.microsoft.com/download/symbols
    Symbol search path is: cache*C:\Development\Tools\Symbols;srv*http://msdl.microsoft.com/download/symbols
    Executable search path is:
    Windows 8 Kernel Version 9600 MP (1 procs) Free x64
    Built by: 9600.16452.amd64fre.winblue_gdr.131030-1505
    Machine Name:
    Kernel base = 0xfffff800`5547e000 PsLoadedModuleList = 0xfffff800`55742990
    System Uptime: 0 days 0:00:00.102
    nt!DebugService2+0x5:
    fffff800`555d28e5 cc int 3
    kd> bp MyVolFltEvtDeviceAdd
    kd> g

    And here what happens after:

    Unload module \SystemRoot\system32\mcupdate_GenuineIntel.dll at fffff800`1b200000
    Unload module \SystemRoot\System32\drivers\werkernel.sys at fffff800`19ed5000
    ...
    Unload module \SystemRoot\system32\DRIVERS\MyVolFlt.sys at fffff800`1b9ed000
    nt!DebugService2+0x5:
    fffff800`555d28e5 cc int 3
    kd> k
    # Child-SP RetAddr Call Site
    00 fffff800`573991a8 fffff800`55544361 nt!DebugService2+0x5
    01 fffff800`573991b0 fffff800`555442ff nt!DbgLoadImageSymbols+0x45
    02 fffff800`57399200 fffff800`55b76fc4 nt!DbgLoadImageSymbolsUnicode+0x2b
    03 fffff800`57399240 fffff800`55b7684b nt!MiReloadBootLoadedDrivers+0x300
    04 fffff800`573993c0 fffff800`55b6c091 nt!MiInitializeDriverImages+0x163
    05 fffff800`57399470 fffff800`55b67299 nt!MiInitSystem+0x3d9
    06 fffff800`57399500 fffff800`557e84ea nt!InitBootProcessor+0x301
    07 fffff800`57399740 fffff800`557de1a3 nt!KiInitializeKernel+0x5a2
    08 fffff800`57399ad0 00000000`00000000 nt!KiSystemStartup+0x193

    It is unloading boot time drivers! And reloading with different start addresses! So when I set my breakpoint at MyVolFltEvtDeviceAdd, WinDbg would insert int 3 instruction and during module relocation that instruction is copied as is. So my breakpoint actually hits, despite code relocation. But this is where the Windows and debugger fall apart - they don't know about this breakpoint.
  • raj_rraj_r Member - All Emails Posts: 987
    can you try setting a Bu myvolfiXXXX instead of bp myvolXXXX during
    initial boot seuence and see if you can reproduce behaviour

    bp is documented to track address while bu tracks symbols

    sorry if iam way off as i havent got the trail of this thread and
    havent visited the forum to lookup while answering

    On 1/16/14, [email protected] <[email protected]> wrote:
    > Now I finally managed to figure out what was wrong. I would normally set my
    > bp during initial break-in sequence:
    >
    > Connected to Windows 8 9600 x64 target at (Thu Jan 16 00:54:33.435 2014 (UTC
    > + 2:00)), ptr64 TRUE
    > Kernel Debugger connection established.
    > ************* Symbol Path validation summary **************
    > Response Time (ms) Location
    > Deferred
    > cache*C:\Development\Tools\Symbols
    > Deferred
    > srv*http://msdl.microsoft.com/download/symbols
    > Symbol search path is:
    > cache*C:\Development\Tools\Symbols;srv*http://msdl.microsoft.com/download/symbols
    > Executable search path is:
    > Windows 8 Kernel Version 9600 MP (1 procs) Free x64
    > Built by: 9600.16452.amd64fre.winblue_gdr.131030-1505
    > Machine Name:
    > Kernel base = 0xfffff800`5547e000 PsLoadedModuleList = 0xfffff800`55742990
    > System Uptime: 0 days 0:00:00.102
    > nt!DebugService2+0x5:
    > fffff800`555d28e5 cc int 3
    > kd> bp MyVolFltEvtDeviceAdd
    > kd> g
    >
    > And here what happens after:
    >
    > Unload module \SystemRoot\system32\mcupdate_GenuineIntel.dll at
    > fffff800`1b200000
    > Unload module \SystemRoot\System32\drivers\werkernel.sys at
    > fffff800`19ed5000
    > ...
    > Unload module \SystemRoot\system32\DRIVERS\MyVolFlt.sys at
    > fffff800`1b9ed000
    > nt!DebugService2+0x5:
    > fffff800`555d28e5 cc int 3
    > kd> k
    > # Child-SP RetAddr Call Site
    > 00 fffff800`573991a8 fffff800`55544361 nt!DebugService2+0x5
    > 01 fffff800`573991b0 fffff800`555442ff nt!DbgLoadImageSymbols+0x45
    > 02 fffff800`57399200 fffff800`55b76fc4 nt!DbgLoadImageSymbolsUnicode+0x2b
    > 03 fffff800`57399240 fffff800`55b7684b nt!MiReloadBootLoadedDrivers+0x300
    > 04 fffff800`573993c0 fffff800`55b6c091 nt!MiInitializeDriverImages+0x163
    > 05 fffff800`57399470 fffff800`55b67299 nt!MiInitSystem+0x3d9
    > 06 fffff800`57399500 fffff800`557e84ea nt!InitBootProcessor+0x301
    > 07 fffff800`57399740 fffff800`557de1a3 nt!KiInitializeKernel+0x5a2
    > 08 fffff800`57399ad0 00000000`00000000 nt!KiSystemStartup+0x193
    >
    > It is unloading boot time drivers! And reloading with different start
    > addresses! So when I set my breakpoint at MyVolFltEvtDeviceAdd, WinDbg would
    > insert int 3 instruction and during module relocation that instruction is
    > copied as is. So my breakpoint actually hits, despite code relocation. But
    > this is where the Windows and debugger fall apart - they don't know about
    > this breakpoint.
    >
    > ---
    > NTDEV is sponsored by OSR
    >
    > Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
    >
    > OSR is HIRING!! See http://www.osr.com/careers
    >
    > For our schedule of WDF, WDM, debugging and other seminars visit:
    > http://www.osr.com/seminars
    >
    > To unsubscribe, visit the List Server section of OSR Online at
    > http://www.osronline.com/page.cfm?name=ListServer
    >
  • Scott_Noone_(OSR)Scott_Noone_(OSR) Administrator Posts: 3,452
    Excellent find, another mystery solved. Certainly seems like there's some
    missing coordination here between the KD module and the Mm module.

    -scott
    OSR

    wrote in message news:[email protected]

    Now I finally managed to figure out what was wrong. I would normally set my
    bp during initial break-in sequence:

    Connected to Windows 8 9600 x64 target at (Thu Jan 16 00:54:33.435 2014 (UTC
    + 2:00)), ptr64 TRUE
    Kernel Debugger connection established.
    ************* Symbol Path validation summary **************
    Response Time (ms) Location
    Deferred
    cache*C:\Development\Tools\Symbols
    Deferred
    srv*http://msdl.microsoft.com/download/symbols
    Symbol search path is:
    cache*C:\Development\Tools\Symbols;srv*http://msdl.microsoft.com/download/symbols
    Executable search path is:
    Windows 8 Kernel Version 9600 MP (1 procs) Free x64
    Built by: 9600.16452.amd64fre.winblue_gdr.131030-1505
    Machine Name:
    Kernel base = 0xfffff800`5547e000 PsLoadedModuleList = 0xfffff800`55742990
    System Uptime: 0 days 0:00:00.102
    nt!DebugService2+0x5:
    fffff800`555d28e5 cc int 3
    kd> bp MyVolFltEvtDeviceAdd
    kd> g

    And here what happens after:

    Unload module \SystemRoot\system32\mcupdate_GenuineIntel.dll at
    fffff800`1b200000
    Unload module \SystemRoot\System32\drivers\werkernel.sys at
    fffff800`19ed5000
    ...
    Unload module \SystemRoot\system32\DRIVERS\MyVolFlt.sys at fffff800`1b9ed000
    nt!DebugService2+0x5:
    fffff800`555d28e5 cc int 3
    kd> k
    # Child-SP RetAddr Call Site
    00 fffff800`573991a8 fffff800`55544361 nt!DebugService2+0x5
    01 fffff800`573991b0 fffff800`555442ff nt!DbgLoadImageSymbols+0x45
    02 fffff800`57399200 fffff800`55b76fc4 nt!DbgLoadImageSymbolsUnicode+0x2b
    03 fffff800`57399240 fffff800`55b7684b nt!MiReloadBootLoadedDrivers+0x300
    04 fffff800`573993c0 fffff800`55b6c091 nt!MiInitializeDriverImages+0x163
    05 fffff800`57399470 fffff800`55b67299 nt!MiInitSystem+0x3d9
    06 fffff800`57399500 fffff800`557e84ea nt!InitBootProcessor+0x301
    07 fffff800`57399740 fffff800`557de1a3 nt!KiInitializeKernel+0x5a2
    08 fffff800`57399ad0 00000000`00000000 nt!KiSystemStartup+0x193

    It is unloading boot time drivers! And reloading with different start
    addresses! So when I set my breakpoint at MyVolFltEvtDeviceAdd, WinDbg would
    insert int 3 instruction and during module relocation that instruction is
    copied as is. So my breakpoint actually hits, despite code relocation. But
    this is where the Windows and debugger fall apart - they don't know about
    this breakpoint.

    -scott
    OSR

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. Sign in or register to get started.

Upcoming OSR Seminars
OSR has suspended in-person seminars due to the Covid-19 outbreak. But, don't miss your training! Attend via the internet instead!
Internals & Software Drivers 15 November 2021 Live, Online
Writing WDF Drivers 24 January 2022 Live, Online
Developing Minifilters 7 February 2022 Live, Online
Kernel Debugging 21 March 2022 Live, Online