Windows System Software -- Consulting, Training, Development -- Unique Expertise, Guaranteed Results

Before Posting...
Please check out the Community Guidelines in the Announcements and Administration Category.

Sleep/hibernate kills network connection with NDIS filter loaded

MDHMDH Member Posts: 13

For a KMDF-based NDIS LWF, is there anything special that needs to happen during the sleep/hibernate cycle? I notice that when systems (both Win7 and Win10) with my filter (modifying, optional) loaded resumes from a suspended state, the network connection is dead. It's not until I unload the driver does it pop back to life. My current solution is to just unload the driver when the service is notified that the system is being suspended and then reload it again when it resumes but that seems clunky. I've been trying to reproduce in a VM but everything works fine so it's difficult to figure out what the hold up is. Is there something I need to specifically handle for suspend/resume or is it more likely just causing a bug in my code to manifest itself?

Comments

  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 544

    This should work seamlessly. And breaking this is not a common "gotcha" of LWFs, so I don't have a psychic debugging answer for you. Instead, some generic avenues for investigation:

    • Make sure your filter isn't interfering with any OIDs (especially OID_PNP_SET_POWER, I suppose).
    • If the miniport driver is a little older, it might not implement the NDIS_MINIPORT_ATTRIBUTES_NO_PAUSE_ON_SUSPEND flag, in which case NDIS will pause the datapath when going to low power, and restart the datapath when coming out of low power. Make sure your filter driver doesn't lose NBLs or otherwise causes a hang during pause. This is easy to see from !ndiskd.miniport, since you'll see evidence that the miniport is in the middle of a pause transition. Within a few hundred milliseconds of resuming from low power, the miniport should have come out of pause already.
    • In general, filter drivers are unaware of system power states, and don't really need any special code to deal with power transitions. Check if your filter driver has any such code (aside, I suppose, from the workaround you mentioned), and code review it again. If the filter is doing something strange in response to a power IRP or OID_PNP_SET_POWER / OID_PM_PARAMETERS, that could be the source of the problem.
    • "the network connection is dead" is generic. If you can narrow that down to a specific problem, that might give more hints on the symptoms. Is the miniport in a bad state? Is there an NDIS thread stuck somewhere? Are OIDs getting failed? Are NBLs getting dropped? Does this repro with different miniport drivers from different IHVs, or just a particular one?
  • MDHMDH Member Posts: 13

    @Jeffrey_Tippet_[MSFT] said:
    This should work seamlessly. And breaking this is not a common "gotcha" of LWFs, so I don't have a psychic debugging answer for you. Instead, some generic avenues for investigation:

    • Make sure your filter isn't interfering with any OIDs (especially OID_PNP_SET_POWER, I suppose).
    • If the miniport driver is a little older, it might not implement the NDIS_MINIPORT_ATTRIBUTES_NO_PAUSE_ON_SUSPEND flag, in which case NDIS will pause the datapath when going to low power, and restart the datapath when coming out of low power. Make sure your filter driver doesn't lose NBLs or otherwise causes a hang during pause. This is easy to see from !ndiskd.miniport, since you'll see evidence that the miniport is in the middle of a pause transition. Within a few hundred milliseconds of resuming from low power, the miniport should have come out of pause already.
    • In general, filter drivers are unaware of system power states, and don't really need any special code to deal with power transitions. Check if your filter driver has any such code (aside, I suppose, from the workaround you mentioned), and code review it again. If the filter is doing something strange in response to a power IRP or OID_PNP_SET_POWER / OID_PM_PARAMETERS, that could be the source of the problem.
    • "the network connection is dead" is generic. If you can narrow that down to a specific problem, that might give more hints on the symptoms. Is the miniport in a bad state? Is there an NDIS thread stuck somewhere? Are OIDs getting failed? Are NBLs getting dropped? Does this repro with different miniport drivers from different IHVs, or just a particular one?

    Thanks Jeff,

    Not doing anything crazy with OID's other than PROMISCUOUS but after your last answer to my question that all works fine. That's what's so strange about this because I didn't think I needed to handle any power issues. When I say "the connection is dead" I mean that ping.exe returns "general failure," the adapter IP is 169.xxx, and Wireshark shows a lot of attempts looking for the gateway with no response. These are test boxes that are not configured for kernel debugging so I don't have the answer about miniport or NBLs. Like I said, it doesn't appear in the VM so it's a bit harder to debug. I may "NotMyFault" the box just to get a dump when it's in a bad state so I at least have a dump to investigate.

  • MDHMDH Member Posts: 13

    I believe I have identified the problem but I'm not sure how it got this way. When running !ndiskd.filter on MyLwf1 I get the following output.

    kd> !ndiskd.filter fffff6026ed3cc60
    State Running
    Datapath Receive only
    References 1
    Flags RUNNING

    The datapath value is set to 'Receive only.' What does that refer to and how does it get into that state? My other loaded LWF (MyLwf2) has the datapath set to 'Normal.' If that's not the problem, there is nothing else I'm seeing in the !ndiskd output that looks unusual.

    6: kd> !ndiskd.netadapter ffffa38fe192c1a0

    MINIPORT

    Realtek PCIe GBE Family Controller
    
    Ndis handle        ffffa38fe192c1a0
    Ndis API version   v6.40
    Adapter context    ffffa38fe1b3a000
    Driver             ffffa38fe7e65ae0 - rt640x64  v9.1
    Network interface  fffff6025f54ea20
    
    Media type         802.3
    

    STATE

    Miniport           Running
    Device PnP         Started             Show state history
    Datapath           Normal
    Interface          Up
    Media              Connected
    Power              D0
    References         0n15                Show detail
    Total resets       0
    Pending OID        None
    Flags              BUS_MASTER, DEFAULT_PORT_ACTIVATED,
                       SUPPORTS_MEDIA_SENSE, DOES_NOT_DO_LOOPBACK,
                       MEDIA_CONNECTED
    PnP flags          RECEIVED_START, HARDWARE_DEVICE
    

    6: kd> !ndiskd.pendingnbls ffffa38fe192c1a0

    PHASE 1/3: Found 51 NBL pool(s).
    PHASE 2/3: Found 0 freed NBL(s).

    Pending Nbl        Currently held by                                        
    No pending NBLs were found.                                              
    

    PHASE 3/3: Found 0 pending NBL(s) of 1029 total NBL(s).
    Search complete.
    6: kd> !ndiskd.oid -miniport ffffa38fe192c1a0

    ALL PENDING OIDs

    [Showing all OIDs on the stack for miniport ffffa38fe192c1a0]
    

    No pending or queued OIDs were found.

    6: kd> !ndiskd.filter fffff6026e78ac60

    FILTER

    Realtek PCIe GBE Family Controller-MyLwf2-0000
    
    Ndis handle        fffff6026e78ac60
    Filter driver      fffff6026162ed60 - MyLwf2
    Module context     fffff6027111cdb0
    Miniport           ffffa38fe192c1a0 - Realtek PCIe GBE Family Controller
    Network interface  fffff60271d66a20
    
    State              Running
    Datapath           Normal
    References         1
    Flags              RUNNING
    
    Higher filter      fffff60272938c60 - Realtek PCIe GBE Family Controller-VirtualBox NDIS Light-Weight Filter-0000
    Lower filter       fffff6026ed3cc60 - Realtek PCIe GBE Family Controller-MyLwf1-0000
    
    Driver handlers
    

    6: kd> !ndiskd.filter fffff6026ed3cc60

    FILTER

    Realtek PCIe GBE Family Controller-MyLwf1-0000
    
    Ndis handle        fffff6026ed3cc60
    Filter driver      fffff6025fe5ad60 - MyLwf1
    Module context     fffff6026e0ccdd0
    Miniport           ffffa38fe192c1a0 - Realtek PCIe GBE Family Controller
    Network interface  fffff6026f54ca20
    
    State              Running
    Datapath           Receive only
    References         1
    Flags              RUNNING
    
    Higher filter      fffff6026e78ac60 - Realtek PCIe GBE Family Controller-MyLwf2-0000
    Lower filter       fffff6026f04ac60 - Realtek PCIe GBE Family Controller-WFP Native MAC Layer LightWeight Filter-0000
    
  • Jeffrey_Tippet_[MSFT]Jeffrey_Tippet_[MSFT] Member - All Emails Posts: 544

    "Receive only" happens when your filter only registers FilterReceiveNetBufferLists and FilterReturnNetBufferLists, and it does not register FilterSendNetBufferLists or FilterSendNetBufferListsComplete.

    It's perfectly fine for a filter to do this -- it just means that the Tx path will skip over the LWF, and the LWF only participates in the Rx path. But if you didn't do it intentionally, obviously that's something to look into.

    It looks like there's another 3rd party LWF in there: the VirtualBox one. Try removing it if you can, and see if that improves things. There might be an interop issue.

    Everything else looks pretty good. It seems like everything is in a good state, but for some reason, someone's not delivering packets. You can try capturing traffic off the box to see if packets are getting out. (Running Wireshark locally doesn't tell you much about any bugs in the host driver stack, because Wireshark interfaces with the driver stack in a clumsy and weird way.) The builtin netmon driver can capture between each Modifying LWF, so you can see whether packets are getting through any particular layer. (Use netsh trace start CaptureMultiLayer=yes).

    You can also try setting a breakpoint on NdisMIndicateReceiveNetBufferLists, and just following the packets up the stack; or NdisSendNetBufferLists down the stack.

Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Upcoming OSR Seminars
Developing Minifilters 29 July 2019 OSR Seminar Space
Writing WDF Drivers 23 Sept 2019 OSR Seminar Space
Kernel Debugging 21 Oct 2019 OSR Seminar Space
Internals & Software Drivers 18 Nov 2019 Dulles, VA