I am developing NDIS Miniport Driver(NDIS 6.4) for 10G PCIe card. When The driver called NdisMAllocateNetBufferSGList() to populate Scatter Gather List for a give NET_BUFFER, the driver crashed with IRQL_LESS_OR_EQUAL . This crash observed after 12 hours after I started a stress test, using NTTCP.
HOW TO REPRODUCE THE ISSUE :
-
Take two windows 10 Rs5 system (T1 and T2).
-
T1 contains my 10G card and T2 contains another reputed company’s 10G Card.
-
Both cards contain Two ports.
-
Assign startic IPs to each port on T1 and T2(Ip1 and Ip2).
T1Port 1: 192.168.0.1
Port 2. 172.168.0.1T2
Port 1: 192.168.0.1
Port 2. 172.168.0.1 -
Connect Port1 of T1 and T2 using CAT 7 cable (Or optical).
-
Connect Port2 of T1 and T2 using CAT 7 cable (Or optical).
-
Run following commands.
On T1:(System under test)
NTttcps.exe -s -m 8,,192.168.0.2 -a 2 -t 50400 -wu 10 -cd 10
NTttcps.exe -s -m 8,,172.168.0.2 -a 2 -t 50400 -wu 10 -cd 10
NTttcpr.exe -r -m 8,,192.168.0.1 -a 16 -t 50400 -wu 10 -cd 10
NTttcpr.exe -r -m 8,,172.168.0.1 -a 16 -t 50400 -wu 10 -cd 10
On T2: (Support machine)
NTttcps.exe -s -m 8,,192.168.0.1 -a 2 -t 50400 -wu 10 -cd 10
NTttcps.exe -s -m 8,,172.168.0.1 -a 2 -t 50400 -wu 10 -cd 10
NTttcpr.exe -r -m 8,,192.168.0.2 -a 16 -t 50400 -wu 10 -cd 10
NTttcpr.exe -r -m 8,,172.168.0.2 -a 16 -t 50400 -wu 10 -cd 10
Below is the stack trace and other details
IRQL_NOT_LESS_OR_EQUAL (a)
An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high. This is usually
caused by drivers using improper addresses.
If a kernel debugger is available get the stack backtrace.
Arguments:
Arg1: 0000000000000008, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000000, bitfield :
bit 0 : value 0 = read operation, 1 = write operation
bit 3 : value 0 = not an execute operation, 1 = execute operation (only on chips which support this level of status)
Arg4: fffff80312425f40, address which referenced memory
nt!KeBugCheckEx
nt!KiBugCheckDispatch+0x69
nt!KiPageFault+0x454
hal!HalpDmaAllocateScatterPagesFromScatterPoolV2+0x90
hal!HalpDmaAllocateScatterPagesFromScatterPool+0x4a
hal!HalpDmaAllocateMapRegisters+0xc7
hal!HalAllocateAdapterChannelV2+0xd2
hal!HalAllocateAdapterChannel+0x45
hal!HalBuildScatterGatherListV2+0x1850b
NDIS!NdisMAllocateNetBufferSGList+0x1f9 <--------------------------------
MyNetworkCard!MPSendNetBufferLists+0x2fb <-------------------------------
NDIS!ndisMSendNBLToMiniportInternal+0x11a
NDIS!ndisMSendNBLToMiniport+0xe
NDIS!ndisCallSendHandler+0xb8
NDIS!NdisSendNetBufferLists+0x2de
tcpip!FlFastSendPackets+0x93
tcpip!IpNlpFastContinueSendDatagrams+0x4f3
tcpip!IpNlpFastSendDatagram+0x3f1
tcpip!TcpTcbSend+0x572
tcpip!TcpEnqueueTcbSend+0x462
tcpip!TcpTlConnectionSendCalloutRoutine+0x24
nt!KeExpandKernelStackAndCalloutInternal+0x78
nt!KeExpandKernelStackAndCalloutEx+0x1d
tcpip!TcpTlConnectionSend+0x77
afd!AfdSend+0x5cf
afd!AfdDispatch+0x154
nt!IofCallDriver+0x59
nt!IopSynchronousServiceTail+0x1b1
nt!NtWriteFile+0x8bd
nt!KiSystemServiceCopyEnd+0x25
ntdll!NtWriteFile+0x14
KERNELBASE!WriteFile+0xfd
NTttcps!PostAsynchBuffer+0xc1
NTttcps!DoAsynchSendsReceives+0x278
NTttcps!StartSenderReceiver+0x45f
KERNEL32!BaseThreadInitThunk+0x14
ntdll!RtlUserThreadStart+0x21
Initial Diagnosis:
-
This crash happened at early stage of data transfer.Till this point , driver does not do anything apart from extracting the NET_BUFFERS from NET_BUFFER_LISTs and Queiueing them in the order. From the Transmit Handler, the driver dequeue each NET_BUFFER and then, calls NdisMAllocateNetBufferSGList(). It does not do anything else.So can it be a OS/NDIS issue?
-
I analyzed the NET_BUFFER under processing. The NET_BUFFER data offset was 0xCA and Datalength was 0x546. It had two MDls. First MDLs had a ByteCountof 0X100 and Second one had 0X510. (0x546 = 0x100 + 0x510 - 0xCA).So It passes MDL DATA length validation.
-
The FIrst MDL flag is 4 and second MDL Flag is 18.
Please help me with your thoughts.
I can provide additional info on request.
Thanks
Sam