strange Windows Storport port crash

I enabled the optimizing option STOR_PERF_CONCURRENT_CHANNELS in our Storport miniport driver, during running heavy IO traffic with background hot-plug disk drives, when test run over 3 hours, sometimes Windows system BSOD, we hit two differenct crash.

one the call stack as the following:
Child-SP RetAddr Call Site
fffff80006d19ff8 fffff80006fce6d2 nt!DbgBreakPointWithStatus
fffff80006d1a000 fffff80006fcf4be nt!KiBugCheckDebugBreak+0x12
fffff80006d1a060 fffff80006ed9004 nt!KeBugCheck2+0x71e
fffff80006d1a730 fffff80006ed8469 nt!KeBugCheckEx+0x104
fffff80006d1a770 fffff80006ed70e0 nt!KiBugCheckDispatch+0x69
fffff80006d1a8b0 fffff80006edc182 nt!KiPageFault+0x260
fffff80006d1aa40 fffff880012b4179 nt!RtlpInterlockedPushEntrySList+0x22
fffff80006d1aa50 fffff80006e2262f storport! ?? ::FNODOBFM::string'+0xd3d fffff80006d1aac0 fffff880012b33a1 hal!HalBuildScatterGatherList+0x203 fffff80006d1ab30 fffff880012b4c81 storport!RaUnitStartIo+0x2e1 fffff80006d1abb0 fffff880012ba2ce storport! ?? ::FNODOBFM::string’+0x18d2
fffff80006d1ac90 fffff80006ee45dc storport!RaidpAdapterRedirectDpcRoutine+0x4e
fffff80006d1acd0 fffff80006ee16fa nt!KiRetireDpcList+0x1bc
fffff80006d1ad80 0000000000000000 nt!KiIdleLoop+0x5a

Seen from the call stack, Windows storport somehow invoked internal RaUnitStartIo and broke in StorportBuildScatterGatherListinside function during executing DPC routine, that was very strange.

another call stack as the following:

ChildEBP RetAddr
8039d8e4 8390ccb3 nt!RtlpBreakWithStatusInstruction
8039d934 8390d799 nt!KiBugCheckDebugBreak+0x1c
8039dd00 8390cb3f nt!KeBugCheck2+0x66d
8039dd24 838ea8e5 nt!KeBugCheckEx+0x1e
8039dd58 838ec67d nt!KeUpdateRunTime+0xd5
8039dd58 8381000f nt!KeUpdateSystemTime+0xed
8039ddd8 84bbcc3c hal!KeAcquireInStackQueuedSpinLockRaiseToSynch+0x3f
8039dde0 84bbdb8a storport!RaidAdapterAcquireStartIoLock+0x2d
8039de08 838e932b storport!RaidpAdapterTimerDpcRoutine+0x92
8039df28 838e8eeb nt!KiTimerListExpire+0x367
8039df88 838e9655 nt!KiTimerExpiration+0x22a
8039dff4 838e72f5 nt!KiRetireDpcList+0xba
8039dff8 8a86eacc nt!KiDispatchInterrupt+0x45

From the call stack, one of system timers was expired.

After enable concurrent channels option, our host driver has to hold some internal spinlock to protect IBQ PI and some other global data structures due to concurrent executing HwStartIo routine. Although the StartIo type spinlock was not held during executing HwStartIo, the StartIo type spin lock was still held when Storport invoking our driver?s HwTimer and HwResetBus routines.

if NOT enable concurrent channel option, the stress test passed, no any crash.

please help to analyze the reason why BSOD?

Thanks
Regards
Leon

Leon,
Not sure if you noticed, MSDN (http://msdn.microsoft.com/en-us/library/windows/hardware/ff563845(v=vs.85).aspx) states not to use the concurrent channels flag.

Regards,
Girish.

Girish
Microsoft guy said Windows Server 2008 Storport support concurrent channels option.
I tried and proved it is indeed true, the flag can be set, possibly, the msdn doc description is wrong.
the concurrent channels option very much help to improve IO performance on host side.
if enable concurrent channel, run overnight IO stress test without background hot-plug disk drives, driver worked well. however, with background hot-plug, after 3 or 4 hours, we got the following BSOD.
the call stack as the following:
Child-SP RetAddr Call Site
fffff80006d19ff8 fffff80006fce6d2 nt!DbgBreakPointWithStatus
fffff80006d1a000 fffff80006fcf4be nt!KiBugCheckDebugBreak+0x12
fffff80006d1a060 fffff80006ed9004 nt!KeBugCheck2+0x71e
fffff80006d1a730 fffff80006ed8469 nt!KeBugCheckEx+0x104
fffff80006d1a770 fffff80006ed70e0 nt!KiBugCheckDispatch+0x69
fffff80006d1a8b0 fffff80006edc182 nt!KiPageFault+0x260
fffff80006d1aa40 fffff880012b4179 nt!RtlpInterlockedPushEntrySList+0x22
fffff80006d1aa50 fffff80006e2262f storport! ?? ::FNODOBFM::string'+0xd3d fffff80006d1aac0 fffff880012b33a1 hal!HalBuildScatterGatherList+0x203 fffff80006d1ab30 fffff880012b4c81 storport!RaUnitStartIo+0x2e1 fffff80006d1abb0 fffff880012ba2ce storport! ?? ::FNODOBFM::string’+0x18d2
fffff80006d1ac90 fffff80006ee45dc storport!RaidpAdapterRedirectDpcRoutine+0x4e
fffff80006d1acd0 fffff80006ee16fa nt!KiRetireDpcList+0x1bc
fffff80006d1ad80 0000000000000000 nt!KiIdleLoop+0x5a
from the call stack, Windows storport somehow invoked internal
RaUnitStartIo and broke in RaidpAdapterContinueScatterGather function during
executing DPC routine, that was very strange.

Hi All
Can anybody help to look at this problem?

Thanks
Regards
Leon

You have not even mentioned the bugcheck code. Post the output of
!analyze -v

Regards,

Daniel Terhell
Resplendence Software Projects Sp
xxxxx@resplendence.com
http://www.resplendence.com

wrote in message news:xxxxx@ntdev…
>I enabled the optimizing option STOR_PERF_CONCURRENT_CHANNELS in our
>Storport miniport driver, during running heavy IO traffic with background
>hot-plug disk drives, when test run over 3 hours, sometimes Windows system
>BSOD, we hit two differenct crash.
>
> one the call stack as the following:
> Child-SP RetAddr Call Site
> fffff80006d19ff8 fffff80006fce6d2 nt!DbgBreakPointWithStatus
> fffff80006d1a000 fffff80006fcf4be nt!KiBugCheckDebugBreak+0x12
> fffff80006d1a060 fffff80006ed9004 nt!KeBugCheck2+0x71e
> fffff80006d1a730 fffff80006ed8469 nt!KeBugCheckEx+0x104
> fffff80006d1a770 fffff80006ed70e0 nt!KiBugCheckDispatch+0x69
> fffff80006d1a8b0 fffff80006edc182 nt!KiPageFault+0x260
> fffff80006d1aa40 fffff880012b4179 nt!RtlpInterlockedPushEntrySList+0x22
> fffff80006d1aa50 fffff80006e2262f storport! ??
> ::FNODOBFM::string'+0xd3d<br>&gt; fffff80006d1aac0 fffff880012b33a1 hal!HalBuildScatterGatherList+0x203<br>&gt; fffff80006d1ab30 fffff880012b4c81 storport!RaUnitStartIo+0x2e1<br>&gt; fffff80006d1abb0 fffff880012ba2ce storport! ?? <br>&gt; ::FNODOBFM::string’+0x18d2
> fffff80006d1ac90 fffff80006ee45dc
> storport!RaidpAdapterRedirectDpcRoutine+0x4e
> fffff80006d1acd0 fffff80006ee16fa nt!KiRetireDpcList+0x1bc
> fffff80006d1ad80 0000000000000000 nt!KiIdleLoop+0x5a
>
> Seen from the call stack, Windows storport somehow invoked internal
> RaUnitStartIo and broke in StorportBuildScatterGatherListinside function
> during executing DPC routine, that was very strange.
>
> another call stack as the following:
>
> ChildEBP RetAddr
> 8039d8e4 8390ccb3 nt!RtlpBreakWithStatusInstruction
> 8039d934 8390d799 nt!KiBugCheckDebugBreak+0x1c
> 8039dd00 8390cb3f nt!KeBugCheck2+0x66d
> 8039dd24 838ea8e5 nt!KeBugCheckEx+0x1e
> 8039dd58 838ec67d nt!KeUpdateRunTime+0xd5
> 8039dd58 8381000f nt!KeUpdateSystemTime+0xed
> 8039ddd8 84bbcc3c hal!KeAcquireInStackQueuedSpinLockRaiseToSynch+0x3f
> 8039dde0 84bbdb8a storport!RaidAdapterAcquireStartIoLock+0x2d
> 8039de08 838e932b storport!RaidpAdapterTimerDpcRoutine+0x92
> 8039df28 838e8eeb nt!KiTimerListExpire+0x367
> 8039df88 838e9655 nt!KiTimerExpiration+0x22a
> 8039dff4 838e72f5 nt!KiRetireDpcList+0xba
> 8039dff8 8a86eacc nt!KiDispatchInterrupt+0x45
>
> From the call stack, one of system timers was expired.
>
>
> After enable concurrent channels option, our host driver has to hold
> some internal spinlock to protect IBQ PI and some other global data
> structures due to concurrent executing HwStartIo routine. Although the
> StartIo type spinlock was not held during executing HwStartIo, the
> StartIo type spin lock was still held when Storport invoking our driver?s
> HwTimer and HwResetBus routines.
>
> if NOT enable concurrent channel option, the stress test passed, no any
> crash.
>
> please help to analyze the reason why BSOD?
>
> Thanks
> Regards
> Leon
>
>

Hi Leon,

This is a known issue in Storport. Please contact your Microsoft Support channel if you want to pursue a hotfix.

Michael Xing [MSFT]