READ_REGISTER_ULONG randomly get stuck during a dma acquisition

Dear Osr online user, i found this great list/forum trying to debug the wdf driver i’m writing.
The device is an amcc5335 based board. I’m rewriting the XP WDM driver to a Win 7 32bit wdf driver developed with wdf 8.1, visual studio 2013.

My problem is that sometime (<1% of requests) during a dma acquisition if I do some DeviceIoControl that drive a READ_REGISTER_ULONG with a valid BAR and offset (4 byte aligned) will get stuck until i stop the dma bus mastering
The system still going on and as soon i stop the dma transaction sequence it will unstuck.

The driver is syncronized at device level:
fdoAttributes.SynchronizationScope = WdfSynchronizationScopeDevice;

I map the memory area with this code:

devExt->barsInfo[barCounter].baseAddress = MmMapIoSpace(
desc->u.Memory.Start,
devExt->barsInfo[barCounter].length,
//MmNonCached);
MmCached);

I tried both the cached and not cached variants.

I read the mapped area with this code:
lBarAddr = devExt->barsInfo[lBar].baseAddress;
lBarLength = devExt->barsInfo[lBar].length;
lOffset = lInputBuffer->offset;

if ((lBarAddr == NULL) || (lBarLength <= lOffset)){
//DbgPrint(“(lBarAddr == NULL) || (lBarLength <= lOffset)”);
WdfRequestComplete(Request, STATUS_INVALID_PARAMETER);

return;
}
lAddress = (PCHAR)lBarAddr + lOffset;
lOutputBuffer->data = READ_REGISTER_ULONG((PULONG)lAddress);

I’m really new at debugging kernel mode drivers, from what i was able to find out I saw that the bar and offset combination was a valid one.

The same snippet of code works properly in XP.
The hardware is a tested one and works properly.
The target pc is provisioned correctly.

Someone faced a similar issue? How I could reach the point of code where is stuck with windbg (embedded in visual studio)?

Thanks in advance :smiley:

xxxxx@hotmail.com wrote:

My problem is that sometime (<1% of requests) during a dma acquisition if I do some DeviceIoControl that drive a READ_REGISTER_ULONG with a valid BAR and offset (4 byte aligned) will get stuck until i stop the dma bus mastering
The system still going on and as soon i stop the dma transaction sequence it will unstuck.

The driver is syncronized at device level:
fdoAttributes.SynchronizationScope = WdfSynchronizationScopeDevice;

Why? Did you just choose that because it was easy, or did you actually
prove to yourself that you need that much locking?

This is getting to be one of my most repeated lines. KMDF has a bunch
of nifty auto-synchronization options, but because you don’t necessarily
see what’s going on underneath, it is very, very easy to
over-synchronize yourself, which means you end up blocking transactions
that you otherwise could have handled. With sync scope set to device,
KMDF will basically only allow your driver to handle one callback at a
time. (That’s not exactly true, but you get the point.) If the
handling of your DMA transaction means that you spend a long time in a
callback, then it will not present a new ioctl to you.

Allow me to quote from the MSDN page on “Using Automatic Synchronization”:

In general, we do not recommend using device-level synchronization.

And I don’t either. As a first strike, change the sync to
WdfSynchronizationScopeNone. That might reveal other synchronization
problems that you need to handle, but it will at least let you know
whether your 1% of requests are now able to proceed.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Thanks for the suggestion trying when i’m writing :smiley:
It will take a few to trigger the issue.

I left this line mainly because the wdf example for amcc5933 had it.
I tried also with WdfSynchronizationScopeQueue and the driver has the same problem.

I enabled the debug messages in target registry and now as soon it start stopping requests i see a lot of:
CORE EH: req 858E7290 [0xec] status 0x4
CORE EH: req 858BD290 [0xec] status 0x4
CORE EH: req 858D7290 [0xec] status 0x4
CORE EH: req 858E8290 [0xec] status 0x4
CORE EH: req 858EA290 [0xec] status 0x4
CORE EH: req 858EC290 [0xec] status 0x4
CORE EH: req 858F7290 [0xec] status 0x4

But i have no idea what that mean.

I forgot to mention an important thing. I have two separate IoQueue one for IoCTL and one for Read/Write requests. Both has to work at the same time.

I have the results, the same behaviour happen, but this time as soon I cancelIo the last pending dma request (i have a cancellation event set trough markascancellable, because it can happen that it never get completed) i get a bug check due probably to the event and isr handlers not in sync.

xxxxx@hotmail.com wrote:

I have the results, the same behaviour happen, but this time as soon I cancelIo the last pending dma request (i have a cancellation event set trough markascancellable, because it can happen that it never get completed) i get a bug check due probably to the event and isr handlers not in sync.

Please don’t guess. This is a complicated environment, and you need to
understand why you get a bug check. Why would there be a conflict? Are
you updating the same data structures at the same time? If so, then you
need some finer-grained locking.

When you start a DMA transaction, what do you do with the request? Do
you tuck it away in a manual queue? If so, then you don’t need to mark
it as cancelable. You get that automatically from being in a queue.

Are you using the KMDF DMA APIs?

I forgot to mention an important thing. I have two separate IoQueue one for IoCTL and one for Read/Write requests. Both has to work at the same time.

That by itself disqualifies device sync scope. With device scope, all
callbacks through ALL queues are serialized.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

Well, it actually depends on what the OP means “work at the same time” – If the Requests simply need to be IN PROGRESS at the same time, then he *can* use Sync Scope Device. If he actually needs the Requests to be active in the EvtIoXxx callbacks simultaneously, then obviously Sync Scope Device will prevent this.

There’s a pretty good article on Sync Scope (if I don’t say so myself) in the recent issue of The NT Insider: http:</http:>

While I’m generally not a fan of Sync Scope Device, if all you do is take a Request, program your device, and then park that Request for later processing it CAN be convenient. It saves you from having to think about serialization in your EvtIoXxx routines, and doesn’t really add that much overhead.

But I *certainly* agree with Tim (and MSDN) that Sync Scope Device is not usually the BEST solution to any problem.

Peter
OSR
@OSRDrivers

The bsod is due to dma completion dpc that try to unmask as cancelable an already cancelled request.

I use a packet mode dma using kmdf. Performance are enough for my needs. So no need to switch back to WDM.

The two queue are auto serial.

My need is just that ioctls shall not be blocked during a long sequence of dma transactions. I saw that Kidd split my 4M dma read in tons of 20k tiny transactions. That is working with 99% of ioctl.

Thanks for your help :slight_smile:

Could you point me on how get a proper stack trace? I’m not able atm to get a useful one, becouse as soon i break (reacting manually when i see the stopped registry update on screen) to read it, the call stack in the kernel debugger in visual studio is not long enough.

sry for the wrong words of the previous message, was typing from cell.

If I debugged correctly the issue, the request it was completed with:
WdfRequestCompleteWithInformation(Request, STATUS_SUCCESS, sizeof(AMCCIODATA));

but the associated DeviceIoControl never returned.

I will create a New discussion since this one is not the real issue.