strange behavior in DiskDrive upper filter driver



Hi, all
I'm writting a driver which works as a DiskDrive upper filter ( ClassGUID: {4D36E967-E325-11CE-BFC1-08002BE10318}).

The code of IRP_MJ_READ Dispatch is very very simple :
NTSTATUS DiskReadDispatch(   IN  PDEVICE_OBJECT  DeviceObject,   IN  PIRP            Irp)
{
        PDISK_DEVICE_EXTENSION    deviceExtension;
        NTSTATUS  status = STATUS_INVALID_DEVICE_STATE;
        deviceExtension = (PDISK_DEVICE_EXTENSION )DeviceObject->DeviceExtension;
        if(deviceExtension == NULL)
        {
            KdBreakPoint();     // had never execute in here
            Irp->IoStatus.Status = status;
            Irp->IoStatus.Information = 0;
            IoCompleteRequest(Irp, IO_DISK_INCREMENT);
            return status;
        }
        PIO_STACK_LOCATION  irpStack = NULL;

        irpStack = IoGetCurrentIrpStackLocation(Irp);

        do 
        {
            if( irpStack->MajorFunction != IRP_MJ_READ )
            {
                  KdBreakPoint();  // had never execute in here
                  Irp->IoStatus.Status = status;
                  Irp->IoStatus.Information = 0;
                  return status;
            }

            ULONG   ulRequestReadLen = irpStack->Parameters.Read.Length;

            if (Irp->MdlAddress == NULL || ulRequestReadLen == 0)
            {
                break;
            }

            if (MmGetMdlByteCount(Irp->MdlAddress) < ulRequestReadLen)
            {
                break;
            }

            PVOID   pOutputBuffer = MmGetSystemAddressForMdlSafe(Irp->MdlAddress,NormalPagePriority);

            if (!pOutputBuffer)
            {
                break;
            }

            PVOID   pJustTest = ExAllocatePoolWithTag(NonPagedPool,ulRequestReadLen,'tseT');

            if (pJustTest)
            {
                memset(pJustTest,0x18,ulRequestReadLen);
                
                 if(irpStack->Parameters.Read.ByteOffset.QuadPart == 0xF8CE0000)
                {
                     KdBreakPoint();  // just for debug
                }

                memcpy(pOutputBuffer, pJustTest, ulRequestReadLen);

                if ( memcmp( pOutputBuffer,  pJustTest, ulRequestReadLen)  !=  0 )
                {
                    KdPrint(("data diff!\r\n"));   // !!!!!!!!!! Problems here:  I don't understand why the code can excute to here
                    KdBreakPoint();
                }

                ExFreePool(pJustTest);
            }
        } while (FALSE);

        // send to lower and return
        IoSkipCurrentIrpStackLocation(Irp);
        return  IoCallDriver(deviceExtension->LowerDeviceObject,Irp);
}

As we can see, the code basically did nothing, just malloc a buffer , set every byte to 0x18 , copy to OutputBuffer of current IRP , then compare the data in pOutputBuffer with pJustTest immediately , and skip current irp stack , then pass irp to lower device.


But the code execute result was strange :


sometimes , after copy data to pOutputBuffer, the data in pOutputbuffer is diffrent from pJustTest. some range in the pOutputBuffer becames random values (not 0x18).
It doesn’t happen most of the time, only sometimes, and every times it happened, the readOffset in the irpStack->Parameters.Read.ByteOffset is same: 0xF8CE0000.
When it happened , I checked data in pJustTest , every byte in the buffer is correct. Just data in pOutputBuffer has been OVERWRITE.


In order to know the buffer was OVERWRITE by who, I had add some code to check whether the read offset is 0xF8CE0000, if is , then break to windbg, and set a memory write breakpoint with ba command in windbg : ba w1 poi(pOutputBuffer) . (and also tried ba w2 , ba w3, ba w4, ba w8)


But when the memcmp return a non-zero value , the break point only hit one times ( this time is caused by my own code, because I set break point before my memcpy)…


At first , I think maybe this is a hardware problem, caused by memory or disk error? So I had install the driver on my notebook, the issue still exists.
This is really unbeliveable, I had debug it for a week, I had no idea now.


Any one can help? Thanks very much!


More infomation:

  1. test on : vmware win7 64 sp1, with full patch. physical memory : 2GB. ( also test on 4GB 8GB and 16GB, always same )
  2. The DiskDrive health status is good, and I also test on my notebook(not virtual machine), result is almost same. (just read offset is not 0xF8CE0000)
  3. The flag of every deviceObjcect in the device stack has DO_DIRECT_IO set.
  4. I do nothing with the irp , just skip and send to lower device.
  5. I didn’t modify Irp->MDLAddress, just use MmGetSystemAddressForMdlSafe to get ouput buffer address.
  6. I tried install the driver as a Volume Filter ( ClassGUID: {71A27CDD-812A-11D0-BEC7-08002BE2092F}) , all code is totally same, the issue never appeared.
  7. When the available physical memory of the OS is insufficient, it happens more frequently. Maybe a same physical page been mapped to more than one virtual address? Or MmGetSystemAddressForMdlSafe returned wrong value?
  8. except IRP_MJ_READ, All other type of irp had send to lower device, I didn’t touch them.
  9. the code list here is meaningless, I just use the code to reproduce the issue. (My purpose is to write a disk cache driver)
  10. had test on other PC, result is same. (these PC has diffrent chipset, intel… amd…)
  11. ha test driver with microsoft driver verifier and run on checked build win7 , verifier detect nothing, no help.


    If you need more infomation , please tell me, thanks !


call stack here:


2: kd> k


# Child-SP RetAddr Call Site


00 fffff88002beeef0 fffff88000e0118c diskcachetest!DiskReadDispatch+0x16a


01 fffff88002beef60 fffff880012f92bf volmgr!VmReadWrite+0x11c


02 fffff88002beefa0 fffff880012f953c fvevol!FveReadWrite+0x47


03 fffff88002beefe0 fffff880012aedf4 fvevol!FveFilterRundownReadWrite+0x1dc


04 fffff88002bef040 fffff8800181539a volsnap! ?? ::FNODOBFM::`string’+0x57b


05 fffff88002bef070 fffff80003ea7078 Ntfs!NtfsStorageDriverCallout+0x16


06 fffff88002bef0a0 fffff88001814f09 nt!KeExpandKernelStackAndCalloutEx+0xd8


07 fffff88002bef180 fffff880018143d6 Ntfs!NtfsMultipleAsync+0xa9


08 fffff88002bef1f0 fffff8800180cfc6 Ntfs!NtfsNonCachedIo+0x216


09 fffff88002bef3c0 fffff8800180ea68 Ntfs!NtfsCommonRead+0x7a6


0a fffff88002bef560 fffff880016f0bcf Ntfs!NtfsFsdRead+0x1b8


0b fffff88002bef770 fffff880016ef6df FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f


0c fffff88002bef800 fffff80003ec2635 FLTMGR!FltpDispatch+0xcf


0d fffff88002bef860 fffff80003ec2109 nt!IoPageRead+0x255


0e fffff88002bef8f0 fffff80003ea8a8a nt!MiIssueHardFault+0x255


0f fffff88002bef980 fffff80003e9876e nt!MmAccessFault+0x146a


10 fffff88002befae0 000007fefecd117a nt!KiPageFault+0x16e


11 000000000166e1d8 000007fefe49c07f msvcrt!memcpy+0x250


12 000000000166e1e0 000007fefe49bfff OLEAUT32!FileLockBytesMemory::ReadAt+0x3f


13 000000000166e210 000007fefe49df7c OLEAUT32!CTypeLib2::ReadAt+0x3f


14 000000000166e250 000007fefe49bf95 OLEAUT32!HEAP::ReadHeap+0x5c


15 000000000166e280 000007fefe49e4ab OLEAUT32!CTypeLib2::ReadGuids+0x15


16 000000000166e2b0 000007fefe49d7ed OLEAUT32!CTypeLib2::ReadExisting+0x4eb


17 000000000166e300 000007fefe49d69c OLEAUT32!CTypeLib2::Create+0x12d


18 000000000166e340 000007fefe49e9df OLEAUT32!LoadTypeLib2LockBytes+0x3c


19 000000000166e390 000007fefab6f3b8 OLEAUT32!LoadTypeLibEx+0x28b


1a 000000000166ece0 000007fefab661f1 es!FiringAgent::FiringAgent+0x51d


1b 000000000166f120 000007fefab6615d es!FiringAgent::Create+0xab


1c 000000000166f180 000007fefeae32a4 es!CFiringAgentFactory::CreateInstance+0x69


1d 000000000166f1d0 000007fefeae760b ole32!ICoCreateInstanceEx+0x3db


1e 000000000166f440 000007fefab6f104 ole32!CoCreateInstance+0x17b


1f 000000000166f530 000007fefab6f06f es!Notifier::NotifyThread::CreateEventClasses+0xd5


20 000000000166f590 000007fefab6bf55 es!Notifier::NotifyThread::ProcessNotifyArgs+0x9a


21 000000000166f7e0 000007fefab6c07b es!Notifier::NotifyThread::NotifyLoop+0x154


22 000000000166f880 000007fefeac7512 es!Notifier::NotifyThread::ContextCallBack+0x17


23 000000000166f8b0 000007fefeac7237 ole32!EnterForCallback+0x192


24 000000000166f9f0 000007fefeaddafc ole32!SwitchForCallback+0x387


25 000000000166fc50 000007fefeadd62f ole32!PerformCallback+0x47c


26 000000000166fd20 000007fefeb149d1 ole32!CObjectContext::InternalContextCallback+0x15b


27 000000000166fe40 000007fefeb14a6d ole32!CObjectContext::ContextCallback+0x81


28 000000000166fed0 000007fefab6c039 ole32!CContextSwitcher::ContextCallback+0x6d


29 000000000166ff10 0000000076ba652d es!Notifier::NotifyThread::ThreadMain+0x75


2a 000000000166ff60 0000000076cdc521 kernel32!BaseThreadInitThunk+0xd


2b 000000000166ff90 0000000000000000 ntdll!RtlUserThreadStart+0x1d

Perhaps one of the pages in your read range is a dummy page?

Peter

@“Peter_Viscarola_(OSR)” said:
Perhaps one of the pages in your read range is a dummy page?

Peter

Many thanks!!
I will research it continue!

@“Peter_Viscarola_(OSR)” said:
Perhaps one of the pages in your read range is a dummy page?

Peter

Problem solved! It’s caused by dummy page. Thanks very much!

My pleasure.

Another happy customer!! :smiley:

Peter