strange behavior in DiskDrive upper filter driver



Hi, all
I'm writting a driver which works as a DiskDrive upper filter ( ClassGUID: {4D36E967-E325-11CE-BFC1-08002BE10318}).

The code of IRP_MJ_READ Dispatch is very very simple :
NTSTATUS DiskReadDispatch(   IN  PDEVICE_OBJECT  DeviceObject,   IN  PIRP            Irp)
{
        PDISK_DEVICE_EXTENSION    deviceExtension;
        NTSTATUS  status = STATUS_INVALID_DEVICE_STATE;
        deviceExtension = (PDISK_DEVICE_EXTENSION )DeviceObject->DeviceExtension;
        if(deviceExtension == NULL)
        {
            KdBreakPoint();     // had never execute in here
            Irp->IoStatus.Status = status;
            Irp->IoStatus.Information = 0;
            IoCompleteRequest(Irp, IO_DISK_INCREMENT);
            return status;
        }
        PIO_STACK_LOCATION  irpStack = NULL;

        irpStack = IoGetCurrentIrpStackLocation(Irp);

        do 
        {
            if( irpStack->MajorFunction != IRP_MJ_READ )
            {
                  KdBreakPoint();  // had never execute in here
                  Irp->IoStatus.Status = status;
                  Irp->IoStatus.Information = 0;
                  return status;
            }

            ULONG   ulRequestReadLen = irpStack->Parameters.Read.Length;

            if (Irp->MdlAddress == NULL || ulRequestReadLen == 0)
            {
                break;
            }

            if (MmGetMdlByteCount(Irp->MdlAddress) < ulRequestReadLen)
            {
                break;
            }

            PVOID   pOutputBuffer = MmGetSystemAddressForMdlSafe(Irp->MdlAddress,NormalPagePriority);

            if (!pOutputBuffer)
            {
                break;
            }

            PVOID   pJustTest = ExAllocatePoolWithTag(NonPagedPool,ulRequestReadLen,'tseT');

            if (pJustTest)
            {
                memset(pJustTest,0x18,ulRequestReadLen);
                
                 if(irpStack->Parameters.Read.ByteOffset.QuadPart == 0xF8CE0000)
                {
                     KdBreakPoint();  // just for debug
                }

                memcpy(pOutputBuffer, pJustTest, ulRequestReadLen);

                if ( memcmp( pOutputBuffer,  pJustTest, ulRequestReadLen)  !=  0 )
                {
                    KdPrint(("data diff!\r\n"));   // !!!!!!!!!! Problems here:  I don't understand why the code can excute to here
                    KdBreakPoint();
                }

                ExFreePool(pJustTest);
            }
        } while (FALSE);

        // send to lower and return
        IoSkipCurrentIrpStackLocation(Irp);
        return  IoCallDriver(deviceExtension->LowerDeviceObject,Irp);
}

As we can see, the code basically did nothing, just malloc a buffer , set every byte to 0x18 , copy to OutputBuffer of current IRP , then compare the data in pOutputBuffer with pJustTest immediately , and skip current irp stack , then pass irp to lower device.


But the code execute result was strange :


sometimes , after copy data to pOutputBuffer, the data in pOutputbuffer is diffrent from pJustTest. some range in the pOutputBuffer becames random values (not 0x18).
It doesnā€™t happen most of the time, only sometimes, and every times it happened, the readOffset in the irpStack->Parameters.Read.ByteOffset is same: 0xF8CE0000.
When it happened , I checked data in pJustTest , every byte in the buffer is correct. Just data in pOutputBuffer has been OVERWRITE.


In order to know the buffer was OVERWRITE by who, I had add some code to check whether the read offset is 0xF8CE0000, if is , then break to windbg, and set a memory write breakpoint with ba command in windbg : ba w1 poi(pOutputBuffer) . (and also tried ba w2 , ba w3, ba w4, ba w8)


But when the memcmp return a non-zero value , the break point only hit one times ( this time is caused by my own code, because I set break point before my memcpy)ā€¦


At first , I think maybe this is a hardware problem, caused by memory or disk error? So I had install the driver on my notebook, the issue still exists.
This is really unbeliveable, I had debug it for a week, I had no idea now.


Any one can help? Thanks very much!


More infomation:

  1. test on : vmware win7 64 sp1, with full patch. physical memory : 2GB. ( also test on 4GB 8GB and 16GB, always same )
  2. The DiskDrive health status is good, and I also test on my notebook(not virtual machine), result is almost same. (just read offset is not 0xF8CE0000)
  3. The flag of every deviceObjcect in the device stack has DO_DIRECT_IO set.
  4. I do nothing with the irp , just skip and send to lower device.
  5. I didnā€™t modify Irp->MDLAddress, just use MmGetSystemAddressForMdlSafe to get ouput buffer address.
  6. I tried install the driver as a Volume Filter ( ClassGUID: {71A27CDD-812A-11D0-BEC7-08002BE2092F}) , all code is totally same, the issue never appeared.
  7. When the available physical memory of the OS is insufficient, it happens more frequently. Maybe a same physical page been mapped to more than one virtual address? Or MmGetSystemAddressForMdlSafe returned wrong value?
  8. except IRP_MJ_READ, All other type of irp had send to lower device, I didnā€™t touch them.
  9. the code list here is meaningless, I just use the code to reproduce the issue. (My purpose is to write a disk cache driver)
  10. had test on other PC, result is same. (these PC has diffrent chipset, intelā€¦ amdā€¦)
  11. ha test driver with microsoft driver verifier and run on checked build win7 , verifier detect nothing, no help.


    If you need more infomation , please tell me, thanks !


call stack here:


2: kd> k


# Child-SP RetAddr Call Site


00 fffff88002beeef0 fffff88000e0118c diskcachetest!DiskReadDispatch+0x16a


01 fffff88002beef60 fffff880012f92bf volmgr!VmReadWrite+0x11c


02 fffff88002beefa0 fffff880012f953c fvevol!FveReadWrite+0x47


03 fffff88002beefe0 fffff880012aedf4 fvevol!FveFilterRundownReadWrite+0x1dc


04 fffff88002bef040 fffff8800181539a volsnap! ?? ::FNODOBFM::`stringā€™+0x57b


05 fffff88002bef070 fffff80003ea7078 Ntfs!NtfsStorageDriverCallout+0x16


06 fffff88002bef0a0 fffff88001814f09 nt!KeExpandKernelStackAndCalloutEx+0xd8


07 fffff88002bef180 fffff880018143d6 Ntfs!NtfsMultipleAsync+0xa9


08 fffff88002bef1f0 fffff8800180cfc6 Ntfs!NtfsNonCachedIo+0x216


09 fffff88002bef3c0 fffff8800180ea68 Ntfs!NtfsCommonRead+0x7a6


0a fffff88002bef560 fffff880016f0bcf Ntfs!NtfsFsdRead+0x1b8


0b fffff88002bef770 fffff880016ef6df FLTMGR!FltpLegacyProcessingAfterPreCallbacksCompleted+0x24f


0c fffff88002bef800 fffff80003ec2635 FLTMGR!FltpDispatch+0xcf


0d fffff88002bef860 fffff80003ec2109 nt!IoPageRead+0x255


0e fffff88002bef8f0 fffff80003ea8a8a nt!MiIssueHardFault+0x255


0f fffff88002bef980 fffff80003e9876e nt!MmAccessFault+0x146a


10 fffff88002befae0 000007fefecd117a nt!KiPageFault+0x16e


11 000000000166e1d8 000007fefe49c07f msvcrt!memcpy+0x250


12 000000000166e1e0 000007fefe49bfff OLEAUT32!FileLockBytesMemory::ReadAt+0x3f


13 000000000166e210 000007fefe49df7c OLEAUT32!CTypeLib2::ReadAt+0x3f


14 000000000166e250 000007fefe49bf95 OLEAUT32!HEAP::ReadHeap+0x5c


15 000000000166e280 000007fefe49e4ab OLEAUT32!CTypeLib2::ReadGuids+0x15


16 000000000166e2b0 000007fefe49d7ed OLEAUT32!CTypeLib2::ReadExisting+0x4eb


17 000000000166e300 000007fefe49d69c OLEAUT32!CTypeLib2::Create+0x12d


18 000000000166e340 000007fefe49e9df OLEAUT32!LoadTypeLib2LockBytes+0x3c


19 000000000166e390 000007fefab6f3b8 OLEAUT32!LoadTypeLibEx+0x28b


1a 000000000166ece0 000007fefab661f1 es!FiringAgent::FiringAgent+0x51d


1b 000000000166f120 000007fefab6615d es!FiringAgent::Create+0xab


1c 000000000166f180 000007fefeae32a4 es!CFiringAgentFactory::CreateInstance+0x69


1d 000000000166f1d0 000007fefeae760b ole32!ICoCreateInstanceEx+0x3db


1e 000000000166f440 000007fefab6f104 ole32!CoCreateInstance+0x17b


1f 000000000166f530 000007fefab6f06f es!Notifier::NotifyThread::CreateEventClasses+0xd5


20 000000000166f590 000007fefab6bf55 es!Notifier::NotifyThread::ProcessNotifyArgs+0x9a


21 000000000166f7e0 000007fefab6c07b es!Notifier::NotifyThread::NotifyLoop+0x154


22 000000000166f880 000007fefeac7512 es!Notifier::NotifyThread::ContextCallBack+0x17


23 000000000166f8b0 000007fefeac7237 ole32!EnterForCallback+0x192


24 000000000166f9f0 000007fefeaddafc ole32!SwitchForCallback+0x387


25 000000000166fc50 000007fefeadd62f ole32!PerformCallback+0x47c


26 000000000166fd20 000007fefeb149d1 ole32!CObjectContext::InternalContextCallback+0x15b


27 000000000166fe40 000007fefeb14a6d ole32!CObjectContext::ContextCallback+0x81


28 000000000166fed0 000007fefab6c039 ole32!CContextSwitcher::ContextCallback+0x6d


29 000000000166ff10 0000000076ba652d es!Notifier::NotifyThread::ThreadMain+0x75


2a 000000000166ff60 0000000076cdc521 kernel32!BaseThreadInitThunk+0xd


2b 000000000166ff90 0000000000000000 ntdll!RtlUserThreadStart+0x1d

Perhaps one of the pages in your read range is a dummy page?

Peter

@ā€œPeter_Viscarola_(OSR)ā€ said:
Perhaps one of the pages in your read range is a dummy page?

Peter

Many thanks!!
I will research it continue!

@ā€œPeter_Viscarola_(OSR)ā€ said:
Perhaps one of the pages in your read range is a dummy page?

Peter

Problem solved! Itā€™s caused by dummy page. Thanks very much!

My pleasure.

Another happy customer!! :smiley:

Peter