Disk filter driver hangs at startup

Hi,

I am working on a WDM disk filter driver. We have an old driver which only compiles with windows 7 WDK written in driverworks framework that we are trying to port to Windows 10 WDK.

So far, I have written a passthrough upper filter driver and it works properly. Now I am trying to create new IRP for read/write requests instead of forwarding original IRP. I am using partial MDL having same address range as original MDL for now.

With this code, the machine hangs just after startup. I can see from traces that write request does not complete, although Read requests work properly.

Can someone help me find what is wrong in my code?

These are the traces of write requests. These are the only two write IRPs after startup :-

Handle_Write: Irp = 0xFFFFFA80045A0C60, subIrp = 0xFFFFFA8004577CC0, offset = 0x6500000, len = 0x2000
Handle_Write: CallDriver returned = 0x103
Handle_Write: Irp = 0xFFFFFA80045ADC10, subIrp = 0xFFFFFA80045B39F0, offset = 0xbfaef000, len = 0x1000
Handle_Write: CallDriver returned = 0x103

The irp detail is :-

3: kd> !irp 0xFFFFFA80045ADC10
Irp is active with 11 stacks 6 is current (= 0xfffffa80045ade48)
 Mdl=fffffa80045aa8f0: No System Buffer: Thread fffffa80037001a0:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
.>[IRP_MJ_WRITE(4), N/A(0)]
           1c e1 fffffa80046e9d20 00000000 fffff88000e0d010-fffffa8004579ba0 Success Error Cancel pending
	       \Driver\MyDiskFilter	volmgr!VmpReadWriteCompletionRoutine
			Args: 00001000 00000000 bfaef000 00000000
 [IRP_MJ_WRITE(4), N/A(0)]
            c e0 fffffa8004579a50 00000000 fffff88001b7a0a8-fffffa8004583b90 Success Error Cancel 
	       \Driver\volmgr	fvevol!FvePassThroughCompletion
			Args: 00001000 00000000 b95ef000 00000000
 [IRP_MJ_WRITE(4), N/A(0)]
            c e0 fffffa8004583a40 00000000 fffff88001b2e0f8-fffffa80037c77c0 Success Error Cancel 
	       \Driver\fvevol	rdyboost!SmdReadWriteCompletion
			Args: 00001000 00000000 b95ef000 00000000
 [IRP_MJ_WRITE(4), N/A(0)]
            c e1 fffffa8004584c40 00000000 fffff88001aeb1b0-fffffa8004586c10 Success Error Cancel pending
	       \Driver\rdyboost	volsnap!VspWriteContextCompletionRoutine
			Args: 00001000 7fffffff b95ef000 00000000
 [IRP_MJ_WRITE(4), N/A(0)]
            c e1 fffffa80046ed040 00000000 fffff8800124c344-fffff880031f5b98 Success Error Cancel pending
	       \Driver\volsnap	Ntfs!NtfsMasterIrpSyncCompletionRoutine
			Args: 00001000 00000000 b95ef000 00000000
 [IRP_MJ_WRITE(4), N/A(0)]
            0  0 fffffa800474f030 fffffa80045b1e80 00000000-00000000    
	       \FileSystem\Ntfs
			Args: 00001000 00000000 016f1000 00000000

The thread processing IRP is :-

3: kd> .thread fffffa80037001a0
Implicit thread is now fffffa80`037001a0
3: kd> k
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr               Call Site
00 fffff880`031f5580 fffff800`0269c992     nt!KiSwapContext+0x7a
01 fffff880`031f56c0 fffff800`0269f1af     nt!KiCommitThreadWait+0x1d2
02 fffff880`031f5750 fffff880`0124b3ff     nt!KeWaitForSingleObject+0x19f
03 fffff880`031f57f0 fffff880`01254224     Ntfs!NtfsNonCachedIo+0x23f
04 fffff880`031f59c0 fffff880`0124f507     Ntfs!NtfsNonCachedUsaWrite+0x64
05 fffff880`031f5a50 fffff880`012501a3     Ntfs!NtfsCommonWrite+0x2ca4
06 fffff880`031f5c00 fffff800`0267d56f     Ntfs!NtfsFsdWrite+0x1c3
07 fffff880`031f5e80 fffff800`026db660     nt!IoSynchronousPageWrite+0x24f
08 fffff880`031f5f00 fffff800`026d9d38     nt!MiFlushSectionInternal+0xb30
09 fffff880`031f6130 fffff800`026d90c9     nt!MmFlushSection+0x1f4
0a fffff880`031f61f0 fffff880`0133ea13     nt!CcFlushCache+0x5e9
0b fffff880`031f62f0 fffff880`01343f13     Ntfs!LfsFlushLogPage+0x63
0c fffff880`031f6320 fffff880`01309fa6     Ntfs!LfsFindLastLsn+0xb43
0d fffff880`031f6460 fffff880`0130f5b6     Ntfs!LfsRestartLogFile+0x812
0e fffff880`031f65c0 fffff880`01310189     Ntfs!LfsOpenLogFile+0xe2
0f fffff880`031f6650 fffff880`013197d4     Ntfs!NtfsStartLogFile+0xed
10 fffff880`031f6710 fffff880`012b580d     Ntfs!NtfsMountVolume+0xa6c
11 fffff880`031f6a50 fffff880`0123c985     Ntfs!NtfsCommonFileSystemControl+0x59
12 fffff880`031f6a90 fffff800`026a1a21     Ntfs!NtfsFspDispatch+0x2ad
13 fffff880`031f6b70 fffff800`02934cce     nt!ExpWorkerThread+0x111
14 fffff880`031f6c00 fffff800`02688fe6     nt!PspSystemThreadStartup+0x5a
15 fffff880`031f6c40 00000000`00000000     nt!KiStartSystemThread+0x16

Write request is handled like this :-

NTSTATUS MyDiskDevice::Handle_Write(PIRP Irp)
{
   bool bForward = true;
   IrpHandler irph{ Irp, false };
   PIO_STACK_LOCATION stack = irph.GetCurrentStackLoc();

   if (m_pDevObj->Flags & DO_DIRECT_IO)
   {
      if (Irp->MdlAddress)
      {
         void * pOrigVA        = MmGetMdlVirtualAddress(Irp->MdlAddress);
         ULONG ulOrigByteCount = MmGetMdlByteCount(Irp->MdlAddress);

         size_t mdlSize = MmSizeOfMdl( pOrigVA, ulOrigByteCount);
         MDL*   pMdl    = (MDL*) new(NonPagedPool, MY_MEM_TAG) unsigned char[mdlSize];
         memset(pMdl, 0, mdlSize);

         MmInitializeMdl(pMdl, pOrigVA, ulOrigByteCount);

         IoBuildPartialMdl(Irp->MdlAddress, pMdl, pOrigVA, ulOrigByteCount);

         DFCRWPacket* pSubirp = new (NonPagedPool, MY_MEM_TAG) RWPacket{ m_pLowerDevObj, m_pDevObj};
         if (pSubirp && pSubirp->m_pIrp)
         {
            TRACE("Handle_Write: Irp = 0x%p, subIrp = 0x%p, offset = 0x%I64x, len = 0x%x\n", Irp, pSubirp->m_pIrp, stack->Parameters.Write.ByteOffset.QuadPart, stack->Parameters.Write.Length);
            irph.MarkPending();
            pSubirp->SetWriteMdl(stack->Parameters.Write.ByteOffset.QuadPart, stack->Parameters.Write.Length, pMdl);
            pSubirp->m_pOrigIrp = Irp;
            pSubirp->bWrite = true;
            NTSTATUS sSubIrp = pSubirp->Call();
            
            TRACE("Handle_Write: CallDriver returned = 0x%x\n", sSubIrp);
            bForward = false;
         }
      }
   }
   if(bForward)
   {
      irph.Forward(m_pLowerDevObj);
   }
   return irph.m_Status;
}

class RWPacket
{
public:
   RWPacket(PDEVICE_OBJECT pLowerDev, PDEVICE_OBJECT pThisDev)
      :m_pLowerDev(pLowerDev)
   {
      KeInitializeEvent(&m_event, NotificationEvent, FALSE);
      m_pIrp = IoAllocateIrp(pThisDev->StackSize, FALSE);
      if (m_pIrp)
      {
         IoSetCompletionRoutine(m_pIrp, IrpRequestCompletion, this, TRUE, TRUE, TRUE);
      }
   }

   void SetReadMdl(LONGLONG llOffset, ULONG ulLength, PMDL pMdl)
   {
   ...
   ...
   }

   void SetWriteMdl(LONGLONG llOffset, ULONG ulLength, PMDL pMdl)
   {
      auto psl = IoGetNextIrpStackLocation(m_pIrp);
      psl->MajorFunction = IRP_MJ_WRITE;
      psl->Parameters.Write.ByteOffset.QuadPart = llOffset;
      psl->Parameters.Write.Length = ulLength;
      m_pIrp->MdlAddress = pMdl;
      m_pIrp->AssociatedIrp.SystemBuffer = nullptr;
      m_pIrp->UserBuffer = MmGetSystemAddressForMdlSafe(pMdl, NormalPagePriority);
   }

   NTSTATUS Call()
   {
      KeClearEvent(&m_event);
      return IoCallDriver(m_pLowerDev, m_pIrp);
   }

   NTSTATUS Wait()
   {
      return KeWaitForSingleObject( &m_event, Executive, KernelMode, FALSE, NULL);
   }

   static NTSTATUS IrpRequestCompletion(PDEVICE_OBJECT DeviceObject, PIRP Irp, PVOID Context)
   {
      UNREFERENCED_PARAMETER(DeviceObject);
      UNREFERENCED_PARAMETER(Irp);

      RWPacket* pReq = (RWPacket*)Context;
      return pReq->OnComplete();
   }

   virtual NTSTATUS OnComplete()
   {
      TRACE("OnComplete: m_pOrigIrp = 0x%p, Irp = 0x%p\n", m_pOrigIrp, m_pIrp);
      if (m_pIrp->MdlAddress)
      {
         ::operator delete[](m_pIrp->MdlAddress, NonPagedPool, MY_MEM_TAG);
         m_pIrp->MdlAddress = nullptr;
      }
   
      m_StatusBlock = m_pIrp->IoStatus;
      IoFreeIrp(m_pIrp);
      KeSetEvent(&m_event, IO_NO_INCREMENT, FALSE);
      
      if(m_pOrigIrp)
      {
         DFCIrpHandler h(m_pOrigIrp);
         h.Complete(m_StatusBlock.Status, m_StatusBlock.Information);
      }

      ::operator delete(this, NonPagedPool, MY_MEM_TAG);

      return STATUS_MORE_PROCESSING_REQUIRED;
   }

   PIRP              m_pIrp {0};
   PDEVICE_OBJECT    m_pLowerDev;
   KEVENT            m_event {0};
   IO_STATUS_BLOCK   m_StatusBlock {0};
   
   PIRP              m_pOrigIrp {0};
}

I see you are returning STATUS_MORE_PROCESSING_REQUIRED in OnComplete.
Shouldn’t you be returning STATUS_CONTINUE_COMPLETION?

If I return STATUS_CONTINUE_COMPLETION, it crashes with IRQL_NOT_LESS_OR_EQUAL.
Msdn reference for IO_COMPLETION_ROUTINE says that “completion routine of all higher level drivers are called until one routine returns STATUS_MORE_PROCESSING_REQUIRED”.
Maybe I am giving wrong stack size to IoAllocateIrp.
!irp command for subIrp is like this (when returning STATUS_MORE_PROCESSING_REQUIRED):

1: kd> !irp 0xFFFFFA8004597930
Irp is active with 4 stacks 4 is current (= 0xfffffa8004597ad8)
 Mdl=fffffa8004593820: No System Buffer: Thread 00000000:  Irp stack trace.  
     cmd  flg cl Device   File     Completion-Context
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
 [N/A(0), N/A(0)]
            0  0 00000000 00000000 00000000-00000000    
			Args: 00000000 00000000 00000000 00000000
.>[IRP_MJ_WRITE(4), N/A(0)]
            0 e1 fffffa80046e9b90 00000000 fffff880019bbbd0-fffffa8003706170 Success Error Cancel pending
	       \Driver\partmgr	MyDiskFilter!RWPacket::IrpRequestCompletion
			Args: 00001000 00000000 bfab3000 00000000

When it is an IRP that you created, you never ever EVER return anything other than STATUS_MORE_PROCESSING_REQUIRED. There is nobody above you who knows anything about your IRP, so they can’t possibly do anything useful with it.

Oh yeah, OnComplete is for his IRP, not the IRP he is filtering. My bad.

Here is how I have changed an r/w IPR MDL in the past:

PIO_STACK_LOCATION c_stack = IoGetCurrentIrpStackLocation(Irp);
ULONG length = c_stack->Parameters.Write.Length;
// Save original MDL. We will pass it on to the completion handler.
PMDL w_mdl = Irp->MdlAddress;
PVOID ptr = ExAllocatePool(NonPagedPool, length < PAGE_SIZE ? PAGE_SIZE :
length);
PMDL r_mdl = IoAllocateMdl(ptr, length, FALSE, FALSE, NULL);
MmBuildMdlForNonPagedPool(r_mdl);
Irp->MdlAddress = r_mdl;

And to free the MDL, use this sequence:

IoFreeMdl(r_mdl);
ExFreePool(ptr);

I’m not sure I’d map the MDLs to the same address range either. Maybe do a
copy since you are just testing.

I tried using IoAllocateMdl with original buffer and also with new allocated buffer. In both cases it hangs just like before. The traces show lot of READ IRPs that are completed and two WRITE IRPs that are not completed.

Did you do this to the MDL for the newly allocated NPP?:

MmBuildMdlForNonPagedPool(r_mdl);

Yes. Otherwise it crashes with BAD_POOL_HEADER.