Data flow In AVStream minidriver and DMA implementation of a PCIE device

Hi, guys
I’m a newbie in windows driver developing, I find it’s very hard in doing this after reading docs and debugging driver samples for one more month. I’m doing this in Win10 using VS2019, SDK10 and DDK10, the driver is x64. And debugview is the only debug tools I know how to use.
The driver I’m developing is for a PCIEx4 video/audio capture device, and i start it based on the Windows AVSHwS sample, because i need to do DMA.
Configuring the DMA engine, a soft IP provided by Xilinx XDMA, in AVStream minidriver is the biggest problem blocked me right now. To my best understanding I tried three ways, unfortunately none of them works. I really hope some one can help me.
Here is what I added into the AVSHws
1. Start from the KSPIN_DESCRIPTOR_EX, I add KSPIN_FLAG_GENERATE_MAPPINGS in to the flag part which is KSPIN_FLAG_PROCESS_IN_RUN_STATE_ONLY| KSPIN_FLAG_GENERATE_MAPPINGS now.
2. I change the PnpStart(,TranslatedResourceList) ( called by DispatchPnpStart( ,Irp ,TranslatedResourceList, )) routine to PnpStart(Irp). Because the Count of the TranslatedResourceList is always 1 and i can only get part of the resources[Q1]. While using the Irp I get TranslatedResourcesList, by parsing the list i get the BAR, Interrupt information successfully, then I create a BusMater DmaAdaptor using IoGetDmaAdapter() and register it to the AVStream class driver using KsDeviceRegisterAdapterObject() . But I can’t connect the interrupt object using the IoConnectInterrupt() [Q2], every time i do this i got a blue screen, so i leave that part being commented
3. Then the DMA part comes, I tried 3 different ways to do this.
3.0. Before I start to describe the ways, I’d like to show you my understanding of the frame data flow:
3.0.1 In Pin Creating routine the frame buffers are allocated locally or passed in as a pointer from the AVStream Class driver[Q3], which uses VideoInfo configured by the DispatchSetFormate(), then the frame info is set.
3.0.2 In Pin Processing routine the Leadingedge is captured from the created pin, I get a valid return when using 1280720YUV2 (12807202 Bytes), but when I change the video size to 1280720RGB24 (12807203 Bytes)or bigger, I get a null pointer to the Leadingedge, despite that the Pin is created sucessfully (the return of the Pin Creation routine)[Q4]. If i remove the flag KSPIN_FLAG_GENERATE_MAPPINGS both will ok and even when the video size is 1920*1080RGB24. Then the leadingedge stream is cloned, and the clone is passed to ProgramScatterGatherMappings(), which is a subroutine of device.cpp and hwsim.cpp, then the new clone is locked and added into a SGList, the return shows the number of mappings added into the list, then Processing will advance the leadingedge by the number of mappings returned.
3.0.3 In the other hand, a Simulated Hardware is created in device.cpp and an interruptDPC is initiated consequently in the creating routine in hwsim.cpp, in the start routine of the hwsim.cpp a timer is initiated to trigger the interruptDPC. Then in the interruptDPC, by calling a FackHardware() routine one frame of video data is generated and in the FillScatterGatherBuffers() routine stream Clone buffer is checkout from the SGList, which is created in the ProgramScatterGatherMappings(), then the generated video data is copy into this cloned stream buffer. then a fake hardware interrupt is triggered, this interrupt routine lies in the hardware sink. In the interrupt, the total number of completed mappings is tracked by calling CompleteMappings() routine, which serves as a capture sink (the Pin) located in Pin class. Also In this routine, the newly filled clone is stamped and forward out by calling KsStreamPointerDelete().(“If the frame to which StreamPointer points has no other references on it after deletion, it is completed. When the last frame in a given IRP is completed, the IRP is completed.” from docs.microsoft.com) [Q5]

3.1 According to my understanding, i add the DMA operation code in the FillScatterGatherBuffers(), and this is where i got stuck[Q6]. I want to use the operation functions of DMAAdaptor create by IoGetDmaAdapter() early in PnpStart(), such as DmaAdaptor->DmaOperations->GetScatterGatherList();, but i can’t find the Irq, also there is no correspoding irq->MDLAddress and a relative DriverObject, which are all needed when using the DmaOperations[Q7]. As i know doing the DMA this way is common in WDM drivers, providing that there is an IRP_WRITE or IRP_READ, but in Avstream framework, maybe things are not like this, so i tried another way.
3.2 Remember the mappings in leadingedge stream pointer! when KSPIN_FLAG_GENERATE_MAPPINGS flag is on and the video image size is set to be (12807202=450*PAGE_SIZE) , i can get a valid leadingedge and clone it, but the value of “OffsetOut.Remaining” in the leadingedge stream pointer or clone is very small (like 5), I’m expecting 450 mappings and each mapping is a PAGE_SIZE [Q8] . Despite this, How can i get the LogicAddress that the device DMA bus-master can use, since there is only PhysicalAddress in the mappings structure, and there is no operation available in DmaOperations using the mappings in a stream pointer’s OffsetOut. and the DMA bus-master’s relative reg’s need to be filled with the LogicAddress list (a SG list), and a start operation flag, i think this should be done in a call back routine written by me, the GetScatterGatherList() has this interface, but i can’t use, again i got stuck.
3.3 With problems and questions in 3.2 I come to my 3rd trying Using common buffer DMA. this time the DmaAdaptor is not a , KSPIN_FLAG_GENERATE_MAPPINGS flag is off, and i need not to worry about the video image size, indeed the original Simulated video works fine this time (I can display it in GraphEdit with 720p or 1080p). all i need to do is change the generated image source buffer to the common buffer. In the PnpStart right after i register the DmaAdaptor to Avstream class, I use DmaVAddr=DmaAdaptor->DmaOperations->AllocateCommonBuffer (DmaAdaptor, imagesize, DmaLAddr, false);, but i got no lucky the driver crushed, and i get a blue screen.[Q9]

my questions are
[Q1]: Why the count of the TranslatedResouces passed in directly by the Dispatch routine is one, while the count from the Irp that also passed by the Dispatch routine is not.
[Q2]: Why can’t I connect the Interrput object sucessfully, the params i got from the partialTranslateResource are not correct, or I missed something. (in fact, I’ve checked the Bar params from the partialTanslatedResources it is correct).
[Q3]: Really don’t know who is responsible for allocating the buffer shifting into the Pin, I read some posts in OSR, which said it can be allocated locally by the minidriver or given by a render( don’t if i understood that correctly).
[Q4]: As the Microsoft docs said a Null pointer will return when there are not enough resources for mapping a large image, but why it won’t happy when the KSPIN_FLAG_GENERATE_MAPPINGS flag is off.
[Q5]: There the IRP comes, but it’s shown only in the Microsoft docs, not really in the completeMappings() routine, the AVStream class driver must have taken care of it, I can’t use it anyway, can I ?
[Q6]: Should I add the DMA operation code in FillScatterGatherBuffers(), or where do I ?
[Q7]: So there is no IRP available so I can’t do the DMA like we do it in a WDM driver, no MDLAddress, no AllocateDmaChannel(), no mapTranslate(), also no GetScatterGatherList(), how can i do it?
[Q8]: the Mappings count in the leadingedge’s OffsetOut is 4k Byte long for each, isn’t it? what is the relationship between these mappings and those returned by calling IoGetDmaAdapter(). suppose these mappings are used in filling the DmaAdaptor. operations how can i using it, since there are only PhysicalAddress in the mappings. In fact, from the DMA device point of view, i only need to feed it with a SGlist (like LogicAddr.High; LogicAddr.low; isTheLastMapping; pNextSGEntry), and write a start flag into the SG_configReg, which is ioMapped in the PnpStart() using MmMapIoSpace(), then the DMA start to work. I really don’t need a DmaAdaptor.
[Q9]: If I do the DMA using common buffer, where is the right place to allocate the common buffer and do the following process to accomplish the DMA.

Thank you for your patient. Since I’m not living in an English-speaking country, if there are words weired, I apologize.
Any words are appreciated, Thank you in advance.