Techniques for making a copy of a camera frame in an upper filter driver

AneeshS · April 10, 2022, 11:38pm

I have a very rudimentary (aka buggy) upper filter driver sitting on top of usbvideo.sys.

In my driver I just wanted to make a copy of the frames captured on return from IOCTL_KS_READ_STREAM.

I set the completion routine for this IRP and on return, I can get the KS_STREAM_HEADER.

I am running a DirectShow app in user mode which is able to capture frames using DirectShow filters (one frame at a time). The DirectShow call GetStreamBuffer shows a size of 9216xx bytes for the buffer but when I look at the corresponding frame upon return from IOCTL_KS_READ_STREAM, the DataUsed field shows a value that is much lesser.

What field in the Irp corresponds to the size DirectShow() API is returning?.
What is the buffer (kernel va) and the size I can access to make a copy of the frame buffer?

I am just curious because I would like to make a copy of the frame bits in my upper filter driver just for some analysis. Of course it does not make sense if my app is the one capturing because i can do the analysis in user mode but I am curious when a random (built in camera app for example) app is running.

Something tells me that it is not straightforward to make a copy of the captured frame in the kernel mode driver (so that the frame matches what my app sees in DirectShow).

Thanks,
AneeshS

Tim_Roberts · April 11, 2022, 12:56am

No, it is not trivial to filter a KS connection under the best of conditions. Those METHOD_NEITHER ioctls provide some interesting surprises – buffers are not where you expect them to be, and the tweaks are not documented.

921,600 happens to be 640x480x3, which is a very common size. Do you believe you are capturing RGB24? Is that what you asked for?

The DataUsed member of the KSSTREAM_HEADER should give you the number of bytes actually being returned, and the Data member has the address.
Whether that address is a user or kernel address depends on a number of conditions. Are you sure you are looking at the right buffer? Do the other stream header fields seem reasonable? It’s POSSIBLE your camera delivers a compressed image, and there’s an automatic decompressor being inserted between the camera and you.

AneeshS · April 11, 2022, 11:50am

@Tim_Roberts If my app is the one capturing the frames, I am using RGB24 so yes 921600 bytes is the right size. When it is the camera app capturing the frames, I am not sure although I set the capture resolution to 1920 1080 in the camera settings.

I set my app to capture exactly one frame and I do see one break on my completion routine for IOCTL_KS_STREAM_READ. The KSSTREAM_HEADER does have the bytes and the Size parameter of this field is correct one (which gives me an indication I am looking at the correct structure). Following that, I see a STREAM_INFO structure which also seems to be valid. The Information field of the Irp is set to 128 bytes which is the size of all headers.

When you say decompressor inserted between camera and me, does me refer to my upper filter driver or the camera app. If it is between usbvideo.sys and my upper filter driver, I should be seeing uncompressed frames. I still need to know how to read them using the Data field.

Thanks,
AS

Tim_Roberts · April 11, 2022, 4:10pm

If there is a decompressor, then it would be in user mode. You said the header “does have the bytes and the Size parameter of this field is correct”, so please tell us again, what is the problem?

AneeshS · April 12, 2022, 12:30am

@Tim_Roberts thanks for the reply.

let me see if I can explain what I am doing in my filter driver.

If it is my app that is running DirectShow (DirectShow does not use svchost.exe to proxy commands to KS), I can see that for each capture, I see a bunch of completions for IOCTL_KS_READ_STREAM. I decode the KSSTREAM_HEADER as (KSSTREAM_HEADER)(Irp->AssociatedIrp.SystemBuffer). If the flags field in this header says I have a KS_FRAME_INFO following this, then I decode that as well.

For each capture that I do from my app, I have multiple completions, each with DataUsed to non-zero. I am assuming one read (of 921600 bytes) translates to multiple reads which is fine. I think I can detect the start and end so I can accumulate the bytes read but I am wondering where the captured bits reside.

Is that in Irp->MdlAddress or in the Data field of each KSSTREAM_HEADER?. If latter, I assume this is a process VA so I will have to do ProbeForRead and handle that way?. But how do I guarantee that the IRP completion routine will run in the process context?.

I guess one way is to preprocess the IRP when it is being sent, map the user buffer into system buffer address and use that on the completion back to read the bytes.

The above will work when it is my own app capturing frames but like I said I do not need to do this when it is my app running.

I really need to capture a full frame when an app like camera is streaming. I want to capture a complete frame from the stream and save it away.

Thanks,
RK

Tim_Roberts · April 12, 2022, 4:21pm

Video streams always pass a complete frame as a unit. Audio streams do not, but video streams do. You should never see a partial completion.

By the time you get to the completion, the fields in the IRP should already have been translated to kernel mode. That’s done by ksthunk. The IRP UserBuffer points to the kernel address of the KSSTREAM_HEADER. The MdlAddress should describe the frame buffer, from the data field of the KSSTREAM_HEADER. They should point to the same memory.

AneeshS · April 12, 2022, 5:41pm

@Tim_Roberts thanks a lot. I wish these were documented somewhere but it is clear that you have already debugged and discovered this information and are willing to share here. For that, I thank you.

When it is my app that is running for example, I am able to capture the frame bits and save it to a BMP file.

When I run the camera app for example, the app seems to be setting MPEG (compression format) when it does CreateFile() on the pin.
So when I capture the frame, I assume it will still be compressed according to MPEG format.

I do not suppose it is as simple as capturing the frame to a MPEG file (all in user space of course) and open the MPEG file in Windows media player.
I am just wondering how to interpret the frame so I can write it to disk and open it as a photo.

Thanks,
RK

Tim_Roberts · April 13, 2022, 6:06am

You’re sure it’s MPEG, and not MJPG? MJPG is easy – it’s just a sequence of JPEG images.

AneeshS · April 13, 2022, 1:04pm

@Tim_Roberts thanks. I stand corrected. I am capturing the format in IRP_MJ_CREATE on the pin and it does MJPG.

The KS_DATAFORMAT_VIDEOINFOHEADER shows biCompression as 0x47504a4d [MJPG]

So if I understand you correctly, each completion of IOCTL_KS_READ_STREAM should contain one full complete JPEG image.
All I have to do is to make a copy of this buffer described at Irp->MdlAddress with size equal to DataUsed and that should be a complete JPEG image.

Please correct me if I am wrong in my understanding of your statements above,
Thanks,
RK

Tim_Roberts · April 13, 2022, 5:35pm

Yes, that’s how it is supposed to work. That’s a common format for web cams, because JPEG is easy to compress in a microprocessor (MPEG is definitely not), and each frame stands alone.

AneeshS · April 13, 2022, 6:23pm

@Tim_Roberts thanks a lot for the reply. I will try that and see if that works. I will update here on what I find. Thanks again