avshws' CPU loading

Hi all,

I found a problem on WDK’s sample avshws that if I connect the capture pin of avshws source with VMR directly, then the cup loading is about 10%, but if i insert a YUV transform filter between the source and VMR then the CPU loading is about 1%~2%.

In the YUV transform filter, i just copy data from input media sample to output media sample and with it’s output pin implemented QueryAccept virtual function.

I wonder why the cpu loading is reduced while adding an extal filter(obviously, cpu did much when the transform filter is added).

And how can i modify the avshws to reduce the cpu loading when connected directly with VMR.

thanks!

xxxxx@gmail.com wrote:

I found a problem on WDK’s sample avshws that if I connect the capture pin of avshws source with VMR directly, then the cup loading is about 10%, but if i insert a YUV transform filter between the source and VMR then the CPU loading is about 1%~2%.

In the YUV transform filter, i just copy data from input media sample to output media sample and with it’s output pin implemented QueryAccept virtual function.

I wonder why the cpu loading is reduced while adding an extal filter(obviously, cpu did much when the transform filter is added).

And how can i modify the avshws to reduce the cpu loading when connected directly with VMR.

What are the pin formats that get chosen in each case? What does your
YUV transform filter actually do?

Avshws can expose RGB24 or YUY2. If your graphics card doesn’t happen
to support overlays in either of those formats, then it will have to
draw frames into the frame buffer directly for each frame. If the YUV
transform filter happens to convert YUY2 to UYVY, and your graphics card
supports UYVY overlays, then one would certainly expect that to use less
CPU.


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.

What are the pin formats that get chosen in each case? What does your YUV transform filter actually do?
=>Both connected directly with VMR and connected with a transform filter inserted between, Avshws’ output pin format is: YUY2 320*240, 16bits, rcSrc = {0, 0, 320, 240} rcDst = {0, 0, 0, 0}, and the input pin of VMR is YUY2 320*-240, 16bits, rcSrc = {0, 0, 320, 240} rcDst = {0, 0, 0, 0}. both the input pin and the output pin of the transform filter has the same format with Avshws’ output pin. there is no data format transform in the transform filter, it just allocated buffer for the output sample and copied data from input sample’s buffer to the output sample’s buffer.

notice that, in both cases, VMR’s input pin has a negative height -240.

the key code of the transform filter is as the following:

HRESULT CUYVYTransfer::DecideBufferSize(IMemAllocator *pAlloc, ALLOCATOR_PROPERTIES *ppropInputRequest)
{
HRESULT result;
ALLOCATOR_PROPERTIES ppropActual;

if (m_pInput->IsConnected() == FALSE)
{
return E_UNEXPECTED;
}

ppropInputRequest->cBuffers = 1;
ppropInputRequest->cbBuffer =m_lFrameSize;
ppropInputRequest->cbPrefix = 0;

result = pAlloc->SetProperties(ppropInputRequest, &ppropActual);
if (result != S_OK)
{
return result;
}

if (ppropActual.cbBuffer < ppropInputRequest->cbBuffer)
{
return E_FAIL;
}

return S_OK;
}

HRESULT CUYVYTransfer::Transform(IMediaSample *pIn, IMediaSample *pOut)
{
uint8_t *src_buffer=NULL;
uint8_t *dest_buffer=NULL;

if( pIn->GetPointer(&src_buffer)!=S_OK)
return S_FALSE;
if(pIn->GetActualDataLength()==0)
return S_FALSE;
if(pOut->GetPointer(&dest_buffer)!=S_OK)
return S_FALSE;
memcpy(dest_buffer,src_buffer,m_lFrameSize);

pOut->SetActualDataLength(m_lFrameSize);

return NOERROR;
}