Adding IOCTL to Simple Audio Sample

Hello everyone, I have been trying to add an IOCTL backdoor handle in SimpleAudioSample to create a virtual mic driver which will let me stream audio into the virtual microphone device from a user-mode app (Unity), instead of the default sine tone playing in the sample.

Due to my lack of experience with windows drivers, I'm unable to fully grasp the architechture and the best place to add my IOCTL handle without intercepting Portcls and causing a BSOD.

I tried adding my IOCTL handler in AddDriver and even DeviceEntry, but doing so in either cases, I'm ABLE to receive the input audio buffers BUT it breaks device enumeration on system settings, so I can no longer then see the device.

The way I was attempting to add IOCTL handle in AddDriver was this -

if (NT_SUCCESS(status)) {
    DriverObject->MajorFunction[IRP_MJ_DEVICE_CONTROL] = CMiniportWaveRTStream::HandleCustomIOCTL;
    PhysicalDeviceObject->Flags |= DO_BUFFERED_IO;
    PhysicalDeviceObject->Flags &= ~DO_DEVICE_INITIALIZING;
}
NTSTATUS
CMiniportWaveRTStream::HandleCustomIOCTL
(
    _In_ PDEVICE_OBJECT DeviceObject,
    _In_ PIRP Irp
)
/*++

Routine Description:

Add a custom IOCTL handler that works with the Ring buffer.

--*/
{
    PIO_STACK_LOCATION irpStack = IoGetCurrentIrpStackLocation(Irp);
    NTSTATUS status = STATUS_SUCCESS;
    ULONG_PTR bytesTransferred = 0;

    DbgPrint("SIMPLEAUDIOSAMPLE: HandleCustomIOCTL Entry\n");
    DbgPrint("SIMPLEAUDIOSAMPLE: Received IOCTL Code: 0x%x\n",
        irpStack->Parameters.DeviceIoControl.IoControlCode);

    // Log all buffer parameters
    DbgPrint("SIMPLEAUDIOSAMPLE: Input Buffer Length: %d\n",
        irpStack->Parameters.DeviceIoControl.InputBufferLength);
    DbgPrint("SIMPLEAUDIOSAMPLE: Output Buffer Length: %d\n",
        irpStack->Parameters.DeviceIoControl.OutputBufferLength);

    DbgPrint("SIMPLEAUDIOSAMPLE: HandleCustomIOCTL - Code: 0x%x, Method: %d\n",
        irpStack->Parameters.DeviceIoControl.IoControlCode,
        irpStack->Parameters.DeviceIoControl.IoControlCode & 3);

    // Get the buffered data
    PVOID buffer = Irp->AssociatedIrp.SystemBuffer;
    ULONG inLength = irpStack->Parameters.DeviceIoControl.InputBufferLength;

    DbgPrint("SIMPLEAUDIOSAMPLE: Buffer: %p, Length: %d\n", buffer, inLength);

    if (buffer && inLength > 0) {
        SHORT* samples = (SHORT*)buffer;
        DbgPrint("SIMPLEAUDIOSAMPLE: First sample: %d\n", samples[0]);
    }

    switch (irpStack->Parameters.DeviceIoControl.IoControlCode)
    {
    case 0x2f0007:  // Audio write attempt
    {
        PVOID inputBuffer = Irp->AssociatedIrp.SystemBuffer;
        ULONG inputBufferLength = irpStack->Parameters.DeviceIoControl.InputBufferLength;

        DbgPrint("SIMPLEAUDIOSAMPLE: Write request - Buffer: %p, Length: %d\n",
            inputBuffer, inputBufferLength);

        if (!inputBuffer || inputBufferLength == 0) {
            DbgPrint("SIMPLEAUDIOSAMPLE: Empty buffer received\n");
            status = STATUS_INVALID_PARAMETER;
            break;
        }

        // Log first few samples
        SHORT* samples = (SHORT*)inputBuffer;
        DbgPrint("SIMPLEAUDIOSAMPLE: First 5 samples: %d %d %d %d %d\n",
            samples[0], samples[1], samples[2], samples[3], samples[4]);

        PortClassDeviceContext* deviceContext = (PortClassDeviceContext*)DeviceObject->DeviceExtension;
        if (!deviceContext || !deviceContext->m_pCommon) {
            DbgPrint("SIMPLEAUDIOSAMPLE: Invalid device context\n");
            status = STATUS_INVALID_DEVICE_STATE;
            break;
        }

        CMiniportWaveRT* miniport = (CMiniportWaveRT*)deviceContext->m_pCommon;
        PCMiniportWaveRTStream stream = miniport->GetActiveCaptureStream();

        if (!stream) {
            DbgPrint("SIMPLEAUDIOSAMPLE: No active stream\n");
            status = STATUS_INVALID_DEVICE_STATE;
            break;
        }

        if (!stream->m_RingBuffer) {
            DbgPrint("SIMPLEAUDIOSAMPLE: No ring buffer\n");
            status = STATUS_INVALID_DEVICE_STATE;
            break;
        }

        status = stream->m_RingBuffer->Write(
            (PBYTE)inputBuffer,
            inputBufferLength);

        if (NT_SUCCESS(status)) {
            bytesTransferred = inputBufferLength;
            DbgPrint("SIMPLEAUDIOSAMPLE: Successfully wrote %d bytes\n", inputBufferLength);
        }
        else {
            DbgPrint("SIMPLEAUDIOSAMPLE: Write failed, status = 0x%x\n", status);
        }

        stream->Release();
        break;
    }

    case 0x2f0003:  // Standard audio query
        DbgPrint("SIMPLEAUDIOSAMPLE: Standard query received\n");
        status = STATUS_SUCCESS;
        break;

    default:
        DbgPrint("SIMPLEAUDIOSAMPLE: Unknown IOCTL code: 0x%x\n",
            irpStack->Parameters.DeviceIoControl.IoControlCode);
        status = STATUS_INVALID_DEVICE_REQUEST;
        break;
    }

    Irp->IoStatus.Status = status;
    Irp->IoStatus.Information = bytesTransferred;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);

    return status;
}

This is after PcAddAdapterDevice call succeeds. This to my best guess rerouted all Portcls IOCTL codes into this handle and caused enumeration in system to fail. If I do this in DriverEntry instead of AddDevice, it still makes device listing fail.

Interestingly, I see continuous logs printed from HandleCustomIOCTL even if device enumeration fails, in either case.

I'm really lost on where to actually add in athe IOCTL backdoor in question. A similar post describes the same problem but the IOCTL part is skimmed over, so I'm not sure what was done there.

I just need directions on where to go from here regarding adding the IOCTL handle without disturbing Portcls's enumeration flow.

I do have a ring buffer class ready for IOCTL. Any help will be greatly appreciated! Again, sorry if the question is very basic.

Did you do any searching at all before posting? We have been having an extensive discussion on this exact topic.

Here's the problem. Audio Engine uses Kernel Streaming to talk to the audio drivers. That communication uses thousands of IOCTL calls to exchange properties, determine capabilities, and establish the audio format. Your code is intercepting all of those and failing them with STATUS_INVALID_DEVICE_REQUEST. No wonder it fails.

ANY TIME you handle an IRP that isn't specifically for you, you must call PcDispatchIrp to allow PortClass to do the dispatching it would ordinarily have done if you weren't there.

Please go read our other thread. You will benefit from that exchange.

Well, that's solved. Appreciate the direction. I did go through that thread but most of it seemed to center around struggling with communication, which I think I got past as of now. I was able to get only my IOCTL code working and I simply pass others on to PcDispatchIrp. I inject IOCTL handle inside AddDevice. My device listing's now unaffected. I plan on attacking the circular ring buffer once I get valid data into the driver.

What stalled me now is invalid data between my user-mode DLL (Carrying actual audio byte data) and the driver. Yes I should probably be using METHOD_IN_DIRECT to construct an IOCTL code instead of hardcoding like I did now on both sides, but somehow even while using the same includes, the IOCTL code calculates differently between the two VS projects (DLL and vistual mic driver), so for now, I resorted to hardcoding the code ONLY to test. That got communication working. MAYBE this is giving null pointers on the driver side when receiving data?

This is the HandleCustomIOCTL method I have in my driver (Added in AddDevice) -

NTSTATUS
CMiniportWaveRTStream::HandleCustomIOCTL
(
    _In_ PDEVICE_OBJECT DeviceObject,
    _In_ PIRP Irp
)
/*++

Routine Description:

Add a custom IOCTL handler that works with the Ring buffer.

--*/
{
    PIO_STACK_LOCATION irpStack = IoGetCurrentIrpStackLocation(Irp);
    NTSTATUS status = STATUS_SUCCESS;
    ULONG_PTR bytesTransferred = 0;
    PCMiniportWaveRTStream stream = NULL;

    if (irpStack->Parameters.DeviceIoControl.IoControlCode == 0x2f0007)
    {
        __try
        {
            // Get direct buffer access
            PVOID inputBuffer = NULL;
            ULONG inputBufferLength = irpStack->Parameters.DeviceIoControl.InputBufferLength;

            // Try getting buffer through different methods
            if (Irp->AssociatedIrp.SystemBuffer) {
                inputBuffer = Irp->AssociatedIrp.SystemBuffer;
                DbgPrint("SIMPLEAUDIOSAMPLE: Using SystemBuffer: %p\n", inputBuffer);
            }
            else if (Irp->MdlAddress) {
                inputBuffer = MmGetSystemAddressForMdlSafe(Irp->MdlAddress, NormalPagePriority);
                DbgPrint("SIMPLEAUDIOSAMPLE: Using MDL Buffer: %p\n", inputBuffer);
            }
            else {
                inputBuffer = Irp->UserBuffer;
                DbgPrint("SIMPLEAUDIOSAMPLE: Using UserBuffer: %p\n", inputBuffer);
            }

            DbgPrint("SIMPLEAUDIOSAMPLE: Write request - Buffer: %p, Length: %d\n",
                inputBuffer, inputBufferLength);

            // Validate buffer
            if (!inputBuffer || inputBufferLength != 32) {
                DbgPrint("SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: %p, Length: %d\n",
                    inputBuffer, inputBufferLength);
                status = STATUS_INVALID_PARAMETER;
                goto End;
            }

            // Access and log the actual data
            SHORT* samples = (SHORT*)inputBuffer;
            PBYTE bytes = (PBYTE)inputBuffer;

            DbgPrint("SIMPLEAUDIOSAMPLE: First 8 bytes: %02X %02X %02X %02X %02X %02X %02X %02X\n",
                bytes[0], bytes[1], bytes[2], bytes[3],
                bytes[4], bytes[5], bytes[6], bytes[7]);

            DbgPrint("SIMPLEAUDIOSAMPLE: First 5 samples: %d %d %d %d %d\n",
                samples[0], samples[1], samples[2], samples[3], samples[4]);

            // Get device context
            PortClassDeviceContext* deviceContext = (PortClassDeviceContext*)DeviceObject->DeviceExtension;
            if (!deviceContext || !deviceContext->m_pCommon) {
                DbgPrint("SIMPLEAUDIOSAMPLE: Invalid device context\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            CMiniportWaveRT* miniport = (CMiniportWaveRT*)deviceContext->m_pCommon;
            stream = miniport->GetActiveCaptureStream();

            if (!stream) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No active stream\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            if (!stream->m_RingBuffer) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No ring buffer\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            status = stream->m_RingBuffer->Write(
                (PBYTE)inputBuffer,
                inputBufferLength);

            if (NT_SUCCESS(status)) {
                bytesTransferred = inputBufferLength;
                DbgPrint("SIMPLEAUDIOSAMPLE: Successfully wrote %d bytes\n",
                    inputBufferLength);
            }
            else {
                DbgPrint("SIMPLEAUDIOSAMPLE: Write failed with status 0x%x\n", status);
            }
        }
        __except (EXCEPTION_EXECUTE_HANDLER)
        {
            DbgPrint("SIMPLEAUDIOSAMPLE: Exception in IOCTL handler\n");
            status = STATUS_INVALID_DEVICE_REQUEST;
        }
    }
    else
    {
        return PcDispatchIrp(DeviceObject, Irp);
    }

End:
    if (stream) {
        stream->Release();
    }

    Irp->IoStatus.Status = status;
    Irp->IoStatus.Information = bytesTransferred;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);

    return status;
}

And this is the send code from my cpp dll (Which I verified to have valid packets being sent from my user-mode app interacting with this dll) -

bool SendAudioData(const float* data, int lengthInSamples, int channels)
{
    WriteLog(L"=== Starting Audio Send ===");

    // Use fixed size matching driver expectation
    const int SHORTS_PER_WRITE = 16;  // 32 bytes
    const SIZE_T bufferSize = SHORTS_PER_WRITE * sizeof(SHORT);

    // Create buffer for hardware access
    SHORT* buffer = (SHORT*)VirtualAlloc(NULL, bufferSize,
        MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE);

    if (!buffer) {
        WriteLog(L"Failed to allocate buffer");
        return false;
    }

    bool hasNonZeroData = false;
    for (int i = 0; i < min(100, lengthInSamples); i++) {
        if (data[i] != 0.0f) {
            hasNonZeroData = true;
            break;
        }
    }
    WriteLog(std::wstring(L"Has non-zero data: ") + (hasNonZeroData ? L"Yes" : L"No"));

    int processedSamples = 0;
    bool success = true;

    while (processedSamples < lengthInSamples && success) {
        RtlZeroMemory(buffer, bufferSize);

        int samplesToProcess = min(SHORTS_PER_WRITE, lengthInSamples - processedSamples);

        // Convert float samples to SHORT
        for (int i = 0; i < samplesToProcess; i++) {
            float sample = data[processedSamples + i];
            float scaledSample = sample * 32767.0f;
            scaledSample = max(-32768.0f, min(32767.0f, scaledSample));
            buffer[i] = static_cast<SHORT>(scaledSample);
        }

        // Debug logging
        if (processedSamples == 0) {
            std::wstringstream ss;
            ss << L"Buffer ptr: 0x" << std::hex << (void*)buffer;
            WriteLog(ss.str());

            ss.str(L"");
            ss << L"First 5 samples (short): ";
            for (int i = 0; i < min(5, samplesToProcess); i++) {
                ss << std::dec << buffer[i] << L" ";
            }
            WriteLog(ss.str());

            BYTE* bytes = (BYTE*)buffer;
            ss.str(L"");
            ss << L"First 8 bytes: ";
            for (int i = 0; i < 8; i++) {
                ss << std::hex << std::setw(2) << std::setfill(L'0')
                    << (int)bytes[i] << L" ";
            }
            WriteLog(ss.str());
        }

        DWORD bytesWritten = 0;
        WriteLog(L"Sending IOCTL 0x2f0007");
        WriteLog(std::wstring(L"Buffer size: ") + std::to_wstring(bufferSize));

        // Create direct buffer IOCTL using memory mapped file
        HANDLE section = CreateFileMapping(INVALID_HANDLE_VALUE, NULL,
            PAGE_READWRITE, 0, bufferSize, NULL);

        if (!section) {
            WriteLog(L"Failed to create file mapping");
            success = false;
            break;
        }

        void* mappedBuffer = MapViewOfFile(section, FILE_MAP_WRITE, 0, 0, bufferSize);
        if (!mappedBuffer) {
            WriteLog(L"Failed to map view of file");
            CloseHandle(section);
            success = false;
            break;
        }

        // Copy our data to shared buffer
        memcpy(mappedBuffer, buffer, bufferSize);

        // Log mapped buffer contents
        std::wstringstream ss;
        ss << L"Mapped buffer ptr: 0x" << std::hex << mappedBuffer;
        WriteLog(ss.str());

        BYTE* mappedBytes = (BYTE*)mappedBuffer;
        ss.str(L"");
        ss << L"Mapped buffer first 8 bytes: ";
        for (int i = 0; i < 8; i++) {
            ss << std::hex << std::setw(2) << std::setfill(L'0')
                << (int)mappedBytes[i] << L" ";
        }
        WriteLog(ss.str());

        // Send to driver
        if (!DeviceIoControl(
            g_deviceHandle,
            0x2f0007,
            mappedBuffer,
            (DWORD)bufferSize,
            NULL,
            0,
            &bytesWritten,
            NULL))
        {
            g_lastError = GetLastError();
            std::wstringstream errSs;
            errSs << L"Write failed, Error: " << std::dec << g_lastError;
            WriteLog(errSs.str());
            success = false;
        }
        else {
            WriteLog(std::wstring(L"Write successful, bytes: ") +
                std::to_wstring(bytesWritten));
        }

        UnmapViewOfFile(mappedBuffer);
        CloseHandle(section);
        processedSamples += samplesToProcess;
    }

    VirtualFree(buffer, 0, MEM_RELEASE);
    return success;
}

Now, logs I get from DebugView for the driver - "SIMPLEAUDIOSAMPLE: Write request - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Using UserBuffer: 0000000000000000
SIMPLEAUDIOSAMPLE: Write request - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Using UserBuffer: 0000000000000000
SIMPLEAUDIOSAMPLE: Write request - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Using UserBuffer: 0000000000000000
SIMPLEAUDIOSAMPLE: Write request - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Using UserBuffer: 0000000000000000
SIMPLEAUDIOSAMPLE: Write request - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Invalid buffer params - Buffer: 0000000000000000, Length: 32
SIMPLEAUDIOSAMPLE: Using UserBuffer: 0000000000000000
"

Logs from the dll cpp - "2024-11-26 19:04:00 - Buffer size: 32
2024-11-26 19:04:00 - Mapped buffer ptr: 0x000001D552E40000
2024-11-26 19:04:00 - Mapped buffer first 8 bytes: 08 de f9 de e5 de a1 de
2024-11-26 19:04:00 - Write failed, Error: 87
2024-11-26 19:04:00 - Last error code: 87
2024-11-26 19:04:00 - === Starting Audio Send ===
2024-11-26 19:04:00 - Has non-zero data: Yes
2024-11-26 19:04:00 - Buffer ptr: 0x000001D552E30000
2024-11-26 19:04:00 - First 5 samples (short): 4991 4252 4670 3252 4786
2024-11-26 19:04:00 - First 8 bytes: 7f 13 9c 10 3e 12 b4 0c
2024-11-26 19:04:00 - Sending IOCTL 0x2f0007
2024-11-26 19:04:00 - Buffer size: 32
2024-11-26 19:04:00 - Mapped buffer ptr: 0x000001D552E40000
2024-11-26 19:04:00 - Mapped buffer first 8 bytes: 7f 13 9c 10 3e 12 b4 0c
2024-11-26 19:04:00 - Write failed, Error: 87
2024-11-26 19:04:00 - Last error code: 87
2024-11-26 19:04:00 - === Starting Audio Send ===
2024-11-26 19:04:00 - Has non-zero data: Yes
2024-11-26 19:04:00 - Buffer ptr: 0x000001D552E30000
2024-11-26 19:04:00 - First 5 samples (short): -497 -2830 -782 -3365 -3235
2024-11-26 19:04:00 - First 8 bytes: 0f fe f2 f4 f2 fc db f2
2024-11-26 19:04:00 - Sending IOCTL 0x2f0007
2024-11-26 19:04:00 - Buffer size: 32
2024-11-26 19:04:00 - Mapped buffer ptr: 0x000001D552E40000
2024-11-26 19:04:00 - Mapped buffer first 8 bytes: 0f fe f2 f4 f2 fc db f2
2024-11-26 19:04:00 - Write failed, Error: 87
2024-11-26 19:04:00 - Last error code: 87"

I'm confused on what the right way is to parse received data, and whether hardcoding the IOCTL for testing is a big mistake in itself throwing me off.

@Tim_Roberts Hey Tim, just need a direction so I can proceed with the right approach, is it necessary I stick to using METHOD_IN_DIRECT instead of METHOD_BUFFERED? MDL seems to be complicated to implement. Also can I re-use the dma ring buffer from the sample code or do I need to implement my own circular buffer?

METHOD_BUFFERED does a copy into kernel memory. METHOD_IN_DIRECT simply maps the user buffer. If the buffers are short, that extra copy isn't very expensive. If the buffer are long (say more than a page) then mapping is better. MDLs are easy; MmGetSystemAddressForMdlSafe does all the work and returns a kernel address.

The SimpleAudioSample doesn't have any ring buffers. For microphone endpoints, it generates a sine wave as needed. For speaker endpoints, it saves to a WAV file. You will need to add your own.

Thanks for that Tim, I tried implementing METHOD_IN_DIRECT from my user mode app's DLL as this -

bool SendAudioData(const float* data, int lengthInSamples, int channels)
{
    WriteLog(L"=== Starting Audio Send ===");
    const int SHORTS_PER_WRITE = 16;
    const SIZE_T bufferSize = SHORTS_PER_WRITE * sizeof(SHORT);

    // Create buffer suitable for MDL
    LPVOID buffer = VirtualAlloc(NULL, bufferSize,
        MEM_COMMIT | MEM_RESERVE,
        PAGE_READWRITE);
    if (!buffer) {
        WriteLog(L"Failed to allocate buffer");
        return false;
    }

    // Lock pages in memory for MDL
    if (!VirtualLock(buffer, bufferSize)) {
        WriteLog(L"Failed to lock buffer pages");
        VirtualFree(buffer, 0, MEM_RELEASE);
        return false;
    }

    std::wstringstream ss;
    ss << L"Using IOCTL code: 0x" << std::hex << IOCTL_WRITE_AUDIO;
    WriteLog(ss.str());

    bool hasNonZeroData = false;
    for (int i = 0; i < min(100, lengthInSamples); i++) {
        if (data[i] != 0.0f) {
            hasNonZeroData = true;
            break;
        }
    }
    WriteLog(std::wstring(L"Has non-zero data: ") + (hasNonZeroData ? L"Yes" : L"No"));

    int processedSamples = 0;
    bool success = true;

    while (processedSamples < lengthInSamples && success) {
        ZeroMemory(buffer, bufferSize);

        int samplesToProcess = min(SHORTS_PER_WRITE, lengthInSamples - processedSamples);

        // Convert float samples to SHORT
        SHORT* shortBuffer = (SHORT*)buffer;
        for (int i = 0; i < samplesToProcess; i++) {
            float sample = data[processedSamples + i];
            float scaledSample = sample * 32767.0f;
            scaledSample = max(-32768.0f, min(32767.0f, scaledSample));
            shortBuffer[i] = static_cast<SHORT>(scaledSample);
        }

        // Debug logging
        if (processedSamples == 0) {
            ss.str(L"");
            ss << L"Buffer ptr: 0x" << std::hex << buffer;
            WriteLog(ss.str());

            ss.str(L"");
            ss << L"First 5 samples (short): ";
            for (int i = 0; i < min(5, samplesToProcess); i++) {
                ss << std::dec << shortBuffer[i] << L" ";
            }
            WriteLog(ss.str());

            BYTE* bytes = (BYTE*)buffer;
            ss.str(L"");
            ss << L"First 8 bytes: ";
            for (int i = 0; i < 8; i++) {
                ss << std::hex << std::setw(2) << std::setfill(L'0')
                    << (int)bytes[i] << L" ";
            }
            WriteLog(ss.str());
        }

        OVERLAPPED overlapped = { 0 };
        overlapped.hEvent = CreateEvent(NULL, TRUE, FALSE, NULL);
        if (!overlapped.hEvent) {
            WriteLog(L"Failed to create event");
            success = false;
            break;
        }

        WriteLog(L"Sending METHOD_IN_DIRECT IOCTL...");

        // Send using METHOD_IN_DIRECT which will create MDL
        if (!DeviceIoControl(
            g_deviceHandle,
            IOCTL_WRITE_AUDIO,
            buffer,
            (DWORD)bufferSize,
            NULL,
            0,
            NULL,
            &overlapped))
        {
            if (GetLastError() != ERROR_IO_PENDING) {
                g_lastError = GetLastError();
                ss.str(L"");
                ss << L"DeviceIoControl failed, Error: " << std::dec << g_lastError;
                WriteLog(ss.str());
                CloseHandle(overlapped.hEvent);
                success = false;
                break;
            }

            // Wait for completion
            DWORD bytesReturned;
            if (!GetOverlappedResult(g_deviceHandle, &overlapped, &bytesReturned, TRUE)) {
                g_lastError = GetLastError();
                ss.str(L"");
                ss << L"GetOverlappedResult failed, Error: " << std::dec << g_lastError;
                WriteLog(ss.str());
                CloseHandle(overlapped.hEvent);
                success = false;
                break;
            }

            ss.str(L"");
            ss << L"Write successful, bytes: " << bytesReturned;
            WriteLog(ss.str());
        }
        else {
            WriteLog(L"Write completed immediately");
        }

        CloseHandle(overlapped.hEvent);
        processedSamples += samplesToProcess;
    }

    VirtualUnlock(buffer, bufferSize);
    VirtualFree(buffer, 0, MEM_RELEASE);
    return success;
}

And this is my HandleCustomIOCTL method being injected from DriverEntry, used in minwavertstream.cpp -

NTSTATUS
CMiniportWaveRTStream::HandleCustomIOCTL
(
    _In_ PDEVICE_OBJECT DeviceObject,
    _In_ PIRP Irp
)
/*++

Routine Description:

Add a custom IOCTL handler that works with the Ring buffer.

--*/
{
    PIO_STACK_LOCATION irpStack = IoGetCurrentIrpStackLocation(Irp);
    NTSTATUS status = STATUS_SUCCESS;
    ULONG_PTR bytesTransferred = 0;

    DbgPrint("SIMPLEAUDIOSAMPLE: Starting IOCTL handler\n");

    ULONG controlCode = irpStack->Parameters.DeviceIoControl.IoControlCode;
    DbgPrint("SIMPLEAUDIOSAMPLE: Received IOCTL 0x%x, Expecting 0x%x\n",
        controlCode, IOCTL_WRITE_AUDIO);

    DbgPrint("SIMPLEAUDIOSAMPLE: IOCTL Method: %d\n", controlCode & 3);

    if (controlCode == IOCTL_WRITE_AUDIO)
    {
        __try
        {
            DbgPrint("SIMPLEAUDIOSAMPLE: Checking request structure...\n");

            DbgPrint("SIMPLEAUDIOSAMPLE: SystemBuffer: %p\n", Irp->AssociatedIrp.SystemBuffer);
            DbgPrint("SIMPLEAUDIOSAMPLE: MDL: %p\n", Irp->MdlAddress);
            DbgPrint("SIMPLEAUDIOSAMPLE: UserBuffer: %p\n", Irp->UserBuffer);
            DbgPrint("SIMPLEAUDIOSAMPLE: InputBufferLength: %d\n",
                irpStack->Parameters.DeviceIoControl.InputBufferLength);

            if (!Irp->MdlAddress) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No MDL present\n");
                status = STATUS_INVALID_PARAMETER;
                goto End;
            }

            PVOID buffer = MmGetSystemAddressForMdlSafe(Irp->MdlAddress,
                NormalPagePriority | MdlMappingNoExecute);

            if (!buffer) {
                DbgPrint("SIMPLEAUDIOSAMPLE: Failed to map MDL\n");
                status = STATUS_INSUFFICIENT_RESOURCES;
                goto End;
            }

            // Log received samples
            SHORT* samples = (SHORT*)buffer;
            DbgPrint("SIMPLEAUDIOSAMPLE: First 5 samples: %d %d %d %d %d\n",
                samples[0], samples[1], samples[2], samples[3], samples[4]);

            status = STATUS_SUCCESS;

            //// Write to ring buffer
            //if (m_RingBuffer) {
            //    status = m_RingBuffer->Write(
            //        (PBYTE)buffer,
            //        irpStack->Parameters.DeviceIoControl.InputBufferLength);

            //    if (NT_SUCCESS(status)) {
            //        bytesTransferred = irpStack->Parameters.DeviceIoControl.InputBufferLength;
            //        DbgPrint("SIMPLEAUDIOSAMPLE: Successfully wrote %d bytes to ring buffer\n",
            //            bytesTransferred);
            //    }
            //    else {
            //        DbgPrint("SIMPLEAUDIOSAMPLE: Failed to write to ring buffer, status = 0x%x\n",
            //            status);
            //    }
            //}
            //else {
            //    DbgPrint("SIMPLEAUDIOSAMPLE: No ring buffer available\n");
            //    status = STATUS_DEVICE_NOT_READY;
            //}
        }
        __except (EXCEPTION_EXECUTE_HANDLER)
        {
            DbgPrint("SIMPLEAUDIOSAMPLE: Exception in IOCTL handler\n");
            status = STATUS_INVALID_DEVICE_REQUEST;
        }
    }
    else
    {
        return PcDispatchIrp(DeviceObject, Irp);
    }

End:
    Irp->IoStatus.Status = status;
    Irp->IoStatus.Information = bytesTransferred;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);
    return status;
}

Even if I try using MmGetSystemAddressForMdlSafe, I still somehow keep getting "SIMPLEAUDIOSAMPLE: No MDL present" in my debugview logs from driver's ioctl method when sending audio to it. I'm lost here, what am I doing wrong?

The names are slightly misleading. For both IN and OUT_DIRECT, the first buffer in the DeviceIoControl call is always buffered. It is the SECOND buffer that gets mapped. So, you need:

        if (!DeviceIoControl(
            g_deviceHandle,
            IOCTL_WRITE_AUDIO,
            NULL,
            0,
            buffer,
            (DWORD)bufferSize,
            NULL,
            &overlapped))

Thankyou! I got the MDL working by changing the code on both sides, the name is so misleading indeed! I have now gotten MDL working and packets coming in with success, no issues with device recognition by user-mode DLL either (Which is sending audio packets). I also have a circular buffer implemented with proper spinlock releases (Previously had IRQL BSODs).

The only thing keeping me from sending audio data to ring buffer is that I'm unable to get active stream, to write to buffer using stream instance.

The handle code I have now -

NTSTATUS
CMiniportWaveRTStream::HandleCustomIOCTL
(
    _In_ PDEVICE_OBJECT DeviceObject,
    _In_ PIRP Irp
)
/*++

Routine Description:

Add a custom IOCTL handler that works with the Ring buffer.

--*/
{
    PIO_STACK_LOCATION irpStack = IoGetCurrentIrpStackLocation(Irp);
    NTSTATUS status = STATUS_SUCCESS;
    ULONG_PTR bytesTransferred = 0;
    PCMiniportWaveRTStream stream = NULL;

    DbgPrint("SIMPLEAUDIOSAMPLE: Starting IOCTL handler\n");

    ULONG controlCode = irpStack->Parameters.DeviceIoControl.IoControlCode;
    DbgPrint("SIMPLEAUDIOSAMPLE: Received IOCTL 0x%x, Expecting 0x%x\n",
        controlCode, IOCTL_WRITE_AUDIO);

    DbgPrint("SIMPLEAUDIOSAMPLE: IOCTL Method: %d\n", controlCode & 3);

    if (controlCode == IOCTL_WRITE_AUDIO)
    {
        __try
        {
            DbgPrint("SIMPLEAUDIOSAMPLE: Checking request structure...\n");

            if (!Irp->MdlAddress) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No MDL present\n");
                status = STATUS_INVALID_PARAMETER;
                goto End;
            }

            PVOID buffer = MmGetSystemAddressForMdlSafe(Irp->MdlAddress,
                NormalPagePriority | MdlMappingNoExecute);

            if (!buffer) {
                DbgPrint("SIMPLEAUDIOSAMPLE: Failed to map MDL\n");
                status = STATUS_INSUFFICIENT_RESOURCES;
                goto End;
            }

            // Log received samples before getting stream
            SHORT* samples = (SHORT*)buffer;
            DbgPrint("SIMPLEAUDIOSAMPLE: First 5 samples: %d %d %d %d %d\n",
                samples[0], samples[1], samples[2], samples[3], samples[4]);

            // Get the miniport instance
            PortClassDeviceContext* deviceContext = (PortClassDeviceContext*)DeviceObject->DeviceExtension;
            if (!deviceContext) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No device context\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            CMiniportWaveRT* miniport = (CMiniportWaveRT*)deviceContext->m_pCommon;
            if (!miniport) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No miniport\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            // Get active stream
            stream = miniport->GetActiveCaptureStream();
            if (!stream) {
                DbgPrint("SIMPLEAUDIOSAMPLE: No active stream available\n");
                status = STATUS_INVALID_DEVICE_STATE;
                goto End;
            }

            // Write to ring buffer through stream instance
            if (stream->m_RingBuffer) {
                status = stream->m_RingBuffer->Write(
                    (PBYTE)buffer,
                    irpStack->Parameters.DeviceIoControl.OutputBufferLength);

                if (NT_SUCCESS(status)) {
                    bytesTransferred = irpStack->Parameters.DeviceIoControl.OutputBufferLength;
                    DbgPrint("SIMPLEAUDIOSAMPLE: Wrote %d bytes to ring buffer\n", bytesTransferred);
                }
                else {
                    DbgPrint("SIMPLEAUDIOSAMPLE: Ring buffer write failed with status 0x%x\n", status);
                }
            }
            else {
                DbgPrint("SIMPLEAUDIOSAMPLE: Stream has no ring buffer\n");
                status = STATUS_DEVICE_NOT_READY;
            }
        }
        __except (EXCEPTION_EXECUTE_HANDLER)
        {
            DbgPrint("SIMPLEAUDIOSAMPLE: Exception in IOCTL handler\n");
            status = STATUS_INVALID_DEVICE_REQUEST;
        }
    }
    else
    {
        return PcDispatchIrp(DeviceObject, Irp);
    }

End:
    if (stream) {
        stream->Release();
    }

    Irp->IoStatus.Status = status;
    Irp->IoStatus.Information = bytesTransferred;
    IoCompleteRequest(Irp, IO_NO_INCREMENT);
    return status;
}

Which keeps printing "No active stream" in logs. I logically have a feeling I'm trying to get into the active stream but doubt the active stream's actually being set via minwavert.cpp. Is my approach even correct to get the active stream? I'm not sure where else to plug the ring buffer into.

I feel like I'm otherwise quite close to rendering the actual audio to virtual mic and feel great about atleast getting MDL working with baby steps!

Is there an application using the microphone when you run this? There won't be an "active" capture stream unless an app is reading from the microphone.

Interestingly, you've implemented this upside-down from what I did. You're trying to push from the ioctl handler into the stream, whereas I had the ioctl copy into ring buffers that were part of the CAdapterCommon, and let the individual streams pull from there. The advantage is that no one tries to access anything before it exists. I've found, in general, that it's better to have lower-level objects pull from above, rather than have higher-level objects try to dig deep into the hierarchy.

That doesn't mean it couldn't be done your way; it just means you need to make sure the stream exists first, and you need to decide what to do if it doesn't exist yet.

I got the audio working! I can hear my audio buffer from user-mode unity app in my virtual mic device!

I took into account what you said, and ditched the get active stream approach. Yeah that felt unnecessary to me too, not to mention difficult to get the right stream in real life scenarios on a crowded os. I eventually went with setting a global pointer to my circular buffer and that did the trick! I've now slimmed down the project too, removing speaker endpoints and renaming sample methods to my names.

Now to fix the crackling and buffer sizes. I know a VM has latency issues with this. There's another thread with detailed input on buffer running dry or overflowing, so I think I can get by.

Thanks for your valuable input, Cheers!

Hey Tim, been struggling quite a bit with buffer running dry, quite consistently. I've read through a similar thread on this (Saliom's thread) and he seemed to eventually move to MSVAD MicArray and got better results with buffer management, but I feel that's overkill and there should be a better way in theory of solving this within Simple Audio Sample itself.

Firstly I keep having to test repeat driver installs on a VM to simply revert to snapshots instead of uninstalling the driver and restarting everytime I make any change, while I sync code via a shared folder from host. But VMs are known to have crap timings, so I'm afraid I'm chasing my own tail here for probably more underruns than I'd see on a real machine. I ofcourse can't test on host or a real machine as that'd mean I have to tediously keep restarting to reinstall the driver post changes. Is there a better way to test here?

Now coming to the main part, I am struggling quite a bit on getting the timings between the driver and my user mode app (Unity) right, even though I've matched sample rates, channel and audio buffer size across both. I've tried with quite a big buffer (8k) and there's still ALWAYS 0s being written in (x) milliseconds, at a constant interval basically, and seems to be the buffer running dry from what I see on Audacity's recorded waveform of what the virtual mic receives. This makes me feel even though I'm getting the buffer sizes to be matched across driver and user app (And I previously expected that to work against running dry, as Unity is using a test audio clip as output source, which means unlike UDP issues or network based problems, I always know what size my next set of buffered data is going to be), the DMA seems to not be timed right with user's buffered data.

What's an actual solution to solve the timing issues/underruns with Simple Audio Sample and not banking on switching to MSVAD or anything else? I like how clean and slim this project is for the purpose.

I can't seem to wrap my head around what is to be conceptually done to handle such a case of unpredicable timing issues, as I'd want this to be reliable on most user machines with varying hardware. Even increasing the audio buffer to (512*1024) on driver, super large buffer does nothing.

Also, the pitch of recevied audio seems to be an entire octave higher than the actual clip, even though sample rates are matched and the speed/tempo of the received audio seems okay.

Any input on this?

You should be able to use devcon disable, copy the new binary in, and devcom enable, without having to restart the VM. As long as the device is disabled, you can replace the binary.

If the sample rates match and the data is an octave higher, then that PROBABLY means you are supplying mono data when your stream is expecting stereo. If you offer stereo, Audio Engine will choose it. Are you matching the number of channels?

Preventing underflow and overflow certainly is a huge challenge. The Audio Engine process runs in real-time and expect the hardware to follow suit. I don't usually like this solution, but have you tried bumping up your thread priority> You just need to stay a little bit ahead of the engine.

Hey @Tim_Roberts , I solved the problem for both wrong pitch and underrun by simply setting the unity app's output sample rate value to double of what the driver and DLL bridge code expected, and buffer size the same as what driver expected. So I learned that for stereo channeled audio I need to process at double sample rate where unity sends out in L R L R fashion within the output stream. Audio's seems great now, no lags, no crackles and underruns. I'm just not clear on why doubling the sample rate like this worked though.

For reference, the test audio track is a 10 second looping saxophone audio at 48000 and stereo channels. Keeping the DSP buffer size at 4096 from unity side, keeping the sample rate the same as driver and DLL expected rate (48000) chops up the track and increases pitch. At 96000, track sounds proper without underruns. But even at 1920000, audio seems proper but by this logic it should lower pitch of audio. But at this sample rate too, sound is proper. This is puzzling me.

I did solve my problem, but I do want to know conceptually why this worked.