First post: Student WDM drivers - intern-ready?

Hi all,
Long-time lurker, first-time poster. Sophomore comp eng student here. Built these after studying Windows Internals Ch1-4 + the thread.sys example.

kbd_latency.sys - Keyboard input optimizer (thread priority/affinity)

mouse_multiplier.sys - Left-click doubler via IRP injection. This uses "actual" IRPs to generate "synthetic" irps, which I fill with LEFT_MOUSE_BUTTON_DOWN and LEFT_MOUSE_BUTTON_UP (basically, left mouse button click events) in said synthetic IRP's buffer.

Code snippets/Highlights:

// kbd_latency.sys - Completion context
PMY_COMPLETION_CONTEXT ctx = ExAllocatePoolWithTag(NonPagedPool, sizeof(...), 'CTXT');
if (ctx) {
    ctx->thread = PsGetCurrentThread();
    ObReferenceObject(ctx->thread);
    KeSetPriorityThread(ctx->thread, HIGH_PRIORITY);
    // [...]
} 


// mouse_multiplier.sys - IRP cloning
IoQueueWorkItem(workItem, [](PDEVICE_OBJECT, PVOID Context) {
    PWORKITEM_CONTEXT wctx = (PWORKITEM_CONTEXT)Context;
    PIRP newIrp = IoAllocateIrpEx(/*...*/);
    // Inject synthetic input
    // [...]
} [code link]

Asking because:
Applying for kernel internships (or any low level internship) soon. Would appreciate honest takes:

Does this show base-level readiness for junior kernel work?

What's the biggest "oh hell no" in the approach?

If you were hiring, would this code get me screened in or out?

Be gentle - first driver project and new to forums. Just trying to gauge if I'm on the right track.

User-mode threads can set their own priority. No driver is required. As a general rule, however, screwing with priorities is almost always counterproductive. After 35 years of tuning, the operating system scheduler is very, very good at its job.

Do you ever release your ObReferenceObject? What do you do if the allocation fails (which is, of course, very unlikely)? Do you return an error, or do you just ignore it and go on?

1 Like

Hi Tim,

Thank you for your critique. I'll try to separate it in 3 distinct points to try my best to reply.

  1. You're absolutely right about the scheduler's maturity- in fact, Windows' keyboard input latency is already very well optimized. My goal was kernel-level latency optimization (not just priority tweaks). The driver also handles IRP pass-through, core affinity, and context preservation- things user-mode can't do.

  2. I always dereference (the thread object that I increased its reference count of) in my completion routine. Here's a code snippet:

NTSTATUS CompletionRoutine(...) {
    PMY_COMPLETION_CONTEXT ctx = (PMY_COMPLETION_CONTEXT)Context;
   //....
    ObDereferenceObject(oldinfo->thread); //always dereferenced, no matter what.
    ExFreePoolWithTag(ctx, 'CTXT');
    return STATUS_SUCCESS; 
}
  1. Good eye! I wrote my code snippet wrong. I check if ctx exists, and if it doesn't I return STATUS_INSUFFICIENT_RESOURCES. (I wasn't sure what NTSTATUS value to pick, but this seemed to be the most appropriate one.) Here's the actual code from my driver:
//initiates a contextinfo struct, which contains oldaffinity, oldpriority and the targetted thread. this will be passed to "IoSetCompletionRoutine" as the "context" argument.
	PMY_COMPLETION_CONTEXT contextinfo = (PMY_COMPLETION_CONTEXT)ExAllocatePoolWithTag(NonPagedPool, sizeof(MY_COMPLETION_CONTEXT), 'CTXT'); //allocates memory from the non-paged memory pool corresponding to the size of my struct. 
	//contextinfo will be passed as the "context" argument in IoSetCompletionRoutine; 
	if (!contextinfo) { //checks if the allocation succeeded. 
		KdPrint(("Error: Failed to allocate memory\n"));
		return STATUS_INSUFFICIENT_RESOURCES;
	}

Sorry again about that.

Also, quick question: How would you approach reducing context switches for latency-sensitive I/O without priority tweaks? I was considering implementing IRP Batching. Would that work?