Injecting dlls across sessions

Due to no longer being able to use CreateRemoteThreads across sessions, I’ve
been looking at other ways of injecting dlls into running processes.

I’ve realised that RtlCreateUserThread allows thread creation across
sessions at the expense of not informing CSRSS.

As I just want my thread to run a DllMain routine and then exit, I figured
the CSRSS thing wasn’t an issue.

The call the RtlCreateUserThread succeeds however the created thread seemed
to grab a lot of CPU time and wasn’t returning.

The thread stack is unexpected and raises an exception after the context
switch.

Does anyone have any ideas?

0: kd> !thread

THREAD 81cd55b0 Cid 0274.0db8 Teb: 7ffae000 Win32Thread: 00000000 RUNNING
on processor 0

Waiting for reply to LPC MessageId 001e1ff8:

Current LPC port e14c1610

Not impersonating

DeviceMap e10019a8

Owning Process 0 Image:

Attached Process 81dfada0 Image: winlogon.exe

Wait Start TickCount 31171 Ticks: 1 (0:00:00:00.015)

Context Switch Count 1935856

UserTime 00:00:00.250

KernelTime 00:01:37.796

Start Address kernel32!LoadLibraryW (0x7c80aeeb)

Stack Init f519d000 Current f519c348 Base f519d000 Limit f519a000 Call 0

Priority 14 BasePriority 13 PriorityDecrement 0 DecrementCount 16

ChildEBP RetAddr Args to Child

f519c2f4 80545161 00000001 ffdff902 000000d1
nt!RtlpBreakWithStatusInstruction (FPO: [1,0,0])

f519c2f4 806e6a9b 00000001 ffdff902 000000d1 nt!KeUpdateSystemTime+0x175
(FPO: [0,2] TrapFrame @ f519c308)

f519c380 804fcf1b 81cd5778 81cd55b0 8055d0c0
hal!KeAcquireQueuedSpinLock+0x4f (FPO: [0,0,0])

f519c398 805a3af3 81ffbcc8 00000001 00000001 nt!KeReleaseSemaphore+0x11
(FPO: [Non-Fpo])

f519c3cc 805a3c7d e14c1610 81ffbcc8 f519c400
nt!LpcpRequestWaitReplyPort+0x3ff (FPO: [Non-Fpo])

f519c3e4 80643557 e14c1610 f519c578 f519c400
nt!LpcRequestWaitReplyPortEx+0x21 (FPO: [Non-Fpo])

f519c560 80643612 f519c578 e14c1610 00000000 nt!DbgkpSendApiMessageLpc+0x49
(FPO: [Non-Fpo])

f519c5f0 804fe805 f519c9d8 00000000 00000001 nt!DbgkForwardException+0x84
(FPO: [Non-Fpo])

f519c9b0 805028d9 f519c9d8 00000000 f519cd64 nt!KiDispatchException+0x38f
(FPO: [Non-Fpo])

f519cd34 80544f2f 010efd18 010efd34 00000000 nt!KiRaiseException+0x175 (FPO:
[Non-Fpo])

f519cd50 8054164c 010efd18 010efd34 00000000 nt!NtRaiseException+0x33

f519cd50 00000000 010efd18 010efd34 00000000 nt!KiFastCallEntry+0xfc (FPO:
[0,0] TrapFrame @ f519cd64)

WARNING: Frame IP not in any known module. Following frames may be wrong.

00000000 00000000 00000000 00000000 00000000 0x0

Why do you need that?

In most of the cases you can accomplish the task without injecting into process.

Speaking about your problem, well, you can disassemble system code to see exact params passed to RtlCreateUserThread to see what’s wrong.

One of the most common mistake is to load a dll which is using api not exported from kernel32 in it’s DllMain. Another mistake is to inject it too early, when the loader is not finished loading of imports for the executable.

Another mistake is to inject into drm processes, which are protected by OS (well, this protection is easially removed :slight_smile: though )

xxxxx@broadcom.com wrote:

Why do you need that?

I need to detour ExitWindowsEx calls in winlogon.

xxxxx@shcherbyna.com wrote:

In most of the cases you can accomplish the task without injecting into
process.

I wish there was another way of doing what I need in usermode, which is to
detour ExitWindowsEx in winlogon.
I realise I could try to inject using my driver, but that opens up a whole
new problem related to threads in kmode.

One of the most common mistake is to load a dll which is using api not
exported from kernel32 in it’s DllMain. Another mistake is to inject it
too early, when the loader is not finished loading of imports for the
executable.

This is indeed a scenario I’m in. Using detours relies on detoured.dll.
However I was of the understanding that the detours.lib adds come clever
trickery to get around this problem as there was no problem when using
CreateRemoteThread.

Another mistake is to inject into drm processes, which are protected by OS
(well, this protection is easially removed :slight_smile: though )

I assume you refer to switching the flag in the objects EPROCESS struct?
Winlogon isn’t one of these, so I’m safe here.

Ged.

>>This is indeed a scenario I’m in. Using detours relies on detoured.dll. However I was of the understanding that the detours.lib adds come clever trickery to get around this problem as there was no problem when using CreateRemoteThread.<<

(Disclamer: I am not going to concentrate on ethical side of hooking ExitWindowsEx approach, and will just skip to technical part)

Well, just to confirm this theory, try to inject into winlogon an empty stab dll, just linked with kernel32.dll. If it will work, most likely, the dependencies from your lib are making a problem.

If stub dll works, for the sake of the test try to load detour library via LoadLibrary/GetProcAddress on a separate thread which you will create in DllMain on process attach event after some delay, say, you fork new thread from DllMain which makes Sleep for 10 seconds and does LoadLibrary and other job.

xxxxx@shcherbyna.com wrote:

(Disclamer: I am not going to concentrate on ethical side of hooking
ExitWindowsEx approach, and will just skip to technical part)

I’m fully aware of the ethical side of hooking this API, especially in a
process like winlogon.
I’d be happy to discuss any suggested alternatives if you have any ideas?

I need to take a snapshot of processes at logoff and do some synchronization
work.
This needs to be done before CSRSS knows about the logoff and starts to fire
off WM_QUERYENDSESSION to running processes, and killing them off.

So my current solution is to detour ExitWindowsEx in winlogon and stall the
logoff process while I do my sync work.
I then call the real API allowing CSRSS to know about the call and logoff to
run.

Thanks,
Ged.

Depends on what you mean under “and do some synchronization work.”

Depending on the task, I would choose less harming approach. What comes into my mind quickly is the following:

  1. I would track list of processes in real time, i.e., by registering callback using PsSetCreateProcessNotifyRoutine and store all active processes in my tree. I would do some portion of “syncronization work” in callbacks. I assume you need to get hash of files and most likely parse pe header?

  2. I would register session change notification handler and wait till WTS_SESSION_LOGOFF arrives, and once system is going down, I already have processes preprocessed, store it, or do something else. (well, you can also register Unload handler in your driver and finish the job there)

The problem with this approach is that I need to sync exact process details
at the time of logoff. Running periodic syncs (which I also do) isn’t enough
to be sure I don’t miss any data.
The details gathered of each process can be quite extensive and require time
to process.

Additionally the software I work on allows users to define triggers which
may be running at the time of logoff. It’s imperative that any scripts /
processes on these triggers are allowed to run until completion.

If I allowed logoff to continue as normal, it’s highly likely that CSRSS
will have started its loop of running processes and some of them will have
been terminated before I have the chance to collect data.

Therefore the main problem (in my opinion) is how to best stall logoff in
the least intrusive way.
My only solutions are to detour ExWindowsEx, or to intercept the message
sent from CSRSS to winlogon (window message in NT5 or pipe message in NT6)

The detour methods seemed like the lesser of two evils.

Ged.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@shcherbyna.com
Sent: 19 November 2010 10:07
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Injecting dlls across sessions

Depends on what you mean under “and do some synchronization work.”

Depending on the task, I would choose less harming approach. What comes into
my mind quickly is the following:

  1. I would track list of processes in real time, i.e., by registering
    callback using PsSetCreateProcessNotifyRoutine and store all active
    processes in my tree. I would do some portion of “syncronization work” in
    callbacks. I assume you need to get hash of files and most likely parse pe
    header?

  2. I would register session change notification handler and wait till
    WTS_SESSION_LOGOFF arrives, and once system is going down, I already have
    processes preprocessed, store it, or do something else. (well, you can also
    register Unload handler in your driver and finish the job there)


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Collect process info periodically, but per each process each N seconds
during process life time or when it exists. You know when it exists as you
can register image load/unload callback and track process exits events.

Now you don’t have a problem when rebooting OS, as if you reboot, you get
callbacks for each dying process, and you process it syncroniously (your
code run in the context of process, if you don’t return, ExitProcess does
not return either), thus, delaying all OS shutdown sequence. You would not
miss even a signle process here, and everything is accomplished using
documented functions.

As for “triggers on shutdown” - well, this is an interesting task. This
should be done in u.m.a. (user mode application), not in k.m.d., but I guess
you can accomplish this using approach used above - this is just thoughts.

Basically, you have several time points here, t1 is when process exit
callback fired, t2 when all N processes signalled exits (because you store
them in your tree, you can monitor progress if processes exiting) and t3
your unload callback in your driver. Playing with these can help you I
guess.

Personally, I would not make triggers on shutdown. I want my machine to turn
off quickly, as soon as possible after I press shutdown …

“Ged” wrote in message news:xxxxx@ntdev…

The problem with this approach is that I need to sync exact process details
at the time of logoff. Running periodic syncs (which I also do) isn’t enough
to be sure I don’t miss any data.
The details gathered of each process can be quite extensive and require time
to process.

Additionally the software I work on allows users to define triggers which
may be running at the time of logoff. It’s imperative that any scripts /
processes on these triggers are allowed to run until completion.

If I allowed logoff to continue as normal, it’s highly likely that CSRSS
will have started its loop of running processes and some of them will have
been terminated before I have the chance to collect data.

Therefore the main problem (in my opinion) is how to best stall logoff in
the least intrusive way.
My only solutions are to detour ExWindowsEx, or to intercept the message
sent from CSRSS to winlogon (window message in NT5 or pipe message in NT6)

The detour methods seemed like the lesser of two evils.

Ged.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@shcherbyna.com
Sent: 19 November 2010 10:07
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] Injecting dlls across sessions

Depends on what you mean under “and do some synchronization work.”

Depending on the task, I would choose less harming approach. What comes into
my mind quickly is the following:

  1. I would track list of processes in real time, i.e., by registering
    callback using PsSetCreateProcessNotifyRoutine and store all active
    processes in my tree. I would do some portion of “syncronization work” in
    callbacks. I assume you need to get hash of files and most likely parse pe
    header?

  2. I would register session change notification handler and wait till
    WTS_SESSION_LOGOFF arrives, and once system is going down, I already have
    processes preprocessed, store it, or do something else. (well, you can also
    register Unload handler in your driver and finish the job there)


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

“I need to detour ExitWindowsEx calls in winlogon.”

“I need to take a snapshot of processes at logoff and do some synchronization
work. This needs to be done before CSRSS knows about the logoff and starts to fire
off WM_QUERYENDSESSION to running processes, and killing them off”

It’s not clear what problem do you have. Do you need to commit some large state to the disk on shutdown/restart? QUERYENDSESSION doesn’t kill a process off right away. You can spend quite a while in the message handler. Why it doesn’t work for you?

Is it a client machine? Server machine? Does it have a console? Are you concerned about user session applications or services?