Wrong address of EntryPoint value in LDR_DATA_TABLE_ENTRY

Hi,

i’m developing an anti-malware driver as a project for school, and part of the code is looking up the entry point of the loaded module (the exe) in a CreateThreadNotifyRoutine callback.

I can’t manage to get the correct address, it’s always ‘close’, but i don’t understand what i’m doing wrong here. If I inspect the structures in windbg, it shows me the right adresses, by issuing !peb and and then using this command:

(!list -t ntdll!_LIST_ENTRY.Flink -x "dt ntdll!_LDR_DATA_TABLE_ENTRY @$extret" <ldr.inloadordermodulelist from peb>

I try to do the same thing in my code, but as i said, for some reason the addresses are never correct, and sometimes the DbgPrint() message in my code doesn’t show?! For example: when i start notepad it never shows.

Also if someone could give me a bit of a advice of how to patch the entry point when i got it, it would also be very much appreciated.

Here is my code from the CreateThreadNotifyRoutine:

VOID createThreadNotify(
IN HANDLE ProcessId,
IN HANDLE ThreadId,
IN BOOLEAN Create) {

NTSTATUS status;
PEPROCESS peProcess;
PPEB peb;
PPEB_LDR_DATA LdrData = NULL;
PLDR_DATA_TABLE_ENTRY LdrTableEntry = NULL;
LIST_ENTRY *Flink, *Blink;

if(Create) {
status = PsLookupProcessByProcessId(ProcessId,&peProcess);

if(status == STATUS_SUCCESS) {
peb = PsGetProcessPeb(peProcess);

if(peb) {
LdrData = peb->Ldr;

if(LdrData) {
Flink = LdrData->InMemoryOrderModuleList.Flink;
Blink = LdrData->InMemoryOrderModuleList.Blink;

while(Flink != Blink) {
LdrTableEntry = (PLDR_DATA_TABLE_ENTRY)Flink;
if(LdrTableEntry) {
DbgPrint(“Module %wZ has EntryPoint: %08x\n”,&LdrTableEntry->FullModuleName,LdrTableEntry->EntryPoint);
}
Flink = Flink->Flink;
}
}
}
}
}
}</ldr.inloadordermodulelist>

> Hi,

i’m developing an anti-malware driver as a project for school, and part of
the code is looking up the entry point of the loaded module (the exe) in a
CreateThreadNotifyRoutine callback.

I can’t manage to get the correct address, it’s always ‘close’, but i
don’t understand what i’m doing wrong here. If I inspect the structures in
windbg, it shows me the right adresses, by issuing !peb and and then using
this command:

(!list -t ntdll!_LIST_ENTRY.Flink -x "dt ntdll!_LDR_DATA_TABLE_ENTRY
@$extret" <ldr.inloadordermodulelist from peb>
>
> I try to do the same thing in my code, but as i said, for some reason the
> addresses are never correct, and sometimes the DbgPrint() message in my
> code doesn’t show?! For example: when i start notepad it never shows.
>
> Also if someone could give me a bit of a advice of how to patch the entry
> point when i got it, it would also be very much appreciated.
>
> Here is my code from the CreateThreadNotifyRoutine:
>
>
> VOID createThreadNotify(
> IN HANDLE ProcessId,
> IN HANDLE ThreadId,

Actually, these are not great choices for names. Process and thread IDs
are not HANDLEs, they are DWORDs. So my bogometer s moving off the green
into the low yellow.

> IN BOOLEAN Create) {
>
> NTSTATUS status;
> PEPROCESS peProcess;
> PPEB peb;
> PPEB_LDR_DATA LdrData = NULL;
> PLDR_DATA_TABLE_ENTRY LdrTableEntry = NULL;
> LIST_ENTRY *Flink, Blink;
>
> if(Create) {
> status = PsLookupProcessByProcessId(ProcessId,&peProcess);
>
> if(status == STATUS_SUCCESS) {
> peb = PsGetProcessPeb(peProcess);
>
> if(peb) {
> LdrData = peb->Ldr;
>
> if(LdrData) {
> Flink = LdrData->InMemoryOrderModuleList.Flink;
> Blink = LdrData->InMemoryOrderModuleList.Blink;
>
> while(Flink != Blink) {
> LdrTableEntry = (PLDR_DATA_TABLE_ENTRY)Flink;
> if(LdrTableEntry) {
> DbgPrint(“Module %wZ has EntryPoint:
> %08x\n”,&LdrTableEntry->FullModuleName,LdrTableEntry->EntryPoint);
> }
> Flink = Flink->Flink;
> }
> }
> }
> }
> }
> }
>
>
What is missing here is a lot of DbgPrint calls in the (non-existent)
else-parts of all those conditionals. You have exactly one DbgPrint that
requires everything be perfect. So if anything whatsoever goes wrong, you
have no idea what failed, or why.

As a style issue, my own taste is to never nest conditionals more than one
deep, which is often impractical, so I am willing to go to two levels.
This probably comes from my academic background of rule-based systems.
The problem is that with deep nesting it is often hard to follow the
logic. Then my OCD kicks in and I “name” the blocks (in fact, my text
editor does this by appropriate magic) so all my nesting looks like
if(something)
{ /
something /
… stuff
} /
something /
else
{ /
not something /
…other stuff
} /
not something */

This model has a deep theoretical basis in cognitive psychology, and a
remembered fondness for a language I used a lot in the 1970s that
supported this as a first-class grammatical concept enforced by the
compiler.

You really, really need to put DbgPrint calls in all places where
something fails. Single-stepping through this code would also give you a
lot of information, but the DbgPrint (or KdPrint) method will record all
the failure modes.

When using DbgPrint, a useful trick is to put some unique string at the
front of every message, e.g. “MYDRIVER:” where you put the actual name of
your driver in. One advantage of this technique is that you can do
searches through the kernel spew and find your messages, or save the debug
log as a file and write an awk, Perl, Python, or other favorite language
program to do data reduction of thousands of lines of kernel spew to the
few you care about.

Note that this callback, and I’m inferring this from the name, is invoked
whenever a thread is created. There are lots of thread creation events,
and, relatively speaking, only a few of them represent module entry
points. And CreateRemoteThread can be the basis of a wonderful number of
near-undetectable exploits. So looking at the module entry point is
largely irrelevant for security. If you fail creation calls because they
don’t reference the module entry you would break every multithreaded app
in existence.

The lack of comments in the code does not suggest what the code is
intended to do or why the fields you are looking at are part of the
solution. So I’m doing a lot of guessing where I should see a spec of the
intention or goal of the code. Just a hint: I spent a lot of time
teaching, and if I assigned a project to implement a security method, and
got code that looked like the above, I would be inclined to give it at
best a B-. Code that “works” is not nearly as useful as a demonstration
of understanding of the problem.
joe
>
> —
> NTDEV is sponsored by OSR
>
> OSR is HIRING!! See http://www.osr.com/careers
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
></ldr.inloadordermodulelist>

> -----Original Message-----

From: xxxxx@lists.osr.com [mailto:bounce-533807-
xxxxx@lists.osr.com] On Behalf Of xxxxx@flounder.com
Sent: Monday, May 20, 2013 11:13 PM
To: Windows System Software Devs Interest List
Subject: Re: [ntdev] Wrong address of EntryPoint value in
LDR_DATA_TABLE_ENTRY
> Here is my code from the CreateThreadNotifyRoutine:
>
>
> VOID createThreadNotify(
> IN HANDLE ProcessId,
> IN HANDLE ThreadId,
*****
Actually, these are not great choices for names. Process and thread IDs
are not HANDLEs, they are DWORDs. So my bogometer s moving off the green
into the low yellow.
*****

Joe, although the rest of your response was fine, the OP is, indeed, correct on this one, as he has the correct signature for the callback:

VOID
(*PCREATE_THREAD_NOTIFY_ROUTINE) (
IN HANDLE ProcessId,
IN HANDLE ThreadId,
IN BOOLEAN Create
);

Phil B

Not speaking for LogRhythm
Phil Barila | Senior Software Engineer
720.881.5364 (w)
A LEADER 2013 SIEM Magic Quadrant
Perfect 5-Star Rating in SC Magazine for 5 Consecutive Years

My rule is make it two level deep at most. If for any wierd reasons that it
has to go 3~4 depth, it MUST fit in one screen. I guess I’m not a very
contextual person.

Calvin

As a style issue, my own taste is to never nest conditionals more than one
deep, which is often impractical, so I am willing to go to two levels.

> My rule is make it two level deep at most. If for any wierd reasons that

it
has to go 3~4 depth, it MUST fit in one screen. I guess I’m not a very
contextual person.

Calvin

Note to readers: much of this is written for the OP, who is a student. So
don’t complain that I’m covering the obvious. Experienced driver writers
can save time by skipping to the next message…now!

Very few people are good with dealing with nesting. There are three
things that cognitive psychology research has poven humans are not very
good at: managing large amounts of state; handling nested contexts; and
reasoning about concurrency. What do programmers do? Yep. We have
trained ourselves to deal with all three things that cognitive psychology
tells us are hard problems!

Mostly, we handle it by avoidance. Can’t reason well about large amounts
of state? Put the required state in a struct and manage it with a small
number of functions. This leads to languages like C++ where the compiler
rigorously enforces the “rules in our head”. Can’t manage dynamic
nesting? Put the necessary state in a struct and pass it by pointer.
This leads to avoiding global variables and leads to languages like C++.
Can’t handle static nesting? Don’t nest. For example, I do a lot of
“state management” by using decision tables. So my comments might be
something like
Name A B C Action
a. T T T paint bits purple
b. T T F paint bits green

Now, a newbie might be tempted to write this as:

if(A)
if(B)
if(C)
purple();
else
green();
(note the lack of {} which can also lead to problems)

For the eight cases, the process will be “optimized” in a set of nested
statements.

The problem is that the code will be maintained by unskilled labor. This
means either the new hire or the original author six months later. It
involves taking the complex nested contexts and reverse-engineering that
decision table, which may or may not be done correctly. Then, a new set
of consraints may require massive refactoring of the code.

So instead, I write
if(A && B && C)
{ /* a. */
purple();
} /* a. */
else
if(A && B && !C)
{ /* b. */
green();
} /* b. */
else

if(!A && B && C)
{ /* e. */
…report impossible condition
} /* e. */

That is, I make no attempt to “optimize” the decoding. And the code to
take an action is typically a single function call, so I don’t have to
copy-and-paste code if two states require the same response; I just call
the function again. Note that although I show () in the call, in C I’m
usually passing down some kind of context.

Now, an amateur instructor might point out that my code is larger than
required, and “less efficient” because it involves evaluating A, B, and C
multiple times (they might not be simple variables, but more like “p !=
NULL” or something like that. I have several responses to that, which
include, but are not limited to:
function calls are free
optimizing compilers can detect common subexpressions and collapse
similar code sequences
evaluating simple predicates on a modern architecture with caches,
pipelines, and all the other bells & whistles is essentially free
code size is irrelevant unless you can demonstrate that the
“optimization” affects the entire driver size to a significant extent
delivering a buggy product is extremely expensive
maintenance (including feature enhancement) is somewhere between
expensive and extremely expensive
introducing bugs via maintenance which are not caught until
post-deployment is extremely expensive
finding and fixing pre-deployment bugs is expensive

So “optimizing” code “by hand” often wastes a valuable resource
(programmer time), wastes an expensive resource (programmer time), and
often pushes these costs “downstream” to later debugging and enhancement
costs. Such pre-optimizations can significantly add to the development
time, which can affect time-to-market and consequently
return-on-investment or even market share.

Optimizing code in this fashion typically saves a small percentage of time
and, in the big picture of overall driver size, unmeasurably small amounts
of space, but has high costs where it critically matters: development
costs and post-deployment maintenance/enhancement costs. Not worth it.

Finally, concurrency. Even most experts suck at reasoning about
concurrency. The simplest solution is to avoid it entirely when possible.
For example, if two threads act upon disjoint data, there is no need to
reason about their concurrency. Synchronization is where two threads rub,
and like most mechanical systems, all this does is generate heat and waste
energy. It is easier to avoid concurrency in user space; down in the
kernel, it is an always-in-your-face reality. So great discipline is
involved in dealing with concurrency, and the simpler you can keep it the
happier you will be. The
acquire-spinlock-make-a-small-number-of-changes-release-spinlock model
generally works pretty well, but if the spinlock becomes high-conflict in
a multicore environment, queued spinlock may be a better choice. The rule
“lock the smallest amount of data for the shortest possible time” is a
good rule to use as much as possible, along with “never set more than one
lock at a time” as a good rule for deadlock avoidance. The rule “when
using more than one lock at a time, lock them in a canonical order” is
mandatory, but often hard to do and very, very difficult to support in
maintenance. Avoidance is the best strategy here. Don’t do a design that
requires nested locking (which is not always possible when in the kernel).

Also note that in general, single scalar variables are handled atomically
by the hardware. Thus the following code is just silly:

BOOLEAN busy; // in a THING struct, for example, and guaranteed to be
DWORD-aligned

KeAcquireSpinlock(&thing->busylock, &oldirql);
thing->busy = FALSE;
KeReleaseSpinlock(&thing->busylock, odirql);

Also, “monotonically mutable” variables, those that start out in one state
and can be set to only one other state, and once changed are not set back
to the initial state, require no synchronization. A ref-count, for
example, is not monotonically-mutable because it can take on many states.
A BOOLEAN can be monotonically mutable, or you may rely on hardware
atomicity. In fact, the real rule is that if for any set of values it is
impossible to violate the invariants of that set no matter how many
non-atomic actions you take, and in what order they are performed, then no
synchronization is required. The number of times this is actually true is
small enough to be indistinguishable from zero. (There is an unproven
conjecture that if you believe you have such a set of values, you are
mistaken).
joe

>
> As a style issue, my own taste is to never nest conditionals more than
> one
> deep, which is often impractical, so I am willing to go to two levels.
>


NTDEV is sponsored by OSR

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

I learned the term “monotonically mutable” today.

Peter
OSR

xxxxx@hotmail.com wrote:

I can’t manage to get the correct address, it’s always ‘close’, but i don’t understand what i’m doing wrong here. If I inspect the structures in windbg, it shows me the right adresses, by issuing !peb and and then using this command:

(!list -t ntdll!_LIST_ENTRY.Flink -x "dt ntdll!_LDR_DATA_TABLE_ENTRY @$extret" <ldr.inloadordermodulelist from peb>
>
> I try to do the same thing in my code, but as i said, for some reason the addresses are never correct, and sometimes the DbgPrint() message in my code doesn’t show?! For example: when i start notepad it never shows.
> …
>
> VOID createThreadNotify(
> …
> if(LdrData) {
> Flink = LdrData->InMemoryOrderModuleList.Flink;
> Blink = LdrData->InMemoryOrderModuleList.Blink;
>
> while(Flink != Blink) {
> LdrTableEntry = (PLDR_DATA_TABLE_ENTRY)Flink;

That last line is your problem. You are assuming that the link points
to the start of the LDR_DATA_TABLE_ENTRY structure. Fifteen seconds
with Google showed me that was not the case. The link points to the
LIST_ENTRY structure that is embedded within the structure, NOT at the
start of the structure.

You want:
LdrTableEntry = CONTAINING_RECORD(Flink, LDR_DATA_TABLE_ENTRY,
InMemoryOrderLinks);

(Actually, the assumption you made here is not unreasonable based on the
text of the MSDN page on PEB_LDR_DATA. The text is misleading, and I
only know it’s wrong because I know how LIST_ENTRY works. I’ll submit a
documentation comment.)


Tim Roberts, xxxxx@probo.com
Providenza & Boekelheide, Inc.</ldr.inloadordermodulelist>

It is one of those precepts like Bolzano’s theorem (aka intermediate value
theorem) that is self evident from the definition

Assuming of course that the definition was sufficient understood in the
first place

wrote in message news:xxxxx@ntdev…

I learned the term “monotonically mutable” today.

Peter
OSR

Proving something obvious like this is incredibly hard.

[quote]

I learned the term “monotonically mutable” today.

[quote]

That’s sounds too geeky. How about “Ratchet Value”?

Calvin