MmGetSystemRoutineAddress BugCheck?

Daniel_Terhell · June 1, 2007, 4:47am

The solution you provide here suggests that MmGet… only crashes in case it
cannot find a routine address name, but that’s not the case. It crashes on
perfectly valid exported routine names so you still don’t know if it is
working reliably or not.

/Daniel

“Doron Holan” wrote in message
news:xxxxx@ntdev…
Essentially, you can do
this

If (TheCurrentVersionExportsThisDDI()) { MmGetSystemRoutineAddress() }

Instead of a blind call to MmGetSystemRoutineAddress on all platforms.
Kinda ugly? Yeah, but not that bad compared to other workarounds, esp
b/c the knowledge here is static and you will encounter the issue well
before you ship.

Mark_Roddy · June 1, 2007, 8:56am

So this has been a fascinating discussion. It seems that the correct
approach is something like:

If (OsVersion < Vista)
{
UseWorkAroundForMissingApiOfInterest();
}
Else
{
ApiOfInterst = MmCrashMySystemRandomly(…);
If (!ApiOfInterest)
{
UseWorkAroundForMissingApiOfInterest();
}
}

-----Original Message-----
From: xxxxx@lists.osr.com [mailto:bounce-288666-
xxxxx@lists.osr.com] On Behalf Of Daniel Terhell
Sent: Friday, June 01, 2007 4:47 AM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] MmGetSystemRoutineAddress BugCheck?

The solution you provide here suggests that MmGet… only crashes in
case it
cannot find a routine address name, but that’s not the case. It crashes
on
perfectly valid exported routine names so you still don’t know if it is
working reliably or not.

/Daniel

“Doron Holan” wrote in message
> news:xxxxx@ntdev…
> Essentially, you can do
> this
>
> If (TheCurrentVersionExportsThisDDI()) { MmGetSystemRoutineAddress() }
>
> Instead of a blind call to MmGetSystemRoutineAddress on all platforms.
> Kinda ugly? Yeah, but not that bad compared to other workarounds, esp
> b/c the knowledge here is static and you will encounter the issue well
> before you ship.
>
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer

OSR_Community_User · June 1, 2007, 9:06am

Okay, I see that MmGetSystemRoutineAddress acquires an ERESOURCE, which
would remain acquired in the event of a caught exception.

So, I apologize for and take back my “irresponsible” accusation and my
snideness about the contract.

Dan.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Doron Holan
Sent: Thursday, May 31, 2007 10:57 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] MmGetSystemRoutineAddress BugCheck?

There is a workaround that does not involve SEH. Yes, it is a bit painful
and iterative, but like peter Wieland said, you can use the current OS
version with some built in knowledge to know whether to make the call or not
until we can provide a real solution that does not require every driver to
hack around the bug. Essentially, you can do this

If (TheCurrentVersionExportsThisDDI()) { MmGetSystemRoutineAddress() }

Instead of a blind call to MmGetSystemRoutineAddress on all platforms. Kinda
ugly? Yeah, but not that bad compared to other workarounds, esp b/c the
knowledge here is static and you will encounter the issue well before you
ship.

no need to be snide and create a straw man about a bug in the API vs SEH and
contracts. My point about contracts is that while it appears to work, you
can very well leave the OS in an unknown state (locks still held or worse, a
random address that was touched and we now have corrupted some other
structure or UM process, etc). You use SEH where it is documented to be
used, not to trap bugs (in your code, some other driver, or the OS).

d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of Dan Kyler
Sent: Thursday, May 31, 2007 9:00 PM
To: Windows System Software Devs Interest List
Subject: RE: [ntdev] MmGetSystemRoutineAddress BugCheck?

Bad advice my ass.

Every indication I’ve seen indicates that this would be caught by try/except
(and I expect that several people on this list have already verified this).
And if not, it wouldn’t make it any worse.

I have a huge amount of respect for Doron, his knowledge and abilities, and
his selfless and abundant help on this list. But I think his last post was
irresponsible.

However, it is in keeping with Microsoft’s “response” to this problem.
Microsoft has for years been actively pushing the use of
MmGetSystemRoutineAddress in response to the community’s complaint’s that
API x is not available on Windows version y. Now we discover that it
crashes the system and blames your driver for 20% of the valid range of
inputs.

What is the response to that? Is it a Windows Update hotfix to EVERY
affected system regardless of whether or not it is still officially
supported? No, it’s don’t worry, it’s been fixed in some random assortment
of operating systems that we can’t really define. And don’t try to protect
yourself against it, because that’s “not recommended”, and “not in the
formal contract”.

If I’m not mistaken, the “formal contract” indicates that
MmGetSystemRoutineAddress will not BSOD if given valid inputs. You can
throw the “formal contract” out the window at this point.

Dan.

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@microsoft.com
Sent: Thursday, May 31, 2007 8:49 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] MmGetSystemRoutineAddress BugCheck?

My apology for the bad advice on using SEH. I composed and discarded posts
on the issue this morning, and began with a “maybe you can do this but it
hasn’t been verified” and moved to “it appears you can”, as the day wore
on-
partly based upon the fact that I’ve done my own code for locating kernel
exports and didn’t remember any unwindable state in that solution, and
partly on an over-optimistic reading of what information I could glean on
the subject myself (not having ready access to or time to sift through all
that legacy source).

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Bill_McKenzie-3 · June 1, 2007, 9:08am

Seriously had me LOL!!

“Mark Roddy” wrote in message news:xxxxx@ntdev…
> So this has been a fascinating discussion. It seems that the correct
> approach is something like:
>
> If (OsVersion < Vista)
> {
> UseWorkAroundForMissingApiOfInterest();
> }
> Else
> {
> ApiOfInterst = MmCrashMySystemRandomly(…);
> If (!ApiOfInterest)
> {
> UseWorkAroundForMissingApiOfInterest();
> }
> }
>
>> -----Original Message-----
>> From: xxxxx@lists.osr.com [mailto:bounce-288666-
>> xxxxx@lists.osr.com] On Behalf Of Daniel Terhell
>> Sent: Friday, June 01, 2007 4:47 AM
>> To: Windows System Software Devs Interest List
>> Subject: Re:[ntdev] MmGetSystemRoutineAddress BugCheck?
>>
>> The solution you provide here suggests that MmGet… only crashes in
>> case it
>> cannot find a routine address name, but that’s not the case. It crashes
>> on
>> perfectly valid exported routine names so you still don’t know if it is
>> working reliably or not.
>>
>> /Daniel
>>
>>
>>
>> “Doron Holan” wrote in message
>> news:xxxxx@ntdev…
>> Essentially, you can do
>> this
>>
>> If (TheCurrentVersionExportsThisDDI()) { MmGetSystemRoutineAddress() }
>>
>> Instead of a blind call to MmGetSystemRoutineAddress on all platforms.
>> Kinda ugly? Yeah, but not that bad compared to other workarounds, esp
>> b/c the knowledge here is static and you will encounter the issue well
>> before you ship.
>>
>>
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>
>
>

Don_Burn_1 · June 1, 2007, 9:09am

This discussion points out the problems of big companies (and some not so
big firms), the rules in place to protect the firm can in many cases
interfere with the goals of the firm. This is one of those classics in
business books and classes, showing why firms become less responsive.

In this case Microsoft pushes people hard to use the new features and API’s
but the function that enables using them is not considered important enough
to fix!

Another one I heard about at WinHEC and just got a bug report for Longhorn
closed without any intent to fix is PreFast annotations. Microsoft if
pushing people to use PreFast (and even requiring it for WHQL), and talking
up the annotations. I filed bug reports for Vista and now Longhorn, about
the fact that documentation still is using IN, OUT and OPTIONAL instead of
annotations. I found out at WinHEC that the documentation group is
carefully taking the new function declaraitions and removing the
annotations for the doc’s. I just got a close with a “will not be fixed”
on the bug I filed.

Oh well more of Do What I Say, Not What I Do!

–
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr
Remove StopSpam to reply

Peter_Viscarola_OSR · June 1, 2007, 10:25am

G*d save us all from SAL notations.

In terms of “do as I say not as I do”… I can tell you from experience that Microsoft takes spec strings VERY seriously. Just TRY checking some code into Windows without the appropriate SAL notations. No chance.

In terms of the DDK Docs: Who cares if the spec strings are in the docs? They’re for PreFast to read, not humans. And they can be insanely ugly and complicated.

Me? I’m gonna start writing my drivers in sing# http:

Peter</http:>

OSR_Community_User · June 1, 2007, 12:10pm

Some followup on a few of the items from yesterday, in case they are needed…

The DbgPrintEx report was one of the NTFSD links posted early yesterday: http://www.osronline.com/showThread.cfm?link=101536

The x64 stuff bothers me, because we use it, and haven’t seen this behavior- it does not sound like this bug is the cause. If there are details of the sort I can use to reproduce the problem (what OS, which names?), I’d be happy to file a bug on it. I don’t think anyone here really wants this to be broken and untreated.

As for what names are exposed to the bug- any non-existent name before the first name in the export table for the hal or kernel (potential is there in the search for either one as searching for a match in one binary is what this routine does] is at risk.

Looking at a HAL I had handy (link /dump /exports %windir%\system32\hal.dll) then, [not entirely true because this is a Vista box, so the bug isn’t there], any missing name before HalAcquireDisplayOwnership is at risk. My kernel’s export names begin with AlpcGetHeaderSize. These are both from Vista, as I said, but it should be an easy enough exercise to assess where the problem names are on any platform you are worried about (link /dump is platform-agnostic, as long as it’s PE format).

I’m thinking if I ever get time, I’ll collect the output from that command, build a table and a small test driver around it to poke at this further. Maybe that can turn up these x64 issues…

Again, I chimed in too early, even if it was with good intentions…

Daniel_Terhell · June 1, 2007, 6:03pm

Considering the report of Konstantin speaking about a variable dropping
below zero, there is a possibility you are mixing up signed and unsigned
types. If you are comparing in a binary search and the low, high or middle
values decrease and wrap around and mess up because you are mixing up
signed and unsigned types and based on test of these values like you do in
binary searches and you just return NULL it can perfectly explain why it
returns NULL at random on x64. It is still speculation but it looks like
this x64 bug is probably just the same bug as on x86 but manifesting in a
different way.

/Daniel

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
Some followup on a few of the items from yesterday, in case they are
needed…

The DbgPrintEx report was one of the NTFSD links posted early yesterday:
http://www.osronline.com/showThread.cfm?link=101536

The x64 stuff bothers me, because we use it, and haven’t seen this behavior-
it does not sound like this bug is the cause. If there are details of the
sort I can use to reproduce the problem (what OS, which names?), I’d be
happy to file a bug on it. I don’t think anyone here really wants this to
be broken and untreated.

As for what names are exposed to the bug- any non-existent name before the
first name in the export table for the hal or kernel (potential is there in
the search for either one as searching for a match in one binary is what
this routine does] is at risk.

Looking at a HAL I had handy (link /dump /exports %windir%\system32\hal.dll)
then, [not entirely true because this is a Vista box, so the bug isn’t
there], any missing name before HalAcquireDisplayOwnership is at risk. My
kernel’s export names begin with AlpcGetHeaderSize. These are both from
Vista, as I said, but it should be an easy enough exercise to assess where
the problem names are on any platform you are worried about (link /dump is
platform-agnostic, as long as it’s PE format).

I’m thinking if I ever get time, I’ll collect the output from that command,
build a table and a small test driver around it to poke at this further.
Maybe that can turn up these x64 issues…

Again, I chimed in too early, even if it was with good intentions…

Daniel_Terhell · June 1, 2007, 6:03pm

Considering the report of Konstantin speaking about a variable dropping
below zero, it looks like you are mixing up signed and unsigned types in
MiFindExportedRoutineByName. If you are comparing in a binary search and the
low, high or middle values decrease and wrap around and mess up because you
are mixing up signed and unsigned types and based on comparison of these
values like you do in normally do in binary searches you just return NULL it
explain why it returns NULL at random on x64. It is still speculation but
it looks like this x64 bug is probably just the same bug as on x86 but
manifesting in a different way and you have some ULONG mixed up with some
LONG.

/Daniel

“Bob Kjelgaard” wrote in message
news:xxxxx@ntdev…
Some followup on a few of the items from yesterday, in case they are
needed…

The DbgPrintEx report was one of the NTFSD links posted early yesterday:
http://www.osronline.com/showThread.cfm?link=101536

The x64 stuff bothers me, because we use it, and haven’t seen this behavior-
it does not sound like this bug is the cause. If there are details of the
sort I can use to reproduce the problem (what OS, which names?), I’d be
happy to file a bug on it. I don’t think anyone here really wants this to
be broken and untreated.

As for what names are exposed to the bug- any non-existent name before the
first name in the export table for the hal or kernel (potential is there in
the search for either one as searching for a match in one binary is what
this routine does] is at risk.

Looking at a HAL I had handy (link /dump /exports %windir%\system32\hal.dll)
then, [not entirely true because this is a Vista box, so the bug isn’t
there], any missing name before HalAcquireDisplayOwnership is at risk. My
kernel’s export names begin with AlpcGetHeaderSize. These are both from
Vista, as I said, but it should be an easy enough exercise to assess where
the problem names are on any platform you are worried about (link /dump is
platform-agnostic, as long as it’s PE format).

I’m thinking if I ever get time, I’ll collect the output from that command,
build a table and a small test driver around it to poke at this further.
Maybe that can turn up these x64 issues…

Again, I chimed in too early, even if it was with good intentions…

OSR_Community_User · June 1, 2007, 9:44pm

The algorithm works except in the case I mentioned- in that specific case, you underflow 0 becasue you are using undigned arithemetic on an algorithm designed for signed values. Because of the table item size, there is more overflow, so even though you aren’t explicitly accessin [-1] the end result is the same. You use whatever ULONG the linker replaced BEFORE the table as an offset into memory for a null-terminated string compare.

IF that is a valid address, and IF the comparison fails so that you move Low away from 0 [Middle + 1], you might eventually terminate with a NULL return. But you’ve got 30 shifts left in the algorithm, and every one of them had better not fault. That just seems unlikely. There’s not a lot of reason to believe it works any differently on x86.

Also, IIRC it was said that VALID names were failing. The check order is kernel, then HAL. So to get this result, you need a valid HAL name which happens to precede the first kernel name in sort order. Even for 64-bit, I haven’t checked them all (or even come clsoe), but I’m not finding any such cases.

I’d prefer not to just wave it off as the same problem.

OSR_Community_User · June 2, 2007, 2:35am

Meant to get to this, but had to leave earlier (school play).

Re-reading some of the earlier requests, I’ll be more specific, because I think it helps explain the reason I still think x64 deserves another look based upon the other reports. I also think it might help people like Bill McKenzie who might still be concerned exactly which calls they have to worry about.

The algorithm is:

Low, Middle, High- indices into an ordered array. Target is the value you are trying to find the appropriate index for in the array.

Begin with Low at the first entry, High at the last.

while (High >=Low)
{
Middle = Low + ((High - Low) >> 1);
if (Array[Middle] == Target)
break;
if (Array[Middle] < Target)
Low = Middle + 1;
else
High = Middle - 1;
}

if (High < Low) on exit, the value isn’t there- otherwise, Middle is your desired index.

In this case, the target is a null-terminated character string, the array elements are offsets from a known base (base of the image) to null-terminated strings, and the comparisons are via strcmpi (I think- might be strcmp, I’m at home at the moment, and I don’t think that detail affects the dsicussion significantly).

The specific indices at start will be low = 0, high = number of names exported - 1;

If all the High, Middle, Low are signed, this works. If they are unsigned, then there is a broken edge- normally when a value is not present the final check will have high == middle == low, and fail with low incremented above high, or high decremented below low as the case may be.

But unsigned with Low 0, and the target is still smaller, you get an underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low)) and this is the bug (easily fixed by using signed indices). While there can be other theoretical complications if you wish to assume mixed signed/unsigned indices [even worse if you assume negtive start points], they don’t realistically apply- these tables have dozens or at most hundreds of entries- the sign bit (which is what kills you) doesn’t come into play UNTIL this case is hit.

Rehashing the earlier post, let’s say the first bad ULONG [effectively at array[-1], thanks to the overflow occuring when converting the Middle index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset within the image, and the data at that location, interpreted as a string, is BIGGER than our target. Still saying its all unsigned, then Low becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same index due to overflow /wrap- the point being that I believe this is your NULL return case when you see it- it continues to compare the same two values until it exits because of the built-in ambiguity of the two high bits. It requires one happy circumstance, but *one* isn’t as hard to swallow. Still I’m skeptical this magic value is x86/x64 difference. But hey, I know what to look for and where, so thanks for the tip…

I think I adequately explained on the previous post why I also have a concern with the idea that a valid name is affected. But the above explanation might make it easier to understand why I assert there that only specific cases can trigger these bad behaviors. I don’t see how anything other than a valid HAL name preceding the first kernel name can wander into this, and unless you do, everything else works.

I’m at least going to try to satisfy myself about the false negatives. Ownership or not, *I* need this to work…

raj_r · June 2, 2007, 4:09am

some where in the sequence ecx becomes 0 and then the next iterartion
goes haywire

this snippet is from w2k sp4

and iirc the api (sorry i think the term is DDI ) on this sequence was
NtDeviceIoControlFile

Breakpoint 1 hit
eax=e1291208 ebx=80062000 ecx=00000000 edx=8006e344 esi=8006e5ac edi=00000001
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl zr na pe cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000247
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:8006e344=0000c586
kd>
Breakpoint 1 hit
eax=e1291208 ebx=80062000 ecx=7fffffff edx=8006e344 esi=8006e586 edi=ffffffff
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 ov up ei pl nz na pe cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000a07
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:8006e340=00005aa0
kd>
Breakpoint 1 hit
eax=e1291208 ebx=80062000 ecx=3fffffff edx=8006e344 esi=80067aa0 edi=7ffffffe
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000206
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:8006e340=00005aa0
kd>
Breakpoint 1 hit
eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0 edi=3ffffffe
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000206
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:0006e340=???
kd> t
Access violation - code c0000005 (!!! second chance !!!)
eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0 edi=3ffffffe
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000306
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:0006e340=???
kd> r ecx=2f
kd> r
eax=e1291208 ebx=80062000 ecx=0000002f edx=8006e344 esi=80067aa0 edi=3ffffffe
eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00000306
nt!MiFindExportedRoutineByName+0x54:
804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
ds:0023:8006e400=0000ca7c
kd> g

regards

raj_r

On 6/2/07, xxxxx@microsoft.com wrote:
> Meant to get to this, but had to leave earlier (school play).
>
> Re-reading some of the earlier requests, I’ll be more specific, because I think it helps explain the reason I still think x64 deserves another look based upon the other reports. I also think it might help people like Bill McKenzie who might still be concerned exactly which calls they have to worry about.
>
> The algorithm is:
>
> Low, Middle, High- indices into an ordered array. Target is the value you are trying to find the appropriate index for in the array.
>
> Begin with Low at the first entry, High at the last.
>
> while (High >=Low)
> {
> Middle = Low + ((High - Low) >> 1);
> if (Array[Middle] == Target)
> break;
> if (Array[Middle] < Target)
> Low = Middle + 1;
> else
> High = Middle - 1;
> }
>
> if (High < Low) on exit, the value isn’t there- otherwise, Middle is your desired index.
>
> In this case, the target is a null-terminated character string, the array elements are offsets from a known base (base of the image) to null-terminated strings, and the comparisons are via strcmpi (I think- might be strcmp, I’m at home at the moment, and I don’t think that detail affects the dsicussion significantly).
>
> The specific indices at start will be low = 0, high = number of names exported - 1;
>
> If all the High, Middle, Low are signed, this works. If they are unsigned, then there is a broken edge- normally when a value is not present the final check will have high == middle == low, and fail with low incremented above high, or high decremented below low as the case may be.
>
> But unsigned with Low 0, and the target is still smaller, you get an underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low)) and this is the bug (easily fixed by using signed indices). While there can be other theoretical complications if you wish to assume mixed signed/unsigned indices [even worse if you assume negtive start points], they don’t realistically apply- these tables have dozens or at most hundreds of entries- the sign bit (which is what kills you) doesn’t come into play UNTIL this case is hit.
>
> Rehashing the earlier post, let’s say the first bad ULONG [effectively at array[-1], thanks to the overflow occuring when converting the Middle index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset within the image, and the data at that location, interpreted as a string, is BIGGER than our target. Still saying its all unsigned, then Low becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same index due to overflow /wrap- the point being that I believe this is your NULL return case when you see it- it continues to compare the same two values until it exits because of the built-in ambiguity of the two high bits. It requires one happy circumstance, but one isn’t as hard to swallow. Still I’m skeptical this magic value is x86/x64 difference. But hey, I know what to look for and where, so thanks for the tip…
>
> I think I adequately explained on the previous post why I also have a concern with the idea that a valid name is affected. But the above explanation might make it easier to understand why I assert there that only specific cases can trigger these bad behaviors. I don’t see how anything other than a valid HAL name preceding the first kernel name can wander into this, and unless you do, everything else works.
>
> I’m at least going to try to satisfy myself about the false negatives. Ownership or not, I need this to work…
>
> —
> Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
>

Daniel_Terhell · June 2, 2007, 6:21am

Yes, in the case a name is not present then after an iteration of low=0 high
=1, Mid becomes 0. Then High = Mid -1 wraps to 0x7FFFFFFF and the loop
still goes on with bogus values.

ecx=00000000
ecx=7fffffff
ecx=3fffffff
ecx=1fffffff

These values confirm my speculation this function uses unsigned types for
high, mid and low because if high would be -1 then the condition
while (high >= low) would no longer be valid and the loop would be
terminated.

/Daniel

“raj_r” wrote in message news:xxxxx@ntdev…
> some where in the sequence ecx becomes 0 and then the next iterartion
> goes haywire
>
> this snippet is from w2k sp4
>
> and iirc the api (sorry i think the term is DDI ) on this sequence was
> NtDeviceIoControlFile
>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=00000000 edx=8006e344 esi=8006e5ac
> edi=00000001
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl zr na pe
> cy
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000247
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e344=0000c586
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=7fffffff edx=8006e344 esi=8006e586
> edi=ffffffff
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 ov up ei pl nz na pe
> cy
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000a07
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e340=00005aa0
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=3fffffff edx=8006e344 esi=80067aa0
> edi=7ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000206
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e340=00005aa0
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000206
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:0006e340=???
> kd> t
> Access violation - code c0000005 (!!! second chance !!!)
> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000306
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:0006e340=???
> kd> r ecx=2f
> kd> r
> eax=e1291208 ebx=80062000 ecx=0000002f edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000306
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e400=0000ca7c
> kd> g
>
>
> regards
>
> raj_r
>
>
>
> On 6/2/07, xxxxx@microsoft.com wrote:
>> Meant to get to this, but had to leave earlier (school play).
>>
>> Re-reading some of the earlier requests, I’ll be more specific, because I
>> think it helps explain the reason I still think x64 deserves another look
>> based upon the other reports. I also think it might help people like
>> Bill McKenzie who might still be concerned exactly which calls they have
>> to worry about.
>>
>> The algorithm is:
>>
>> Low, Middle, High- indices into an ordered array. Target is the value
>> you are trying to find the appropriate index for in the array.
>>
>> Begin with Low at the first entry, High at the last.
>>
>> while (High >=Low)
>> {
>> Middle = Low + ((High - Low) >> 1);
>> if (Array[Middle] == Target)
>> break;
>> if (Array[Middle] < Target)
>> Low = Middle + 1;
>> else
>> High = Middle - 1;
>> }
>>
>> if (High < Low) on exit, the value isn’t there- otherwise, Middle is your
>> desired index.
>>
>> In this case, the target is a null-terminated character string, the array
>> elements are offsets from a known base (base of the image) to
>> null-terminated strings, and the comparisons are via strcmpi (I think-
>> might be strcmp, I’m at home at the moment, and I don’t think that detail
>> affects the dsicussion significantly).
>>
>> The specific indices at start will be low = 0, high = number of names
>> exported - 1;
>>
>> If all the High, Middle, Low are signed, this works. If they are
>> unsigned, then there is a broken edge- normally when a value is not
>> present the final check will have high == middle == low, and fail with
>> low incremented above high, or high decremented below low as the case may
>> be.
>>
>> But unsigned with Low 0, and the target is still smaller, you get an
>> underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low))
>> and this is the bug (easily fixed by using signed indices). While there
>> can be other theoretical complications if you wish to assume mixed
>> signed/unsigned indices [even worse if you assume negtive start points],
>> they don’t realistically apply- these tables have dozens or at most
>> hundreds of entries- the sign bit (which is what kills you) doesn’t come
>> into play UNTIL this case is hit.
>>
>> Rehashing the earlier post, let’s say the first bad ULONG [effectively at
>> array[-1], thanks to the overflow occuring when converting the Middle
>> index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset
>> within the image, and the data at that location, interpreted as a string,
>> is BIGGER than our target. Still saying its all unsigned, then Low
>> becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same
>> index due to overflow /wrap- the point being that I believe this is your
>> NULL return case when you see it- it continues to compare the same two
>> values until it exits because of the built-in ambiguity of the two high
>> bits. It requires one happy circumstance, but one isn’t as hard to
>> swallow. Still I’m skeptical this magic value is x86/x64 difference. But
>> hey, I know what to look for and where, so thanks for the tip…
>>
>> I think I adequately explained on the previous post why I also have a
>> concern with the idea that a valid name is affected. But the above
>> explanation might make it easier to understand why I assert there that
>> only specific cases can trigger these bad behaviors. I don’t see how
>> anything other than a valid HAL name preceding the first kernel name can
>> wander into this, and unless you do, everything else works.
>>
>> I’m at least going to try to satisfy myself about the false negatives.
>> Ownership or not, I need this to work…
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>

Daniel_Terhell · June 2, 2007, 6:21am

Yes, in the case a name is not present then after an iteration of low=0 high
=1, Mid becomes 0. Then High = Mid -1 wraps to 0x7FFFFFFF and the loop
still goes on with bogus values.

ecx=00000000
ecx=7fffffff
ecx=3fffffff
ecx=1fffffff

These values confirm my speculation this function uses unsigned types for
high, mid and low because if high would be -1 then the condition
while (high >= low) would no longer be valid and the loop would be
terminated.

Maybe there is another problem with the tables. What are they supposed to be
filled with, exported function names only ?

/Daniel

“raj_r” wrote in message news:xxxxx@ntdev…
> some where in the sequence ecx becomes 0 and then the next iterartion
> goes haywire
>
> this snippet is from w2k sp4
>
> and iirc the api (sorry i think the term is DDI ) on this sequence was
> NtDeviceIoControlFile
>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=00000000 edx=8006e344 esi=8006e5ac
> edi=00000001
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl zr na pe
> cy
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000247
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e344=0000c586
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=7fffffff edx=8006e344 esi=8006e586
> edi=ffffffff
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 ov up ei pl nz na pe
> cy
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000a07
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e340=00005aa0
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=3fffffff edx=8006e344 esi=80067aa0
> edi=7ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000206
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e340=00005aa0
> kd>
> Breakpoint 1 hit
> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000206
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:0006e340=???
> kd> t
> Access violation - code c0000005 (!!! second chance !!!)
> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000306
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:0006e340=???
> kd> r ecx=2f
> kd> r
> eax=e1291208 ebx=80062000 ecx=0000002f edx=8006e344 esi=80067aa0
> edi=3ffffffe
> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na pe
> nc
> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> efl=00000306
> nt!MiFindExportedRoutineByName+0x54:
> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> ds:0023:8006e400=0000ca7c
> kd> g
>
>
> regards
>
> raj_r
>
>
>
> On 6/2/07, xxxxx@microsoft.com wrote:
>> Meant to get to this, but had to leave earlier (school play).
>>
>> Re-reading some of the earlier requests, I’ll be more specific, because I
>> think it helps explain the reason I still think x64 deserves another look
>> based upon the other reports. I also think it might help people like
>> Bill McKenzie who might still be concerned exactly which calls they have
>> to worry about.
>>
>> The algorithm is:
>>
>> Low, Middle, High- indices into an ordered array. Target is the value
>> you are trying to find the appropriate index for in the array.
>>
>> Begin with Low at the first entry, High at the last.
>>
>> while (High >=Low)
>> {
>> Middle = Low + ((High - Low) >> 1);
>> if (Array[Middle] == Target)
>> break;
>> if (Array[Middle] < Target)
>> Low = Middle + 1;
>> else
>> High = Middle - 1;
>> }
>>
>> if (High < Low) on exit, the value isn’t there- otherwise, Middle is your
>> desired index.
>>
>> In this case, the target is a null-terminated character string, the array
>> elements are offsets from a known base (base of the image) to
>> null-terminated strings, and the comparisons are via strcmpi (I think-
>> might be strcmp, I’m at home at the moment, and I don’t think that detail
>> affects the dsicussion significantly).
>>
>> The specific indices at start will be low = 0, high = number of names
>> exported - 1;
>>
>> If all the High, Middle, Low are signed, this works. If they are
>> unsigned, then there is a broken edge- normally when a value is not
>> present the final check will have high == middle == low, and fail with
>> low incremented above high, or high decremented below low as the case may
>> be.
>>
>> But unsigned with Low 0, and the target is still smaller, you get an
>> underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low))
>> and this is the bug (easily fixed by using signed indices). While there
>> can be other theoretical complications if you wish to assume mixed
>> signed/unsigned indices [even worse if you assume negtive start points],
>> they don’t realistically apply- these tables have dozens or at most
>> hundreds of entries- the sign bit (which is what kills you) doesn’t come
>> into play UNTIL this case is hit.
>>
>> Rehashing the earlier post, let’s say the first bad ULONG [effectively at
>> array[-1], thanks to the overflow occuring when converting the Middle
>> index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset
>> within the image, and the data at that location, interpreted as a string,
>> is BIGGER than our target. Still saying its all unsigned, then Low
>> becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same
>> index due to overflow /wrap- the point being that I believe this is your
>> NULL return case when you see it- it continues to compare the same two
>> values until it exits because of the built-in ambiguity of the two high
>> bits. It requires one happy circumstance, but one isn’t as hard to
>> swallow. Still I’m skeptical this magic value is x86/x64 difference. But
>> hey, I know what to look for and where, so thanks for the tip…
>>
>> I think I adequately explained on the previous post why I also have a
>> concern with the idea that a valid name is affected. But the above
>> explanation might make it easier to understand why I assert there that
>> only specific cases can trigger these bad behaviors. I don’t see how
>> anything other than a valid HAL name preceding the first kernel name can
>> wander into this, and unless you do, everything else works.
>>
>> I’m at least going to try to satisfy myself about the false negatives.
>> Ownership or not, I need this to work…
>>
>> —
>> Questions? First check the Kernel Driver FAQ at
>> http://www.osronline.com/article.cfm?id=256
>>
>> To unsubscribe, visit the List Server section of OSR Online at
>> http://www.osronline.com/page.cfm?name=ListServer
>>
>

Daniel_Terhell · June 2, 2007, 6:23am

BTW there must be also a bug in Windows Mail. Everytime it tells me ‘could
not send post’. Then I send it again and it appears two posts have arrived.
I am on 56K dialup so this doesn’t help but apologies for my double posts.

/Daniel

“Daniel Terhell” wrote in message
news:xxxxx@ntdev…
> Yes, in the case a name is not present then after an iteration of low=0
> high =1, Mid becomes 0. Then High = Mid -1 wraps to 0x7FFFFFFF and the
> loop still goes on with bogus values.
>
> ecx=00000000
> ecx=7fffffff
> ecx=3fffffff
> ecx=1fffffff
>
> These values confirm my speculation this function uses unsigned types for
> high, mid and low because if high would be -1 then the condition
> while (high >= low) would no longer be valid and the loop would be
> terminated.
>
> Maybe there is another problem with the tables. What are they supposed to
> be filled with, exported function names only ?
>
> /Daniel
>
>
> “raj_r” wrote in message news:xxxxx@ntdev…
>> some where in the sequence ecx becomes 0 and then the next iterartion
>> goes haywire
>>
>> this snippet is from w2k sp4
>>
>> and iirc the api (sorry i think the term is DDI ) on this sequence was
>> NtDeviceIoControlFile
>>
>> Breakpoint 1 hit
>> eax=e1291208 ebx=80062000 ecx=00000000 edx=8006e344 esi=8006e5ac
>> edi=00000001
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl zr na
>> pe
>> cy
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000247
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:8006e344=0000c586
>> kd>
>> Breakpoint 1 hit
>> eax=e1291208 ebx=80062000 ecx=7fffffff edx=8006e344 esi=8006e586
>> edi=ffffffff
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 ov up ei pl nz na
>> pe
>> cy
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000a07
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:8006e340=00005aa0
>> kd>
>> Breakpoint 1 hit
>> eax=e1291208 ebx=80062000 ecx=3fffffff edx=8006e344 esi=80067aa0
>> edi=7ffffffe
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
>> pe
>> nc
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000206
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:8006e340=00005aa0
>> kd>
>> Breakpoint 1 hit
>> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
>> edi=3ffffffe
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
>> pe
>> nc
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000206
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:0006e340=???
>> kd> t
>> Access violation - code c0000005 (!!! second chance !!!)
>> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
>> edi=3ffffffe
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
>> pe
>> nc
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000306
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:0006e340=???
>> kd> r ecx=2f
>> kd> r
>> eax=e1291208 ebx=80062000 ecx=0000002f edx=8006e344 esi=80067aa0
>> edi=3ffffffe
>> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
>> pe
>> nc
>> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
>> efl=00000306
>> nt!MiFindExportedRoutineByName+0x54:
>> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
>> ds:0023:8006e400=0000ca7c
>> kd> g
>>
>>
>> regards
>>
>> raj_r
>>
>>
>>
>> On 6/2/07, xxxxx@microsoft.com wrote:
>>> Meant to get to this, but had to leave earlier (school play).
>>>
>>> Re-reading some of the earlier requests, I’ll be more specific, because
>>> I
>>> think it helps explain the reason I still think x64 deserves another
>>> look
>>> based upon the other reports. I also think it might help people like
>>> Bill McKenzie who might still be concerned exactly which calls they have
>>> to worry about.
>>>
>>> The algorithm is:
>>>
>>> Low, Middle, High- indices into an ordered array. Target is the value
>>> you are trying to find the appropriate index for in the array.
>>>
>>> Begin with Low at the first entry, High at the last.
>>>
>>> while (High >=Low)
>>> {
>>> Middle = Low + ((High - Low) >> 1);
>>> if (Array[Middle] == Target)
>>> break;
>>> if (Array[Middle] < Target)
>>> Low = Middle + 1;
>>> else
>>> High = Middle - 1;
>>> }
>>>
>>> if (High < Low) on exit, the value isn’t there- otherwise, Middle is
>>> your
>>> desired index.
>>>
>>> In this case, the target is a null-terminated character string, the
>>> array
>>> elements are offsets from a known base (base of the image) to
>>> null-terminated strings, and the comparisons are via strcmpi (I think-
>>> might be strcmp, I’m at home at the moment, and I don’t think that
>>> detail
>>> affects the dsicussion significantly).
>>>
>>> The specific indices at start will be low = 0, high = number of names
>>> exported - 1;
>>>
>>> If all the High, Middle, Low are signed, this works. If they are
>>> unsigned, then there is a broken edge- normally when a value is not
>>> present the final check will have high == middle == low, and fail with
>>> low incremented above high, or high decremented below low as the case
>>> may
>>> be.
>>>
>>> But unsigned with Low 0, and the target is still smaller, you get an
>>> underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low))
>>> and this is the bug (easily fixed by using signed indices). While there
>>> can be other theoretical complications if you wish to assume mixed
>>> signed/unsigned indices [even worse if you assume negtive start points],
>>> they don’t realistically apply- these tables have dozens or at most
>>> hundreds of entries- the sign bit (which is what kills you) doesn’t come
>>> into play UNTIL this case is hit.
>>>
>>> Rehashing the earlier post, let’s say the first bad ULONG [effectively
>>> at
>>> array[-1], thanks to the overflow occuring when converting the Middle
>>> index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset
>>> within the image, and the data at that location, interpreted as a
>>> string,
>>> is BIGGER than our target. Still saying its all unsigned, then Low
>>> becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same
>>> index due to overflow /wrap- the point being that I believe this is your
>>> NULL return case when you see it- it continues to compare the same two
>>> values until it exits because of the built-in ambiguity of the two high
>>> bits. It requires one happy circumstance, but one isn’t as hard to
>>> swallow. Still I’m skeptical this magic value is x86/x64 difference.
>>> But
>>> hey, I know what to look for and where, so thanks for the tip…
>>>
>>> I think I adequately explained on the previous post why I also have a
>>> concern with the idea that a valid name is affected. But the above
>>> explanation might make it easier to understand why I assert there that
>>> only specific cases can trigger these bad behaviors. I don’t see how
>>> anything other than a valid HAL name preceding the first kernel name can
>>> wander into this, and unless you do, everything else works.
>>>
>>> I’m at least going to try to satisfy myself about the false negatives.
>>> Ownership or not, I need this to work…
>>>
>>> —
>>> Questions? First check the Kernel Driver FAQ at
>>> http://www.osronline.com/article.cfm?id=256
>>>
>>> To unsubscribe, visit the List Server section of OSR Online at
>>> http://www.osronline.com/page.cfm?name=ListServer
>>>
>>
>
>

raj_r · June 2, 2007, 11:59am

which tables ? it takes IMAGE_EXPORT_DIRECTORY no of names
and then does the shift

and in some point it crosses the threshhold 0 goes below the 0 and
crashes in the third iteration in xp-sp1 this no is
<hal.numberofnames> 0000005C

(((5b>1)>1)>1) and so on results in a nice bsod KMODE_EXCEPTION_NOT_HANDLED

i dont know if this list will reject this post or not
im posting an annoted disassembly of nt!MiFindExportedRoutineByName

00535324 >PUSH EBP
00535325 MOV EBP, ESP
00535327 SUB ESP, 14
0053532A LEA EAX, DWORD PTR SS:[EBP-14]
0053532D PUSH EAX ; /Arg4 = 0007E030
0053532E PUSH 0 ; |Arg3 = 00000000
00535330 PUSH 1 ; |Arg2 = 00000001
00535332 PUSH DWORD PTR SS:[EBP+8] ; |Mz header of hal
or ntoskrnl.exe
00535335 CALL NTOSKRNL.RtlImageDirectoryEntryT>;
\RtlImageDirectoryEntryToData
0053533A TEST EAX, EAX ; 00289B00
<hal.characteristics> 00000000
0053533C MOV DWORD PTR SS:[EBP-10], EAX
0053533F JE NTOSKRNL.005353D8
00535345 MOV EDX, DWORD PTR DS:[EAX+20] ; 00289B20
<hal.addressofnames>00019C98
00535348 MOV ECX, DWORD PTR DS:[EAX+24] ; 00289B24
<hal.addressofnameordinals> 00019E08
0053534B ADD ECX, DWORD PTR SS:[EBP+8] ;
0053534E ADD EDX, DWORD PTR SS:[EBP+8] ;
00535351 AND DWORD PTR SS:[EBP-4], 0
00535355 PUSH EBX
00535356 PUSH ESI ;
00535357 PUSH EDI
00535358 MOV EDI, DWORD PTR DS:[EAX+18] ; 00289B18
<hal.numberofnames> 0000005C
0053535B MOV EAX, DWORD PTR SS:[EBP+C]
0053535E MOV EAX, DWORD PTR DS:[EAX+4]
00535361 MOV DWORD PTR SS:[EBP-C], ECX
00535364 DEC EDI ; 5b
00535365 MOV DWORD PTR SS:[EBP-8], EAX
00535368 /MOV EAX, DWORD PTR SS:[EBP-4]
0053536B |LEA ECX, DWORD PTR DS:[EAX+EDI]
0053536E |MOV EAX, DWORD PTR SS:[EBP-8]
00535371 |SHR ECX, 1 ;
2d,16,0a,04,01,00,7fffffff,3fffffff,1fffffff
00535373 |MOV ESI, DWORD PTR DS:[EDX+ECX4]
00535376 |ADD ESI, DWORD PTR SS:[EBP+8] ;
00535379 |MOV DWORD PTR SS:[EBP+C], EAX
0053537C |/MOV EAX, DWORD PTR SS:[EBP+C]
0053537F ||MOV BL, BYTE PTR DS:[EAX]
00535381 ||MOV AL, BL
00535383 ||CMP BL, BYTE PTR DS:[ESI]
00535385 ||JNZ SHORT NTOSKRNL.005353A6
00535387 ||TEST AL, AL
00535389 ||JE SHORT NTOSKRNL.005353A2
0053538B ||MOV EAX, DWORD PTR SS:[EBP+C]
0053538E ||MOV BL, BYTE PTR DS:[EAX+1]
00535391 ||MOV AL, BL
00535393 ||CMP BL, BYTE PTR DS:[ESI+1]
00535396 ||JNZ SHORT NTOSKRNL.005353A6
00535398 ||ADD DWORD PTR SS:[EBP+C], 2
0053539C ||INC ESI ;
0053539D ||INC ESI ;
0053539E ||TEST AL, AL
005353A0 |\JNZ SHORT NTOSKRNL.0053537C
005353A2 |XOR EAX, EAX
005353A4 |JMP SHORT NTOSKRNL.005353AB
005353A6 |SBB EAX, EAX
005353A8 |SBB EAX, -1
005353AB |TEST EAX, EAX
005353AD |JGE SHORT NTOSKRNL.005353B4
005353AF |LEA EDI, DWORD PTR DS:[ECX-1] ; here it become
-1 when shr ecx above became 0
005353B2 |JMP SHORT NTOSKRNL.005353BC
005353B4 |JLE SHORT NTOSKRNL.005353C1
005353B6 |LEA EAX, DWORD PTR DS:[ECX+1]
005353B9 |MOV DWORD PTR SS:[EBP-4], EAX
005353BC |CMP EDI, DWORD PTR SS:[EBP-4]
005353BF \JNB SHORT NTOSKRNL.00535368
005353C1 CMP EDI, DWORD PTR SS:[EBP-4]
005353C4 POP EDI
005353C5 POP ESI ;
005353C6 POP EBX
005353C7 JL SHORT NTOSKRNL.005353D8
005353C9 MOV EAX, DWORD PTR SS:[EBP-C] ; HAL.00289E08
005353CC MOVZX EAX, WORD PTR DS:[EAX+ECX2]
005353D0 MOV ECX, DWORD PTR SS:[EBP-10] ; <hal.characteristics>
005353D3 CMP EAX, DWORD PTR DS:[ECX+14]
005353D6 JB SHORT NTOSKRNL.005353DC
005353D8 XOR EAX, EAX
005353DA JMP SHORT NTOSKRNL.005353EA
005353DC MOV ECX, DWORD PTR DS:[ECX+1C]
005353DF LEA EAX, DWORD PTR DS:[ECX+EAX4]
005353E2 MOV ECX, DWORD PTR SS:[EBP+8] ; HAL.00270000
005353E5 MOV EAX, DWORD PTR DS:[EAX+ECX]
005353E8 ADD EAX, ECX
005353EA LEAVE
005353EB RETN 8

On 6/2/07, Daniel Terhell wrote:
> BTW there must be also a bug in Windows Mail. Everytime it tells me ‘could
> not send post’. Then I send it again and it appears two posts have arrived.
> I am on 56K dialup so this doesn’t help but apologies for my double posts.
>
> /Daniel
>
>
>
> “Daniel Terhell” wrote in message
> news:xxxxx@ntdev…
> > Yes, in the case a name is not present then after an iteration of low=0
> > high =1, Mid becomes 0. Then High = Mid -1 wraps to 0x7FFFFFFF and the
> > loop still goes on with bogus values.
> >
> > ecx=00000000
> > ecx=7fffffff
> > ecx=3fffffff
> > ecx=1fffffff
> >
> > These values confirm my speculation this function uses unsigned types for
> > high, mid and low because if high would be -1 then the condition
> > while (high >= low) would no longer be valid and the loop would be
> > terminated.
> >
> > Maybe there is another problem with the tables. What are they supposed to
> > be filled with, exported function names only ?
> >
> > /Daniel
> >
> >
> > “raj_r” wrote in message news:xxxxx@ntdev…
> >> some where in the sequence ecx becomes 0 and then the next iterartion
> >> goes haywire
> >>
> >> this snippet is from w2k sp4
> >>
> >> and iirc the api (sorry i think the term is DDI ) on this sequence was
> >> NtDeviceIoControlFile
> >>
> >> Breakpoint 1 hit
> >> eax=e1291208 ebx=80062000 ecx=00000000 edx=8006e344 esi=8006e5ac
> >> edi=00000001
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl zr na
> >> pe
> >> cy
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000247
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> >> ds:0023:8006e344=0000c586
> >> kd>
> >> Breakpoint 1 hit
> >> eax=e1291208 ebx=80062000 ecx=7fffffff edx=8006e344 esi=8006e586
> >> edi=ffffffff
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 ov up ei pl nz na
> >> pe
> >> cy
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000a07
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> >> ds:0023:8006e340=00005aa0
> >> kd>
> >> Breakpoint 1 hit
> >> eax=e1291208 ebx=80062000 ecx=3fffffff edx=8006e344 esi=80067aa0
> >> edi=7ffffffe
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
> >> pe
> >> nc
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000206
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> >> ds:0023:8006e340=00005aa0
> >> kd>
> >> Breakpoint 1 hit
> >> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> >> edi=3ffffffe
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
> >> pe
> >> nc
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000206
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> >> ds:0023:0006e340=???
> >> kd> t
> >> Access violation - code c0000005 (!!! second chance !!!)
> >> eax=e1291208 ebx=80062000 ecx=1fffffff edx=8006e344 esi=80067aa0
> >> edi=3ffffffe
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
> >> pe
> >> nc
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000306
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx4]
> >> ds:0023:0006e340=???
> >> kd> r ecx=2f
> >> kd> r
> >> eax=e1291208 ebx=80062000 ecx=0000002f edx=8006e344 esi=80067aa0
> >> edi=3ffffffe
> >> eip=804ef7c7 esp=fb674ae4 ebp=fb674b04 iopl=0 nv up ei pl nz na
> >> pe
> >> nc
> >> cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000
> >> efl=00000306
> >> nt!MiFindExportedRoutineByName+0x54:
> >> 804ef7c7 8b348a mov esi,dword ptr [edx+ecx*4]
> >> ds:0023:8006e400=0000ca7c
> >> kd> g
> >>
> >>
> >> regards
> >>
> >> raj_r
> >>
> >>
> >>
> >> On 6/2/07, xxxxx@microsoft.com wrote:
> >>> Meant to get to this, but had to leave earlier (school play).
> >>>
> >>> Re-reading some of the earlier requests, I’ll be more specific, because
> >>> I
> >>> think it helps explain the reason I still think x64 deserves another
> >>> look
> >>> based upon the other reports. I also think it might help people like
> >>> Bill McKenzie who might still be concerned exactly which calls they have
> >>> to worry about.
> >>>
> >>> The algorithm is:
> >>>
> >>> Low, Middle, High- indices into an ordered array. Target is the value
> >>> you are trying to find the appropriate index for in the array.
> >>>
> >>> Begin with Low at the first entry, High at the last.
> >>>
> >>> while (High >=Low)
> >>> {
> >>> Middle = Low + ((High - Low) >> 1);
> >>> if (Array[Middle] == Target)
> >>> break;
> >>> if (Array[Middle] < Target)
> >>> Low = Middle + 1;
> >>> else
> >>> High = Middle - 1;
> >>> }
> >>>
> >>> if (High < Low) on exit, the value isn’t there- otherwise, Middle is
> >>> your
> >>> desired index.
> >>>
> >>> In this case, the target is a null-terminated character string, the
> >>> array
> >>> elements are offsets from a known base (base of the image) to
> >>> null-terminated strings, and the comparisons are via strcmpi (I think-
> >>> might be strcmp, I’m at home at the moment, and I don’t think that
> >>> detail
> >>> affects the dsicussion significantly).
> >>>
> >>> The specific indices at start will be low = 0, high = number of names
> >>> exported - 1;
> >>>
> >>> If all the High, Middle, Low are signed, this works. If they are
> >>> unsigned, then there is a broken edge- normally when a value is not
> >>> present the final check will have high == middle == low, and fail with
> >>> low incremented above high, or high decremented below low as the case
> >>> may
> >>> be.
> >>>
> >>> But unsigned with Low 0, and the target is still smaller, you get an
> >>> underflow (the new high is 0 (Middle) - 1 and that value is > 0 (Low))
> >>> and this is the bug (easily fixed by using signed indices). While there
> >>> can be other theoretical complications if you wish to assume mixed
> >>> signed/unsigned indices [even worse if you assume negtive start points],
> >>> they don’t realistically apply- these tables have dozens or at most
> >>> hundreds of entries- the sign bit (which is what kills you) doesn’t come
> >>> into play UNTIL this case is hit.
> >>>
> >>> Rehashing the earlier post, let’s say the first bad ULONG [effectively
> >>> at
> >>> array[-1], thanks to the overflow occuring when converting the Middle
> >>> index of 0x7FFFFFFF to a pointer to ULONG] magically gives us an offset
> >>> within the image, and the data at that location, interpreted as a
> >>> string,
> >>> is BIGGER than our target. Still saying its all unsigned, then Low
> >>> becomes 0x80000000 and middle BFFFFFFF- that resolves to the very same
> >>> index due to overflow /wrap- the point being that I believe this is your
> >>> NULL return case when you see it- it continues to compare the same two
> >>> values until it exits because of the built-in ambiguity of the two high
> >>> bits. It requires one happy circumstance, but one isn’t as hard to
> >>> swallow. Still I’m skeptical this magic value is x86/x64 difference.
> >>> But
> >>> hey, I know what to look for and where, so thanks for the tip…
> >>>
> >>> I think I adequately explained on the previous post why I also have a
> >>> concern with the idea that a valid name is affected. But the above
> >>> explanation might make it easier to understand why I assert there that
> >>> only specific cases can trigger these bad behaviors. I don’t see how
> >>> anything other than a valid HAL name preceding the first kernel name can
> >>> wander into this, and unless you do, everything else works.
> >>>
> >>> I’m at least going to try to satisfy myself about the false negatives.
> >>> Ownership or not, I need this to work…
> >>>
> >>> —
> >>> Questions? First check the Kernel Driver FAQ at
> >>> http://www.osronline.com/article.cfm?id=256
> >>>
> >>> To unsubscribe, visit the List Server section of OSR Online at
> >>> http://www.osronline.com/page.cfm?name=ListServer
> >>>
> >>
> >
> >
>
>
> —
> Questions? First check the Kernel Driver FAQ at http://www.osronline.com/article.cfm?id=256
>
> To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer
></hal.characteristics></hal.numberofnames></hal.addressofnameordinals></hal.addressofnames></hal.characteristics></hal.numberofnames>

Maxim_S_Shatskih · June 2, 2007, 4:11pm

> On 32 bit XP SP2 it just blue screens. Search this list and you can find

somebody who has explained exactly why with bug check and everything, there
is a bug when it does a binary search for a routine address.

I have the implementation of MmGetSystemRoutineAddress, written for NT4
compatibility long ago in around 2000-2001.

I was thinking about throwing it away, since we are dropping NT4 compat.

Now I understand that sorry no the code is a candidate to persist for some
years more

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

Maxim_S_Shatskih · June 2, 2007, 4:28pm

Sorry, Doron, but what are the other solutions except using SEH? writing
our own MmGetSystemRoutineAddress?

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com

“Doron Holan” wrote in message news:xxxxx@ntdev…
FYI, using SEH to recover from this bug is NOT recommended. SEH is
not a formal contract for this API and as such, we (MSFT) cannot
guarantee that the OS is still in a stable state after you have caught
the exception. I am working on a better solution, but for now, SEH is
not the answer.

Thx
d

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@osr.com
Sent: Thursday, May 31, 2007 1:29 PM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] MmGetSystemRoutineAddress BugCheck?

Thank you, both Bob and Doron, for taking the time to follow-up and let
us know what’s up. I know that it’s not your job to do this, either of
you, and we all certainly appreciate it greatly.

Peter
OSR

—
Questions? First check the Kernel Driver FAQ at
http://www.osronline.com/article.cfm?id=256

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Michal_Vodicka-2 · June 2, 2007, 4:32pm

> ----------

From: xxxxx@lists.osr.com[SMTP:xxxxx@lists.osr.com] on behalf of Maxim S. Shatskih[SMTP:xxxxx@storagecraft.com]
Reply To: Windows System Software Devs Interest List
Sent: Saturday, June 02, 2007 10:11 PM
To: Windows System Software Devs Interest List
Subject: Re:[ntdev] MmGetSystemRoutineAddress BugCheck?

Now I understand that sorry no the code is a candidate to persist for some
years more

Maybe you should offer it to MS

Best regards,

Michal Vodicka
UPEK, Inc.
[xxxxx@upek.com, http://www.upek.com]

Maxim_S_Shatskih · June 2, 2007, 4:38pm

> Okay, I see that MmGetSystemRoutineAddress acquires an ERESOURCE,

…and this lock protects against a race with the currently running kernel
export table change, for a case the kernel will want to change its exports on
the fly???

How intriguing… the need for a lock to bsearch (or linear search) the
constant data is really interesting…

–
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
xxxxx@storagecraft.com
http://www.storagecraft.com