[OSR-DETECTED-SPAM] Re: Re:MmMapLockedPagesSpecifyCache: Execption under Server2003-32

> Dear Members,

I ran the driver with verifer. Upon the IOCTL request that calls to
MmMapLockedPagesSpecifyCache I got a “reboot”.
I opened the crash dump with the debugger and got the following lines:

Arg1: 00000077, MmMapLockedPagesSpecifyCache called when not at APC_LEVEL
or below.
Arg2: 00000002, current IRQL
Arg3: 893b2000, MDL address
Arg4: 00000001, access mode

****
You need to give the stack backtrace of where you were when you executed
the call. There are a lot of places where the documentation explicitly
tells you that you are in an unknown context and have to assume
DISPATCH_LEVEL is in effect. Classically, the dequeue handlers called
when you start a queued IRP from your DPC routine are one such instance; I
don’t know how KMDF handles this.

But I agree with Don; the colossal design failure of the FPGA to use
scatter/gather has nothing to do with a desire to map kernel memory to
user memory.
******

Now I have to find out what is the reason for IRQL=2 upon IOCTL.

Regarding the use of kernel space mapped to user space:
I’m aware this is ugly.
But this is how my hardware works.
The FPGA does not support scatter-gather.

Thanks,
Zvika.


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

wrote in message news:xxxxx@ntdev…
> I am reminded of the old saying “if your only tool is a hammer, all your
> problems look like nails”.

Wasn’t it you who said that IOCTLs are GOOD and that shared memory is
B.A.D ?

The problem with shared memory is that you need to be aware of IRQL and
process context and the lifetime of the process in which the memory is
mapped and know it does not do automatic synchronization for you. Hence the
many PROCESS_HAS_LOCKED_PAGES and related blue screens in the world.

There is a lot of misuse, but we aren’t going to blame the TOOLS on that,
are we ? Although you may be right in this situation, you won’t hear me
advocate against the technique in general. That’s because there are
situations in which IOCTLs are NOT the preferred way and shared memory the
only option at our disposal.

//Daniel

> wrote in message news:xxxxx@ntdev…
>> I am reminded of the old saying “if your only tool is a hammer, all your
>> problems look like nails”.
>
>
> Wasn’t it you who said that IOCTLs are GOOD and that shared memory is
> B.A.D ?
>
> The problem with shared memory is that you need to be aware of IRQL and
> process context and the lifetime of the process in which the memory is
> mapped and know it does not do automatic synchronization for you. Hence
> the
> many PROCESS_HAS_LOCKED_PAGES and related blue screens in the world.
>
> There is a lot of misuse, but we aren’t going to blame the TOOLS on that,
> are we ? Although you may be right in this situation, you won’t hear me
> advocate against the technique in general. That’s because there are
> situations in which IOCTLs are NOT the preferred way and shared memory the
> only option at our disposal.

I was referring to the programmer’s “mental toolbox”.

IOCTLs are “good” compared to memory sharing. The point is, you have to
think of techniques such as shared memory as the absolutely last resort,
and create such designs only when no alternative is going to work. The
problem I see is that programmers latch onto shared memory as the FIRST
idea, and won’t let go of it no matter what.
joe

>
> //Daniel
>
>
>
>
> —
> NTDEV is sponsored by OSR
>
> For our schedule of WDF, WDM, debugging and other seminars visit:
> http://www.osr.com/seminars
>
> To unsubscribe, visit the List Server section of OSR Online at
> http://www.osronline.com/page.cfm?name=ListServer
>

What bothers me on this whole discussion is that we have never gotten
the constraints of the problem:

  1. How big of data blocks are being transferred? With today’s systems
    a buffer copy is pretty cheap

  2. What is rete of data requests? I can easily get 100K IOCTL’s
    through a KMDF driver, and double that with WDM preprocessing, and an
    order of magnitude improvement over the original with FastIO

  3. Can the hardware read partial data sets (be they packets or
    whatever)? I’ve seen a situation where I could read the data in chunks
    as a software driver form of scatter gather

  4. How CPU bound is the system(s) in question? Even with the overhead,
    it may be better depending on the data size and request rate to copy
    things.

Basically, this seems to be another “We did it this way in Linux, so we
have to do it to Windows” situation. The last of these I encountered I
was told that the application that accessed the shared memory was so
complex that they had to do it that way. I asked for a day to review
the application. The next day they said “You see it was terrible”, and
I replied “Heres a modified source that does it with IOCTL’s, once I
analyzed the code it took me an hour to change your application and get
it to run on Windows!” It is disturbing how many times I have
encountered this sort of situation.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

Don,

These fixations are a common problem; I see them in applications all the
time. The horror stories of bad code based on “that’s how we always did
it” are too numerous to go into. Unstated is “On Unix” or “On VMS”.

And they are usually based on specious interpretations of requirements.
For example, “We don’t use short functions, because function calls are too
expensive” (175ps is expensive?) Or “macros are better than inline
functions because…” (some silly reason), or “We don’t use threads
because they’re too expensive” (unstated: “We don’t use threads because
our programmers are clueless” or “fork() was expensive so threads must be
expensive”) I find the attitudes about programming are those based on
programming PDP-11s, which is not surprising, because most of them have
either been taught by old PDP-11 programmers, or learned from books
written by PDP-11 programmers.

I get questions all the time of the form of “How do I efficiently do X on
a large file?” So I ask, “How large is a ‘large’ file?” and get answers
like “Ten megabytes”. To which I reply, “Oh, you mean a TINY file. I
thought you meant a LARGE file. Read the whole file into a buffer and
work on it in memory.” To me, large files are measured in gigabytes,
medium files in hundreds of megabytes. Ten megabytes is tiny. There is
simply no understanding of the consequences of
multiple-orders-of-magnitude changes in memmory speed, instruction
execution speed, or address space size on solving a problem. This is why
we get “patch the IDT” solutions, because they’re “faster”.

You are precisely right to ask for the specs; many of these designs are
based on solving nonexistent performance problems caused by a failure to
understand either the performance requirements of the problem domain or
the capabilities of the machines. It’s like everyone thinks we’re still
programming PDP-11s. The concepts of 2.8GHz pipelined superscalar
architectures with speculative execution, multilevel caches, and dozens of
other performance enhancements, do not seem to have impacted their
consciousness.

In my teaching, I had one C instructor catch me at break and asked me to
explain the difference between stack storage, static storage, and heap
storage (scary!). With disconnects like this, it is not surprising many
programmers are confused.

This is one of the reasons I tend to make statements of the form “X is
bad” (for some value of X), because it limits the scope of mistakes. By
the time they figure out that X may be the only solution, they have enough
years of experience to do the job right, and the judgment to have
determined that they need to do X.

I find that the less experience a programmer has, the more likely he/she
is to create a complex solution to a simple problem. Years of untraining
new programmers taught me this part.
joe

What bothers me on this whole discussion is that we have never gotten
the constraints of the problem:

  1. How big of data blocks are being transferred? With today’s systems
    a buffer copy is pretty cheap

  2. What is rete of data requests? I can easily get 100K IOCTL’s
    through a KMDF driver, and double that with WDM preprocessing, and an
    order of magnitude improvement over the original with FastIO

  3. Can the hardware read partial data sets (be they packets or
    whatever)? I’ve seen a situation where I could read the data in chunks
    as a software driver form of scatter gather

  4. How CPU bound is the system(s) in question? Even with the overhead,
    it may be better depending on the data size and request rate to copy
    things.

Basically, this seems to be another “We did it this way in Linux, so we
have to do it to Windows” situation. The last of these I encountered I
was told that the application that accessed the shared memory was so
complex that they had to do it that way. I asked for a day to review
the application. The next day they said “You see it was terrible”, and
I replied “Heres a modified source that does it with IOCTL’s, once I
analyzed the code it took me an hour to change your application and get
it to run on Windows!” It is disturbing how many times I have
encountered this sort of situation.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

I didn’t say it was nonsense, I said it was a Bad Idea. As I have
observed elsewhere, the problem is that newbies to Windows fixate on this
design model, without realizing that “The driver thing complicates it” is
more than just a little glitch; it often represents the difference between
having written a driver that works and having created a software artifact
that will never work correctly.

Shared memory and notifications when the path is kernel-to-user are NOT
simple, no matter how much you want to believe it. Years of experience
from many driver writers have demonstrated that approaches other than the
classic “inverted call” lead to designs in which weird boundary conditions
dominate the problem space, and the shared memory design simply does not
represent Best Practice.

It doesn’t matter if it is “OS agnostic”. It’s a Windows driver and has
to live in the Windows environment.

And, to add to the fun, the people who MOST want to do this are usually
the people LEAST qualified to do so. If I go to OSR, and they propose a
driver that shares memory with user space, I know that they have done all
due diligence and have come to the conclusion that this is the only viable
solution. And they are likely to build something that works, should
they decide this design pattern is absolutely required. When somebody
wants to do this design as their first driver, I am reasonably confident
that they are clueless. And the probability that the design will ever be
made to work is asymptotically close enough to zero to be considered zero
for all practical purposes.

joe

On 23-Aug-2012 13:32, xxxxx@flounder.com wrote:
> This fixation about mapping kernel memory to use space keeps coming
> around. Has no one bothered to explain that this is one of the Truly
> Bad
> Ideas?

IMHO this is just a normal kind of IPC. If we can forget for a moment
that one side is “driver”, there are just two local
tasks/threads/processes communicating over a shared memory. This concept
is attractive because it is simple and OS agnostic.
The “driver” thing complicates it, and Windows adds it specific quirks,
but the trend is that simple designs win, and users eventually get what
they want (and they want what they can understand).

In the specific case of the OP they could not realize advantages of the
simple idea - but this does not mean the idea by itself is nonsense.

Regards,
– pa


NTDEV is sponsored by OSR

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer