I was looking through the 3718 DDK IOCTL sample, and came across this:
//
// Map the physical pages described by the MDL into system space.
// Note: double mapping the buffer this way causes lot of
// system overhead for large size buffers.
//
buffer = MmGetSystemAddressForMdlSafe(mdl, NormalPagePriority );
I was under the impression that this (along with the probe & lock, etc…)
is what the IO Manager does provide access to METHOD_IN/OUT_DIRECT user-mode
buffers. In other words, I thought that this is the lowest overhead means
of getting to a user-mode buffer from kernel mode, whether I do it
explicitly, or the IO manager does it for me. Is my understanding flawed
here?
Thanks,
Phil
Philip D. Barila
Seagate Technology, LLC
(720) 684-1842
As if I need to say it: Not speaking for Seagate.
“Phil Barila” wrote in message
news:xxxxx@ntdev…
>
>
> I was under the impression that this (along with the probe & lock, etc…)
> is what the IO Manager does provide access to METHOD_IN/OUT_DIRECT
user-mode
>
No, you’re absolutely correct. This is precisely what the I/O Manager does
for Direct I/O.
There’s a long-standing urban legend that claims mapping a buffer as for
direct I/O (and using MmGetSystemAddressForMdlSafe) causes “a lot of
overhead.” In fact, doing the mapping entails nothing more than allocating
and filling in the necessary PTE entries. Certainly not a lot of overhead
the way I look at it.
Now, there IS a hit that the system takes when UNmapping this sort of a
buffer. Anytime you invalidate a virtual to physical address mapping the
TLB will need to be flushed. So that’s a “hit.” But everything costs
SOMEthing. And getting data is what a driver is about.
If this were truly “high overhead” then the entire storage path wouldn’t be
based on the use of Direct I/O, right?
I think the origin of this myth is some statements in the DDK years ago.
Old myths die hard,
P
> I was under the impression that this
[MmGetSystemAddressForMdl] (along with the probe &
lock, etc…) is what the IO Manager does provide access to
METHOD_IN/OUT_DIRECT user-mode buffers.
No. When you ask for METHOD_IN/OUT_DIRECT or DO_DIRECT_IO
the IO manager ONLY does the probe-and-lock. This guarantees
you and your (usually DMA) device a nonmoving set of physical
addresses for your buffer, but it creates no kernel-space
addresses. For that you need to call MmGetSystemAddressForMdl.
n.b.: It is highly recc’d to call MmGetSystemAddressForMdlSafe
instead.
See, most drivers of DMA devices don’t actually need to “see”
the data being moved. They just point their devices at the
RAM addresses (which might not be the actual RAM addresses, but
that’s another discussion) and say “go”. So there’s no need for
the normal DMA setup to go through the page table allocation,
page table set up, and translation buffer invalidate steps that
are done by MmGetSystemAddressForMdl[Safe] .
— Jamie Hanrahan
Azius Developer Training http://www.azius.com/
Kernel Mode Systems http://www.cmkrnl.com/
Windows Driver Consulting and Training
Thanks,
Phil
Seagate Technology, LLC
(720) 684-1842
As if I need to say it: Not speaking for Seagate.
You are currently subscribed to ntdev as: xxxxx@cmkrnl.com
To unsubscribe send a blank email to xxxxx@lists.osr.com
Actually, Jamie’s characterization (just sent) about what the I/O manager
does is much clearer than mine… Sorry, too much multi-tasking.
My point stands about MmGetSystemAddressForMdlSafe. There’s no extrodinary
overhead here. And it must be used anytime access to the buffer’s required
outside the context of the requesting thread, as in PIO devices in the
storage stack.
Sorry for any confusion,
p
“Peter Viscarola” wrote in message news:xxxxx@ntdev…
>
>
> “Phil Barila” wrote in message
> news:xxxxx@ntdev…
> >
> >
> > I was under the impression that this (along with the probe & lock,
etc…)
> > is what the IO Manager does provide access to METHOD_IN/OUT_DIRECT
> user-mode
> >
>
> No, you’re absolutely correct. This is precisely what the I/O Manager
does
> for Direct I/O.
>
> There’s a long-standing urban legend that claims mapping a buffer as for
> direct I/O (and using MmGetSystemAddressForMdlSafe) causes “a lot of
> overhead.” In fact, doing the mapping entails nothing more than
allocating
> and filling in the necessary PTE entries. Certainly not a lot of overhead
> the way I look at it.
>
> Now, there IS a hit that the system takes when UNmapping this sort of a
> buffer. Anytime you invalidate a virtual to physical address mapping the
> TLB will need to be flushed. So that’s a “hit.” But everything costs
> SOMEthing. And getting data is what a driver is about.
>
> If this were truly “high overhead” then the entire storage path wouldn’t
be
> based on the use of Direct I/O, right?
>
> I think the origin of this myth is some statements in the DDK years ago.
> Old myths die hard,
>
> P
>
>
>
>
>
I think for user space, flushing of TB is done at teardown.
Dan
----- Original Message -----
From: “Jamie Hanrahan”
To: “NT Developers Interest List”
Sent: Tuesday, March 11, 2003 7:47 PM
Subject: [ntdev] Re: More questions about mapping user & kernel mode memory
> Peter Viscarola wrote:
> > “Phil Barila” wrote in message
> > news:xxxxx@ntdev…
> > >
> > >
> > > I was under the impression that this [MmGetSystemAddressForMdl]
> > > (along with the probe & lock,
> > > etc…) is what the IO Manager does provide access to
> > > METHOD_IN/OUT_DIRECT
> > user-mode
> > >
> >
> > No, you’re absolutely correct. This is precisely what the
> > I/O Manager does for Direct I/O.
>
> No, it doesn’t. No way does every setup for direct IO include
> mapping the pages into system address space.
>
> > There’s a long-standing urban legend that claims mapping a
> > buffer as for direct I/O (and using
> > MmGetSystemAddressForMdlSafe) causes “a lot of overhead.”
>
> I remember. I believe at one point it was claimed that this
> caused loss of memory cache, which seemed ridiculous to all
> of us.
>
> > In
> > fact, doing the mapping entails nothing more than allocating
> > and filling in the necessary PTE entries. Certainly not a
> > lot of overhead the way I look at it.
> >
> > Now, there IS a hit that the system takes when UNmapping this
> > sort of a buffer. Anytime you invalidate a virtual to
> > physical address mapping the
> > TLB will need to be flushed. So that’s a “hit.” But
> > everything costs SOMEthing.
>
> Aside: It isn’t clear to me whether the invalidate is done upon
> setup or teardown - for system space it really wouldn’t matter,
> as long as it does it in one place or the other. Nor does it
> matter for “overhead” considerations.
>
> You need to remember though that all of this allocating of PTEs,
> filling in PTEs, AND invalidating the TLB entries for all of the
> page addresses so generated… it’s all being done with the MM
> spinlock held. Invalidating TLB entries is furthermore one of
> those nasty “serializing” operations in the CPU.
>
> There’s also a nasty note in the ia32 Instruction Set Reference:
>
> “The INVLPG instruction normally flushes the TLB entry only for
> the specified page; however, in some cases, it flushes the entire
> TLB.”
>
> Ow! Now as the folks at Mindshare say in Protected Mode System
> Architecture, I can’t imagine what cases would require flushing
> the entire TLB, but there it is.
>
> > And getting data is what a driver is about.
> >
> > If this were truly “high overhead” then the entire storage
> > path wouldn’t be based on the use of Direct I/O, right?
>
> But the entire storage path ISN’T based on mapping user addresses
> into kernel address space, is it? No allocating and filling in of
> SPTEs, no INVLPGs.
>
> — Jamie Hanrahan
> Azius Developer Training http://www.azius.com/
> Kernel Mode Systems http://www.cmkrnl.com/
> Windows Driver Consulting and Training
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@rdsor.ro
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>
“Jamie Hanrahan” wrote in message news:xxxxx@ntdev…
>
> You need to remember though that all of this allocating of PTEs,
> filling in PTEs, AND invalidating the TLB entries for all of the
> page addresses so generated… it’s all being done with the MM
> spinlock held. Invalidating TLB entries is furthermore one of
> those nasty “serializing” operations in the CPU.
>
Weeelll… actually, that’s only SORT of true.
NT has gotten way smarter about serialization over the years. Many things
that USED to be serialized by “big locks” no longer are. The memory
manager, especially, is far smarter than it used to be in this regard.
In fact, taking this to the next level, even the number of TLB flushes that
are performed has been significantly reduced over the years, with the aim of
trying to reduce the number of such flushes to the absolute minimum that
are actually necessary.
>
> But the entire storage path ISN’T based on mapping user addresses
> into kernel address space, is it? No allocating and filling in of
> SPTEs, no INVLPGs.
>
Yes, well, I did say earlier that I mis-spoke… As I said, consider PIO
type storage peripherals. Less common these days, yes. Unheard of, no.
P
“Jamie Hanrahan” wrote in message news:xxxxx@ntdev…
>
> > I was under the impression that this
> > [MmGetSystemAddressForMdl] (along with the probe &
> > lock, etc…) is what the IO Manager does provide access to
> > METHOD_IN/OUT_DIRECT user-mode buffers.
>
> No. When you ask for METHOD_IN/OUT_DIRECT or DO_DIRECT_IO
> the IO manager ONLY does the probe-and-lock. This guarantees
> you and your (usually DMA) device a nonmoving set of physical
> addresses for your buffer, but it creates no kernel-space
> addresses. For that you need to call MmGetSystemAddressForMdl.
>
> n.b.: It is highly recc’d to call MmGetSystemAddressForMdlSafe
> instead.
>
> See, most drivers of DMA devices don’t actually need to “see”
> the data being moved. They just point their devices at the
> RAM addresses (which might not be the actual RAM addresses, but
> that’s another discussion) and say “go”. So there’s no need for
> the normal DMA setup to go through the page table allocation,
> page table set up, and translation buffer invalidate steps that
> are done by MmGetSystemAddressForMdl[Safe] .
Jamie & Peter, thanks for the responses. Between the two of you, you got me
past a dumb mistake I made in the mapping (or not, as the case should have
been) between UM and KM in my custom class driver.
Thanks!
Phil
–
Philip D. Barila
Seagate Technology, LLC
(720) 684-1842
As if I need to say it: Not speaking for Seagate.
I think even in current incarnations like XP it seems that reserving system
PTEs requires aquiring the
system space lock, while flushing TLBs and invalidating caches should
require holding some dispatcher
lock at synch level and IPIs must be involved. Still big locks are
involved.
----- Original Message -----
From: “Peter Viscarola”
Newsgroups: ntdev
To: “NT Developers Interest List”
Sent: Tuesday, March 11, 2003 9:02 PM
Subject: [ntdev] Re: More questions about mapping user & kernel mode memory
>
> “Jamie Hanrahan” wrote in message news:xxxxx@ntdev…
> >
> > You need to remember though that all of this allocating of PTEs,
> > filling in PTEs, AND invalidating the TLB entries for all of the
> > page addresses so generated… it’s all being done with the MM
> > spinlock held. Invalidating TLB entries is furthermore one of
> > those nasty “serializing” operations in the CPU.
> >
>
> Weeelll… actually, that’s only SORT of true.
>
> NT has gotten way smarter about serialization over the years. Many
things
> that USED to be serialized by “big locks” no longer are. The memory
> manager, especially, is far smarter than it used to be in this regard.
>
> In fact, taking this to the next level, even the number of TLB flushes
that
> are performed has been significantly reduced over the years, with the aim
of
> trying to reduce the number of such flushes to the absolute minimum that
> are actually necessary.
>
> >
> > But the entire storage path ISN’T based on mapping user addresses
> > into kernel address space, is it? No allocating and filling in of
> > SPTEs, no INVLPGs.
> >
>
> Yes, well, I did say earlier that I mis-spoke… As I said, consider
PIO
> type storage peripherals. Less common these days, yes. Unheard of, no.
>
> P
>
>
>
> —
> You are currently subscribed to ntdev as: xxxxx@rdsor.ro
> To unsubscribe send a blank email to xxxxx@lists.osr.com
>