Chuck,
You offer some solutions to some of the problems posed, and I agree that
with the right structure for things, some of the potential problems you
suggest are avoidable.
As to the performance advantages of having a smaller miniport:
-
The miniport does some performance critical work, but the actual drawing
of things is done in the Display Driver proper (in our case 3DLDD.DLL as
opposed to 3DLMP.SYS), so this is the place where you make sure that
everything is cache-aligned, compact etc. (Oh, and this file is probably a
few megabytes for most manufacturers…). So from an “half-life frames per
second” perspective, the miniport is essentially unimportant, or at least
the performance of the code in the miniport is. Of course, if the miniport
decided to “not enable AGP8X, use PCI instead”, you’d surely see a BIG
difference in performance. But this is a once and for all type setup, so not
particularly critical in terms of how it’s being performed. -
In most cases, runtime checks can be avoided with clever tricks. This
applies to miniport and display driver both. Just set things up at the start
of day, and then you haven’t got too many runtime checks after that. Just as
an example, if you need to know how much memory a particular board has, you
store that as a member of the device structure, so you don’t need to check
every time “is it card X, then video memory = 128M”.
The biggest factor, as Calvin explained, is that there is a need to keep a
large number of fair-sized tables to keep track of (for instance) video
modes for different display settings. This can expand quite quickly, and
depending on how you do things, you may end up with quite a lot more data
than the obvious X, Y, BPP and VFreq. For instance, you may want to encode
in this table some of the hard to calculate parameters to the video unit,
rather than calculating them at the time you need them. I don’t know how
much of our driver is tables, but certainly there are some pretty large
tables in quite a few places. Mostly, these are “use once” tables, so they
are only of interest to the setup code when the graphics mode is
initialized. Once it’s been set up, it’s not used until next time the screen
mode is changed (for instance when you start half-life in full-screen mode).
Of course, having a quarter meg of tables will not make much of a difference
to the performance in half-life, as the table(s) are all out of the cache by
the time you start playing half-life, and the code in the miniport may or
may not be efficient, but very little of it will actually be used when
playing half-life, anyways.
Finally, if you have a reasonably fast processor, it’s most likely that the
limiting factor for the graphics speed is mainly the graphics processor
itself. Of course, for really complex games, this may not be true. But a
significant amount of time can easily be spent on “waiting for the graphics
processor” if you don’t have lots of math to do before drawing the next
frame. This is why some benchmarks end up giving exactly the same score
whatever the processor when you get to a certain level of performance.
Obviously, optimising the code in the driver (whether miniport or display
driver) in this case will gain 0% performance increase.
–
Mats
-----Original Message-----
From: Chuck Batson [mailto:xxxxx@cbatson.com]
Sent: Thursday, November 20, 2003 3:45 PM
To: Windows System Software Devs Interest List
Subject: [ntdev] Re: Flushing DMA Buffer Allocated with AllocateCommon
BufferHi Mats, thanks for the thorough answer. I know absolutely nothing
about Windows video drivers, so I’m mostly just curious what
the reasons
are for doing this.> The other reason would be maintaining the different builds in some
sensible
> way, and avoiding errors. If you split the source up into different
source
> files, some update somewhere will be missed in a different version.Presumably any code that needed to be shared among the different
versions would be in a common place (perhaps a library) to
avoid “copy &
paste” updates and other modification propogation errors.> If you
> have lots of #if in the code, it becomes hairy for other reasons.
Having one
> large source file that contains all different variations without any
> conditional compile makes it relatively easy to maintain.This is true, although in those places where you need
different code for
different hardware, it has to be done somehow – be it a compile-time
#if or a run-time if (meaning a conditional of some sort).> There’s also the fact that for each package you have, you need
separate WHQL
> certification, which means that if you do a different build for a
particular
> board/SKU etc, you need to run the same tests on this build
as you did
on
> the main build, so for each variation you add one lot of
WHQL runs. A
run of
> WHQL for Display driver takes around a day to run, assuming all goes
well…
> Add to this that you have to send the logs to MS, get them
to certify
you,
> and wait for the results from MS.This is probably the most compelling of all the reasons you mention.
=^)> If the driver is updated often (during Beta stage when you’re
developing a
> new board + ASIC for instance), you also get problems with tracking
which
> versions of which variation of the driver has which fixes
included, as
> someone may have updated something during the daily build stage, and
the
> second of the builds for that day has a different set of fixes than
the
> first build.These are of course real issues, but there are procedures and
processes
you can use to alleviate them. For example, having a dedicated build
engineer; building from a source “snapshot” (so you don’t get builds
when the source code is in an “intermediate” state); and building all
binaries during a single build from the source snapshot.> Now multiply all this by the number of OS’s that you support (WinNT,
2K, XP,
> 9X etc) and you start seeing why having one driver is a real nice
thing.This is also true. I have written application code that handles the
various OS flavors at run-time as opposed to compile-time. I’m just
surprised that something somewhat performance critical as a
video driver
would do this too. Having a single binary implies at least three
potentially performance-sapping side-effects: (1) run-time
conditionals
(extra instructions for tests, comparisons, branches; CPU performance
with regard to branches, such as incorrectly predicted branches,
instruction fetch queue and pipeline flushes, etc.); (2)
increased cache
misses due to lower spacial proximity as well as larger footprint; and
(3) swapping or memory resource consumption due to a larger binary
footprint. Do you have any thoughts on how, in a practical
“real world”
situation, having a unified driver binary affects performance? For
example, if I had a driver compiled specifically for my hardware, how
many more frames per second would I see playing Half Life? =^)Chuck
> > -----Original Message-----
> > From: Chuck Batson [mailto:xxxxx@cbatson.com]
> > Sent: Thursday, November 20, 2003 2:48 PM
> > To: Windows System Software Devs Interest List
> > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
AllocateCommon
> > Buffer
> >
> >
> > Perhaps this is a dumb question, but is there any particular
> > reason why
> > you don’t segregate into different builds? Why is it
> > necessary to cram
> > everything into a single driver binary?
> >
> > Chuck
> >
> > ----- Original Message -----
> > From: “Calvin Guan”
> > > To: “Windows System Software Devs Interest List”
>
> > > Sent: Wednesday, November 19, 2003 2:26 AM
> > > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> AllocateCommon
> > > Buffer
> > >
> > >
> > > > To add what Alberto said, our miniport has to support a
> huge list
> of
> > > desktop
> > > > and mobile ASICs. Each asic has different video BIOS to handle.
> The
> > > most
> > > > headache to me is the mobile ASICs on notebooks. Different OEMs
> have
> > > > different LCD panels. And different OEM requires different
> features.
> > > Also,
> > > > there are many awesome features implemented in the miniport.
> > > >
> > > > Instead of “miniport”, I would call it a griantport. It’s
> > > even larger
> > > than
> > > > ntfs.sys in size. I really miss the day when I was with
> > > NDIS miniport
> > > that I
> > > > wrote every single line of code for my driver-
> > > >
> > > > -----Original Message-----
> > > > From: Moreira, Alberto [mailto:xxxxx@compuware.com]
> > > > Sent: Tuesday, November 18, 2003 10:40 AM
> > > > To: Windows System Software Devs Interest List
> > > > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> > > AllocateCommon
> > > > Buffer
> > > >
> > > >
> > > > There’s a lot of functionality in a Miniport, it does
> most of the
> > > > non-time-critical functions of driving a graphics
> subsystem. Some
> > > people put
> > > > support for several different chips in the same piece
> of code, but
> > > even if
> > > > you only have one chip, your Miniport may end up being
> pretty big.
> > > Some of
> > > > the actual space is taken by tables, for example, every graphics
> > > driver
> > > > supports several resolutions and bit depths, and one must
> > > keep tables
> > > of
> > > > register settings that set up your chip to the
> corresponding video
> > > mode.
> > > > There’s also tables with configuration and capability settings,
> and
> > > they
> > > > take space. You must handle initialization, capabilities, mode
> > > changes,
> > > > power management, multiple screens, resource management,
> > > you name it.
> > > You
> > > > must also manage the retrace interrupt. In WinXP
> there’s even new
> > > support
> > > > for DMA. BTW, Calvin, do you guys implement and use the new
> > > DMA calls
> > > that
> > > > WinXP added to the Miniport ?
> > > >
> > > > Alberto.
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: xxxxx@lists.osr.com
> > > > [mailto:xxxxx@lists.osr.com]On Behalf Of Maxim
> > > S. Shatskih
> > > > Sent: Monday, November 17, 2003 10:14 PM
> > > > To: Windows System Software Devs Interest List
> > > > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> AllocateC
> > > > ommonBuffer
> > > >
> > > >
> > > > Wow! Am I right that this huge amount of code is due to
> > > supporting
> > > all
> > > > videocard hardware models and maintaining the backward
> > > compatibility,
> > > so
> > > > that the newest binary can work with even the old hardware?
> > > >
> > > > Maxim Shatskih, Windows DDK MVP
> > > > StorageCraft Corporation
> > > > xxxxx@storagecraft.com mailto:xxxxx
> > > > http://www.storagecraft.com http:
> > > >
> > > >
> > > > ----- Original Message -----
> > > > From: Calvin Guan mailto:xxxxx
> > > > To: Windows System Software Devs Interest
> > > mailto:xxxxx List
> > > >
> > > > Sent: Tuesday, November 18, 2003 4:02 AM
> > > > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> AllocateC
> > > > ommonBuffer
> > > >
> > > >
> > > > Well, video miniport is a lot of code-:).
> > > > Our Radeon x86 free build miniport (ati2mtag.sys) is more than
> 600k.
> > > the chk
> > > > build doesn’t fit into a floppy…
> > > >
> > > > Calvin Guan, Software Developer xxxxx@nospam.ati.com
> > > > mailto:xxxxx
> > > > SW2D-Radeon NT Core Drivers
> > > > ATI Technologies Inc.
> > > > 1 Commerce Valley Drive East
> > > > Markham, Ontario, Canada L3T 7X6
> > > > Tel: (905) 882-2600 Ext. 8654
> > > > Find a driver: http://www.ati.com/support/driver.html
> > > > http:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Maxim S. Shatskih [mailto:xxxxx@storagecraft.com
> > > > mailto:xxxxx]
> > > > > Sent: Monday, November 17, 2003 7:20 PM
> > > > > To: Windows System Software Devs Interest List
> > > > > Subject: [ntdev] Re: Flushing DMA Buffer Allocated with
> AllocateC
> > > > > ommonBuffer
> > > > >
> > > > >
> > > > > > Miniport. For example, look at the Permedia P3 sample in
> > > > > the DDK, the DMA
> > > > > > rendering is handled in the driver and not in the Miniport.
> > > > > There’s not
> > > > >
> > > > > Then why the nVidia’s miniport is THIS huge (500KB or such)?
> > > > >
> > > > > Maxim Shatskih, Windows DDK MVP
> > > > > StorageCraft Corporation
> > > > > xxxxx@storagecraft.com
> > > > > http://www.storagecraft.com http:
> > > > >
> > > > >
> > > > > —
> > > > > Questions? First check the Kernel Driver FAQ at
> > > > > http://www.osronline.com/article.cfm?id=256
> > > > http:
> > > > >
> > > > > You are currently subscribed to ntdev as: xxxxx@ati.com
> > > > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > > > >
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as: xxxxx@storagecraft.com
> > > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as:
> > > xxxxx@compuware.com
> > > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as: xxxxx@ati.com
> > > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > > >
> > > >
> > > >
> > > >
> > > > The contents of this e-mail are intended for the named
> > > addressee only.
> > > It
> > > > contains information that may be confidential. Unless
> you are the
> > > named
> > > > addressee or an authorized designee, you may not copy or use it,
> or
> > > disclose
> > > > it to anyone else. If you received it in error please notify us
> > > immediately
> > > > and then destroy it.
> > > >
> > > >
> > > >
> > > >
> > > > —
> > > > Questions? First check the Kernel Driver FAQ at
> > > http://www.osronline.com/article.cfm?id=256
> > > >
> > > > You are currently subscribed to ntdev as: xxxxx@cbatson.com
> > > > To unsubscribe send a blank email to
> > > xxxxx@lists.osr.com
> > > >
> > >
> > >
> > > —
> > > Questions? First check the Kernel Driver FAQ at
> > > http://www.osronline.com/article.cfm?id=256
> > >
> > > You are currently subscribed to ntdev as:
> xxxxx@3dlabs.com
> > > To unsubscribe send a blank email to
> xxxxx@lists.osr.com
> > >
> >
> > —
> > Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
> >
> > You are currently subscribed to ntdev as: xxxxx@cbatson.com
> > To unsubscribe send a blank email to
> xxxxx@lists.osr.com
> >
>
>
> —
> Questions? First check the Kernel Driver FAQ at
> http://www.osronline.com/article.cfm?id=256
>
> You are currently subscribed to ntdev as: xxxxx@3dlabs.com
> To unsubscribe send a blank email to xxxxx@lists.osr.com
></http:></http:></mailto:xxxxx></http:></mailto:xxxxx></mailto:xxxxx></mailto:xxxxx></http:></mailto:xxxxx>