Why is LRU used for page-replacement? Because at IBM, in the early 1970s,
one researcher asked “What if we knew the furure?” and to that end
instrumented the OS to track page references. Then he took the trace (to
give you an idea of when this was done, the trace was on a magtape) and
built a simulator that could look into the “future” to decide which page
was furthest in the future for a reference, and simulated paging it out.
His overall performance was only about 5% better than LRU. Since the
problem of telling past behavior was easy to solve, and predicting the
future was somwhat more difficult, LRU became the algoritm of choice.
A friend of mine did his PhD dissertation on optimal register allocation
in a compiler. There is a known-best-algoritm for doing this that
involves matrix artithmetic of gigantic binary matrices at each value
creation point. I don’t recall the complexity, but it was certainly
exponential; he used to run overnight batch jobs to compile tiny
subroutines. The compiler we used for production, however, had a
heuristic algorithm whose complexity was linear in the number of value
creations. Like LRU, his algorithm produced code only about 5% better
than the production compiler. So the linear heuristic was “good enough”.
Do we have any similar studies showing aggressive page-out to give
comparable performance to lazy page-out? The experience of many of us
suggests that aggressive page-out produces user-perceptible and serious
degradation of response, particularly when memory is abundant. We already
know that techniques which optimize server performance are different than
techniques that optimize highly-interactive environments; an algorithm
that improves performance under high memory pressure appears to be less
successful when memory pressure is low, and the suggestion seems to be
that the system should be adaptive to its environment rather than trying
to use a single algorithm for 2GB phones running a small number of apps
and 16GB workstations running 50-100 apps.
I am not optimistic; look how many years it took to get GDI to stop
failing when lots of apps needed lots of GDI space, and the same limits
were used for multicore 4GB Win32 machines as had been used for 2GB Win16
machines.
joe
@M M:
>As all of the veterans here know, page trimming and replacement as well
> as cache
balancing algorithm design makes NP hard issues seem trivial.
This is one of those problems that don’t have to have the “best” solution.
They need to be good enough, and, the most important, avoid pathological
behaviors. Pathological cases usually happen when the programmer tried to
make the behavior too smart for its own good. Or overly aggressive.
Avoiding pathological cases is more important than getting the best
performance. The “best performance” you can only see in artificial
benchmarks, anyway, and when you see an increase in single percents, the
users will not notice it, because it’s below noise.
I’ve read about a typical case of misguided optimization. A programmer’s
blog on MSDN had an article about converting PDB write component (in LINK)
to multi-threaded, because measurements show that it takes significant
time. Not once the programmer asked themselves a question: “why would a
glorified binary formatting routime be noticeably slow? Could it be
because LINK writes a PDB as a compressed file? Could its write pattern
cause excessive compression overhead?”
NTDEV is sponsored by OSR
Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev
OSR is HIRING!! See http://www.osr.com/careers
For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars
To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer