Curious if anyone else has run into this problem before. Not a kernel-level
issue but the answer probably involves some low-level system stuff:
I’ve got a chunk of code that, on Unix, fseeks around a large file,
fread’ing lots of little pieces of it. In this case, a ‘large file’ is
anywhere from 180MB to 350MB and ‘little pieces’ means about 125 bytes. On
Windows, I instead memory-map the file and then just use memcpy instead of
fseek/fread. The Unix approach was chosen because it is simple to code and
maintain and is plenty fast. The Windows approach was chosen after seeing
that fread/fseek was dog-meat slow. The memory-mapped file approach brings
performance on Windows in-line with Unix.
A problem arises when I use a really large file, around 1.3GB. What fails
is the MapViewOfFile call. This is because there isn’t a contiguous chunk of
memory in the process’s memory map that is 1.3GB long. I could get fancy and
start mapping several smaller views of the file but I’d like to investigate
simpler things first.
I went back and tried the fseek/fread approach on Windows again and got the
same lousy performance. Wondering if the CRT was to blame, I redid it with
SetFilePos/ReadFile and that was no better.
I’m curious why, in this case, there would be such a discrepancy? In all
cases, wouldn’t the caching benefits take over? At a certain abstract level,
all three cases are doing the same seeking and reading into a memory buffer.
I know that the memory mapped file concept is more primitive but I can’t see
where the gains come from in practice.
Can anyone enlighten me? Any suggestions for getting the fast I/O I need
without resorting to memory-mapping the big file?
db
You are currently subscribed to ntdev as: $subst(‘Recip.EmailAddr’)
To unsubscribe send a blank email to leave-ntdev-$subst(‘Recip.MemberIDChar’)@lists.osr.com