Re: [ntdev] Short reads on FSDs and disks

It is obvious that any UM handle the relates to a stream will complete short reads. Obvious stream handles are sockets and pipes, but less obvious stream handles are files - think POSIX

Block level devices ordinarily will not complete a short read but rather fail the request. And often fail any unaligned request

Sent from Surface Pro

From: xxxxx@osr.com
Sent: ‎Monday‎, ‎January‎ ‎12‎, ‎2015 ‎9‎:‎12‎ ‎PM
To: Windows System Software Devs Interest List

(thanks to Max for the most interesting question in weeks)

You’d think all of us big experts would know the answer to this simple questions right off the top of our heads, wouldn’t you. LOL…

Suppose you try to read beyond the disk’s capacity? I mean, who checks that? Does the request get to the controller.

I just don’t remember. Back in the day, I *seem* to remember that disk or partition checked to see if you attempted to read past the end of the current partition. But that code has definitely changed since the last time I paid any attention to it.

Peter
OSR
@OSRDrivers


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

In POSIX a file is a stream and any stream can legally give a short read. Correctly written code usually does not care if the stream is a file or a pipe or paper tape

The Win32 API supports this construct as well, but there are certain design patterns that rely on the assumption that short reads do not occur and IIRC since Vista MSDN documentation changed to reflect the certainty of this behaviour as well as IOCP completion notifications and other stuff that previously had to be assumed to write performance application (think SQL server)

Sent from Surface Pro

From: Maxim S. Shatskih
Sent: ‎Tuesday‎, ‎January‎ ‎13‎, ‎2015 ‎9‎:‎55‎ ‎AM
To: Windows System Software Devs Interest List

So, if you’re sequentially reading through a file and you are returned less data than you asked for,
but you don’t get an “end of file” error, you just keep reading until you get zero bytes and an end of
file error??

Yes. On POSIX, yes.

At there are web resources where the developers are warned on this.

My Linux bug was: I have some “chunk headers” inside the file.

If EOF hits in the middle of the Nth chunk header, thus making the header truncated - then the file is corrupt.

And my code was just reading ChunkHeaderSize and failing on a short read after, reporting the corrupt file.

This seems (I’m now not sure even about this!) to be correct on Windows.

But, on Linux, the OS can return a short read on my chunk header read, and then the valid file is considered to be broken. More so, this occurs only sometimes :slight_smile:

All of this is related to Linux signals in some way. A signal can cause a short read.


Maxim S. Shatskih
Microsoft MVP on File System And Storage
xxxxx@storagecraft.com
http://www.storagecraft.com


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

In addition to the effects observed here with respect to single disk pert, SAN and iSCSI topologies can have thousands of disks, tiered storage and asymmetric IO paths that can all have radical effects on the performance of specific transfers. Virtualization can also add another layer of complexity.

Sent from Surface Pro

From: xxxxx@osr.com
Sent: ‎Tuesday‎, ‎January‎ ‎13‎, ‎2015 ‎5‎:‎49‎ ‎PM
To: Windows System Software Devs Interest List

[Oh, I feel it coming]

That should have read “sophisticated”… with the quotes. I would have said “outdated and meddlesome” myself.

And I’m not trying to be cute, or take a random dump on Linux here.

But we’ve worked very closely with a *lot* of storage vendors and they all, unanimously, want the OS to do as little pre-write “optimization” as possible.

The type of I/O Scheduling that I understand Linux does is based upon some really ancient assumptions. *I* did that sort of coalescing, next sector first, elevator service, nearest sector first with a a fairness count… heck, back in the days of the PDP-11. I believe that, and ST-506 disks on PCs, was the last time this type of optimization made real sense.

These days, there’s darn little that you can count on in terms of disk layout. It’s better to just jam as many requests down to the disks control logic as possible (hundreds of simultaneous operations is great) and let the disk figure out what’s best for it based on what it knows about the media.

If you haven’t read it (it’s several years old) and didn’t see it in our pre-Christmas Tweet, anyone interested in this topic should check out the paper entitled “Why Disks Are Like Snowflakes”(http://www.pdl.cmu.edu/PDL-FTP/Storage/CMU-PDL-11-102.pdf).

I’d be curious if anybody knows why this type of optimization remains in Linux. I know they’re not reticent to change stuff that’s outdated, and that probably means they think this type of optimization is “worth it”… But I’d like to hear what the current argument is.

Peter
OSR
@OSRDrivers


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer