storport miniport & protection information

OSR_Community_User · May 6, 2014, 9:50am

Does windows have any infrastructure for the application/filesystem/other storage client to access protection information, above the storport layer?

There are various adapters that support DIF/DIX, including NVMe, and it’s possible to implement some rudimentary checking in a storport miniport, i.e. generate information on write, verify on read, return generic I/O error status on mismatch.

However that’s not end-to-end protection: I’ve failed to find any way for the miniport to notify the client that the reason the I/O failed was due to invalid protection info, or for the client even to request/be aware this is going on.

Is any of this on Microsoft’s roadmap somewhere?

Peter_Viscarola_OSR · May 6, 2014, 10:08am

??

I don’t have my NVMe spec to hand, but I find it difficult to believe there’s not a way for a client to know about an end-to-end protection failure.

I’m sorry to doubt you, Mr. Green, and when I get into the office I’ll check my NVMe spec. But it would be *very* surprising to me if NVMe doesn’t describe a specific return for this.

Of course, the OTHER issue would be “Who’s making NVMe devices that support end-to-end data protection”… but that’s a different discussion entirely.

Peter
OSR
@OSRDrivers

OSR_Community_User · May 6, 2014, 10:39am

NVMe does indeed specify status codes for errors in the protection information.
That wasn’t my question: I want to know if windows has any infrastructure/API *above* the port driver to communicate such errors to the application, file system etc.

OSR_Community_User · May 13, 2014, 9:45am

I guess I might be ploughing a new furrow here :-/

I’d be interested in any suggestions anyone might have for how to deal with protection information failures.

[Quick summary: NVMe specifies a means where the driver can supply additional metadata with every block written, this metadata is often 8 bytes in the form specified by T10-DIF (type 1 is basically the LBA and a CRC). On a readback, the controller can verify this metadata and return specific errors on mismatch. The idea is to catch data corruption which occurs between write and subsequent read.]

My take is that the miniport is not the place to decide what to do on a mismatch, the error should be kicked upstairs to the client, which should be in a better position to decide how to handle it. However I don’t see any framework in storport or disk.sys for doing so. Unless I’m mistaken, my options are:-

Return an I/O error. Seems a bit brutal, there may be cases where the client wants the data , perhaps only one bit is corrupt. Also might prevent reads of LBAs that haven’t yet been written.
Log an event, so the user is at least aware that this is happening.
Just don’t bother. Hmmm… tempting…

Your thoughts & suggestions, as always, much appreciated.

Peter_Viscarola_OSR · May 13, 2014, 10:03am

This is an error in which I’m quite interested as well, Mr. Green.

I *suspect* that these types of issues need to be handled in a vendor-specific manner. In my view of the world, this category of issue is similar to other vendor-specific activities such as Namespace creation. You’ll need a vendor-specific utility to *create* a Namespace with end-to-end protection… perhaps you’ll have a vendor-specific service/policy that works in coordination with a vendor-supplied NVMe driver to handle end-to-end protection errors the way you want (log them, or whatever).

I can’t say I’ve specifically explored this issue specifically with the MSFT people responsible (I probably should, I know)… but when I *have* discussed the general category of issue, what I’ve heard amounts to what I said above.

We’re seeing the industry repeat exactly what happened with SATA once again with NVMe: We have an agreed specification and a reasonably capable MSFT-supplied driver that works out of the box… and there’ll be a whole raft of vendor-supplied drivers with unique features each seeking to add value and create a competitive advantage. Some will be faster than the MSFT-supplied driver, some slower, some more capable, probably many less reliable.

Interesting issue,

Peter
OSR
@OSRDrivers

Don_Burn · May 13, 2014, 10:32am

As Peter points out this should probably be a namespace policy. Note, there
are precedents for this, the first RAID controllers from Compaq years ago
used blocks bigger than normal to do the same approach. At least one of the
fault-tolerant companies was doing this from the 1980’s. At a minimum I
would use WMI to report the number of “errors”.

Don Burn
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of xxxxx@osr.com
Sent: Tuesday, May 13, 2014 10:03 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] storport miniport & protection information

This is an error in which I’m quite interested as well, Mr. Green.

I *suspect* that these types of issues need to be handled in a
vendor-specific manner. In my view of the world, this category of issue is
similar to other vendor-specific activities such as Namespace creation.
You’ll need a vendor-specific utility to *create* a Namespace with
end-to-end protection… perhaps you’ll have a vendor-specific
service/policy that works in coordination with a vendor-supplied NVMe driver
to handle end-to-end protection errors the way you want (log them, or
whatever).

I can’t say I’ve specifically explored this issue specifically with the MSFT
people responsible (I probably should, I know)… but when I *have*
discussed the general category of issue, what I’ve heard amounts to what I
said above.

We’re seeing the industry repeat exactly what happened with SATA once again
with NVMe: We have an agreed specification and a reasonably capable
MSFT-supplied driver that works out of the box… and there’ll be a whole
raft of vendor-supplied drivers with unique features each seeking to add
value and create a competitive advantage. Some will be faster than the
MSFT-supplied driver, some slower, some more capable, probably many less
reliable.

Interesting issue,

Peter
OSR
@OSRDrivers

NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Peter_Viscarola_OSR · May 13, 2014, 10:51am

Well, sure. As Mr. Green pointed out the NVMe work stems from the work in T10 (the SCSI committee).

This is still *very* common in enterprise-level RAID arrays. You can order SAS and SCSI drives that are factory formatted with 520 byte blocks. They’re commodity items, but – given that they’re exclusively for the enterprise market – they’re quite expensive.

The extra 8 bytes is for the checksum data Mr. Green is referring to.

Peter
OSR
@OSRDrivers