How can FILE_FS_VOLUME_INFORMATION's VolumeLabelLength be incorrect?

Hi,

I have a bit of userland code that’s been running OK for a few hundred thousand customers for the past year. Last week I had a user who has a SAN from DataPlow mounted on two Windows XP PCs, causing my code to fail. In particular, the VolumeLabelLength value of FILE_FS_VOLUME_INFORMATION returned is less than the actual length required, causing the code to go into an infinite loop.

The code that failed:

while(STATUS_BUFFER_OVERFLOW == (result = ddk.ZwQueryVolumeInformationFile(hPart.Handle, &statusBlock, volumeInformation.get(), size, informationClass)))
{
//Need to allocate more space
size = volumeInformation->VolumeLabelLength + sizeof(FILE_FS_VOLUME_INFORMATION);
volumeInformation.reset((FILE_FS_VOLUME_INFORMATION*) new unsigned char[size]);
}

TCHAR *s = p->Label.GetBuffer(volumeInformation->VolumeLabelLength/sizeof(TCHAR) + 1);
memcpy(s, volumeInformation->VolumeLabel, volumeInformation->VolumeLabelLength);
s[volumeInformation->VolumeLabelLength/sizeof(TCHAR)] = _T(‘\0’);
p->Label.ReleaseBuffer();

This would result in an infinite loop because sizeof(FILE_FS_VOLUME_INFORMATION) + the returned VolumeLabelLength after the failure is never >= the actual size required.

To work around this, I was forced to always allocate sizeof(FILE_FS_VOLUME_INFORMATION) + (MAX_PATH +1 ) * sizeof(TCHAR).

I wasn’t aware that the filesystem driver is the one to provide the value of VolumeLabelLength, and as such FILE_FS_VOLUME_INFORMATION is susceptible to returning incorrect lengths when used with buggy FS drivers…

If VolumeLabelLength is zero, your code is incorrect.

I don’t think so. Why would you say that?

What length label is getting returned with your “fixed” code? Remember
the label’s length will not include a NULL since this is a counted
string. Also, with your C++ wrappers around the calls, it is impossible
to tell what the code is really doing, so if you want help the wrapper
functions need to be provided.

I’ve gotten the FILE_FS_VOLUME_INFORMATION many times and the volume
label length has never been incorrect. I’m betting on your code being
wrong.

Don Burn (MVP, Windows DKD)
Windows Filesystem and Driver Consulting
Website: http://www.windrvr.com
Blog: http://msmvps.com/blogs/WinDrvr

xxxxx@NeoSmart.net” wrote in message
news:xxxxx@ntdev:

> I don’t think so. Why would you say that?

That’s not a C++ function, just a function pointer to the dynamically loaded counterpart in the DDK, to work in userland and compile in VS w/ MSVCRT.

Two things: The code above fully takes into consideration that VolumeLabelLength doesn’t include the terminating NULL… aaand I don’t think FILE_FS_VOLUME_INFORMATION even *has* terminating NULLs in the response. I’ve always had to manually terminate them myself using VolumeLabelLength.

I too have never had a problem with FSVINFO until now, and even then, only with this particular device (AFAICT). Did you ever use it on 3rd party filesystems? This isn’t anything standard, the FS name I get back is “DataPlowSFSZ”.

I don’t have the returned length (it wasn’t in the logs), though I can send the client another build that’ll print that info, if you guys think it’s absolutely necessary.

Sorry, my bad. Anyway, I suspect the buffer size should be rounded up to multiple of 8.

I don’t think so, Alex. As a counted string, VolumeLabelLength is the only way (I think!) to know where to stick the NULL. So it needs to be precise with a resolution of no more than 1 character.

I’m afraid that particular custom FS driver may be misinterpreting VolumeLabelLength as number of 16-bit characters. Do some experiments. Change the volume label to different lengths and see in the debugger what you get.

Alex - I’m not caring about this SAN or filesystem. I want to know why my code is crashing.

I’m trying to see the bigger picture - let’s assume the developer thought it was the number of 16-bit characters and not the number of bytes required.

Assume the label is “SAN” - it’s 6 bytes or 3 16-bit characters. No NULL.

Now I’m asking the OS to give me the length. If the developer does it right, VolumeLabelLength should be 16, and I’ll stick the NULL at the (6/sizeof(TCHAR))th index, i.e. the the 3rd. If the developer was wrong, it’ll be trimmed short - I’ll place the NULL at the (3/sizeof(TCHAR))th index - or 1.5 (1 or 2).

In the worst case scenario, I’ll end up with a partial volume label. This doesn’t explain what’s happening, basically, this is the scenario:

Me: How many bytes should I reserve for the label?
OS: Give me X.
Me: OK, here’s an X-sized buffer.
OS: No, that’s not enough. Give me an X-sized buffer.
Me: Umm… I already did.

I don’t care if I get a buffer bigger than needed or smaller than needed. I’m just wondering why the OS gives me a number and then decides it’s not enough. I, as a developer, don’t see how this can happen!

The file system fills in this value, so you’re at the mercy of the file
system. See FatQueryFsVolumeInfo in the FASTFAT source for an example of the
FS completing this IRP. Pretty bad for it to have this bug though because
you’re right, it’s not clear what to size buffer to send when you get the
error.

It would be interesting to know what length was returned by the driver.
Also, does Explorer show the correct label?

-scott


Scott Noone
Consulting Associate and Chief System Problem Analyst
OSR Open Systems Resources, Inc.
http://www.osronline.com

wrote in message news:xxxxx@ntdev…

Alex - I’m not caring about this SAN or filesystem. I want to know why my
code is crashing.

I’m trying to see the bigger picture - let’s assume the developer thought it
was the number of 16-bit characters and not the number of bytes required.

Assume the label is “SAN” - it’s 6 bytes or 3 16-bit characters. No NULL.

Now I’m asking the OS to give me the length. If the developer does it right,
VolumeLabelLength should be 16, and I’ll stick the NULL at the
(6/sizeof(TCHAR))th index, i.e. the the 3rd. If the developer was wrong,
it’ll be trimmed short - I’ll place the NULL at the (3/sizeof(TCHAR))th
index - or 1.5 (1 or 2).

In the worst case scenario, I’ll end up with a partial volume label. This
doesn’t explain what’s happening, basically, this is the scenario:

Me: How many bytes should I reserve for the label?
OS: Give me X.
Me: OK, here’s an X-sized buffer.
OS: No, that’s not enough. Give me an X-sized buffer.
Me: Umm… I already did.

I don’t care if I get a buffer bigger than needed or smaller than needed.
I’m just wondering why the OS gives me a number and then decides it’s not
enough. I, as a developer, don’t see how this can happen!

The OS doesn’t give you the length. It’s the custom filesystem driver gives you a hint (VolumeLabelLength) which you’re supposed to use for buffer length calculation. If the hint is incorrect, you’re screwed.

I don’t know about Explorer, but I have a screenshot of Disk Management (I presume they’d be the same). It looks right to me… which is odd.

Disk Management: http://grab.by/grabs/e53e522f78779b768d4a25e4e7f7a421.png

My problem was I was assuming the OS uses the same method as my own code to calculate the required size (size of FSVINFO + length of string), so a bug in the FS would affect both the OS and myself equally. Guess not, which means the driver exposes a separate method for the OS or else stores the string with the terminating NULL and the OS uses that to find out how much to reserve (wish I knew what!), but I’ll have to look at the code example you cited to figure that out.

I think the diskmgr doesn’t bother with two-stage scheme, but simply allocates a big enough buffer up front (possibly on stack).

or else calls in a loop with an expanding buffer

IMHO this is slightly safer code because it does not assume that the data
you are trying to retrieve is non-volatile. If one assumes that it is
volatile, then the return becomes a hint only, and good code will apply an
expansion factor (constrained to a max size) to consider the probability
that the value is longer than what it once was in order to be able to read
it without quiescence. Of course none of this should apply to this case,
but it is just good defensive programming

The big buffer approach is less elegant, but probably more effective in the
real world of x64 and practically unlimited UM address space

wrote in message news:xxxxx@ntdev…

I think the diskmgr doesn’t bother with two-stage scheme, but simply
allocates a big enough buffer up front (possibly on stack).

NTDEV: The original code sample I posted would have worked even if the data I was trying to retrieve had changed. I was using the IOCTL call in a while loop - so long as the result is overflow, I reallocate and try again. This is *the* recommended approach when reading anything that might change between the time you query the size and the time you retrieve it (such as values of registry keys, etc.). Except the “expanding buffer” relied solely on the returned size, and in this buggy case, is just plain wrong.

I’ve used naive resizing in the past (grow 2x the buffer size each loop until 4096), but only in cases where the OS doesn’t return a hint as to what size you should be using. However, when the OS does return such a value, logically, one should never have to guess any longer.