Crash Help

Hello everyone. I'm getting a bizarre random crash when I try to transfer files into my virtual file system driver from the network. The crash dump at the end of the post.

This does not happen at all if I just generate files locally. Up until recently, we did not support security. Now we store security for each file in self relative format (once per unique SD.) What I don't get ultimately is that for now all the files have the same default SD (DACLs for Local Administrator, Local Users and SYSTEM) and we're returning the same value hundreds to hundreds of thousands of times without crashing.

In between the calls srv!SrvRetrieveMaximalAccessRightsForUser and SeAccessCheck+0xc5, the stack is flowing into my FSD twice for a IRP_MJ_QUERY_SECURITY calls.

We take the cached SD and pass it through SeQuerySecurityDescriptorInfo and the buffer is generally always too small so it returns STATUS_BUFFER_TOO_SMALL. We then change the error code to STATUS_BUFFER_OVERFLOW, and put the correct size returned from the call in Irp->IoStatus.Information. (We also reset the size in IrpSp->Parameters.QuerySecurity.Length to the same value.)

I did a trace using DbgPrint and the most bizarre thing is that when the crash happens, that thread that particular time never reentered into our driver after the first IRP_MJ_QUERY_SECURITY call!

Any help would be appreciated! Its sort of tough to debug assembly. Seems to me that registry might be zero since we obviously did not fill it in with the PSECURITY_DESCRIPTOR.


SYSTEM_THREAD_EXCEPTION_NOT_HANDLED (7e)
This is a very common bugcheck. Usually the exception address pinpoints
the driver/function that caused the problem. Always note this address
as well as the link date of the driver/image that contains this address.
Arguments:
Arg1: c0000005, The exception code that was not handled
Arg2: 8093a39f, The address that the exception occurred at
Arg3: b9219af0, Exception Record Address
Arg4: b92197ec, Context Record Address

Debugging Details:

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

FAULTING_IP:
nt!SepSidInToken+24
8093a39f 0fb64301 movzx eax,byte ptr [ebx+1]

EXCEPTION_RECORD: b9219af0 -- (.exr 0xffffffffb9219af0)
ExceptionAddress: 8093a39f (nt!SepSidInToken+0x00000024)
ExceptionCode: c0000005 (Access violation)
ExceptionFlags: 00000000
NumberParameters: 2
Parameter[0]: 00000000
Parameter[1]: 00000001
Attempt to read from address 00000001

CONTEXT: b92197ec -- (.cxr 0xffffffffb92197ec)
eax=898ebc34 ebx=00000000 ecx=899a3000 edx=00000000 esi=e11c13c0 edi=00000000
eip=8093a39f esp=b9219bb8 ebp=b9219bc4 iopl=0 nv up ei pl zr na pe nc
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010246
nt!SepSidInToken+0x24:
8093a39f 0fb64301 movzx eax,byte ptr [ebx+1] ds:0023:00000001=??
Resetting default scope

PROCESS_NAME: System

CURRENT_IRQL: 0

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at "0x%08lx" referenced memory at "0x%08lx". The memory could not be "%s".

READ_ADDRESS: 00000001

BUGCHECK_STR: 0x7E

DEFAULT_BUCKET_ID: NULL_CLASS_PTR_DEREFERENCE

LAST_CONTROL_TRANSFER: from 8093a35e to 8093a39f

STACK_TEXT:
b9219bc4 8093a35e e11c13c0 00000000 00000000 nt!SepSidInToken+0x24
b9219be8 8092b179 e11c13c0 898ebc34 00000001 nt!SepTokenIsOwner+0x4b
b9219c08 b9dd3770 898ebc34 b9219c44 00000000 nt!SeAccessCheck+0xc5
b9219c58 b9dd3625 00000000 898ebc34 b9219c7c srv!SrvRetrieveMaximalAccessRightsForUser+0x68
b9219c98 b9dd36bc 8996a440 b9219cb0 b9219cac srv!SrvRetrieveMaximalAccessRights+0xc0
b9219cb4 b9dd19cf 8996a440 898e7bad 898e7bb1 srv!SrvUpdateMaximalAccessRightsInResponse+0x1f
b9219d38 b9dd1bff b9dc380c 8996a440 89e56a98 srv!GenerateNtCreateAndXResponse+0x198
b9219d78 b9db3e87 8996a448 89e56a60 b9dc86c7 srv!SrvSmbNtCreateAndX+0x1c9
b9219d84 b9dc86c7 00000000 8942b020 00000000 srv!SrvProcessSmb+0xb7
b9219dac 80920833 00e56a60 00000000 00000000 srv!WorkerThread+0x138
b9219ddc 8083fe9f b9dc8602 89e56a60 00000000 nt!PspSystemThreadStartup+0x2e
00000000 00000000 00000000 00000000 00000000 nt!KiThreadStartup+0x16

FOLLOWUP_IP:
srv!SrvRetrieveMaximalAccessRightsForUser+68
b9dd3770 84c0 test al,al

SYMBOL_STACK_INDEX: 3

SYMBOL_NAME: srv!SrvRetrieveMaximalAccessRightsForUser+68

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: srv

IMAGE_NAME: srv.sys

DEBUG_FLR_IMAGE_TIMESTAMP: 4940fb1f

STACK_COMMAND: .cxr 0xffffffffb92197ec ; kb

FAILURE_BUCKET_ID: 0x7E_srv!SrvRetrieveMaximalAccessRightsForUser+68

BUCKET_ID: 0x7E_srv!SrvRetrieveMaximalAccessRightsForUser+68

Followup: MachineOwner

The assembly for SepSidInToken is below. The memory at ebp+0Ch (and ebp+10h) is indeed zero. Wouldn’t this mean that the comparison at 8096a2ad should fault? If not, then something in between there is corrupting the stack? The memory there *should* be an offset into the user buffer (I think buffer + 14h) which should be I’m assuming the PSID Owner of the PSECURITY_DESCRIPTOR.

I can send someone a mini/memory dump if it will help. SepTokenIsOwner and SepSidInToken aren’t documented. Its completely random, I can generate a hundred or a million files before it happens.

nt!SepSidInToken:
8096a2a8 8bff mov edi,edi
8096a2aa 55 push ebp
8096a2ab 8bec mov ebp,esp
8096a2ad 837d0c00 cmp dword ptr [ebp+0Ch],0
8096a2b1 53 push ebx
8096a2b2 8b5d10 mov ebx,dword ptr [ebp+10h]
8096a2b5 56 push esi
8096a2b6 57 push edi
8096a2b7 7413 je nt!SepSidInToken+0x24 (8096a2cc)
8096a2b9 53 push ebx
8096a2ba ff357c119e80 push dword ptr [nt!SePrincipalSelfSid (809e117c)]
8096a2c0 e83336ffff call nt!RtlEqualSid (8095d8f8)
8096a2c5 84c0 test al,al
8096a2c7 7403 je nt!SepSidInToken+0x24 (8096a2cc)
8096a2c9 8b5d0c mov ebx,dword ptr [ebp+0Ch]
8096a2cc 0fb64301 movzx eax,byte ptr [ebx+1]

Q: Is DbgPrint executed inline or is there some sort of lazy writer out to the log? If its truly inline, then the thread is certainly doing something wrong since its NOT entering back into my FSD with a buffer for me to fill in before going off and analyzing it.

> Q: Is DbgPrint executed inline or is there some sort of lazy writer out to the log?

Inline.


Maxim S. Shatskih
Windows DDK MVP
xxxxx@storagecraft.com
http://www.storagecraft.com

I am sure it is not your case, but input security descriptor parameter in SeQuerySecurityDescriptorInfo is pointer to pointer - very unusual for an input parameter. It should be self-relative. It seems you are doing everything correctly.

Bronislav Gabrhelik

I appreciate the responses.

I don’t see why my IRP_MJ_QUERY_SECURITY should be a problem. My test app that reads data calls that as an option and it will run for millions of files without issue both locally and remotely. My test app that generates the data has no issues generating it locally. Remotely however, the srv service executes a security check after it generates the file with a IRP_MJ_CREATE, and then for some reason at the point it crashes, does NOT reissue a IRP_MJ_QUERY_SECURITY after receiving the STATUS_BUFFER_OVERFLOW code. By the way, this same sequence of calls by srv is also generated by the read app on every file as well, but never crashes.

WINDBG stops outputting my DbgPrint after awhile if the crash doesn’t occur right away. I read somewhere that is protection against drivers who keep DbgPrint in production code, but in this case I need it. Anyway to keep it printing?

I’m going to search the lists for help with stack corruption since dword ptr [ebp+0Ch] ought to be a parameter to the function. But I’ve never debugged that sort of problem at the kernel level at all, let alone without source code for the offending function.