BSOD REFERENCE_BY_POINTER

Peter_Scott · May 1, 2013, 8:21am

Since you are setting a dispatch routine for the IRP_MJ_SHUTDOWN in your
driver object during DriverEntry, you would receive this on every stack
you have created a device object. Thus if you have created 2 device
objects they would each receive an IRP_MJ_SHUTDOWN request.

To resolve it, set your global object which you are dereferencing to
NULL after you dereference it. Then upon entry to your dispatch routine
check to ensure it is non-NULL before dereferencing it. You can use
interlockedexchange to ensure there are no races for determining which
one should dereference it.

Pete

On 5/1/2013 1:18 AM, xxxxx@gmail.com wrote:

Dear Scott,

Thanks for your input. I think you are right and my shutdown routine is called twice. I confirmed this by placing a custom bug check after ObDereferenceObject. Now at restart time my custom bug check is coming. It means that the BSOD REFERENCE_BY_POINTER is not coming first time but I think at second time.

What can be the reason for the shutdown call for more than one time and how can I resolve it.

Thanks,
Uzair Lakhani

NTFSD is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

–
Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295

Uzair_Lakhani · May 2, 2013, 5:26am

Dear Peter and Other Members,

Thanks for your input. I have made changes according to your advice. Now the BSOD REFERENCE_BY_POINTER is not coming on shutdown. But one problem is that the file system on the disk partition is running on every restart. The check was also running previously before making these changes. I thought that it was due to some open handle to the disk partition due to which the dirty bit of the file system remains set and which forces the file system check on next restart.

According to my code review I am closing the file handle correctly. First calling ZwOpenFile, ObReferenceObjectByHandle and IoGetRelatedDeviceObject upon receiving a custom IOCTL and in the shutdown routine calling ObDereferenceObject and ZwClose. But still the file system check runs on every reboot.

I am only opening X: i.e. my virtual device and then shutdown the system. Opening X: generates some IOCTLs and some read/write requests. I am not writing anything on X: Although the file system check runs but it cannot find any problems. I want to find out what is causing the file system check to run on every restart.

Thanks,
Uzair Lakhani

Uzair_Lakhani · May 3, 2013, 12:51am

Dear Members,

Following is a part of code of the shutdown routine:

if(targetDisk->file_obj){
NTSTATUS statusLocal;
ObDereferenceObject(targetDisk->file_obj); // PORTING
//ObDereferenceObject(targetDisk->StorageDeviceObject); // PORTING
statusLocal = ZwClose(targetDisk->myHandle);
KeBugCheckEx(0x2C001,7,7,7,statusLocal);
//ZwClose(targetDisk->myHandle);
//ObDereferenceObject(targetDisk->file_obj); // PORTING
targetDisk->file_obj = NULL;
}

The statusLocal here in KeBugCheckEx is returning STATUS_INVALID_HANDLE. What can be the reason for this?

Additionally can I call ZwClose earlier (i.e. before calling ObDereferenceObject) in some other routine like immediately after calling ObReferenceObjectByHandle or is this can create some problems?

Thanks,
Uzair Lakhani

Peter_Scott · May 3, 2013, 8:30am

How is the handle created? Is it a kernel or user mode handle?

What is the purpose of this code snippet?

Pete

On 5/2/2013 10:55 PM, xxxxx@gmail.com wrote:

Dear Members,

Following is a part of code of the shutdown routine:

if(targetDisk->file_obj){
NTSTATUS statusLocal;
ObDereferenceObject(targetDisk->file_obj); // PORTING
//ObDereferenceObject(targetDisk->StorageDeviceObject); // PORTING
statusLocal = ZwClose(targetDisk->myHandle);
KeBugCheckEx(0x2C001,7,7,7,statusLocal);
//ZwClose(targetDisk->myHandle);
//ObDereferenceObject(targetDisk->file_obj); // PORTING
targetDisk->file_obj = NULL;
}

The statusLocal here in KeBugCheckEx is returning STATUS_INVALID_HANDLE. What can be the reason for this?

Additionally can I call ZwClose earlier (i.e. before calling ObDereferenceObject) in some other routine like immediately after calling ObReferenceObjectByHandle or is this can create some problems?

Thanks,
Uzair Lakhani

NTFSD is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

–
Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295

Uzair_Lakhani · May 3, 2013, 8:56am

Dear Peter,

Thanks for your input. This is the part of the code where the handle was created:

//pdo is like ??\H:
InitializeObjectAttributes(&oa, pdo, 0 , NULL, NULL);

status = ZwOpenFile(&myHandle,
GENERIC_ALL,
&oa,
&statusBlock,
FILE_SHARE_READ|FILE_SHARE_WRITE,
FILE_NON_DIRECTORY_FILE);

if(status == STATUS_SUCCESS){
{
targetDisk->myHandle = myHandle;
…

…

The purpose of the code snippet (in previous post) was to properly close the handle to the target partition i.e. first call ObDereferenceObject and then ZwClose. But ZwClose was returning STATUS_INVALID_HANDLE in shutdown routine.

Thanks,
Uzair Lakhani

Peter_Scott · May 3, 2013, 9:03am

You are opening the handle as a user mode handle and thus it is only
valid in the context of the calling process where you make the below
call. You should specify OBJ_KERNEL_HANDLE in the attributes of the
InitializeObjectAttributes() call to ensure that the handle is a kernel
handle and accessible from any context in kernel mode.

Pete

On 5/3/2013 6:59 AM, xxxxx@gmail.com wrote:

Dear Peter,

Thanks for your input. This is the part of the code where the handle was created:

//pdo is like ??\H:
InitializeObjectAttributes(&oa, pdo, 0 , NULL, NULL);

status = ZwOpenFile(&myHandle,
GENERIC_ALL,
&oa,
&statusBlock,
FILE_SHARE_READ|FILE_SHARE_WRITE,
FILE_NON_DIRECTORY_FILE);

if(status == STATUS_SUCCESS){
{
targetDisk->myHandle = myHandle;
…

…

The purpose of the code snippet (in previous post) was to properly close the handle to the target partition i.e. first call ObDereferenceObject and then ZwClose. But ZwClose was returning STATUS_INVALID_HANDLE in shutdown routine.

Thanks,
Uzair Lakhani

NTFSD is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at http://www.osronline.com/page.cfm?name=ListServer

–
Kernel Drivers
Windows File System and Device Driver Consulting
www.KernelDrivers.com
866.263.9295

Uzair_Lakhani · May 3, 2013, 10:41am

Dear Peter,

Thanks again for your input. You have rightly pointed out the problem.

Thanks,
Uzair Lakhani

Uzair_Lakhani · May 4, 2013, 2:19am

Dear All,

Thanks for your input. Updating the previous post regarding the file system check, one thing I noted is that the file system check runs on every restart if I make IRP_MJ_WRITE requests on the disk partition.

If IRP_MJ_WRITE requests are not generated then the file system check does not run on next restart. Please also note that I am setting the flag SL_FORCE_DIRECT_WRITE in the Flags field of next stack location of the IRP. If I do not set the flag then the completion status of the IRP most probably will be STATUS_ACCESS_DENIED.

Any help will lead me forward.

Thanks,
Uzair Lakhani

OSR_Community_User · May 4, 2013, 10:28am

It’s reasonable for a file system to mark the volume as dirty if you force a write to the underlying device. The FAT code in the WDK does show special case handling for this, though I’m sure it would be different than the NTFS handling in such a case.

After all, you might have formatted the drive or scrambled the meta-data via this technique. If you don’t want it to check the volume, I’d suggest finding an alternative mechanism to achieve your ends.

Tony
OSR

Uzair_Lakhani · May 6, 2013, 12:30am

Dear Tony & Other Members,

Thanks for your input. As far as I understand you mean that the SL_FORCE_DIRECT_WRITE flag set in the next location of the IRP is causing the file system check. The problem is that if I donot set this flag then my write IRP does not get completed successfully.

Additionally if I remember properly this flag is added in Vista and onwards operating systems. But I think I was getting the file system related issues in XP also where you don’t need to set the SL_FORCE_DIRECT_WRITE flag.

Also what alternative I can use instead of this.

Any help will be appreciated.

Thanks,
Uzair Lakhani

OSR_Community_User · May 6, 2013, 10:26am

The idea is the same, regardless of the OS platform: you scribble on the file system volume through the file system, it wants to verify that you haven’t damaged the file system meta-data on next reboot.

If you don’t want the file system to check it’s meta-data, don’t write to the volume through the file system driver.

Tony
OSR

Uzair_Lakhani · May 7, 2013, 12:55am

Dear Tony,

Thanks again for your input. Basically in order to resolve the file system issue I involve the file system driver so that the file system driver should be aware of the changes taking place. In order to involve the file system driver I am specifying GENERIC_ALL access rights in ZwOpenFile routine which I think will mount the file system on the device and return file system driver object.

Previously we were specifying FILE_READ_ATTRIBUTES in ZwOpenFile routine which I think does not mount the file system on the device. The file system check was also coming in this case. So we thought that the file system was not aware about the changes done under it on the device so that the next time file system comes in action it runs the file system check for any possible corruptions. Due to this reason we thought it will be wise to involve the file system driver.

The conclusion is that specifying FILE_READ_ATTRIBUTES as access rights in ZwOpenFile (will not mount the file system) and GENERIC_ALL as access rights in ZwOpenFile (will mount the file system), both are causing the file system check on next reboot.

Any suggestion/input will be helpful.

Thanks,
Uzair Lakhani

OSR_Community_User · May 7, 2013, 1:17am

I’d be rather surprised if the file system were not getting mounted. Further, if the file system is not mounted, then there is no way for it to mark the volume dirty and hence your own analysis suggests that it IS getting mounted.

My point is that if you want to modify the underlying disk in such a way that NTFS/FAT don’t check the drive for consistency, you must bypass NTFS/FAT. This is a rather risky approach and it really does require considerable insight and expertise for your implementation.

Of course, in your own response you indicated that you WANT NTFS/FAT to be aware of these changes. In that case, the behavior that you are observing is the byproduct of exactly that understanding.

I don’t know what problem you’re trying to solve - I only really know the peculiar path you’ve chosen to a solution here. I don’t see how to meet all of the constraints on your problem at this point. Hopefully someone else will be able to share some insight here.

Tony
OSR

Uzair_Lakhani · May 8, 2013, 1:19am

Dear List Members,

Thanks for your input up to this point. Below is the main idea on which we are working.

Basically we are developing a caching system for Windows desktop and server platforms. For giving the user an access point for using the cache we are creating a virtual device i.e. X: in My Computer. The geometry of X: will be same as the geometry of disk partition for which X: is providing caching services. Let for example X: is providing caching for D: then if size of D: is 40GB then the size of X: will also be 40GB. In other words all IOCTL requests for X: will be directed to D: All the write requests for X: will be written to Z: which is basically a RAMDISK. When a certain amount of RAMDISK is full or when the system is going to shutdown then the dirty data of X: should be synched to D: This synching is basically I think causing file system check on next reboot.

So for X: three types of requests should be sent to D:

Read Requests [Cache Miss Case]

Write Requests [Dirty Data Synch Case]

IOCTL Requests [Device I/O Control Requests except I think IOCTL_MOUNTDEV_QUERY_DEVICE_NAME]

Hopefully this would clearify the problem I am facing.

Thanks,
Uzair Lakhani

OSR_Community_User · May 8, 2013, 5:01am

Do you have any evidence that would suggest that this will improve your
performance? The RAMdisk approach made sense in an era when a
single-cylinder seek took 25ms, drives did not have caches, and the
underlying MS-DOS file system did not do page management, and the
processor clock was 4.7MHz. Today, with gigabytes of RAM available for
the file system cache and automatically optimized for dynamic performance,
processors are much faster than even their clock speeds suggest, and disk
drives have onboard caches, is it really going to help to create something
that, on the whole, is already better than the classic RAMdisk approach?
Have you measured the performance? Have you analyzed the application to
see if it is using the best algorithms? (Seriously. I’ve seen
better-than-order-of-magnitude performance changes accomplished by trivial
changes in the app. I once got 30% performance improvement with a
one-line change). Before heading off to do a very expensive and complex
driver, do you have the performance numbers to justify this effort? And
how robust is this solution?

I once had an MS-DOS app that took five hours to run. I copied all the
input files to a RAMdisk and it ran in 40 minutes. Some years later I had
a much faster Windows machine. Alas, the script that ran involved a lot
of complexity but started off with a copy to RAMdisk. I mapped the
logical RAMdisk letter to a directory of the local disk and started it
running. I went to fix something to eat, planning to come back and watch
it run. By the time I came back, it had finished. It ran in 5 minutes,
without the RAMdisk! So before you rush off to reinvent a very complex
wheel, can you present any evidence that suggests that drive delays are
your performance bottleneck? And, when is the last time you defragmented
your disk? Don’t rush off to spend lots of money, time, and people until
you know where the problem is.

And, if you really need a RAMdisk, I’m sure there are tested and debugged
RAMdisk packages already in existence. It would be really inexpensive to
buy one, install it, and see if you got measurable improvement. If you
don’t, the project is not worth doing. If you do, but you need the
mirroring capability, write a service that can handle the updating of the
hard-drive files using lazy writeback. This would give you what I think
you are asking for, but at ***vastly*** lower cost.

I’ve heard Tony Mason give talks on the issues of building file systems.
He knows all the weird corner cases you haven’t thought of (yet). How
will your file system handle memory-mapped files? How will it work when
other drivers exist in the stack? Will it be FAT or NTFS? Hmmm.
There’s a FAT driver example, and NTFS is still proprietary. And what
about ReFS? How will you deal with protection issues if it’s FAT (You
can’t put ACLs on FAT)?

Or, do you want to build a virtual volume whose implementation is RAM?
Then you can support NTFS trivially because that is managed far above your
pay grade…uh…I mean, far above you in the stack. Well, it’s te same
thing, actually. Or is this idea of RAMdisk a design thought up by
someone in management or marketing who once used one under MS-DOS and
“knows” this is the “right solution”?

One thing I learned in 15 years of performance measurement: never, ever
undertake a problem in “optimization” until you know that one exists,
where it exists, and have analyzed the application to determine who is at
fault. In 1982-1983 I implemented what was probably the fastest storage
allocator in existence. I spent at least a month doing nothing but
performance analysis and performance enhancement. A few weeks after I
released it, one of te product groups came to me and said, “We’ve
identified the performance bottleneck in our product. It’s your
allocator.” Proof of this was the program-counter-sampling report that
showed a huge spike at the storage allocator module. Well, having just
spent a month getting the “typical” allocation path down to under 50
instructions, and the “best” allocation down to about 20 instructions, I
had a little trouble believing this. So, since they were running the
“debug” version, I reached in with the debugger and turned on the
performance counters. Turns out they had a tight loop that called a
function to do something. Each time this function was called, it
allocated a small buffer to work in, and before it left, it deallocated
this buffer. The result was over 4,000,000 unnecessary calls on the
allocator. I changed it to put the buffer as a local variable on the
stack, and got a noticeable performance improvement–at least a factor of
4 or 5. The fastest known (at that time) allocator was not fast enough to
handle several million unnecessary calls. So you have to not only
determine what is the per-disk-transaction cost, but then take the next
step: if this cost was 0, what would happen to overall performance? One
researcher came to me because he’d been told I was the performance guru.
He lamented, “I’ve spent a week optimizing this subroutine. It’s at least
twice as fast as it used to be. And my program still takes forever!” So
I ran my performance analysis tool. His subroutine, on which he’d labored
long and hard, was not called very often, and accounted for 0.25% of the
total execution time. Before, it had accounted for 0.5% of the time. The
aalysis identified several “hotspots” he had not even suspected.

Local optimization in the absence of real performance data is usually a
waste of time. Buy a RAMdisk package and measure your performance using
it. Suppose DiskPerf tells you it is 10 times faster. Does the
application run ten times faster? If it runs at the same speed, or
perhaps 2x faster, disk I/O is not your problem. If your app also runs
10x faster, then you know disk I/O /is/ the bottleneck. Then say, “Well,
half the solution cost $N. Is there a way to get the rest of the solution
at a reasonable price?”

I did a google search, and the first hit gave a RAMdisk that runs up
through Win8. It costs $18.99, or you could spring for the deluxe version
at $22.99 that comes with a T-shirt. I didn’t look further, although the
entire first page of the google search appeared to be products. I would
not pursue your project any further until I had purchased this package and
done measurements with it. And done a thorough analysys of the app.
joe

Dear List Members,

Thanks for your input up to this point. Below is the main idea on which we
are working.

Basically we are developing a caching system for Windows desktop and
server platforms. For giving the user an access point for using the cache
we are creating a virtual device i.e. X: in My Computer. The geometry of
X: will be same as the geometry of disk partition for which X: is
providing caching services. Let for example X: is providing caching for D:
then if size of D: is 40GB then the size of X: will also be 40GB. In other
words all IOCTL requests for X: will be directed to D: All the write
requests for X: will be written to Z: which is basically a RAMDISK. When a
certain amount of RAMDISK is full or when the system is going to shutdown
then the dirty data of X: should be synched to D: This synching is
basically I think causing file system check on next reboot.

So for X: three types of requests should be sent to D:

Read Requests [Cache Miss Case]

Write Requests [Dirty Data Synch Case]

IOCTL Requests [Device I/O Control Requests except I think
IOCTL_MOUNTDEV_QUERY_DEVICE_NAME]

Hopefully this would clearify the problem I am facing.

Thanks,
Uzair Lakhani

NTFSD is sponsored by OSR

OSR is hiring!! Info at http://www.osr.com/careers

For our schedule of debugging and file system seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Uzair_Lakhani · May 9, 2013, 1:08am

Dear Joseph,

Thanks for your input and guidance. The problem is that we are not in the beginning of developing such a caching product but instead we have already completed the product. We have both RAMDISK and WRITE-BACK/WRITE-THROUGH cache management driver with us. The product is even dispatched to the client for evaluation purposes. One problem is the current file system problem that needs to be fixed before the client reports it.

As far as the file system support is concerned we do not need any special handling for X: as we forward the requests to actual disk partition. The file system of the target will be most probably NTFS.

If you or anybody else can help us in resolving the issue or suggesting some alternative way then it will be helpful for us.

Again I appreciate your time for guiding us in detail.

Thanks,
Uzair Lakhani

Uzair_Lakhani · May 10, 2013, 12:40am

Dear All,

Thanks for the input up to this point. As I have mentioned earlier that we have to synch the dirty data to the disk partition on shutdown. One thing which is noted currently is that the completion routine for write request to the disk partition is giving the status STATUS_TOO_LATE [synch requests write]. I confirmed this by placing a custom bug check in case write request fails and priting the status on BSD fourth paramter.

I think it may be somewhat related to the file system check problem.

In order to handle this I was calling IoReleaseRemoveLockAndWait earlier in my shutdown routine. Although by this the incoming read/write requests will fail and all write requests [synch requests write] will be completed successfully. But I am doubtful about this strategy due to two reasons. Firstly IoReleaseRemoveLockAndWait is called in remove device function. Secondly by failing the incoming read/write requests I may be failing an important file system request that may be causing file system check after restart.

Additionally why STATUS_TOO_LATE is coming even when I have not called ObDereferenceObject on the disk partition.

Thanks,
Uzair Lakhani

OSR_Community_User · May 10, 2013, 12:53am

From FASTFAT:

//
// Check if this volume has already been shut down. If it has, fail
// this write request.
//

if ( FlagOn(Vcb->VcbState, VCB_STATE_FLAG_SHUTDOWN) ) {

Irp->IoStatus.Information = 0;
FatCompleteRequest( IrpContext, Irp, STATUS_TOO_LATE );
return STATUS_TOO_LATE;
}

And note that in FatCommonShutdown this is set:

SetFlag( Vcb->VcbState, VCB_STATE_FLAG_SHUTDOWN );

Tony
OSR

Uzair_Lakhani · May 11, 2013, 1:47am

Dear Tony & All,

Thanks for the input. Regarding my last question is there any way that once my dispatch shutdown routine is called then before synching my dirty data I do not allow more dispatch write requests. I am noticing that in my shutdown routine I synch the dirty data completely but in the meantime it seems that some dispatch write requests have come so again dirty data is generated that need synch. Then again my dispatch shutdown routine is called and now when I try to synch the dirty data the status STATUS_TOO_LATE is coming although I have not released my reference to the disk partition.

Can anybody help me in resolving this problem? One solution can be to call IoReleaseRemoveLockAndWait first time in the dispatch shutdown routine. In this way more dispatch write requests will not come?

Thanks,
Uzair Lakhani

Uzair_Lakhani · May 13, 2013, 2:48am

Dear Members,

One more question. Can anyone tell me what types of write requests are generated if I only open a disk partition without writing anything on it. The problem is that when I open X: some write requests are generated and when I synch these write requests for X: on to the real disk partition at shutdown then on next restart the file system check runs on the disk partition.

So can any one tell me what type of write requests generated on opening a partition.

Thanks,
Uzair Lakhani