When will be queued a packet to the I/O completion port ?

assume we have:

  • The file is opened for asynchronous I/O.
  • A I/O completion port is associated with the file.
  • we invoke asynchronous I/O operation on file (ApcContext != 0)

so how detect - will be queued a packet to the IOCP or not ?

this always need exactly know, because we can not free/dereference IO_STATUS_BLOCK or OVERLAPPED (and may be other resources) until I/O not finished.
usually this done in callback called when packet queued to IOCP (we can handle IOCP yourself, or use CreateThreadpoolIo or BindIoCompletionCallback)
if we know that will no packet queued to IOCP - need manually just call this callback with error status
if we use CreateThreadpoolIo callback, need call CancelThreadpoolIo, if will be no IOCP notification

look like documentation for StartThreadpoolIo and CancelThreadpoolIo give exactly answer, when will be no notification to IOCP (and we must call the CancelThreadpoolIo)

  • An overlapped (asynchronous) I/O operation fails (that is, the asynchronous I/O function call returns failure with an error code other than ERROR_IO_PENDING).

  • An asynchronous I/O operation returns immediately with success and the file handle associated with the I/O completion object has the notification mode FILE_SKIP_COMPLETION_PORT_ON_SUCCESS. The file handle will not notify the I/O completion port and the associated I/O callback function will not be called.

because win32 layer usually set error code and return false when NT_ERROR(status) (except special case STATUS_PENDING converted to ERROR_IO_PENDING) returned from Zw* api, for NT layer this mean:

  • if api return NT_ERROR(status) ( status in range [0xC0000000, 0xFFFFFFFF] ) - no notification
  • otherwise, if returned status in range [0, 0xC0000000) - will be

the range [0x80000000, 0xC0000000) bit problematic really
if such status returned from I/O manager, due invalid parameters and before actual call driver, will be no notification.
only one case which i know here - this is STATUS_DATATYPE_MISALIGNMENT 0x80000002 returned for example from ZwNotifyChangeDirectoryFile(ReadDirectoryChangesW ) if lpBuffer not DWORD-aligned.
in this case will be no completion.

interesting that win32 call

	OVERLAPPED ov{};
	ReadDirectoryChangesW(0, (void*)1, 1, TRUE, FILE_NOTIFY_VALID_MASK, 0, &ov, 0);

return TRUE, because !NT_ERROR(STATUS_DATATYPE_MISALIGNMENT) - win32 layer simply lost STATUS_DATATYPE_MISALIGNMENT error

from another case, if say NtQueryDirectoryFile return STATUS_NO_MORE_FILES (0x80000006) will be notification

but this is wrong in general case. really can be notification to IOCP even in case NT_ERROR(status) !
this is because FastIo which can be invoked before IRP create - can return TRUE with error final status.
say FastIoDeviceControl or FastIoLock can return TRUE (I/O finished) and final status is error

for example we can call LockFileEx or ZwLockFile on directory file - I/O manager call Fs FastIoLock implementation (say NtfsFastLock) and it just return TRUE with STATUS_INVALID_PARAMETER

so after such call we got STATUS_INVALID_PARAMETER. by documented rules - must not be notification in this case. but in real word it will be !
of course this is rarely situation, but it break general rule.
also in case SetFileCompletionNotificationModes - if we test it (FILE_SKIP_COMPLETION_PORT_ON_SUCCESS) with LockFileEx - will be no notification in this case.
so FILE_SKIP_COMPLETION_PORT_ON_SUCCESS prevent notification not only success return, but any synchronous return.
so more correct name it FILE_SKIP_COMPLETION_PORT_ON_SYNCHRONOUS and instead
A request returns success immediately without returning ERROR_PENDING
must be
A request returns immediately without returning ERROR_PENDING

POC of code with LockFileEx - https://pastebin.com/qWrMYvy4 , unfortunately too long for post here

so really correct detect will be or not notification in general case 100% reliable ?

i view next solution (but may be i mistake or exist better ?- in this and main question)

set IO_STATUS_BLOCK.Status = STATUS_PENDING; before api call (win32 layer always do this how i know)
check IO_STATUS_BLOCK.Status == STATUS_PENDING after api return code other than STATUS_PENDING (ERROR_IO_PENDING)
sense here in next - notification to IOCP will be when and only when I/O manager write back status to user mode IO_STATUS_BLOCK (if no FILE_SKIP_COMPLETION_PORT_ON_SYNCHRONOUS)
but we can not simply access IO_STATUS_BLOCK(or OVERLAPPED) after api call - because other thread can in concurrent execute callback with this IO_STATUS_BLOCK and can already free it
(this like I/O manager can not more access IRP after call driver, if no IRP_DEFER_IO_COMPLETION in flags - IRP can be already completed and free/reused)
solution here use reference counting on structure which incapsulate IO_STATUS_BLOCK(or OVERLAPPED).
we always create this structure (let name it user mode IRP) with 2 reference.
one reference we release after check result/IO_STATUS_BLOCK of API call (for detect - are we need manually invoke callback, call CancelThreadpoolIo )
another we release in callback

class NT_IRP : public IO_STATUS_BLOCK 
{
    //...
	LONG m_dwRefCount;//=2 on init
	
	NT_IRP() : m_dwRefCount(2)
	{
          Status = STATUS_PENDING, Information = 0;
	}

	void Release()
	{
		if (!InterlockedDecrement(&m_dwRefCount)) delete this;
	}

	VOID IOCompletionRoutine(NTSTATUS status, ULONG_PTR dwNumberOfBytesTransfered)
	{
	        //..
		Release();
	}

	static VOID CALLBACK _IOCompletionRoutine(NTSTATUS status, ULONG_PTR dwNumberOfBytesTransfered, PVOID ApcContext)
	{
		// we must pass NT_IRP pointer in place ApcContext in I/O call
		reinterpret_cast<NT_IRP*>(ApcContext)->IOCompletionRoutine(status, dwNumberOfBytesTransfered);
	}

	void CheckNtStatus(NTSTATUS status, BOOL bSkippedOnSynchronous = FALSE)
	{
          // api completed synchronous (status != STATUS_PENDING)
          // and 
          // bSkippedOnSynchronous or iosb not modified (Status == STATUS_PENDING)
          if (status != STATUS_PENDING && (bSkippedOnSynchronous || Status == STATUS_PENDING))
          {
                IOCompletionRoutine(status, Information);
          }

	      Release();
	}
};

or for win32 case

class WIN32_IRP : public OVERLAPPED 
{
    //...
	LONG m_dwRefCount;//=2 on init
	
	WIN32_IRP() : m_dwRefCount(2)
	{
          Internal = STATUS_PENDING, InternalHigh = 0;
	}

	void Release()
	{
		if (!InterlockedDecrement(&m_dwRefCount)) delete this;
	}

	VOID IOCompletionRoutine(NTSTATUS status, ULONG_PTR dwNumberOfBytesTransfered)
	{
	        //...
		Release();
	}

	static VOID CALLBACK _IOCompletionRoutine(NTSTATUS status, ULONG_PTR dwNumberOfBytesTransfered, PVOID ApcContext)
	{
		// we must pass NT_IRP pointer in place ApcContext in I/O call
		reinterpret_cast<NT_IRP*>(ApcContext)->IOCompletionRoutine(status, dwNumberOfBytesTransfered);
	}

	void CheckErrorCode(ULONG dwErrorCode, BOOL bSkippedOnSynchronous = FALSE)
	{
          // api completed synchronous (dwErrorCode != ERROR_IO_PENDING)
          // and 
          // bSkippedOnSynchronous or iosb not modified (Internal == STATUS_PENDING)
          if (dwErrorCode != ERROR_IO_PENDING && (bSkippedOnSynchronous || Internal == STATUS_PENDING))
          {
                 IOCompletionRoutine(status, Information);
          }
	      Release();
	}
};

but solution look like too complex… are microsoft forget Fast Io case ?

also why say FastIoRead and FastIoWrite called only in case synchronous I/O (so not make problems here), but FastIoLock and FastIoDetachDevice called for any I/O ?
more logic from my look - or call FastIoLock and FastIoDetachDevice also only for synchronous file, FastIoRead and FastIoWrite - for any I/O too

I couldn’t follow your rather long and rambling post, but I believe you are making this more complicated than it needs to be. It’s certainly unclear when you’re talking from user mode and when you’re talking from kernel.

If the original call fails immediately, so that the request never makes it into a driver, then there’s no completion, and hence no IOCP packet. That’s almost always a programming error, and thus should not be much of a concern in production. Once the request makes it into the driver, there will eventually be an IOCP entry, when the highest-level driver completes the request. After the completion is removed from the queue by GetQueuedCompletionStatus, you can free or reuse the OVERLAPPED structure.

As long as you’re using an IOCP, then why do you need to dink with ]STATUS_PENDING at all?

unfortunatelly you not understand me and my concrete exemple at all. very strange

“If the original call fails immediately, so that the request never makes it into a driver”

but how we can know - error was from I/O manager or from driver ? and i show concrete example - when FastIoLock routine called from driver and it return error - STATUS_INVALID_PARAMETER - but still in this case will be packet to IOCP. exactly how I/O manager handle FastIoLock and FastIoDeviceControl (when this return TRUE with error status) break general rule.

“Once the request makes it into the driver, there will eventually be an IOCP entry, when the highest-level driver completes the request.”

but this is of course false !! if driver return error immediately - status in range C0000000-FFFFFFFF - will be no IOCP entry (or apc, or event set) and this is not mandatory programing error (invalid parameters,etc) for example FSCTL_PIPE_LISTEN can return error STATUS_PIPE_CONNECTED (client already connected) or even STATUS_PIPE_CLOSING (client already connected and disconnected) and will be no IOCP entry because this is NT_ERROR status.

so general rule (implicitly documented)

  • if api call return NT_ERROR - will be no IOCP entry
  • if api call return STATUS_PENDONG will be IOCP entry
  • otherwise based on FILE_SKIP_COMPLETION_PORT_ON_SUCCESS - will be - if this mode not set

but unfortunatelly this is not always true (how i show in example with LockFileEx - https://pastebin.com/qWrMYvy4 ) if some driver implement FastIoDeviceControl - and call to it return TRUE - will be IOCP entry, even if final operation status was NT_ERROR range.

at all question - why I/O manager - not call FastIoRead and FastIoWrite for asynchronous request, but call FastIoLock and FastIoDeviceControl (sorry i in original post by mistake wrote “FastIoDetachDevice” insted FastIoDeviceControl )

and yes, we can assume that error not from I/O manager, because this is always result of invalid parameters in call (invalid buffers, handles, or granted access on handles). so let assume that request will be sent to driver. if IRP will be created and sent to driver - rule next, when will be IOCP entry

if api call return NT_ERROR(status) - NO
if api call return STATUS_PENDONG - YES
otherwise if FILE_SKIP_COMPLETION_PORT_ON_SUCCESS mode - NO
otherwise - YES

or in code

bool IsWillBeIOCPNotification(NTSTATUS status /*returned from api call*/, bool bSkipOnSuccessMode)
{
	if (status == STATUS_PENDING) return true;

	if (NT_ERROR(status)) return false;

	return !bSkipOnSuccessMode;
}

or, for win32 case

bool IsWillBeIOCPNotification(ULONG dwError, bool bSkipOnSuccessMode)
{
	if (dwError == ERROR_IO_PENDING) return true;

	if (dwError != NOERROR) return false;

	return !bSkipOnSuccessMode;
}

again, note, that Once the request makes it into the driver, there will eventually be an IOCP entry - false, if driver return NT_ERROR

but, before create and sent IRP to driver - I/O manager, in case device io control and lock file request, can try Fast IO, if driver implement it

and here IOCP entry will be if fast io return true. regardless from returned status. as result can be IOCP entry even with error status

Hmmmm… I’m pretty sure you ARE making this harder than it needs to be, but OTOH I think you might be right about the ambiguity involved.

First, ignore whether the I/O Manager returns the error, or whether it’s from Fast I/O or IRP-based processing. Sure, it’s true that some errors are returned synchronously and others asynchronously… but WHO returns the error isn’t relevant (beyond the fact that, obviously, errors from I/O Manager will always be returned synchronously).

So… I think what you’re saying is that it’s not clear if warnings (NTSTATUS values in the range of 0x80000000 − 0xBFFFFFFF) will generate completion callbacks. Is that what you’re asking? If that’s the point of your post, then you certainly have the record for the most words to ask the simplest question.

I wouldn’t be surprised if there were some weird edge-conditions here that result in unexpected behavior in terms of completion being called or not. Is that the issue you’re asking about?

Peter

unfortunately apparently I poorly expressed my thought and my not the best English, if you not understand me.

“First, ignore whether the I/O Manager returns the error, or whether it’s from Fast I/O or IRP-based processing…but WHO returns the error isn’t relevant”

i try say absolute another( i only describe situation with this), ok let try else one

  • i open for asynchronous I/O,
  • bind IOCP to it (let be via CreateThreadpoolIo or BindIoCompletionCallback)
  • i call some asynchronous API (Zw* or it win32 shell - no matter - anyway Zw* finally called)

(Tim Roberts ask - unclear when you’re talking from user mode and when you’re talking from kernel - here this no matter, Zw api called direct or indirect anyway. i say about both)

so now, i as developer need know - are will be IOCP entry as result of my I/O request or no ?
i not ask are api completed synchronous or not. this is elementary - based are STATUS_PENDING (ERROR_IO_PENDING) returned.
i ask - will be IOCP entry queued as result of I/O call or not ?

(really i even not ask this, i know this, but I’m trying to draw your attention to a very interesting (as it seems to me) question)

  • if will be IOCP notification - my registered callback (with CreateThreadpoolIo or BindIoCompletionCallback) will be called by system (from thread pool which listen on IOCP)
  • otherwise i need call callback manually by self.

this need for free resources allocated for API call - IO_STATUS_BLOCK (or OVERLAPPED), dereference object which encapsulate file handle. need call CancelThreadpoolIo, if i use TP_IO

look like CancelThreadpoolIo documentation give answer for this (really we must call CancelThreadpoolIo EXACTLY when will be no IOCP notification)

To prevent memory leaks, you must call the CancelThreadpoolIo function for either of the following scenarios:

  • An overlapped (asynchronous) I/O operation fails (that is, the asynchronous I/O function call returns failure with an error code other than ERROR_IO_PENDING).
  • An asynchronous I/O operation returns immediately with success and the file handle associated with the I/O completion object has the notification mode FILE_SKIP_COMPLETION_PORT_ON_SUCCESS. The file handle will not notify the I/O completion port and the associated I/O callback function will not be called.

but how i discover - this is incorrect.
really I/O request can synchronous return NT_ERROR status, but will be IOCP notification !!
and even more worse case
the asynchronous I/O request return STATUS_SUCCESS but will be no IOCP notification (despite we must wait for it in this case)

the last case I believe direct bug in windows - in NtUnlockFile api. look for win2003 src code (despite this already very old, in this points - nothing changed, how show tests in windows 10)

look - if diver implement FastIoUnlockSingle and it return TRUE (let be with STATUS_SUCCESS) - I/O manager just complete request, without IOCP

so what - asynchronous api call return STATUS_SUCCESS but will be no completion. and how detect this ?

compare with NtLockFile implementation:

here I/O manager post IOCP entry. even if I/O completed with error status !

the same situation with NtDeviceIoControlFile/IopXxxControlFile - if FastIoDeviceControl present and return TRUE - will be IoSetIoCompletion called, even if NT_ERROR status returned
(as opposite FastIoWrite and as FastIoRead called only for FO_SYNCHRONOUS_IO files)

ok, i agree that use Lock file on directory (which return sysnchronous STATUS_INVALID_PARAMETER) is wrong by design.
but - we can call LockFileEx with LOCKFILE_FAIL_IMMEDIATELY - in this case - we can just got STATUS_LOCK_NOT_GRANTED (C0000055) - by documentation - must not be IOCP completion in this case. so we can just release resources (IO_STATUS_BLOCK/OVERLAPPED) must call CancelThreadpoolIo etc. but will be and callback called, because entry queued to IOCP. as result will be double free of IO_STATUS_BLOCK and other. but i found way detect and correct handle this case.

but then, when we call UnlockFileEx (if we acquire lock) - it return synchronous STATUS_SUCCESS and we must except IOCP notification. can not release resources. but will be no notify and callback. and i not view any way detect this case at all (ok, lock/unlock file not very frequently used, but anyway. case with FastIoDeviceControl can be more frequently)

so i design special example, if somebody interest test it and view result yourself. here i try maximal describe case in code. i create 2 threads, which in concurrent try lock the first byte in same file.

or even better 2 absolute concrete question

1.)

let we call

LockFileEx(hFile, LOCKFILE_EXCLUSIVE_LOCK|LOCKFILE_FAIL_IMMEDIATELY, 0, *, *, lpOverlapped);

it return FALSE and GetLastError() return ERROR_LOCK_VIOLATION :

  • will be IOCP notification in this case ?
  • need call CancelThreadpoolIo ? (if i use new thread pool callback here)
  • can we free lpOverlapped just ?

by documentation - must no be IOCP notification and need call CancelThreadpoolIo, but in practic this is wrong

or if native api closer to someone - let we call

ZwLockFile(_hFile, 0, 0, # , #, &ByteOffset, &Length, ‘key1’, TRUE, TRUE);

and it returned STATUS_LOCK_NOT_GRANTED :

  • will be IOCP notification in this case ?
  • need call CancelThreadpoolIo ? (if i use new thread pool callback here)
  • can we free lpOverlapped/iosb just ?

2.)

we call

UnlockFileEx(_hFile, 0, 1, 0, lpOverlapped) (or ZwUnlockFile)

and it return TRUE ( or STATUS_SUCCESS).

  • will be IOCP notification in this case ?
  • need call CancelThreadpoolIo ? (if i use new thread pool callback here)
  • can/must we free lpOverlapped just ? or when ?

by documentation - must me IOCP notification in this case, and we can not free lpOverlapped/iosb until this
but in real word - will be no more any notifications here

i try say absolute another

… Hmmmm… well, you say wrong. I am not guessing.

i not ask are api completed synchronous or not. this is elementary - based are STATUS_PENDING

No. You are missing the point… and I would recommend you approach asking questions of folks here on the forum with a JUST A BIT more humility. We are working hard to help you, despite it being our holiday and you being a non-native speaker of English.

As Mr. Roberts said, you are confusing the issue by switching your analysis, viewpoint, and questions back and forth from Win32 to the NT Native API. You wrote “here this no matter, Zw api called direct or indirect anyway” – indeed it DOES matter. While it is true that the Native API gets called eventually, your entire issue (it seems to me) is with how Win32 handles the edge cases. If you limit your work to the Native NT API (in user mode or kernel mode, “no matter”) then I think you’ll find things much more clear.

You’re also confusing the case by using what it perhaps one of THE most unusual file system functions in Windows: Directory Change Notification. If you get back STATUS_PENDING from a directory change notification, that effectively grants the request to notify you of the pending change, right? It’s only AFTER there’s a directory change that the request completes (assuming you got back STATUS_PENDING initially).

Byte range locks are another odd case, and I suspect how they behave will vary from file system to file system.

And now I will return to my holiday,

Peter

unfortunately i not wait for such result… may be i too bad explain

indeed it DOES matter. While it is true that the Native API gets called eventually, your entire issue (it seems to me) is with how Win32 handles the edge >cases. If you limit your work to the Native NT API (in user mode or kernel mode, “no matter”) then I think you’ll find things much more clear.

NO. you mistake here. this is not win32 issue. nothing is changed if we call direct native api. and handle direct NTSTATUS but not wrong (not rarely) win32 error. may be my mistake that i post both versions. and i agree that Native NT API much more clear , but here problem not in win32 layer.

and here not problem in file system implementations at all too. the problem in I/O manager kernel code - i say this all time. problem in how it handle FAST IO case. at some (LockFile) point I/O manager post completion to IOCP, even if error status returned. at another point (Unlock) - I/O manager -complete request without IOCP (even if STATUS_SUCCESS returned). not driver, but I/O itself ! simply WRK-v1.2 (i paste links) for understand src of problems

Byte range locks are another odd case, and I suspect how they behave will vary from file system to file system.

again - this not file system problem - how it handle request, but I/O manager problem - how it complete request.

and Directory Change Notification here unrelated at all. i nothing write about it, and perfect know how it work and not once use it in async mode, but here no problems.

simply look win

ok. sorry for all

your posts are certianly too long to follow, but let me say what i think your problem is

you need to know when an IOCP notification will be queued and when it will not. This is very important

there is an apparent ambiguity in the documentation re the use of FILE_SKIP_COMPLETION_PORT_ON_SUCCESS. In fact this feature was specifically added to Windows to remove the ambiguity you are worried about

The name FILE_SKIP_COMPLETION_PORT_ON_SUCCESS includes the work SUCCESS. That word has nothing to do with the real meaning of this option. What it means is that if the operation never goes pending and the final state of the call is known directly, then do not queue a completion on the port. This is true regardless of whether the final result of the call was a failure, warning or success

yes, unfortunately i guilty that I could not clearly explain the essence. but this was not question, faster how say already - but I’m trying to draw your attention to a very interesting (as it seems to me) question

you need to know when an IOCP notification will be queued and when it will not. This is very important

absolute agree. and not only i need this. everybody, who do asynchronous I/O programming.

FILE_SKIP_COMPLETION_PORT_ON_SUCCESS - not direct related to problem, and my post not about it. i know how it work, and that it name not does not display the essence (really it prevent IOCP entry if operation completed without pending)

but i about absolute another try say… that I/O manager (not drivers or win32 subsystem) wrong (?) handle FAST IO path what lead to problems

certianly too long

look for concrete comment with 2 absolute concrete question…

if still unclear what i all time try to say, ok, let moderator delete my post at all.

so what about FAST IO bothers you? The IRP can complete immediatly instead of being pended. It can either fail or succeed.

I realize that I have been too sloppy in my terminology. Language barriers increase the need to be technically precise and i have not been

But the point remains the same - what about FAST IO bothers you?

The IRP can complete immediatly instead of being pended. It can either fail or succeed.

True… regardless of whether Turbo I/O is used or an IRP is sent.

I don’t get what Mr. @nektar80 wants…

Peter

but i explain !! are still not clear ?! this is break general rule when will be and no IOCP entry as result of I/O (if fast io used IRP not created, but this is unrelated). so again - how based on returned status from I/O detect will be IOCP notification|callback or no ? we must know this exactly for correct manage resources.

general/semi documented rule is next:

bool IsWillBeIOCPNotification(NTSTATUS status /*returned from api call*/, bool bSkipOnSuccessMode)
{
    if (status == STATUS_PENDING) return true;

    if (NT_ERROR(status)) return false;

    return !bSkipOnSuccessMode;
}

1.) if STATUS_PENDING returned - will be notification. here no questions.

2.) if NT_ERROR(status) - must not be IOCP.

but - i found case, when this rule is beaked. i show concrete example with NtLockFile - when it can return STATUS_INVALID_PARAMETER or STATUS_LOCK_NOT_GRANTED - but will be IOCP entry !!

and this is not related to win32 layer or file system driver. this is related to I/O manager only - look wrk src code - how it complete request after fast io lock. this part still relevant and for windows 10. the same and for FastIoDeviceControl - how I/O manager handle at this point - possible NT_ERROR(status) - and IOCP notification

3.) otherwise based on FILE_SKIP_COMPLETION_PORT_ON_SUCCESS - will be completion or no.

**but ** - look for NtUnlockFile src code (again from wrk - still relevant and in latest win10 at this part) - if driver implement FastIoUnlockSingle and it return TRUE - I/O manager complete request without post IOCP entry (or set event, or APC).

so I/O can returned STATUS_SUCCESS but without IOCP notification (let assume FILE_SKIP_COMPLETION_PORT_ON_SUCCESS not set - for not confuse).

look at I/O manager code for understand !

so i show concrete examples. and “ask” all time about how exactly know - will be or not IOCP notification

@“Peter_Viscarola_(OSR)” - sorry for this topic at all. i not wait such and how i say - i not need help at all. i try draw your attention to this point. but if still unclear… i am sorry. i not want spam you. simply delete my post at all

ок, fortget about win32 layer and FILE_SKIP_COMPLETION_PORT_ON_SUCCESS - it not set.

maximal simply example

struct MY_IRP : public IO_STATUS_BLOCK
{
	//...
};

VOID NTAPI IoCallback(NTSTATUS status, ULONG_PTR Information, PVOID Context)
{
	MY_IRP* Irp = reinterpret_cast<MY_IRP*>(Context);
	//...
	delete Irp;
};

bool WillBeNoCallback(NTSTATUS status); 

	HANDLE hFile;
	RtlSetIoCompletionCallback(hFile, IoCallback, 0);

	if (MY_IRP* Irp = new MY_IRP(..))
	{
		LARGE_INTEGER ByteOffset{x}, Length{y};
		NTSTATUS status = NtLockFile(hFile, 0, 0, Irp, Irp, &ByteOffset, &Length, 'keyX', TRUE, TRUE);
		if (WillBeNoCallback(status)) // if (NT_ERROR(status)) 
		{
			IoCallback(status, 0, Irp);
		}
	}
  • we bind IoCallback to hFile - as result it will be called by system, when IOCP entry
  • every I/O operation require unique IO_STATUS_BLOCK - so i allocate it - MY_IRP* Irp = new MY_IRP(…)
  • now i call asynchronous I/O api (NtLockFile in concrete case) and pass MY_IRP* Irp to it in place ApcContext (for get it back in IoCallback)
  • but MY_IRP* Irp - need be free, when I/O finished. so how, when do this ?
  • native solution (how it seems to me) - do this inside IoCallback.
  • but if IoCallback will be not called (due I/O error) - we need just free MY_IRP* Irp or better - manual call IoCallback(status, 0, Irp);
  • so QUESTION - how detect, based on returned status (and may be IO_STATUS_BLOCK) - will be IoCallback called (as result of IOCP notification) or no
  • so question in implementation of bool WillBeNoCallback(NTSTATUS status);
  • are if (NT_ERROR(status)) - always (forget about FILE_SKIP_COMPLETION_PORT_ON_SUCCESS !! it not set!) correct ?
  • i show that not always !

note - that will be IOCP notification or no - depend only from I/O manager. this not depend from driver, which handle request ! driver can have any implementation of lock/unlock/ ioctl. but not driver post IOCP entry( or queue APC or set user Event). this done exclusive by I/O manager. based on driver return final status.

sorry, really i mistake with NtUnlockFile - this is synchronous api - so never here will be IOCP notification here

but still exist case with IoDeviceControl and IoLock when - api returned NT_ERROR(status) but will be packet queued to IOCP

usually (documented) we need assume that will be no IOCP notification if api return status in range NT_ERROR(status)

as result we can just free IO_STATUS_BLOCK and another resources allocated for call, if we got NT_ERROR(status) (status in [0xC0000000, 0xFFFFFFFF])

but because can be IOCP notification even in this case - the callback binded to this IOCP will be called too, end here we again try free IO_STATUS_BLOCK…

as result double free


poc in Native api code
https://github.com/rbmm/LockFile-Poc/blob/master/NT_Api_poc.cpp
https://github.com/rbmm/LockFile-Poc/blob/master/NT_poc.log

what i say, about from where was error (I/O manager, Fast IO, IRP io) - this of course really no matter.
but i simply try explain - why/how/where was situation when NT_ERROR(status) returned but packet queued to IOCP:
this was in case FastIoDeviceControl or FastIoLock return TRUE with NT_ERROR(status)
really I/O manager queued packet to IOCP in this case (FastIo return TRUE) and not look for final operation status

and result not depend from how drives handle lock or ioctl. only from I/O manager, because this it duty - how complete user request, particularly post or not post entry to IOCP

after more research kernel code - i found that this really bug only inside NtLockFile api:
after FastIoLock return TRUE it not check final status and post entry to IOCP for any status. pseudo code from win10

NTSTATUS NtLockFile(... PVOID ApcContext ...)
{
	PFILE_OBJECT FileObject;

	// code after FastIoLock return TRUE -

	PIO_COMPLETION_CONTEXT CompletionContext = FileObject->CompletionContext;

	if (CompletionContext &&
		ApcContext &&
		!(FileObject->Flags & FO_SKIP_COMPLETION_PORT))
	{
		IoSetIoCompletionEx2( CompletionContext->Port,
			CompletionContext->Key,
			ApcContext,
			status, // not checked !!!
			Information,
			...);
	}
}

look NtLockFile.png

for compare, code from IopXxxControlFile (fixed error in win8, still exist in win7)

NTSTATUS IopXxxControlFile(... PVOID ApcContext ...)
{
	PFILE_OBJECT FileObject;

	NTSTATUS status;
	PVOID Port = 0, Key = 0;

	// code after FastIoDeviceControl 

	PIO_COMPLETION_CONTEXT CompletionContext = FileObject->CompletionContext;

	if (
		CompletionContext && 
		!(FileObject->Flags & FO_SKIP_COMPLETION_PORT) &&
		!NT_ERROR(status) // checked !!!
		)
	{
		Port = CompletionContext->Port, Key = CompletionContext->Key;
	}

	if (Port && ApcContext)
	{
		IoSetIoCompletionEx2( Port, Key, ApcContext,status, Information,...);
	}
}

here added check !NT_ERROR(status)

so if fix error (i believe that this is bug) in ntoskrnl!NtLockFile

bool IsWillBeIOCPNotification(NTSTATUS status /*returned from api call*/, bool FoSkipCompletionPort )
{
    if (status == STATUS_PENDING) return true;

    if (NT_ERROR(status)) return false;

    return !FoSkipCompletionPort;
}

I am curious to know what led you to tumble down this rabbit hole. You started off asking a very general design question, but was this all triggered because your application encountered a problem with the LockFile API? If not, what let you into this investigation?

@Tim_Roberts i am sorry for too bad English and and for the inability to clearly express one’s thought

i am developing general class library for asynchronous I/O and always interesting in this topic. but several days ago, when i play with asynchronous file locking, i random catch design bug - NtLockFile returned STATUS_LOCK_NOT_GRANTED but despite this error was notify the I/O completion port and the associated I/O callback function called. what must not be.

my mistake - that i look initially only to the WRK-v1.2 source code (of course very old already) and i understand why is happens. but when i look to NtDeviceIoControlFile → IopXxxControlFile - i view that here the same mistake in code. so i decide that the same problem must be and with IOCTL if we use asynchronous file and driver implement FastIoDeviceControl and return TRUE from it with NT_ERROR status. this already more serious. but when i test IOCTL in win10 - no error. so i look for binary code (win7, 8.1, 10 ) and understand that error in IopXxxControlFile was fixed in 8.1 how minimum. but MS forget fix NtLockFile. so bug is only here now. easy for fix (only one check !NT_ERROR(status) need add), but still exist. may be MS fix this bug if report it

with my say about NtUnlockFile - i of course hurry and mistake - this is synchronous api, not have ApcContext parameter - so here must not be IOCP notification at all. sorry for this mistake.

bug in NtLockFile really exist, but only here. was early and in ioctl path but fixed long time ago