WFP driver. PAGE_FAULT_IN_NONPAGED_AREA bugcheck on RtlCopyMemory()

Hello everyone,

I’m working on a network driver, which is based on the Windows Filtering Platform and stands for the network packets interception. Then the intercepted packets are pushed to the user-mode system service, which is analyzing these packets.

The driver is quite stable for me, but according to users’ feedback it produces the PAGE_FAULT_IN_NONPAGED_AREA bugcheck on the OpenVPN startup (for desktops) and on Windows Surface Pro. I tried to reproduce the issue with the OpenVPN, but failed.

Here is the bugcheck info:
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by try-except,
it must be protected by a Probe. Typically the address is just plain bad or it
is pointing at freed memory.
Arguments:
Arg1: ffffe000815550f4, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff80083c54450, If non-zero, the instruction address which referenced the bad memory address.
Arg4: 0000000000000000, (reserved)

Analyzing the crash dump I have found out that the problem is about the following code:

void *pbuf_data_v = NULL;
NET_BUFFER *pbuf_v = NULL;
ulong pkt_size_v = 0;

bb_byte_blob_t pkt_v; // the source code of the bb_byte_blob_t is below

pbuf_v = NET_BUFFER_LIST_FIRST_NB(pdata->_ppkt); // pdata->_ppkt is a pointer to the NET_BUFFER_LIST structure
CHECK_NULL(pbuf_v, status_v, exit_l); // just checks whether the pbuf_v is NULL

pbuf_data_v = NdisGetDataBuffer(pbuf_v, pkt_size_v, NULL, 1, 0); // getting the pointer to the buffer
CHECK_NULL(pbuf_data_v, status_v, exit_l);

status_v = bb_assign(&pkt_v, pbuf_data_v, pkt_size_v); // makes a copy of the packet (source is below). BSOD happens here
CHECK_ERROR(status_v, exit_l);

Here is the source code of the bb_byte_blob_t:

#pragma pack(push, 1)
struct bb_byte_blob_t
{
void *_buffer;
uint _size;
};
#pragma pack(pop)

Here is the source of the bb_assign():

NTSTATUS bb_assign(bb_byte_blob_t *pbb, void *pdata, uint size)
{
NTSTATUS status_v = STATUS_UNSUCCESSFUL;

NULL_ASSERT(“pbb == NULL”, pbb);
NULL_ASSERT(“pdata == NULL”, pdata);

status_v = bb_allocate(pbb, size); // allocates the buffer the source code is below
CHECK_ERROR(status_v, exit_l);

RtlCopyMemory(pbb->_buffer, pdata, size); // the BSOD is here

exit_l:
return status_v;
}

NTSTATUS bb_allocate(bb_byte_blob_t *pbb, uint size)
{
NTSTATUS status_v = STATUS_MEMORY_NOT_ALLOCATED;

NULL_ASSERT(“pbb == NULL”, pbb);

pbb->_buffer = NULL;
pbb->_size = 0;

if (size > 0)
{
M_ALLOC(pbb->_buffer, size); // macros for ExAllocatePoolWithTag()
if (NULL != pbb->_buffer)
{
pbb->_size = size;
status_v = STATUS_SUCCESS;
}
}
return status_v;
}

Since the crash dump’s Arg2 is 0, I may assume that the problem is in the NdisGetDataBuffer() function, which is probably returns the corrupted pointer, but I have no idea why this may happen?

Any help will be appreciated!

Thanks in advance,
Richard

Consider reviewing documentation for NdisGetDataBuffer. In particular learn
the cases where NdisGetDataBuffer is designed to return NULL and revise your
code.

Also, there are two versions of the OpenVPN virtual miniport: An older NDIS
5 version and a more recent NDIS 6 version. Make sure that you are testing
(in the OpenVPN case…) with the same platform and the same OpenVPN version
as the bug report.

Good luck!

Thomas F. Divine
http://www.pcausa.com

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Tuesday, October 28, 2014 2:58 AM
To: Windows System Software Devs Interest List
Subject: [ntdev] WFP driver. PAGE_FAULT_IN_NONPAGED_AREA bugcheck on
RtlCopyMemory()

Hello everyone,

I’m working on a network driver, which is based on the Windows Filtering
Platform and stands for the network packets interception. Then the
intercepted packets are pushed to the user-mode system service, which is
analyzing these packets.

The driver is quite stable for me, but according to users’ feedback it
produces the PAGE_FAULT_IN_NONPAGED_AREA bugcheck on the OpenVPN startup
(for desktops) and on Windows Surface Pro. I tried to reproduce the issue
with the OpenVPN, but failed.

Here is the bugcheck info:
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by
try-except, it must be protected by a Probe. Typically the address is just
plain bad or it is pointing at freed memory.
Arguments:
Arg1: ffffe000815550f4, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff80083c54450, If non-zero, the instruction address which
referenced the bad memory address.
Arg4: 0000000000000000, (reserved)

Analyzing the crash dump I have found out that the problem is about the
following code:

void *pbuf_data_v = NULL;
NET_BUFFER *pbuf_v = NULL;
ulong pkt_size_v = 0;

bb_byte_blob_t pkt_v; // the source code of the bb_byte_blob_t is below

pbuf_v = NET_BUFFER_LIST_FIRST_NB(pdata->_ppkt); // pdata->_ppkt is a
pointer to the NET_BUFFER_LIST structure CHECK_NULL(pbuf_v, status_v,
exit_l); // just checks whether the pbuf_v is NULL

pbuf_data_v = NdisGetDataBuffer(pbuf_v, pkt_size_v, NULL, 1, 0); // getting
the pointer to the buffer CHECK_NULL(pbuf_data_v, status_v, exit_l);

status_v = bb_assign(&pkt_v, pbuf_data_v, pkt_size_v); // makes a copy of
the packet (source is below). BSOD happens here CHECK_ERROR(status_v,
exit_l);

Here is the source code of the bb_byte_blob_t:

#pragma pack(push, 1)
struct bb_byte_blob_t
{
void *_buffer;
uint _size;
};
#pragma pack(pop)

Here is the source of the bb_assign():

NTSTATUS bb_assign(bb_byte_blob_t *pbb, void *pdata, uint size) {
NTSTATUS status_v = STATUS_UNSUCCESSFUL;

NULL_ASSERT(“pbb == NULL”, pbb);
NULL_ASSERT(“pdata == NULL”, pdata);

status_v = bb_allocate(pbb, size); // allocates the buffer the source
code is below
CHECK_ERROR(status_v, exit_l);

RtlCopyMemory(pbb->_buffer, pdata, size); // the BSOD is here

exit_l:
return status_v;
}

NTSTATUS bb_allocate(bb_byte_blob_t *pbb, uint size) {
NTSTATUS status_v = STATUS_MEMORY_NOT_ALLOCATED;

NULL_ASSERT(“pbb == NULL”, pbb);

pbb->_buffer = NULL;
pbb->_size = 0;

if (size > 0)
{
M_ALLOC(pbb->_buffer, size); // macros for ExAllocatePoolWithTag()
if (NULL != pbb->_buffer)
{
pbb->_size = size;
status_v = STATUS_SUCCESS;
}
}
return status_v;
}

Since the crash dump’s Arg2 is 0, I may assume that the problem is in the
NdisGetDataBuffer() function, which is probably returns the corrupted
pointer, but I have no idea why this may happen?

Any help will be appreciated!

Thanks in advance,
Richard


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

On 28-Oct-2014 14:04, Thomas F. Divine wrote:

Consider reviewing documentation for NdisGetDataBuffer. In particular learn
the cases where NdisGetDataBuffer is designed to return NULL and revise your
code.

Richard, are all these CHECK_NULL an so on macros #defined to do
anything in release (a.k.a. free) mode?

Excuse me for asking but things happen sometimes.
– pa

The caller provides pdata which points to invalid or paged memory. I’d suspect it’s where some MDL used to be mapped.

Thanks everyone for replies!

Thomas,
There is a CHECK_NULL() macros, which stands for the checking the pointer for NULL (unfortunately, the source code formatting is corrupted, so it may be not clear):

pbuf_v = NET_BUFFER_LIST_FIRST_NB(pdata->_ppkt);
CHECK_NULL(pbuf_v, status_v, exit_l);

Here is the CHECK_NULL definition:
#define CHECK_ERROR(_res, _err) \
if (!NT_SUCCESS(_res)) \
goto _err

#define CHECK_NULL(_ptr, _res, _err) \
_res = (NULL != _ptr) ? \
STATUS_SUCCESS : STATUS_UNSUCCESSFUL; \
CHECK_ERROR(_res, _err)

Pavel,
Yes, it is. All these macros are defined for both the debug and release modes.

Alex,
As far as I understand, the NET_BUFFER_LIST contains the MDLs, but when I call the NdisGetDataBuffer() it should return me NULL (if something goes wrong) or a pointer to a contiguous memory, which may be copied. It looks like the NdisGetDataBuffer() returns me non-null pointer (because CHECK_NULL() macros is passed), but after that the memory is already freed somehow, so the RtlCopyMemory() fails.

Any possibility that there is a problem in fetching the actual frame size to
copy?

Thomas F. Divine

-----Original Message-----
From: xxxxx@lists.osr.com
[mailto:xxxxx@lists.osr.com] On Behalf Of
xxxxx@gmail.com
Sent: Wednesday, October 29, 2014 6:23 AM
To: Windows System Software Devs Interest List
Subject: RE:[ntdev] WFP driver. PAGE_FAULT_IN_NONPAGED_AREA bugcheck on
RtlCopyMemory()

Thanks everyone for replies!

Thomas,
There is a CHECK_NULL() macros, which stands for the checking the pointer
for NULL (unfortunately, the source code formatting is corrupted, so it may
be not clear):

pbuf_v = NET_BUFFER_LIST_FIRST_NB(pdata->_ppkt);
CHECK_NULL(pbuf_v, status_v, exit_l);

Here is the CHECK_NULL definition:
#define CHECK_ERROR(_res, _err) \
if (!NT_SUCCESS(_res)) \
goto _err

#define CHECK_NULL(_ptr, _res, _err) \
_res = (NULL != _ptr) ? \
STATUS_SUCCESS : STATUS_UNSUCCESSFUL; \
CHECK_ERROR(_res, _err)

Pavel,
Yes, it is. All these macros are defined for both the debug and release
modes.

Alex,
As far as I understand, the NET_BUFFER_LIST contains the MDLs, but when I
call the NdisGetDataBuffer() it should return me NULL (if something goes
wrong) or a pointer to a contiguous memory, which may be copied. It looks
like the NdisGetDataBuffer() returns me non-null pointer (because
CHECK_NULL() macros is passed), but after that the memory is already freed
somehow, so the RtlCopyMemory() fails.


NTDEV is sponsored by OSR

Visit the list at: http://www.osronline.com/showlists.cfm?list=ntdev

OSR is HIRING!! See http://www.osr.com/careers

For our schedule of WDF, WDM, debugging and other seminars visit:
http://www.osr.com/seminars

To unsubscribe, visit the List Server section of OSR Online at
http://www.osronline.com/page.cfm?name=ListServer

Run the box with the Driver Verifier enabled for VPN and NDIS and your driver.

It’s possible that the VPN driver frees the buffer prematurely.

Hello there!

The problem is resolved.

During the deep research it was found out the the real problem was in collaborative work of our driver and Avast’s drivers. As Alex supposed, the buffer was freed prematurely and the bugcheck occurred.

Thanks for your assistance,
Richard