About WFP local proxy processing connected udp socket problem

Hi there,
I want to develop a network global proxy project. I passed the wfp framework and successfully implemented the tcp proxy. I encountered some problems when handling udp. I heard from the community that it might be a bug in Microsoft's framework (Source port 0 for UDP connect data while doing connect redirect FWPM_LAYER_ALE_CONNECT_REDIRECT_V4 - #7 by sagi_zar) . Here are my questions:

Some udp socket applications call the Connect function, so they will only receive data packets from the ip specified in the Connect function, and will not receive data packets from the local proxy, so my local proxy cannot reply to udp traffic

My attempts:

I have not solved it yet. I tried multiple methods:

  1. Forged datagrams through raw sockets, benevolently deceived applications, making them think that the data packets sent by the local proxy are from their target server, but failed due to Microsoft's restrictions on raw sockets (UDP datagrams with an invalid source address cannot be sent over raw sockets.), I also posted on stackflow to ask for help on how to bypass the restriction (How to break out of Windows raw sockets to send UDP datagrams where the source ip does not exist on the network interface - Stack Overflow), but I didn't get any response
  2. At the FWPM_LAYER_DATAGRAM_DATA_V4 layer, I tried to capture and modify the source IP of the packet from the local proxy, but I didn't capture any data, probably because the source address of the application redirected by wfp is a loopback
  3. Through LSP, but Microsoft documents say that LSP is outdated after win8, and it seems that it needs to directly interfere with the application. I didn't find relevant code materials and examples, so I can't learn it
  4. Create a tap network card, but I didn't find any relevant documents. I only searched for wintun, but I don't like to use third-party libraries
    If anyone can help me, thank very much

at windows examples : msdn-code-gallery-microsoft/Official Windows Driver Kit Sample/Windows Driver Kit (WDK) 8.1 Samples/[C++]-windows-driver-kit-81-cpp/WDK 8.1 C++ Samples/Windows Filtering Platform Packet Modification Sample/C++/sys/DD_proxy.c at master · microsoftarchive/msdn-code-gallery-microsoft · GitHub
it first calls new flow callback and then udp classify will trigger due the fact that packet layer was changed at flow callback to udp callback layer (and specific callback id).
perhaps this is something you could use.

Thanks for your reply. Sorry for the delay of two months. I am busy with my life and recently took some time to do experiments.

  1. I don't know if I didn't understand what you meant or if there are some problems with my expression. What I mean is that I can't capture any loopback packets replied by the local proxy at the DATAGRAM_DATA layer (I suspect that the DATAGRAM_DATA layer cannot capture loopback packets). The loopback packets do not pass through my layer. The premise of ddproxy's redirection logic is that it can capture packets at DATAGRAM_DATA before receiving the context of FlowEstablished, but I have no gains at this layer. ddproxy should only be used to redirect to other proxy servers, and Not a local proxy

  2. I tried to explore the lower flow layer. I successfully captured the data packet that the local proxy replied to the application in IPPACKET_V4 (INBOUND). However, after I modified the source address and called FwpsInjectNetworkSendAsync0, NetBufferList->Status was always 0xc000021b. Then I changed to IPPACKET_V4 (OUTBOUND). After modifying the source address, NetBufferList->Status was always 0xc000000d. I guess Microsoft thinks that the data packet in this case is abnormal, so it blocks or discards the sending

What should I do?

@Jason_Stephenson Sorry for disturbing you suddenly. I have been stuck on this problem for a long time. I see that you have a lot of knowledge in this area in the community. Maybe you can give some insight here. Thank you very much for your time !

It has been a few years since I've played with this stuff in earnest, so take my response with a pinch of salt.

I want to develop a network global proxy project.

What goal do you have in mind? Is there a subet of protocols you are interested in? Understanding this will make your life easier. i.e "Do I really need to look at all UDP data?"

Some udp socket applications call the Connect function, so they will only receive data packets from the ip specified in the Connect function, and will not receive data packets from the local proxy, so my local proxy cannot reply to udp traffic

I do not recall this being true for all connected udp sockets. The only circumstance I've seen redirect packets rejected was by windows dns client. (DNS requests). In this circumstance I intercepted the downstream packet (DNS response) again at DATAGRAM_DATA and made it look like it came from what windows dns client wanted to talk to. How are you redirecting your UDP packets? ALE_CONNECT_REDIRECT?

However, after I modified the source address and called FwpsInjectNetworkSendAsync0, NetBufferList->Status was always 0xc000021b

This is STATUS_NOT_ACCEPTED. Are you sure you altered the packet correctly? Advanced and/or Reatreated the NetBufferList to the correct point before calling inject?

As a general rule of thumb you want to do as much as possible at the ALE layers on a per "flow" basis. This has the best interoperability with VPNS (IPSEC) and other ISVs.

Jason

@Jason_Stephenson Thanks for your reply and time,I have made improvements based on your suggestions.but the problem still exists.

oops... Let me correct my previous description error. "IPPACKET_V4 (INBOUND)" should use "FwpsInjectNetworkReceiveAsync", and "IPPACKET_V4 (OUTBOUND)" is this function.
When I replied, this position of the message may have been re-edited by me.

In general operating systems, setting the system proxy is just adding the proxy server address to the environment variable. The program itself must read this environment variable if it wants to use the proxy, but most programs often do not read this environment variable. I want all network data of applications on the computer to go through my proxy, but most proxy servers only support TCP and UDP. Therefore, I currently want to use WFP to proxy TCP and UDP traffic of all applications.

I tested it. I wrote a udp socket client program. When the connect function was not called and the data was sent through the sendto function, the program successfully went through my proxy and got the correct reply. I modified the code and called the connect function and sent the data through send. The program did not get any reply. I found in the debugging analysis that the data replied by the proxy did not reach the application correctly. It seemed to be intercepted and discarded by Windows.

Yes

I am not familiar with packet injection. This is my code. It does not work. DbgView prints "WfpProxyCore Error in InBoundDataInject:c000021b". From my understanding, it should be fine. Can you help me debug it? I will be very grateful.
I have filtered out the code that I think is not interesting for this problem. If necessary, I will provide more detailed code. This is a minimized example:

VOID INBoundUDPConnectFilterCallBack(const FWPS_INCOMING_VALUES0* inFixedValues, const FWPS_INCOMING_METADATA_VALUES0* inMetaValues, void* layerData, const void* classifyContext, const FWPS_FILTER1* filter, UINT64 flowContext, FWPS_CLASSIFY_OUT0* classifyOut) 
{	
	classifyOut->actionType = FWP_ACTION_PERMIT;
	classifyOut->rights |= FWPS_RIGHT_ACTION_WRITE;
	
	
	UINT32 RemoteAddress = inFixedValues->incomingValue[FWPS_FIELD_INBOUND_IPPACKET_V4_IP_REMOTE_ADDRESS].value.uint32;
	UINT32 LocalAddress = inFixedValues->incomingValue[FWPS_FIELD_INBOUND_IPPACKET_V4_IP_LOCAL_ADDRESS].value.uint32;	
	IF_INDEX InterfaceIndex = inFixedValues->incomingValue[FWPS_FIELD_INBOUND_IPPACKET_V4_INTERFACE_INDEX].value.uint32;
	IF_INDEX SubInterfaceIndex = inFixedValues->incomingValue[FWPS_FIELD_INBOUND_IPPACKET_V4_SUB_INTERFACE_INDEX].value.uint32;
	FWPS_PACKET_INJECTION_STATE PacketInjectionState = FwpsQueryPacketInjectionState(InjectHandle, (NET_BUFFER_LIST*)layerData, NULL);

	if ((PacketInjectionState == FWPS_PACKET_NOT_INJECTED)|| RemoteAddress == IP_TO_HEX(127, 0, 0, 1))
	{
		UINT32 IpHeaderSize = inMetaValues->ipHeaderSize;
		UINT32 TransportHeaderSize = inMetaValues->transportHeaderSize;
		PNET_BUFFER_LIST CloneNetBufferList = NULL;
		PNET_BUFFER NetBuffer = NET_BUFFER_LIST_FIRST_NB((PNET_BUFFER_LIST)layerData);
		PNET_BUFFER CloneNetBuffer;

		NdisRetreatNetBufferDataStart(NetBuffer, IpHeaderSize, 0, 0);
		NTSTATUS Status = FwpsAllocateCloneNetBufferList((PNET_BUFFER_LIST)layerData, NULL, NULL, 0, &CloneNetBufferList);
		NdisAdvanceNetBufferDataStart(NetBuffer, IpHeaderSize, false, NULL);
		if (!NT_SUCCESS(Status)|| !CloneNetBufferList)
		{
			return;
		}
		CloneNetBuffer = NET_BUFFER_LIST_FIRST_NB(CloneNetBufferList);
		PIPV4_HEADER pIPHeader = (PIPV4_HEADER)NdisGetDataBuffer(CloneNetBuffer, IpHeaderSize, NULL, 1, 0);

		if (!pIPHeader)
		{
			DbgPrint("Null pIPHeader\n");
			FwpsFreeCloneNetBufferList(CloneNetBufferList, 0);
			return;
		}else if (pIPHeader->Protocol != IPPROTO_UDP) 
		{
			DbgPrint("Not UDP,Protocol:%x\n", pIPHeader->Protocol);
			FwpsFreeCloneNetBufferList(CloneNetBufferList, 0);
			return;
		}
		NdisAdvanceNetBufferDataStart(CloneNetBuffer,IpHeaderSize,false,NULL);

		PUDP_HDR pUDPHeader = (PUDP_HDR)NdisGetDataBuffer(CloneNetBuffer, sizeof(UDP_HDR), NULL, 1, 0);

		if (!pUDPHeader)
		{
			DbgPrint("Null pUDPHeader\n");
			FwpsFreeCloneNetBufferList(CloneNetBufferList, 0);
			return;
		}

		
		if (pUDPHeader->src_port != SWITCH_PORT_ENDIANNESS( 9870)) //This is the port that the local proxy receives or replies to the program
		{
			DbgPrint("No hit,Src Port:%x\n", pUDPHeader->src_port);
			FwpsFreeCloneNetBufferList(CloneNetBufferList, 0);
			return;
		}
		pIPHeader->SourceAddress = SWITCH_IPADDRSS_ENDIANNESS(IP_TO_HEX(8, 8, 8, 8));
		
		Status=FwpsInjectNetworkReceiveAsync(InjectHandle,NULL,0,(COMPARTMENT_ID)inMetaValues->compartmentId,InterfaceIndex,SubInterfaceIndex,CloneNetBufferList, DriverIPPACKETDataInjectComplete,0);
		if (!NT_SUCCESS(Status)) 
		{
			DbgPrint("WfpProxyCore Error in FwpsInjectNetworkReceiveAsync:%x\n",Status);
			return;
		}
		DbgPrint("Redirected DNS Reply Modify!");
		classifyOut->actionType = FWP_ACTION_BLOCK;
		classifyOut->rights &= ~FWPS_RIGHT_ACTION_WRITE;
		classifyOut->flags |= FWPS_CLASSIFY_OUT_FLAG_ABSORB;
	}
}
void NTAPI DriverIPPACKETDataInjectComplete(_In_ void* context,_Inout_ NET_BUFFER_LIST* netBufferList,_In_ BOOLEAN dispatchLevel) //Deprecated
{
	if (!NT_SUCCESS(netBufferList->Status))
	DbgPrint("WfpProxyCore Error in InBoundDataInject:%x\n", netBufferList->Status);
	FwpsFreeCloneNetBufferList(netBufferList, 0);
}

Are you trying to create a VPN of some kind?

Normally, proxying of network connections has to be either application aware (classic HTTP proxy) or network transparent (performed by network devices, so no software of any kind on Windows). But both require the host that will proxy your traffic to be somewhere on the network.

With a VPN, the application remains unaware that anything except normal network communications are happening, but the OS knows to do something special for certain traffic - encapsulate it and send it somewhere else.

It makes no sense to try and run a local proxy server, so maybe you mean to implement some kind of VPN scheme?

Not entirely correct, I am not implementing VPN, I am only proxying traffic and not encrypting it
The reason for using a local proxy is that it is forwarded directly to the destination proxy server, and the destination proxy server does not know the source and destination addresses of the traffic
I first redirect the traffic to my local proxy through wfp, then my local proxy tells the proxy server the source and destination addresses of the traffic and forwards the traffic to my proxy server, and finally my proxy server proxies the access.

I think you are making this more complicated for yourself than it needs to be.

VPN encapsulates network traffic on a per-packet basis. It can also encrypt that traffic, and usually does, but that's not an essential feature. Usually the tunnel is also authenticated, and it can best be thought of as an overlay network - a virtual network built on top of a physical one that has a different logical layout.

connection redirection works differently. instead of encapsulating each packet, and adding a new IP header, the existing IP header is re-written. Return traffic has to have that header restored.

In terms of you actual problem, how do you tell the remote server what the destination should be? If it isn't encapsulation, how do you tell it about each different conversation and then associate packet data with those instructions? You need some kind of control channel between the two servers - which you then need to use while holding (or discarding?) the first few packets of a new conversation. And maintain some kind of list or table (memory & speed). If you are also performing stateful packet inspection (firewall protocol analysis) this won't be anything extra, but for simply proxying a connection, all of this extra overhead is why encapsulation is the standard technique.

Thank you for your time

I only access the destination server on behalf of the client and do not handle local and LAN data.

I am currently stuck at the step of restoring the return traffic header. After the WFP ALE_CONNECT_REDIRECT redirection, the application and the redirected proxy (local proxy) become loopback sockets. The loopback packets cannot be captured in the DATAGRAM_DATA layer. I captured the loopback packets in IPPACKET_V4 (INBOUND), but when the header is reinjected, netbufferlist->status is 0xc000021b.

Yes, I use my own way to handle client connections:
For TCP connections, after the connection is established, the client sends the information of this proxy connection, and the server creates a TCP connection to the destination server according to the information provided. When one end closes the connection, the sockets of the client and the destination server will be closed together.
For UDP connections, each time the client connects, it inserts a piece of information before the data and sends it to the proxy server. The proxy server sets a timed release connection (assuming that the destination server does not respond within 15 seconds, the connection will be released)
But this is not important to the topic of this post. The problem I want to solve is that the IP header of the reply data cannot be restored, and the client will not receive the reply from the proxy service

pIPHeader->SourceAddress

You need to update the source port as well. The point of this redirection is to make it seemless to the source application. I vaguely recall having to deal with the udp checksum here. You may need to calculate them yourself.

With regards to the rest, this is far to fiddly for me to deep dive into when I haven't got an environment. My advice to you would be to print the bytes of every packet sent and received. Also, make sure you understand how you are advancing / retreating the NBL's.

1 Like

Thank you for your reply. Now I can't seem to get the data part of the packet at the IPPACKET_INBOUND layer, so I can't calculate the checksum. At the same time, it seems that the value of the "length" member of the udp header is also wrong (my test reply data is "Message received!", but its value is 70???)

My attempt to get the data part of the packet
The output result is Data: (Yes, there is no content, but the packet I captured in wireshark is the correct content)
Sorry to use your time again, this is my experimental code:

		...
		NdisAdvanceNetBufferDataStart(pNetBuffer, sizeof(UDP_HDR), false, NULL); //As I understand it, the data offset of IPPACKET_INBOUND is at the beginning of the transport header (Microsoft documentation), so the advance data with an offset size equal to the size of the UDP header is required.
		char* pDataBuffer = (char*)ExAllocatePool(NonPagedPool, SWITCH_SHORT_ENDIANNESS(pUDPHeader->dgram_len)-sizeof(UDP_HDR));
		if (!pDataBuffer)
		{
			DbgPrint("pDataDataBuffer is NullPtr\n");
			FwpsFreeCloneNetBufferList(pCloneNetBufferList, 0);
			return;
		}
		char* pResultDataBuffer = (char*)NdisGetDataBuffer(pNetBuffer, SWITCH_SHORT_ENDIANNESS(pUDPHeader->dgram_len)-sizeof(UDP_HDR),pDataBuffer,1,0);
		if(pResultDataBuffer)
		DbgPrint("Data:%.*s\n", SWITCH_SHORT_ENDIANNESS(pUDPHeader->dgram_len) - sizeof(UDP_HDR),pResultDataBuffer);
		NdisRetreatNetBufferDataStart(pNetBuffer, sizeof(UDP_HDR), 0, 0);
		...

Sorry, as mentioned before - I can't dig into this in anymore detail without an environment. Some parting thoughts:

  • Print everything until you understand what it's doing
  • This should be doable (and preferable) at DATAGRAM_DATA
  • Good luck

Jason