NVMeOF TCP Initiator for Windows

Hi,

Is there anything available for nvme like MS-iSCSI initiator ?
I have looked around and couldn’t find any.
Intent is to discover/connect nvme target via TCP from Windows machine. Similar to Linux’s nvme, nvme-tcp and nvme tool.
I have searched but couldn’t get much on nvme.

Hence, looking to implement NVMeOF (over TCP) initiator, similar to MS-iSCSI initiator(May be), unless Microsoft is expected to release for nvme.
I have searched for MS-iSCSI initiator sample code for reference but couldn’t get.
Any help in this regard would be very helpful.

Thanks,
Sandilya

NVMeOF is a protocol that is much different than iSCSI. I do not believe that there is anything from Microsoft, and I would be surprised if they have plans for anything anytime soon. The major storage vendors have too much invested in their proprietary versions

Usually, NMVEOF is used with Fiber Channel. It can be used with Ethernet too, but it is not TCP

@MBond2 said:
NVMeOF is a protocol that is much different than iSCSI. I do not believe that there is anything from Microsoft, and I would be surprised if they have plans for anything anytime soon. The major storage vendors have too much invested in their proprietary versions

Usually, NMVEOF is used with Fiber Channel. It can be used with Ethernet too, but it is not TCP

Thanks you for your inputs. However in my Google search came across, It says it could work over TCP.
https://www.starwindsoftware.com/starwind-nvme-of-initiator

I want to develop a similar one probably tightly coupled with our NIC. Any references would be helpful.

I know nothing about this product, or what you want to do. With respect, if you are asking advice on how to approach this from an Internet forum, you have a long way to go on this project.

@MBond2 said:
I know nothing about this product, or what you want to do. With respect, if you are asking advice on how to approach this from an Internet forum, you have a long way to go on this project.

I apologize if I have sounded so, I am not asking on how to approach but, I have 2 questions first, Is MSFT has plans to provide for which I have received information in your first reply. My second questions is to know if there is any MS-iSCSI initiator sample code available only to understand the TCP part. I have learned that WSK is one way to communicate via sockets from kernel. Just want evaluate any other options if available for socket communication in kernel code.
Thank you again.

well, I guess you are right

https://nvmexpress.org/wp-content/uploads/NVMe-over-Fabrics-1.1a-2021.07.12-Ratified.pdf

the specification does include explicit TCP support. Typically this is deployed using lossless Ethernet or Fiber Channel, but I suppose TCP would qualify as a similar lossless transport. The performance would be abysmal, and it would be much more fragile if you intend to rely on the Windows TCP stack. I think that you would need a TCP stack on your hardware if you intend to support bootable volumes, but I have not looked into the details enough to know.

Anyways, I would expect that you would plan to be enumerated as an additional device on your NIC, and that your upper edge would be storage controller, and your lower edge would be your NIC - the NVME part being implemented entirely between your driver and firmware. But that’s just a guess

@MBond2 said:
well, I guess you are right

https://nvmexpress.org/wp-content/uploads/NVMe-over-Fabrics-1.1a-2021.07.12-Ratified.pdf

the specification does include explicit TCP support. Typically this is deployed using lossless Ethernet or Fiber Channel, but I suppose TCP would qualify as a similar lossless transport. The performance would be abysmal, and it would be much more fragile if you intend to rely on the Windows TCP stack. I think that you would need a TCP stack on your hardware if you intend to support bootable volumes, but I have not looked into the details enough to know.

Anyways, I would expect that you would plan to be enumerated as an additional device on your NIC, and that your upper edge would be storage controller, and your lower edge would be your NIC - the NVME part being implemented entirely between your driver and firmware. But that’s just a guess

Apologies again for delay in response. I am not keeping well recently.
Yes, that’s correct. One alternative is It would be enumerated as additional device on NIC with storage as upper edge and lower edge would be NIC and private implementation in the firmware/driver.

But, not all vendors/models does have required support to implement things in the driver/firmware and hence this alternative.
It would be enumerated as root device like MS-iSCSI initiator with storage as upper edge and lower edge would be any NIC (It could be extended to RDMA capable NIC).
So for this, looking for the right interface to communicate with the NICs in the driver.

if you are not in control of the NIC hardware, then your task is immensely harder and is probably not worth the effort.

Also, for this project, RDMA is an irrelevant feature. RDMA is a technology to accelerate the transfer of data from the NIC into RAM. But you are primarily concerned with lossless transfer over the fabric - or emulated lossless transfer over TCP over the fabric. And it is likely that whatever mechanism that the NIC driver uses, your throughput will be much lower - and if it relies on TCP, and the hardware does not actually implement lossless ethernet, then it is likely to be much lower

your question is whether you can or should use the in box TCP stack. If you have specific hardware that can implement the stack on your device, that would be the standard approach. If not, then whatever you implement will suck and there isn’t much you can do about it

Completely agree with all your points. But, the problem at hand is having something working (acknowledging lower throughput) vs not having at anything all. Especially when Linux having a standard tool which does seamlessly for any NIC.

I have understood that it is possible to implement using WSK and I’ll summarize my study. I have noted that it is not easy and immensely harder.
Thank you very much again for your insight.
My opinion this should ideally come from MSFT if it is worth having it like MS-iSCSI Initiator but I am talking without complete knowledge here.

I implemented this NVMe over TCP couple of years back. It is working fantastically similar to Linux and in some workload better then Linux.

One more point I’ve written the library for WinSock Kernel to be used in the kernel applications. Which made whole effort much easier.

@Yatindra_Vaishnav Awesome… are you able to share any of your work with the community? That would, I’m sure, be most appreciated.

@“Peter_Viscarola_(OSR)” I would love to and trying to get something for community, which will not conflict with the organization.

Meanwhile I would help if someone had issue to get this piece of puzzle working.

@Yatindra_Vaishnav Great to hear about your product. However, I am about to start on it before been pulled off to another priority project. However, I would start soon. I was planned to use WSK for the underlying TCP communication and looks like it is the choice. I understand if you cannot share your work but any pointers which helps if can be shareable would be of great help.

Start looking at https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wsk/ns-wsk-_wsk_provider_connection_dispatch. This is the first thing you will have to have for doing anything to with Windows Kernel Sockets.

As I promised please look at my github page for RDMA and TCP kernel socket library and NVMe-oF driver. Which I’m building as open source to help community (to the comment by @“Peter_Viscarola_(OSR)” ). Though it is not yet completed but I would want to share so that people can start observing the overall development.

https://github.com/yatindrav/NVMe-oF

@Yatindra_Vaishnav said:
As I promised please look at my github page for RDMA and TCP kernel socket library and NVMe-oF driver. Which I’m building as open source to help community (to the comment by @“Peter_Viscarola_(OSR)” ). Though it is not yet completed but I would want to share so that people can start observing the overall development.

https://github.com/yatindrav/NVMe-oF

This is great effort. Thank you for making your work available for public.

Let me know if you need any guidance.

@Yatindra_Vaishnav said:
Let me know if you need any guidance.

Thank you so much for publishing NVMe-oF. May I ask a question about the driver compiling. I use VS2022 but I got some compiling errors. Could you guide me if there is any additional library I’m missing. It seems some definitions could not be found. Thanks again.

Build started...
1>------ Build started: Project: NVMeoF, Configuration: Debug x64 ------
1>Building 'NVMeoF' with toolset 'WindowsKernelModeDriver10.0' and the 'Universal' target platform.
1>Stamping x64\Debug\NVMeoF.inf
1>Stamping [Version] section with DriverVer=08/04/2023,10.41.31.774
1>fab_rdma.c
. . . . . .
. . . . . .
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\fab_rdma.c(1074,1): error C2084: function 'NTSTATUS NdkCreateRdmaSocket(IF_INDEX,PNDK_COMPLETION_CBS,PNDK_SOCKET *)' already has a body
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\fab_rdma.c(1006,1): message : see previous definition of 'NdkCreateRdmaSocket'
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\fab_rdma.c(1590,2): error C2065: 'NDK_WREQ_SQ_GET_SEND_RESULTS': undeclared identifier
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\fab_rdma.c(1695,2): error C2051: case expression not constant
1>fab_tcp.c
1>nvmefrdma.c
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmef.h(106,2): error C2061: syntax error: identifier 'PNVMEOF_TCP_WORK_REQUEST'
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmef.h(107,1): error C2059: syntax error: '}'
. . . . . .
. . . . . .
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(49,56): error C2146: syntax error: missing ')' before identifier 'psDispatcher'
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(49,56): error C2081: 'PNDK_CALLBACKS': name in formal parameter list illegal
. . . . . .
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(209,63): error C2065: 'NVMEOF_RESPONSE_STATE_SUBMITTED': undeclared identifier
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(210,58): error C2065: 'NVMEOF_RESPONSE_STATE_FREE': undeclared identifier
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(210,89): error C2065: 'NVMEOF_RESPONSE_STATE_FREE': undeclared identifier
. . . . . .
. . . . . .
1>Z:\OneDrive\CZ_Dev\NVMe-oF-WinDrv\workplace\nvmefrdma.c(653,126): fatal  error C1003: error count exceeds 100; stopping compilation
1>nvmeftcp.c
. . . . . .
. . . . . .
1>Generating Code...
1>Done building project "NVMeoF.vcxproj" -- FAILED.
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
========== Build started at 10:41 AM and took 04.276 seconds ==========