From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============3351855897106064939==" MIME-Version: 1.0 From: Harris, James R Subject: Re: [SPDK] NBD with SPDK Date: Mon, 12 Aug 2019 18:41:42 +0000 Message-ID: <2473EE9E-0598-4C2F-A9B7-580774566AF0@intel.com> In-Reply-To: 40047C26-7BD2-40F3-BBD9-621C93F9971A@intel.com List-ID: To: spdk@lists.01.org --===============3351855897106064939== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable =EF=BB=BFOn 8/12/19, 11:20 AM, "SPDK on behalf of Harris, James R" wrote: = = On 8/12/19, 9:20 AM, "SPDK on behalf of Mittal, Rishabh via SPDK" wrote: = <> = I am thinking of passing the physical address of the buffers in bio= to spdk. I don=E2=80=99t know if it is already pinned by the kernel or do= we need to explicitly do it. And also, spdk has some requirements on the a= lignment of physical address. I don=E2=80=99t know if address in bio confor= ms to those requirements. = SPDK won=E2=80=99t be running in VM. = = Hi Rishabh, = SPDK relies on data buffers being mapped into the SPDK application's ad= dress space, and are passed as virtual addresses throughout the SPDK stack.= Once the buffer reaches a module that requires a physical address (such a= s the NVMe driver for a PCIe-attached device), SPDK translates the virtual = address to a physical address. Note that the NVMe fabrics transports (RDMA= and TCP) both deal with virtual addresses, not physical addresses. The RD= MA transport is built on top of ibverbs, where we register virtual address = areas as memory regions for describing data transfers. = So for nbd, pinning the buffers and getting the physical address(es) to= SPDK wouldn't be enough. Those physical address regions would also need t= o get dynamically mapped into the SPDK address space. = Do you have any profiling data that shows the relative cost of the data= copy v. the system calls themselves on your system? There may be some opt= imization opportunities on the system calls to look at as well. = Regards, = -Jim Hi Rishabh, Could you also clarify what the 50us is measuring? For example, does this = include the NVMe-oF round trip? And if so, what is the backing device for = the namespace on the target side? Thanks, -Jim = = = = From: "Luse, Paul E" Date: Sunday, August 11, 2019 at 12:53 PM To: "Mittal, Rishabh" , "spdk(a)lists.01.org" = Cc: "Kadayam, Hari" , "Chen, Xiaoxi" , "Szmyd, Brian" Subject: RE: NBD with SPDK = Hi Rishabh, = Thanks for the question. I was talking to Jim and Ben about this a = bit, one of them may want to elaborate but we=E2=80=99re thinking the cost = of mmap and also making sure the memory is pinned is probably prohibitive. = As I=E2=80=99m sure you=E2=80=99re aware, SPDK apps use spdk_alloc() with t= he SPDK_MALLOC_DMA which is backed by huge pages that are effectively pinne= d already. SPDK does virt to phy transition on memory allocated this way ve= ry efficiently using spdk_vtophys(). It would be an interesting experiment= though. Your app is not in a VM right? = Thx Paul = From: Mittal, Rishabh [mailto:rimittal(a)ebay.com] Sent: Saturday, August 10, 2019 6:09 PM To: spdk(a)lists.01.org Cc: Luse, Paul E ; Kadayam, Hari ; Chen, Xiaoxi ; Szmyd, Brian Subject: NBD with SPDK = Hi, = We are trying to use NBD and SPDK on client side. Data path looks = like this = File System ----> NBD client ------>SPDK------->NVMEoF = = Currently we are seeing a high latency in the order of 50 us by usi= ng this path. It seems like there is data buffer copy happening for write c= ommands from kernel to user space when spdk nbd read data from the nbd sock= et. = I think that there could be two ways to prevent data copy . = = 1. Memory mapped the kernel buffers to spdk virtual space. I am= not sure if it is possible to mmap a buffer. And what is the impact to cal= l mmap for each IO. 2. If NBD kernel give the physical address of a buffer and SPDK = use that to DMA it to NVMEoF. I think spdk must also be changing a virtual = address to physical address before sending it to nvmeof. = Option 2 makes more sense to me. Please let me know if option 2 is = feasible in spdk = Thanks Rishabh Mittal = _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://lists.01.org/mailman/listinfo/spdk = = _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://lists.01.org/mailman/listinfo/spdk = --===============3351855897106064939==--