From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4800197758090509662==" MIME-Version: 1.0 From: Harris, James R Subject: Re: [SPDK] NBD with SPDK Date: Tue, 13 Aug 2019 21:45:21 +0000 Message-ID: In-Reply-To: 70E41F7C-8F6E-468B-AEAD-F9909E661987@ebay.com List-ID: To: spdk@lists.01.org --===============4800197758090509662== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Hi Rishabh, The idea is technically feasible, but I think you would find the cost of pi= nning the pages plus mapping them into the SPDK process would far exceed th= e cost of the kernel/user copy. >From your original e-mail - could you clarify what the 50us is measuring? = For example, does this include the NVMe-oF round trip? And if so, what is = the backing device for the namespace on the target side? Thanks, -Jim =EF=BB=BFOn 8/13/19, 12:55 PM, "Mittal, Rishabh" wrot= e: I don't have any profiling data. I am not really worried about system c= alls because I think we could find a way to optimize it. I am really worrie= d about bcopy. How can we avoid bcopying from kernel to user space. = Other idea we have is to map the physical address of a buffer in bio to= spdk virtual memory. We have to modify nbd driver or write a new light wei= ght driver for this. Do you think is It something feasible to do in SPDK. = = Thanks Rishabh Mittal = On 8/12/19, 11:42 AM, "Harris, James R" wr= ote: = = = On 8/12/19, 11:20 AM, "SPDK on behalf of Harris, James R" wrote: = = = On 8/12/19, 9:20 AM, "SPDK on behalf of Mittal, Rishabh via SPD= K" wrote: = <> = I am thinking of passing the physical address of the buffer= s in bio to spdk. I don=E2=80=99t know if it is already pinned by the kern= el or do we need to explicitly do it. And also, spdk has some requirements = on the alignment of physical address. I don=E2=80=99t know if address in bi= o conforms to those requirements. = SPDK won=E2=80=99t be running in VM. = = Hi Rishabh, = SPDK relies on data buffers being mapped into the SPDK applicat= ion's address space, and are passed as virtual addresses throughout the SPD= K stack. Once the buffer reaches a module that requires a physical address= (such as the NVMe driver for a PCIe-attached device), SPDK translates the = virtual address to a physical address. Note that the NVMe fabrics transpor= ts (RDMA and TCP) both deal with virtual addresses, not physical addresses.= The RDMA transport is built on top of ibverbs, where we register virtual = address areas as memory regions for describing data transfers. = So for nbd, pinning the buffers and getting the physical addres= s(es) to SPDK wouldn't be enough. Those physical address regions would als= o need to get dynamically mapped into the SPDK address space. = Do you have any profiling data that shows the relative cost of = the data copy v. the system calls themselves on your system? There may be = some optimization opportunities on the system calls to look at as well. = Regards, = -Jim = Hi Rishabh, = Could you also clarify what the 50us is measuring? For example, do= es this include the NVMe-oF round trip? And if so, what is the backing dev= ice for the namespace on the target side? = Thanks, = -Jim = = = = = From: "Luse, Paul E" Date: Sunday, August 11, 2019 at 12:53 PM To: "Mittal, Rishabh" , "spdk(a)lists.= 01.org" Cc: "Kadayam, Hari" , "Chen, Xiaoxi" <= xiaoxchen(a)ebay.com>, "Szmyd, Brian" Subject: RE: NBD with SPDK = Hi Rishabh, = Thanks for the question. I was talking to Jim and Ben about= this a bit, one of them may want to elaborate but we=E2=80=99re thinking t= he cost of mmap and also making sure the memory is pinned is probably prohi= bitive. As I=E2=80=99m sure you=E2=80=99re aware, SPDK apps use spdk_alloc(= ) with the SPDK_MALLOC_DMA which is backed by huge pages that are effective= ly pinned already. SPDK does virt to phy transition on memory allocated thi= s way very efficiently using spdk_vtophys(). It would be an interesting ex= periment though. Your app is not in a VM right? = Thx Paul = From: Mittal, Rishabh [mailto:rimittal(a)ebay.com] Sent: Saturday, August 10, 2019 6:09 PM To: spdk(a)lists.01.org Cc: Luse, Paul E ; Kadayam, Hari <= hkadayam(a)ebay.com>; Chen, Xiaoxi ; Szmyd, Brian Subject: NBD with SPDK = Hi, = We are trying to use NBD and SPDK on client side. Data pat= h looks like this = File System ----> NBD client ------>SPDK------->NVMEoF = = Currently we are seeing a high latency in the order of 50 u= s by using this path. It seems like there is data buffer copy happening for= write commands from kernel to user space when spdk nbd read data from the = nbd socket. = I think that there could be two ways to prevent data copy . = = 1. Memory mapped the kernel buffers to spdk virtual spac= e. I am not sure if it is possible to mmap a buffer. And what is the impac= t to call mmap for each IO. 2. If NBD kernel give the physical address of a buffer a= nd SPDK use that to DMA it to NVMEoF. I think spdk must also be changing a = virtual address to physical address before sending it to nvmeof. = Option 2 makes more sense to me. Please let me know if opti= on 2 is feasible in spdk = Thanks Rishabh Mittal = _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://nam01.safelinks.protection.outlook.com/?url=3Dhttps= %3A%2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&data=3D02%7C01%7Crimi= ttal%40ebay.com%7C1f52c93de1c84250e44908d71f54baca%7C46326bff992841a0baca17= c16c94ea99%7C0%7C0%7C637012321296153316&sdata=3DP%2FMNmm%2FZUu2MsxtIdnS= rojG%2FZ0ww8cSdDGSvqZ%2FesPE%3D&reserved=3D0 = = _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://nam01.safelinks.protection.outlook.com/?url=3Dhttps%3A%= 2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&data=3D02%7C01%7Crimittal= %40ebay.com%7C1f52c93de1c84250e44908d71f54baca%7C46326bff992841a0baca17c16c= 94ea99%7C0%7C0%7C637012321296153316&sdata=3DP%2FMNmm%2FZUu2MsxtIdnSrojG= %2FZ0ww8cSdDGSvqZ%2FesPE%3D&reserved=3D0 = = = = = --===============4800197758090509662==--