All of lore.kernel.org
 help / color / mirror / Atom feed
From: Harris, James R <james.r.harris at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] NBD with SPDK
Date: Tue, 13 Aug 2019 21:45:21 +0000	[thread overview]
Message-ID: <DFD1CFC1-7CD5-4ABA-862A-DE86670DBFD9@intel.com> (raw)
In-Reply-To: 70E41F7C-8F6E-468B-AEAD-F9909E661987@ebay.com

[-- Attachment #1: Type: text/plain, Size: 7513 bytes --]

Hi Rishabh,

The idea is technically feasible, but I think you would find the cost of pinning the pages plus mapping them into the SPDK process would far exceed the cost of the kernel/user copy.

From your original e-mail - could you clarify what the 50us is measuring?  For example, does this include the NVMe-oF round trip?  And if so, what is the backing device for the namespace on the target side?

Thanks,

-Jim


On 8/13/19, 12:55 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:

    I don't have any profiling data. I am not really worried about system calls because I think we could find a way to optimize it. I am really worried about bcopy. How can we avoid bcopying from kernel to user space.
    
    Other idea we have is to map the physical address of a buffer in bio to spdk virtual memory. We have to modify nbd driver or write a new light weight driver for this.  Do you think is It something feasible to do in SPDK.
    
    
    Thanks
    Rishabh Mittal
    
    On 8/12/19, 11:42 AM, "Harris, James R" <james.r.harris(a)intel.com> wrote:
    
        
        
        On 8/12/19, 11:20 AM, "SPDK on behalf of Harris, James R" <spdk-bounces(a)lists.01.org on behalf of james.r.harris(a)intel.com> wrote:
        
            
            
            On 8/12/19, 9:20 AM, "SPDK on behalf of Mittal, Rishabh via SPDK" <spdk-bounces(a)lists.01.org on behalf of spdk(a)lists.01.org> wrote:
            
                <<As I’m sure you’re aware, SPDK apps use spdk_alloc() with the SPDK_MALLOC_DMA which is backed by huge pages that are effectively pinned already. SPDK does virt to phy transition on memory allocated this <<way very efficiently using spdk_vtophys().  It would be an interesting experiment though.  Your app is not in a VM right?>>
                
                I am thinking of passing the physical address of the buffers in bio to spdk.  I don’t know if it is already pinned by the kernel or do we need to explicitly do it. And also, spdk has some requirements on the alignment of physical address. I don’t know if address in bio conforms to those requirements.
                
                SPDK won’t be running in VM.
            
            
            Hi Rishabh,
            
            SPDK relies on data buffers being mapped into the SPDK application's address space, and are passed as virtual addresses throughout the SPDK stack.  Once the buffer reaches a module that requires a physical address (such as the NVMe driver for a PCIe-attached device), SPDK translates the virtual address to a physical address.  Note that the NVMe fabrics transports (RDMA and TCP) both deal with virtual addresses, not physical addresses.  The RDMA transport is built on top of ibverbs, where we register virtual address areas as memory regions for describing data transfers.
            
            So for nbd, pinning the buffers and getting the physical address(es) to SPDK wouldn't be enough.  Those physical address regions would also need to get dynamically mapped into the SPDK address space.
            
            Do you have any profiling data that shows the relative cost of the data copy v. the system calls themselves on your system?  There may be some optimization opportunities on the system calls to look at as well.
            
            Regards,
            
            -Jim
        
        Hi Rishabh,
        
        Could you also clarify what the 50us is measuring?  For example, does this include the NVMe-oF round trip?  And if so, what is the backing device for the namespace on the target side?
        
        Thanks,
        
        -Jim
        
            
            
            
            
                From: "Luse, Paul E" <paul.e.luse(a)intel.com>
                Date: Sunday, August 11, 2019 at 12:53 PM
                To: "Mittal, Rishabh" <rimittal(a)ebay.com>, "spdk(a)lists.01.org" <spdk(a)lists.01.org>
                Cc: "Kadayam, Hari" <hkadayam(a)ebay.com>, "Chen, Xiaoxi" <xiaoxchen(a)ebay.com>, "Szmyd, Brian" <bszmyd(a)ebay.com>
                Subject: RE: NBD with SPDK
                
                Hi Rishabh,
                
                Thanks for the question. I was talking to Jim and Ben about this a bit, one of them may want to elaborate but we’re thinking the cost of mmap and also making sure the memory is pinned is probably prohibitive. As I’m sure you’re aware, SPDK apps use spdk_alloc() with the SPDK_MALLOC_DMA which is backed by huge pages that are effectively pinned already. SPDK does virt to phy transition on memory allocated this way very efficiently using spdk_vtophys().  It would be an interesting experiment though.  Your app is not in a VM right?
                
                Thx
                Paul
                
                From: Mittal, Rishabh [mailto:rimittal(a)ebay.com]
                Sent: Saturday, August 10, 2019 6:09 PM
                To: spdk(a)lists.01.org
                Cc: Luse, Paul E <paul.e.luse(a)intel.com>; Kadayam, Hari <hkadayam(a)ebay.com>; Chen, Xiaoxi <xiaoxchen(a)ebay.com>; Szmyd, Brian <bszmyd(a)ebay.com>
                Subject: NBD with SPDK
                
                Hi,
                
                We are trying to use NBD and SPDK on client side.  Data path looks like this
                
                File System ----> NBD client ------>SPDK------->NVMEoF
                
                
                Currently we are seeing a high latency in the order of 50 us by using this path. It seems like there is data buffer copy happening for write commands from kernel to user space when spdk nbd read data from the nbd socket.
                
                I think that there could be two ways to prevent data copy .
                
                
                  1.  Memory mapped the kernel buffers to spdk virtual space.  I am not sure if it is possible to mmap a buffer. And what is the impact to call mmap for each IO.
                  2.  If NBD kernel give the physical address of a buffer and SPDK use that to DMA it to NVMEoF. I think spdk must also be changing a virtual address to physical address before sending it to nvmeof.
                
                Option 2 makes more sense to me. Please let me know if option 2 is feasible in spdk
                
                Thanks
                Rishabh Mittal
                
                _______________________________________________
                SPDK mailing list
                SPDK(a)lists.01.org
                https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&amp;data=02%7C01%7Crimittal%40ebay.com%7C1f52c93de1c84250e44908d71f54baca%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637012321296153316&amp;sdata=P%2FMNmm%2FZUu2MsxtIdnSrojG%2FZ0ww8cSdDGSvqZ%2FesPE%3D&amp;reserved=0
                
            
            _______________________________________________
            SPDK mailing list
            SPDK(a)lists.01.org
            https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&amp;data=02%7C01%7Crimittal%40ebay.com%7C1f52c93de1c84250e44908d71f54baca%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637012321296153316&amp;sdata=P%2FMNmm%2FZUu2MsxtIdnSrojG%2FZ0ww8cSdDGSvqZ%2FesPE%3D&amp;reserved=0
            
        
        
    
    


             reply	other threads:[~2019-08-13 21:45 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-13 21:45 Harris, James R [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-09-23  1:03 [SPDK] NBD with SPDK Huang Zhiteng
2019-09-06 20:31 Kadayam, Hari
2019-09-06 17:13 Mittal, Rishabh
2019-09-06  2:14 Szmyd, Brian
2019-09-06  2:08 Huang Zhiteng
2019-09-05 22:00 Szmyd, Brian
2019-09-05 21:22 Walker, Benjamin
2019-09-05 20:11 Luse, Paul E
2019-09-04 23:27 Luse, Paul E
2019-09-04 23:03 Luse, Paul E
2019-09-04 18:08 Walker, Benjamin
2019-08-30 22:28 Mittal, Rishabh
2019-08-30 17:06 Walker, Benjamin
2019-08-30  1:05 Mittal, Rishabh
2019-08-19 14:41 Luse, Paul E
2019-08-16  1:50 Mittal, Rishabh
2019-08-16  1:26 Harris, James R
2019-08-15 23:34 Mittal, Rishabh
2019-08-14 17:55 Mittal, Rishabh
2019-08-14 17:05 Kadayam, Hari
2019-08-14 16:54 Harris, James R
2019-08-14 16:18 Walker, Benjamin
2019-08-14 14:28 Luse, Paul E
2019-08-13 22:08 Mittal, Rishabh
2019-08-13 19:55 Mittal, Rishabh
2019-08-12 18:41 Harris, James R
2019-08-12 18:11 Harris, James R
2019-08-11 23:33 Mittal, Rishabh
2019-08-11 22:51 Mittal, Rishabh
2019-08-11 19:53 Luse, Paul E
2019-08-11  1:08 Mittal, Rishabh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DFD1CFC1-7CD5-4ABA-862A-DE86670DBFD9@intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.