Re: [SPDK] NBD with SPDK

* Re: [SPDK] NBD with SPDK
@ 2019-08-30 22:28 Mittal, Rishabh
  0 siblings, 0 replies; 32+ messages in thread
From: Mittal, Rishabh @ 2019-08-30 22:28 UTC (permalink / raw)
  To: spdk

[-- Attachment #1: Type: text/plain, Size: 11551 bytes --]

I got the run again. It is with 4k write.

13.16%  vhost                       [.] spdk_ring_dequeue                                                                          
   6.08%  vhost                       [.] rte_rdtsc                                                                                  
   4.77%  vhost                       [.] spdk_thread_poll                                                                           
   2.85%  vhost                       [.] _spdk_reactor_run                                                                          
   2.43%  [kernel]                    [k] syscall_return_via_sysret                                                                  
   2.17%  [kernel]                    [k] copy_user_enhanced_fast_string                                                             
   2.05%  [kernel]                    [k] _raw_spin_lock                                                                             
   1.83%  vhost                       [.] _spdk_msg_queue_run_batch                                                                  
   1.56%  vhost                       [.] _spdk_event_queue_run_batch                                                                
   1.56%  [kernel]                    [k] memcpy_erms                                                                                
   1.39%  [kernel]                    [k] switch_mm_irqs_off                                                                         
   1.33%  [kernel]                    [k] radix_tree_next_chunk                                                                      
   1.17%  [kernel]                    [k] native_queued_spin_lock_slowpath                                                           
   1.13%  [unknown]                   [k] 0xfffffe000000601b                                                                         
   1.02%  [kernel]                    [k] _raw_spin_lock_irqsave                                                                     
   0.94%  [kernel]                    [k] unix_stream_read_generic                                                                   
   0.92%  [kernel]                    [k] load_new_mm_cr3                                                                            
   0.87%  [kernel]                    [k] _raw_spin_lock_irq                                                                         
   0.83%  [kernel]                    [k] cmpxchg_double_slab.isra.61                                                                
   0.78%  [kernel]                    [k] mutex_lock                                                                                 
   0.78%  [kernel]                    [k] unix_stream_sendmsg                                                                        
   0.77%  [kernel]                    [k] sock_wfree                                                                                 
   0.74%  [kernel]                    [k] __schedule                                                                                 

On 8/29/19, 6:05 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:

    I got the profile with first run. 

      27.91%  vhost                       [.] spdk_ring_dequeue                                                                          
      12.94%  vhost                       [.] rte_rdtsc                                                                                  
      11.00%  vhost                       [.] spdk_thread_poll                                                                           
       6.15%  vhost                       [.] _spdk_reactor_run                                                                          
       4.35%  [kernel]                    [k] syscall_return_via_sysret                                                                  
       3.91%  vhost                       [.] _spdk_msg_queue_run_batch                                                                  
       3.38%  vhost                       [.] _spdk_event_queue_run_batch                                                                
       2.83%  [unknown]                   [k] 0xfffffe000000601b                                                                         
       1.45%  vhost                       [.] spdk_thread_get_from_ctx                                                                   
       1.20%  [kernel]                    [k] __fget                                                                                     
       1.14%  libpthread-2.27.so          [.] __libc_read                                                                                
       1.00%  libc-2.27.so                [.] 0x000000000018ef76                                                                         
       0.99%  libc-2.27.so                [.] 0x000000000018ef79          

    Thanks
    Rishabh Mittal                         

    On 8/19/19, 7:42 AM, "Luse, Paul E" <paul.e.luse(a)intel.com> wrote:

        That's great.  Keep any eye out for the items Ben mentions below - at least the first one should be quick to implement and compare both profile data and measured performance.

        Don’t' forget about the community meetings either, great place to chat about these kinds of things.  https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdk.io%2Fcommunity%2F&amp;data=02%7C01%7Crimittal%40ebay.com%7Cd5c75891ea414963501c08d724b36248%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637018225183900855&amp;sdata=wEMi40AMPeGVt3XX3bHfneHqM0LFEB8Jt%2F9dQl6cIBE%3D&amp;reserved=0  Next one is tomorrow morn US time.

        Thx
        Paul

        -----Original Message-----
        From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Mittal, Rishabh via SPDK
        Sent: Thursday, August 15, 2019 6:50 PM
        To: Harris, James R <james.r.harris(a)intel.com>; Walker, Benjamin <benjamin.walker(a)intel.com>; spdk(a)lists.01.org
        Cc: Mittal, Rishabh <rimittal(a)ebay.com>; Chen, Xiaoxi <xiaoxchen(a)ebay.com>; Szmyd, Brian <bszmyd(a)ebay.com>; Kadayam, Hari <hkadayam(a)ebay.com>
        Subject: Re: [SPDK] NBD with SPDK

        Thanks. I will get the profiling by next week. 

        On 8/15/19, 6:26 PM, "Harris, James R" <james.r.harris(a)intel.com> wrote:

            On 8/15/19, 4:34 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:

                Hi Jim

                What tool you use to take profiling. 

            Hi Rishabh,

            Mostly I just use "perf top".

            -Jim

                Thanks
                Rishabh Mittal

                On 8/14/19, 9:54 AM, "Harris, James R" <james.r.harris(a)intel.com> wrote:

                    On 8/14/19, 9:18 AM, "Walker, Benjamin" <benjamin.walker(a)intel.com> wrote:

                    <trim>

                        When an I/O is performed in the process initiating the I/O to a file, the data
                        goes into the OS page cache buffers at a layer far above the bio stack
                        (somewhere up in VFS). If SPDK were to reserve some memory and hand it off to
                        your kernel driver, your kernel driver would still need to copy it to that
                        location out of the page cache buffers. We can't safely share the page cache
                        buffers with a user space process.

                    I think Rishabh was suggesting the SPDK reserve the virtual address space only.
                    Then the kernel could map the page cache buffers into that virtual address space.
                    That would not require a data copy, but would require the mapping operations.

                    I think the profiling data would be really helpful - to quantify how much of the 50us
                    Is due to copying the 4KB of data.  That can help drive next steps on how to optimize
                    the SPDK NBD module.

                    Thanks,

                    -Jim

                        As Paul said, I'm skeptical that the memcpy is significant in the overall
                        performance you're measuring. I encourage you to go look at some profiling data
                        and confirm that the memcpy is really showing up. I suspect the overhead is
                        instead primarily in these spots:

                        1) Dynamic buffer allocation in the SPDK NBD backend.

                        As Paul indicated, the NBD target is dynamically allocating memory for each I/O.
                        The NBD backend wasn't designed to be fast - it was designed to be simple.
                        Pooling would be a lot faster and is something fairly easy to implement.

                        2) The way SPDK does the syscalls when it implements the NBD backend.

                        Again, the code was designed to be simple, not high performance. It simply calls
                        read() and write() on the socket for each command. There are much higher
                        performance ways of doing this, they're just more complex to implement.

                        3) The lack of multi-queue support in NBD

                        Every I/O is funneled through a single sockpair up to user space. That means
                        there is locking going on. I believe this is just a limitation of NBD today - it
                        doesn't plug into the block-mq stuff in the kernel and expose multiple
                        sockpairs. But someone more knowledgeable on the kernel stack would need to take
                        a look.

                        Thanks,
                        Ben

                        > 
                        > Couple of things that I am not really sure in this flow is :- 1. How memory
                        > registration is going to work with RDMA driver.
                        > 2. What changes are required in spdk memory management
                        > 
                        > Thanks
                        > Rishabh Mittal

        _______________________________________________
        SPDK mailing list
        SPDK(a)lists.01.org
        https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&amp;data=02%7C01%7Crimittal%40ebay.com%7Cd5c75891ea414963501c08d724b36248%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637018225183900855&amp;sdata=9QDXP2O4MWvrQmKitBJONSkZZHXrRqfFXPrDqltPYjM%3D&amp;reserved=0

^ permalink raw reply	[flat|nested] 32+ messages in thread