All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luse, Paul E <paul.e.luse at intel.com>
To: spdk@lists.01.org
Subject: Re: [SPDK] NBD with SPDK
Date: Wed, 04 Sep 2019 23:27:00 +0000	[thread overview]
Message-ID: <82C9F782B054C94B9FC04A331649C77ABFEEC0C1@FMSMSX125.amr.corp.intel.com> (raw)
In-Reply-To: 5E8CA222-30BC-412E-ABB5-25BF4DA55F73@ebay.com

[-- Attachment #1: Type: text/plain, Size: 16013 bytes --]

Cool, thanks for sending this.  I will try and repro tomorrow here and see what kind of results I get

Thx
Paul

-----Original Message-----
From: Mittal, Rishabh [mailto:rimittal(a)ebay.com] 
Sent: Wednesday, September 4, 2019 4:23 PM
To: Luse, Paul E <paul.e.luse(a)intel.com>; Walker, Benjamin <benjamin.walker(a)intel.com>; Harris, James R <james.r.harris(a)intel.com>; spdk(a)lists.01.org
Cc: Chen, Xiaoxi <xiaoxchen(a)ebay.com>; Kadayam, Hari <hkadayam(a)ebay.com>; Szmyd, Brian <bszmyd(a)ebay.com>
Subject: Re: [SPDK] NBD with SPDK

Avg CPU utilization is very low when I am running this.

09/04/2019 04:21:40 PM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           2.59    0.00    2.57    0.00    0.00   94.84

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sda              0.00    0.20      0.00      0.80     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     4.00   0.00   0.00
sdb              0.00    0.00      0.00      0.00     0.00     0.00   0.00   0.00    0.00    0.00   0.00     0.00     0.00   0.00   0.00
sdc              0.00 28846.80      0.00 191555.20     0.00 18211.00   0.00  38.70    0.00    1.03  29.64     0.00     6.64   0.03 100.00
nb0              0.00 47297.00      0.00 191562.40     0.00   593.60   0.00   1.24    0.00    1.32  61.83     0.00     4.05   0



On 9/4/19, 4:19 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:

    I am using this command
    
    fio --name=randwrite --ioengine=libaio --iodepth=8 --rw=write --rwmixread=0  --bsrange=4k-4k --direct=1 -filename=/dev/nbd0 --numjobs=8 --runtime 120 --time_based --group_reporting
    
    I have created the device by using these commands
    	1.  ./root/spdk/app/vhost
    	2.  ./rpc.py bdev_aio_create /dev/sdc aio0
    	3. /rpc.py start_nbd_disk aio0 /dev/nbd0
    
    I am using  "perf top"  to get the performance 
    
    On 9/4/19, 4:03 PM, "Luse, Paul E" <paul.e.luse(a)intel.com> wrote:
    
        Hi Rishabh,
        
        Maybe it would help (me at least) if you described the complete & exact steps for your test - both setup of the env & test and command to profile.  Can you send that out?
        
        Thx
        Paul
        
        -----Original Message-----
        From: Mittal, Rishabh [mailto:rimittal(a)ebay.com] 
        Sent: Wednesday, September 4, 2019 2:45 PM
        To: Walker, Benjamin <benjamin.walker(a)intel.com>; Harris, James R <james.r.harris(a)intel.com>; spdk(a)lists.01.org; Luse, Paul E <paul.e.luse(a)intel.com>
        Cc: Chen, Xiaoxi <xiaoxchen(a)ebay.com>; Kadayam, Hari <hkadayam(a)ebay.com>; Szmyd, Brian <bszmyd(a)ebay.com>
        Subject: Re: [SPDK] NBD with SPDK
        
        Yes, I am using 64 q depth with one thread in fio. I am using AIO. This profiling is for the entire system. I don't know why spdk threads are idle.
        
        On 9/4/19, 11:08 AM, "Walker, Benjamin" <benjamin.walker(a)intel.com> wrote:
        
            On Fri, 2019-08-30 at 22:28 +0000, Mittal, Rishabh wrote:
            > I got the run again. It is with 4k write.
            > 
            > 13.16%  vhost                       [.]
            > spdk_ring_dequeue                                                             
            >              
            >    6.08%  vhost                       [.]
            > rte_rdtsc                                                                     
            >              
            >    4.77%  vhost                       [.]
            > spdk_thread_poll                                                              
            >              
            >    2.85%  vhost                       [.]
            > _spdk_reactor_run                                                             
            >  
            
            You're doing high queue depth for at least 30 seconds while the trace runs,
            right? Using fio with the libaio engine on the NBD device is probably the way to
            go. Are you limiting the profiling to just the core where the main SPDK process
            is pinned? I'm asking because SPDK still appears to be mostly idle, and I
            suspect the time is being spent in some other thread (in the kernel). Consider
            capturing a profile for the entire system. It will have fio stuff in it, but the
            expensive stuff still should generally bubble up to the top.
            
            Thanks,
            Ben
            
            
            > 
            > On 8/29/19, 6:05 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:
            > 
            >     I got the profile with first run. 
            >     
            >       27.91%  vhost                       [.]
            > spdk_ring_dequeue                                                             
            >              
            >       12.94%  vhost                       [.]
            > rte_rdtsc                                                                     
            >              
            >       11.00%  vhost                       [.]
            > spdk_thread_poll                                                              
            >              
            >        6.15%  vhost                       [.]
            > _spdk_reactor_run                                                             
            >              
            >        4.35%  [kernel]                    [k]
            > syscall_return_via_sysret                                                     
            >              
            >        3.91%  vhost                       [.]
            > _spdk_msg_queue_run_batch                                                     
            >              
            >        3.38%  vhost                       [.]
            > _spdk_event_queue_run_batch                                                   
            >              
            >        2.83%  [unknown]                   [k]
            > 0xfffffe000000601b                                                            
            >              
            >        1.45%  vhost                       [.]
            > spdk_thread_get_from_ctx                                                      
            >              
            >        1.20%  [kernel]                    [k]
            > __fget                                                                        
            >              
            >        1.14%  libpthread-2.27.so          [.]
            > __libc_read                                                                   
            >              
            >        1.00%  libc-2.27.so                [.]
            > 0x000000000018ef76                                                            
            >              
            >        0.99%  libc-2.27.so                [.] 0x000000000018ef79          
            >     
            >     Thanks
            >     Rishabh Mittal                         
            >     
            >     On 8/19/19, 7:42 AM, "Luse, Paul E" <paul.e.luse(a)intel.com> wrote:
            >     
            >         That's great.  Keep any eye out for the items Ben mentions below - at
            > least the first one should be quick to implement and compare both profile data
            > and measured performance.
            >         
            >         Don’t' forget about the community meetings either, great place to chat
            > about these kinds of things.  
            > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspdk.io%2Fcommunity%2F&amp;data=02%7C01%7Crimittal%40ebay.com%7C1bba5013016a4b69435908d7318c0e68%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637032349916269231&amp;sdata=FI1RsbUF1gKjlQIyKu317GeF9QuQVGspLdI7MkF5zZE%3D&amp;reserved=0
            >   Next one is tomorrow morn US time.
            >         
            >         Thx
            >         Paul
            >         
            >         -----Original Message-----
            >         From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Mittal,
            > Rishabh via SPDK
            >         Sent: Thursday, August 15, 2019 6:50 PM
            >         To: Harris, James R <james.r.harris(a)intel.com>; Walker, Benjamin <
            > benjamin.walker(a)intel.com>; spdk(a)lists.01.org
            >         Cc: Mittal, Rishabh <rimittal(a)ebay.com>; Chen, Xiaoxi <
            > xiaoxchen(a)ebay.com>; Szmyd, Brian <bszmyd(a)ebay.com>; Kadayam, Hari <
            > hkadayam(a)ebay.com>
            >         Subject: Re: [SPDK] NBD with SPDK
            >         
            >         Thanks. I will get the profiling by next week. 
            >         
            >         On 8/15/19, 6:26 PM, "Harris, James R" <james.r.harris(a)intel.com>
            > wrote:
            >         
            >             
            >             
            >             On 8/15/19, 4:34 PM, "Mittal, Rishabh" <rimittal(a)ebay.com> wrote:
            >             
            >                 Hi Jim
            >                 
            >                 What tool you use to take profiling. 
            >             
            >             Hi Rishabh,
            >             
            >             Mostly I just use "perf top".
            >             
            >             -Jim
            >             
            >                 
            >                 Thanks
            >                 Rishabh Mittal
            >                 
            >                 On 8/14/19, 9:54 AM, "Harris, James R" <
            > james.r.harris(a)intel.com> wrote:
            >                 
            >                     
            >                     
            >                     On 8/14/19, 9:18 AM, "Walker, Benjamin" <
            > benjamin.walker(a)intel.com> wrote:
            >                     
            >                     <trim>
            >                         
            >                         When an I/O is performed in the process initiating the
            > I/O to a file, the data
            >                         goes into the OS page cache buffers at a layer far
            > above the bio stack
            >                         (somewhere up in VFS). If SPDK were to reserve some
            > memory and hand it off to
            >                         your kernel driver, your kernel driver would still
            > need to copy it to that
            >                         location out of the page cache buffers. We can't
            > safely share the page cache
            >                         buffers with a user space process.
            >                        
            >                     I think Rishabh was suggesting the SPDK reserve the
            > virtual address space only.
            >                     Then the kernel could map the page cache buffers into that
            > virtual address space.
            >                     That would not require a data copy, but would require the
            > mapping operations.
            >                     
            >                     I think the profiling data would be really helpful - to
            > quantify how much of the 50us
            >                     Is due to copying the 4KB of data.  That can help drive
            > next steps on how to optimize
            >                     the SPDK NBD module.
            >                     
            >                     Thanks,
            >                     
            >                     -Jim
            >                     
            >                     
            >                         As Paul said, I'm skeptical that the memcpy is
            > significant in the overall
            >                         performance you're measuring. I encourage you to go
            > look at some profiling data
            >                         and confirm that the memcpy is really showing up. I
            > suspect the overhead is
            >                         instead primarily in these spots:
            >                         
            >                         1) Dynamic buffer allocation in the SPDK NBD backend.
            >                         
            >                         As Paul indicated, the NBD target is dynamically
            > allocating memory for each I/O.
            >                         The NBD backend wasn't designed to be fast - it was
            > designed to be simple.
            >                         Pooling would be a lot faster and is something fairly
            > easy to implement.
            >                         
            >                         2) The way SPDK does the syscalls when it implements
            > the NBD backend.
            >                         
            >                         Again, the code was designed to be simple, not high
            > performance. It simply calls
            >                         read() and write() on the socket for each command.
            > There are much higher
            >                         performance ways of doing this, they're just more
            > complex to implement.
            >                         
            >                         3) The lack of multi-queue support in NBD
            >                         
            >                         Every I/O is funneled through a single sockpair up to
            > user space. That means
            >                         there is locking going on. I believe this is just a
            > limitation of NBD today - it
            >                         doesn't plug into the block-mq stuff in the kernel and
            > expose multiple
            >                         sockpairs. But someone more knowledgeable on the
            > kernel stack would need to take
            >                         a look.
            >                         
            >                         Thanks,
            >                         Ben
            >                         
            >                         > 
            >                         > Couple of things that I am not really sure in this
            > flow is :- 1. How memory
            >                         > registration is going to work with RDMA driver.
            >                         > 2. What changes are required in spdk memory
            > management
            >                         > 
            >                         > Thanks
            >                         > Rishabh Mittal
            >                         
            >                     
            >                     
            >                 
            >                 
            >             
            >             
            >         
            >         _______________________________________________
            >         SPDK mailing list
            >         SPDK(a)lists.01.org
            >         
            > https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.01.org%2Fmailman%2Flistinfo%2Fspdk&amp;data=02%7C01%7Crimittal%40ebay.com%7C1bba5013016a4b69435908d7318c0e68%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637032349916269231&amp;sdata=zNZpYAyUoBjjAvBzT7PH2uaw60CTuL1tql27a3jRRKs%3D&amp;reserved=0
            >         
            >     
            >     
            > 
            
            
        
        
    
    


             reply	other threads:[~2019-09-04 23:27 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-04 23:27 Luse, Paul E [this message]
  -- strict thread matches above, loose matches on Subject: below --
2019-09-23  1:03 [SPDK] NBD with SPDK Huang Zhiteng
2019-09-06 20:31 Kadayam, Hari
2019-09-06 17:13 Mittal, Rishabh
2019-09-06  2:14 Szmyd, Brian
2019-09-06  2:08 Huang Zhiteng
2019-09-05 22:00 Szmyd, Brian
2019-09-05 21:22 Walker, Benjamin
2019-09-05 20:11 Luse, Paul E
2019-09-04 23:03 Luse, Paul E
2019-09-04 18:08 Walker, Benjamin
2019-08-30 22:28 Mittal, Rishabh
2019-08-30 17:06 Walker, Benjamin
2019-08-30  1:05 Mittal, Rishabh
2019-08-19 14:41 Luse, Paul E
2019-08-16  1:50 Mittal, Rishabh
2019-08-16  1:26 Harris, James R
2019-08-15 23:34 Mittal, Rishabh
2019-08-14 17:55 Mittal, Rishabh
2019-08-14 17:05 Kadayam, Hari
2019-08-14 16:54 Harris, James R
2019-08-14 16:18 Walker, Benjamin
2019-08-14 14:28 Luse, Paul E
2019-08-13 22:08 Mittal, Rishabh
2019-08-13 21:45 Harris, James R
2019-08-13 19:55 Mittal, Rishabh
2019-08-12 18:41 Harris, James R
2019-08-12 18:11 Harris, James R
2019-08-11 23:33 Mittal, Rishabh
2019-08-11 22:51 Mittal, Rishabh
2019-08-11 19:53 Luse, Paul E
2019-08-11  1:08 Mittal, Rishabh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=82C9F782B054C94B9FC04A331649C77ABFEEC0C1@FMSMSX125.amr.corp.intel.com \
    --to=spdk@lists.01.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.