From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============4561463140677507443==" MIME-Version: 1.0 From: Mittal, Rishabh Subject: Re: [SPDK] NBD with SPDK Date: Fri, 30 Aug 2019 01:05:34 +0000 Message-ID: In-Reply-To: 82C9F782B054C94B9FC04A331649C77AB8EBA3B0@FMSMSX126.amr.corp.intel.com List-ID: To: spdk@lists.01.org --===============4561463140677507443== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable I got the profile with first run. = 27.91% vhost [.] spdk_ring_dequeue = = 12.94% vhost [.] rte_rdtsc = = 11.00% vhost [.] spdk_thread_poll = = 6.15% vhost [.] _spdk_reactor_run = = 4.35% [kernel] [k] syscall_return_via_sysret = = 3.91% vhost [.] _spdk_msg_queue_run_batch = = 3.38% vhost [.] _spdk_event_queue_run_batch = = 2.83% [unknown] [k] 0xfffffe000000601b = = 1.45% vhost [.] spdk_thread_get_from_ctx = = 1.20% [kernel] [k] __fget = = 1.14% libpthread-2.27.so [.] __libc_read = = 1.00% libc-2.27.so [.] 0x000000000018ef76 = = 0.99% libc-2.27.so [.] 0x000000000018ef79 = Thanks Rishabh Mittal = =EF=BB=BFOn 8/19/19, 7:42 AM, "Luse, Paul E" wrot= e: That's great. Keep any eye out for the items Ben mentions below - at l= east the first one should be quick to implement and compare both profile da= ta and measured performance. = Don=E2=80=99t' forget about the community meetings either, great place = to chat about these kinds of things. https://nam01.safelinks.protection.ou= tlook.com/?url=3Dhttps%3A%2F%2Fspdk.io%2Fcommunity%2F&data=3D02%7C01%7C= rimittal%40ebay.com%7Cd5c75891ea414963501c08d724b36248%7C46326bff992841a0ba= ca17c16c94ea99%7C0%7C0%7C637018225183900855&sdata=3DwEMi40AMPeGVt3XX3bH= fneHqM0LFEB8Jt%2F9dQl6cIBE%3D&reserved=3D0 Next one is tomorrow morn U= S time. = Thx Paul = -----Original Message----- From: SPDK [mailto:spdk-bounces(a)lists.01.org] On Behalf Of Mittal, Ri= shabh via SPDK Sent: Thursday, August 15, 2019 6:50 PM To: Harris, James R ; Walker, Benjamin ; spdk(a)lists.01.org Cc: Mittal, Rishabh ; Chen, Xiaoxi ; Szmyd, Brian ; Kadayam, Hari Subject: Re: [SPDK] NBD with SPDK = Thanks. I will get the profiling by next week. = = On 8/15/19, 6:26 PM, "Harris, James R" wro= te: = = = On 8/15/19, 4:34 PM, "Mittal, Rishabh" wrote: = Hi Jim = What tool you use to take profiling. = = Hi Rishabh, = Mostly I just use "perf top". = -Jim = = Thanks Rishabh Mittal = On 8/14/19, 9:54 AM, "Harris, James R" wrote: = = = On 8/14/19, 9:18 AM, "Walker, Benjamin" wrote: = = When an I/O is performed in the process initiating the = I/O to a file, the data goes into the OS page cache buffers at a layer far abov= e the bio stack (somewhere up in VFS). If SPDK were to reserve some mem= ory and hand it off to your kernel driver, your kernel driver would still need= to copy it to that location out of the page cache buffers. We can't safely= share the page cache buffers with a user space process. = I think Rishabh was suggesting the SPDK reserve the virtual= address space only. Then the kernel could map the page cache buffers into that = virtual address space. That would not require a data copy, but would require the m= apping operations. = I think the profiling data would be really helpful - to qua= ntify how much of the 50us Is due to copying the 4KB of data. That can help drive nex= t steps on how to optimize the SPDK NBD module. = Thanks, = -Jim = = As Paul said, I'm skeptical that the memcpy is signific= ant in the overall performance you're measuring. I encourage you to go loo= k at some profiling data and confirm that the memcpy is really showing up. I sus= pect the overhead is instead primarily in these spots: = 1) Dynamic buffer allocation in the SPDK NBD backend. = As Paul indicated, the NBD target is dynamically alloca= ting memory for each I/O. The NBD backend wasn't designed to be fast - it was des= igned to be simple. Pooling would be a lot faster and is something fairly e= asy to implement. = 2) The way SPDK does the syscalls when it implements th= e NBD backend. = Again, the code was designed to be simple, not high per= formance. It simply calls read() and write() on the socket for each command. Ther= e are much higher performance ways of doing this, they're just more compl= ex to implement. = 3) The lack of multi-queue support in NBD = Every I/O is funneled through a single sockpair up to u= ser space. That means there is locking going on. I believe this is just a lim= itation of NBD today - it doesn't plug into the block-mq stuff in the kernel and = expose multiple sockpairs. But someone more knowledgeable on the kernel= stack would need to take a look. = Thanks, Ben = > = > Couple of things that I am not really sure in this fl= ow is :- 1. How memory > registration is going to work with RDMA driver. > 2. What changes are required in spdk memory management > = > Thanks > Rishabh Mittal = = = = = = = = _______________________________________________ SPDK mailing list SPDK(a)lists.01.org https://nam01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Flis= ts.01.org%2Fmailman%2Flistinfo%2Fspdk&data=3D02%7C01%7Crimittal%40ebay.= com%7Cd5c75891ea414963501c08d724b36248%7C46326bff992841a0baca17c16c94ea99%7= C0%7C0%7C637018225183900855&sdata=3D9QDXP2O4MWvrQmKitBJONSkZZHXrRqfFXPr= DqltPYjM%3D&reserved=3D0 = --===============4561463140677507443==--