From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============8581423002789777678==" MIME-Version: 1.0 From: Harris, James R Subject: Re: [SPDK] NBD with SPDK Date: Wed, 14 Aug 2019 16:54:36 +0000 Message-ID: <0A4299FB-9034-4530-A633-D4DB7EC1269A@intel.com> In-Reply-To: 742b180c299ad6847e6efa33bad8c62e3dd2c7ac.camel@intel.com List-ID: To: spdk@lists.01.org --===============8581423002789777678== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable =EF=BB=BFOn 8/14/19, 9:18 AM, "Walker, Benjamin" wrote: = When an I/O is performed in the process initiating the I/O to a file, t= he data goes into the OS page cache buffers at a layer far above the bio stack (somewhere up in VFS). If SPDK were to reserve some memory and hand it = off to your kernel driver, your kernel driver would still need to copy it to t= hat location out of the page cache buffers. We can't safely share the page = cache buffers with a user space process. = I think Rishabh was suggesting the SPDK reserve the virtual address space o= nly. Then the kernel could map the page cache buffers into that virtual address = space. That would not require a data copy, but would require the mapping operation= s. I think the profiling data would be really helpful - to quantify how much o= f the 50us Is due to copying the 4KB of data. That can help drive next steps on how t= o optimize the SPDK NBD module. Thanks, -Jim As Paul said, I'm skeptical that the memcpy is significant in the overa= ll performance you're measuring. I encourage you to go look at some profil= ing data and confirm that the memcpy is really showing up. I suspect the overhea= d is instead primarily in these spots: = 1) Dynamic buffer allocation in the SPDK NBD backend. = As Paul indicated, the NBD target is dynamically allocating memory for = each I/O. The NBD backend wasn't designed to be fast - it was designed to be simp= le. Pooling would be a lot faster and is something fairly easy to implement. = 2) The way SPDK does the syscalls when it implements the NBD backend. = Again, the code was designed to be simple, not high performance. It sim= ply calls read() and write() on the socket for each command. There are much higher performance ways of doing this, they're just more complex to implement. = 3) The lack of multi-queue support in NBD = Every I/O is funneled through a single sockpair up to user space. That = means there is locking going on. I believe this is just a limitation of NBD t= oday - it doesn't plug into the block-mq stuff in the kernel and expose multiple sockpairs. But someone more knowledgeable on the kernel stack would nee= d to take a look. = Thanks, Ben = > = > Couple of things that I am not really sure in this flow is :- 1. How = memory > registration is going to work with RDMA driver. > 2. What changes are required in spdk memory management > = > Thanks > Rishabh Mittal = --===============8581423002789777678==--