linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
@ 2019-02-15 21:19 Felipe Franciosi
  2019-02-15 21:41 ` Bart Van Assche
  2019-02-15 21:47 ` Keith Busch
  0 siblings, 2 replies; 6+ messages in thread
From: Felipe Franciosi @ 2019-02-15 21:19 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-block, linux-nvme

Hi All,

I'd like to attend LSF/MM this year and discuss the kernel performance when accessing NVMe devices, specifically (but not limited to) Intel Optane Memory (which boasts very low latency and high iops/throughput per NVMe controller).

Over the last year or two, I have done extensive experimentation comparing applications using libaio to those using SDPK. For hypervisors, where storage devices can be exclusively accessed with userspace drivers (given the device can be dedicated to a single process), using SPDK has proven to be significantly faster and more efficient. That remains true even in the latest versions of the kernel.

I have presented work focusing on hypervisors in several conferences during this time. Although I appreciate the LSF/MM is more discussion-oriented, I am linking a couple of these presentations for reference:

Flash Memory Summit 2018
https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2018/20180808_SOFT-202-1_Franciosi.pdf

Linux Piter 2018
https://linuxpiter.com/system/attachments/files/000/001/558/original/20181103_-_AHV_and_SPDK.pdf

For LSF/MM, instead of focusing on hypervisors, I would like to discuss what can be done to achieve better efficiency and performance when using the kernel. My data include detailed results considering various scenarios like different NUMA configurations, IRQ affinity and polling modes.

Thanks,
Felipe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
  2019-02-15 21:19 [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel Felipe Franciosi
@ 2019-02-15 21:41 ` Bart Van Assche
       [not found]   ` <11A6C7D0-A26D-410F-8EE3-9AF524DF2050@nutanix.com>
  2019-02-15 21:47 ` Keith Busch
  1 sibling, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2019-02-15 21:41 UTC (permalink / raw)
  To: Felipe Franciosi, lsf-pc; +Cc: linux-block, linux-nvme

On Fri, 2019-02-15 at 21:19 +0000, Felipe Franciosi wrote:
> Hi All,
> 
> I'd like to attend LSF/MM this year and discuss the kernel performance when accessing NVMe devices, specifically (but not limited to) Intel Optane Memory (which boasts very low latency and high
> iops/throughput per NVMe controller).
> 
> Over the last year or two, I have done extensive experimentation comparing applications using libaio to those using SDPK. For hypervisors, where storage devices can be exclusively accessed with
> userspace drivers (given the device can be dedicated to a single process), using SPDK has proven to be significantly faster and more efficient. That remains true even in the latest versions of the
> kernel.
> 
> I have presented work focusing on hypervisors in several conferences during this time. Although I appreciate the LSF/MM is more discussion-oriented, I am linking a couple of these presentations for
> reference:
> 
> Flash Memory Summit 2018
> https://www.flashmemorysummit.com/English/Collaterals/Proceedings/2018/20180808_SOFT-202-1_Franciosi.pdf
> 
> Linux Piter 2018
> https://linuxpiter.com/system/attachments/files/000/001/558/original/20181103_-_AHV_and_SPDK.pdf
> 
> For LSF/MM, instead of focusing on hypervisors, I would like to discuss what can be done to achieve better efficiency and performance when using the kernel. My data include detailed results
> considering various scenarios like different NUMA configurations, IRQ affinity and polling modes.

Hi Felipe,

It seems like you missed the performance comparison between SPDK and io_uring
Jens posted recently?

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
  2019-02-15 21:19 [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel Felipe Franciosi
  2019-02-15 21:41 ` Bart Van Assche
@ 2019-02-15 21:47 ` Keith Busch
  2019-02-15 22:14   ` Felipe Franciosi
  1 sibling, 1 reply; 6+ messages in thread
From: Keith Busch @ 2019-02-15 21:47 UTC (permalink / raw)
  To: Felipe Franciosi; +Cc: lsf-pc, linux-block, linux-nvme

On Fri, Feb 15, 2019 at 09:19:02PM +0000, Felipe Franciosi wrote:
> Over the last year or two, I have done extensive experimentation comparing applications using libaio to those using SDPK. 

Try the io_uring interface instead. Its queued up in the linux-block
for-next tree.

> For hypervisors, where storage devices can be exclusively accessed with userspace drivers (given the device can be dedicated to a single process), using SPDK has proven to be significantly faster and more efficient.

It doesn't work so well for file based or multi-device backing
storage. But if you are sequestering an entire controller over to a VM,
direct-assign/device-passthrough is usually also an option and that
ought to be even faster.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
  2019-02-15 21:47 ` Keith Busch
@ 2019-02-15 22:14   ` Felipe Franciosi
  0 siblings, 0 replies; 6+ messages in thread
From: Felipe Franciosi @ 2019-02-15 22:14 UTC (permalink / raw)
  To: Keith Busch; +Cc: lsf-pc, linux-block, linux-nvme

Hi Keith,

> On Feb 15, 2019, at 9:47 PM, Keith Busch <keith.busch@intel.com> wrote:
> 
> On Fri, Feb 15, 2019 at 09:19:02PM +0000, Felipe Franciosi wrote:
>> Over the last year or two, I have done extensive experimentation comparing applications using libaio to those using SDPK. 
> 
> Try the io_uring interface instead. Its queued up in the linux-block
> for-next tree.

I just read about that based on the other response from Bart. Thanks for pointing it out.

> 
>> For hypervisors, where storage devices can be exclusively accessed with userspace drivers (given the device can be dedicated to a single process), using SPDK has proven to be significantly faster and more efficient.
> 
> It doesn't work so well for file based or multi-device backing
> storage. But if you are sequestering an entire controller over to a VM,
> direct-assign/device-passthrough is usually also an option and that
> ought to be even faster.

The advantage is to dedicate a controller to "the hypervisor" (ie. one userspace process responsible for mediating access between multiple VMs). Some VMs may choose to use userspace drivers, too. Others can use traditional kernel datapaths. The average overhead we have measured from virtual machines in this setup is neglectable.

I did not experience problems with multiple devices, but surely careful thought is required for the data format.

Cheers,
Felipe

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
       [not found]   ` <11A6C7D0-A26D-410F-8EE3-9AF524DF2050@nutanix.com>
@ 2019-02-16  1:01     ` Bart Van Assche
  2019-02-16  1:54     ` Jens Axboe
  1 sibling, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2019-02-16  1:01 UTC (permalink / raw)
  To: Felipe Franciosi; +Cc: lsf-pc, linux-block, linux-nvme, Jens Axboe

On Sat, 2019-02-16 at 00:53 +0000, Felipe Franciosi wrote:
> On Feb 15, 2019, at 9:41 PM, Bart Van Assche <bvanassche@acm.org> wrote:
> > On Fri, 2019-02-15 at 21:19 +-0000, Felipe Franciosi wrote:
> > > Hi All,
> > > 
> > > I'd like to attend LSF/MM this year and discuss the kernel performance when accessing NVMe devices, specifically (but not limited to) Intel Optane Memory (which boasts very low latency and high
> > > iops/throughput per NVMe controller).
> > > 
> > > Over the last year or two, I have done extensive experimentation comparing applications using libaio to those using SDPK. For hypervisors, where storage devices can be exclusively accessed with
> > > userspace drivers (given the device can be dedicated to a single process), using SPDK has proven to be significantly faster and more efficient. That remains true even in the latest versions of
> > > the
> > > kernel.
> > > 
> > > I have presented work focusing on hypervisors in several conferences during this time. Although I appreciate the LSF/MM is more discussion-oriented, I am linking a couple of these presentations
> > > for
> > > reference:
> > > 
> > > Flash Memory Summit 2018
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.flashmemorysummit.com_English_Collaterals_Proceedings_2018_20180808-5FSOFT-2D202-2D1-5FFranciosi.pdf&d=DwICgQ&c=s883GpUCOChKOHiocYtGcg&r=
> > > CCrJKVC5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=gpN7ZCCTDgrYuN3cZ0TceD2QUDAUeJEnwR0A-OUNju4&s=adckJANXPcBu177wFlVmO4pB3jPFZOdggibVfmLERr8&e=
> > > 
> > > Linux Piter 2018
> > > https://urldefense.proofpoint.com/v2/url?u=https-3A__linuxpiter.com_system_attachments_files_000_001_558_original_20181103-5F-2D-5FAHV-5Fand-5FSPDK.pdf&d=DwICgQ&c=s883GpUCOChKOHiocYtGcg&r=CCrJKV
> > > C5zGot8PrnI-iYe00MdX4pgdQfMRigp14Ptmk&m=gpN7ZCCTDgrYuN3cZ0TceD2QUDAUeJEnwR0A-OUNju4&s=TdfWF887BHpEvwQ__kNNTFNe1uaKoIjDZ-gW4qr6UxE&e=
> > > 
> > > For LSF/MM, instead of focusing on hypervisors, I would like to discuss what can be done to achieve better efficiency and performance when using the kernel. My data include detailed results
> > > considering various scenarios like different NUMA configurations, IRQ affinity and polling modes.
> >  
> > Hi Felipe,
> > 
> > It seems like you missed the performance comparison between SPDK and io_uring
> > Jens posted recently?
> 
> I configured 5.0-rc6 and had a look at the io_uring code. Finally worked out how to use FIO's t/io_uring to submit IO and poll completions without system calls. My _initial_ numbers still show SPDK
> being faster and more efficient.
> 
> Searching the lists, I found a few mentions that Jens published a comparison stating otherwise, but I can't find it. Could you please give me some pointers?

Hi Felipe,

This is probably what you are looking for:

https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel
       [not found]   ` <11A6C7D0-A26D-410F-8EE3-9AF524DF2050@nutanix.com>
  2019-02-16  1:01     ` Bart Van Assche
@ 2019-02-16  1:54     ` Jens Axboe
  1 sibling, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2019-02-16  1:54 UTC (permalink / raw)
  To: Felipe Franciosi; +Cc: Bart Van Assche, lsf-pc, linux-block, linux-nvme

(resending, as my phone email client turned this into html crap)

> Hi Bart,
>
> I configured 5.0-rc6 and had a look at the io_uring code. Finally worked out how to use FIO's t/io_uring to submit IO and poll completions without system calls. My _initial_ numbers still show SPDK being faster and more efficient.
>
> Searching the lists, I found a few mentions that Jens published a comparison stating otherwise, but I can't find it. Could you please give me some pointers?

I posted some numbers with v5:

https://lore.kernel.org/linux-block/20190116175003.17880-1-axboe@kernel.dk/

Sounds like you are using sqpoll, you probably don’t want to do that for
peak performance. And you probably want to experiment a bit to reach the
best performance, both in terms of setup/placement and kernel
configuration, runtime options (like iostats), etc.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-16  1:54 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-15 21:19 [LSF/MM TOPIC] NVMe Performance: Userspace vs Kernel Felipe Franciosi
2019-02-15 21:41 ` Bart Van Assche
     [not found]   ` <11A6C7D0-A26D-410F-8EE3-9AF524DF2050@nutanix.com>
2019-02-16  1:01     ` Bart Van Assche
2019-02-16  1:54     ` Jens Axboe
2019-02-15 21:47 ` Keith Busch
2019-02-15 22:14   ` Felipe Franciosi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).