linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "Ober, Frank" <frank.ober@intel.com>
To: Keith Busch <kbusch@kernel.org>
Cc: "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"Rajendiran, Swetha" <swetha.rajendiran@intel.com>,
	"Liang, Mark" <mark.liang@intel.com>,
	"Derrick, Jonathan" <jonathan.derrick@intel.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: RE: Polled io for Linux kernel 5.x
Date: Tue, 31 Dec 2019 19:06:01 +0000	[thread overview]
Message-ID: <SN6PR11MB26691B36D7AEF22393CC04F38B260@SN6PR11MB2669.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20191220212049.GA5582@redsun51.ssa.fujisawa.hgst.com>

Hi Keith, so the performance results I see are very close between poll_queues and io_uring. I posted them below. Because I think this topic is pretty new to people.

Is there anything we need to tell the reader/user about poll_queues. What is important to usage? 

And can it be dynamic or do we have only at (module) startup the ability to define poll_queues?

My goal is to update the blog we built around testing Optane SSDs. Is there a possibility of creating an LWN article that will go deeper (into this change) to poll_queues?

What's interesting in the below data is that the clat time for io_uring is (lower) better but the performance in IOPS is not. Pvsync2 is the most efficient, by a small margin against the newer 3D XPoint device.
Thanks
Frank

Results: 
(kernel (el repo) - 5.4.1-1.el8.elrepo.x86_64
cpu - Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz  - pinned to run at 3.1
fio - fio-3.16-64-gfd988
Results of Gen2 Optane SSD with poll_queues(pvsync2) v io_uring/hipri
pvsync2 (poll queues)
fio-3.16-64-gfd988
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=552MiB/s][r=141k IOPS][eta 00m:00s]
rand-read-4k-qd1: (groupid=0, jobs=1): err= 0: pid=10309: Tue Dec 31 10:49:33 2019
  read: IOPS=141k, BW=552MiB/s (579MB/s)(64.7GiB/120001msec)
    clat (nsec): min=6548, max=186309, avg=6809.48, stdev=497.58
     lat (nsec): min=6572, max=186333, avg=6834.24, stdev=499.28
    clat percentiles (usec):
     |  1.0000th=[    7],  5.0000th=[    7], 10.0000th=[    7],
     | 20.0000th=[    7], 30.0000th=[    7], 40.0000th=[    7],
     | 50.0000th=[    7], 60.0000th=[    7], 70.0000th=[    7],
     | 80.0000th=[    7], 90.0000th=[    7], 95.0000th=[    8],
     | 99.0000th=[    8], 99.5000th=[    8], 99.9000th=[    9],
     | 99.9500th=[   10], 99.9900th=[   18], 99.9990th=[  117],
     | 99.9999th=[  163]
   bw (  KiB/s): min=563512, max=567392, per=100.00%, avg=565635.38, stdev=846.99, samples=239
   iops        : min=140878, max=141848, avg=141408.82, stdev=211.76, samples=239
  lat (usec)   : 10=99.97%, 20=0.03%, 50=0.01%, 100=0.01%, 250=0.01%
  cpu          : usr=6.28%, sys=93.55%, ctx=408, majf=0, minf=96
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=16969949,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=552MiB/s (579MB/s), 552MiB/s-552MiB/s (579MB/s-579MB/s), io=64.7GiB (69.5GB), run=120001-120001msec

Disk stats (read/write):
  nvme3n1: ios=16955008/0, merge=0/0, ticks=101477/0, in_queue=0, util=99.95%

io_uring:
fio-3.16-64-gfd988
Starting 1 process
Jobs: 1 (f=1): [r(1)][100.0%][r=538MiB/s][r=138k IOPS][eta 00m:00s]
rand-read-4k-qd1: (groupid=0, jobs=1): err= 0: pid=10797: Tue Dec 31 10:53:29 2019
  read: IOPS=138k, BW=539MiB/s (565MB/s)(63.1GiB/120001msec)
    slat (nsec): min=1029, max=161248, avg=1204.69, stdev=219.02
    clat (nsec): min=262, max=208952, avg=5735.42, stdev=469.73
     lat (nsec): min=6691, max=210136, avg=7008.54, stdev=516.99
    clat percentiles (usec):
     |  1.0000th=[    6],  5.0000th=[    6], 10.0000th=[    6],
     | 20.0000th=[    6], 30.0000th=[    6], 40.0000th=[    6],
     | 50.0000th=[    6], 60.0000th=[    6], 70.0000th=[    6],
     | 80.0000th=[    6], 90.0000th=[    6], 95.0000th=[    6],
     | 99.0000th=[    7], 99.5000th=[    7], 99.9000th=[    8],
     | 99.9500th=[    9], 99.9900th=[   10], 99.9990th=[   52],
     | 99.9999th=[  161]
   bw (  KiB/s): min=548208, max=554504, per=100.00%, avg=551620.30, stdev=984.77, samples=239
   iops        : min=137052, max=138626, avg=137905.07, stdev=246.17, samples=239
  lat (nsec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=99.98%, 20=0.01%, 50=0.01%
  lat (usec)   : 100=0.01%, 250=0.01%
  cpu          : usr=7.39%, sys=92.44%, ctx=408, majf=0, minf=93
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=16548899,0,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=539MiB/s (565MB/s), 539MiB/s-539MiB/s (565MB/s-565MB/s), io=63.1GiB (67.8GB), run=120001-120001msec

Disk stats (read/write):
  nvme3n1: ios=16534429/0, merge=0/0, ticks=100320/0, in_queue=0, util=99.95%

Happy New Year Keith!

-----Original Message-----
From: Keith Busch <kbusch@kernel.org> 
Sent: Friday, December 20, 2019 1:21 PM
To: Ober, Frank <frank.ober@intel.com>
Cc: linux-block@vger.kernel.org; linux-nvme@lists.infradead.org; Derrick, Jonathan <jonathan.derrick@intel.com>; Rajendiran, Swetha <swetha.rajendiran@intel.com>; Liang, Mark <mark.liang@intel.com>
Subject: Re: Polled io for Linux kernel 5.x

On Thu, Dec 19, 2019 at 09:59:14PM +0000, Ober, Frank wrote:
> Thanks Keith, it makes sense to reserve and set it up uniquely if you 
> can save hw interrupts. But why would io_uring then not need these 
> queues, because a stack trace I ran shows without the special queues I 
> am still entering bio_poll. With pvsync2 I can only do polled io with 
> the poll_queues?

Polling can happen only if you have polled queues, so io_uring is not accomplishing anything by calling iopoll. I don't see an immediately good way to pass that information up to io_uring, though.

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

      reply	other threads:[~2019-12-31 19:06 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-19 19:25 Polled io for Linux kernel 5.x Ober, Frank
2019-12-19 20:52 ` Keith Busch
2019-12-19 21:59   ` Ober, Frank
2019-12-20 21:20     ` Keith Busch
2019-12-31 19:06       ` Ober, Frank [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SN6PR11MB26691B36D7AEF22393CC04F38B260@SN6PR11MB2669.namprd11.prod.outlook.com \
    --to=frank.ober@intel.com \
    --cc=jonathan.derrick@intel.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mark.liang@intel.com \
    --cc=swetha.rajendiran@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).