All of lore.kernel.org
 help / color / mirror / Atom feed
* polled IO and 5.x kernels
@ 2019-12-18 19:02 Ober, Frank
  2019-12-19  2:40 ` Sitsofe Wheeler
  0 siblings, 1 reply; 5+ messages in thread
From: Ober, Frank @ 2019-12-18 19:02 UTC (permalink / raw)
  To: fio
  Cc: Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas,
	Satvik M, Knapp, Anthony J

Hi fio community, 
On 4.x kernels we used to be able to do:
# echo 1 > /sys/block/nvme0n1/queue/io_poll

And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set.

On 5.x kernels we see the following error trying to write the device settings>>>
-bash: echo: write error: Invalid argument

This is verifiable on 5.3, 5.4 kernels with fio 3.16 builds.

What is the background on what has changed because Jens wrote this note back in 2015, which did work once upon a time.
But now things have changed, but none of us here in the Intel SSD group and OSS Driver team really know why.

https://lwn.net/Articles/663543/

More documentation can be found here: https://stackoverflow.com/questions/55223883/echo-write-error-invalid-argument-while-setting-io-poll-for-nvme-ssd/

Here is a good sample A / B test:
[global]
direct=1
filename=/dev/nvme1n1
log_avg_msec=500
time_based
percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999


[rand-read-4k-qd1]
runtime=120
bs=4K
iodepth=1
numjobs=1
cpus_allowed=0
ioengine=io_uring
hipri
rw=randread
Works!

[global]
direct=1
filename=/dev/nvme1n1
log_avg_msec=500
time_based
percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999


[rand-read-4k-qd1]
runtime=120
bs=4K
iodepth=1
numjobs=1
cpus_allowed=0
ioengine=pvsync2
hipri
rw=randread
Does not work... you do not see the CPU spin up to 100% (kernel/sys usage) on hipri with pvsync2 on a 5.x kernel.  

Why not?

And what changed here?

Is it possible to get a new LWN article?
Thank you!
Frank Ober



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: polled IO and 5.x kernels
  2019-12-18 19:02 polled IO and 5.x kernels Ober, Frank
@ 2019-12-19  2:40 ` Sitsofe Wheeler
  2019-12-19  3:04   ` Sitsofe Wheeler
  0 siblings, 1 reply; 5+ messages in thread
From: Sitsofe Wheeler @ 2019-12-19  2:40 UTC (permalink / raw)
  To: Ober, Frank
  Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas,
	Satvik M, Knapp, Anthony J

Hi,

On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote:
>
> Hi fio community,
> On 4.x kernels we used to be able to do:
> # echo 1 > /sys/block/nvme0n1/queue/io_poll
>
> And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set.
>
> On 5.x kernels we see the following error trying to write the device settings>>>
> -bash: echo: write error: Invalid argument

Wouldn't this be better asked on the linux-block mailing list
(http://vger.kernel.org/vger-lists.html#linux-block )?

If you're saying "This worked in 4.20 but not in 5.0" (after all since
you can see this clearly bisection is always an option for you) then a
quick search (https://github.com/torvalds/linux/search?o=desc&p=2&q=%22io_poll%22&s=committer-date&type=Commits
) finds this commit -
https://github.com/torvalds/linux/commit/cd19181bf9ad4b7f40f2a4e0355d052109c76529
which might have something to do with the change of behaviour...

> This is verifiable on 5.3, 5.4 kernels with fio 3.16 builds.
>
> What is the background on what has changed because Jens wrote this note back in 2015, which did work once upon a time.
> But now things have changed, but none of us here in the Intel SSD group and OSS Driver team really know why.
>
> https://lwn.net/Articles/663543/

Note: this is just an excerpt from a mailing list post that LWN
happened to highlight. See
https://lore.kernel.org/lkml/1446830423-25027-1-git-send-email-axboe@fb.com/
for the post and its thread...

> More documentation can be found here: https://stackoverflow.com/questions/55223883/echo-write-error-invalid-argument-while-setting-io-poll-for-nvme-ssd/

Isn't stackoverflow for programming questions rather than "where's a
kernel option gone?" style question? Maybe
https://unix.stackexchange.com/ would have been a better home?

> Here is a good sample A / B test:
> [global]
> direct=1
> filename=/dev/nvme1n1
> log_avg_msec=500
> time_based
> percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999
>
>
> [rand-read-4k-qd1]
> runtime=120
> bs=4K
> iodepth=1
> numjobs=1
> cpus_allowed=0
> ioengine=io_uring
> hipri
> rw=randread
> Works!
>
> [global]
> direct=1
> filename=/dev/nvme1n1
> log_avg_msec=500
> time_based
> percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999
>
>
> [rand-read-4k-qd1]
> runtime=120
> bs=4K
> iodepth=1
> numjobs=1
> cpus_allowed=0
> ioengine=pvsync2
> hipri
> rw=randread
> Does not work... you do not see the CPU spin up to 100% (kernel/sys usage) on hipri with pvsync2 on a 5.x kernel.
>
> Why not?

Glancing at https://fio.readthedocs.io/en/latest/fio_doc.html#i-o-engine-specific-parameters
, hipri on the pvsync2 ioengine means something different to hipri on
io_uring ioengine...

> And what changed here?

Just to check, do you get the same "full CPU" utilisation when you
remove hipri from your pvsync job on a 4.x kernel?

> Is it possible to get a new LWN article?

That sounds like one for Jon Corbet (LWN.net editor) and the block
maintainers but as mentioned above what you linked to was just a
mailing list excerpt...

> Thank you!
> Frank Ober

Good luck!

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: polled IO and 5.x kernels
  2019-12-19  2:40 ` Sitsofe Wheeler
@ 2019-12-19  3:04   ` Sitsofe Wheeler
  2019-12-19 17:22     ` Ober, Frank
  0 siblings, 1 reply; 5+ messages in thread
From: Sitsofe Wheeler @ 2019-12-19  3:04 UTC (permalink / raw)
  To: Ober, Frank
  Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas,
	Satvik M, Knapp, Anthony J

On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>
> On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote:
> >
> > On 4.x kernels we used to be able to do:
> > # echo 1 > /sys/block/nvme0n1/queue/io_poll
> >
> > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set.
> >
> > On 5.x kernels we see the following error trying to write the device settings>>>
> > -bash: echo: write error: Invalid argument
>
> Wouldn't this be better asked on the linux-block mailing list
> (http://vger.kernel.org/vger-lists.html#linux-block )?

In fact, doing another search on that list
(https://lore.kernel.org/linux-block/?q=io_poll ) or a more general
Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll
) turns up a post titled "Error while enabling io_poll for NVMe SSD"
(https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/
) where Keith Busch (who has an Intel address but I'm guessing isn't
in your group?) asks: "did the user turn on polling queues in the nvme
driver"? Another quick hunt
(https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code
and following some blame lines) finds
https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523
which indicates an NVMe option was added in 5.0 related to polling.

(PS: If you're using fio to publish benchmarking results don't forget
about https://github.com/axboe/fio/blob/master/MORAL-LICENSE )

-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: polled IO and 5.x kernels
  2019-12-19  3:04   ` Sitsofe Wheeler
@ 2019-12-19 17:22     ` Ober, Frank
  2019-12-20  8:01       ` Sitsofe Wheeler
  0 siblings, 1 reply; 5+ messages in thread
From: Ober, Frank @ 2019-12-19 17:22 UTC (permalink / raw)
  To: Sitsofe Wheeler, fio
  Cc: Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas,
	Satvik M, Knapp, Anthony J

Sitsofe, 
Ok I will do that. 
Frank


-----Original Message-----
From: Sitsofe Wheeler <sitsofe@gmail.com> 
Sent: Wednesday, December 18, 2019 7:05 PM
To: Ober, Frank <frank.ober@intel.com>
Cc: fio@vger.kernel.org; Rajendiran, Swetha <swetha.rajendiran@intel.com>; Liang, Mark <mark.liang@intel.com>; Derrick, Jonathan <jonathan.derrick@intel.com>; Vyas, Satvik M <satvik.m.vyas@intel.com>; Knapp, Anthony J <anthony.j.knapp@intel.com>
Subject: Re: polled IO and 5.x kernels

On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
>
> On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote:
> >
> > On 4.x kernels we used to be able to do:
> > # echo 1 > /sys/block/nvme0n1/queue/io_poll
> >
> > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set.
> >
> > On 5.x kernels we see the following error trying to write the device 
> > settings>>>
> > -bash: echo: write error: Invalid argument
>
> Wouldn't this be better asked on the linux-block mailing list 
> (http://vger.kernel.org/vger-lists.html#linux-block )?

In fact, doing another search on that list (https://lore.kernel.org/linux-block/?q=io_poll ) or a more general Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll
) turns up a post titled "Error while enabling io_poll for NVMe SSD"
(https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/
) where Keith Busch (who has an Intel address but I'm guessing isn't in your group?) asks: "did the user turn on polling queues in the nvme driver"? Another quick hunt (https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code
and following some blame lines) finds
https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523
which indicates an NVMe option was added in 5.0 related to polling.

(PS: If you're using fio to publish benchmarking results don't forget about https://github.com/axboe/fio/blob/master/MORAL-LICENSE )

--
Sitsofe | http://sucs.org/~sits/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: polled IO and 5.x kernels
  2019-12-19 17:22     ` Ober, Frank
@ 2019-12-20  8:01       ` Sitsofe Wheeler
  0 siblings, 0 replies; 5+ messages in thread
From: Sitsofe Wheeler @ 2019-12-20  8:01 UTC (permalink / raw)
  To: Ober, Frank
  Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas,
	Satvik M, Knapp, Anthony J

> -----Original Message-----
> From: Sitsofe Wheeler <sitsofe@gmail.com>
> Sent: Wednesday, December 18, 2019 7:05 PM
>
> > Wouldn't this be better asked on the linux-block mailing list
> > (http://vger.kernel.org/vger-lists.html#linux-block )?

On Thu, 19 Dec 2019 at 17:22, Ober, Frank <frank.ober@intel.com> wrote:
>
> Sitsofe,
> Ok I will do that.

For anyone who might be following along (e.g. by finding the post
you're reading via Google some time in the future), Frank did indeed
do this and you can see the outcome in the "Polled io for Linux kernel
5.x" thread on the Linux block mailing list (e.g.
https://lore.kernel.org/linux-block/SN6PR11MB2669E7A65DD0AD9DC65A67C58B520@SN6PR11MB2669.namprd11.prod.outlook.com/
).

-- 
Sitsofe | http://sucs.org/~sits/

On Thu, 19 Dec 2019 at 17:22, Ober, Frank <frank.ober@intel.com> wrote:
>
> Sitsofe,
> Ok I will do that.
> Frank
>
>
> -----Original Message-----
> From: Sitsofe Wheeler <sitsofe@gmail.com>
> Sent: Wednesday, December 18, 2019 7:05 PM
> To: Ober, Frank <frank.ober@intel.com>
> Cc: fio@vger.kernel.org; Rajendiran, Swetha <swetha.rajendiran@intel.com>; Liang, Mark <mark.liang@intel.com>; Derrick, Jonathan <jonathan.derrick@intel.com>; Vyas, Satvik M <satvik.m.vyas@intel.com>; Knapp, Anthony J <anthony.j.knapp@intel.com>
> Subject: Re: polled IO and 5.x kernels
>
> On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote:
> >
> > On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote:
> > >
> > > On 4.x kernels we used to be able to do:
> > > # echo 1 > /sys/block/nvme0n1/queue/io_poll
> > >
> > > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set.
> > >
> > > On 5.x kernels we see the following error trying to write the device
> > > settings>>>
> > > -bash: echo: write error: Invalid argument
> >
> > Wouldn't this be better asked on the linux-block mailing list
> > (http://vger.kernel.org/vger-lists.html#linux-block )?
>
> In fact, doing another search on that list (https://lore.kernel.org/linux-block/?q=io_poll ) or a more general Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll
> ) turns up a post titled "Error while enabling io_poll for NVMe SSD"
> (https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/
> ) where Keith Busch (who has an Intel address but I'm guessing isn't in your group?) asks: "did the user turn on polling queues in the nvme driver"? Another quick hunt (https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code
> and following some blame lines) finds
> https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523
> which indicates an NVMe option was added in 5.0 related to polling.
>
> (PS: If you're using fio to publish benchmarking results don't forget about https://github.com/axboe/fio/blob/master/MORAL-LICENSE )
>
> --
> Sitsofe | http://sucs.org/~sits/



-- 
Sitsofe | http://sucs.org/~sits/


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-12-20  8:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-18 19:02 polled IO and 5.x kernels Ober, Frank
2019-12-19  2:40 ` Sitsofe Wheeler
2019-12-19  3:04   ` Sitsofe Wheeler
2019-12-19 17:22     ` Ober, Frank
2019-12-20  8:01       ` Sitsofe Wheeler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.