* polled IO and 5.x kernels @ 2019-12-18 19:02 Ober, Frank 2019-12-19 2:40 ` Sitsofe Wheeler 0 siblings, 1 reply; 5+ messages in thread From: Ober, Frank @ 2019-12-18 19:02 UTC (permalink / raw) To: fio Cc: Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas, Satvik M, Knapp, Anthony J Hi fio community, On 4.x kernels we used to be able to do: # echo 1 > /sys/block/nvme0n1/queue/io_poll And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set. On 5.x kernels we see the following error trying to write the device settings>>> -bash: echo: write error: Invalid argument This is verifiable on 5.3, 5.4 kernels with fio 3.16 builds. What is the background on what has changed because Jens wrote this note back in 2015, which did work once upon a time. But now things have changed, but none of us here in the Intel SSD group and OSS Driver team really know why. https://lwn.net/Articles/663543/ More documentation can be found here: https://stackoverflow.com/questions/55223883/echo-write-error-invalid-argument-while-setting-io-poll-for-nvme-ssd/ Here is a good sample A / B test: [global] direct=1 filename=/dev/nvme1n1 log_avg_msec=500 time_based percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999 [rand-read-4k-qd1] runtime=120 bs=4K iodepth=1 numjobs=1 cpus_allowed=0 ioengine=io_uring hipri rw=randread Works! [global] direct=1 filename=/dev/nvme1n1 log_avg_msec=500 time_based percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999 [rand-read-4k-qd1] runtime=120 bs=4K iodepth=1 numjobs=1 cpus_allowed=0 ioengine=pvsync2 hipri rw=randread Does not work... you do not see the CPU spin up to 100% (kernel/sys usage) on hipri with pvsync2 on a 5.x kernel. Why not? And what changed here? Is it possible to get a new LWN article? Thank you! Frank Ober ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: polled IO and 5.x kernels 2019-12-18 19:02 polled IO and 5.x kernels Ober, Frank @ 2019-12-19 2:40 ` Sitsofe Wheeler 2019-12-19 3:04 ` Sitsofe Wheeler 0 siblings, 1 reply; 5+ messages in thread From: Sitsofe Wheeler @ 2019-12-19 2:40 UTC (permalink / raw) To: Ober, Frank Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas, Satvik M, Knapp, Anthony J Hi, On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote: > > Hi fio community, > On 4.x kernels we used to be able to do: > # echo 1 > /sys/block/nvme0n1/queue/io_poll > > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set. > > On 5.x kernels we see the following error trying to write the device settings>>> > -bash: echo: write error: Invalid argument Wouldn't this be better asked on the linux-block mailing list (http://vger.kernel.org/vger-lists.html#linux-block )? If you're saying "This worked in 4.20 but not in 5.0" (after all since you can see this clearly bisection is always an option for you) then a quick search (https://github.com/torvalds/linux/search?o=desc&p=2&q=%22io_poll%22&s=committer-date&type=Commits ) finds this commit - https://github.com/torvalds/linux/commit/cd19181bf9ad4b7f40f2a4e0355d052109c76529 which might have something to do with the change of behaviour... > This is verifiable on 5.3, 5.4 kernels with fio 3.16 builds. > > What is the background on what has changed because Jens wrote this note back in 2015, which did work once upon a time. > But now things have changed, but none of us here in the Intel SSD group and OSS Driver team really know why. > > https://lwn.net/Articles/663543/ Note: this is just an excerpt from a mailing list post that LWN happened to highlight. See https://lore.kernel.org/lkml/1446830423-25027-1-git-send-email-axboe@fb.com/ for the post and its thread... > More documentation can be found here: https://stackoverflow.com/questions/55223883/echo-write-error-invalid-argument-while-setting-io-poll-for-nvme-ssd/ Isn't stackoverflow for programming questions rather than "where's a kernel option gone?" style question? Maybe https://unix.stackexchange.com/ would have been a better home? > Here is a good sample A / B test: > [global] > direct=1 > filename=/dev/nvme1n1 > log_avg_msec=500 > time_based > percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999 > > > [rand-read-4k-qd1] > runtime=120 > bs=4K > iodepth=1 > numjobs=1 > cpus_allowed=0 > ioengine=io_uring > hipri > rw=randread > Works! > > [global] > direct=1 > filename=/dev/nvme1n1 > log_avg_msec=500 > time_based > percentile_list=1:5:10:20:30:40:50:60:70:80:90:95:99:99.5:99.9:99.95:99.99:99.999:99.9999 > > > [rand-read-4k-qd1] > runtime=120 > bs=4K > iodepth=1 > numjobs=1 > cpus_allowed=0 > ioengine=pvsync2 > hipri > rw=randread > Does not work... you do not see the CPU spin up to 100% (kernel/sys usage) on hipri with pvsync2 on a 5.x kernel. > > Why not? Glancing at https://fio.readthedocs.io/en/latest/fio_doc.html#i-o-engine-specific-parameters , hipri on the pvsync2 ioengine means something different to hipri on io_uring ioengine... > And what changed here? Just to check, do you get the same "full CPU" utilisation when you remove hipri from your pvsync job on a 4.x kernel? > Is it possible to get a new LWN article? That sounds like one for Jon Corbet (LWN.net editor) and the block maintainers but as mentioned above what you linked to was just a mailing list excerpt... > Thank you! > Frank Ober Good luck! -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: polled IO and 5.x kernels 2019-12-19 2:40 ` Sitsofe Wheeler @ 2019-12-19 3:04 ` Sitsofe Wheeler 2019-12-19 17:22 ` Ober, Frank 0 siblings, 1 reply; 5+ messages in thread From: Sitsofe Wheeler @ 2019-12-19 3:04 UTC (permalink / raw) To: Ober, Frank Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas, Satvik M, Knapp, Anthony J On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > > On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote: > > > > On 4.x kernels we used to be able to do: > > # echo 1 > /sys/block/nvme0n1/queue/io_poll > > > > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set. > > > > On 5.x kernels we see the following error trying to write the device settings>>> > > -bash: echo: write error: Invalid argument > > Wouldn't this be better asked on the linux-block mailing list > (http://vger.kernel.org/vger-lists.html#linux-block )? In fact, doing another search on that list (https://lore.kernel.org/linux-block/?q=io_poll ) or a more general Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll ) turns up a post titled "Error while enabling io_poll for NVMe SSD" (https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/ ) where Keith Busch (who has an Intel address but I'm guessing isn't in your group?) asks: "did the user turn on polling queues in the nvme driver"? Another quick hunt (https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code and following some blame lines) finds https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523 which indicates an NVMe option was added in 5.0 related to polling. (PS: If you're using fio to publish benchmarking results don't forget about https://github.com/axboe/fio/blob/master/MORAL-LICENSE ) -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: polled IO and 5.x kernels 2019-12-19 3:04 ` Sitsofe Wheeler @ 2019-12-19 17:22 ` Ober, Frank 2019-12-20 8:01 ` Sitsofe Wheeler 0 siblings, 1 reply; 5+ messages in thread From: Ober, Frank @ 2019-12-19 17:22 UTC (permalink / raw) To: Sitsofe Wheeler, fio Cc: Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas, Satvik M, Knapp, Anthony J Sitsofe, Ok I will do that. Frank -----Original Message----- From: Sitsofe Wheeler <sitsofe@gmail.com> Sent: Wednesday, December 18, 2019 7:05 PM To: Ober, Frank <frank.ober@intel.com> Cc: fio@vger.kernel.org; Rajendiran, Swetha <swetha.rajendiran@intel.com>; Liang, Mark <mark.liang@intel.com>; Derrick, Jonathan <jonathan.derrick@intel.com>; Vyas, Satvik M <satvik.m.vyas@intel.com>; Knapp, Anthony J <anthony.j.knapp@intel.com> Subject: Re: polled IO and 5.x kernels On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > > On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote: > > > > On 4.x kernels we used to be able to do: > > # echo 1 > /sys/block/nvme0n1/queue/io_poll > > > > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set. > > > > On 5.x kernels we see the following error trying to write the device > > settings>>> > > -bash: echo: write error: Invalid argument > > Wouldn't this be better asked on the linux-block mailing list > (http://vger.kernel.org/vger-lists.html#linux-block )? In fact, doing another search on that list (https://lore.kernel.org/linux-block/?q=io_poll ) or a more general Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll ) turns up a post titled "Error while enabling io_poll for NVMe SSD" (https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/ ) where Keith Busch (who has an Intel address but I'm guessing isn't in your group?) asks: "did the user turn on polling queues in the nvme driver"? Another quick hunt (https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code and following some blame lines) finds https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523 which indicates an NVMe option was added in 5.0 related to polling. (PS: If you're using fio to publish benchmarking results don't forget about https://github.com/axboe/fio/blob/master/MORAL-LICENSE ) -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: polled IO and 5.x kernels 2019-12-19 17:22 ` Ober, Frank @ 2019-12-20 8:01 ` Sitsofe Wheeler 0 siblings, 0 replies; 5+ messages in thread From: Sitsofe Wheeler @ 2019-12-20 8:01 UTC (permalink / raw) To: Ober, Frank Cc: fio, Rajendiran, Swetha, Liang, Mark, Derrick, Jonathan, Vyas, Satvik M, Knapp, Anthony J > -----Original Message----- > From: Sitsofe Wheeler <sitsofe@gmail.com> > Sent: Wednesday, December 18, 2019 7:05 PM > > > Wouldn't this be better asked on the linux-block mailing list > > (http://vger.kernel.org/vger-lists.html#linux-block )? On Thu, 19 Dec 2019 at 17:22, Ober, Frank <frank.ober@intel.com> wrote: > > Sitsofe, > Ok I will do that. For anyone who might be following along (e.g. by finding the post you're reading via Google some time in the future), Frank did indeed do this and you can see the outcome in the "Polled io for Linux kernel 5.x" thread on the Linux block mailing list (e.g. https://lore.kernel.org/linux-block/SN6PR11MB2669E7A65DD0AD9DC65A67C58B520@SN6PR11MB2669.namprd11.prod.outlook.com/ ). -- Sitsofe | http://sucs.org/~sits/ On Thu, 19 Dec 2019 at 17:22, Ober, Frank <frank.ober@intel.com> wrote: > > Sitsofe, > Ok I will do that. > Frank > > > -----Original Message----- > From: Sitsofe Wheeler <sitsofe@gmail.com> > Sent: Wednesday, December 18, 2019 7:05 PM > To: Ober, Frank <frank.ober@intel.com> > Cc: fio@vger.kernel.org; Rajendiran, Swetha <swetha.rajendiran@intel.com>; Liang, Mark <mark.liang@intel.com>; Derrick, Jonathan <jonathan.derrick@intel.com>; Vyas, Satvik M <satvik.m.vyas@intel.com>; Knapp, Anthony J <anthony.j.knapp@intel.com> > Subject: Re: polled IO and 5.x kernels > > On Thu, 19 Dec 2019 at 02:40, Sitsofe Wheeler <sitsofe@gmail.com> wrote: > > > > On Wed, 18 Dec 2019 at 19:02, Ober, Frank <frank.ober@intel.com> wrote: > > > > > > On 4.x kernels we used to be able to do: > > > # echo 1 > /sys/block/nvme0n1/queue/io_poll > > > > > > And then run a polled_io job in fio with pvsync2 as our ioengine, with the hipri flag set. > > > > > > On 5.x kernels we see the following error trying to write the device > > > settings>>> > > > -bash: echo: write error: Invalid argument > > > > Wouldn't this be better asked on the linux-block mailing list > > (http://vger.kernel.org/vger-lists.html#linux-block )? > > In fact, doing another search on that list (https://lore.kernel.org/linux-block/?q=io_poll ) or a more general Google search (https://www.google.com/search?q=%22echo%3A+write+error%3A+Invalid+argument%22+io_poll > ) turns up a post titled "Error while enabling io_poll for NVMe SSD" > (https://lore.kernel.org/linux-block/CAFQ9A4Zc8Fc4bDyAsiduTw4kTpxR=dVZr7LUdqUkvQSu1CaGpg@mail.gmail.com/ > ) where Keith Busch (who has an Intel address but I'm guessing isn't in your group?) asks: "did the user turn on polling queues in the nvme driver"? Another quick hunt (https://github.com/torvalds/linux/search?q=%22poll_queues%22&type=Code > and following some blame lines) finds > https://github.com/torvalds/linux/commit/4b04cc6a8f86c4842314def22332de1f15de8523 > which indicates an NVMe option was added in 5.0 related to polling. > > (PS: If you're using fio to publish benchmarking results don't forget about https://github.com/axboe/fio/blob/master/MORAL-LICENSE ) > > -- > Sitsofe | http://sucs.org/~sits/ -- Sitsofe | http://sucs.org/~sits/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-12-20 8:02 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-18 19:02 polled IO and 5.x kernels Ober, Frank 2019-12-19 2:40 ` Sitsofe Wheeler 2019-12-19 3:04 ` Sitsofe Wheeler 2019-12-19 17:22 ` Ober, Frank 2019-12-20 8:01 ` Sitsofe Wheeler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.