All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
@ 2016-09-16  7:55 Paolo Valente
  2016-09-16  8:24 ` Greg KH
  2016-09-21 14:30 ` Bart Van Assche
  0 siblings, 2 replies; 20+ messages in thread
From: Paolo Valente @ 2016-09-16  7:55 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: b.zolnierkie, Jens Axboe, hare, Tejun Heo, osandov, hch

Linux systems suffers from long-standing high-latency problems, at
system and application level, related to I/O.  For example, they
usually suffer from poor responsiveness--or even starvation, depending
on the workload--while, e.g., one or more files are being
read/written/copied.  On a similar note, background workloads may
cause audio/video playback/streaming to stutter, even with long gaps.
A lot of test results on this problem can be found here [1] (I'm
citing only this resource just because I'm familiar with it, but
evidence can be found in countless technical reports, scientific
papers, forum discussions, and so on).

These problems are caused mainly by poor I/O scheduling, although I/O
schedulers are not the only culprit.  To address these issues, eight
years ago I started to work on a new I/O scheduler, named BFQ [1], with
other researchers and developers.  Since then, we have improved
and fine-tuned BFQ rather constantly.  In particular we have
tested it extensively, especially on desktop system.  In our
easily-repeatable experiments, BFQ proves to be able to solve latency
issues in many, if not most, use cases [2].  For example, regardless of
the background workload considered in [2], application start-up times
are about the same as when the storage device is idle.  Similarly,
audio/video playback is always perfectly smooth.  The feedback
received so far confirms our results.  In this respect, BFQ is, e.g.,
the default I/O scheduler in a few distributions, including Sabayon
and Arch Linux ARM, as well as in CyanoGenMod for several devices.

BFQ has been submitted several times on lkml over the last eight
years, the last times by me.  But it has not made it, for (apparently)
other reasons than how serious the latency problem is, or how
effectively BFQ solves it.  In short, the problem with the first
patchsets was that they added a new scheduler, while it was decided
that they should have replaced CFQ instead [3].  Then time passed in
various submit&revise rounds.  Meanwhile blk-mq has entered mainline,
and a new objection has been raised: it is not sensible to touch code
(blk) that will eventually be deprecated [4].

In view of these facts, I would like to propose a discussion on this
topic, and, in particular on the following points:

1) If blk will still be used in a considerable number of systems for
at least one or two more years, as many thinks it is the case, is it
sensible to prevent a lot of users from enjoying a responsive and
smooth system?  It does not seem a good idea, also because having
BFQ, or an even better variant of it, in blk would imply having a
strong reference benchmark to drive the development of effective I/O
scheduling also in blk-mq.

2) Work is going on in blk-mq to add I/O scheduling, but IMHO current
approaches and ideas may not be sufficient to solve the above latency
problems.  So, still IMO, latency issues may get even worse for low-
and medium-speed single-queue devices in the transition to blk-mq, as
there is no I/O scheduling yet in blk-mq, and these issues may remain
there if no accurate-enough scheduler is added.  In contrast, solving
latency issues, and not only improving throughput, is probably quite
important to speed up the transition to blk-mq for these devices.

3) Can we join forces to solve latency problems in blk-mq?  By working
on BFQ, I should have gained some experience on I/O scheduling
and on providing strong service guarantees (low latency, accurate
bandwidth distribution, ...), yet I'm all but an expert of bulk-mq's
inner workings and issues.  I'm willing to help on all areas in this
regard, including tasks related to the previous point.

Thanks,
Paolo

[1] http://algogroup.unimore.it/people/paolo/disk_sched/
[2] http://algogroup.unimore.it/people/paolo/disk_sched/results.php
[3] https://lists.linux-foundation.org/pipermail/containers/2014-June/034704.html
[4] https://lkml.org/lkml/2016/8/8/207

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  7:55 [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O Paolo Valente
@ 2016-09-16  8:24 ` Greg KH
  2016-09-16  8:59   ` Linus Walleij
  2016-09-16 15:15   ` James Bottomley
  2016-09-21 14:30 ` Bart Van Assche
  1 sibling, 2 replies; 20+ messages in thread
From: Greg KH @ 2016-09-16  8:24 UTC (permalink / raw)
  To: Paolo Valente
  Cc: b.zolnierkie, ksummit-discuss, Jens Axboe, hare, Tejun Heo, osandov, hch

On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
> Linux systems suffers from long-standing high-latency problems, at
> system and application level, related to I/O.  For example, they
> usually suffer from poor responsiveness--or even starvation, depending
> on the workload--while, e.g., one or more files are being
> read/written/copied.  On a similar note, background workloads may
> cause audio/video playback/streaming to stutter, even with long gaps.
> A lot of test results on this problem can be found here [1] (I'm
> citing only this resource just because I'm familiar with it, but
> evidence can be found in countless technical reports, scientific
> papers, forum discussions, and so on).

<snip>

Isn't this a better topic for the Vault conference, or the storage mini
conference?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  8:24 ` Greg KH
@ 2016-09-16  8:59   ` Linus Walleij
  2016-09-16  9:10     ` Bart Van Assche
  2016-09-22  9:18     ` Ulf Hansson
  2016-09-16 15:15   ` James Bottomley
  1 sibling, 2 replies; 20+ messages in thread
From: Linus Walleij @ 2016-09-16  8:59 UTC (permalink / raw)
  To: Greg KH
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Jens Axboe, hare,
	Tejun Heo, osandov, Christoph Hellwig

On Fri, Sep 16, 2016 at 10:24 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
>> Linux systems suffers from long-standing high-latency problems, at
>> system and application level, related to I/O.  For example, they
>> usually suffer from poor responsiveness--or even starvation, depending
>> on the workload--while, e.g., one or more files are being
>> read/written/copied.  On a similar note, background workloads may
>> cause audio/video playback/streaming to stutter, even with long gaps.
>> A lot of test results on this problem can be found here [1] (I'm
>> citing only this resource just because I'm familiar with it, but
>> evidence can be found in countless technical reports, scientific
>> papers, forum discussions, and so on).
>
> <snip>
>
> Isn't this a better topic for the Vault conference, or the storage mini
> conference?

Paolo was invited to the kernel summit and I guess so are the
core block maintainers: Jens, Tejun, Christoph. The right people are
there so why not take the opportunity.

If for nothing else just have a formal chat.

Overall I personally think the most KS-related discussion would be
to address the problems Paolo has had to break into the block layer
development community and the conflicting responses to the patch
sets, which generated a few flak comments under the last LWN
article:
http://lwn.net/Articles/674308/

The main problem is that unlike some random driver this cannot
be put into staging and adding it as a secondary (or tertiary or
whatever) scheduling policy in block/* was explicitly nixed.

AFAICT there is no clear answer from the block maintainers
regarding:

- Is the old blk layer deprecated or not? Christoph seems to
  say "yes, forget it, work on mq", but I am still unsure about Jens
  and Tejuns positions here. Would be nice with some consensus.
  If it is deprecated it would make sense not to merge any new
  code using it, right?

- When is an all-out transition to mq really going to happen?
  "When it's ready and all blk consumers are migrated" is a good
  answer, but pretty unhelpful for developers like Paolo.
  Can we get a clearer picture?

- What will subsystems (especially my pet peeve about MMC/SD
  which is single-queue by nature) that experience a performance
  regression with a switch to mq do? Not switch until mq has a
  scheduling policy? Switch and suck up the performance regression,
  multiplied by the number of Android handheld devices on the
  planet?

I only have handwavy arguments about the latter being the
case which is why I'm working on a patch to MMC/SD to
switch to mq as an RFT. It's taking some time though, alas
I'm not very smart.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  8:59   ` Linus Walleij
@ 2016-09-16  9:10     ` Bart Van Assche
  2016-09-16 11:24       ` Linus Walleij
  2016-09-22  9:18     ` Ulf Hansson
  1 sibling, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2016-09-16  9:10 UTC (permalink / raw)
  To: Linus Walleij, Greg KH
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Jens Axboe, hare,
	Tejun Heo, osandov, Christoph Hellwig

On 09/16/2016 10:59 AM, Linus Walleij wrote:
> - What will subsystems (especially my pet peeve about MMC/SD
>   which is single-queue by nature) that experience a performance
>   regression with a switch to mq do? Not switch until mq has a
>   scheduling policy? Switch and suck up the performance regression,
>   multiplied by the number of Android handheld devices on the
>   planet?
>
> I only have handwavy arguments about the latter being the
> case which is why I'm working on a patch to MMC/SD to
> switch to mq as an RFT. It's taking some time though, alas
> I'm not very smart.

Hello Linus,

What was your reference when comparing blk-mq MMC/SD performance with 
the current implementation? Which I/O scheduler was used when measuring 
performance with the traditional block layer? If it was not noop, how 
does blk-mq performance of MMC/SD compare to the performance of the 
current implementation with noop scheduler?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  9:10     ` Bart Van Assche
@ 2016-09-16 11:24       ` Linus Walleij
  2016-09-16 11:46         ` Arnd Bergmann
  2016-09-16 11:53         ` Bart Van Assche
  0 siblings, 2 replies; 20+ messages in thread
From: Linus Walleij @ 2016-09-16 11:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, Christoph Hellwig

On Fri, Sep 16, 2016 at 11:10 AM, Bart Van Assche
<bart.vanassche@sandisk.com> wrote:

> What was your reference when comparing blk-mq MMC/SD performance with the
> current implementation?

I have *NOT* compared the performance, since I did not
manage to replace blk with blk mq in MMC/SD yet.

If someone else has more experience and can do this in
5 minutes to get a rough measure I would appreciate to
see it.

I am working on it from the bottom up, trying to make
a not too stupid search/and/substitute replacement. As MMC
is doing a lot of stacking requests and looking ahead and behind
and what not, this needs to be done thoroughly.

But this is the reference tests I have used for CFQ vs BFQ
comparisons so far:

Hardware:
- ARM Integrator/AP IM-PD1 SD-card at 300kHz (!)
- Ux500 with 7.18GiB eMMC
- Ux500 with SanDisk 4GiB uSD card
- ARM Juno with 2GiB Kingston uSD card
- ARM Juno with SanDisk  4GiB uSD card
- Marvell Kirkwood Feroceon ARM with 2GiB SD card

First the standard dd-write/read test of course, because if you
have performance issues there you can just forget about everything
else. Looks something like:
time dd if=/dev/mmcblk0 of=/dev/null bs=1M count=1024 iflag=direct

That is with busybox dd/time.

Then I used iozone which is something the mobile industry had
traditionally used to provide some figures on storage throughput,
as many just want a figure to put on their whitepaper, they use
iozone, which will read and write a number of blocks of varying
size, re-read it, re-write it and also perform reads and writes
at random offsets:
http://www.iozone.org/

I just usually use it like so:
mount /dev/mmcblk0p1 /mnt
iozone -az -i0 -i1 -i2 -s 20m -I -f /mnt/foo.test

Both of these are simple to cross compile and run from an
initramfs on ARM targets.

Then I use Jens Axboe's fio. This is a more complicated beast
intended to generate real-world workloads to emulate the load
on your random Google or Facebook database server or image
cluser or Idon'tknowwhat.
https://github.com/axboe/fio

It is not super-useful on MMC/SD cards, because the load
will simply bog down everything and your typical embedded
system will start to behave like an updating Android phone
"optimizing applications" which is a known issue that is
caused by the slowness of eMMC. It also eats memory
quickly and that way just kills any embedded system because
of OOM before you can make any meaningful tests. But it
can spawn any number of readers & writers and stress out
your device very efficiently if you have enough memory
and CPU. (It is apparently designed to test systems with
lots of memory and CPU power.)

I mainly used fio on NAS type devices.
For example on Marvell Kirkwood Pogoplug 4 with SATA, I
can do a test like this to test an dmcrypt devicemapper thing:

fio --filename=/dev/dm-0 --direct=1 --iodepth=1 --rw=read --bs=64K \
--size=1G --group_reporting --numjobs=1 --name=test_read

> Which I/O scheduler was used when measuring
> performance with the traditional block layer?

I used CFQ, deadline, noop, and of course the BFQ patches.
With BFQ I reproduced the figures reported by Paolo on a
laptop but since his test cases use fio to stress the system
and eMMC/SD are so slow, I couldn't come up with any good
usecase using fio.

Any hints on better tests are welcome!
In the kernel logs I only see peole doing a lot of dd
tests which I think is silly, you need more serious
test cases so it's good if we can build some consensus
there.

What do you guys at SanDisk use?

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 11:24       ` Linus Walleij
@ 2016-09-16 11:46         ` Arnd Bergmann
  2016-09-16 13:10           ` Paolo Valente
  2016-09-16 13:36           ` Linus Walleij
  2016-09-16 11:53         ` Bart Van Assche
  1 sibling, 2 replies; 20+ messages in thread
From: Arnd Bergmann @ 2016-09-16 11:46 UTC (permalink / raw)
  To: ksummit-discuss
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, Bart Van Assche, osandov, Christoph Hellwig

On Friday, September 16, 2016 1:24:07 PM CEST Linus Walleij wrote:

> It is not super-useful on MMC/SD cards, because the load
> will simply bog down everything and your typical embedded
> system will start to behave like an updating Android phone
> "optimizing applications" which is a known issue that is
> caused by the slowness of eMMC. It also eats memory
> quickly and that way just kills any embedded system because
> of OOM before you can make any meaningful tests. But it
> can spawn any number of readers & writers and stress out
> your device very efficiently if you have enough memory
> and CPU. (It is apparently designed to test systems with
> lots of memory and CPU power.)

I think it's more complex than "the slowness of eMMC": I would
expect that in a read-only scenario, eMMC (or SD cards and
most USB sticks) does't do that bad, it may be one order of
magnitude slower than a hard drive but doesn't suffer from
seeks during read by nearly as much.

For writes, the situation is completely different on these,
as you can just hit extremely long delays (up to a second) on
a single write whenever the device goes into garbage collection
mode, during which no other I/O is done, and that ends up
stalling any process that is waiting for a read request.

> I mainly used fio on NAS type devices.
> For example on Marvell Kirkwood Pogoplug 4 with SATA, I
> can do a test like this to test an dmcrypt devicemapper thing:
> 
> fio --filename=/dev/dm-0 --direct=1 --iodepth=1 --rw=read --bs=64K \
> --size=1G --group_reporting --numjobs=1 --name=test_read
> 
> > Which I/O scheduler was used when measuring
> > performance with the traditional block layer?
> 
> I used CFQ, deadline, noop, and of course the BFQ patches.
> With BFQ I reproduced the figures reported by Paolo on a
> laptop but since his test cases use fio to stress the system
> and eMMC/SD are so slow, I couldn't come up with any good
> usecase using fio.
> 
> Any hints on better tests are welcome!
> In the kernel logs I only see peole doing a lot of dd
> tests which I think is silly, you need more serious
> test cases so it's good if we can build some consensus
> there.

My guess is that the impact of the file system is much greater
than the I/O scheduler. If the file system is well tuned
to the storage device (e.g. f2fs should be near ideal),
you can avoid most of the stalls regardless of the scheduler,
while with file systems that are not aware of flash geometry
at all (e.g. the now-removed ext3 code, especially with
journaling), the scheduler won't be able to help that much
either.

What file system did you use for testing, and which tuning
did you do for your storage devices?

Maybe a better long-term strategy is to improve the important
file systems (ext4, xfs, btrfs) further to work well with
flash storage through blk-mq.

	Arnd

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 11:24       ` Linus Walleij
  2016-09-16 11:46         ` Arnd Bergmann
@ 2016-09-16 11:53         ` Bart Van Assche
  1 sibling, 0 replies; 20+ messages in thread
From: Bart Van Assche @ 2016-09-16 11:53 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, Christoph Hellwig

On 09/16/2016 01:24 PM, Linus Walleij wrote:
> What do you guys at SanDisk use?

Hello Linus,

We use fio for block device performance measurements. Before we run fio 
we disable C-state and P-state transitions to make sure that the results 
will not depend on any frequency scaling algorithm. Furthermore, we 
install a udev rule that sets the following block layer parameters for 
non-rotational devices: add_random=0 and rq_affinity=2.

Bart.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 11:46         ` Arnd Bergmann
@ 2016-09-16 13:10           ` Paolo Valente
  2016-09-16 13:36           ` Linus Walleij
  1 sibling, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2016-09-16 13:10 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: ksummit-discuss, Bartlomiej Zolnierkiewicz, ksummit-discuss,
	Greg KH, Jens Axboe, hare, Tejun Heo, Bart Van Assche, osandov,
	Christoph Hellwig


> Il giorno 16 set 2016, alle ore 13:46, Arnd Bergmann <arnd@arndb.de> ha scritto:
> 

<snip>

> My guess is that the impact of the file system is much greater
> than the I/O scheduler. If the file system is well tuned
> to the storage device (e.g. f2fs should be near ideal),
> you can avoid most of the stalls regardless of the scheduler,
> while with file systems that are not aware of flash geometry
> at all (e.g. the now-removed ext3 code, especially with
> journaling), the scheduler won't be able to help that much
> either.
> 

If I have not misunderstood your guess, then it actually does not
match our results for any test case [1].  More precisely, certain
filesystems do improve performance for certain, or sometimes most
workloads, but responsiveness, starvation and frame-drop issues remain
basically unchanged.  Per-filesystems results are not reported in [1],
but, if you want, I can reproduce them for the filesystems you
suggest.

According to our experience, the fundamental problem is that either

1) The I/O scheduler goes on choosing the wrong I/O requests to
dispatch for very long: seconds, minutes, or forever, depending on the
workload and the scheduler.  For example, one tries to start a new
application while one or more files are being copied, and the I/O
requests of the starting application are served very rarely, or not
served at all until the copy is finished.  Then the application takes
a very long time to start, or simply does not start until the copy is
finished.

or

2) The I/O scheduler does nothing (noop), or does not exist (blk-mq),
so service order is FIFO plus internal reordering in the storage
device, where internal reordering is most often aimed at maximizing
throughput.  In this case, the problem described for the previous case
usually gets much worse, because any Linux scheduler, apart from noop,
tends somehow to achieve fairness and reduce latency.

Thanks, Paolo

[1] http://algogroup.unimore.it/people/paolo/disk_sched/results.php

> What file system did you use for testing, and which tuning
> did you do for your storage devices?
> 
> Maybe a better long-term strategy is to improve the important
> file systems (ext4, xfs, btrfs) further to work well with
> flash storage through blk-mq.
> 
> 	Arnd
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 11:46         ` Arnd Bergmann
  2016-09-16 13:10           ` Paolo Valente
@ 2016-09-16 13:36           ` Linus Walleij
  1 sibling, 0 replies; 20+ messages in thread
From: Linus Walleij @ 2016-09-16 13:36 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: ksummit-discuss, Bartlomiej Zolnierkiewicz, ksummit-discuss,
	Greg KH, Jens Axboe, hare, Tejun Heo, Bart Van Assche,
	Omar Sandoval, Christoph Hellwig

On Fri, Sep 16, 2016 at 1:46 PM, Arnd Bergmann <arnd@arndb.de> wrote:

> What file system did you use for testing, and which tuning
> did you do for your storage devices?

These were all with ext4.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  8:24 ` Greg KH
  2016-09-16  8:59   ` Linus Walleij
@ 2016-09-16 15:15   ` James Bottomley
  2016-09-16 18:48     ` Paolo Valente
  1 sibling, 1 reply; 20+ messages in thread
From: James Bottomley @ 2016-09-16 15:15 UTC (permalink / raw)
  To: Greg KH, Paolo Valente
  Cc: b.zolnierkie, ksummit-discuss, Jens Axboe, hare, Tejun Heo, osandov, hch

On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
> > Linux systems suffers from long-standing high-latency problems, at
> > system and application level, related to I/O.  For example, they
> > usually suffer from poor responsiveness--or even starvation, 
> > depending on the workload--while, e.g., one or more files are being
> > read/written/copied.  On a similar note, background workloads may
> > cause audio/video playback/streaming to stutter, even with long 
> > gaps. A lot of test results on this problem can be found here [1] 
> > (I'm citing only this resource just because I'm familiar with it, 
> > but evidence can be found in countless technical reports, 
> > scientific papers, forum discussions, and so on).
> 
> <snip>
> 
> Isn't this a better topic for the Vault conference, or the storage 
> mini conference?

LSF/MM would be the place to have the technical discussion, yes.  It
will be in Cambridge (MA,USA not the real one) in the Feb/March time
frame in 2017.  Far more of the storage experts (who likely want to
weigh in) will be present.

My understanding of the patch set is that you've only sent it as an RFC
and the main criticism was that it only applied to our legacy
interface, not the new mq one.  You sent out an RFD for ideas around mq
in August, but the main criticism was that your ideas would introduce a
contention point.  Omar Sandoval is also working on something similar
in mq, are you actually talking to him?

James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 15:15   ` James Bottomley
@ 2016-09-16 18:48     ` Paolo Valente
  2016-09-16 19:36       ` James Bottomley
  0 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2016-09-16 18:48 UTC (permalink / raw)
  To: James Bottomley
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, hch


> Il giorno 16 set 2016, alle ore 17:15, James Bottomley <James.Bottomley@HansenPartnership.com> ha scritto:
> 
> On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
>> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
>>> Linux systems suffers from long-standing high-latency problems, at
>>> system and application level, related to I/O.  For example, they
>>> usually suffer from poor responsiveness--or even starvation, 
>>> depending on the workload--while, e.g., one or more files are being
>>> read/written/copied.  On a similar note, background workloads may
>>> cause audio/video playback/streaming to stutter, even with long 
>>> gaps. A lot of test results on this problem can be found here [1] 
>>> (I'm citing only this resource just because I'm familiar with it, 
>>> but evidence can be found in countless technical reports, 
>>> scientific papers, forum discussions, and so on).
>> 
>> <snip>
>> 
>> Isn't this a better topic for the Vault conference, or the storage 
>> mini conference?
> 
> LSF/MM would be the place to have the technical discussion, yes.  It
> will be in Cambridge (MA,USA not the real one) in the Feb/March time
> frame in 2017.  Far more of the storage experts (who likely want to
> weigh in) will be present.
> 

Perfect venue.  Just it would be a pity IMO to waste the opportunity
of my being at KS with other people working on the components involved
in high-latency issues, and to delay by more months a discussion on
possible solutions.

> My understanding of the patch set is that you've only sent it as an RFC

Actually, in last submission the RFC tag was gone.

> and the main criticism was that it only applied to our legacy
> interface, not the new mq one.

Yes.  What puzzles me a little bit is that, over these years,
virtually no ack or objection concerned how relevant/irrelevant the
addressed latency problems are, or how effective/ineffective BFQ is in
solving them.

>  You sent out an RFD for ideas around mq
> in August, but the main criticism was that your ideas would introduce a
> contention point.

Yes, that criticism concerned one of my questions: I asked whether io
contexts or something like that could be used for I/O scheduling in
blk-mq.  Since I have just started thinking about possible solutions
to solve effectively latency issues in blk-mq, I'm trying to
understand on what ground they could be based.  Naively, I didn't
realize that io contexts, in their current incarnation, are just
unfeasible in a parallel framework.

>  Omar Sandoval is also working on something similar
> in mq, are you actually talking to him?
> 

One of the purposes of my RFD was exactly to talk with somebody like
Omar.  He did reply providing very useful information.  As of now, my
interaction with Omar consists just in the exchange of emails in that
thread.  That exchange is currently stuck at my last email, sent about
three weeks ago, and containing some considerations and questions
about the information Omar provided me in his email.

Thanks,
Paolo

> James
> 
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 18:48     ` Paolo Valente
@ 2016-09-16 19:36       ` James Bottomley
  2016-09-16 20:13         ` Paolo Valente
                           ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: James Bottomley @ 2016-09-16 19:36 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, hch

On Fri, 2016-09-16 at 20:48 +0200, Paolo Valente wrote:
> > Il giorno 16 set 2016, alle ore 17:15, James Bottomley <
> > James.Bottomley@HansenPartnership.com> ha scritto:
> > 
> > On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
> > > On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
> > > > Linux systems suffers from long-standing high-latency problems, 
> > > > at system and application level, related to I/O.  For example,
> > > > they usually suffer from poor responsiveness--or even 
> > > > starvation, depending on the workload--while, e.g., one or more 
> > > > files are being read/written/copied.  On a similar note, 
> > > > background workloads may cause audio/video playback/streaming 
> > > > to stutter, even with long gaps. A lot of test results on this 
> > > > problem can be found here [1] (I'm citing only this resource 
> > > > just because I'm familiar with it, but evidence can be found in 
> > > > countless technical reports, scientific papers, forum
> > > > discussions, and so on).
> > > 
> > > <snip>
> > > 
> > > Isn't this a better topic for the Vault conference, or the 
> > > storage mini conference?
> > 
> > LSF/MM would be the place to have the technical discussion, yes. 
> >  It will be in Cambridge (MA,USA not the real one) in the Feb/March
> > time frame in 2017.  Far more of the storage experts (who likely 
> > want to weigh in) will be present.
> > 
> 
> Perfect venue.  Just it would be a pity IMO to waste the opportunity
> of my being at KS with other people working on the components 
> involved in high-latency issues, and to delay by more months a 
> discussion on possible solutions.

OK, so the problem with a formal discussion of something like this at
KS is that of the 80 or so people in the room, likely only 10 have any
interest whatsoever, leading to intense boredom for the remaining 70. 
 And for those 10, there were likely another 10 who didn't get invited
who wanted the chance to express an opinion.  Realistically, this is
why we no-longer do technical discussions at KS: audience too broad and
not enough specific subject matter experts.

However, nothing says you can't have a discussion in the hallway if
you're already going.

> > My understanding of the patch set is that you've only sent it as an
> > RFC
> 
> Actually, in last submission the RFC tag was gone.
> 
> > and the main criticism was that it only applied to our legacy
> > interface, not the new mq one.
> 
> Yes.  What puzzles me a little bit is that, over these years,
> virtually no ack or objection concerned how relevant/irrelevant the
> addressed latency problems are, or how effective/ineffective BFQ is 
> in solving them.

Where have you been posting them for years?  I stay pretty close to
block issues, but the first time I actually noticed was when you posted
to linux-block on 1 Feb this year.

> >  You sent out an RFD for ideas around mq in August, but the main 
> > criticism was that your ideas would introduce a contention point.
> 
> Yes, that criticism concerned one of my questions: I asked whether io
> contexts or something like that could be used for I/O scheduling in
> blk-mq.  Since I have just started thinking about possible solutions
> to solve effectively latency issues in blk-mq, I'm trying to
> understand on what ground they could be based.  Naively, I didn't
> realize that io contexts, in their current incarnation, are just
> unfeasible in a parallel framework.

Well, I understand, but you're trying to get the attention of people
who believe nothing now is important except blk-mq ...  I'm afraid it
means you do need to understand and adapt to the new toy.

> >  Omar Sandoval is also working on something similar
> > in mq, are you actually talking to him?
> > 
> 
> One of the purposes of my RFD was exactly to talk with somebody like
> Omar.  He did reply providing very useful information.  As of now, my
> interaction with Omar consists just in the exchange of emails in that
> thread.  That exchange is currently stuck at my last email, sent 
> about three weeks ago, and containing some considerations and 
> questions about the information Omar provided me in his email.

My hazy recollection of Omar from the last LSF/MM is that he's quite a
recent FB developer and he's got quite a lot to do ... he may just need
reminding.

James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 19:36       ` James Bottomley
@ 2016-09-16 20:13         ` Paolo Valente
  2016-09-19  8:17           ` Jan Kara
  2016-09-17 10:31         ` Linus Walleij
  2016-09-21 13:51         ` Grant Likely
  2 siblings, 1 reply; 20+ messages in thread
From: Paolo Valente @ 2016-09-16 20:13 UTC (permalink / raw)
  To: James Bottomley
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, Christoph Hellwig


> Il giorno 16 set 2016, alle ore 21:36, James Bottomley <James.Bottomley@HansenPartnership.com> ha scritto:
> 
> On Fri, 2016-09-16 at 20:48 +0200, Paolo Valente wrote:
>>> Il giorno 16 set 2016, alle ore 17:15, James Bottomley <
>>> James.Bottomley@HansenPartnership.com> ha scritto:
>>> 
>>> On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
>>>> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
>>>>> Linux systems suffers from long-standing high-latency problems, 
>>>>> at system and application level, related to I/O.  For example,
>>>>> they usually suffer from poor responsiveness--or even 
>>>>> starvation, depending on the workload--while, e.g., one or more 
>>>>> files are being read/written/copied.  On a similar note, 
>>>>> background workloads may cause audio/video playback/streaming 
>>>>> to stutter, even with long gaps. A lot of test results on this 
>>>>> problem can be found here [1] (I'm citing only this resource 
>>>>> just because I'm familiar with it, but evidence can be found in 
>>>>> countless technical reports, scientific papers, forum
>>>>> discussions, and so on).
>>>> 
>>>> <snip>
>>>> 
>>>> Isn't this a better topic for the Vault conference, or the 
>>>> storage mini conference?
>>> 
>>> LSF/MM would be the place to have the technical discussion, yes. 
>>> It will be in Cambridge (MA,USA not the real one) in the Feb/March
>>> time frame in 2017.  Far more of the storage experts (who likely 
>>> want to weigh in) will be present.
>>> 
>> 
>> Perfect venue.  Just it would be a pity IMO to waste the opportunity
>> of my being at KS with other people working on the components 
>> involved in high-latency issues, and to delay by more months a 
>> discussion on possible solutions.
> 
> OK, so the problem with a formal discussion of something like this at
> KS is that of the 80 or so people in the room, likely only 10 have any
> interest whatsoever, leading to intense boredom for the remaining 70.

No no, that would be scary to me, given the level of the audience!  I
thought it would have been possible to arrange some sort of
sub-discussions with limited groups (although maybe the fact the Linux
still suffers from high latencies might somehow worry all people that
care about the kernel).  I'm sorry, but this will be my first time at KS.

>  
> And for those 10, there were likely another 10 who didn't get invited
> who wanted the chance to express an opinion.  Realistically, this is
> why we no-longer do technical discussions at KS: audience too broad and
> not enough specific subject matter experts.
> 
> However, nothing says you can't have a discussion in the hallway if
> you're already going.
> 

Which may be enough to raise more awareness.

>>> My understanding of the patch set is that you've only sent it as an
>>> RFC
>> 
>> Actually, in last submission the RFC tag was gone.
>> 
>>> and the main criticism was that it only applied to our legacy
>>> interface, not the new mq one.
>> 
>> Yes.  What puzzles me a little bit is that, over these years,
>> virtually no ack or objection concerned how relevant/irrelevant the
>> addressed latency problems are, or how effective/ineffective BFQ is 
>> in solving them.
> 
> Where have you been posting them for years?  I stay pretty close to
> block issues, but the first time I actually noticed was when you posted
> to linux-block on 1 Feb this year.
> 

I forgot all BFQ submissions too :)
(please have a look at the last link in the following list)

After a little search, here are the very first ones:
https://lkml.org/lkml/2008/4/1/234
https://lkml.org/lkml/2008/11/11/148

Then the first one with the new version of BFQ:
https://lkml.org/lkml/2014/5/27/314

After a few other rounds in the last two years, the last one
(which you already saw according to your summary):
https://lkml.org/lkml/2016/8/8/207

And, maybe even more relevant, a very short feedback highlighting that
the problem seems to be still alive, well and serious:
https://lkml.org/lkml/2016/9/9/154

>>> You sent out an RFD for ideas around mq in August, but the main 
>>> criticism was that your ideas would introduce a contention point.
>> 
>> Yes, that criticism concerned one of my questions: I asked whether io
>> contexts or something like that could be used for I/O scheduling in
>> blk-mq.  Since I have just started thinking about possible solutions
>> to solve effectively latency issues in blk-mq, I'm trying to
>> understand on what ground they could be based.  Naively, I didn't
>> realize that io contexts, in their current incarnation, are just
>> unfeasible in a parallel framework.
> 
> Well, I understand, but you're trying to get the attention of people
> who believe nothing now is important except blk-mq ...  I'm afraid it
> means you do need to understand and adapt to the new toy.
> 

I did notice it ...  And we are trying to find a good way to retrofit
strong low-latency guarantees in blk-mq too.

Anyway, a concrete problem is that, if I'm not completely mistaken,
there is a huge number of users and sysadmins that could already enjoy
much better Linux systems, while waiting for the transition to blk-mq
to be completed.

>>> Omar Sandoval is also working on something similar
>>> in mq, are you actually talking to him?
>>> 
>> 
>> One of the purposes of my RFD was exactly to talk with somebody like
>> Omar.  He did reply providing very useful information.  As of now, my
>> interaction with Omar consists just in the exchange of emails in that
>> thread.  That exchange is currently stuck at my last email, sent 
>> about three weeks ago, and containing some considerations and 
>> questions about the information Omar provided me in his email.
> 
> My hazy recollection of Omar from the last LSF/MM is that he's quite a
> recent FB developer and he's got quite a lot to do ... he may just need
> reminding.
> 

Then I will follow your advise.

Thank you very much,
Paolo

> James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 19:36       ` James Bottomley
  2016-09-16 20:13         ` Paolo Valente
@ 2016-09-17 10:31         ` Linus Walleij
  2016-09-21 13:51         ` Grant Likely
  2 siblings, 0 replies; 20+ messages in thread
From: Linus Walleij @ 2016-09-17 10:31 UTC (permalink / raw)
  To: James Bottomley
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, Omar Sandoval, Christoph Hellwig

On Fri, Sep 16, 2016 at 9:36 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:

> OK, so the problem with a formal discussion of something like this at
> KS is that of the 80 or so people in the room, likely only 10 have any
> interest whatsoever, leading to intense boredom for the remaining 70.

If it is about the semantics of CFQ, BFQ and MQ yes. But what is lurking
here is also a social problem and that need to be addressed at KS IMO.

Following this from a laidback bystander position I recognize patterns that
we saw earlier in the CPU scheduler community when the fuzz was all
about the interactive qualities of the O(1) scheduler vs rotating staircase
and the eventual merge of the (awesome) CFS scheduler. And the kind of
attention that brought about the (cool) CPU deadline scheduler.

I think the problem is in the original Thomas Kuhnian sense paradigmatic.

That is, loosely defined, what kind of questions should be addressed and
what type of answers that could be expected, a set of working assumptions
for the community so that it can make steady progress and not be
disturbed by irrelevant noise.

The block layer people are working inside a paradigm, that is something
like "use the hardware optimally to maximize throughput". It is obvious from
things like mq and the fio tool that this is what is percieved as the problem
space it sets out to manouver.

Now comes along this italian who says something totally perpendicular
like "but I care much more about latency", i.e. how interactive the system
is for a user, start-up time of applications under load, no skipping in the
media players under system load and things like that.

Well that is no storage cluster use case...

(Modified truth: Paolo actually consulted for a storage provider that did
not provide a certain average thoughput, but instead an as exact throughput
rate as possible, which made BFQ fit their usecase better. But you get
the point.)

The usual reaction from people working inside the paradigm to concepts
alien to them will be a series of shrugs and yawns. And that is human.
Don't rock the boat. Sit down. Or even "can't you just take a bigger and
faster nvram disk? Well, I think you will be able to in two years so give
up right now."

Even more intimidating when he's making research reports and
measurements and develop repeatable test cases to prove the point.

In the past CPU scheduler debate some people have done lame
handwavy arguments as to why this or that scheduler is so much better,
but that is not the case here. Paolo's tests are very real. Scientific,
repeatable, hard measures.

The point is, I suspect that the block layer community is all about
throughput and the talk about latency and interactivity is seen as an
annoying distraction.

Like the kids making noise about doing detours for catching Pokémons
in the back seat of the car while you're in the driving seat, driving to
some percieved important destination. If you see what I mean. Their
problems is not really your problem, so you don't care much. It will be
more "yeah yeah, we'll see about your Pokémons. Someday."

But as in the case with the CPU schedulers, what we risk getting out
there amongst the comments in LWN and Phoronix and sites like that is a
conspiracy theory: that the block layer devs are living in their ivory tower
and not caring about interactivity of Linux and the desktop user experience
and all that old yada-yada we've heard a million times by now.

The point is not about the Linux desktop even, if you ask me. The point for
me is that for everyone using an Android phone, Linux block layer
interactivity matters, every time an application lags in start up on a
stressed Android, and for spurious writes like "optimizing applications"
it's even worse. (Disclaimer: I represent the embedded, tablet and handset
industry. I might be tainted.)

The people who think ineractivity of the block layer is important to them
wants a voice. And Paolo is there for them, at the KS. I would take this
opportunity to listen to him, whether formally or informally.

ALSO to get the vibe from the kernel developer community at large:
hands down: what matters to us? Data storage clusters of nvrams or
embedded eMMC cards in Android phones? Or both? Is that even a
question so silly that it should not be asked? Don't ask me, ask the KS
attendees. I think it's relevant.

(If for nothing else we do a good job at kicking up dust on this mailing
list already, and we've been told it is actually more important than the
KS itself.)

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 20:13         ` Paolo Valente
@ 2016-09-19  8:17           ` Jan Kara
  0 siblings, 0 replies; 20+ messages in thread
From: Jan Kara @ 2016-09-19  8:17 UTC (permalink / raw)
  To: Paolo Valente
  Cc: Jens Axboe, Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH,
	James Bottomley, hare, Tejun Heo, osandov, Christoph Hellwig

On Fri 16-09-16 22:13:44, Paolo Valente wrote:
> > Il giorno 16 set 2016, alle ore 21:36, James Bottomley <James.Bottomley@HansenPartnership.com> ha scritto:
> > 
> > On Fri, 2016-09-16 at 20:48 +0200, Paolo Valente wrote:
> >>> Il giorno 16 set 2016, alle ore 17:15, James Bottomley <
> >>> James.Bottomley@HansenPartnership.com> ha scritto:
> >>> 
> >>> On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
> >>>> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
> >>>>> Linux systems suffers from long-standing high-latency problems, 
> >>>>> at system and application level, related to I/O.  For example,
> >>>>> they usually suffer from poor responsiveness--or even 
> >>>>> starvation, depending on the workload--while, e.g., one or more 
> >>>>> files are being read/written/copied.  On a similar note, 
> >>>>> background workloads may cause audio/video playback/streaming 
> >>>>> to stutter, even with long gaps. A lot of test results on this 
> >>>>> problem can be found here [1] (I'm citing only this resource 
> >>>>> just because I'm familiar with it, but evidence can be found in 
> >>>>> countless technical reports, scientific papers, forum
> >>>>> discussions, and so on).
> >>>> 
> >>>> <snip>
> >>>> 
> >>>> Isn't this a better topic for the Vault conference, or the 
> >>>> storage mini conference?
> >>> 
> >>> LSF/MM would be the place to have the technical discussion, yes. 
> >>> It will be in Cambridge (MA,USA not the real one) in the Feb/March
> >>> time frame in 2017.  Far more of the storage experts (who likely 
> >>> want to weigh in) will be present.
> >>> 
> >> 
> >> Perfect venue.  Just it would be a pity IMO to waste the opportunity
> >> of my being at KS with other people working on the components 
> >> involved in high-latency issues, and to delay by more months a 
> >> discussion on possible solutions.
> > 
> > OK, so the problem with a formal discussion of something like this at
> > KS is that of the 80 or so people in the room, likely only 10 have any
> > interest whatsoever, leading to intense boredom for the remaining 70.
> 
> No no, that would be scary to me, given the level of the audience!  I
> thought it would have been possible to arrange some sort of
> sub-discussions with limited groups (although maybe the fact the Linux
> still suffers from high latencies might somehow worry all people that
> care about the kernel).  I'm sorry, but this will be my first time at KS.

Yeah, so I'll be at KS and I'd be interested in this discussion. Actually
I expect to have Jens Axboe and Christoph Hellwig around as well which are
biggest blk-mq proponents so I think the most important people for the
discussion about what are the blockers for merging are there.

I agree that for a discussion about details of the scheduling algorithm
LSF/MM is a better venue but at least for a process discussion under which
conditions BFQ is mergeable KS is OK.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16 19:36       ` James Bottomley
  2016-09-16 20:13         ` Paolo Valente
  2016-09-17 10:31         ` Linus Walleij
@ 2016-09-21 13:51         ` Grant Likely
  2 siblings, 0 replies; 20+ messages in thread
From: Grant Likely @ 2016-09-21 13:51 UTC (permalink / raw)
  To: James Bottomley
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	Hannes Reinecke, Tejun Heo, osandov, Christoph Hellwig

On Fri, Sep 16, 2016 at 8:36 PM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Fri, 2016-09-16 at 20:48 +0200, Paolo Valente wrote:
>> > Il giorno 16 set 2016, alle ore 17:15, James Bottomley <
>> > James.Bottomley@HansenPartnership.com> ha scritto:
>> >
>> > On Fri, 2016-09-16 at 10:24 +0200, Greg KH wrote:
>> > > On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
>> > > > Linux systems suffers from long-standing high-latency problems,
>> > > > at system and application level, related to I/O.  For example,
>> > > > they usually suffer from poor responsiveness--or even
>> > > > starvation, depending on the workload--while, e.g., one or more
>> > > > files are being read/written/copied.  On a similar note,
>> > > > background workloads may cause audio/video playback/streaming
>> > > > to stutter, even with long gaps. A lot of test results on this
>> > > > problem can be found here [1] (I'm citing only this resource
>> > > > just because I'm familiar with it, but evidence can be found in
>> > > > countless technical reports, scientific papers, forum
>> > > > discussions, and so on).
>> > >
>> > > <snip>
>> > >
>> > > Isn't this a better topic for the Vault conference, or the
>> > > storage mini conference?
>> >
>> > LSF/MM would be the place to have the technical discussion, yes.
>> >  It will be in Cambridge (MA,USA not the real one) in the Feb/March
>> > time frame in 2017.  Far more of the storage experts (who likely
>> > want to weigh in) will be present.
>> >
>>
>> Perfect venue.  Just it would be a pity IMO to waste the opportunity
>> of my being at KS with other people working on the components
>> involved in high-latency issues, and to delay by more months a
>> discussion on possible solutions.
>
> OK, so the problem with a formal discussion of something like this at
> KS is that of the 80 or so people in the room, likely only 10 have any
> interest whatsoever, leading to intense boredom for the remaining 70.
>  And for those 10, there were likely another 10 who didn't get invited
> who wanted the chance to express an opinion.  Realistically, this is
> why we no-longer do technical discussions at KS: audience too broad and
> not enough specific subject matter experts.
>
> However, nothing says you can't have a discussion in the hallway if
> you're already going.

Maybe we can set aside a slot or too for smaller scale BoF sessions?
If there are other topics in this vein, it would be good to have a
list of them ahead of time. I've been considering a BoF related to our
device model for example.

g.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  7:55 [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O Paolo Valente
  2016-09-16  8:24 ` Greg KH
@ 2016-09-21 14:30 ` Bart Van Assche
  2016-09-21 14:37   ` Paolo Valente
  1 sibling, 1 reply; 20+ messages in thread
From: Bart Van Assche @ 2016-09-21 14:30 UTC (permalink / raw)
  To: Paolo Valente, ksummit-discuss
  Cc: b.zolnierkie, Jens Axboe, hare, Tejun Heo, osandov, hch

On 09/16/16 00:55, Paolo Valente wrote:
> Linux systems suffers from long-standing high-latency problems, at
> system and application level, related to I/O.

Hello Paolo,

Are you aware of Jens' throttled background buffered writeback work? If 
not, can you repeat your measurements against a kernel on which these 
patches have been applied? See also 
http://www.spinics.net/lists/linux-fsdevel/msg101391.html.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-21 14:30 ` Bart Van Assche
@ 2016-09-21 14:37   ` Paolo Valente
  0 siblings, 0 replies; 20+ messages in thread
From: Paolo Valente @ 2016-09-21 14:37 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Jens Axboe, hare,
	Tejun Heo, osandov, hch


> Il giorno 21 set 2016, alle ore 16:30, Bart Van Assche <bart.vanassche@sandisk.com> ha scritto:
> 
> On 09/16/16 00:55, Paolo Valente wrote:
>> Linux systems suffers from long-standing high-latency problems, at
>> system and application level, related to I/O.
> 
> Hello Paolo,
> 

Hi

> Are you aware of Jens' throttled background buffered writeback work? If not, can you repeat your measurements against a kernel on which these patches have been applied?

Already done (see below).

> See also http://www.spinics.net/lists/linux-fsdevel/msg101391.html.
> 

A brief report of the outcome of my measurements is in this email of mine on the same thread:
http://www.spinics.net/lists/linux-fsdevel/msg101430.html

In short, application start-up time happened to be about the same with and without writeback throttling (actually slightly higher with throttling, for, e.g., gnome-terminal).

Thanks,
Paolo

> Thanks,
> 
> Bart.
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-16  8:59   ` Linus Walleij
  2016-09-16  9:10     ` Bart Van Assche
@ 2016-09-22  9:18     ` Ulf Hansson
  2016-09-22 11:06       ` Linus Walleij
  1 sibling, 1 reply; 20+ messages in thread
From: Ulf Hansson @ 2016-09-22  9:18 UTC (permalink / raw)
  To: Linus Walleij
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, osandov, Christoph Hellwig

On 16 September 2016 at 10:59, Linus Walleij <linus.walleij@linaro.org> wrote:
> On Fri, Sep 16, 2016 at 10:24 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
>> On Fri, Sep 16, 2016 at 09:55:45AM +0200, Paolo Valente wrote:
>>> Linux systems suffers from long-standing high-latency problems, at
>>> system and application level, related to I/O.  For example, they
>>> usually suffer from poor responsiveness--or even starvation, depending
>>> on the workload--while, e.g., one or more files are being
>>> read/written/copied.  On a similar note, background workloads may
>>> cause audio/video playback/streaming to stutter, even with long gaps.
>>> A lot of test results on this problem can be found here [1] (I'm
>>> citing only this resource just because I'm familiar with it, but
>>> evidence can be found in countless technical reports, scientific
>>> papers, forum discussions, and so on).
>>
>> <snip>
>>
>> Isn't this a better topic for the Vault conference, or the storage mini
>> conference?
>
> Paolo was invited to the kernel summit and I guess so are the
> core block maintainers: Jens, Tejun, Christoph. The right people are
> there so why not take the opportunity.
>
> If for nothing else just have a formal chat.

Whatever form works for me! Although, I may join first at Tuesday as I
will be at LPC.

>
> Overall I personally think the most KS-related discussion would be
> to address the problems Paolo has had to break into the block layer
> development community and the conflicting responses to the patch
> sets, which generated a few flak comments under the last LWN
> article:
> http://lwn.net/Articles/674308/
>
> The main problem is that unlike some random driver this cannot
> be put into staging and adding it as a secondary (or tertiary or
> whatever) scheduling policy in block/* was explicitly nixed.
>
> AFAICT there is no clear answer from the block maintainers
> regarding:
>
> - Is the old blk layer deprecated or not? Christoph seems to
>   say "yes, forget it, work on mq", but I am still unsure about Jens
>   and Tejuns positions here. Would be nice with some consensus.
>   If it is deprecated it would make sense not to merge any new
>   code using it, right?
>
> - When is an all-out transition to mq really going to happen?
>   "When it's ready and all blk consumers are migrated" is a good
>   answer, but pretty unhelpful for developers like Paolo.
>   Can we get a clearer picture?
>
> - What will subsystems (especially my pet peeve about MMC/SD
>   which is single-queue by nature) that experience a performance
>   regression with a switch to mq do? Not switch until mq has a
>   scheduling policy? Switch and suck up the performance regression,
>   multiplied by the number of Android handheld devices on the
>   planet?

With my MMC hat on, I would of course appreciate to reach a consensus
about the three topics above.
To me, the KS seems like a very good opportunity to meet and discuss
this, especially since it seems like many important stakeholders will
be there.

>
> I only have handwavy arguments about the latter being the
> case which is why I'm working on a patch to MMC/SD to
> switch to mq as an RFT. It's taking some time though, alas
> I'm not very smart.

I appreciate this! I don't expect it to be easy, as you would probably
have to rip out most of the mmc block/core code related to request
management.

For example, I guess the asynchronous request mechanism doesn't really
fit into blkmq, does it?

Kind regards
Ulf Hansson

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O
  2016-09-22  9:18     ` Ulf Hansson
@ 2016-09-22 11:06       ` Linus Walleij
  0 siblings, 0 replies; 20+ messages in thread
From: Linus Walleij @ 2016-09-22 11:06 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Bartlomiej Zolnierkiewicz, ksummit-discuss, Greg KH, Jens Axboe,
	hare, Tejun Heo, Omar Sandoval, Christoph Hellwig

On Thu, Sep 22, 2016 at 11:18 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 16 September 2016 at 10:59, Linus Walleij <linus.walleij@linaro.org> wrote:

>> I only have handwavy arguments about the latter being the
>> case which is why I'm working on a patch to MMC/SD to
>> switch to mq as an RFT. It's taking some time though, alas
>> I'm not very smart.
>
> I appreciate this! I don't expect it to be easy, as you would probably
> have to rip out most of the mmc block/core code related to request
> management.
>
> For example, I guess the asynchronous request mechanism doesn't really
> fit into blkmq, does it?

Nopes. I have no idea how to make that work.

I got blk-mq running for MMC/SD today and I see a gross performance
regression, from 37 MB/s to 27 MB/s on Ux500 7.38 GB eMMC
with a simple dd test:

BEFORE switching to MQ:

time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1073741824 bytes (1.0GB) copied, 27.530335 seconds, 37.2MB/s
real    0m 27.54s
user    0m 0.02s
sys     0m 7.56s

AFTER switching to MQ:

time dd if=/dev/mmcblk3 of=/dev/null bs=1M count=1024
1073741824 bytes (1.0GB) copied, 37.170990 seconds, 27.5MB/s
real    0m 37.18s
user    0m 0.02s
sys     0m 7.32s

I will however post my hacky patch as a RFD to the blockdevs and
the block maintainers, along with the numbers and a speculation
about what may be causing it. asynchronous requests (request
pipelining) is one thing, another thing is front/back merge in
the block layer I guess.

I think I should give the blkdevs will have the opportunity to tell me
off for all the stupid ways in which I should *not* be using MQ
before we draw any conclusions from this...

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-09-22 11:06 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-16  7:55 [Ksummit-discuss] [TECH TOPIC] Addressing long-standing high-latency problems related to I/O Paolo Valente
2016-09-16  8:24 ` Greg KH
2016-09-16  8:59   ` Linus Walleij
2016-09-16  9:10     ` Bart Van Assche
2016-09-16 11:24       ` Linus Walleij
2016-09-16 11:46         ` Arnd Bergmann
2016-09-16 13:10           ` Paolo Valente
2016-09-16 13:36           ` Linus Walleij
2016-09-16 11:53         ` Bart Van Assche
2016-09-22  9:18     ` Ulf Hansson
2016-09-22 11:06       ` Linus Walleij
2016-09-16 15:15   ` James Bottomley
2016-09-16 18:48     ` Paolo Valente
2016-09-16 19:36       ` James Bottomley
2016-09-16 20:13         ` Paolo Valente
2016-09-19  8:17           ` Jan Kara
2016-09-17 10:31         ` Linus Walleij
2016-09-21 13:51         ` Grant Likely
2016-09-21 14:30 ` Bart Van Assche
2016-09-21 14:37   ` Paolo Valente

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.