[PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
@ 2009-10-26 21:43 Corrado Zoccolo
  2009-10-28  8:27 ` Jens Axboe
       [not found] ` <x49zl7c268s.fsf@segfault.boston.devel.redhat.com>
  0 siblings, 2 replies; 8+ messages in thread
From: Corrado Zoccolo @ 2009-10-26 21:43 UTC (permalink / raw)
  To: Linux-Kernel, Jens Axboe, Jeff Moyer

[rebased on top of Jeff's latest changes for 2.6.33. Various code style improvements over v1 & v2]

This patch series is intended to improve I/O latency, addressing an often
neglected, important subset of workloads: the ones for which cfq currently
prefers not to do any idling.

Those are the ones that would benefit most from having low latency, in fact
they are any of:
* processes with large think times (e.g. interactive ones like file managers)
* seeky (e.g. programs faulting in their code at startup)
* or marked as no-idle from upper levels.

The patch series addresses this by:
* reducing queues' timeslice when many queues have pending I/O
* separating queues with different priorities and different characteristics in
different service trees, each with an allocated time slice
* enable idling when switching between service trees, even for queues that
would not have idling enabled otherwise.

This provides various benefits:
* service tree insertion code is simplified, since it doesn't need to cope with
priorities any more.
* high priority no_idle queues are no longer penalized when competing with
lower priority, idling queues
* seeky and no_idle queues have their fair share of disk time, without
penalizing NCQ drives' performances, since they can all dispatch together,
filling up the available NCQ slots.

On a non-NCQ capable drive, a workload of 4 random readers competing with
sequential writer, the maximum latency experienced by readers decreased from >
500ms to about 160ms.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-10-26 21:43 [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3) Corrado Zoccolo
@ 2009-10-28  8:27 ` Jens Axboe
       [not found] ` <x49zl7c268s.fsf@segfault.boston.devel.redhat.com>
  1 sibling, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-10-28  8:27 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jeff Moyer

On Mon, Oct 26 2009, Corrado Zoccolo wrote:
> [rebased on top of Jeff's latest changes for 2.6.33. Various code style improvements over v1 & v2]
> 
> This patch series is intended to improve I/O latency, addressing an often
> neglected, important subset of workloads: the ones for which cfq currently
> prefers not to do any idling.
> 
> Those are the ones that would benefit most from having low latency, in fact
> they are any of:
> * processes with large think times (e.g. interactive ones like file managers)
> * seeky (e.g. programs faulting in their code at startup)
> * or marked as no-idle from upper levels.
> 
> The patch series addresses this by:
> * reducing queues' timeslice when many queues have pending I/O
> * separating queues with different priorities and different characteristics in
> different service trees, each with an allocated time slice
> * enable idling when switching between service trees, even for queues that
> would not have idling enabled otherwise.
> 
> This provides various benefits:
> * service tree insertion code is simplified, since it doesn't need to cope with
> priorities any more.
> * high priority no_idle queues are no longer penalized when competing with
> lower priority, idling queues
> * seeky and no_idle queues have their fair share of disk time, without
> penalizing NCQ drives' performances, since they can all dispatch together,
> filling up the available NCQ slots.
> 
> On a non-NCQ capable drive, a workload of 4 random readers competing with
> sequential writer, the maximum latency experienced by readers decreased from >
> 500ms to about 160ms.

Thanks Corrado, this is indeed good stuff. Only style issue left was the
one in cfq_get_avg_queues(), I just corrected that manually.

I have committed this in a test branch based off for-2.6.33 and will do
some testing with it, then merge it into for-2.6.33 if it looks good.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
       [not found]             ` <4e5e476b0911030719m425c208cg311f44a91fad8342@mail.gmail.com>
@ 2009-11-03 18:35               ` Corrado Zoccolo
  2009-11-03 20:18                 ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Corrado Zoccolo @ 2009-11-03 18:35 UTC (permalink / raw)
  To: Jens Axboe, Linux-Kernel, Jeff Moyer

Hi Jens,
Jeff did some testing of this patchset on his NCQ-enabled SSD (the
30GB OCZ Vertex).
The test suite contained various multiple competing workloads
scenarios, and was run on for-2.6.33 and cfq-2.6.33 branches.

Max latencies were reduced in most cases, and we had also improvements
on bandwidth side in some scenarios, especially
for multiple random readers, either alone or competing with writes.
2 random readers aggregate bw increased from 48356 to 74205
and 4 random readers vs 1 seq writer:
* aggregate reader bw increased from 35242 to 56400
* writer bandwidth increased from 33269 to 55127
* maximum latency on read decreased from 535 to 324
* maximum latency on writes decreased from 22243 to 1153
It's a win on all measures.
The effect increasing the number of readers to 32 (latency_test_2.fio)
is even more visible (max read latency reduced from 3305 to 268,
aggregated read BW increased from 32894 to 164571).

The only case where I see an increased max latency is for 2 random
readers vs 1 seq reader:

for-2.6.33:
randomread.0: read_bw = 15,418K
randomread.1: read_bw = 15,399K
seqread: read_bw = 409K
0: read_bw = 31226
0: read_lat_max = 11.589
0: read_lat_avg = 3.22366666666667

cfq-2.6.33:
randomread.0: read_bw = 10,065K
randomread.1: read_bw = 10,067K
seqread: read_bw = 101M
0: read_bw = 121132
0: read_lat_max = 303
0: read_lat_avg = 0.282333333333333

but here the increased latency is paid back by a large increase in
sequential read BW (the max latency is, btw, experienced by the seq
reader, so I think it is a fair behaviour).

Jeff observed that the for-2.6.33 numbers were worse than his baseline
runs, probably due to changed hw_tag detection.
My patchset is much less sensible to hw_tag on SSDs (since there are
much less situations in which it would idle), so my numbers are
unaffected.

Corrado

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-11-03 18:35               ` Fwd: " Corrado Zoccolo
@ 2009-11-03 20:18                 ` Jens Axboe
  2009-11-03 20:26                   ` Jeff Moyer
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2009-11-03 20:18 UTC (permalink / raw)
  To: Corrado Zoccolo; +Cc: Linux-Kernel, Jeff Moyer

On Tue, Nov 03 2009, Corrado Zoccolo wrote:
> Hi Jens,
> Jeff did some testing of this patchset on his NCQ-enabled SSD (the
> 30GB OCZ Vertex).
> The test suite contained various multiple competing workloads
> scenarios, and was run on for-2.6.33 and cfq-2.6.33 branches.
> 
> Max latencies were reduced in most cases, and we had also improvements
> on bandwidth side in some scenarios, especially
> for multiple random readers, either alone or competing with writes.
> 2 random readers aggregate bw increased from 48356 to 74205
> and 4 random readers vs 1 seq writer:
> * aggregate reader bw increased from 35242 to 56400
> * writer bandwidth increased from 33269 to 55127
> * maximum latency on read decreased from 535 to 324
> * maximum latency on writes decreased from 22243 to 1153
> It's a win on all measures.
> The effect increasing the number of readers to 32 (latency_test_2.fio)
> is even more visible (max read latency reduced from 3305 to 268,
> aggregated read BW increased from 32894 to 164571).
> 
> The only case where I see an increased max latency is for 2 random
> readers vs 1 seq reader:
> 
> for-2.6.33:
> randomread.0: read_bw = 15,418K
> randomread.1: read_bw = 15,399K
> seqread: read_bw = 409K
> 0: read_bw = 31226
> 0: read_lat_max = 11.589
> 0: read_lat_avg = 3.22366666666667
> 
> cfq-2.6.33:
> randomread.0: read_bw = 10,065K
> randomread.1: read_bw = 10,067K
> seqread: read_bw = 101M
> 0: read_bw = 121132
> 0: read_lat_max = 303
> 0: read_lat_avg = 0.282333333333333
> 
> but here the increased latency is paid back by a large increase in
> sequential read BW (the max latency is, btw, experienced by the seq
> reader, so I think it is a fair behaviour).
> 
> Jeff observed that the for-2.6.33 numbers were worse than his baseline
> runs, probably due to changed hw_tag detection.
> My patchset is much less sensible to hw_tag on SSDs (since there are
> much less situations in which it would idle), so my numbers are
> unaffected.

Thanks a lot for your testing. My testing on cfq-2.6.33 looks good too,
so I pulled it into for-2.6.33 today.

Since for-linus contains conflicting changes, can you and Jeff please
double check that everything is still in order? The interesting bit here
is the merge with for-2.6.33 and the coop limit from Shaohua Li. I did
the straight forward merge, but we likely just need to drop that logic
since the coop concept is radically different given that we merge and
break queues in for-2.6.33.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-11-03 20:18                 ` Jens Axboe
@ 2009-11-03 20:26                   ` Jeff Moyer
  2009-11-03 20:28                     ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2009-11-03 20:26 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Corrado Zoccolo, Linux-Kernel

Jens Axboe <jens.axboe@oracle.com> writes:

> Since for-linus contains conflicting changes, can you and Jeff please
> double check that everything is still in order? The interesting bit here
> is the merge with for-2.6.33 and the coop limit from Shaohua Li. I did
> the straight forward merge, but we likely just need to drop that logic
> since the coop concept is radically different given that we merge and
> break queues in for-2.6.33.

Yeah, since I changed the meaning of the cfqq_coop flag, a lot of those
tests are just plain wrong.  Let me play with it and I'll send you an
incremental patch in a bit.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-11-03 20:26                   ` Jeff Moyer
@ 2009-11-03 20:28                     ` Jens Axboe
  2009-11-03 22:00                       ` Jeff Moyer
  0 siblings, 1 reply; 8+ messages in thread
From: Jens Axboe @ 2009-11-03 20:28 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Corrado Zoccolo, Linux-Kernel

On Tue, Nov 03 2009, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> > Since for-linus contains conflicting changes, can you and Jeff please
> > double check that everything is still in order? The interesting bit here
> > is the merge with for-2.6.33 and the coop limit from Shaohua Li. I did
> > the straight forward merge, but we likely just need to drop that logic
> > since the coop concept is radically different given that we merge and
> > break queues in for-2.6.33.
> 
> Yeah, since I changed the meaning of the cfqq_coop flag, a lot of those
> tests are just plain wrong.  Let me play with it and I'll send you an
> incremental patch in a bit.

Thanks, here's what I have. It's basically a revert of the commit in
question.

diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
index b700f41..4ab240c 100644
--- a/block/cfq-iosched.c
+++ b/block/cfq-iosched.c
@@ -253,7 +253,6 @@ enum cfqq_state_flags {
 	CFQ_CFQQ_FLAG_slice_new,	/* no requests dispatched in slice */
 	CFQ_CFQQ_FLAG_sync,		/* synchronous queue */
 	CFQ_CFQQ_FLAG_coop,		/* cfqq is shared */
-	CFQ_CFQQ_FLAG_coop_preempt,	/* coop preempt */
 };
 
 #define CFQ_CFQQ_FNS(name)						\
@@ -280,7 +279,6 @@ CFQ_CFQQ_FNS(prio_changed);
 CFQ_CFQQ_FNS(slice_new);
 CFQ_CFQQ_FNS(sync);
 CFQ_CFQQ_FNS(coop);
-CFQ_CFQQ_FNS(coop_preempt);
 #undef CFQ_CFQQ_FNS
 
 #define cfq_log_cfqq(cfqd, cfqq, fmt, args...)	\
@@ -1070,16 +1068,9 @@ static struct cfq_queue *cfq_get_next_queue(struct cfq_data *cfqd)
 static struct cfq_queue *cfq_set_active_queue(struct cfq_data *cfqd,
 					      struct cfq_queue *cfqq)
 {
-	if (!cfqq) {
+	if (!cfqq)
 		cfqq = cfq_get_next_queue(cfqd);
 
-		if (cfqq && !cfq_cfqq_coop_preempt(cfqq))
-			cfq_clear_cfqq_coop(cfqq);
-	}
-
-	if (cfqq)
-		cfq_clear_cfqq_coop_preempt(cfqq);
-
 	__cfq_set_active_queue(cfqd, cfqq);
 	return cfqq;
 }
@@ -2433,16 +2424,8 @@ cfq_should_preempt(struct cfq_data *cfqd, struct cfq_queue *new_cfqq,
 	 * if this request is as-good as one we would expect from the
 	 * current cfqq, let it preempt
 	 */
-	if (cfq_rq_close(cfqd, cfqq, rq) && (!cfq_cfqq_coop(new_cfqq) ||
-	    cfqd->busy_queues == 1)) {
-		/*
-		 * Mark new queue coop_preempt, so its coop flag will not be
-		 * cleared when new queue gets scheduled at the very first time
-		 */
-		cfq_mark_cfqq_coop_preempt(new_cfqq);
-		cfq_mark_cfqq_coop(new_cfqq);
+	if (cfq_rq_close(cfqd, cfqq, rq))
 		return true;
-	}
 
 	return false;
 }

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-11-03 20:28                     ` Jens Axboe
@ 2009-11-03 22:00                       ` Jeff Moyer
  2009-11-04  7:51                         ` Jens Axboe
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff Moyer @ 2009-11-03 22:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Corrado Zoccolo, Linux-Kernel

Jens Axboe <jens.axboe@oracle.com> writes:

> On Tue, Nov 03 2009, Jeff Moyer wrote:
>> Jens Axboe <jens.axboe@oracle.com> writes:
>> 
>> > Since for-linus contains conflicting changes, can you and Jeff please
>> > double check that everything is still in order? The interesting bit here
>> > is the merge with for-2.6.33 and the coop limit from Shaohua Li. I did
>> > the straight forward merge, but we likely just need to drop that logic
>> > since the coop concept is radically different given that we merge and
>> > break queues in for-2.6.33.
>> 
>> Yeah, since I changed the meaning of the cfqq_coop flag, a lot of those
>> tests are just plain wrong.  Let me play with it and I'll send you an
>> incremental patch in a bit.
>
> Thanks, here's what I have. It's basically a revert of the commit in
> question.

Your patch looks like a straight-forward revert.  I still think we need
some guards in place, though.  For now, I think we can go with what you
have, and I'll come up with some other mechanism to deal with this case.

Cheers,
Jeff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Fwd: [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3)
  2009-11-03 22:00                       ` Jeff Moyer
@ 2009-11-04  7:51                         ` Jens Axboe
  0 siblings, 0 replies; 8+ messages in thread
From: Jens Axboe @ 2009-11-04  7:51 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Corrado Zoccolo, Linux-Kernel

On Tue, Nov 03 2009, Jeff Moyer wrote:
> Jens Axboe <jens.axboe@oracle.com> writes:
> 
> > On Tue, Nov 03 2009, Jeff Moyer wrote:
> >> Jens Axboe <jens.axboe@oracle.com> writes:
> >> 
> >> > Since for-linus contains conflicting changes, can you and Jeff please
> >> > double check that everything is still in order? The interesting bit here
> >> > is the merge with for-2.6.33 and the coop limit from Shaohua Li. I did
> >> > the straight forward merge, but we likely just need to drop that logic
> >> > since the coop concept is radically different given that we merge and
> >> > break queues in for-2.6.33.
> >> 
> >> Yeah, since I changed the meaning of the cfqq_coop flag, a lot of those
> >> tests are just plain wrong.  Let me play with it and I'll send you an
> >> incremental patch in a bit.
> >
> > Thanks, here's what I have. It's basically a revert of the commit in
> > question.
> 
> Your patch looks like a straight-forward revert.  I still think we need
> some guards in place, though.  For now, I think we can go with what you
> have, and I'll come up with some other mechanism to deal with this case.

Thanks Jeff, I'll merge it up and we can get things straightened out in
due time.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-11-04  7:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-26 21:43 [PATCH 0/5] cfq-iosched: improve latency for no-idle queues (v3) Corrado Zoccolo
2009-10-28  8:27 ` Jens Axboe
     [not found] ` <x49zl7c268s.fsf@segfault.boston.devel.redhat.com>
     [not found]   ` <4e5e476b0910271124r2cf9f9c0l83fdc59b50619202@mail.gmail.com>
     [not found]     ` <x493a4wsn5c.fsf@segfault.boston.devel.redhat.com>
     [not found]       ` <x49fx8wbqd0.fsf@segfault.boston.devel.redhat.com>
     [not found]         ` <4e5e476b0911030042q5963718aj5875c542e6f6cc40@mail.gmail.com>
     [not found]           ` <x49ocnju35d.fsf@segfault.boston.devel.redhat.com>
     [not found]             ` <4e5e476b0911030719m425c208cg311f44a91fad8342@mail.gmail.com>
2009-11-03 18:35               ` Fwd: " Corrado Zoccolo
2009-11-03 20:18                 ` Jens Axboe
2009-11-03 20:26                   ` Jeff Moyer
2009-11-03 20:28                     ` Jens Axboe
2009-11-03 22:00                       ` Jeff Moyer
2009-11-04  7:51                         ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.