* [PATCH 3/3]Subject: CFQ: add think time check for group
@ 2011-07-04 5:36 Shaohua Li
2011-07-05 14:31 ` Vivek Goyal
0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2011-07-04 5:36 UTC (permalink / raw)
To: lkml; +Cc: Jens Axboe, Vivek Goyal
Subject: CFQ: add think time check for group
Currently when the last queue of a group has no request, we don't expire
the queue to hope request from the group comes soon, so the group doesn't
miss its share. But if the think time is big, the assumption isn't correct
and we just waste bandwidth. In such case, we don't do idle.
[global]
runtime=30
direct=1
[test1]
cgroup=test1
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file1
thinktime=9000
[test2]
cgroup=test2
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file2
patched base
test1 64k 39k
test2 540k 540k
total 604k 578k
group1 gets much better throughput because it waits less time.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
block/cfq-iosched.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
Index: linux/block/cfq-iosched.c
===================================================================
--- linux.orig/block/cfq-iosched.c 2011-07-01 13:45:24.000000000 +0800
+++ linux/block/cfq-iosched.c 2011-07-01 13:48:18.000000000 +0800
@@ -215,6 +215,7 @@ struct cfq_group {
#endif
/* number of requests that are on the dispatch list or inside driver */
int dispatched;
+ struct cfq_ttime ttime;
};
/*
@@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
*st = CFQ_RB_ROOT;
RB_CLEAR_NODE(&cfqg->rb_node);
+ cfqg->ttime.last_end_request = jiffies;
+
/*
* Take the initial reference that will be released on destroy
* This can be thought of a joint reference by cgroup and
@@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
* this group, wait for requests to complete.
*/
check_group_idle:
- if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
- && cfqq->cfqg->dispatched) {
+ if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
+ cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
+ cfqd->cfq_group_idle)) {
cfqq = NULL;
goto keep_queue;
}
@@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
__cfq_update_io_thinktime(&service_tree->ttime,
cfqd->cfq_slice_idle);
}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+ __cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
+#endif
}
static void
@@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
if (cfqq->cfqg->nr_cfqq > 1)
return false;
- if (cfq_slice_used(cfqq))
+ /* we are the only queue in the group */
+ if (cfq_slice_used(cfqq) &&
+ !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
return true;
/* if slice left is less than think time, wait busy */
@@ -3593,6 +3602,10 @@ static void cfq_completed_request(struct
cfqd->last_delayed_sync = now;
}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+ cfqq->cfqg->ttime.last_end_request = now;
+#endif
+
/*
* If this is the active queue, check if it needs to be expired,
* or if we want to idle in case it has no pending requests.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
2011-07-04 5:36 [PATCH 3/3]Subject: CFQ: add think time check for group Shaohua Li
@ 2011-07-05 14:31 ` Vivek Goyal
2011-07-06 1:58 ` Shaohua Li
0 siblings, 1 reply; 5+ messages in thread
From: Vivek Goyal @ 2011-07-05 14:31 UTC (permalink / raw)
To: Shaohua Li; +Cc: lkml, Jens Axboe
On Mon, Jul 04, 2011 at 01:36:36PM +0800, Shaohua Li wrote:
> Subject: CFQ: add think time check for group
>
> Currently when the last queue of a group has no request, we don't expire
> the queue to hope request from the group comes soon, so the group doesn't
> miss its share. But if the think time is big, the assumption isn't correct
> and we just waste bandwidth. In such case, we don't do idle.
>
> [global]
> runtime=30
> direct=1
>
> [test1]
> cgroup=test1
> cgroup_weight=1000
> rw=randread
> ioengine=libaio
> size=500m
> runtime=30
> directory=/mnt
> filename=file1
> thinktime=9000
>
> [test2]
> cgroup=test2
> cgroup_weight=1000
> rw=randread
> ioengine=libaio
> size=500m
> runtime=30
> directory=/mnt
> filename=file2
>
> patched base
> test1 64k 39k
> test2 540k 540k
> total 604k 578k
>
> group1 gets much better throughput because it waits less time.
>
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> ---
> block/cfq-iosched.c | 19 ++++++++++++++++---
> 1 file changed, 16 insertions(+), 3 deletions(-)
>
> Index: linux/block/cfq-iosched.c
> ===================================================================
> --- linux.orig/block/cfq-iosched.c 2011-07-01 13:45:24.000000000 +0800
> +++ linux/block/cfq-iosched.c 2011-07-01 13:48:18.000000000 +0800
> @@ -215,6 +215,7 @@ struct cfq_group {
> #endif
> /* number of requests that are on the dispatch list or inside driver */
> int dispatched;
> + struct cfq_ttime ttime;
> };
>
> /*
> @@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
> *st = CFQ_RB_ROOT;
> RB_CLEAR_NODE(&cfqg->rb_node);
>
> + cfqg->ttime.last_end_request = jiffies;
> +
> /*
> * Take the initial reference that will be released on destroy
> * This can be thought of a joint reference by cgroup and
> @@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
> * this group, wait for requests to complete.
> */
> check_group_idle:
> - if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
> - && cfqq->cfqg->dispatched) {
> + if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
> + cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
> + cfqd->cfq_group_idle)) {
Lets put the group think time check on new line to avoid splittling
function argument on two lines.
> cfqq = NULL;
> goto keep_queue;
> }
> @@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
> __cfq_update_io_thinktime(&service_tree->ttime,
> cfqd->cfq_slice_idle);
> }
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> + __cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
> +#endif
> }
>
> static void
> @@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
> if (cfqq->cfqg->nr_cfqq > 1)
> return false;
>
> - if (cfq_slice_used(cfqq))
> + /* we are the only queue in the group */
> + if (cfq_slice_used(cfqq) &&
> + !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
> return true;
I think thinktime check should not be anded with slice_used() check. Slice
used check is about that we have been idling all along on queue and now
slice is used so we would like to wait a bit more for queue to get busy so
that group does not lose its share. Given the fact slice is used that
means we have been idling on the queue and have been getting requests
regularly in the queue.
But group think time check should be applicable irrespective of the fact
whether we have used the slice or not. If thinktime of a group is big,
then we should just not wait for it to get busy because anyway group
idle timer will fire and expire the queue/group before that.
Thanks
Vivek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
2011-07-05 14:31 ` Vivek Goyal
@ 2011-07-06 1:58 ` Shaohua Li
2011-07-06 15:06 ` Vivek Goyal
0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2011-07-06 1:58 UTC (permalink / raw)
To: Vivek Goyal; +Cc: lkml, Jens Axboe
On Tue, 2011-07-05 at 22:31 +0800, Vivek Goyal wrote:
> On Mon, Jul 04, 2011 at 01:36:36PM +0800, Shaohua Li wrote:
> > Subject: CFQ: add think time check for group
> >
> > Currently when the last queue of a group has no request, we don't expire
> > the queue to hope request from the group comes soon, so the group doesn't
> > miss its share. But if the think time is big, the assumption isn't correct
> > and we just waste bandwidth. In such case, we don't do idle.
> >
> > [global]
> > runtime=30
> > direct=1
> >
> > [test1]
> > cgroup=test1
> > cgroup_weight=1000
> > rw=randread
> > ioengine=libaio
> > size=500m
> > runtime=30
> > directory=/mnt
> > filename=file1
> > thinktime=9000
> >
> > [test2]
> > cgroup=test2
> > cgroup_weight=1000
> > rw=randread
> > ioengine=libaio
> > size=500m
> > runtime=30
> > directory=/mnt
> > filename=file2
> >
> > patched base
> > test1 64k 39k
> > test2 540k 540k
> > total 604k 578k
> >
> > group1 gets much better throughput because it waits less time.
> >
> > Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> > ---
> > block/cfq-iosched.c | 19 ++++++++++++++++---
> > 1 file changed, 16 insertions(+), 3 deletions(-)
> >
> > Index: linux/block/cfq-iosched.c
> > ===================================================================
> > --- linux.orig/block/cfq-iosched.c 2011-07-01 13:45:24.000000000 +0800
> > +++ linux/block/cfq-iosched.c 2011-07-01 13:48:18.000000000 +0800
> > @@ -215,6 +215,7 @@ struct cfq_group {
> > #endif
> > /* number of requests that are on the dispatch list or inside driver */
> > int dispatched;
> > + struct cfq_ttime ttime;
> > };
> >
> > /*
> > @@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
> > *st = CFQ_RB_ROOT;
> > RB_CLEAR_NODE(&cfqg->rb_node);
> >
> > + cfqg->ttime.last_end_request = jiffies;
> > +
> > /*
> > * Take the initial reference that will be released on destroy
> > * This can be thought of a joint reference by cgroup and
> > @@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
> > * this group, wait for requests to complete.
> > */
> > check_group_idle:
> > - if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
> > - && cfqq->cfqg->dispatched) {
> > + if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
> > + cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
> > + cfqd->cfq_group_idle)) {
>
> Lets put the group think time check on new line to avoid splittling
> function argument on two lines.
ok
> > cfqq = NULL;
> > goto keep_queue;
> > }
> > @@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
> > __cfq_update_io_thinktime(&service_tree->ttime,
> > cfqd->cfq_slice_idle);
> > }
> > +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> > + __cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
> > +#endif
> > }
> >
> > static void
> > @@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
> > if (cfqq->cfqg->nr_cfqq > 1)
> > return false;
> >
> > - if (cfq_slice_used(cfqq))
> > + /* we are the only queue in the group */
> > + if (cfq_slice_used(cfqq) &&
> > + !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
> > return true;
>
> I think thinktime check should not be anded with slice_used() check. Slice
> used check is about that we have been idling all along on queue and now
> slice is used so we would like to wait a bit more for queue to get busy so
> that group does not lose its share. Given the fact slice is used that
> means we have been idling on the queue and have been getting requests
> regularly in the queue.
>
> But group think time check should be applicable irrespective of the fact
> whether we have used the slice or not. If thinktime of a group is big,
> then we should just not wait for it to get busy because anyway group
> idle timer will fire and expire the queue/group before that.
Makes sense. Updated patch below.
Subject: CFQ: add think time check for group
Currently when the last queue of a group has no request, we don't expire
the queue to hope request from the group comes soon, so the group doesn't
miss its share. But if the think time is big, the assumption isn't correct
and we just waste bandwidth. In such case, we don't do idle.
[global]
runtime=30
direct=1
[test1]
cgroup=test1
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file1
thinktime=9000
[test2]
cgroup=test2
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file2
patched base
test1 64k 39k
test2 540k 540k
total 604k 578k
group1 gets much better throughput because it waits less time.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
block/cfq-iosched.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
Index: linux/block/cfq-iosched.c
===================================================================
--- linux.orig/block/cfq-iosched.c 2011-07-06 09:39:52.000000000 +0800
+++ linux/block/cfq-iosched.c 2011-07-06 09:48:48.000000000 +0800
@@ -213,6 +213,7 @@ struct cfq_group {
#endif
/* number of requests that are on the dispatch list or inside driver */
int dispatched;
+ struct cfq_ttime ttime;
};
/*
@@ -1072,6 +1073,8 @@ static struct cfq_group * cfq_alloc_cfqg
*st = CFQ_RB_ROOT;
RB_CLEAR_NODE(&cfqg->rb_node);
+ cfqg->ttime.last_end_request = jiffies;
+
/*
* Take the initial reference that will be released on destroy
* This can be thought of a joint reference by cgroup and
@@ -2395,8 +2398,9 @@ static struct cfq_queue *cfq_select_queu
* this group, wait for requests to complete.
*/
check_group_idle:
- if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
- && cfqq->cfqg->dispatched) {
+ if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
+ cfqq->cfqg->dispatched &&
+ !cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true)) {
cfqq = NULL;
goto keep_queue;
}
@@ -3250,6 +3254,9 @@ cfq_update_io_thinktime(struct cfq_data
__cfq_update_io_thinktime(&cfqq->service_tree->ttime,
cfqd->cfq_slice_idle);
}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+ __cfq_update_io_thinktime(&cfqq->cfqg->ttime, cfqd->cfq_group_idle);
+#endif
}
static void
@@ -3541,6 +3548,10 @@ static bool cfq_should_wait_busy(struct
if (cfqq->cfqg->nr_cfqq > 1)
return false;
+ /* the only queue in the group, but think time is big */
+ if (cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true))
+ return false;
+
if (cfq_slice_used(cfqq))
return true;
@@ -3598,6 +3609,10 @@ static void cfq_completed_request(struct
cfqd->last_delayed_sync = now;
}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+ cfqq->cfqg->ttime.last_end_request = now;
+#endif
+
/*
* If this is the active queue, check if it needs to be expired,
* or if we want to idle in case it has no pending requests.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
2011-07-06 1:58 ` Shaohua Li
@ 2011-07-06 15:06 ` Vivek Goyal
2011-07-07 6:08 ` Shaohua Li
0 siblings, 1 reply; 5+ messages in thread
From: Vivek Goyal @ 2011-07-06 15:06 UTC (permalink / raw)
To: Shaohua Li; +Cc: lkml, Jens Axboe
On Wed, Jul 06, 2011 at 09:58:40AM +0800, Shaohua Li wrote:
[..]
> > > [global]
> > > runtime=30
> > > direct=1
> > >
> > > [test1]
> > > cgroup=test1
> > > cgroup_weight=1000
> > > rw=randread
> > > ioengine=libaio
> > > size=500m
> > > runtime=30
> > > directory=/mnt
> > > filename=file1
> > > thinktime=9000
> > >
> > > [test2]
> > > cgroup=test2
> > > cgroup_weight=1000
> > > rw=randread
> > > ioengine=libaio
> > > size=500m
> > > runtime=30
> > > directory=/mnt
> > > filename=file2
> > >
> > > patched base
> > > test1 64k 39k
> > > test2 540k 540k
> > > total 604k 578k
> > >
> > > group1 gets much better throughput because it waits less time.
I don't understand it. Thinktime of group test1 is more than 8ms. So now
we should not be idling on test1. Hence test1 should lose some share and
test2 should gain disk share and overall throughput should go up.
I am wondering why throughput of test2 did not go up?
Also can you run some tests to make sure that disk shares of regular
workloads (thinktime less than 8ms) are not impacted.
Thanks
Vivek
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
2011-07-06 15:06 ` Vivek Goyal
@ 2011-07-07 6:08 ` Shaohua Li
0 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2011-07-07 6:08 UTC (permalink / raw)
To: Vivek Goyal; +Cc: lkml, Jens Axboe
On Wed, 2011-07-06 at 23:06 +0800, Vivek Goyal wrote:
> On Wed, Jul 06, 2011 at 09:58:40AM +0800, Shaohua Li wrote:
> [..]
> > > > [global]
> > > > runtime=30
> > > > direct=1
> > > >
> > > > [test1]
> > > > cgroup=test1
> > > > cgroup_weight=1000
> > > > rw=randread
> > > > ioengine=libaio
> > > > size=500m
> > > > runtime=30
> > > > directory=/mnt
> > > > filename=file1
> > > > thinktime=9000
> > > >
> > > > [test2]
> > > > cgroup=test2
> > > > cgroup_weight=1000
> > > > rw=randread
> > > > ioengine=libaio
> > > > size=500m
> > > > runtime=30
> > > > directory=/mnt
> > > > filename=file2
> > > >
> > > > patched base
> > > > test1 64k 39k
> > > > test2 540k 540k
> > > > total 604k 578k
> > > >
> > > > group1 gets much better throughput because it waits less time.
>
> I don't understand it. Thinktime of group test1 is more than 8ms. So now
> we should not be idling on test1. Hence test1 should lose some share and
> test2 should gain disk share and overall throughput should go up.
>
> I am wondering why throughput of test2 did not go up?
hmm, actually the throughput of test2 is better. Maybe I wrote it down wrong.
test2 throughput is about 548k/s. Sorry.
> Also can you run some tests to make sure that disk shares of regular
> workloads (thinktime less than 8ms) are not impacted.
I tried think time 2ms or no think time. there is no difference. the
result is quite stable.
Thanks,
Shaohua
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-07-07 6:08 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-04 5:36 [PATCH 3/3]Subject: CFQ: add think time check for group Shaohua Li
2011-07-05 14:31 ` Vivek Goyal
2011-07-06 1:58 ` Shaohua Li
2011-07-06 15:06 ` Vivek Goyal
2011-07-07 6:08 ` Shaohua Li
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).