linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 3/3]Subject: CFQ: add think time check for group
@ 2011-07-04  5:36 Shaohua Li
  2011-07-05 14:31 ` Vivek Goyal
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2011-07-04  5:36 UTC (permalink / raw)
  To: lkml; +Cc: Jens Axboe, Vivek Goyal

Subject: CFQ: add think time check for group

Currently when the last queue of a group has no request, we don't expire
the queue to hope request from the group comes soon, so the group doesn't
miss its share. But if the think time is big, the assumption isn't correct
and we just waste bandwidth. In such case, we don't do idle.

[global]
runtime=30
direct=1

[test1]
cgroup=test1
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file1
thinktime=9000

[test2]
cgroup=test2
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file2

	patched		base
test1	64k		39k
test2	540k		540k
total	604k		578k

group1 gets much better throughput because it waits less time.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
 block/cfq-iosched.c |   19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

Index: linux/block/cfq-iosched.c
===================================================================
--- linux.orig/block/cfq-iosched.c	2011-07-01 13:45:24.000000000 +0800
+++ linux/block/cfq-iosched.c	2011-07-01 13:48:18.000000000 +0800
@@ -215,6 +215,7 @@ struct cfq_group {
 #endif
 	/* number of requests that are on the dispatch list or inside driver */
 	int dispatched;
+	struct cfq_ttime ttime;
 };
 
 /*
@@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
 		*st = CFQ_RB_ROOT;
 	RB_CLEAR_NODE(&cfqg->rb_node);
 
+	cfqg->ttime.last_end_request = jiffies;
+
 	/*
 	 * Take the initial reference that will be released on destroy
 	 * This can be thought of a joint reference by cgroup and
@@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
 	 * this group, wait for requests to complete.
 	 */
 check_group_idle:
-	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
-	    && cfqq->cfqg->dispatched) {
+	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
+	    cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
+	    cfqd->cfq_group_idle)) {
 		cfqq = NULL;
 		goto keep_queue;
 	}
@@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
 		__cfq_update_io_thinktime(&service_tree->ttime,
 			cfqd->cfq_slice_idle);
 	}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+	__cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
+#endif
 }
 
 static void
@@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
 	if (cfqq->cfqg->nr_cfqq > 1)
 		return false;
 
-	if (cfq_slice_used(cfqq))
+	/* we are the only queue in the group */
+	if (cfq_slice_used(cfqq) &&
+	    !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
 		return true;
 
 	/* if slice left is less than think time, wait busy */
@@ -3593,6 +3602,10 @@ static void cfq_completed_request(struct
 			cfqd->last_delayed_sync = now;
 	}
 
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+	cfqq->cfqg->ttime.last_end_request = now;
+#endif
+
 	/*
 	 * If this is the active queue, check if it needs to be expired,
 	 * or if we want to idle in case it has no pending requests.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
  2011-07-04  5:36 [PATCH 3/3]Subject: CFQ: add think time check for group Shaohua Li
@ 2011-07-05 14:31 ` Vivek Goyal
  2011-07-06  1:58   ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: Vivek Goyal @ 2011-07-05 14:31 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, Jens Axboe

On Mon, Jul 04, 2011 at 01:36:36PM +0800, Shaohua Li wrote:
> Subject: CFQ: add think time check for group
> 
> Currently when the last queue of a group has no request, we don't expire
> the queue to hope request from the group comes soon, so the group doesn't
> miss its share. But if the think time is big, the assumption isn't correct
> and we just waste bandwidth. In such case, we don't do idle.
> 
> [global]
> runtime=30
> direct=1
> 
> [test1]
> cgroup=test1
> cgroup_weight=1000
> rw=randread
> ioengine=libaio
> size=500m
> runtime=30
> directory=/mnt
> filename=file1
> thinktime=9000
> 
> [test2]
> cgroup=test2
> cgroup_weight=1000
> rw=randread
> ioengine=libaio
> size=500m
> runtime=30
> directory=/mnt
> filename=file2
> 
> 	patched		base
> test1	64k		39k
> test2	540k		540k
> total	604k		578k
> 
> group1 gets much better throughput because it waits less time.
> 
> Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> ---
>  block/cfq-iosched.c |   19 ++++++++++++++++---
>  1 file changed, 16 insertions(+), 3 deletions(-)
> 
> Index: linux/block/cfq-iosched.c
> ===================================================================
> --- linux.orig/block/cfq-iosched.c	2011-07-01 13:45:24.000000000 +0800
> +++ linux/block/cfq-iosched.c	2011-07-01 13:48:18.000000000 +0800
> @@ -215,6 +215,7 @@ struct cfq_group {
>  #endif
>  	/* number of requests that are on the dispatch list or inside driver */
>  	int dispatched;
> +	struct cfq_ttime ttime;
>  };
>  
>  /*
> @@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
>  		*st = CFQ_RB_ROOT;
>  	RB_CLEAR_NODE(&cfqg->rb_node);
>  
> +	cfqg->ttime.last_end_request = jiffies;
> +
>  	/*
>  	 * Take the initial reference that will be released on destroy
>  	 * This can be thought of a joint reference by cgroup and
> @@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
>  	 * this group, wait for requests to complete.
>  	 */
>  check_group_idle:
> -	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
> -	    && cfqq->cfqg->dispatched) {
> +	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
> +	    cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
> +	    cfqd->cfq_group_idle)) {

Lets put the group think time check on new line to avoid splittling
function argument on two lines.

>  		cfqq = NULL;
>  		goto keep_queue;
>  	}
> @@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
>  		__cfq_update_io_thinktime(&service_tree->ttime,
>  			cfqd->cfq_slice_idle);
>  	}
> +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> +	__cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
> +#endif
>  }
>  
>  static void
> @@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
>  	if (cfqq->cfqg->nr_cfqq > 1)
>  		return false;
>  
> -	if (cfq_slice_used(cfqq))
> +	/* we are the only queue in the group */
> +	if (cfq_slice_used(cfqq) &&
> +	    !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
>  		return true;

I think thinktime check should not be anded with slice_used() check. Slice
used check is about that we have been idling all along on queue and now
slice is used so we would like to wait a bit more for queue to get busy so
that group does not lose its share. Given the fact slice is used that 
means we have been idling on the queue and have been getting requests
regularly in the queue.

But group think time check should be applicable irrespective of the fact
whether we have used the slice or not. If thinktime of a group is big,
then we should just not wait for it to get busy because anyway group
idle timer will fire and expire the queue/group before that.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
  2011-07-05 14:31 ` Vivek Goyal
@ 2011-07-06  1:58   ` Shaohua Li
  2011-07-06 15:06     ` Vivek Goyal
  0 siblings, 1 reply; 5+ messages in thread
From: Shaohua Li @ 2011-07-06  1:58 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: lkml, Jens Axboe

On Tue, 2011-07-05 at 22:31 +0800, Vivek Goyal wrote:
> On Mon, Jul 04, 2011 at 01:36:36PM +0800, Shaohua Li wrote:
> > Subject: CFQ: add think time check for group
> > 
> > Currently when the last queue of a group has no request, we don't expire
> > the queue to hope request from the group comes soon, so the group doesn't
> > miss its share. But if the think time is big, the assumption isn't correct
> > and we just waste bandwidth. In such case, we don't do idle.
> > 
> > [global]
> > runtime=30
> > direct=1
> > 
> > [test1]
> > cgroup=test1
> > cgroup_weight=1000
> > rw=randread
> > ioengine=libaio
> > size=500m
> > runtime=30
> > directory=/mnt
> > filename=file1
> > thinktime=9000
> > 
> > [test2]
> > cgroup=test2
> > cgroup_weight=1000
> > rw=randread
> > ioengine=libaio
> > size=500m
> > runtime=30
> > directory=/mnt
> > filename=file2
> > 
> > 	patched		base
> > test1	64k		39k
> > test2	540k		540k
> > total	604k		578k
> > 
> > group1 gets much better throughput because it waits less time.
> > 
> > Signed-off-by: Shaohua Li <shaohua.li@intel.com>
> > ---
> >  block/cfq-iosched.c |   19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> > 
> > Index: linux/block/cfq-iosched.c
> > ===================================================================
> > --- linux.orig/block/cfq-iosched.c	2011-07-01 13:45:24.000000000 +0800
> > +++ linux/block/cfq-iosched.c	2011-07-01 13:48:18.000000000 +0800
> > @@ -215,6 +215,7 @@ struct cfq_group {
> >  #endif
> >  	/* number of requests that are on the dispatch list or inside driver */
> >  	int dispatched;
> > +	struct cfq_ttime ttime;
> >  };
> >  
> >  /*
> > @@ -1062,6 +1063,8 @@ static struct cfq_group * cfq_alloc_cfqg
> >  		*st = CFQ_RB_ROOT;
> >  	RB_CLEAR_NODE(&cfqg->rb_node);
> >  
> > +	cfqg->ttime.last_end_request = jiffies;
> > +
> >  	/*
> >  	 * Take the initial reference that will be released on destroy
> >  	 * This can be thought of a joint reference by cgroup and
> > @@ -2385,8 +2388,9 @@ static struct cfq_queue *cfq_select_queu
> >  	 * this group, wait for requests to complete.
> >  	 */
> >  check_group_idle:
> > -	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
> > -	    && cfqq->cfqg->dispatched) {
> > +	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
> > +	    cfqq->cfqg->dispatched && !cfq_io_thinktime_big(cfqq->cfqg->ttime,
> > +	    cfqd->cfq_group_idle)) {
> 
> Lets put the group think time check on new line to avoid splittling
> function argument on two lines.
ok

> >  		cfqq = NULL;
> >  		goto keep_queue;
> >  	}
> > @@ -3245,6 +3249,9 @@ cfq_update_io_thinktime(struct cfq_data
> >  		__cfq_update_io_thinktime(&service_tree->ttime,
> >  			cfqd->cfq_slice_idle);
> >  	}
> > +#ifdef CONFIG_CFQ_GROUP_IOSCHED
> > +	__cfq_update_io_thinktime(&cfqg->ttime, cfqd->cfq_group_idle);
> > +#endif
> >  }
> >  
> >  static void
> > @@ -3536,7 +3543,9 @@ static bool cfq_should_wait_busy(struct
> >  	if (cfqq->cfqg->nr_cfqq > 1)
> >  		return false;
> >  
> > -	if (cfq_slice_used(cfqq))
> > +	/* we are the only queue in the group */
> > +	if (cfq_slice_used(cfqq) &&
> > +	    !cfq_io_thinktime_big(cfqq->cfqg->ttime, cfqd->cfq_group_idle))
> >  		return true;
> 
> I think thinktime check should not be anded with slice_used() check. Slice
> used check is about that we have been idling all along on queue and now
> slice is used so we would like to wait a bit more for queue to get busy so
> that group does not lose its share. Given the fact slice is used that 
> means we have been idling on the queue and have been getting requests
> regularly in the queue.
> 
> But group think time check should be applicable irrespective of the fact
> whether we have used the slice or not. If thinktime of a group is big,
> then we should just not wait for it to get busy because anyway group
> idle timer will fire and expire the queue/group before that.
Makes sense. Updated patch below.

Subject: CFQ: add think time check for group

Currently when the last queue of a group has no request, we don't expire
the queue to hope request from the group comes soon, so the group doesn't
miss its share. But if the think time is big, the assumption isn't correct
and we just waste bandwidth. In such case, we don't do idle.

[global]
runtime=30
direct=1

[test1]
cgroup=test1
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file1
thinktime=9000

[test2]
cgroup=test2
cgroup_weight=1000
rw=randread
ioengine=libaio
size=500m
runtime=30
directory=/mnt
filename=file2

	patched		base
test1	64k		39k
test2	540k		540k
total	604k		578k

group1 gets much better throughput because it waits less time.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
 block/cfq-iosched.c |   19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

Index: linux/block/cfq-iosched.c
===================================================================
--- linux.orig/block/cfq-iosched.c	2011-07-06 09:39:52.000000000 +0800
+++ linux/block/cfq-iosched.c	2011-07-06 09:48:48.000000000 +0800
@@ -213,6 +213,7 @@ struct cfq_group {
 #endif
 	/* number of requests that are on the dispatch list or inside driver */
 	int dispatched;
+	struct cfq_ttime ttime;
 };
 
 /*
@@ -1072,6 +1073,8 @@ static struct cfq_group * cfq_alloc_cfqg
 		*st = CFQ_RB_ROOT;
 	RB_CLEAR_NODE(&cfqg->rb_node);
 
+	cfqg->ttime.last_end_request = jiffies;
+
 	/*
 	 * Take the initial reference that will be released on destroy
 	 * This can be thought of a joint reference by cgroup and
@@ -2395,8 +2398,9 @@ static struct cfq_queue *cfq_select_queu
 	 * this group, wait for requests to complete.
 	 */
 check_group_idle:
-	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1
-	    && cfqq->cfqg->dispatched) {
+	if (cfqd->cfq_group_idle && cfqq->cfqg->nr_cfqq == 1 &&
+	    cfqq->cfqg->dispatched &&
+	    !cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true)) {
 		cfqq = NULL;
 		goto keep_queue;
 	}
@@ -3250,6 +3254,9 @@ cfq_update_io_thinktime(struct cfq_data
 		__cfq_update_io_thinktime(&cfqq->service_tree->ttime,
 			cfqd->cfq_slice_idle);
 	}
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+	__cfq_update_io_thinktime(&cfqq->cfqg->ttime, cfqd->cfq_group_idle);
+#endif
 }
 
 static void
@@ -3541,6 +3548,10 @@ static bool cfq_should_wait_busy(struct
 	if (cfqq->cfqg->nr_cfqq > 1)
 		return false;
 
+	/* the only queue in the group, but think time is big */
+	if (cfq_io_thinktime_big(cfqd, &cfqq->cfqg->ttime, true))
+		return false;
+
 	if (cfq_slice_used(cfqq))
 		return true;
 
@@ -3598,6 +3609,10 @@ static void cfq_completed_request(struct
 			cfqd->last_delayed_sync = now;
 	}
 
+#ifdef CONFIG_CFQ_GROUP_IOSCHED
+	cfqq->cfqg->ttime.last_end_request = now;
+#endif
+
 	/*
 	 * If this is the active queue, check if it needs to be expired,
 	 * or if we want to idle in case it has no pending requests.




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
  2011-07-06  1:58   ` Shaohua Li
@ 2011-07-06 15:06     ` Vivek Goyal
  2011-07-07  6:08       ` Shaohua Li
  0 siblings, 1 reply; 5+ messages in thread
From: Vivek Goyal @ 2011-07-06 15:06 UTC (permalink / raw)
  To: Shaohua Li; +Cc: lkml, Jens Axboe

On Wed, Jul 06, 2011 at 09:58:40AM +0800, Shaohua Li wrote:
[..]
> > > [global]
> > > runtime=30
> > > direct=1
> > > 
> > > [test1]
> > > cgroup=test1
> > > cgroup_weight=1000
> > > rw=randread
> > > ioengine=libaio
> > > size=500m
> > > runtime=30
> > > directory=/mnt
> > > filename=file1
> > > thinktime=9000
> > > 
> > > [test2]
> > > cgroup=test2
> > > cgroup_weight=1000
> > > rw=randread
> > > ioengine=libaio
> > > size=500m
> > > runtime=30
> > > directory=/mnt
> > > filename=file2
> > > 
> > > 	patched		base
> > > test1	64k		39k
> > > test2	540k		540k
> > > total	604k		578k
> > > 
> > > group1 gets much better throughput because it waits less time.

I don't understand it. Thinktime of group test1 is more than 8ms. So now
we should not be idling on test1. Hence test1 should lose some share and
test2 should gain disk share and overall throughput should go up.

I am wondering why throughput of test2 did not go up?

Also can you run some tests to make sure that disk shares of regular
workloads (thinktime less than 8ms) are not impacted.

Thanks
Vivek

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 3/3]Subject: CFQ: add think time check for group
  2011-07-06 15:06     ` Vivek Goyal
@ 2011-07-07  6:08       ` Shaohua Li
  0 siblings, 0 replies; 5+ messages in thread
From: Shaohua Li @ 2011-07-07  6:08 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: lkml, Jens Axboe

On Wed, 2011-07-06 at 23:06 +0800, Vivek Goyal wrote:
> On Wed, Jul 06, 2011 at 09:58:40AM +0800, Shaohua Li wrote:
> [..]
> > > > [global]
> > > > runtime=30
> > > > direct=1
> > > > 
> > > > [test1]
> > > > cgroup=test1
> > > > cgroup_weight=1000
> > > > rw=randread
> > > > ioengine=libaio
> > > > size=500m
> > > > runtime=30
> > > > directory=/mnt
> > > > filename=file1
> > > > thinktime=9000
> > > > 
> > > > [test2]
> > > > cgroup=test2
> > > > cgroup_weight=1000
> > > > rw=randread
> > > > ioengine=libaio
> > > > size=500m
> > > > runtime=30
> > > > directory=/mnt
> > > > filename=file2
> > > > 
> > > > 	patched		base
> > > > test1	64k		39k
> > > > test2	540k		540k
> > > > total	604k		578k
> > > > 
> > > > group1 gets much better throughput because it waits less time.
> 
> I don't understand it. Thinktime of group test1 is more than 8ms. So now
> we should not be idling on test1. Hence test1 should lose some share and
> test2 should gain disk share and overall throughput should go up.
> 
> I am wondering why throughput of test2 did not go up?
hmm, actually the throughput of test2 is better. Maybe I wrote it down wrong.
test2 throughput is about 548k/s. Sorry.

> Also can you run some tests to make sure that disk shares of regular
> workloads (thinktime less than 8ms) are not impacted.
I tried think time 2ms or no think time. there is no difference. the
result is quite stable.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-07-07  6:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-04  5:36 [PATCH 3/3]Subject: CFQ: add think time check for group Shaohua Li
2011-07-05 14:31 ` Vivek Goyal
2011-07-06  1:58   ` Shaohua Li
2011-07-06 15:06     ` Vivek Goyal
2011-07-07  6:08       ` Shaohua Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).