linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] blk-throttle: Couple of more fixes
@ 2010-12-15 21:07 Vivek Goyal
  2010-12-15 21:07 ` [PATCH 1/2] blk-throttle: process limit change only through one function Vivek Goyal
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-12-15 21:07 UTC (permalink / raw)
  To: linux-kernel, jaxboe, paulmck; +Cc: vgoyal, oleg

Hi Jens,

Please find attached couple of more fixes for blk-throttle code. These are
based on top of "for-linus" branch of your block tree. 

Oleg had pointed out couple of race conditions in cgroup weight update code.
I think these race conditions are hard to hit and not disastrous so I would
not be too concerned about pushing these patches in 2.6.37 and can queue
up for 2.6.38.

Paul,

Based on discussion in other mail thread, I have used xchg() based
implementation for updating and processing limtis.  Can you please have a look
if it is correct implementation and do I need any ACCESS_ONCE() or barriers
somewhere. If this implementation is not correct then I can go back to atomic
variable based implementation as suggested by you in other mail thread.
Appreciate the help.

Thanks
Vivek

Vivek Goyal (2):
  blk-throttle: process limit change only through one function
  blk-throttle: Some cleanups and race fixes in limit update code

 block/blk-throttle.c |  104 ++++++++++++++++++++------------------------------
 1 files changed, 41 insertions(+), 63 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/2] blk-throttle: process limit change only through one function
  2010-12-15 21:07 [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
@ 2010-12-15 21:07 ` Vivek Goyal
  2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
  2010-12-21 16:05 ` [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
  2 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-12-15 21:07 UTC (permalink / raw)
  To: linux-kernel, jaxboe, paulmck; +Cc: vgoyal, oleg

o With the help of cgroup interface one can go and upate the bps/iops limits
  of existing group. Once the limits are udpated, a thread is woken up to
  see if some blocked group needs recalculation based on new limits and needs
  to be requeued.

o There was also a piece of code where I was checking for group limit update
  when a fresh bio comes in. This patch gets rid of that piece of code and
  keeps processing the limit change at one place throtl_process_limit_change().
  It just keeps the code simple and easy to understand.

Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-throttle.c |    8 +-------
 1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 381b09b..91bd444 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -998,14 +998,8 @@ int blk_throtl_bio(struct request_queue *q, struct bio **biop)
 		/*
 		 * There is already another bio queued in same dir. No
 		 * need to update dispatch time.
-		 * Still update the disptime if rate limits on this group
-		 * were changed.
 		 */
-		if (!tg->limits_changed)
-			update_disptime = false;
-		else
-			tg->limits_changed = false;
-
+		update_disptime = false;
 		goto queue_bio;
 	}
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code
  2010-12-15 21:07 [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
  2010-12-15 21:07 ` [PATCH 1/2] blk-throttle: process limit change only through one function Vivek Goyal
@ 2010-12-15 21:07 ` Vivek Goyal
  2010-12-16 14:49   ` Oleg Nesterov
                     ` (2 more replies)
  2010-12-21 16:05 ` [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
  2 siblings, 3 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-12-15 21:07 UTC (permalink / raw)
  To: linux-kernel, jaxboe, paulmck; +Cc: vgoyal, oleg

o When throttle group limits are updated through cgroups, a thread is woken
  up to process these updates. While reviewing that code, oleg noted couple
  of race conditions existed in the code and he also suggested that code can
  be simplified.

o This patch fixes the races simplifies the code based on Oleg's suggestions.
        - Use xchg().
        - Introduced a common function throtl_update_blkio_group_common()
          which is shared now by all iops/bps update functions.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
---
 block/blk-throttle.c |   96 +++++++++++++++++++++-----------------------------
 1 files changed, 40 insertions(+), 56 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 91bd444..549bc8e 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -97,7 +97,7 @@ struct throtl_data
 	/* Work for dispatching throttled bios */
 	struct delayed_work throtl_work;
 
-	atomic_t limits_changed;
+	bool limits_changed;
 };
 
 enum tg_state_flags {
@@ -188,6 +188,7 @@ static struct throtl_grp * throtl_find_alloc_tg(struct throtl_data *td,
 	RB_CLEAR_NODE(&tg->rb_node);
 	bio_list_init(&tg->bio_lists[0]);
 	bio_list_init(&tg->bio_lists[1]);
+	td->limits_changed = false;
 
 	/*
 	 * Take the initial reference that will be released on destroy
@@ -725,34 +726,27 @@ static void throtl_process_limit_change(struct throtl_data *td)
 	struct throtl_grp *tg;
 	struct hlist_node *pos, *n;
 
-	if (!atomic_read(&td->limits_changed))
+	if (!td->limits_changed)
 		return;
 
-	throtl_log(td, "limit changed =%d", atomic_read(&td->limits_changed));
+	xchg(&td->limits_changed, false);
 
-	/*
-	 * Make sure updates from throtl_update_blkio_group_read_bps() group
-	 * of functions to tg->limits_changed are visible. We do not
-	 * want update td->limits_changed to be visible but update to
-	 * tg->limits_changed not being visible yet on this cpu. Hence
-	 * the read barrier.
-	 */
-	smp_rmb();
+	throtl_log(td, "limits changed");
 
 	hlist_for_each_entry_safe(tg, pos, n, &td->tg_list, tg_node) {
-		if (throtl_tg_on_rr(tg) && tg->limits_changed) {
-			throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
-				" riops=%u wiops=%u", tg->bps[READ],
-				tg->bps[WRITE], tg->iops[READ],
-				tg->iops[WRITE]);
+		if (!tg->limits_changed)
+			continue;
+
+		if (!xchg(&tg->limits_changed, false))
+			continue;
+
+		throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
+			" riops=%u wiops=%u", tg->bps[READ], tg->bps[WRITE],
+			tg->iops[READ], tg->iops[WRITE]);
+
+		if (throtl_tg_on_rr(tg))
 			tg_update_disptime(td, tg);
-			tg->limits_changed = false;
-		}
 	}
-
-	smp_mb__before_atomic_dec();
-	atomic_dec(&td->limits_changed);
-	smp_mb__after_atomic_dec();
 }
 
 /* Dispatch throttled bios. Should be called without queue lock held. */
@@ -887,6 +881,15 @@ void throtl_unlink_blkio_group(void *key, struct blkio_group *blkg)
 	spin_unlock_irqrestore(td->queue->queue_lock, flags);
 }
 
+static void throtl_update_blkio_group_common(struct throtl_data *td,
+				struct throtl_grp *tg)
+{
+	xchg(&tg->limits_changed, true);
+	xchg(&td->limits_changed, true);
+	/* Schedule a work now to process the limit change */
+	throtl_schedule_delayed_work(td->queue, 0);
+}
+
 /*
  * For all update functions, key should be a valid pointer because these
  * update functions are called under blkcg_lock, that means, blkg is
@@ -900,61 +903,40 @@ static void throtl_update_blkio_group_read_bps(void *key,
 				struct blkio_group *blkg, u64 read_bps)
 {
 	struct throtl_data *td = key;
+	struct throtl_grp *tg = tg_of_blkg(blkg);
 
-	tg_of_blkg(blkg)->bps[READ] = read_bps;
-	/* Make sure read_bps is updated before setting limits_changed */
-	smp_wmb();
-	tg_of_blkg(blkg)->limits_changed = true;
-
-	/* Make sure tg->limits_changed is updated before td->limits_changed */
-	smp_mb__before_atomic_inc();
-	atomic_inc(&td->limits_changed);
-	smp_mb__after_atomic_inc();
-
-	/* Schedule a work now to process the limit change */
-	throtl_schedule_delayed_work(td->queue, 0);
+	tg->bps[READ] = read_bps;
+	throtl_update_blkio_group_common(td, tg);
 }
 
 static void throtl_update_blkio_group_write_bps(void *key,
 				struct blkio_group *blkg, u64 write_bps)
 {
 	struct throtl_data *td = key;
+	struct throtl_grp *tg = tg_of_blkg(blkg);
 
-	tg_of_blkg(blkg)->bps[WRITE] = write_bps;
-	smp_wmb();
-	tg_of_blkg(blkg)->limits_changed = true;
-	smp_mb__before_atomic_inc();
-	atomic_inc(&td->limits_changed);
-	smp_mb__after_atomic_inc();
-	throtl_schedule_delayed_work(td->queue, 0);
+	tg->bps[WRITE] = write_bps;
+	throtl_update_blkio_group_common(td, tg);
 }
 
 static void throtl_update_blkio_group_read_iops(void *key,
 			struct blkio_group *blkg, unsigned int read_iops)
 {
 	struct throtl_data *td = key;
+	struct throtl_grp *tg = tg_of_blkg(blkg);
 
-	tg_of_blkg(blkg)->iops[READ] = read_iops;
-	smp_wmb();
-	tg_of_blkg(blkg)->limits_changed = true;
-	smp_mb__before_atomic_inc();
-	atomic_inc(&td->limits_changed);
-	smp_mb__after_atomic_inc();
-	throtl_schedule_delayed_work(td->queue, 0);
+	tg->iops[READ] = read_iops;
+	throtl_update_blkio_group_common(td, tg);
 }
 
 static void throtl_update_blkio_group_write_iops(void *key,
 			struct blkio_group *blkg, unsigned int write_iops)
 {
 	struct throtl_data *td = key;
+	struct throtl_grp *tg = tg_of_blkg(blkg);
 
-	tg_of_blkg(blkg)->iops[WRITE] = write_iops;
-	smp_wmb();
-	tg_of_blkg(blkg)->limits_changed = true;
-	smp_mb__before_atomic_inc();
-	atomic_inc(&td->limits_changed);
-	smp_mb__after_atomic_inc();
-	throtl_schedule_delayed_work(td->queue, 0);
+	tg->iops[WRITE] = write_iops;
+	throtl_update_blkio_group_common(td, tg);
 }
 
 void throtl_shutdown_timer_wq(struct request_queue *q)
@@ -1001,6 +983,7 @@ int blk_throtl_bio(struct request_queue *q, struct bio **biop)
 		 */
 		update_disptime = false;
 		goto queue_bio;
+
 	}
 
 	/* Bio is with-in rate limit of group */
@@ -1041,7 +1024,7 @@ int blk_throtl_init(struct request_queue *q)
 
 	INIT_HLIST_HEAD(&td->tg_list);
 	td->tg_service_tree = THROTL_RB_ROOT;
-	atomic_set(&td->limits_changed, 0);
+	td->limits_changed = false;
 
 	/* Init root group */
 	tg = &td->root_tg;
@@ -1053,6 +1036,7 @@ int blk_throtl_init(struct request_queue *q)
 	/* Practically unlimited BW */
 	tg->bps[0] = tg->bps[1] = -1;
 	tg->iops[0] = tg->iops[1] = -1;
+	td->limits_changed = false;
 
 	/*
 	 * Set root group reference to 2. One reference will be dropped when
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code
  2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
@ 2010-12-16 14:49   ` Oleg Nesterov
  2010-12-17 22:28   ` Paul E. McKenney
  2011-03-29 23:14   ` [PATCH] blk-throttle: don't call xchg on bool Andreas Schwab
  2 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2010-12-16 14:49 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-kernel, jaxboe, paulmck

On 12/15, Vivek Goyal wrote:
>
>         - Use xchg().
>         - Introduced a common function throtl_update_blkio_group_common()
>           which is shared now by all iops/bps update functions.

ACK


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code
  2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
  2010-12-16 14:49   ` Oleg Nesterov
@ 2010-12-17 22:28   ` Paul E. McKenney
  2010-12-17 22:34     ` Vivek Goyal
  2011-03-29 23:14   ` [PATCH] blk-throttle: don't call xchg on bool Andreas Schwab
  2 siblings, 1 reply; 13+ messages in thread
From: Paul E. McKenney @ 2010-12-17 22:28 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-kernel, jaxboe, oleg

On Wed, Dec 15, 2010 at 04:07:35PM -0500, Vivek Goyal wrote:
> o When throttle group limits are updated through cgroups, a thread is woken
>   up to process these updates. While reviewing that code, oleg noted couple
>   of race conditions existed in the code and he also suggested that code can
>   be simplified.
> 
> o This patch fixes the races simplifies the code based on Oleg's suggestions.
>         - Use xchg().
>         - Introduced a common function throtl_update_blkio_group_common()
>           which is shared now by all iops/bps update functions.

OK, this looks good at the moment.  The HW guys are still tearing their
hair out, but this approach does appear to have an excellent chance of
turning out to be safe.  ;-)

A few questions below, but assuming the answers turn out to be what
I believe them to be:

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>  block/blk-throttle.c |   96 +++++++++++++++++++++-----------------------------
>  1 files changed, 40 insertions(+), 56 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 91bd444..549bc8e 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -97,7 +97,7 @@ struct throtl_data
>  	/* Work for dispatching throttled bios */
>  	struct delayed_work throtl_work;
> 
> -	atomic_t limits_changed;
> +	bool limits_changed;
>  };
> 
>  enum tg_state_flags {
> @@ -188,6 +188,7 @@ static struct throtl_grp * throtl_find_alloc_tg(struct throtl_data *td,
>  	RB_CLEAR_NODE(&tg->rb_node);
>  	bio_list_init(&tg->bio_lists[0]);
>  	bio_list_init(&tg->bio_lists[1]);
> +	td->limits_changed = false;

This is at initialization time, correct?  No other CPUs have access
at this point, right?

>  	/*
>  	 * Take the initial reference that will be released on destroy
> @@ -725,34 +726,27 @@ static void throtl_process_limit_change(struct throtl_data *td)
>  	struct throtl_grp *tg;
>  	struct hlist_node *pos, *n;
> 
> -	if (!atomic_read(&td->limits_changed))
> +	if (!td->limits_changed)
>  		return;
> 
> -	throtl_log(td, "limit changed =%d", atomic_read(&td->limits_changed));
> +	xchg(&td->limits_changed, false);
> 
> -	/*
> -	 * Make sure updates from throtl_update_blkio_group_read_bps() group
> -	 * of functions to tg->limits_changed are visible. We do not
> -	 * want update td->limits_changed to be visible but update to
> -	 * tg->limits_changed not being visible yet on this cpu. Hence
> -	 * the read barrier.
> -	 */
> -	smp_rmb();
> +	throtl_log(td, "limits changed");
> 
>  	hlist_for_each_entry_safe(tg, pos, n, &td->tg_list, tg_node) {
> -		if (throtl_tg_on_rr(tg) && tg->limits_changed) {
> -			throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
> -				" riops=%u wiops=%u", tg->bps[READ],
> -				tg->bps[WRITE], tg->iops[READ],
> -				tg->iops[WRITE]);
> +		if (!tg->limits_changed)
> +			continue;
> +
> +		if (!xchg(&tg->limits_changed, false))
> +			continue;
> +
> +		throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
> +			" riops=%u wiops=%u", tg->bps[READ], tg->bps[WRITE],
> +			tg->iops[READ], tg->iops[WRITE]);
> +
> +		if (throtl_tg_on_rr(tg))
>  			tg_update_disptime(td, tg);
> -			tg->limits_changed = false;
> -		}
>  	}
> -
> -	smp_mb__before_atomic_dec();
> -	atomic_dec(&td->limits_changed);
> -	smp_mb__after_atomic_dec();
>  }
> 
>  /* Dispatch throttled bios. Should be called without queue lock held. */
> @@ -887,6 +881,15 @@ void throtl_unlink_blkio_group(void *key, struct blkio_group *blkg)
>  	spin_unlock_irqrestore(td->queue->queue_lock, flags);
>  }
> 
> +static void throtl_update_blkio_group_common(struct throtl_data *td,
> +				struct throtl_grp *tg)
> +{
> +	xchg(&tg->limits_changed, true);
> +	xchg(&td->limits_changed, true);
> +	/* Schedule a work now to process the limit change */
> +	throtl_schedule_delayed_work(td->queue, 0);
> +}
> +
>  /*
>   * For all update functions, key should be a valid pointer because these
>   * update functions are called under blkcg_lock, that means, blkg is
> @@ -900,61 +903,40 @@ static void throtl_update_blkio_group_read_bps(void *key,
>  				struct blkio_group *blkg, u64 read_bps)
>  {
>  	struct throtl_data *td = key;
> +	struct throtl_grp *tg = tg_of_blkg(blkg);
> 
> -	tg_of_blkg(blkg)->bps[READ] = read_bps;
> -	/* Make sure read_bps is updated before setting limits_changed */
> -	smp_wmb();
> -	tg_of_blkg(blkg)->limits_changed = true;
> -
> -	/* Make sure tg->limits_changed is updated before td->limits_changed */
> -	smp_mb__before_atomic_inc();
> -	atomic_inc(&td->limits_changed);
> -	smp_mb__after_atomic_inc();
> -
> -	/* Schedule a work now to process the limit change */
> -	throtl_schedule_delayed_work(td->queue, 0);
> +	tg->bps[READ] = read_bps;
> +	throtl_update_blkio_group_common(td, tg);
>  }
> 
>  static void throtl_update_blkio_group_write_bps(void *key,
>  				struct blkio_group *blkg, u64 write_bps)
>  {
>  	struct throtl_data *td = key;
> +	struct throtl_grp *tg = tg_of_blkg(blkg);
> 
> -	tg_of_blkg(blkg)->bps[WRITE] = write_bps;
> -	smp_wmb();
> -	tg_of_blkg(blkg)->limits_changed = true;
> -	smp_mb__before_atomic_inc();
> -	atomic_inc(&td->limits_changed);
> -	smp_mb__after_atomic_inc();
> -	throtl_schedule_delayed_work(td->queue, 0);
> +	tg->bps[WRITE] = write_bps;
> +	throtl_update_blkio_group_common(td, tg);
>  }
> 
>  static void throtl_update_blkio_group_read_iops(void *key,
>  			struct blkio_group *blkg, unsigned int read_iops)
>  {
>  	struct throtl_data *td = key;
> +	struct throtl_grp *tg = tg_of_blkg(blkg);
> 
> -	tg_of_blkg(blkg)->iops[READ] = read_iops;
> -	smp_wmb();
> -	tg_of_blkg(blkg)->limits_changed = true;
> -	smp_mb__before_atomic_inc();
> -	atomic_inc(&td->limits_changed);
> -	smp_mb__after_atomic_inc();
> -	throtl_schedule_delayed_work(td->queue, 0);
> +	tg->iops[READ] = read_iops;
> +	throtl_update_blkio_group_common(td, tg);
>  }
> 
>  static void throtl_update_blkio_group_write_iops(void *key,
>  			struct blkio_group *blkg, unsigned int write_iops)
>  {
>  	struct throtl_data *td = key;
> +	struct throtl_grp *tg = tg_of_blkg(blkg);
> 
> -	tg_of_blkg(blkg)->iops[WRITE] = write_iops;
> -	smp_wmb();
> -	tg_of_blkg(blkg)->limits_changed = true;
> -	smp_mb__before_atomic_inc();
> -	atomic_inc(&td->limits_changed);
> -	smp_mb__after_atomic_inc();
> -	throtl_schedule_delayed_work(td->queue, 0);
> +	tg->iops[WRITE] = write_iops;
> +	throtl_update_blkio_group_common(td, tg);
>  }
> 
>  void throtl_shutdown_timer_wq(struct request_queue *q)
> @@ -1001,6 +983,7 @@ int blk_throtl_bio(struct request_queue *q, struct bio **biop)
>  		 */
>  		update_disptime = false;
>  		goto queue_bio;
> +
>  	}
> 
>  	/* Bio is with-in rate limit of group */
> @@ -1041,7 +1024,7 @@ int blk_throtl_init(struct request_queue *q)
> 
>  	INIT_HLIST_HEAD(&td->tg_list);
>  	td->tg_service_tree = THROTL_RB_ROOT;
> -	atomic_set(&td->limits_changed, 0);
> +	td->limits_changed = false;

And the name leads me to believe that there are no other CPUs accessing
things at this point as well.  Correct?

>  	/* Init root group */
>  	tg = &td->root_tg;
> @@ -1053,6 +1036,7 @@ int blk_throtl_init(struct request_queue *q)
>  	/* Practically unlimited BW */
>  	tg->bps[0] = tg->bps[1] = -1;
>  	tg->iops[0] = tg->iops[1] = -1;
> +	td->limits_changed = false;

Ditto?

>  	/*
>  	 * Set root group reference to 2. One reference will be dropped when
> -- 
> 1.7.1
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code
  2010-12-17 22:28   ` Paul E. McKenney
@ 2010-12-17 22:34     ` Vivek Goyal
  0 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-12-17 22:34 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, jaxboe, oleg

On Fri, Dec 17, 2010 at 02:28:05PM -0800, Paul E. McKenney wrote:
> On Wed, Dec 15, 2010 at 04:07:35PM -0500, Vivek Goyal wrote:
> > o When throttle group limits are updated through cgroups, a thread is woken
> >   up to process these updates. While reviewing that code, oleg noted couple
> >   of race conditions existed in the code and he also suggested that code can
> >   be simplified.
> > 
> > o This patch fixes the races simplifies the code based on Oleg's suggestions.
> >         - Use xchg().
> >         - Introduced a common function throtl_update_blkio_group_common()
> >           which is shared now by all iops/bps update functions.
> 
> OK, this looks good at the moment.  The HW guys are still tearing their
> hair out, but this approach does appear to have an excellent chance of
> turning out to be safe.  ;-)
> 
> A few questions below, but assuming the answers turn out to be what
> I believe them to be:
> 
> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Thanks a lot for the review paul. Yes all the assignments below without xchg()
are at initialization time and no other cpu is accessing it.

Thanks
Vivek


> 
> > Reviewed-by: Oleg Nesterov <oleg@redhat.com>
> > Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> > ---
> >  block/blk-throttle.c |   96 +++++++++++++++++++++-----------------------------
> >  1 files changed, 40 insertions(+), 56 deletions(-)
> > 
> > diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> > index 91bd444..549bc8e 100644
> > --- a/block/blk-throttle.c
> > +++ b/block/blk-throttle.c
> > @@ -97,7 +97,7 @@ struct throtl_data
> >  	/* Work for dispatching throttled bios */
> >  	struct delayed_work throtl_work;
> > 
> > -	atomic_t limits_changed;
> > +	bool limits_changed;
> >  };
> > 
> >  enum tg_state_flags {
> > @@ -188,6 +188,7 @@ static struct throtl_grp * throtl_find_alloc_tg(struct throtl_data *td,
> >  	RB_CLEAR_NODE(&tg->rb_node);
> >  	bio_list_init(&tg->bio_lists[0]);
> >  	bio_list_init(&tg->bio_lists[1]);
> > +	td->limits_changed = false;
> 
> This is at initialization time, correct?  No other CPUs have access
> at this point, right?
> 
> >  	/*
> >  	 * Take the initial reference that will be released on destroy
> > @@ -725,34 +726,27 @@ static void throtl_process_limit_change(struct throtl_data *td)
> >  	struct throtl_grp *tg;
> >  	struct hlist_node *pos, *n;
> > 
> > -	if (!atomic_read(&td->limits_changed))
> > +	if (!td->limits_changed)
> >  		return;
> > 
> > -	throtl_log(td, "limit changed =%d", atomic_read(&td->limits_changed));
> > +	xchg(&td->limits_changed, false);
> > 
> > -	/*
> > -	 * Make sure updates from throtl_update_blkio_group_read_bps() group
> > -	 * of functions to tg->limits_changed are visible. We do not
> > -	 * want update td->limits_changed to be visible but update to
> > -	 * tg->limits_changed not being visible yet on this cpu. Hence
> > -	 * the read barrier.
> > -	 */
> > -	smp_rmb();
> > +	throtl_log(td, "limits changed");
> > 
> >  	hlist_for_each_entry_safe(tg, pos, n, &td->tg_list, tg_node) {
> > -		if (throtl_tg_on_rr(tg) && tg->limits_changed) {
> > -			throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
> > -				" riops=%u wiops=%u", tg->bps[READ],
> > -				tg->bps[WRITE], tg->iops[READ],
> > -				tg->iops[WRITE]);
> > +		if (!tg->limits_changed)
> > +			continue;
> > +
> > +		if (!xchg(&tg->limits_changed, false))
> > +			continue;
> > +
> > +		throtl_log_tg(td, tg, "limit change rbps=%llu wbps=%llu"
> > +			" riops=%u wiops=%u", tg->bps[READ], tg->bps[WRITE],
> > +			tg->iops[READ], tg->iops[WRITE]);
> > +
> > +		if (throtl_tg_on_rr(tg))
> >  			tg_update_disptime(td, tg);
> > -			tg->limits_changed = false;
> > -		}
> >  	}
> > -
> > -	smp_mb__before_atomic_dec();
> > -	atomic_dec(&td->limits_changed);
> > -	smp_mb__after_atomic_dec();
> >  }
> > 
> >  /* Dispatch throttled bios. Should be called without queue lock held. */
> > @@ -887,6 +881,15 @@ void throtl_unlink_blkio_group(void *key, struct blkio_group *blkg)
> >  	spin_unlock_irqrestore(td->queue->queue_lock, flags);
> >  }
> > 
> > +static void throtl_update_blkio_group_common(struct throtl_data *td,
> > +				struct throtl_grp *tg)
> > +{
> > +	xchg(&tg->limits_changed, true);
> > +	xchg(&td->limits_changed, true);
> > +	/* Schedule a work now to process the limit change */
> > +	throtl_schedule_delayed_work(td->queue, 0);
> > +}
> > +
> >  /*
> >   * For all update functions, key should be a valid pointer because these
> >   * update functions are called under blkcg_lock, that means, blkg is
> > @@ -900,61 +903,40 @@ static void throtl_update_blkio_group_read_bps(void *key,
> >  				struct blkio_group *blkg, u64 read_bps)
> >  {
> >  	struct throtl_data *td = key;
> > +	struct throtl_grp *tg = tg_of_blkg(blkg);
> > 
> > -	tg_of_blkg(blkg)->bps[READ] = read_bps;
> > -	/* Make sure read_bps is updated before setting limits_changed */
> > -	smp_wmb();
> > -	tg_of_blkg(blkg)->limits_changed = true;
> > -
> > -	/* Make sure tg->limits_changed is updated before td->limits_changed */
> > -	smp_mb__before_atomic_inc();
> > -	atomic_inc(&td->limits_changed);
> > -	smp_mb__after_atomic_inc();
> > -
> > -	/* Schedule a work now to process the limit change */
> > -	throtl_schedule_delayed_work(td->queue, 0);
> > +	tg->bps[READ] = read_bps;
> > +	throtl_update_blkio_group_common(td, tg);
> >  }
> > 
> >  static void throtl_update_blkio_group_write_bps(void *key,
> >  				struct blkio_group *blkg, u64 write_bps)
> >  {
> >  	struct throtl_data *td = key;
> > +	struct throtl_grp *tg = tg_of_blkg(blkg);
> > 
> > -	tg_of_blkg(blkg)->bps[WRITE] = write_bps;
> > -	smp_wmb();
> > -	tg_of_blkg(blkg)->limits_changed = true;
> > -	smp_mb__before_atomic_inc();
> > -	atomic_inc(&td->limits_changed);
> > -	smp_mb__after_atomic_inc();
> > -	throtl_schedule_delayed_work(td->queue, 0);
> > +	tg->bps[WRITE] = write_bps;
> > +	throtl_update_blkio_group_common(td, tg);
> >  }
> > 
> >  static void throtl_update_blkio_group_read_iops(void *key,
> >  			struct blkio_group *blkg, unsigned int read_iops)
> >  {
> >  	struct throtl_data *td = key;
> > +	struct throtl_grp *tg = tg_of_blkg(blkg);
> > 
> > -	tg_of_blkg(blkg)->iops[READ] = read_iops;
> > -	smp_wmb();
> > -	tg_of_blkg(blkg)->limits_changed = true;
> > -	smp_mb__before_atomic_inc();
> > -	atomic_inc(&td->limits_changed);
> > -	smp_mb__after_atomic_inc();
> > -	throtl_schedule_delayed_work(td->queue, 0);
> > +	tg->iops[READ] = read_iops;
> > +	throtl_update_blkio_group_common(td, tg);
> >  }
> > 
> >  static void throtl_update_blkio_group_write_iops(void *key,
> >  			struct blkio_group *blkg, unsigned int write_iops)
> >  {
> >  	struct throtl_data *td = key;
> > +	struct throtl_grp *tg = tg_of_blkg(blkg);
> > 
> > -	tg_of_blkg(blkg)->iops[WRITE] = write_iops;
> > -	smp_wmb();
> > -	tg_of_blkg(blkg)->limits_changed = true;
> > -	smp_mb__before_atomic_inc();
> > -	atomic_inc(&td->limits_changed);
> > -	smp_mb__after_atomic_inc();
> > -	throtl_schedule_delayed_work(td->queue, 0);
> > +	tg->iops[WRITE] = write_iops;
> > +	throtl_update_blkio_group_common(td, tg);
> >  }
> > 
> >  void throtl_shutdown_timer_wq(struct request_queue *q)
> > @@ -1001,6 +983,7 @@ int blk_throtl_bio(struct request_queue *q, struct bio **biop)
> >  		 */
> >  		update_disptime = false;
> >  		goto queue_bio;
> > +
> >  	}
> > 
> >  	/* Bio is with-in rate limit of group */
> > @@ -1041,7 +1024,7 @@ int blk_throtl_init(struct request_queue *q)
> > 
> >  	INIT_HLIST_HEAD(&td->tg_list);
> >  	td->tg_service_tree = THROTL_RB_ROOT;
> > -	atomic_set(&td->limits_changed, 0);
> > +	td->limits_changed = false;
> 
> And the name leads me to believe that there are no other CPUs accessing
> things at this point as well.  Correct?
> 
> >  	/* Init root group */
> >  	tg = &td->root_tg;
> > @@ -1053,6 +1036,7 @@ int blk_throtl_init(struct request_queue *q)
> >  	/* Practically unlimited BW */
> >  	tg->bps[0] = tg->bps[1] = -1;
> >  	tg->iops[0] = tg->iops[1] = -1;
> > +	td->limits_changed = false;
> 
> Ditto?
> 
> >  	/*
> >  	 * Set root group reference to 2. One reference will be dropped when
> > -- 
> > 1.7.1
> > 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/2] blk-throttle: Couple of more fixes
  2010-12-15 21:07 [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
  2010-12-15 21:07 ` [PATCH 1/2] blk-throttle: process limit change only through one function Vivek Goyal
  2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
@ 2010-12-21 16:05 ` Vivek Goyal
  2 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2010-12-21 16:05 UTC (permalink / raw)
  To: jaxboe; +Cc: oleg, paulmck, linux-kernel

On Wed, Dec 15, 2010 at 04:07:33PM -0500, Vivek Goyal wrote:
> Hi Jens,
> 
> Please find attached couple of more fixes for blk-throttle code. These are
> based on top of "for-linus" branch of your block tree. 
> 

Hi Jens,

Do you have any concerns with these two fixes? Now Oleg and Paul have
acked it too. Could you please apply these for 2.6.38.

Thanks
Vivek

> Oleg had pointed out couple of race conditions in cgroup weight update code.
> I think these race conditions are hard to hit and not disastrous so I would
> not be too concerned about pushing these patches in 2.6.37 and can queue
> up for 2.6.38.
> 
> Paul,
> 
> Based on discussion in other mail thread, I have used xchg() based
> implementation for updating and processing limtis.  Can you please have a look
> if it is correct implementation and do I need any ACCESS_ONCE() or barriers
> somewhere. If this implementation is not correct then I can go back to atomic
> variable based implementation as suggested by you in other mail thread.
> Appreciate the help.
> 
> Thanks
> Vivek
> 
> Vivek Goyal (2):
>   blk-throttle: process limit change only through one function
>   blk-throttle: Some cleanups and race fixes in limit update code
> 
>  block/blk-throttle.c |  104 ++++++++++++++++++++------------------------------
>  1 files changed, 41 insertions(+), 63 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] blk-throttle: don't call xchg on bool
  2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
  2010-12-16 14:49   ` Oleg Nesterov
  2010-12-17 22:28   ` Paul E. McKenney
@ 2011-03-29 23:14   ` Andreas Schwab
  2011-03-30 10:21     ` Jens Axboe
  2011-03-30 13:19     ` Vivek Goyal
  2 siblings, 2 replies; 13+ messages in thread
From: Andreas Schwab @ 2011-03-29 23:14 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-kernel, jaxboe, paulmck, oleg

xchg does not work portably with smaller than 32bit types.

Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
---
 block/blk-throttle.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 5352bda..6c98cfe 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -77,7 +77,7 @@ struct throtl_grp {
 	unsigned long slice_end[2];
 
 	/* Some throttle limits got updated for the group */
-	bool limits_changed;
+	int limits_changed;
 };
 
 struct throtl_data
@@ -102,7 +102,7 @@ struct throtl_data
 	/* Work for dispatching throttled bios */
 	struct delayed_work throtl_work;
 
-	bool limits_changed;
+	int limits_changed;
 };
 
 enum tg_state_flags {
-- 
1.7.4.2


-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] blk-throttle: don't call xchg on bool
  2011-03-29 23:14   ` [PATCH] blk-throttle: don't call xchg on bool Andreas Schwab
@ 2011-03-30 10:21     ` Jens Axboe
  2011-03-30 13:19     ` Vivek Goyal
  1 sibling, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2011-03-30 10:21 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Vivek Goyal, linux-kernel, paulmck, oleg

On 2011-03-30 01:14, Andreas Schwab wrote:
> xchg does not work portably with smaller than 32bit types.

Thanks Andreas, I guess if people look at the x86 implementation then
that's a worry.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] blk-throttle: don't call xchg on bool
  2011-03-29 23:14   ` [PATCH] blk-throttle: don't call xchg on bool Andreas Schwab
  2011-03-30 10:21     ` Jens Axboe
@ 2011-03-30 13:19     ` Vivek Goyal
  2011-03-30 13:51       ` Andreas Schwab
  2011-03-30 16:06       ` Andreas Schwab
  1 sibling, 2 replies; 13+ messages in thread
From: Vivek Goyal @ 2011-03-30 13:19 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: linux-kernel, jaxboe, paulmck, oleg

On Wed, Mar 30, 2011 at 01:14:50AM +0200, Andreas Schwab wrote:
> xchg does not work portably with smaller than 32bit types.
> 

Hi Andreas,

Which arch creates the problem and what's the issue here if data type
is smaller than 32bit? Is it a compile time warning?

Also, you also probably need to chagne ->limits_changed assignments?
Instead of using true/false use 1/0.

Thanks
Vivek

> Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
> ---
>  block/blk-throttle.c |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 5352bda..6c98cfe 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -77,7 +77,7 @@ struct throtl_grp {
>  	unsigned long slice_end[2];
>  
>  	/* Some throttle limits got updated for the group */
> -	bool limits_changed;
> +	int limits_changed;
>  };
>  
>  struct throtl_data
> @@ -102,7 +102,7 @@ struct throtl_data
>  	/* Work for dispatching throttled bios */
>  	struct delayed_work throtl_work;
>  
> -	bool limits_changed;
> +	int limits_changed;
>  };
>  
>  enum tg_state_flags {
> -- 
> 1.7.4.2
> 
> 
> -- 
> Andreas Schwab, schwab@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] blk-throttle: don't call xchg on bool
  2011-03-30 13:19     ` Vivek Goyal
@ 2011-03-30 13:51       ` Andreas Schwab
  2011-03-30 16:53         ` Vivek Goyal
  2011-03-30 16:06       ` Andreas Schwab
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Schwab @ 2011-03-30 13:51 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-kernel, jaxboe, paulmck, oleg

Vivek Goyal <vgoyal@redhat.com> writes:

> Which arch creates the problem and what's the issue here if data type
> is smaller than 32bit?

Just look at the various implementations in arch/*/include/asm/system.h:

- arm: supports only one and 4 byte xchg
- avr32: supports only 4 byte xchg
- frv: supports only 4 byte xchg
- m32r: SMP supports only 4 byte xchg
- microblaze: supports only one and 4 byte xchg

etc.

> Is it a compile time warning?

Most implementations turn the use of xchg with a bad pointer into a link
error.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] blk-throttle: don't call xchg on bool
  2011-03-30 13:19     ` Vivek Goyal
  2011-03-30 13:51       ` Andreas Schwab
@ 2011-03-30 16:06       ` Andreas Schwab
  1 sibling, 0 replies; 13+ messages in thread
From: Andreas Schwab @ 2011-03-30 16:06 UTC (permalink / raw)
  To: Vivek Goyal; +Cc: linux-kernel, jaxboe, paulmck, oleg

Vivek Goyal <vgoyal@redhat.com> writes:

> Also, you also probably need to chagne ->limits_changed assignments?
> Instead of using true/false use 1/0.

That's rather a cosmetical change that can be added on top if it is
deemed useful.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] blk-throttle: don't call xchg on bool
  2011-03-30 13:51       ` Andreas Schwab
@ 2011-03-30 16:53         ` Vivek Goyal
  0 siblings, 0 replies; 13+ messages in thread
From: Vivek Goyal @ 2011-03-30 16:53 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: linux-kernel, jaxboe, paulmck, oleg

On Wed, Mar 30, 2011 at 03:51:34PM +0200, Andreas Schwab wrote:
> Vivek Goyal <vgoyal@redhat.com> writes:
> 
> > Which arch creates the problem and what's the issue here if data type
> > is smaller than 32bit?
> 
> Just look at the various implementations in arch/*/include/asm/system.h:
> 
> - arm: supports only one and 4 byte xchg
> - avr32: supports only 4 byte xchg
> - frv: supports only 4 byte xchg
> - m32r: SMP supports only 4 byte xchg
> - microblaze: supports only one and 4 byte xchg
> 
> etc.
> 
> > Is it a compile time warning?
> 
> Most implementations turn the use of xchg with a bad pointer into a link
> error.

Thanks Andreas. I get it. I did check arm and it is supporting only one
and 4 byte xchg.

Vivek

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-03-30 16:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-15 21:07 [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal
2010-12-15 21:07 ` [PATCH 1/2] blk-throttle: process limit change only through one function Vivek Goyal
2010-12-15 21:07 ` [PATCH 2/2] blk-throttle: Some cleanups and race fixes in limit update code Vivek Goyal
2010-12-16 14:49   ` Oleg Nesterov
2010-12-17 22:28   ` Paul E. McKenney
2010-12-17 22:34     ` Vivek Goyal
2011-03-29 23:14   ` [PATCH] blk-throttle: don't call xchg on bool Andreas Schwab
2011-03-30 10:21     ` Jens Axboe
2011-03-30 13:19     ` Vivek Goyal
2011-03-30 13:51       ` Andreas Schwab
2011-03-30 16:53         ` Vivek Goyal
2011-03-30 16:06       ` Andreas Schwab
2010-12-21 16:05 ` [PATCH 0/2] blk-throttle: Couple of more fixes Vivek Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).