linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] bcache: fix deadlock in bcache_allocator
@ 2019-08-06  9:18 Andrea Righi
  2019-08-06 12:59 ` Coly Li
  2019-08-06 17:36 ` Andrea Righi
  0 siblings, 2 replies; 5+ messages in thread
From: Andrea Righi @ 2019-08-06  9:18 UTC (permalink / raw)
  To: Coly Li, Kent Overstreet; +Cc: linux-bcache, linux-kernel

bcache_allocator() can call the following:

 bch_allocator_thread()
  -> bch_prio_write()
     -> bch_bucket_alloc()
        -> wait on &ca->set->bucket_wait

But the wake up event on bucket_wait is supposed to come from
bch_allocator_thread() itself => deadlock:

[ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
[ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
[ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
[ 1158.504419] Call Trace:
[ 1158.504429]  __schedule+0x2a8/0x670
[ 1158.504432]  schedule+0x2d/0x90
[ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
[ 1158.504453]  ? wait_woken+0x80/0x80
[ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
[ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
[ 1158.504491]  kthread+0x121/0x140
[ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
[ 1158.504506]  ? kthread_park+0xb0/0xb0
[ 1158.504510]  ret_from_fork+0x35/0x40

Fix by making the call to bch_prio_write() non-blocking, so that
bch_allocator_thread() never waits on itself.

Moreover, make sure to wake up the garbage collector thread when
bch_prio_write() is failing to allocate buckets.

BugLink: https://bugs.launchpad.net/bugs/1784665
BugLink: https://bugs.launchpad.net/bugs/1796292
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
Changes in v2:
 - prevent retry_invalidate busy loop in bch_allocator_thread()

 drivers/md/bcache/alloc.c  |  5 ++++-
 drivers/md/bcache/bcache.h |  2 +-
 drivers/md/bcache/super.c  | 13 +++++++++----
 3 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index 6f776823b9ba..a1df0d95151c 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -377,7 +377,10 @@ static int bch_allocator_thread(void *arg)
 			if (!fifo_full(&ca->free_inc))
 				goto retry_invalidate;
 
-			bch_prio_write(ca);
+			if (bch_prio_write(ca, false) < 0) {
+				ca->invalidate_needs_gc = 1;
+				wake_up_gc(ca->set);
+			}
 		}
 	}
 out:
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 013e35a9e317..deb924e1d790 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -977,7 +977,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
 __printf(2, 3)
 bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
 
-void bch_prio_write(struct cache *ca);
+int bch_prio_write(struct cache *ca, bool wait);
 void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
 
 extern struct workqueue_struct *bcache_wq;
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 20ed838e9413..716ea272fb55 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -529,7 +529,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
 	closure_sync(cl);
 }
 
-void bch_prio_write(struct cache *ca)
+int bch_prio_write(struct cache *ca, bool wait)
 {
 	int i;
 	struct bucket *b;
@@ -564,8 +564,12 @@ void bch_prio_write(struct cache *ca)
 		p->magic	= pset_magic(&ca->sb);
 		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
 
-		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
-		BUG_ON(bucket == -1);
+		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
+		if (bucket == -1) {
+			if (!wait)
+				return -ENOMEM;
+			BUG_ON(1);
+		}
 
 		mutex_unlock(&ca->set->bucket_lock);
 		prio_io(ca, bucket, REQ_OP_WRITE, 0);
@@ -593,6 +597,7 @@ void bch_prio_write(struct cache *ca)
 
 		ca->prio_last_buckets[i] = ca->prio_buckets[i];
 	}
+	return 0;
 }
 
 static void prio_read(struct cache *ca, uint64_t bucket)
@@ -1954,7 +1959,7 @@ static int run_cache_set(struct cache_set *c)
 
 		mutex_lock(&c->bucket_lock);
 		for_each_cache(ca, c, i)
-			bch_prio_write(ca);
+			bch_prio_write(ca, true);
 		mutex_unlock(&c->bucket_lock);
 
 		err = "cannot allocate new UUID bucket";
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] bcache: fix deadlock in bcache_allocator
  2019-08-06  9:18 [PATCH v2] bcache: fix deadlock in bcache_allocator Andrea Righi
@ 2019-08-06 12:59 ` Coly Li
  2019-08-06 17:36 ` Andrea Righi
  1 sibling, 0 replies; 5+ messages in thread
From: Coly Li @ 2019-08-06 12:59 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On 2019/8/6 5:18 下午, Andrea Righi wrote:
> bcache_allocator() can call the following:
> 
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
> 
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself => deadlock:
> 
> [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
> [ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
> [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
> [ 1158.504419] Call Trace:
> [ 1158.504429]  __schedule+0x2a8/0x670
> [ 1158.504432]  schedule+0x2d/0x90
> [ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
> [ 1158.504453]  ? wait_woken+0x80/0x80
> [ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
> [ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
> [ 1158.504491]  kthread+0x121/0x140
> [ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
> [ 1158.504506]  ? kthread_park+0xb0/0xb0
> [ 1158.504510]  ret_from_fork+0x35/0x40
> 
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself.
> 
> Moreover, make sure to wake up the garbage collector thread when
> bch_prio_write() is failing to allocate buckets.
> 
> BugLink: https://bugs.launchpad.net/bugs/1784665
> BugLink: https://bugs.launchpad.net/bugs/1796292
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

At this moment I am not able to find a better solution, so I take this
patch in my for-test.

Thank you. And I hope you may continue to maintain this change if people
report problem (I mean if) in future.


Coly Li


> ---
> Changes in v2:
>  - prevent retry_invalidate busy loop in bch_allocator_thread()
> 
>  drivers/md/bcache/alloc.c  |  5 ++++-
>  drivers/md/bcache/bcache.h |  2 +-
>  drivers/md/bcache/super.c  | 13 +++++++++----
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
> index 6f776823b9ba..a1df0d95151c 100644
> --- a/drivers/md/bcache/alloc.c
> +++ b/drivers/md/bcache/alloc.c
> @@ -377,7 +377,10 @@ static int bch_allocator_thread(void *arg)
>  			if (!fifo_full(&ca->free_inc))
>  				goto retry_invalidate;
>  
> -			bch_prio_write(ca);
> +			if (bch_prio_write(ca, false) < 0) {
> +				ca->invalidate_needs_gc = 1;
> +				wake_up_gc(ca->set);
> +			}
>  		}
>  	}
>  out:
> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
> index 013e35a9e317..deb924e1d790 100644
> --- a/drivers/md/bcache/bcache.h
> +++ b/drivers/md/bcache/bcache.h
> @@ -977,7 +977,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
>  __printf(2, 3)
>  bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
>  
> -void bch_prio_write(struct cache *ca);
> +int bch_prio_write(struct cache *ca, bool wait);
>  void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
>  
>  extern struct workqueue_struct *bcache_wq;
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 20ed838e9413..716ea272fb55 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -529,7 +529,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
>  	closure_sync(cl);
>  }
>  
> -void bch_prio_write(struct cache *ca)
> +int bch_prio_write(struct cache *ca, bool wait)
>  {
>  	int i;
>  	struct bucket *b;
> @@ -564,8 +564,12 @@ void bch_prio_write(struct cache *ca)
>  		p->magic	= pset_magic(&ca->sb);
>  		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
>  
> -		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
> -		BUG_ON(bucket == -1);
> +		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
> +		if (bucket == -1) {
> +			if (!wait)
> +				return -ENOMEM;
> +			BUG_ON(1);
> +		}
>  
>  		mutex_unlock(&ca->set->bucket_lock);
>  		prio_io(ca, bucket, REQ_OP_WRITE, 0);
> @@ -593,6 +597,7 @@ void bch_prio_write(struct cache *ca)
>  
>  		ca->prio_last_buckets[i] = ca->prio_buckets[i];
>  	}
> +	return 0;
>  }
>  
>  static void prio_read(struct cache *ca, uint64_t bucket)
> @@ -1954,7 +1959,7 @@ static int run_cache_set(struct cache_set *c)
>  
>  		mutex_lock(&c->bucket_lock);
>  		for_each_cache(ca, c, i)
> -			bch_prio_write(ca);
> +			bch_prio_write(ca, true);
>  		mutex_unlock(&c->bucket_lock);
>  
>  		err = "cannot allocate new UUID bucket";
> 


-- 

Coly Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] bcache: fix deadlock in bcache_allocator
  2019-08-06  9:18 [PATCH v2] bcache: fix deadlock in bcache_allocator Andrea Righi
  2019-08-06 12:59 ` Coly Li
@ 2019-08-06 17:36 ` Andrea Righi
  2019-08-07  9:25   ` Andrea Righi
  1 sibling, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2019-08-06 17:36 UTC (permalink / raw)
  To: Coly Li; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On Tue, Aug 06, 2019 at 11:18:01AM +0200, Andrea Righi wrote:
> bcache_allocator() can call the following:
> 
>  bch_allocator_thread()
>   -> bch_prio_write()
>      -> bch_bucket_alloc()
>         -> wait on &ca->set->bucket_wait
> 
> But the wake up event on bucket_wait is supposed to come from
> bch_allocator_thread() itself => deadlock:
> 
> [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
> [ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
> [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
> [ 1158.504419] Call Trace:
> [ 1158.504429]  __schedule+0x2a8/0x670
> [ 1158.504432]  schedule+0x2d/0x90
> [ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
> [ 1158.504453]  ? wait_woken+0x80/0x80
> [ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
> [ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
> [ 1158.504491]  kthread+0x121/0x140
> [ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
> [ 1158.504506]  ? kthread_park+0xb0/0xb0
> [ 1158.504510]  ret_from_fork+0x35/0x40
> 
> Fix by making the call to bch_prio_write() non-blocking, so that
> bch_allocator_thread() never waits on itself.
> 
> Moreover, make sure to wake up the garbage collector thread when
> bch_prio_write() is failing to allocate buckets.
> 
> BugLink: https://bugs.launchpad.net/bugs/1784665
> BugLink: https://bugs.launchpad.net/bugs/1796292
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> ---
> Changes in v2:
>  - prevent retry_invalidate busy loop in bch_allocator_thread()
> 
>  drivers/md/bcache/alloc.c  |  5 ++++-
>  drivers/md/bcache/bcache.h |  2 +-
>  drivers/md/bcache/super.c  | 13 +++++++++----
>  3 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
> index 6f776823b9ba..a1df0d95151c 100644
> --- a/drivers/md/bcache/alloc.c
> +++ b/drivers/md/bcache/alloc.c
> @@ -377,7 +377,10 @@ static int bch_allocator_thread(void *arg)
>  			if (!fifo_full(&ca->free_inc))
>  				goto retry_invalidate;
>  
> -			bch_prio_write(ca);
> +			if (bch_prio_write(ca, false) < 0) {
> +				ca->invalidate_needs_gc = 1;
> +				wake_up_gc(ca->set);
> +			}
>  		}
>  	}
>  out:
> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
> index 013e35a9e317..deb924e1d790 100644
> --- a/drivers/md/bcache/bcache.h
> +++ b/drivers/md/bcache/bcache.h
> @@ -977,7 +977,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
>  __printf(2, 3)
>  bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
>  
> -void bch_prio_write(struct cache *ca);
> +int bch_prio_write(struct cache *ca, bool wait);
>  void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
>  
>  extern struct workqueue_struct *bcache_wq;
> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> index 20ed838e9413..716ea272fb55 100644
> --- a/drivers/md/bcache/super.c
> +++ b/drivers/md/bcache/super.c
> @@ -529,7 +529,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
>  	closure_sync(cl);
>  }
>  
> -void bch_prio_write(struct cache *ca)
> +int bch_prio_write(struct cache *ca, bool wait)
>  {
>  	int i;
>  	struct bucket *b;
> @@ -564,8 +564,12 @@ void bch_prio_write(struct cache *ca)
>  		p->magic	= pset_magic(&ca->sb);
>  		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
>  
> -		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
> -		BUG_ON(bucket == -1);
> +		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
> +		if (bucket == -1) {
> +			if (!wait)
> +				return -ENOMEM;
> +			BUG_ON(1);
> +		}

Coly,

looking more at this change, I think we should handle the failure path
properly or we may leak buckets, am I right? (sorry for not realizing
this before). Maybe we need something like the following on top of my
previous patch.

I'm going to run more stress tests with this patch applied and will try
to figure out if we're actually leaking buckets without it.

---
Subject: bcache: prevent leaking buckets in bch_prio_write()

Handle the allocation failure path properly in bch_prio_write() to avoid
leaking buckets from the previous successful iterations.

Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
---
 drivers/md/bcache/super.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 716ea27..727266f 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -531,7 +531,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
 
 int bch_prio_write(struct cache *ca, bool wait)
 {
-	int i;
+	int i, j, ret = 0;
 	struct bucket *b;
 	struct closure cl;
 
@@ -566,9 +566,11 @@ int bch_prio_write(struct cache *ca, bool wait)
 
 		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
 		if (bucket == -1) {
-			if (!wait)
-				return -ENOMEM;
-			BUG_ON(1);
+			if (!wait) {
+				ret = -ENOMEM;
+				break;
+			}
+			BUG();
 		}
 
 		mutex_unlock(&ca->set->bucket_lock);
@@ -590,14 +592,14 @@ int bch_prio_write(struct cache *ca, bool wait)
 	 * Don't want the old priorities to get garbage collected until after we
 	 * finish writing the new ones, and they're journalled
 	 */
-	for (i = 0; i < prio_buckets(ca); i++) {
-		if (ca->prio_last_buckets[i])
+	for (j = prio_buckets(ca) - 1; j > i; --j) {
+		if (ca->prio_last_buckets[j])
 			__bch_bucket_free(ca,
-				&ca->buckets[ca->prio_last_buckets[i]]);
+				&ca->buckets[ca->prio_last_buckets[j]]);
 
-		ca->prio_last_buckets[i] = ca->prio_buckets[i];
+		ca->prio_last_buckets[j] = ca->prio_buckets[j];
 	}
-	return 0;
+	return ret;
 }
 
 static void prio_read(struct cache *ca, uint64_t bucket)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] bcache: fix deadlock in bcache_allocator
  2019-08-06 17:36 ` Andrea Righi
@ 2019-08-07  9:25   ` Andrea Righi
  2019-08-07 10:18     ` Coly Li
  0 siblings, 1 reply; 5+ messages in thread
From: Andrea Righi @ 2019-08-07  9:25 UTC (permalink / raw)
  To: Coly Li; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On Tue, Aug 06, 2019 at 07:36:48PM +0200, Andrea Righi wrote:
> On Tue, Aug 06, 2019 at 11:18:01AM +0200, Andrea Righi wrote:
> > bcache_allocator() can call the following:
> > 
> >  bch_allocator_thread()
> >   -> bch_prio_write()
> >      -> bch_bucket_alloc()
> >         -> wait on &ca->set->bucket_wait
> > 
> > But the wake up event on bucket_wait is supposed to come from
> > bch_allocator_thread() itself => deadlock:
> > 
> > [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
> > [ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
> > [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
> > [ 1158.504419] Call Trace:
> > [ 1158.504429]  __schedule+0x2a8/0x670
> > [ 1158.504432]  schedule+0x2d/0x90
> > [ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
> > [ 1158.504453]  ? wait_woken+0x80/0x80
> > [ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
> > [ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
> > [ 1158.504491]  kthread+0x121/0x140
> > [ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
> > [ 1158.504506]  ? kthread_park+0xb0/0xb0
> > [ 1158.504510]  ret_from_fork+0x35/0x40
> > 
> > Fix by making the call to bch_prio_write() non-blocking, so that
> > bch_allocator_thread() never waits on itself.
> > 
> > Moreover, make sure to wake up the garbage collector thread when
> > bch_prio_write() is failing to allocate buckets.
> > 
> > BugLink: https://bugs.launchpad.net/bugs/1784665
> > BugLink: https://bugs.launchpad.net/bugs/1796292
> > Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> > ---
> > Changes in v2:
> >  - prevent retry_invalidate busy loop in bch_allocator_thread()
> > 
> >  drivers/md/bcache/alloc.c  |  5 ++++-
> >  drivers/md/bcache/bcache.h |  2 +-
> >  drivers/md/bcache/super.c  | 13 +++++++++----
> >  3 files changed, 14 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
> > index 6f776823b9ba..a1df0d95151c 100644
> > --- a/drivers/md/bcache/alloc.c
> > +++ b/drivers/md/bcache/alloc.c
> > @@ -377,7 +377,10 @@ static int bch_allocator_thread(void *arg)
> >  			if (!fifo_full(&ca->free_inc))
> >  				goto retry_invalidate;
> >  
> > -			bch_prio_write(ca);
> > +			if (bch_prio_write(ca, false) < 0) {
> > +				ca->invalidate_needs_gc = 1;
> > +				wake_up_gc(ca->set);
> > +			}
> >  		}
> >  	}
> >  out:
> > diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
> > index 013e35a9e317..deb924e1d790 100644
> > --- a/drivers/md/bcache/bcache.h
> > +++ b/drivers/md/bcache/bcache.h
> > @@ -977,7 +977,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
> >  __printf(2, 3)
> >  bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
> >  
> > -void bch_prio_write(struct cache *ca);
> > +int bch_prio_write(struct cache *ca, bool wait);
> >  void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
> >  
> >  extern struct workqueue_struct *bcache_wq;
> > diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
> > index 20ed838e9413..716ea272fb55 100644
> > --- a/drivers/md/bcache/super.c
> > +++ b/drivers/md/bcache/super.c
> > @@ -529,7 +529,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
> >  	closure_sync(cl);
> >  }
> >  
> > -void bch_prio_write(struct cache *ca)
> > +int bch_prio_write(struct cache *ca, bool wait)
> >  {
> >  	int i;
> >  	struct bucket *b;
> > @@ -564,8 +564,12 @@ void bch_prio_write(struct cache *ca)
> >  		p->magic	= pset_magic(&ca->sb);
> >  		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
> >  
> > -		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
> > -		BUG_ON(bucket == -1);
> > +		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
> > +		if (bucket == -1) {
> > +			if (!wait)
> > +				return -ENOMEM;
> > +			BUG_ON(1);
> > +		}
> 
> Coly,
> 
> looking more at this change, I think we should handle the failure path
> properly or we may leak buckets, am I right? (sorry for not realizing
> this before). Maybe we need something like the following on top of my
> previous patch.
> 
> I'm going to run more stress tests with this patch applied and will try
> to figure out if we're actually leaking buckets without it.
> 
> ---
> Subject: bcache: prevent leaking buckets in bch_prio_write()
> 
> Handle the allocation failure path properly in bch_prio_write() to avoid
> leaking buckets from the previous successful iterations.
> 
> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>

Coly, ignore this one please. A v3 of the previous patch with a better
fix for this potential buckets leak is on the way.

Thanks,
-Andrea

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] bcache: fix deadlock in bcache_allocator
  2019-08-07  9:25   ` Andrea Righi
@ 2019-08-07 10:18     ` Coly Li
  0 siblings, 0 replies; 5+ messages in thread
From: Coly Li @ 2019-08-07 10:18 UTC (permalink / raw)
  To: Andrea Righi; +Cc: Kent Overstreet, linux-bcache, linux-kernel

On 2019/8/7 5:25 下午, Andrea Righi wrote:
> On Tue, Aug 06, 2019 at 07:36:48PM +0200, Andrea Righi wrote:
>> On Tue, Aug 06, 2019 at 11:18:01AM +0200, Andrea Righi wrote:
>>> bcache_allocator() can call the following:
>>>
>>>  bch_allocator_thread()
>>>   -> bch_prio_write()
>>>      -> bch_bucket_alloc()
>>>         -> wait on &ca->set->bucket_wait
>>>
>>> But the wake up event on bucket_wait is supposed to come from
>>> bch_allocator_thread() itself => deadlock:
>>>
>>> [ 1158.490744] INFO: task bcache_allocato:15861 blocked for more than 10 seconds.
>>> [ 1158.495929]       Not tainted 5.3.0-050300rc3-generic #201908042232
>>> [ 1158.500653] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [ 1158.504413] bcache_allocato D    0 15861      2 0x80004000
>>> [ 1158.504419] Call Trace:
>>> [ 1158.504429]  __schedule+0x2a8/0x670
>>> [ 1158.504432]  schedule+0x2d/0x90
>>> [ 1158.504448]  bch_bucket_alloc+0xe5/0x370 [bcache]
>>> [ 1158.504453]  ? wait_woken+0x80/0x80
>>> [ 1158.504466]  bch_prio_write+0x1dc/0x390 [bcache]
>>> [ 1158.504476]  bch_allocator_thread+0x233/0x490 [bcache]
>>> [ 1158.504491]  kthread+0x121/0x140
>>> [ 1158.504503]  ? invalidate_buckets+0x890/0x890 [bcache]
>>> [ 1158.504506]  ? kthread_park+0xb0/0xb0
>>> [ 1158.504510]  ret_from_fork+0x35/0x40
>>>
>>> Fix by making the call to bch_prio_write() non-blocking, so that
>>> bch_allocator_thread() never waits on itself.
>>>
>>> Moreover, make sure to wake up the garbage collector thread when
>>> bch_prio_write() is failing to allocate buckets.
>>>
>>> BugLink: https://bugs.launchpad.net/bugs/1784665
>>> BugLink: https://bugs.launchpad.net/bugs/1796292
>>> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
>>> ---
>>> Changes in v2:
>>>  - prevent retry_invalidate busy loop in bch_allocator_thread()
>>>
>>>  drivers/md/bcache/alloc.c  |  5 ++++-
>>>  drivers/md/bcache/bcache.h |  2 +-
>>>  drivers/md/bcache/super.c  | 13 +++++++++----
>>>  3 files changed, 14 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
>>> index 6f776823b9ba..a1df0d95151c 100644
>>> --- a/drivers/md/bcache/alloc.c
>>> +++ b/drivers/md/bcache/alloc.c
>>> @@ -377,7 +377,10 @@ static int bch_allocator_thread(void *arg)
>>>  			if (!fifo_full(&ca->free_inc))
>>>  				goto retry_invalidate;
>>>  
>>> -			bch_prio_write(ca);
>>> +			if (bch_prio_write(ca, false) < 0) {
>>> +				ca->invalidate_needs_gc = 1;
>>> +				wake_up_gc(ca->set);
>>> +			}
>>>  		}
>>>  	}
>>>  out:
>>> diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
>>> index 013e35a9e317..deb924e1d790 100644
>>> --- a/drivers/md/bcache/bcache.h
>>> +++ b/drivers/md/bcache/bcache.h
>>> @@ -977,7 +977,7 @@ bool bch_cached_dev_error(struct cached_dev *dc);
>>>  __printf(2, 3)
>>>  bool bch_cache_set_error(struct cache_set *c, const char *fmt, ...);
>>>  
>>> -void bch_prio_write(struct cache *ca);
>>> +int bch_prio_write(struct cache *ca, bool wait);
>>>  void bch_write_bdev_super(struct cached_dev *dc, struct closure *parent);
>>>  
>>>  extern struct workqueue_struct *bcache_wq;
>>> diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
>>> index 20ed838e9413..716ea272fb55 100644
>>> --- a/drivers/md/bcache/super.c
>>> +++ b/drivers/md/bcache/super.c
>>> @@ -529,7 +529,7 @@ static void prio_io(struct cache *ca, uint64_t bucket, int op,
>>>  	closure_sync(cl);
>>>  }
>>>  
>>> -void bch_prio_write(struct cache *ca)
>>> +int bch_prio_write(struct cache *ca, bool wait)
>>>  {
>>>  	int i;
>>>  	struct bucket *b;
>>> @@ -564,8 +564,12 @@ void bch_prio_write(struct cache *ca)
>>>  		p->magic	= pset_magic(&ca->sb);
>>>  		p->csum		= bch_crc64(&p->magic, bucket_bytes(ca) - 8);
>>>  
>>> -		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, true);
>>> -		BUG_ON(bucket == -1);
>>> +		bucket = bch_bucket_alloc(ca, RESERVE_PRIO, wait);
>>> +		if (bucket == -1) {
>>> +			if (!wait)
>>> +				return -ENOMEM;
>>> +			BUG_ON(1);
>>> +		}
>>
>> Coly,
>>
>> looking more at this change, I think we should handle the failure path
>> properly or we may leak buckets, am I right? (sorry for not realizing
>> this before). Maybe we need something like the following on top of my
>> previous patch.
>>
>> I'm going to run more stress tests with this patch applied and will try
>> to figure out if we're actually leaking buckets without it.
>>
>> ---
>> Subject: bcache: prevent leaking buckets in bch_prio_write()
>>
>> Handle the allocation failure path properly in bch_prio_write() to avoid
>> leaking buckets from the previous successful iterations.
>>
>> Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
> 
> Coly, ignore this one please. A v3 of the previous patch with a better
> fix for this potential buckets leak is on the way.

Sure, waiting for next version :-)


-- 

Coly Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-07 10:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-06  9:18 [PATCH v2] bcache: fix deadlock in bcache_allocator Andrea Righi
2019-08-06 12:59 ` Coly Li
2019-08-06 17:36 ` Andrea Righi
2019-08-07  9:25   ` Andrea Righi
2019-08-07 10:18     ` Coly Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).