All of lore.kernel.org
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: axboe@kernel.dk
Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org,
	Coly Li <colyli@suse.de>
Subject: [PATCH 25/37] bcache: acquire bch_register_lock later in cached_dev_free()
Date: Fri, 28 Jun 2019 19:59:48 +0800	[thread overview]
Message-ID: <20190628120000.40753-26-colyli@suse.de> (raw)
In-Reply-To: <20190628120000.40753-1-colyli@suse.de>

When enable lockdep engine, a lockdep warning can be observed when
reboot or shutdown system,

[ 3142.764557][    T1] bcache: bcache_reboot() Stopping all devices:
[ 3142.776265][ T2649]
[ 3142.777159][ T2649] ======================================================
[ 3142.780039][ T2649] WARNING: possible circular locking dependency detected
[ 3142.782869][ T2649] 5.2.0-rc4-lp151.20-default+ #1 Tainted: G        W
[ 3142.785684][ T2649] ------------------------------------------------------
[ 3142.788479][ T2649] kworker/3:67/2649 is trying to acquire lock:
[ 3142.790738][ T2649] 00000000aaf02291 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0
[ 3142.794678][ T2649]
[ 3142.794678][ T2649] but task is already holding lock:
[ 3142.797402][ T2649] 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache]
[ 3142.801462][ T2649]
[ 3142.801462][ T2649] which lock already depends on the new lock.
[ 3142.801462][ T2649]
[ 3142.805277][ T2649]
[ 3142.805277][ T2649] the existing dependency chain (in reverse order) is:
[ 3142.808902][ T2649]
[ 3142.808902][ T2649] -> #2 (&bch_register_lock){+.+.}:
[ 3142.812396][ T2649]        __mutex_lock+0x7a/0x9d0
[ 3142.814184][ T2649]        cached_dev_free+0x17/0x120 [bcache]
[ 3142.816415][ T2649]        process_one_work+0x2a4/0x640
[ 3142.818413][ T2649]        worker_thread+0x39/0x3f0
[ 3142.820276][ T2649]        kthread+0x125/0x140
[ 3142.822061][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.823965][ T2649]
[ 3142.823965][ T2649] -> #1 ((work_completion)(&cl->work)#2){+.+.}:
[ 3142.827244][ T2649]        process_one_work+0x277/0x640
[ 3142.829160][ T2649]        worker_thread+0x39/0x3f0
[ 3142.830958][ T2649]        kthread+0x125/0x140
[ 3142.832674][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.834915][ T2649]
[ 3142.834915][ T2649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}:
[ 3142.838121][ T2649]        lock_acquire+0xb4/0x1c0
[ 3142.840025][ T2649]        flush_workqueue+0xae/0x4c0
[ 3142.842035][ T2649]        drain_workqueue+0xa9/0x180
[ 3142.844042][ T2649]        destroy_workqueue+0x17/0x250
[ 3142.846142][ T2649]        cached_dev_free+0x52/0x120 [bcache]
[ 3142.848530][ T2649]        process_one_work+0x2a4/0x640
[ 3142.850663][ T2649]        worker_thread+0x39/0x3f0
[ 3142.852464][ T2649]        kthread+0x125/0x140
[ 3142.854106][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.855880][ T2649]
[ 3142.855880][ T2649] other info that might help us debug this:
[ 3142.855880][ T2649]
[ 3142.859663][ T2649] Chain exists of:
[ 3142.859663][ T2649]   (wq_completion)bcache_writeback_wq --> (work_completion)(&cl->work)#2 --> &bch_register_lock
[ 3142.859663][ T2649]
[ 3142.865424][ T2649]  Possible unsafe locking scenario:
[ 3142.865424][ T2649]
[ 3142.868022][ T2649]        CPU0                    CPU1
[ 3142.869885][ T2649]        ----                    ----
[ 3142.871751][ T2649]   lock(&bch_register_lock);
[ 3142.873379][ T2649]                                lock((work_completion)(&cl->work)#2);
[ 3142.876399][ T2649]                                lock(&bch_register_lock);
[ 3142.879727][ T2649]   lock((wq_completion)bcache_writeback_wq);
[ 3142.882064][ T2649]
[ 3142.882064][ T2649]  *** DEADLOCK ***
[ 3142.882064][ T2649]
[ 3142.885060][ T2649] 3 locks held by kworker/3:67/2649:
[ 3142.887245][ T2649]  #0: 00000000e774cdd0 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640
[ 3142.890815][ T2649]  #1: 00000000f7df89da ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640
[ 3142.894884][ T2649]  #2: 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache]
[ 3142.898797][ T2649]
[ 3142.898797][ T2649] stack backtrace:
[ 3142.900961][ T2649] CPU: 3 PID: 2649 Comm: kworker/3:67 Tainted: G        W         5.2.0-rc4-lp151.20-default+ #1
[ 3142.904789][ T2649] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[ 3142.909168][ T2649] Workqueue: events cached_dev_free [bcache]
[ 3142.911422][ T2649] Call Trace:
[ 3142.912656][ T2649]  dump_stack+0x85/0xcb
[ 3142.914181][ T2649]  print_circular_bug+0x19a/0x1f0
[ 3142.916193][ T2649]  __lock_acquire+0x16cd/0x1850
[ 3142.917936][ T2649]  ? __lock_acquire+0x6a8/0x1850
[ 3142.919704][ T2649]  ? lock_acquire+0xb4/0x1c0
[ 3142.921335][ T2649]  ? find_held_lock+0x34/0xa0
[ 3142.923052][ T2649]  lock_acquire+0xb4/0x1c0
[ 3142.924635][ T2649]  ? flush_workqueue+0x87/0x4c0
[ 3142.926375][ T2649]  flush_workqueue+0xae/0x4c0
[ 3142.928047][ T2649]  ? flush_workqueue+0x87/0x4c0
[ 3142.929824][ T2649]  ? drain_workqueue+0xa9/0x180
[ 3142.931686][ T2649]  drain_workqueue+0xa9/0x180
[ 3142.933534][ T2649]  destroy_workqueue+0x17/0x250
[ 3142.935787][ T2649]  cached_dev_free+0x52/0x120 [bcache]
[ 3142.937795][ T2649]  process_one_work+0x2a4/0x640
[ 3142.939803][ T2649]  worker_thread+0x39/0x3f0
[ 3142.941487][ T2649]  ? process_one_work+0x640/0x640
[ 3142.943389][ T2649]  kthread+0x125/0x140
[ 3142.944894][ T2649]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 3142.947744][ T2649]  ret_from_fork+0x3a/0x50
[ 3142.970358][ T2649] bcache: bcache_device_free() bcache0 stopped

Here is how the deadlock happens.
1) bcache_reboot() calls bcache_device_stop(), then inside
   bcache_device_stop() BCACHE_DEV_CLOSING bit is set on d->flags.
   Then closure_queue(&d->cl) is called to invoke cached_dev_flush().
2) In cached_dev_flush(), cached_dev_free() is called by continu_at().
3) In cached_dev_free(), when stopping the writeback kthread of the
   cached device by kthread_stop(), dc->writeback_thread will be waken
   up to quite the kthread while-loop, then cached_dev_put() is called
   in bch_writeback_thread().
4) Calling cached_dev_put() in writeback kthread may drop dc->count to
   0, then dc->detach kworker is scheduled, which is initialized as
   cached_dev_detach_finish().
5) Inside cached_dev_detach_finish(), the last line of code is to call
   closure_put(&dc->disk.cl), which drops the last reference counter of
   closrure dc->disk.cl, then the callback cached_dev_flush() gets
   called.
Now cached_dev_flush() is called for second time in the code path, the
first time is in step 2). And again bch_register_lock will be acquired
again, and a A-A lock (lockdep terminology) is happening.

The root cause of the above A-A lock is in cached_dev_free(), mutex
bch_register_lock is held before stopping writeback kthread and other
kworkers. Fortunately now we have variable 'bcache_is_reboot', which may
prevent device registration or unregistration during reboot/shutdown
time, so it is unncessary to hold bch_register_lock such early now.

This is how this patch fixes the reboot/shutdown time A-A lock issue:
After moving mutex_lock(&bch_register_lock) to a later location where
before atomic_read(&dc->running) in cached_dev_free(), such A-A lock
problem can be solved without any reboot time registration race.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 40d857e690f9..8a12a8313367 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1231,8 +1231,6 @@ static void cached_dev_free(struct closure *cl)
 {
 	struct cached_dev *dc = container_of(cl, struct cached_dev, disk.cl);
 
-	mutex_lock(&bch_register_lock);
-
 	if (test_and_clear_bit(BCACHE_DEV_WB_RUNNING, &dc->disk.flags))
 		cancel_writeback_rate_update_dwork(dc);
 
@@ -1243,6 +1241,8 @@ static void cached_dev_free(struct closure *cl)
 	if (!IS_ERR_OR_NULL(dc->status_update_thread))
 		kthread_stop(dc->status_update_thread);
 
+	mutex_lock(&bch_register_lock);
+
 	if (atomic_read(&dc->running))
 		bd_unlink_disk_holder(dc->bdev, dc->disk.disk);
 	bcache_device_free(&dc->disk);
-- 
2.16.4


  parent reply	other threads:[~2019-06-28 12:01 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-28 11:59 [PATCH 00/37] bcache patches for Linux v5.3 Coly Li
2019-06-28 11:59 ` [PATCH 01/37] bcache: don't set max writeback rate if gc is running Coly Li
2019-06-28 11:59 ` [PATCH 02/37] bcache: check c->gc_thread by IS_ERR_OR_NULL in cache_set_flush() Coly Li
2019-06-28 11:59 ` [PATCH 03/37] bcache: fix return value error in bch_journal_read() Coly Li
2019-06-28 11:59 ` [PATCH 04/37] Revert "bcache: set CACHE_SET_IO_DISABLE in bch_cached_dev_error()" Coly Li
2019-06-28 11:59 ` [PATCH 05/37] bcache: avoid flushing btree node in cache_set_flush() if io disabled Coly Li
2019-06-28 11:59 ` [PATCH 06/37] bcache: ignore read-ahead request failure on backing device Coly Li
2019-06-28 11:59 ` [PATCH 07/37] bcache: add io error counting in write_bdev_super_endio() Coly Li
2019-06-28 11:59 ` [PATCH 08/37] bcache: remove unnecessary prefetch() in bset_search_tree() Coly Li
2019-06-28 11:59 ` [PATCH 09/37] bcache: use sysfs_match_string() instead of __sysfs_match_string() Coly Li
2019-06-28 11:59 ` [PATCH 10/37] bcache: add return value check to bch_cached_dev_run() Coly Li
2019-06-28 11:59 ` [PATCH 11/37] bcache: remove unncessary code in bch_btree_keys_init() Coly Li
2019-06-28 11:59 ` [PATCH 12/37] bcache: check CACHE_SET_IO_DISABLE in allocator code Coly Li
2019-06-28 11:59 ` [PATCH 13/37] bcache: check CACHE_SET_IO_DISABLE bit in bch_journal() Coly Li
2019-06-28 11:59 ` [PATCH 14/37] bcache: more detailed error message to bcache_device_link() Coly Li
2019-06-28 11:59 ` [PATCH 15/37] bcache: add more error message in bch_cached_dev_attach() Coly Li
2019-06-28 11:59 ` [PATCH 16/37] bcache: improve error message in bch_cached_dev_run() Coly Li
2019-06-28 11:59 ` [PATCH 17/37] bcache: remove "XXX:" comment line from run_cache_set() Coly Li
2019-06-28 11:59 ` [PATCH 18/37] bcache: make bset_search_tree() be more understandable Coly Li
2019-06-28 11:59 ` [PATCH 19/37] bcache: add pendings_cleanup to stop pending bcache device Coly Li
2019-06-28 11:59 ` [PATCH 20/37] bcache: fix mistaken sysfs entry for io_error counter Coly Li
2019-06-28 11:59 ` [PATCH 21/37] bcache: destroy dc->writeback_write_wq if failed to create dc->writeback_thread Coly Li
2019-06-28 11:59 ` [PATCH 22/37] bcache: stop writeback kthread and kworker when bch_cached_dev_run() failed Coly Li
2019-06-28 11:59 ` [PATCH 23/37] bcache: avoid a deadlock in bcache_reboot() Coly Li
2019-06-28 11:59 ` [PATCH 24/37] bcache: acquire bch_register_lock later in cached_dev_detach_finish() Coly Li
2019-06-28 11:59 ` Coly Li [this message]
2019-06-28 11:59 ` [PATCH 26/37] bcache: fix potential deadlock in cached_def_free() Coly Li
2019-06-28 11:59 ` [PATCH 27/37] bcache: add code comments for journal_read_bucket() Coly Li
2019-06-28 11:59 ` [PATCH 28/37] bcache: set largest seq to ja->seq[bucket_index] in journal_read_bucket() Coly Li
2019-06-28 11:59 ` [PATCH 29/37] bcache: shrink btree node cache after bch_btree_check() Coly Li
2019-06-28 11:59 ` [PATCH 30/37] bcache: Revert "bcache: free heap cache_set->flush_btree in bch_journal_free" Coly Li
2019-06-28 11:59 ` [PATCH 31/37] bcache: Revert "bcache: fix high CPU occupancy during journal" Coly Li
2019-06-28 11:59 ` [PATCH 32/37] bcache: only clear BTREE_NODE_dirty bit when it is set Coly Li
2019-06-28 11:59 ` [PATCH 33/37] bcache: add comments for mutex_lock(&b->write_lock) Coly Li
2019-06-28 11:59 ` [PATCH 34/37] bcache: remove retry_flush_write from struct cache_set Coly Li
2019-06-28 11:59 ` [PATCH 35/37] bcache: fix race in btree_flush_write() Coly Li
2019-06-28 11:59 ` [PATCH 36/37] bcache: performance improvement for btree_flush_write() Coly Li
2019-06-28 12:00 ` [PATCH 37/37] bcache: add reclaimed_journal_buckets to struct cache_set Coly Li
2019-06-28 13:42 ` [PATCH 00/37] bcache patches for Linux v5.3 Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190628120000.40753-26-colyli@suse.de \
    --to=colyli@suse.de \
    --cc=axboe@kernel.dk \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.