linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Guoju Fang <fangguoju@gmail.com>, Coly Li <colyli@suse.de>,
	Jens Axboe <axboe@kernel.dk>, Sasha Levin <sashal@kernel.org>,
	linux-bcache@vger.kernel.org, linux-raid@vger.kernel.org
Subject: [PATCH AUTOSEL 4.9 11/90] bcache: fix a lost wake-up problem caused by mca_cannibalize_lock
Date: Thu, 17 Sep 2020 22:13:36 -0400	[thread overview]
Message-ID: <20200918021455.2067301-11-sashal@kernel.org> (raw)
In-Reply-To: <20200918021455.2067301-1-sashal@kernel.org>

From: Guoju Fang <fangguoju@gmail.com>

[ Upstream commit 34cf78bf34d48dddddfeeadb44f9841d7864997a ]

This patch fix a lost wake-up problem caused by the race between
mca_cannibalize_lock and bch_cannibalize_unlock.

Consider two processes, A and B. Process A is executing
mca_cannibalize_lock, while process B takes c->btree_cache_alloc_lock
and is executing bch_cannibalize_unlock. The problem happens that after
process A executes cmpxchg and will execute prepare_to_wait. In this
timeslice process B executes wake_up, but after that process A executes
prepare_to_wait and set the state to TASK_INTERRUPTIBLE. Then process A
goes to sleep but no one will wake up it. This problem may cause bcache
device to dead.

Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/md/bcache/bcache.h |  1 +
 drivers/md/bcache/btree.c  | 12 ++++++++----
 drivers/md/bcache/super.c  |  1 +
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 7fe7df56fa334..f0939fc1cfe55 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -547,6 +547,7 @@ struct cache_set {
 	 */
 	wait_queue_head_t	btree_cache_wait;
 	struct task_struct	*btree_cache_alloc_lock;
+	spinlock_t		btree_cannibalize_lock;
 
 	/*
 	 * When we free a btree node, we increment the gen of the bucket the
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 764d519a7f1c6..26e56a9952d09 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -836,15 +836,17 @@ out:
 
 static int mca_cannibalize_lock(struct cache_set *c, struct btree_op *op)
 {
-	struct task_struct *old;
-
-	old = cmpxchg(&c->btree_cache_alloc_lock, NULL, current);
-	if (old && old != current) {
+	spin_lock(&c->btree_cannibalize_lock);
+	if (likely(c->btree_cache_alloc_lock == NULL)) {
+		c->btree_cache_alloc_lock = current;
+	} else if (c->btree_cache_alloc_lock != current) {
 		if (op)
 			prepare_to_wait(&c->btree_cache_wait, &op->wait,
 					TASK_UNINTERRUPTIBLE);
+		spin_unlock(&c->btree_cannibalize_lock);
 		return -EINTR;
 	}
+	spin_unlock(&c->btree_cannibalize_lock);
 
 	return 0;
 }
@@ -879,10 +881,12 @@ static struct btree *mca_cannibalize(struct cache_set *c, struct btree_op *op,
  */
 static void bch_cannibalize_unlock(struct cache_set *c)
 {
+	spin_lock(&c->btree_cannibalize_lock);
 	if (c->btree_cache_alloc_lock == current) {
 		c->btree_cache_alloc_lock = NULL;
 		wake_up(&c->btree_cache_wait);
 	}
+	spin_unlock(&c->btree_cannibalize_lock);
 }
 
 static struct btree *mca_alloc(struct cache_set *c, struct btree_op *op,
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 95e9a33de06a2..263c0d987929e 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1510,6 +1510,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 	sema_init(&c->sb_write_mutex, 1);
 	mutex_init(&c->bucket_lock);
 	init_waitqueue_head(&c->btree_cache_wait);
+	spin_lock_init(&c->btree_cannibalize_lock);
 	init_waitqueue_head(&c->bucket_wait);
 	init_waitqueue_head(&c->gc_wait);
 	sema_init(&c->uuid_write_mutex, 1);
-- 
2.25.1


  parent reply	other threads:[~2020-09-18  2:29 UTC|newest]

Thread overview: 95+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-18  2:13 [PATCH AUTOSEL 4.9 01/90] scsi: aacraid: fix illegal IO beyond last LBA Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 02/90] m68k: q40: Fix info-leak in rtc_ioctl Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 03/90] gma/gma500: fix a memory disclosure bug due to uninitialized bytes Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 04/90] ASoC: kirkwood: fix IRQ error handling Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 05/90] ata: sata_mv, avoid trigerrable BUG_ON Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 06/90] PM / devfreq: tegra30: Fix integer overflow on CPU's freq max out Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 07/90] clk/ti/adpll: allocate room for terminating null Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 08/90] mtd: cfi_cmdset_0002: don't free cfi->cfiq in error path of cfi_amdstd_setup() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 09/90] mfd: mfd-core: Protect against NULL call-back function pointer Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 10/90] tracing: Adding NULL checks for trace_array descriptor pointer Sasha Levin
2020-09-18  2:13 ` Sasha Levin [this message]
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 12/90] RDMA/i40iw: Fix potential use after free Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 13/90] xfs: fix attr leaf header freemap.size underflow Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 14/90] RDMA/iw_cgxb4: Fix an error handling path in 'c4iw_connect()' Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 15/90] debugfs: Fix !DEBUG_FS debugfs_create_automount Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 16/90] CIFS: Properly process SMB3 lease breaks Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 17/90] kernel/sys.c: avoid copying possible padding bytes in copy_to_user Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 18/90] neigh_stat_seq_next() should increase position index Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 19/90] rt_cpu_seq_next " Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 20/90] seqlock: Require WRITE_ONCE surrounding raw_seqcount_barrier Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 21/90] media: ti-vpe: cal: Restrict DMA to avoid memory corruption Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 22/90] ACPI: EC: Reference count query handlers under lock Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 23/90] efi/arm: Defer probe of PCIe backed efifb on DT systems Sasha Levin
2020-09-18  6:25   ` Ard Biesheuvel
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 24/90] dmaengine: zynqmp_dma: fix burst length configuration Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 25/90] tracing: Set kernel_stack's caller size properly Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 26/90] ext4: make dioread_nolock the default Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 27/90] ar5523: Add USB ID of SMCWUSBT-G2 wireless adapter Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 28/90] Bluetooth: Fix refcount use-after-free issue Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 29/90] mm: pagewalk: fix termination condition in walk_pte_range() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 30/90] Bluetooth: prefetch channel before killing sock Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 31/90] KVM: fix overflow of zero page refcount with ksm running Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 32/90] ALSA: hda: Clear RIRB status before reading WP Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 33/90] skbuff: fix a data race in skb_queue_len() Sasha Levin
2020-09-18  2:13 ` [PATCH AUTOSEL 4.9 34/90] audit: CONFIG_CHANGE don't log internal bookkeeping as an event Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 35/90] selinux: sel_avc_get_stat_idx should increase position index Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 36/90] scsi: lpfc: Fix RQ buffer leakage when no IOCBs available Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 37/90] scsi: lpfc: Fix coverity errors in fmdi attribute handling Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 38/90] drm/omap: fix possible object reference leak Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 39/90] RDMA/rxe: Fix configuration of atomic queue pair attributes Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 40/90] KVM: x86: fix incorrect comparison in trace event Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 41/90] x86/pkeys: Add check for pkey "overflow" Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 42/90] bpf: Remove recursion prevention from rcu free callback Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 43/90] dmaengine: tegra-apb: Prevent race conditions on channel's freeing Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 44/90] media: go7007: Fix URB type for interrupt handling Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 45/90] Bluetooth: guard against controllers sending zero'd events Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 46/90] timekeeping: Prevent 32bit truncation in scale64_check_overflow() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 47/90] drm/amdgpu: increase atombios cmd timeout Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 48/90] Bluetooth: L2CAP: handle l2cap config request during open state Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 49/90] media: tda10071: fix unsigned sign extension overflow Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 50/90] xfs: don't ever return a stale pointer from __xfs_dir3_free_read Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 51/90] tpm: ibmvtpm: Wait for buffer to be set before proceeding Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 52/90] tracing: Use address-of operator on section symbols Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 53/90] serial: 8250_port: Don't service RX FIFO if throttled Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 54/90] serial: 8250_omap: Fix sleeping function called from invalid context during probe Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 55/90] serial: 8250: 8250_omap: Terminate DMA before pushing data on RX timeout Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 56/90] cpufreq: powernv: Fix frame-size-overflow in powernv_cpufreq_work_fn Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 57/90] tools: gpio-hammer: Avoid potential overflow in main Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 58/90] SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()' Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 59/90] svcrdma: Fix leak of transport addresses Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 60/90] ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 61/90] ALSA: usb-audio: Fix case when USB MIDI interface has more than one extra endpoint descriptor Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 62/90] mm/filemap.c: clear page error before actual read Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 63/90] mm/mmap.c: initialize align_offset explicitly for vm_unmapped_area Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 64/90] serial: uartps: Wait for tx_empty in console setup Sasha Levin
2020-09-28 20:11   ` Naresh Kamboju
2020-09-28 20:13     ` Naresh Kamboju
2020-09-29  6:59       ` Greg Kroah-Hartman
2020-09-29 17:39         ` Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 65/90] KVM: Remove CREATE_IRQCHIP/SET_PIT2 race Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 66/90] bdev: Reduce time holding bd_mutex in sync in blkdev_close() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 67/90] drivers: char: tlclk.c: Avoid data race between init and interrupt handler Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 68/90] dt-bindings: sound: wm8994: Correct required supplies based on actual implementaion Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 69/90] atm: fix a memory leak of vcc->user_back Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 70/90] phy: samsung: s5pv210-usb2: Add delay after reset Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 71/90] Bluetooth: Handle Inquiry Cancel error after Inquiry Complete Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 72/90] USB: EHCI: ehci-mv: fix error handling in mv_ehci_probe() Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 73/90] tty: serial: samsung: Correct clock selection logic Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 74/90] ALSA: hda: Fix potential race in unsol event handler Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 75/90] fuse: don't check refcount after stealing page Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 76/90] USB: EHCI: ehci-mv: fix less than zero comparison of an unsigned int Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 77/90] e1000: Do not perform reset in reset_task if we are already down Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 78/90] printk: handle blank console arguments passed in Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 79/90] btrfs: don't force read-only after error in drop snapshot Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 80/90] vfio/pci: fix memory leaks of eventfd ctx Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 81/90] perf util: Fix memory leak of prefix_if_not_in Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 82/90] perf kcore_copy: Fix module map when there are no modules loaded Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 83/90] mtd: rawnand: omap_elm: Fix runtime PM imbalance on error Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 84/90] ceph: fix potential race in ceph_check_caps Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 85/90] mtd: parser: cmdline: Support MTD names containing one or more colons Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 86/90] x86/speculation/mds: Mark mds_user_clear_cpu_buffers() __always_inline Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 87/90] vfio/pci: Clear error and request eventfd ctx after releasing Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 88/90] cifs: Fix double add page to memcg when cifs_readpages Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 89/90] selftests/x86/syscall_nt: Clear weird flags after each test Sasha Levin
2020-09-18  2:14 ` [PATCH AUTOSEL 4.9 90/90] vfio/pci: fix racy on error and request eventfd ctx Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200918021455.2067301-11-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=colyli@suse.de \
    --cc=fangguoju@gmail.com \
    --cc=linux-bcache@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).