linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Ray Morris <support@bettercgi.com>,
	NeilBrown <neilb@suse.de>
Subject: [ 051/108] md/raid1,raid10: avoid deadlock during resync/recovery.
Date: Fri, 30 Mar 2012 12:58:13 -0700	[thread overview]
Message-ID: <20120330195729.446369560@linuxfoundation.org> (raw)
In-Reply-To: <20120330195812.GA31833@kroah.com>

3.0-stable review patch.  If anyone has any objections, please let me know.

------------------

From: NeilBrown <neilb@suse.de>

commit d6b42dcb995e6acd7cc276774e751ffc9f0ef4bf upstream.

If RAID1 or RAID10 is used under LVM or some other stacking
block device, it is possible to enter a deadlock during
resync or recovery.
This can happen if the upper level block device creates
two requests to the RAID1 or RAID10.  The first request gets
processed, blocks recovery and queue requests for underlying
requests in current->bio_list.  A resync request then starts
which will wait for those requests and block new IO.

But then the second request to the RAID1/10 will be attempted
and it cannot progress until the resync request completes,
which cannot progress until the underlying device requests complete,
which are on a queue behind that second request.

So allow that second request to proceed even though there is
a resync request about to start.

This is suitable for any -stable kernel.

Reported-by: Ray Morris <support@bettercgi.com>
Tested-by: Ray Morris <support@bettercgi.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/md/raid1.c  |   17 +++++++++++++++--
 drivers/md/raid10.c |   17 +++++++++++++++--
 2 files changed, 30 insertions(+), 4 deletions(-)

--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -614,9 +614,22 @@ static void wait_barrier(conf_t *conf)
 	spin_lock_irq(&conf->resync_lock);
 	if (conf->barrier) {
 		conf->nr_waiting++;
-		wait_event_lock_irq(conf->wait_barrier, !conf->barrier,
+		/* Wait for the barrier to drop.
+		 * However if there are already pending
+		 * requests (preventing the barrier from
+		 * rising completely), and the
+		 * pre-process bio queue isn't empty,
+		 * then don't wait, as we need to empty
+		 * that queue to get the nr_pending
+		 * count down.
+		 */
+		wait_event_lock_irq(conf->wait_barrier,
+				    !conf->barrier ||
+				    (conf->nr_pending &&
+				     current->bio_list &&
+				     !bio_list_empty(current->bio_list)),
 				    conf->resync_lock,
-				    );
+			);
 		conf->nr_waiting--;
 	}
 	conf->nr_pending++;
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -667,9 +667,22 @@ static void wait_barrier(conf_t *conf)
 	spin_lock_irq(&conf->resync_lock);
 	if (conf->barrier) {
 		conf->nr_waiting++;
-		wait_event_lock_irq(conf->wait_barrier, !conf->barrier,
+		/* Wait for the barrier to drop.
+		 * However if there are already pending
+		 * requests (preventing the barrier from
+		 * rising completely), and the
+		 * pre-process bio queue isn't empty,
+		 * then don't wait, as we need to empty
+		 * that queue to get the nr_pending
+		 * count down.
+		 */
+		wait_event_lock_irq(conf->wait_barrier,
+				    !conf->barrier ||
+				    (conf->nr_pending &&
+				     current->bio_list &&
+				     !bio_list_empty(current->bio_list)),
 				    conf->resync_lock,
-				    );
+			);
 		conf->nr_waiting--;
 	}
 	conf->nr_pending++;



  parent reply	other threads:[~2012-03-30 21:06 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-30 19:58 [ 000/108] 3.0.27-stable review Greg KH
2012-03-30 19:57 ` [ 001/108] USB: option: Add MediaTek MT6276M modem&app interfaces Greg KH
2012-03-30 19:57 ` [ 002/108] USB: option driver: adding support for Telit CC864-SINGLE, CC864-DUAL and DE910-DUAL modems Greg KH
2012-03-30 19:57 ` [ 003/108] USB: option: make interface blacklist work again Greg KH
2012-03-30 19:57 ` [ 004/108] USB: option: add ZTE MF820D Greg KH
2012-03-30 19:57 ` [ 005/108] USB: ftdi_sio: fix problem when the manufacture is a NULL string Greg KH
2012-03-30 19:57 ` [ 006/108] USB: ftdi_sio: add support for BeagleBone rev A5+ Greg KH
2012-03-30 19:57 ` [ 007/108] USB: Microchip VID mislabeled as Hornby VID in ftdi_sio Greg KH
2012-03-30 19:57 ` [ 008/108] USB: ftdi_sio: new PID: Distortec JTAG-lock-pick Greg KH
2012-03-30 19:57 ` [ 009/108] USB: ftdi_sio: add support for FT-X series devices Greg KH
2012-03-30 19:57 ` [ 010/108] USB: ftdi_sio: new PID: LUMEL PD12 Greg KH
2012-03-30 19:57 ` [ 011/108] powerpc/usb: fix bug of kernel hang when initializing usb Greg KH
2012-04-13  5:21   ` Anthony Foiani
2012-04-13 17:42     ` Greg KH
2012-03-30 19:57 ` [ 012/108] usb: musb: Reselect index reg in interrupt context Greg KH
2012-03-30 19:57 ` [ 013/108] usb: gadgetfs: return number of bytes on ep0 read request Greg KH
2012-03-30 19:57 ` [ 014/108] USB: gadget: Make g_hid device class conform to spec Greg KH
2012-03-30 19:57 ` [ 015/108] futex: Cover all PI opcodes with cmpxchg enabled check Greg KH
2012-03-30 19:57 ` [ 016/108] sysfs: Fix memory leak in sysfs_sd_setsecdata() Greg KH
2012-03-30 19:57 ` [ 017/108] tty: moxa: fix bit test in moxa_start() Greg KH
2012-03-30 19:57 ` [ 018/108] TTY: Wrong unicode value copied in con_set_unimap() Greg KH
2012-03-30 19:57 ` [ 019/108] USB: serial: fix console error reporting Greg KH
2012-03-30 19:57 ` [ 020/108] cdc-wdm: Fix more races on the read path Greg KH
2012-03-30 19:57 ` [ 021/108] cdc-wdm: Dont clear WDM_READ unless entire read buffer is emptied Greg KH
2012-03-30 19:57 ` [ 022/108] usb: fsl_udc_core: Fix scheduling while atomic dump message Greg KH
2012-03-30 19:57 ` [ 023/108] usb: Fix build error due to dma_mask is not at pdev_archdata at ARM Greg KH
2012-03-30 19:57 ` [ 024/108] USB: qcserial: add several new serial devices Greg KH
2012-03-30 19:57 ` [ 025/108] USB: qcserial: dont grab QMI port on Gobi 1000 devices Greg KH
2012-03-30 19:57 ` [ 026/108] usb-serial: Add support for the Sealevel SeaLINK+8 2038-ROHS device Greg KH
2012-03-30 19:57 ` [ 027/108] usb: cp210x: Update to support CP2105 and multiple interface devices Greg KH
2012-03-30 19:57 ` [ 028/108] USB: serial: mos7840: Fixed MCS7820 device attach problem Greg KH
2012-03-30 19:57 ` [ 029/108] rt2x00: Add support for D-Link DWA-127 to rt2800usb Greg KH
2012-03-30 19:57 ` [ 030/108] rtlwifi: Handle previous allocation failures when freeing device memory Greg KH
2012-03-30 19:57 ` [ 031/108] rtlwifi: rtl8192c: Prevent sleeping from invalid context in rtl8192cu Greg KH
2012-03-30 19:57 ` [ 032/108] rtlwifi: rtl8192ce: Fix loss of receive performance Greg KH
2012-03-30 19:57 ` [ 033/108] serial: PL011: clear pending interrupts Greg KH
2012-04-01 11:43   ` Linus Walleij
2012-04-02 16:23     ` Greg KH
2012-04-03  7:46       ` Linus Walleij
2012-03-30 19:57 ` [ 034/108] math: Introduce div64_long Greg KH
2012-03-30 19:57 ` [ 035/108] ntp: Fix integer overflow when setting time Greg KH
2012-03-30 19:57 ` [ 036/108] uevent: send events in correct order according to seqnum (v3) Greg KH
2012-03-30 19:57 ` [ 037/108] genirq: Fix long-term regression in genirq irq_set_irq_type() handling Greg KH
2012-03-30 19:58 ` [ 038/108] genirq: Fix incorrect check for forced IRQ thread handler Greg KH
2012-03-30 19:58 ` [ 039/108] rtc: Disable the alarm in the hardware (v2) Greg KH
2012-03-30 19:58 ` [ 040/108] p54spi: Release GPIO lines and IRQ on error in p54spi_probe Greg KH
2012-03-30 19:58 ` [ 041/108] IB/iser: Post initial receive buffers before sending the final login request Greg KH
2012-03-30 19:58 ` [ 042/108] x86/ioapic: Add register level checks to detect bogus io-apic entries Greg KH
2012-03-30 19:58 ` [ 043/108] mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode Greg KH
2012-03-30 19:58 ` [ 044/108] bootmem/sparsemem: remove limit constraint in alloc_bootmem_section Greg KH
2012-03-30 19:58 ` [ 045/108] hugetlbfs: avoid taking i_mutex from hugetlbfs_read() Greg KH
2012-03-30 19:58 ` [ 046/108] ASoC: pxa-ssp: atomically set stream active masks Greg KH
2012-03-30 19:58 ` [ 047/108] tcm_loop: Set residual field for SCSI commands Greg KH
2012-03-30 19:58 ` [ 048/108] udlfb: remove sysfs framebuffer device with USB .disconnect() Greg KH
2012-03-30 19:58 ` [ 049/108] tcm_fc: Fix fc_exch memory leak in ft_send_resp_status Greg KH
2012-03-30 19:58 ` [ 050/108] md/bitmap: ensure to load bitmap when creating via sysfs Greg KH
2012-03-30 19:58 ` Greg KH [this message]
2012-03-30 19:58 ` [ 052/108] drm/radeon: Restrict offset for legacy hardware cursor Greg KH
2012-03-30 19:58 ` [ 053/108] drm/radeon/kms: fix analog load detection on DVI-I connectors Greg KH
2012-03-30 19:58 ` [ 054/108] drm/radeon/kms: add connector quirk for Fujitsu D3003-S2 board Greg KH
2012-03-30 19:58 ` [ 055/108] target: Dont set WBUS16 or SYNC bits in INQUIRY response Greg KH
2012-03-30 19:58 ` [ 056/108] target: Fix 16-bit target ports for SET TARGET PORT GROUPS emulation Greg KH
2012-03-30 19:58 ` [ 057/108] Bluetooth: Add AR30XX device ID on Asus laptops Greg KH
2012-03-30 19:58 ` [ 058/108] HID: add extra hotkeys in Asus AIO keyboards Greg KH
2012-03-30 19:58 ` [ 059/108] HID: add more " Greg KH
2012-03-30 19:58 ` [ 060/108] pata_legacy: correctly mask recovery field for HT6560B Greg KH
2012-03-30 19:58 ` [ 061/108] firewire: ohci: fix too-early completion of IR multichannel buffers Greg KH
2012-03-30 19:58 ` [ 062/108] video:uvesafb: Fix oops that uvesafb try to execute NX-protected page Greg KH
2012-03-30 21:32   ` Florian Tobias Schandinat
2012-03-31 18:03     ` Greg KH
2012-03-30 19:58 ` [ 063/108] KVM: x86: extend "struct x86_emulate_ops" with "get_cpuid" Greg KH
2012-03-30 19:58 ` [ 064/108] KVM: x86: fix missing checks in syscall emulation Greg KH
2012-03-30 19:58 ` [ 065/108] NFS: Properly handle the case where the delegation is revoked Greg KH
2012-03-30 19:58 ` [ 066/108] NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE Greg KH
2012-03-30 19:58 ` [ 067/108] xfs: fix inode lookup race Greg KH
2012-03-30 19:58 ` [ 068/108] cifs: fix issue mounting of DFS ROOT when redirecting from one domain controller to the next Greg KH
2012-03-30 19:58 ` [ 069/108] UBI: fix error handling in ubi_scan() Greg KH
2012-03-30 19:58 ` [ 070/108] UBI: fix eraseblock picking criteria Greg KH
2012-03-30 19:58 ` [ 071/108] SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Greg KH
2012-03-30 19:58 ` [ 072/108] usbnet: increase URB reference count before usb_unlink_urb Greg KH
2012-03-30 19:58 ` [ 073/108] usbnet: dont clear urb->dev in tx_complete Greg KH
2012-03-30 19:58 ` [ 074/108] x86-32: Fix endless loop when processing signals for kernel tasks Greg KH
2012-03-30 19:58 ` [ 075/108] proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate() Greg KH
2012-03-30 19:58 ` [ 076/108] hwmon: (fam15h_power) Correct sign extension of running_avg_capture Greg KH
2012-03-30 19:58 ` [ 077/108] [media] lgdt330x: fix signedness error in i2c_read_demod_bytes() Greg KH
2012-03-30 19:58 ` [ 078/108] [media] pvrusb2: fix 7MHz & 8MHz DVB-T tuner support for HVR1900 rev D1F5 Greg KH
2012-03-30 19:58 ` [ 079/108] e1000e: Avoid wrong check on TX hang Greg KH
2012-03-30 19:58 ` [ 080/108] PM / Hibernate: Enable usermodehelpers in hibernate() error path Greg KH
2012-03-30 19:58 ` [ 081/108] ext4: flush any pending end_io requests before DIO reads w/dioread_nolock Greg KH
2012-03-30 19:58 ` [ 082/108] jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer Greg KH
2012-03-30 19:58 ` [ 083/108] ext4: ignore EXT4_INODE_JOURNAL_DATA flag with delalloc Greg KH
2012-03-30 19:58 ` [ 084/108] ext4: check for zero length extent Greg KH
2012-03-30 19:58 ` [ 085/108] vfs: fix d_ancestor() case in d_materialize_unique Greg KH
2012-03-30 19:58 ` [ 086/108] udf: Fix deadlock in udf_release_file() Greg KH
2012-03-30 19:58 ` [ 087/108] dm crypt: fix mempool deadlock Greg KH
2012-03-30 19:58 ` [ 088/108] dm crypt: add missing error handling Greg KH
2012-03-30 19:58 ` [ 089/108] dm exception store: fix init error path Greg KH
2012-03-30 19:58 ` [ 090/108] backlight: fix typo in tosa_lcd.c Greg KH
2012-03-30 19:58 ` [ 091/108] xfs: Fix oops on IO error during xlog_recover_process_iunlinks() Greg KH
2012-03-30 19:58 ` [ 092/108] slub: Do not hold slub_lock when calling sysfs_slab_add() Greg KH
2012-03-30 19:58 ` [ 093/108] module: Remove module size limit Greg KH
2012-03-30 19:58 ` [ 094/108] Bluetooth: btusb: fix bInterval for high/super speed isochronous endpoints Greg KH
2012-03-30 19:58 ` [ 095/108] drm/i915: suspend fbdev device around suspend/hibernate Greg KH
2012-03-30 19:58 ` [ 096/108] Fix pppol2tp getsockname() Greg KH
2012-03-30 19:58 ` [ 097/108] net: bpf_jit: fix BPF_S_LDX_B_MSH compilation Greg KH
2012-03-30 19:59 ` [ 098/108] net: fix a potential rcu_read_lock() imbalance in rt6_fill_node() Greg KH
2012-03-30 19:59 ` [ 099/108] net: fix napi_reuse_skb() skb reserve Greg KH
2012-03-30 19:59 ` [ 100/108] Remove printk from rds_sendmsg Greg KH
2012-03-30 19:59 ` [ 101/108] sky2: override for PCI legacy power management Greg KH
2012-03-30 19:59 ` [ 102/108] xfrm: Access the replay notify functions via the registered callbacks Greg KH
2012-03-30 19:59 ` [ 103/108] lockd: fix arg parsing for grace_period and timeout Greg KH
2012-03-30 19:59 ` [ 104/108] x86, tsc: Skip refined tsc calibration on systems with reliable TSC Greg KH
2012-03-30 19:59 ` [ 105/108] x86, tls: Off by one limit check Greg KH
2012-03-30 19:59 ` [ 106/108] compat: use sys_sendfile64() implementation for sendfile syscall Greg KH
2012-03-30 19:59 ` [ 107/108] nfsd: dont allow zero length strings in cache_parse() Greg KH
2012-03-30 19:59 ` [ 108/108] serial: sh-sci: fix a race of DMA submit_tx on transfer Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120330195729.446369560@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=stable@vger.kernel.org \
    --cc=support@bettercgi.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).