linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@suse.de>
To: linux-kernel@vger.kernel.org, stable@kernel.org
Cc: stable-review@kernel.org, torvalds@linux-foundation.org,
	akpm@linux-foundation.org, alan@lxorguk.ukuu.org.uk,
	Kiyoshi Ueda <k-ueda@ct.jp.nec.com>,
	Junichi Nomura <j-nomura@ce.jp.nec.com>,
	Alasdair G Kergon <agk@redhat.com>
Subject: [93/93] dm mpath: fix stall when requeueing io
Date: Fri, 19 Feb 2010 08:30:26 -0800	[thread overview]
Message-ID: <20100219163257.186977438@kvm.kroah.org> (raw)
In-Reply-To: <20100219165717.GA15002@kroah.com>

2.6.32-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>

commit 9eef87da2a8ea4920e0d913ff977cac064b68ee0 upstream.

This patch fixes the problem that system may stall if target's ->map_rq
returns DM_MAPIO_REQUEUE in map_request().
E.g. stall happens on 1 CPU box when a dm-mpath device with queue_if_no_path
     bounces between all-paths-down and paths-up on I/O load.

When target's ->map_rq returns DM_MAPIO_REQUEUE, map_request() requeues
the request and returns to dm_request_fn().  Then, dm_request_fn()
doesn't exit the I/O dispatching loop and continues processing
the requeued request again.
This map and requeue loop can be done with interrupt disabled,
so 1 CPU system can be stalled if this situation happens.

For example, commands below can stall my 1 CPU box within 1 minute or so:
  # dmsetup table mp
  mp: 0 2097152 multipath 1 queue_if_no_path 0 1 1 service-time 0 1 2 8:144 1 1
  # while true; do dd if=/dev/mapper/mp of=/dev/null bs=1M count=100; done &
  # while true; do \
  > dmsetup message mp 0 "fail_path 8:144" \
  > dmsetup suspend --noflush mp \
  > dmsetup resume mp \
  > dmsetup message mp 0 "reinstate_path 8:144" \
  > done

To fix the problem above, this patch changes dm_request_fn() to exit
the I/O dispatching loop once if a request is requeued in map_request().

Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/md/dm.c |   21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1487,11 +1487,15 @@ static int dm_prep_fn(struct request_que
 	return BLKPREP_OK;
 }
 
-static void map_request(struct dm_target *ti, struct request *rq,
-			struct mapped_device *md)
+/*
+ * Returns:
+ * 0  : the request has been processed (not requeued)
+ * !0 : the request has been requeued
+ */
+static int map_request(struct dm_target *ti, struct request *clone,
+		       struct mapped_device *md)
 {
-	int r;
-	struct request *clone = rq->special;
+	int r, requeued = 0;
 	struct dm_rq_target_io *tio = clone->end_io_data;
 
 	/*
@@ -1516,6 +1520,7 @@ static void map_request(struct dm_target
 	case DM_MAPIO_REQUEUE:
 		/* The target wants to requeue the I/O */
 		dm_requeue_unmapped_request(clone);
+		requeued = 1;
 		break;
 	default:
 		if (r > 0) {
@@ -1527,6 +1532,8 @@ static void map_request(struct dm_target
 		dm_kill_unmapped_request(clone, r);
 		break;
 	}
+
+	return requeued;
 }
 
 /*
@@ -1568,12 +1575,16 @@ static void dm_request_fn(struct request
 
 		blk_start_request(rq);
 		spin_unlock(q->queue_lock);
-		map_request(ti, rq, md);
+		if (map_request(ti, rq, md))
+			goto requeued;
 		spin_lock_irq(q->queue_lock);
 	}
 
 	goto out;
 
+requeued:
+	spin_lock_irq(q->queue_lock);
+
 plug_and_out:
 	if (!elv_queue_empty(q))
 		/* Some requests still remain, retry later */



  parent reply	other threads:[~2010-02-19 17:02 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-02-19 16:57 [00/93] 2.6.32.9-stable review Greg KH
2010-02-19 16:28 ` [01/93] Fix potential crash with sys_move_pages Greg KH
2010-02-19 16:28 ` [02/93] futex_lock_pi() key refcnt fix Greg KH
2010-02-19 16:28 ` [03/93] futex: Handle user space corruption gracefully Greg KH
2010-02-19 16:28 ` [04/93] futex: Handle futex value " Greg KH
2010-02-19 16:28 ` [05/93] Fix race in tty_fasync() properly Greg KH
2010-02-19 16:28 ` [06/93] hwmon: (w83781d) Request I/O ports individually for probing Greg KH
2010-02-19 16:29 ` [07/93] hwmon: (lm78) " Greg KH
2010-02-19 16:29 ` [08/93] hwmon: (adt7462) Wrong ADT7462_VOLT_COUNT Greg KH
2010-02-19 16:29 ` [09/93] ALSA: ctxfi - fix PTP address initialization Greg KH
2010-02-19 16:29 ` [10/93] drm/i915: disable hotplug detect before Ironlake CRT detect Greg KH
2010-02-19 16:29 ` [11/93] drm/i915: enable self-refresh on 965 Greg KH
2010-02-19 16:29 ` [12/93] drm/i915: Disable SR when more than one pipe is enabled Greg KH
2010-02-19 16:29 ` [13/93] drm/i915: Fix DDC on some systems by clearing BIOS GMBUS setup Greg KH
2010-02-19 16:29 ` [14/93] drm/i915: Add HP nx9020/SamsungSX20S to ACPI LID quirk list Greg KH
2010-02-19 16:29 ` [15/93] drm/i915: Fix the incorrect DMI string for Samsung SX20S laptop Greg KH
2010-02-19 16:29 ` [16/93] drm/i915: Add MALATA PC-81005 to ACPI LID quirk list Greg KH
2010-02-19 16:29 ` [17/93] usb: r8a66597-hcd: Flush the D-cache for the pipe-in transfer buffers Greg KH
2010-02-19 16:29 ` [18/93] i2c-tiny-usb: Fix on big-endian systems Greg KH
2010-02-19 16:29 ` [19/93] drm/i915: handle FBC and self-refresh better Greg KH
2010-02-19 16:29 ` [20/93] drm/i915: Increase fb alignment to 64k Greg KH
2010-02-19 16:29 ` [21/93] drm/i915: Update write_domains on active list after flush Greg KH
2010-02-19 16:29 ` [22/93] regulator: Fix display of null constraints for regulators Greg KH
2010-02-19 16:29 ` [23/93] ALSA: hda-intel: Avoid divide by zero crash Greg KH
2010-02-19 16:29 ` [24/93] CPUFREQ: Fix use after free of struct powernow_k8_data Greg KH
2010-02-19 16:29 ` [25/93] freeze_bdev: dont deactivate successfully frozen MS_RDONLY sb Greg KH
2010-02-19 16:29 ` [26/93] cciss: Make cciss_seq_show handle holes in the h->drv[] array Greg KH
2010-02-19 16:29 ` [27/93] ioat: fix infinite timeout checking in ioat2_quiesce Greg KH
2010-02-19 16:29 ` [28/93] resource: add helpers for fetching rlimits Greg KH
2010-02-19 16:29 ` [29/93] fs/exec.c: restrict initial stack space expansion to rlimit Greg KH
2010-02-21  6:42   ` Michael Neuling
2010-02-23 15:34     ` [stable] " Greg KH
2010-02-23 20:42       ` Michael Neuling
2010-02-19 16:29 ` [30/93] cifs: fix length calculation for converted unicode readdir names Greg KH
2010-02-19 16:29 ` [31/93] NFS: Fix a reference leak in nfs_wb_cancel_page() Greg KH
2010-02-19 16:29 ` [32/93] NFS: Try to commit unstable writes in nfs_release_page() Greg KH
2010-02-19 16:29 ` [33/93] NFSv4: Dont allow posix locking against servers that dont support it Greg KH
2010-02-19 16:29 ` [34/93] NFSv4: Ensure that the NFSv4 locking can recover from stateid errors Greg KH
2010-02-19 16:29 ` [35/93] NFS: Fix an Oops when truncating a file Greg KH
2010-02-19 16:29 ` [36/93] NFS: Fix a umount race Greg KH
2010-02-19 16:29 ` [37/93] NFS: Fix a bug in nfs_fscache_release_page() Greg KH
2010-02-19 16:29 ` [38/93] NFS: Fix the mapping of the NFSERR_SERVERFAULT error Greg KH
2010-02-19 16:29 ` [39/93] md: fix degraded calculation when starting a reshape Greg KH
2010-02-19 16:29 ` [40/93] V4L/DVB: dvb-core: fix initialization of feeds list in demux filter Greg KH
2010-02-19 16:29 ` [41/93] Export the symbol of getboottime and mmonotonic_to_bootbased Greg KH
2010-02-19 16:29 ` [42/93] kvmclock: count total_sleep_time when updating guest clock Greg KH
2010-02-19 16:29 ` [43/93] KVM: PIT: control word is write-only Greg KH
2010-02-19 16:29 ` [44/93] tpm_infineon: fix suspend/resume handler for pnp_driver Greg KH
2010-02-19 16:29 ` [45/93] amd64_edac: Do not falsely trigger kerneloops Greg KH
2010-02-19 16:29 ` [46/93] netfilter: nf_conntrack: fix memory corruption with multiple namespaces Greg KH
2010-02-19 16:29 ` [47/93] netfilter: nf_conntrack: per netns nf_conntrack_cachep Greg KH
2010-02-19 16:29 ` [48/93] netfilter: nf_conntrack: restrict runtime expect hashsize modifications Greg KH
2010-02-19 16:29 ` [49/93] netfilter: xtables: compat out of scope fix Greg KH
2010-02-19 16:29 ` [50/93] netfilter: nf_conntrack: fix hash resizing with namespaces Greg KH
2010-02-19 16:29 ` [51/93] drm/i915: remove full registers dump debug Greg KH
2010-02-19 16:29 ` [52/93] drm/i915: add i915_lp_ring_sync helper Greg KH
2010-02-19 16:29 ` [53/93] drm/i915: Dont wait interruptible for possible plane buffer flush Greg KH
2010-02-19 16:29 ` [54/93] [S390] dasd: remove strings from s390dbf Greg KH
2010-02-19 16:29 ` [55/93] crypto: padlock-sha - Add import/export support Greg KH
2010-02-19 16:29 ` [56/93] wmi: Free the allocated acpi objects through wmi_get_event_data Greg KH
2010-02-19 16:29 ` [57/93] dell-wmi, hp-wmi, msi-wmi: check wmi_get_event_data() return value Greg KH
2010-02-19 16:29 ` [58/93] /dev/mem: introduce size_inside_page() Greg KH
2010-02-19 16:29 ` [59/93] devmem: check vmalloc address on kmem read/write Greg KH
2010-02-19 16:29 ` [60/93] devmem: fix kmem write bug on memory holes Greg KH
2010-02-19 16:29 ` [61/93] SCSI: mptfusion : mptscsih_abort return value should be SUCCESS instead of value 0 Greg KH
2010-02-19 16:29 ` [62/93] sh: Couple kernel and user write page perm bits for CONFIG_X2TLB Greg KH
2010-02-19 16:29 ` [63/93] ALSA: hda - use WARN_ON_ONCE() for zero-division detection Greg KH
2010-02-19 16:29 ` [64/93] dst: call cond_resched() in dst_gc_task() Greg KH
2010-02-19 16:29 ` [65/93] ALSA: hda - Improved MacBook (Pro) 5,1 / 5,2 support Greg KH
2010-02-19 16:29 ` [66/93] befs: fix leak Greg KH
2010-02-19 16:30 ` [67/93] rtc-fm3130: add missing braces Greg KH
2010-02-19 16:30 ` [68/93] [libata] Call flush_dcache_page after PIO data transfers in libata-sff.c Greg KH
2010-02-19 16:30 ` [69/93] ahci: add Acer G725 to broken suspend list Greg KH
2010-02-19 16:30 ` [70/93] pktgen: Fix freezing problem Greg KH
2010-02-19 16:30 ` [71/93] x86/amd-iommu: Fix IOMMU-API initialization for iommu=pt Greg KH
2010-02-19 16:30 ` [72/93] x86/amd-iommu: Fix deassignment of a device from the pt_domain Greg KH
2010-02-19 16:30 ` [73/93] x86: Re-get cfg_new in case reuse/move irq_desc Greg KH
2010-02-19 16:30 ` [74/93] Staging: fix rtl8187se compilation errors with mac80211 Greg KH
2010-02-19 16:30 ` [75/93] ALSA: usb-audio - Avoid Oops after disconnect Greg KH
2010-02-19 16:30 ` [76/93] serial: 8250: add serial transmitter fully empty test Greg KH
2010-02-19 16:30 ` [77/93] sysfs: sysfs_sd_setattr set iattrs unconditionally Greg KH
2010-02-19 16:30 ` [78/93] class: Free the class private data in class_release Greg KH
2010-02-19 16:30 ` [79/93] USB: usbfs: only copy the actual data received Greg KH
2010-02-19 16:30 ` [80/93] USB: usbfs: properly clean up the as structure on error paths Greg KH
2010-02-19 16:30 ` [81/93] rtl8187: Add new device ID Greg KH
2010-02-19 16:30 ` [82/93] ACPI: Add NULL pointer check in acpi_bus_start Greg KH
2010-02-19 16:30 ` [83/93] ACPI: fix High cpu temperature with 2.6.32 Greg KH
2010-02-19 16:30 ` [84/93] drm/radeon/kms: use udelay for short delays Greg KH
2010-02-19 16:30 ` [85/93] NFS: Too many GETATTR and ACCESS calls after direct I/O Greg KH
2010-02-19 16:30 ` [86/93] eCryptfs: Add getattr function Greg KH
2010-02-19 16:30 ` [87/93] b43: Fix throughput regression Greg KH
2010-02-19 16:30 ` [88/93] ath9k: Fix sequence numbers for PAE frames Greg KH
2010-02-19 16:30 ` [89/93] mac80211: Fix probe request filtering in IBSS mode Greg KH
2010-02-19 16:30 ` [90/93] iwlwifi: Fix to set correct ht configuration Greg KH
2010-02-19 16:30 ` [91/93] dm stripe: avoid divide by zero with invalid stripe count Greg KH
2010-02-19 16:30 ` [92/93] dm log: userspace fix overhead_size calcuations Greg KH
2010-02-19 16:30 ` Greg KH [this message]
2010-02-21 16:07   ` [Stable-review] [93/93] dm mpath: fix stall when requeueing io Stefan Bader
2010-02-22 10:16     ` Kiyoshi Ueda
2010-02-23 18:12       ` Alasdair G Kergon
2010-02-24  1:12         ` Kiyoshi Ueda
2010-02-24 22:30           ` Mikulas Patocka
     [not found]       ` <20100223175331.GE560@agk-dp.fab.redhat.com>
     [not found]         ` <Pine.LNX.4.64.1002231429210.29863@hs20-bc2-1.build.redhat.com>
     [not found]           ` <4B84E05C.4070300@ct.jp.nec.com>
     [not found]             ` <Pine.LNX.4.64.1002241732300.2537@hs20-bc2-1.build.redhat.com>
     [not found]               ` <4B862103.1040802@ct.jp.nec.com>
2010-02-26 20:57                 ` How to unload a module? (Was: [dm-devel] rqdm: bad usage of dm_get/dm_put) Mikulas Patocka
2010-02-23 15:33     ` [Stable-review] [93/93] dm mpath: fix stall when requeueing io Greg KH
2010-02-23 17:00       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100219163257.186977438@kvm.kroah.org \
    --to=gregkh@suse.de \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=k-ueda@ct.jp.nec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable-review@kernel.org \
    --cc=stable@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).