All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhuang, Jin Can" <jin.can.zhuang@intel.com>
To: Bart Van Assche <bvanassche@acm.org>,
	linux-scsi <linux-scsi@vger.kernel.org>
Cc: James Bottomley <jbottomley@parallels.com>,
	Mike Christie <michaelc@cs.wisc.edu>,
	Jens Axboe <axboe@kernel.dk>, Tejun Heo <tj@kernel.org>,
	Chanho Min <chanho.min@lge.com>
Subject: RE: [PATCH 6/7] Fix race between starved list processing and device removal
Date: Sun, 28 Oct 2012 18:01:58 +0000	[thread overview]
Message-ID: <267107B7B5D6404FB174AB273F79D8BD1A4C39@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <508A7C6D.8070002@acm.org>

I recently ran into the same issue
The test I did is plug/unplug u-disk in an interval of 1 second. And I found when sdev1 is being removed, scsi_run_queue is triggered by sdev2, which then accesses all the starving scsi device including sdev1.

I have adopted the solution below which works fine for me so far.
But there's one thing to fix in the patch below. When it put_device in scsi_run_queue, irq is disabled. As put_device may get into sleep, irq should be enabled before it's called.
So I change it to:
		spin_unlock_irq(sdev->request_queue->queue_lock);

		put_device(&sdev->sdev_gendev);

 		spin_lock_irq(shost->host_lock);

-Jincan

-----Original Message-----
From: linux-scsi-owner@vger.kernel.org [mailto:linux-scsi-owner@vger.kernel.org] On Behalf Of Bart Van Assche
Sent: Friday, October 26, 2012 8:05 PM
To: linux-scsi
Cc: James Bottomley; Mike Christie; Jens Axboe; Tejun Heo; Chanho Min
Subject: [PATCH 6/7] Fix race between starved list processing and device removal

The SCSI core maintains a "starved list" per SCSI host. This is a list of devices for which one or more requests have been queued but that have not yet been passed to the SCSI LLD. The function
scsi_run_queue() examines all SCSI devices on the starved list.
Since scsi_remove_device() can be invoked concurrently with
scsi_run_queue() it is important to avoid that a SCSI device is accessed by that function after it has been freed. Avoid that the sdev reference count can drop to zero before the queue is run by
scsi_run_queue() by inserting a get_device() / put_device() pair in that function. Move the code for removing a device from the starved list from scsi_device_dev_release_usercontext() to
__scsi_remove_device() such that it is guaranteed that the newly added get_device() call succeeds.

Reported-and-tested-by: Chanho Min <chanho.min@lge.com>
Reference: http://lkml.org/lkml/2012/8/2/96
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c   |    5 +++++
 drivers/scsi/scsi_sysfs.c |    7 ++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index f29a1a9..c5d4ec2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -452,10 +452,15 @@ static void scsi_run_queue(struct request_queue *q)
 			continue;
 		}
 
+		get_device(&sdev->sdev_gendev);
 		spin_unlock(shost->host_lock);
+
 		spin_lock(sdev->request_queue->queue_lock);
 		__blk_run_queue(sdev->request_queue);
 		spin_unlock(sdev->request_queue->queue_lock);
+
+		put_device(&sdev->sdev_gendev);
+
 		spin_lock(shost->host_lock);
 	}
 	/* put any unprocessed entries back */ diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c index ce5224c..2661a957 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -348,7 +348,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
 	starget->reap_ref++;
 	list_del(&sdev->siblings);
 	list_del(&sdev->same_target_siblings);
-	list_del(&sdev->starved_entry);
 	spin_unlock_irqrestore(sdev->host->host_lock, flags);
 
 	cancel_work_sync(&sdev->event_work);
@@ -956,6 +955,8 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)  void __scsi_remove_device(struct scsi_device *sdev)  {
 	struct device *dev = &sdev->sdev_gendev;
+	struct Scsi_Host *shost = sdev->host;
+	unsigned long flags;
 
 	if (sdev->is_visible) {
 		if (scsi_device_set_state(sdev, SDEV_CANCEL) != 0) @@ -977,6 +978,10 @@ void __scsi_remove_device(struct scsi_device *sdev)
 	blk_cleanup_queue(sdev->request_queue);
 	cancel_work_sync(&sdev->requeue_work);
 
+	spin_lock_irqsave(shost->host_lock, flags);
+	list_del(&sdev->starved_entry);
+	spin_unlock_irqrestore(shost->host_lock, flags);
+
 	if (sdev->host->hostt->slave_destroy)
 		sdev->host->hostt->slave_destroy(sdev);
 	transport_destroy_device(dev);
--
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2012-10-28 18:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-26 12:00 [PATCH 0/7 v5] More device removal fixes Bart Van Assche
2012-10-26 12:01 ` [PATCH 1/7] block: Avoid that blk_drain_queue() finishes early Bart Van Assche
2012-10-29  1:47   ` Tejun Heo
2012-10-29  1:52     ` Tejun Heo
2012-10-29 14:35       ` Bart Van Assche
2012-10-26 12:02 ` [PATCH 2/7] block: Let blk_drain_queue() caller obtain the queue lock Bart Van Assche
2012-10-29  1:55   ` Tejun Heo
2012-10-26 12:02 ` [PATCH 3/7] block: Rename queue dead flag Bart Van Assche
2012-10-26 12:03 ` [PATCH 4/7] block: Avoid that request_fn is invoked on a dead queue Bart Van Assche
2012-10-29  1:59   ` Tejun Heo
2012-10-26 12:04 ` [PATCH 5/7] block: Make blk_cleanup_queue() wait until request_fn finished Bart Van Assche
2012-10-29  2:00   ` Tejun Heo
2012-10-26 12:05 ` [PATCH 6/7] Fix race between starved list processing and device removal Bart Van Assche
2012-10-28 18:01   ` Zhuang, Jin Can [this message]
2012-10-29 14:32     ` Bart Van Assche
2012-10-30  5:40       ` Zhuang, Jin Can
2012-11-02 10:48         ` Bart Van Assche
2012-11-21 11:06           ` Bart Van Assche
     [not found]         ` <026701cdb8c3$d2e3cb50$78ab61f0$@min@lge.com>
2012-11-21 12:10           ` Bart Van Assche
2012-10-29  2:07   ` Tejun Heo
2012-10-26 12:05 ` [PATCH 7/7] Remove get_device() / put_device() pair from scsi_request_fn() Bart Van Assche
2012-10-29  2:08   ` Tejun Heo
2012-11-23 10:37 ` [PATCH 0/7 v5] More device removal fixes Bart Van Assche
2012-11-26 17:19   ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=267107B7B5D6404FB174AB273F79D8BD1A4C39@SHSMSX101.ccr.corp.intel.com \
    --to=jin.can.zhuang@intel.com \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=chanho.min@lge.com \
    --cc=jbottomley@parallels.com \
    --cc=linux-scsi@vger.kernel.org \
    --cc=michaelc@cs.wisc.edu \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.