LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Dan Williams <djbw@fb.com>
To: John Drescher <drescherjm@gmail.com>
Cc: 王金浦 <jinpuwang@gmail.com>, LKML <linux-kernel@vger.kernel.org>,
	linux-scsi@vger.kernel.org, DL-MPTFusionLinux@lsi.com
Subject: Re: Possible mptsas regression post 3.5.0
Date: Mon, 27 Aug 2012 22:37:33 -0700
Message-ID: <1346132253.12384.6.camel@localhost.localdomain> (raw)
In-Reply-To: <CAEhu1-7zGeZ-A0hkJ0=PqJart8f1zGyapWo-DP0fORuNTpj0sg@mail.gmail.com>


[-- Attachment #1: Type: text/plain, Size: 862 bytes --]

On Mon, 2012-08-27 at 12:13 -0400, John Drescher wrote:
> >> I have bisected it down to the following patch:
> >>
> >> Bisecting: 0 revisions left to test after this (roughly 0 steps)
> >> [10f8d5b86743b33d841a175303e2bf67fd620f42] SCSI: fix hot unplug vs
> >> async scan race
> >>
> >> It appears this patch caused the bad behavior although I have not
> >> tested that yet. I am rebuilding the array (takes ~2 hours) from the
> >> previous good bisect.
> >>
> 
> Confirmed. This patch appears to cause the bug in my test setup.
> 
> [  339.406778] BUG: soft lockup - CPU#2 stuck for 23s! [kworker/u:8:2202]
[..]
> [  339.415268]  [<ffffffff8141782a>] scsi_remove_target+0xda/0x1f0

I wonder if we are preventing scsi_device_dev_release_usercontext() from
making forward progress?

...the attached patch should confirm this or give more info otherwise.

--
Dan


[-- Attachment #2: dbg-scsi-remove-target.patch --]
[-- Type: text/x-patch, Size: 1551 bytes --]

scsi_remove_target: debug softlockup

From: Dan Williams <djbw@fb.com>

dump more info in the case where we get stuck trying to remove a device.
---
 drivers/scsi/scsi_sysfs.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index 093d4f6..011f8ee 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -1032,8 +1032,11 @@ void scsi_remove_target(struct device *dev)
 {
 	struct Scsi_Host *shost = dev_to_shost(dev->parent);
 	struct scsi_target *starget, *found;
+	struct scsi_target *found_log[3];
 	unsigned long flags;
 
+	memset(found_log, 0, sizeof(found_log));
+
  restart:
 	found = NULL;
 	spin_lock_irqsave(shost->host_lock, flags);
@@ -1041,8 +1044,24 @@ void scsi_remove_target(struct device *dev)
 		if (starget->state == STARGET_DEL)
 			continue;
 		if (starget->dev.parent == dev || &starget->dev == dev) {
+			int i;
+
 			found = starget;
 			found->reap_ref++;
+			for (i = 0; i < ARRAY_SIZE(found_log); i++)
+				if (!found_log[i]) {
+					found_log[i] = found;
+					break;
+				} else if (found_log[i] == found) {
+					struct scsi_device *sdev = NULL;
+
+					if (!list_empty(&found->devices))
+						sdev = list_entry(found->devices.next, typeof(*sdev), same_target_siblings);
+					pr_err_once("%s[%d]: reap %d:%d state: %d reap: %d dev_del: %d\n",
+						    __func__, i, found->channel, found->id,
+						    found->state, found->reap_ref,
+						    sdev ? work_busy(&sdev->ew.work) ? 2 : 1 : 0);
+				}
 			break;
 		}
 	}

  reply index

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-23 17:34 John Drescher
2012-08-24 19:34 ` John Drescher
2012-08-27 14:10   ` John Drescher
     [not found]     ` <CAD9gYJJQ+vsv12+-0e_nMtgs71Snvt4j2s48-HnwkrV2yOiwLQ@mail.gmail.com>
2012-08-27 16:13       ` John Drescher
2012-08-28  5:37         ` Dan Williams [this message]
2012-08-28 14:03           ` John Drescher
2012-08-28 16:12             ` Dan Williams
2012-08-28 16:42               ` John Drescher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1346132253.12384.6.camel@localhost.localdomain \
    --to=djbw@fb.com \
    --cc=DL-MPTFusionLinux@lsi.com \
    --cc=drescherjm@gmail.com \
    --cc=jinpuwang@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git