From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1758904AbcAUGfW (ORCPT <rfc822;w@1wt.eu>);
	Thu, 21 Jan 2016 01:35:22 -0500
Received: from mail-pa0-f41.google.com ([209.85.220.41]:36602 "EHLO
	mail-pa0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1758709AbcAUGfS (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 21 Jan 2016 01:35:18 -0500
Subject: [PATCH 0/2] scsi: Fix endless loop of ATA hard resets due to VPD
 reads
From: Alexander Duyck <aduyck@mirantis.com>
To: jbottomley@odin.com, hare@suse.de, linux-scsi@vger.kernel.org
Cc: alexander.duyck@gmail.com, martin.petersen@oracle.com,
        linux-kernel@vger.kernel.org, shane.seymour@hpe.com,
        jthumshirn@suse.de
Date: Wed, 20 Jan 2016 22:35:15 -0800
Message-ID: <20160121063039.3803.66.stgit@localhost.localdomain>
User-Agent: StGit/0.17.1-dirty
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Recent changes to the kernel pulled in during the merge window have
resulted in my system generating an endless loop of the following type of
errors:

[  318.965756] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[  318.968457] ata14.00: configured for UDMA/66
[  318.970656] ata14: EH complete
[  318.984366] ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6
[  318.986854] ata14.00: irq_stat 0x40000001
[  318.989138] ata14.00: cmd a0/01:00:00:00:01/00:00:00:00:00/a0 tag 22 dma 16640 in
         Inquiry 12 01 00 00 ff 00res 00/00:00:00:00:00/00:00:00:00:00/00 Emask 0x3 (HSM violation)
[  318.995986] ata14: hard resetting link

I bisected the issue and found the patch responsible for the issue was
commit 09e2b0b14690 "scsi: rescan VPD attributes".  This commit contained
several issues.

First, the commit had changed the behavior in terms of what devices we
called scsi_attach_vpd() for.  As a result we were calling it for devices
that didn't support a scsi_level of 6, SCSI 3, so VPD accesses could
result in errors.

Second, the commit as well as a follow-on patch for it contained a number
of RCU errors.  Specifically the code was structured such that we had
accesses outside of RCU locked regions, and repeated use of the RCU
protected pointer without using the proper accessors.  As such it was
possible to get into a serious corruption situation should a pointer be
updated.

Ultimately neither of these bugs were my root cause.  It turns out the
Marvel Console SCSI device in my system needed to have a flag set to
disable VPD access in order to keep things from looping through the error
repeatedly.  In order to resolve it I had to add the kernel parameter
"scsi_mod.dev_flags=Marvell:Console:0x4000000".  This allowed my system to
boot without any errors, however the first two issues described above are
still relevent so I thought I would provide the patches since I had already
written them up.

---

Alexander Duyck (2):
      scsi: Do not attach VPD to devices that don't support it
      scsi: Fix RCU handling for VPD pages


 drivers/scsi/scsi.c        |   55 ++++++++++++++++++++++++--------------------
 drivers/scsi/scsi_lib.c    |   12 +++++-----
 drivers/scsi/scsi_scan.c   |    3 +-
 drivers/scsi/scsi_sysfs.c  |   14 ++++++-----
 include/scsi/scsi_device.h |   14 +++++++----
 5 files changed, 54 insertions(+), 44 deletions(-)

--