All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ben Hutchings <ben@decadent.org.uk>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: akpm@linux-foundation.org, Denis Kirjanov <kda@linux-powerpc.org>,
	"Jason Yan" <yanaijie@huawei.com>,
	"John Garry" <john.garry@huawei.com>,
	"Gao Chuan" <gaochuan4@huawei.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: [PATCH 3.16 63/63] scsi: libsas: stop discovering if oob mode is disconnected
Date: Wed, 08 Jan 2020 19:44:01 +0000	[thread overview]
Message-ID: <lsq.1578512578.643033814@decadent.org.uk> (raw)
In-Reply-To: <lsq.1578512578.117275639@decadent.org.uk>

3.16.81-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Jason Yan <yanaijie@huawei.com>

commit f70267f379b5e5e11bdc5d72a56bf17e5feed01f upstream.

The discovering of sas port is driven by workqueue in libsas. When libsas
is processing port events or phy events in workqueue, new events may rise
up and change the state of some structures such as asd_sas_phy.  This may
cause some problems such as follows:

==>thread 1                       ==>thread 2

                                  ==>phy up
                                  ==>phy_up_v3_hw()
                                    ==>oob_mode = SATA_OOB_MODE;
                                  ==>phy down quickly
                                  ==>hisi_sas_phy_down()
                                    ==>sas_ha->notify_phy_event()
                                    ==>sas_phy_disconnected()
                                      ==>oob_mode = OOB_NOT_CONNECTED
==>workqueue wakeup
==>sas_form_port()
  ==>sas_discover_domain()
    ==>sas_get_port_device()
      ==>oob_mode is OOB_NOT_CONNECTED and device
         is wrongly taken as expander

This at last lead to the panic when libsas trying to issue a command to
discover the device.

[183047.614035] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000058
[183047.622896] Mem abort info:
[183047.625762]   ESR = 0x96000004
[183047.628893]   Exception class = DABT (current EL), IL = 32 bits
[183047.634888]   SET = 0, FnV = 0
[183047.638015]   EA = 0, S1PTW = 0
[183047.641232] Data abort info:
[183047.644189]   ISV = 0, ISS = 0x00000004
[183047.648100]   CM = 0, WnR = 0
[183047.651145] user pgtable: 4k pages, 48-bit VAs, pgdp =
00000000b7df67be
[183047.657834] [0000000000000058] pgd=0000000000000000
[183047.662789] Internal error: Oops: 96000004 [#1] SMP
[183047.667740] Process kworker/u16:2 (pid: 31291, stack limit =
0x00000000417c4974)
[183047.675208] CPU: 0 PID: 3291 Comm: kworker/u16:2 Tainted: G
W  OE 4.19.36-vhulk1907.1.0.h410.eulerosv2r8.aarch64 #1
[183047.687015] Hardware name: N/A N/A/Kunpeng Desktop Board D920S10,
BIOS 0.15 10/22/2019
[183047.695007] Workqueue: 0000:74:02.0_disco_q sas_discover_domain
[183047.700999] pstate: 20c00009 (nzCv daif +PAN +UAO)
[183047.705864] pc : prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.711510] lr : prep_ata_v3_hw+0xb0/0x230 [hisi_sas_v3_hw]
[183047.717153] sp : ffff00000f28ba60
[183047.720541] x29: ffff00000f28ba60 x28: ffff8026852d7228
[183047.725925] x27: ffff8027dba3e0a8 x26: ffff8027c05fc200
[183047.731310] x25: 0000000000000000 x24: ffff8026bafa8dc0
[183047.736695] x23: ffff8027c05fc218 x22: ffff8026852d7228
[183047.742079] x21: ffff80007c2f2940 x20: ffff8027c05fc200
[183047.747464] x19: 0000000000f80800 x18: 0000000000000010
[183047.752848] x17: 0000000000000000 x16: 0000000000000000
[183047.758232] x15: ffff000089a5a4ff x14: 0000000000000005
[183047.763617] x13: ffff000009a5a50e x12: ffff8026bafa1e20
[183047.769001] x11: ffff0000087453b8 x10: ffff00000f28b870
[183047.774385] x9 : 0000000000000000 x8 : ffff80007e58f9b0
[183047.779770] x7 : 0000000000000000 x6 : 000000000000003f
[183047.785154] x5 : 0000000000000040 x4 : ffffffffffffffe0
[183047.790538] x3 : 00000000000000f8 x2 : 0000000002000007
[183047.795922] x1 : 0000000000000008 x0 : 0000000000000000
[183047.801307] Call trace:
[183047.803827]  prep_ata_v3_hw+0xf8/0x230 [hisi_sas_v3_hw]
[183047.809127]  hisi_sas_task_prep+0x750/0x888 [hisi_sas_main]
[183047.814773]  hisi_sas_task_exec.isra.7+0x88/0x1f0 [hisi_sas_main]
[183047.820939]  hisi_sas_queue_command+0x28/0x38 [hisi_sas_main]
[183047.826757]  smp_execute_task_sg+0xec/0x218
[183047.831013]  smp_execute_task+0x74/0xa0
[183047.834921]  sas_discover_expander.part.7+0x9c/0x5f8
[183047.839959]  sas_discover_root_expander+0x90/0x160
[183047.844822]  sas_discover_domain+0x1b8/0x1e8
[183047.849164]  process_one_work+0x1b4/0x3f8
[183047.853246]  worker_thread+0x54/0x470
[183047.856981]  kthread+0x134/0x138
[183047.860283]  ret_from_fork+0x10/0x18
[183047.863931] Code: f9407a80 528000e2 39409281 72a04002 (b9405800)
[183047.870097] kernel fault(0x1) notification starting on CPU 0
[183047.875828] kernel fault(0x1) notification finished on CPU 0
[183047.881559] Modules linked in: unibsp(OE) hns3(OE) hclge(OE)
hnae3(OE) mem_drv(OE) hisi_sas_v3_hw(OE) hisi_sas_main(OE)
[183047.892418] ---[ end trace 4cc26083fc11b783  ]---
[183047.897107] Kernel panic - not syncing: Fatal exception
[183047.902403] kernel fault(0x5) notification starting on CPU 0
[183047.908134] kernel fault(0x5) notification finished on CPU 0
[183047.913865] SMP: stopping secondary CPUs
[183047.917861] Kernel Offset: disabled
[183047.921422] CPU features: 0x2,a2a00a38
[183047.925243] Memory Limit: none
[183047.928372] kernel reboot(0x2) notification starting on CPU 0
[183047.934190] kernel reboot(0x2) notification finished on CPU 0
[183047.940008] ---[ end Kernel panic - not syncing: Fatal exception
]---

Fixes: 2908d778ab3e ("[SCSI] aic94xx: new driver")
Link: https://lore.kernel.org/r/20191206011118.46909-1-yanaijie@huawei.com
Reported-by: Gao Chuan <gaochuan4@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/scsi/libsas/sas_discover.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -97,12 +97,21 @@ static int sas_get_port_device(struct as
 		else
 			dev->dev_type = SAS_SATA_DEV;
 		dev->tproto = SAS_PROTOCOL_SATA;
-	} else {
+	} else if (port->oob_mode == SAS_OOB_MODE) {
 		struct sas_identify_frame *id =
 			(struct sas_identify_frame *) dev->frame_rcvd;
 		dev->dev_type = id->dev_type;
 		dev->iproto = id->initiator_bits;
 		dev->tproto = id->target_bits;
+	} else {
+		/* If the oob mode is OOB_NOT_CONNECTED, the port is
+		 * disconnected due to race with PHY down. We cannot
+		 * continue to discover this port
+		 */
+		sas_put_device(dev);
+		pr_warn("Port %016llx is disconnected when discovering\n",
+			SAS_ADDR(port->attached_sas_addr));
+		return -ENODEV;
 	}
 
 	sas_init_dev(dev);


  parent reply	other threads:[~2020-01-08 19:47 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08 19:42 [PATCH 3.16 00/63] 3.16.81-rc1 review Ben Hutchings
2020-01-08 19:42 ` [PATCH 3.16 01/63] net: qlogic: Fix memory leak in ql_alloc_large_buffers Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 02/63] net: qlogic: Fix error paths in ql_alloc_large_buffers() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 03/63] HID: sony: Update device ids Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 04/63] HID: sony: Support DS4 dongle Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 05/63] crypto: cts - fix crash on short inputs Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 06/63] tracing/uprobes: Fix output for multiple string arguments Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 07/63] libceph: handle an empty authorize reply Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 08/63] ALSA: compress: add support for 32bit calls in a 64bit kernel Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 09/63] mmc: debugfs: Add a restriction to mmc debugfs clock setting Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 10/63] mmc: sanitize 'bus width' in debug output Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 11/63] mmc: core: shut up "voltage-ranges unspecified" pr_info() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 12/63] usb: dwc3: gadget: Fix suspend/resume during device mode Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 13/63] arm64: mm: Add trace_irqflags annotations to do_debug_exception() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 14/63] mmc: core: fix using wrong io voltage if mmc_select_hs200 fails Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 15/63] mm/rmap: replace BUG_ON(anon_vma->degree) with VM_WARN_ON Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 16/63] kbuild: setlocalversion: print error to STDERR Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 17/63] usb: gadget: composite: fix dereference after null check coverify warning Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 18/63] usb: gadget: serial: fix re-ordering of tx data Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 19/63] usb: gadget: Add the gserial port checking in gs_start_tx() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 20/63] tcp/dccp: drop SYN packets if accept queue is full Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 21/63] arm64: traps: disable irq in die() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 22/63] usb: renesas_usbhs: gadget: fix unused-but-set-variable warning Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 23/63] xhci: Fix port resume done detection for SS ports with LPM enabled Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 24/63] mmc: block: Allow more than 8 partitions per card Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 25/63] arm64: fix COMPAT_SHMLBA definition for large pages Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 26/63] ARM: 8458/1: bL_switcher: add GIC dependency Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 27/63] net: diag: support v4mapped sockets in inet_diag_find_one_icsk() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 28/63] asm-generic: Fix local variable shadow in __set_fixmap_offset Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 29/63] staging: ashmem: Avoid deadlock with mmap/shrink Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 30/63] staging: ashmem: Add missing include Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 31/63] staging: ion: Set minimum carveout heap allocation order to PAGE_SHIFT Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 32/63] staging: goldfish: audio: fix compiliation on arm Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 33/63] ARM: 8510/1: rework ARM_CPU_SUSPEND dependencies Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 34/63] arm64/kernel: fix incorrect EL0 check in inv_entry macro Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 35/63] arm64: kernel: Include _AC definition in page.h Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 36/63] suspend: simplify block I/O handling Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 37/63] PM / Hibernate: Call flush_icache_range() on pages restored in-place Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 38/63] usb: gadget: configfs: add mutex lock before unregister gadget Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 39/63] usb: gadget: rndis: free response queue during REMOTE_NDIS_RESET_MSG Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 40/63] video: fbdev: Set pixclock = 0 in goldfishfb Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 41/63] arm64: kconfig: drop CONFIG_RTC_LIB dependency Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 42/63] mmc: mmc: fix switch timeout issue caused by jiffies precision Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 43/63] cfg80211: size various nl80211 messages correctly Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 44/63] arm64: support keyctl() system call in 32-bit mode Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 45/63] stmmac: copy unicast mac address to MAC registers Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 46/63] arm64: debug: Don't propagate UNKNOWN FAR into si_code for debug signals Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 47/63] arm64: debug: Ensure debug handlers check triggering exception level Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 48/63] x86/atomic: Fix smp_mb__{before,after}_atomic() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 49/63] locking,x86: Kill atomic_or_long() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 50/63] locking/x86: Remove the unused atomic_inc_short() methd Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 51/63] dmaengine: qcom: bam_dma: Fix resource leak Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 52/63] mwifiex: Fix NL80211_TX_POWER_LIMITED Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 53/63] xhci: fix USB3 device initiated resume race with roothub autosuspend Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 54/63] Make filldir[64]() verify the directory entry filename is valid Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 55/63] filldir[64]: remove WARN_ON_ONCE() for bad directory entries Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 56/63] ext4: Introduce ext4_clamp_want_extra_isize() Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 57/63] ext4: add more paranoia checking in ext4_expand_extra_isize handling Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 58/63] Revert "sched/fair: Fix bandwidth timer clock drift condition" Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 59/63] can: kvaser_usb: kvaser_usb_leaf: Fix some info-leaks to USB devices Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 60/63] media: cpia2: Fix use-after-free in cpia2_exit Ben Hutchings
2020-01-08 19:43 ` [PATCH 3.16 61/63] mwifiex: don't follow AP if country code received from EEPROM Ben Hutchings
2020-01-08 19:44 ` [PATCH 3.16 62/63] mwifiex: fix possible heap overflow in mwifiex_process_country_ie() Ben Hutchings
2020-01-09 12:12   ` Salvatore Bonaccorso
2020-01-10 16:01     ` Ben Hutchings
2020-01-08 19:44 ` Ben Hutchings [this message]
2020-01-08 22:52 ` [PATCH 3.16 00/63] 3.16.81-rc1 review Guenter Roeck
2020-01-09  1:14   ` Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=lsq.1578512578.643033814@decadent.org.uk \
    --to=ben@decadent.org.uk \
    --cc=akpm@linux-foundation.org \
    --cc=gaochuan4@huawei.com \
    --cc=john.garry@huawei.com \
    --cc=kda@linux-powerpc.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=stable@vger.kernel.org \
    --cc=yanaijie@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.