linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Steffen Maier <maier@linux.ibm.com>,
	Jens Remus <jremus@linux.ibm.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>
Subject: [PATCH 4.14 055/105] scsi: zfcp: fix posting too many status read buffers leading to adapter shutdown
Date: Fri, 11 Jan 2019 15:14:26 +0100	[thread overview]
Message-ID: <20190111131107.640822727@linuxfoundation.org> (raw)
In-Reply-To: <20190111131102.899065735@linuxfoundation.org>

4.14-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Steffen Maier <maier@linux.ibm.com>

commit 60a161b7e5b2a252ff0d4c622266a7d8da1120ce upstream.

Suppose adapter (open) recovery is between opened QDIO queues and before
(the end of) initial posting of status read buffers (SRBs). This time
window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
causing by design looping with exponential increase sleeps in the function
performing exchange config data during recovery
[zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.

Suppose an event occurs for which the FCP channel would send an unsolicited
notification to zfcp by means of a previously posted SRB.  We saw it with
local cable pull (link down) in multi-initiator zoning with multiple
NPIV-enabled subchannels of the same shared FCP channel.

As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial
status read buffers from within the adapter's ERP thread, the channel does
send an unsolicited notification.

Since v2.6.27 commit d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted
status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
adapter->stat_work to re-fill the just consumed SRB from a work item.

Now the ERP thread and the work item post SRBs in parallel.  Both contexts
call the helper function zfcp_status_read_refill().  The tracking of
missing (to be posted / re-filled) SRBs is not thread-safe due to separate
atomic_read() and atomic_dec(), in order to depend on posting
success. Hence, both contexts can see
atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts
one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
(trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
zfcp_qdio_handler_error().

An obvious and seemingly clean fix would be to schedule stat_work from the
ERP thread and wait for it to finish. This would serialize all SRB
re-fills. However, we already have another work item wait on the ERP
thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the
open port recoveries during zfcp auto port scan, but in fact it waits for
any pending recovery including an adapter recovery. This approach leads to
a deadlock.  [see also v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan
resiliency"); v2.6.37 commit d3e1088d6873
("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
v2.6.28 commit fca55b6fb587
("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
fixing v2.6.27 commit c57a39a45a76
("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
v2.6.27 commit cc8c282963bd
("[SCSI] zfcp: Automatically attach remote ports")]

Instead make the accounting of missing SRBs atomic for parallel execution
in both the ERP thread and adapter->stat_work.

Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes: d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall")
Cc: <stable@vger.kernel.org> #2.6.27+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/s390/scsi/zfcp_aux.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/s390/scsi/zfcp_aux.c
+++ b/drivers/s390/scsi/zfcp_aux.c
@@ -274,16 +274,16 @@ static void zfcp_free_low_mem_buffers(st
  */
 int zfcp_status_read_refill(struct zfcp_adapter *adapter)
 {
-	while (atomic_read(&adapter->stat_miss) > 0)
+	while (atomic_add_unless(&adapter->stat_miss, -1, 0))
 		if (zfcp_fsf_status_read(adapter->qdio)) {
+			atomic_inc(&adapter->stat_miss); /* undo add -1 */
 			if (atomic_read(&adapter->stat_miss) >=
 			    adapter->stat_read_buf_num) {
 				zfcp_erp_adapter_reopen(adapter, 0, "axsref1");
 				return 1;
 			}
 			break;
-		} else
-			atomic_dec(&adapter->stat_miss);
+		}
 	return 0;
 }
 



  parent reply	other threads:[~2019-01-11 15:01 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 14:13 [PATCH 4.14 000/105] 4.14.93-stable review Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 001/105] pinctrl: meson: fix pull enable register calculation Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 002/105] powerpc: Fix COFF zImage booting on old powermacs Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 003/105] powerpc/mm: Fix linux page tables build with some configs Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 004/105] HID: ite: Add USB id match for another ITE based keyboard rfkill key quirk Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 005/105] ARM: imx: update the cpu power up timing setting on i.mx6sx Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 006/105] ARM: dts: imx7d-nitrogen7: Fix the description of the Wifi clock Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 007/105] Input: restore EV_ABS ABS_RESERVED Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 008/105] checkstack.pl: fix for aarch64 Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 009/105] xfrm: Fix error return code in xfrm_output_one() Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 010/105] xfrm: Fix bucket count reported to userspace Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 011/105] xfrm: Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 012/105] netfilter: seqadj: re-load tcp header pointer after possible head reallocation Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 013/105] scsi: bnx2fc: Fix NULL dereference in error handling Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 014/105] Input: omap-keypad - fix idle configuration to not block SoC idle states Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 015/105] Input: synaptics - enable RMI on ThinkPad T560 Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 016/105] ibmvnic: Fix non-atomic memory allocation in IRQ context Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 017/105] ieee802154: ca8210: fix possible u8 overflow in ca8210_rx_done Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 018/105] x86/mm: Fix guard hole handling Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 019/105] x86/dump_pagetables: Fix LDT remap address marker Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 020/105] i40e: fix mac filter delete when setting mac address Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 021/105] netfilter: ipset: do not call ipset_nest_end after nla_nest_cancel Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 022/105] netfilter: nat: cant use dst_hold on noref dst Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 023/105] bnx2x: Clear fip MAC when fcoe offload support is disabled Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 024/105] bnx2x: Remove configured vlans as part of unload sequence Greg Kroah-Hartman
2019-01-12 21:22   ` Sudip Mukherjee
2019-01-13  7:04     ` Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 025/105] bnx2x: Send update-svid ramrod with retry/poll flags enabled Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 026/105] scsi: target: iscsi: cxgbit: fix csk leak Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 027/105] scsi: target: iscsi: cxgbit: add missing spin_lock_init() Greg Kroah-Hartman
2019-01-11 14:13 ` [PATCH 4.14 028/105] x86, hyperv: remove PCI dependency Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 029/105] drivers: net: xgene: Remove unnecessary forward declarations Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 030/105] w90p910_ether: remove incorrect __init annotation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 031/105] net: hns: Incorrect offset address used for some registers Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 032/105] net: hns: All ports can not work when insmod hns ko after rmmod Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 033/105] net: hns: Some registers use wrong address according to the datasheet Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 034/105] net: hns: Fixed bug that netdev was opened twice Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 035/105] net: hns: Clean rx fbd when ae stopped Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 036/105] net: hns: Free irq when exit from abnormal branch Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 037/105] net: hns: Avoid net reset caused by pause frames storm Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 038/105] net: hns: Fix ntuple-filters status error Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 039/105] net: hns: Add mac pcs config when enable|disable mac Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 040/105] net: hns: Fix ping failed when use net bridge and send multicast Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 041/105] SUNRPC: Fix a race with XPRT_CONNECTING Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 042/105] qed: Fix an error code qed_ll2_start_xmit() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 043/105] net: macb: fix random memory corruption on RX with 64-bit DMA Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 044/105] net: macb: fix dropped RX frames due to a race Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 045/105] lan78xx: Resolve issue with changing MAC address Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 046/105] vxge: ensure data0 is initialized in when fetching firmware version information Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 047/105] mac80211: free skb fraglist before freeing the skb Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 048/105] kbuild: fix false positive warning/error about missing libelf Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 049/105] virtio: fix test build after uio.h change Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 050/105] gpio: mvebu: only fail on missing clk if pwm is actually to be used Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 051/105] Input: synaptics - enable SMBus for HP EliteBook 840 G4 Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 052/105] net: netxen: fix a missing check and an uninitialized use Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 053/105] qmi_wwan: Fix qmap header retrieval in qmimux_rx_fixup Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 054/105] serial/sunsu: fix refcount leak Greg Kroah-Hartman
2019-01-11 14:14 ` Greg Kroah-Hartman [this message]
2019-01-11 14:14 ` [PATCH 4.14 056/105] scsi: lpfc: do not set queue->page_count to 0 if pc_sli4_params.wqpcnt is invalid Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 057/105] genirq/affinity: Dont return with empty affinity masks on error Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 058/105] tools: fix cross-compile var clobbering Greg Kroah-Hartman
2019-01-12 21:18   ` Sudip Mukherjee
2019-01-12 21:35     ` Martin Kelly
2019-01-13  7:10       ` Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 059/105] fork: record start_time late Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 060/105] zram: fix double free backing device Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 061/105] hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 062/105] mm, devm_memremap_pages: mark devm_memremap_pages() EXPORT_SYMBOL_GPL Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 063/105] mm, devm_memremap_pages: kill mapping "System RAM" support Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 064/105] mm, hmm: use devm semantics for hmm_devmem_{add, remove} Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 065/105] mm, hmm: mark hmm_devmem_{add, add_resource} EXPORT_SYMBOL_GPL Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 066/105] mm, swap: fix swapoff with KSM pages Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 067/105] sunrpc: fix cache_head leak due to queued request Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 068/105] sunrpc: use SVC_NET() in svcauth_gss_* functions Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 069/105] powerpc: avoid -mno-sched-epilog on GCC 4.9 and newer Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 070/105] powerpc: Disable -Wbuiltin-requires-header when setjmp is used Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 071/105] ftrace: Build with CPPFLAGS to get -Qunused-arguments Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 072/105] md: raid10: remove VLAIS Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 073/105] kbuild: add -no-integrated-as Clang option unconditionally Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 074/105] kbuild: consolidate Clang compiler flags Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 075/105] Makefile: Export clang toolchain variables Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 076/105] powerpc/boot: Set target when cross-compiling for clang Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 077/105] raid6/ppc: Fix build " Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 078/105] vhost/vsock: fix uninitialized vhost_vsock->guest_cid Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 079/105] dm verity: fix crash on bufio buffer that was allocated with vmalloc Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 080/105] dm zoned: Fix target BIO completion handling Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 081/105] ALSA: cs46xx: Potential NULL dereference in probe Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 082/105] ALSA: usb-audio: Avoid access before bLength check in build_audio_procunit() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 083/105] ALSA: usb-audio: Fix an out-of-bound read in create_composite_quirks Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 084/105] dlm: fixed memory leaks after failed ls_remove_names allocation Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 085/105] dlm: possible memory leak on error path in create_lkb() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 086/105] dlm: lost put_lkb on error path in receive_convert() and receive_unlock() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 087/105] dlm: memory leaks on error path in dlm_user_request() Greg Kroah-Hartman
2019-01-11 14:14 ` [PATCH 4.14 088/105] gfs2: Get rid of potential double-freeing in gfs2_create_inode Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 089/105] gfs2: Fix loop in gfs2_rbm_find Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 090/105] b43: Fix error in cordic routine Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 091/105] selinux: policydb - fix byte order and alignment issues Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 092/105] lockd: Show pid of lockd for remote locks Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 093/105] arm64: drop linker script hack to hide __efistub_ symbols Greg Kroah-Hartman
2019-01-11 18:02   ` Nick Desaulniers
2019-01-12  8:03     ` Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 094/105] arm64: relocatable: fix inconsistencies in linker script and options Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 095/105] powerpc/tm: Set MSR[TS] just prior to recheckpoint Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 096/105] 9p/net: put a lower bound on msize Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 097/105] rxe: fix error completion wr_id and qp_num Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 098/105] iommu/vt-d: Handle domain agaw being less than iommu agaw Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 099/105] sched/fair: Fix infinite loop in update_blocked_averages() by reverting a9e7f6544b9c Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 100/105] ceph: dont update importing caps mseq when handing cap export Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 101/105] genwqe: Fix size check Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 102/105] intel_th: msu: Fix an off-by-one in attribute store Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 103/105] power: supply: olpc_battery: correct the temperature units Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 104/105] lib: fix build failure in CONFIG_DEBUG_VIRTUAL test Greg Kroah-Hartman
2019-01-11 14:15 ` [PATCH 4.14 105/105] drm/vc4: Set ->is_yuv to false when num_planes == 1 Greg Kroah-Hartman
2019-01-11 21:41 ` [PATCH 4.14 000/105] 4.14.93-stable review shuah
2019-01-12  8:21 ` Naresh Kamboju
2019-01-12 17:44 ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190111131107.640822727@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=jremus@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maier@linux.ibm.com \
    --cc=martin.petersen@oracle.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).