stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org,
	Mike Marciniszyn <mike.marciniszyn@intel.com>,
	Kaike Wan <kaike.wan@intel.com>,
	Dennis Dalessandro <dennis.dalessandro@intel.com>,
	Jason Gunthorpe <jgg@mellanox.com>
Subject: [PATCH 5.4 31/78] IB/hfi1: Adjust flow PSN with the correct resync_psn
Date: Tue, 14 Jan 2020 11:01:05 +0100	[thread overview]
Message-ID: <20200114094357.850365222@linuxfoundation.org> (raw)
In-Reply-To: <20200114094352.428808181@linuxfoundation.org>

From: Kaike Wan <kaike.wan@intel.com>

commit b2ff0d510182eb5cc05a65d1b2371af62c4b170c upstream.

When a TID RDMA ACK to RESYNC request is received, the flow PSNs for
pending TID RDMA WRITE segments will be adjusted with the next flow
generation number, based on the resync_psn value extracted from the flow
PSN of the TID RDMA ACK packet. The resync_psn value indicates the last
flow PSN for which a TID RDMA WRITE DATA packet has been received by the
responder and the requester should resend TID RDMA WRITE DATA packets,
starting from the next flow PSN.

However, if resync_psn points to the last flow PSN for a segment and the
next segment flow PSN starts with a new generation number, use of the old
resync_psn to adjust the flow PSN for the next segment will lead to
miscalculation, resulting in WARN_ON and sge rewinding errors:

  WARNING: CPU: 4 PID: 146961 at /nfs/site/home/phcvs2/gitrepo/ifs-all/components/Drivers/tmp/rpmbuild/BUILD/ifs-kernel-updates-3.10.0_957.el7.x86_64/hfi1/tid_rdma.c:4764 hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1]
  Modules linked in: ib_ipoib(OE) hfi1(OE) rdmavt(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfsv3 nfs_acl nfs lockd grace fscache iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel ib_isert iscsi_target_mod target_core_mod aesni_intel lrw gf128mul glue_helper ablk_helper cryptd rpcrdma sunrpc opa_vnic ast ttm ib_iser libiscsi drm_kms_helper scsi_transport_iscsi ipmi_ssif syscopyarea sysfillrect sysimgblt fb_sys_fops drm joydev ipmi_si pcspkr sg drm_panel_orientation_quirks ipmi_devintf lpc_ich i2c_i801 ipmi_msghandler wmi rdma_ucm ib_ucm ib_uverbs acpi_cpufreq acpi_power_meter ib_umad rdma_cm ib_cm iw_cm ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul i2c_algo_bit crct10dif_common
   crc32c_intel e1000e ib_core ahci libahci ptp libata pps_core nfit libnvdimm [last unloaded: rdmavt]
  CPU: 4 PID: 146961 Comm: kworker/4:0H Kdump: loaded Tainted: G        W  OE  ------------   3.10.0-957.el7.x86_64 #1
  Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.0X.02.0117.040420182310 04/04/2018
  Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1]
  Call Trace:
   <IRQ>  [<ffffffff9e361dc1>] dump_stack+0x19/0x1b
   [<ffffffff9dc97648>] __warn+0xd8/0x100
   [<ffffffff9dc9778d>] warn_slowpath_null+0x1d/0x20
   [<ffffffffc05d28c6>] hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1]
   [<ffffffffc05c21cc>] hfi1_kdeth_eager_rcv+0x1dc/0x210 [hfi1]
   [<ffffffffc05c23ef>] ? hfi1_kdeth_expected_rcv+0x1ef/0x210 [hfi1]
   [<ffffffffc0574f15>] kdeth_process_eager+0x35/0x90 [hfi1]
   [<ffffffffc0575b5a>] handle_receive_interrupt_nodma_rtail+0x17a/0x2b0 [hfi1]
   [<ffffffffc056a623>] receive_context_interrupt+0x23/0x40 [hfi1]
   [<ffffffff9dd4a294>] __handle_irq_event_percpu+0x44/0x1c0
   [<ffffffff9dd4a442>] handle_irq_event_percpu+0x32/0x80
   [<ffffffff9dd4a4cc>] handle_irq_event+0x3c/0x60
   [<ffffffff9dd4d27f>] handle_edge_irq+0x7f/0x150
   [<ffffffff9dc2e554>] handle_irq+0xe4/0x1a0
   [<ffffffff9e3795dd>] do_IRQ+0x4d/0xf0
   [<ffffffff9e36b362>] common_interrupt+0x162/0x162
   <EOI>  [<ffffffff9dfa0f79>] ? swiotlb_map_page+0x49/0x150
   [<ffffffffc05c2ed1>] hfi1_verbs_send_dma+0x291/0xb70 [hfi1]
   [<ffffffffc05c2c40>] ? hfi1_wait_kmem+0xf0/0xf0 [hfi1]
   [<ffffffffc05c3f26>] hfi1_verbs_send+0x126/0x2b0 [hfi1]
   [<ffffffffc05ce683>] _hfi1_do_tid_send+0x1d3/0x320 [hfi1]
   [<ffffffff9dcb9d4f>] process_one_work+0x17f/0x440
   [<ffffffff9dcbade6>] worker_thread+0x126/0x3c0
   [<ffffffff9dcbacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
   [<ffffffff9dcc1c31>] kthread+0xd1/0xe0
   [<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40
   [<ffffffff9e374c1d>] ret_from_fork_nospec_begin+0x7/0x21
   [<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40

This patch fixes the issue by adjusting the resync_psn first if the flow
generation has been advanced for a pending segment.

Fixes: 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Link: https://lore.kernel.org/r/20191219231920.51069.37147.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/infiniband/hw/hfi1/tid_rdma.c |    9 +++++++++
 1 file changed, 9 insertions(+)

--- a/drivers/infiniband/hw/hfi1/tid_rdma.c
+++ b/drivers/infiniband/hw/hfi1/tid_rdma.c
@@ -4633,6 +4633,15 @@ void hfi1_rc_rcv_tid_rdma_ack(struct hfi
 			 */
 			fpsn = full_flow_psn(flow, flow->flow_state.spsn);
 			req->r_ack_psn = psn;
+			/*
+			 * If resync_psn points to the last flow PSN for a
+			 * segment and the new segment (likely from a new
+			 * request) starts with a new generation number, we
+			 * need to adjust resync_psn accordingly.
+			 */
+			if (flow->flow_state.generation !=
+			    (resync_psn >> HFI1_KDETH_BTH_SEQ_SHIFT))
+				resync_psn = mask_psn(fpsn - 1);
 			flow->resync_npkts +=
 				delta_psn(mask_psn(resync_psn + 1), fpsn);
 			/*



  parent reply	other threads:[~2020-01-14 10:20 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 10:00 [PATCH 5.4 00/78] 5.4.12-stable review Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 01/78] chardev: Avoid potential use-after-free in chrdev_open() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 02/78] i2c: fix bus recovery stop mode timing Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 03/78] powercap: intel_rapl: add NULL pointer check to rapl_mmio_cpu_online() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 04/78] usb: chipidea: host: Disable port power only if previously enabled Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 05/78] ALSA: usb-audio: Apply the sample rate quirk for Bose Companion 5 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 06/78] ALSA: hda/realtek - Add new codec supported for ALCS1200A Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 07/78] ALSA: hda/realtek - Set EAPD control to default for ALC222 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 08/78] ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 09/78] tpm: Revert "tpm_tis: reserve chip for duration of tpm_tis_core_init" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 10/78] tpm: Revert "tpm_tis_core: Set TPM_CHIP_FLAG_IRQ before probing for interrupts" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 11/78] tpm: Revert "tpm_tis_core: Turn on the TPM before probing IRQs" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 12/78] tpm: Handle negative priv->response_len in tpm_common_read() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 13/78] rtc: sun6i: Add support for RTC clocks on R40 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 14/78] kernel/trace: Fix do not unregister tracepoints when register sched_migrate_task fail Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 15/78] tracing: Have stack tracer compile when MCOUNT_INSN_SIZE is not defined Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 16/78] tracing: Change offset type to s32 in preempt/irq tracepoints Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 17/78] HID: Fix slab-out-of-bounds read in hid_field_extract Greg Kroah-Hartman
2020-02-05  7:12   ` [PATCH 5.4 17/78] HID: Fix slab-out-of-bounds read in hid_field_extract (Broken!) peter enderborg
2020-02-05  9:32     ` Greg Kroah-Hartman
2020-02-05  9:49       ` Enderborg, Peter
2020-02-05  9:54         ` Jiri Kosina
2020-02-05 15:00           ` Alan Stern
2020-02-06  7:00             ` Enderborg, Peter
2020-02-06 15:14               ` Alan Stern
2020-02-07  8:11                 ` Enderborg, Peter
2020-02-07 15:22                   ` Alan Stern
2020-02-10 12:08                     ` [PATCH] HID: Extend report buffer size Peter Enderborg
2020-02-10 12:21                       ` Greg Kroah-Hartman
2020-02-10 12:40                         ` Peter Enderborg
2020-02-10 13:43                           ` Greg Kroah-Hartman
2020-02-10 15:01                       ` Alan Stern
2020-02-11  8:35                         ` peter enderborg
2020-02-11 14:54                           ` Alan Stern
2020-02-11 15:01                             ` Jiri Kosina
2020-01-14 10:00 ` [PATCH 5.4 18/78] HID: uhid: Fix returning EPOLLOUT from uhid_char_poll Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 19/78] HID: hidraw: Fix returning EPOLLOUT from hidraw_poll Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 20/78] HID: hid-input: clear unmapped usages Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 21/78] Input: add safety guards to input_set_keycode() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 22/78] Input: input_event - fix struct padding on sparc64 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 23/78] drm/i915: Add Wa_1408615072 and Wa_1407596294 to icl,ehl Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 24/78] drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu Greg Kroah-Hartman
2020-01-14 14:31   ` Deucher, Alexander
2020-01-14 14:39     ` Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 25/78] Revert "drm/amdgpu: Set no-retry as default." Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 26/78] drm/sun4i: tcon: Set RGB DCLK min. divider based on hardware model Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 27/78] drm/fb-helper: Round up bits_per_pixel if possible Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 28/78] drm/dp_mst: correct the shifting in DP_REMOTE_I2C_READ Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 29/78] drm/i915: Add Wa_1407352427:icl,ehl Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 30/78] drm/i915/gt: Mark up virtual engine uabi_instance Greg Kroah-Hartman
2020-01-14 10:01 ` Greg Kroah-Hartman [this message]
2020-01-14 10:01 ` [PATCH 5.4 32/78] can: kvaser_usb: fix interface sanity check Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 33/78] can: gs_usb: gs_usb_probe(): use descriptors of current altsetting Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 34/78] can: tcan4x5x: tcan4x5x_can_probe(): get the device out of standby before register access Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 35/78] can: mscan: mscan_rx_poll(): fix rx path lockup when returning from polling to irq mode Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 36/78] can: can_dropped_invalid_skb(): ensure an initialized headroom in outgoing CAN sk_buffs Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 37/78] gpiolib: acpi: Turn dmi_system_id table into a generic quirk table Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 38/78] gpiolib: acpi: Add honor_wakeup module-option + quirk mechanism Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 39/78] pstore/ram: Regularize prz label allocation lifetime Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 40/78] staging: vt6656: set usb_set_intfdata on driver fail Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 41/78] staging: vt6656: Fix non zero logical return of, usb_control_msg Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 42/78] usb: cdns3: should not use the same dev_id for shared interrupt handler Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 43/78] usb: ohci-da8xx: ensure error return on variable error is set Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 44/78] USB-PD tcpm: bad warning+size, PPS adapters Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 45/78] USB: serial: option: add ZLP support for 0x1bc7/0x9010 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 46/78] usb: musb: fix idling for suspend after disconnect interrupt Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 47/78] usb: musb: Disable pullup at init Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 48/78] usb: musb: dma: Correct parameter passed to IRQ handler Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 49/78] staging: comedi: adv_pci1710: fix AI channels 16-31 for PCI-1713 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 50/78] staging: vt6656: correct return of vnt_init_registers Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 51/78] staging: vt6656: limit reg output to block size Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 52/78] staging: rtl8188eu: Add device code for TP-Link TL-WN727N v5.21 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 53/78] serdev: Dont claim unsupported ACPI serial devices Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 54/78] iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 55/78] tty: link tty and port before configuring it as console Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 56/78] tty: always relink the port Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 57/78] arm64: Move __ARCH_WANT_SYS_CLONE3 definition to uapi headers Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 58/78] arm64: Implement copy_thread_tls Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 59/78] arm: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 60/78] parisc: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 61/78] riscv: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 62/78] xtensa: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 63/78] clone3: ensure copy_thread_tls is implemented Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 64/78] um: Implement copy_thread_tls Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 65/78] staging: vt6656: remove bool from vnt_radio_power_on ret Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 66/78] mwifiex: fix possible heap overflow in mwifiex_process_country_ie() Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 67/78] mwifiex: pcie: Fix memory leak in mwifiex_pcie_alloc_cmdrsp_buf Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 68/78] rpmsg: char: release allocated memory Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 69/78] scsi: bfa: release allocated memory in case of error Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 70/78] rtl8xxxu: prevent leaking urb Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 71/78] ath10k: fix memory leak Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 72/78] HID: hiddev: fix mess in hiddev_open() Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 73/78] USB: Fix: Dont skip endpoint descriptors with maxpacket=0 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 74/78] phy: cpcap-usb: Fix error path when no host driver is loaded Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 75/78] phy: cpcap-usb: Fix flakey host idling and enumerating of devices Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 76/78] netfilter: arp_tables: init netns pointer in xt_tgchk_param struct Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 77/78] netfilter: conntrack: dccp, sctp: handle null timeout argument Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 78/78] netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present Greg Kroah-Hartman
2020-01-14 15:02 ` [PATCH 5.4 00/78] 5.4.12-stable review Jon Hunter
2020-01-14 15:18   ` Greg Kroah-Hartman
2020-01-14 18:17 ` Guenter Roeck
2020-01-14 18:53   ` Greg Kroah-Hartman
2020-01-14 20:19 ` shuah
2020-01-14 21:55   ` Greg Kroah-Hartman
2020-01-15  2:09 ` Daniel Díaz
2020-01-15  8:12   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200114094357.850365222@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=dennis.dalessandro@intel.com \
    --cc=jgg@mellanox.com \
    --cc=kaike.wan@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mike.marciniszyn@intel.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).