From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
Mike Marciniszyn <mike.marciniszyn@intel.com>,
Kaike Wan <kaike.wan@intel.com>,
Dennis Dalessandro <dennis.dalessandro@intel.com>,
Jason Gunthorpe <jgg@mellanox.com>
Subject: [PATCH 5.4 31/78] IB/hfi1: Adjust flow PSN with the correct resync_psn
Date: Tue, 14 Jan 2020 11:01:05 +0100 [thread overview]
Message-ID: <20200114094357.850365222@linuxfoundation.org> (raw)
In-Reply-To: <20200114094352.428808181@linuxfoundation.org>
From: Kaike Wan <kaike.wan@intel.com>
commit b2ff0d510182eb5cc05a65d1b2371af62c4b170c upstream.
When a TID RDMA ACK to RESYNC request is received, the flow PSNs for
pending TID RDMA WRITE segments will be adjusted with the next flow
generation number, based on the resync_psn value extracted from the flow
PSN of the TID RDMA ACK packet. The resync_psn value indicates the last
flow PSN for which a TID RDMA WRITE DATA packet has been received by the
responder and the requester should resend TID RDMA WRITE DATA packets,
starting from the next flow PSN.
However, if resync_psn points to the last flow PSN for a segment and the
next segment flow PSN starts with a new generation number, use of the old
resync_psn to adjust the flow PSN for the next segment will lead to
miscalculation, resulting in WARN_ON and sge rewinding errors:
WARNING: CPU: 4 PID: 146961 at /nfs/site/home/phcvs2/gitrepo/ifs-all/components/Drivers/tmp/rpmbuild/BUILD/ifs-kernel-updates-3.10.0_957.el7.x86_64/hfi1/tid_rdma.c:4764 hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1]
Modules linked in: ib_ipoib(OE) hfi1(OE) rdmavt(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfsv3 nfs_acl nfs lockd grace fscache iTCO_wdt iTCO_vendor_support skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm irqbypass crc32_pclmul ghash_clmulni_intel ib_isert iscsi_target_mod target_core_mod aesni_intel lrw gf128mul glue_helper ablk_helper cryptd rpcrdma sunrpc opa_vnic ast ttm ib_iser libiscsi drm_kms_helper scsi_transport_iscsi ipmi_ssif syscopyarea sysfillrect sysimgblt fb_sys_fops drm joydev ipmi_si pcspkr sg drm_panel_orientation_quirks ipmi_devintf lpc_ich i2c_i801 ipmi_msghandler wmi rdma_ucm ib_ucm ib_uverbs acpi_cpufreq acpi_power_meter ib_umad rdma_cm ib_cm iw_cm ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul i2c_algo_bit crct10dif_common
crc32c_intel e1000e ib_core ahci libahci ptp libata pps_core nfit libnvdimm [last unloaded: rdmavt]
CPU: 4 PID: 146961 Comm: kworker/4:0H Kdump: loaded Tainted: G W OE ------------ 3.10.0-957.el7.x86_64 #1
Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.0X.02.0117.040420182310 04/04/2018
Workqueue: hfi0_0 _hfi1_do_tid_send [hfi1]
Call Trace:
<IRQ> [<ffffffff9e361dc1>] dump_stack+0x19/0x1b
[<ffffffff9dc97648>] __warn+0xd8/0x100
[<ffffffff9dc9778d>] warn_slowpath_null+0x1d/0x20
[<ffffffffc05d28c6>] hfi1_rc_rcv_tid_rdma_ack+0x8f6/0xa90 [hfi1]
[<ffffffffc05c21cc>] hfi1_kdeth_eager_rcv+0x1dc/0x210 [hfi1]
[<ffffffffc05c23ef>] ? hfi1_kdeth_expected_rcv+0x1ef/0x210 [hfi1]
[<ffffffffc0574f15>] kdeth_process_eager+0x35/0x90 [hfi1]
[<ffffffffc0575b5a>] handle_receive_interrupt_nodma_rtail+0x17a/0x2b0 [hfi1]
[<ffffffffc056a623>] receive_context_interrupt+0x23/0x40 [hfi1]
[<ffffffff9dd4a294>] __handle_irq_event_percpu+0x44/0x1c0
[<ffffffff9dd4a442>] handle_irq_event_percpu+0x32/0x80
[<ffffffff9dd4a4cc>] handle_irq_event+0x3c/0x60
[<ffffffff9dd4d27f>] handle_edge_irq+0x7f/0x150
[<ffffffff9dc2e554>] handle_irq+0xe4/0x1a0
[<ffffffff9e3795dd>] do_IRQ+0x4d/0xf0
[<ffffffff9e36b362>] common_interrupt+0x162/0x162
<EOI> [<ffffffff9dfa0f79>] ? swiotlb_map_page+0x49/0x150
[<ffffffffc05c2ed1>] hfi1_verbs_send_dma+0x291/0xb70 [hfi1]
[<ffffffffc05c2c40>] ? hfi1_wait_kmem+0xf0/0xf0 [hfi1]
[<ffffffffc05c3f26>] hfi1_verbs_send+0x126/0x2b0 [hfi1]
[<ffffffffc05ce683>] _hfi1_do_tid_send+0x1d3/0x320 [hfi1]
[<ffffffff9dcb9d4f>] process_one_work+0x17f/0x440
[<ffffffff9dcbade6>] worker_thread+0x126/0x3c0
[<ffffffff9dcbacc0>] ? manage_workers.isra.25+0x2a0/0x2a0
[<ffffffff9dcc1c31>] kthread+0xd1/0xe0
[<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40
[<ffffffff9e374c1d>] ret_from_fork_nospec_begin+0x7/0x21
[<ffffffff9dcc1b60>] ? insert_kthread_work+0x40/0x40
This patch fixes the issue by adjusting the resync_psn first if the flow
generation has been advanced for a pending segment.
Fixes: 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Link: https://lore.kernel.org/r/20191219231920.51069.37147.stgit@awfm-01.aw.intel.com
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/infiniband/hw/hfi1/tid_rdma.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/infiniband/hw/hfi1/tid_rdma.c
+++ b/drivers/infiniband/hw/hfi1/tid_rdma.c
@@ -4633,6 +4633,15 @@ void hfi1_rc_rcv_tid_rdma_ack(struct hfi
*/
fpsn = full_flow_psn(flow, flow->flow_state.spsn);
req->r_ack_psn = psn;
+ /*
+ * If resync_psn points to the last flow PSN for a
+ * segment and the new segment (likely from a new
+ * request) starts with a new generation number, we
+ * need to adjust resync_psn accordingly.
+ */
+ if (flow->flow_state.generation !=
+ (resync_psn >> HFI1_KDETH_BTH_SEQ_SHIFT))
+ resync_psn = mask_psn(fpsn - 1);
flow->resync_npkts +=
delta_psn(mask_psn(resync_psn + 1), fpsn);
/*
next prev parent reply other threads:[~2020-01-14 10:20 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-14 10:00 [PATCH 5.4 00/78] 5.4.12-stable review Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 01/78] chardev: Avoid potential use-after-free in chrdev_open() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 02/78] i2c: fix bus recovery stop mode timing Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 03/78] powercap: intel_rapl: add NULL pointer check to rapl_mmio_cpu_online() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 04/78] usb: chipidea: host: Disable port power only if previously enabled Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 05/78] ALSA: usb-audio: Apply the sample rate quirk for Bose Companion 5 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 06/78] ALSA: hda/realtek - Add new codec supported for ALCS1200A Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 07/78] ALSA: hda/realtek - Set EAPD control to default for ALC222 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 08/78] ALSA: hda/realtek - Add quirk for the bass speaker on Lenovo Yoga X1 7th gen Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 09/78] tpm: Revert "tpm_tis: reserve chip for duration of tpm_tis_core_init" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 10/78] tpm: Revert "tpm_tis_core: Set TPM_CHIP_FLAG_IRQ before probing for interrupts" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 11/78] tpm: Revert "tpm_tis_core: Turn on the TPM before probing IRQs" Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 12/78] tpm: Handle negative priv->response_len in tpm_common_read() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 13/78] rtc: sun6i: Add support for RTC clocks on R40 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 14/78] kernel/trace: Fix do not unregister tracepoints when register sched_migrate_task fail Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 15/78] tracing: Have stack tracer compile when MCOUNT_INSN_SIZE is not defined Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 16/78] tracing: Change offset type to s32 in preempt/irq tracepoints Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 17/78] HID: Fix slab-out-of-bounds read in hid_field_extract Greg Kroah-Hartman
2020-02-05 7:12 ` [PATCH 5.4 17/78] HID: Fix slab-out-of-bounds read in hid_field_extract (Broken!) peter enderborg
2020-02-05 9:32 ` Greg Kroah-Hartman
2020-02-05 9:49 ` Enderborg, Peter
2020-02-05 9:54 ` Jiri Kosina
2020-02-05 15:00 ` Alan Stern
2020-02-06 7:00 ` Enderborg, Peter
2020-02-06 15:14 ` Alan Stern
2020-02-07 8:11 ` Enderborg, Peter
2020-02-07 15:22 ` Alan Stern
2020-02-10 12:08 ` [PATCH] HID: Extend report buffer size Peter Enderborg
2020-02-10 12:21 ` Greg Kroah-Hartman
2020-02-10 12:40 ` Peter Enderborg
2020-02-10 13:43 ` Greg Kroah-Hartman
2020-02-10 15:01 ` Alan Stern
2020-02-11 8:35 ` peter enderborg
2020-02-11 14:54 ` Alan Stern
2020-02-11 15:01 ` Jiri Kosina
2020-01-14 10:00 ` [PATCH 5.4 18/78] HID: uhid: Fix returning EPOLLOUT from uhid_char_poll Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 19/78] HID: hidraw: Fix returning EPOLLOUT from hidraw_poll Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 20/78] HID: hid-input: clear unmapped usages Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 21/78] Input: add safety guards to input_set_keycode() Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 22/78] Input: input_event - fix struct padding on sparc64 Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 23/78] drm/i915: Add Wa_1408615072 and Wa_1407596294 to icl,ehl Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 24/78] drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu Greg Kroah-Hartman
2020-01-14 14:31 ` Deucher, Alexander
2020-01-14 14:39 ` Greg Kroah-Hartman
2020-01-14 10:00 ` [PATCH 5.4 25/78] Revert "drm/amdgpu: Set no-retry as default." Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 26/78] drm/sun4i: tcon: Set RGB DCLK min. divider based on hardware model Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 27/78] drm/fb-helper: Round up bits_per_pixel if possible Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 28/78] drm/dp_mst: correct the shifting in DP_REMOTE_I2C_READ Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 29/78] drm/i915: Add Wa_1407352427:icl,ehl Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 30/78] drm/i915/gt: Mark up virtual engine uabi_instance Greg Kroah-Hartman
2020-01-14 10:01 ` Greg Kroah-Hartman [this message]
2020-01-14 10:01 ` [PATCH 5.4 32/78] can: kvaser_usb: fix interface sanity check Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 33/78] can: gs_usb: gs_usb_probe(): use descriptors of current altsetting Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 34/78] can: tcan4x5x: tcan4x5x_can_probe(): get the device out of standby before register access Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 35/78] can: mscan: mscan_rx_poll(): fix rx path lockup when returning from polling to irq mode Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 36/78] can: can_dropped_invalid_skb(): ensure an initialized headroom in outgoing CAN sk_buffs Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 37/78] gpiolib: acpi: Turn dmi_system_id table into a generic quirk table Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 38/78] gpiolib: acpi: Add honor_wakeup module-option + quirk mechanism Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 39/78] pstore/ram: Regularize prz label allocation lifetime Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 40/78] staging: vt6656: set usb_set_intfdata on driver fail Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 41/78] staging: vt6656: Fix non zero logical return of, usb_control_msg Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 42/78] usb: cdns3: should not use the same dev_id for shared interrupt handler Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 43/78] usb: ohci-da8xx: ensure error return on variable error is set Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 44/78] USB-PD tcpm: bad warning+size, PPS adapters Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 45/78] USB: serial: option: add ZLP support for 0x1bc7/0x9010 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 46/78] usb: musb: fix idling for suspend after disconnect interrupt Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 47/78] usb: musb: Disable pullup at init Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 48/78] usb: musb: dma: Correct parameter passed to IRQ handler Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 49/78] staging: comedi: adv_pci1710: fix AI channels 16-31 for PCI-1713 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 50/78] staging: vt6656: correct return of vnt_init_registers Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 51/78] staging: vt6656: limit reg output to block size Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 52/78] staging: rtl8188eu: Add device code for TP-Link TL-WN727N v5.21 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 53/78] serdev: Dont claim unsupported ACPI serial devices Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 54/78] iommu/vt-d: Fix adding non-PCI devices to Intel IOMMU Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 55/78] tty: link tty and port before configuring it as console Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 56/78] tty: always relink the port Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 57/78] arm64: Move __ARCH_WANT_SYS_CLONE3 definition to uapi headers Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 58/78] arm64: Implement copy_thread_tls Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 59/78] arm: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 60/78] parisc: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 61/78] riscv: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 62/78] xtensa: " Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 63/78] clone3: ensure copy_thread_tls is implemented Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 64/78] um: Implement copy_thread_tls Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 65/78] staging: vt6656: remove bool from vnt_radio_power_on ret Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 66/78] mwifiex: fix possible heap overflow in mwifiex_process_country_ie() Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 67/78] mwifiex: pcie: Fix memory leak in mwifiex_pcie_alloc_cmdrsp_buf Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 68/78] rpmsg: char: release allocated memory Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 69/78] scsi: bfa: release allocated memory in case of error Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 70/78] rtl8xxxu: prevent leaking urb Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 71/78] ath10k: fix memory leak Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 72/78] HID: hiddev: fix mess in hiddev_open() Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 73/78] USB: Fix: Dont skip endpoint descriptors with maxpacket=0 Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 74/78] phy: cpcap-usb: Fix error path when no host driver is loaded Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 75/78] phy: cpcap-usb: Fix flakey host idling and enumerating of devices Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 76/78] netfilter: arp_tables: init netns pointer in xt_tgchk_param struct Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 77/78] netfilter: conntrack: dccp, sctp: handle null timeout argument Greg Kroah-Hartman
2020-01-14 10:01 ` [PATCH 5.4 78/78] netfilter: ipset: avoid null deref when IPSET_ATTR_LINENO is present Greg Kroah-Hartman
2020-01-14 15:02 ` [PATCH 5.4 00/78] 5.4.12-stable review Jon Hunter
2020-01-14 15:18 ` Greg Kroah-Hartman
2020-01-14 18:17 ` Guenter Roeck
2020-01-14 18:53 ` Greg Kroah-Hartman
2020-01-14 20:19 ` shuah
2020-01-14 21:55 ` Greg Kroah-Hartman
2020-01-15 2:09 ` Daniel Díaz
2020-01-15 8:12 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200114094357.850365222@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=dennis.dalessandro@intel.com \
--cc=jgg@mellanox.com \
--cc=kaike.wan@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mike.marciniszyn@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).