All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Chiqijun <chiqijun@huawei.com>,
	Bjorn Helgaas <bhelgaas@google.com>,
	Alex Williamson <alex.williamson@redhat.com>
Subject: [PATCH 5.4 63/90] PCI: Work around Huawei Intelligent NIC VF FLR erratum
Date: Mon, 21 Jun 2021 18:15:38 +0200	[thread overview]
Message-ID: <20210621154906.286261737@linuxfoundation.org> (raw)
In-Reply-To: <20210621154904.159672728@linuxfoundation.org>

From: Chiqijun <chiqijun@huawei.com>

commit ce00322c2365e1f7b0312f2f493539c833465d97 upstream.

pcie_flr() starts a Function Level Reset (FLR), waits 100ms (the maximum
time allowed for FLR completion by PCIe r5.0, sec 6.6.2), and waits for the
FLR to complete.  It assumes the FLR is complete when a config read returns
valid data.

When we do an FLR on several Huawei Intelligent NIC VFs at the same time,
firmware on the NIC processes them serially.  The VF may respond to config
reads before the firmware has completed its reset processing.  If we bind a
driver to the VF (e.g., by assigning the VF to a virtual machine) in the
interval between the successful config read and completion of the firmware
reset processing, the NIC VF driver may fail to load.

Prevent this driver failure by waiting for the NIC firmware to complete its
reset processing.  Not all NIC firmware supports this feature.

[bhelgaas: commit log]
Link: https://support.huawei.com/enterprise/en/doc/EDOC1100063073/87950645/vm-oss-occasionally-fail-to-load-the-in200-driver-when-the-vf-performs-flr
Link: https://lore.kernel.org/r/20210414132301.1793-1-chiqijun@huawei.com
Signed-off-by: Chiqijun <chiqijun@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/pci/quirks.c |   65 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 65 insertions(+)

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3991,6 +3991,69 @@ static int delay_250ms_after_flr(struct
 	return 0;
 }
 
+#define PCI_DEVICE_ID_HINIC_VF      0x375E
+#define HINIC_VF_FLR_TYPE           0x1000
+#define HINIC_VF_FLR_CAP_BIT        (1UL << 30)
+#define HINIC_VF_OP                 0xE80
+#define HINIC_VF_FLR_PROC_BIT       (1UL << 18)
+#define HINIC_OPERATION_TIMEOUT     15000	/* 15 seconds */
+
+/* Device-specific reset method for Huawei Intelligent NIC virtual functions */
+static int reset_hinic_vf_dev(struct pci_dev *pdev, int probe)
+{
+	unsigned long timeout;
+	void __iomem *bar;
+	u32 val;
+
+	if (probe)
+		return 0;
+
+	bar = pci_iomap(pdev, 0, 0);
+	if (!bar)
+		return -ENOTTY;
+
+	/* Get and check firmware capabilities */
+	val = ioread32be(bar + HINIC_VF_FLR_TYPE);
+	if (!(val & HINIC_VF_FLR_CAP_BIT)) {
+		pci_iounmap(pdev, bar);
+		return -ENOTTY;
+	}
+
+	/* Set HINIC_VF_FLR_PROC_BIT for the start of FLR */
+	val = ioread32be(bar + HINIC_VF_OP);
+	val = val | HINIC_VF_FLR_PROC_BIT;
+	iowrite32be(val, bar + HINIC_VF_OP);
+
+	pcie_flr(pdev);
+
+	/*
+	 * The device must recapture its Bus and Device Numbers after FLR
+	 * in order generate Completions.  Issue a config write to let the
+	 * device capture this information.
+	 */
+	pci_write_config_word(pdev, PCI_VENDOR_ID, 0);
+
+	/* Firmware clears HINIC_VF_FLR_PROC_BIT when reset is complete */
+	timeout = jiffies + msecs_to_jiffies(HINIC_OPERATION_TIMEOUT);
+	do {
+		val = ioread32be(bar + HINIC_VF_OP);
+		if (!(val & HINIC_VF_FLR_PROC_BIT))
+			goto reset_complete;
+		msleep(20);
+	} while (time_before(jiffies, timeout));
+
+	val = ioread32be(bar + HINIC_VF_OP);
+	if (!(val & HINIC_VF_FLR_PROC_BIT))
+		goto reset_complete;
+
+	pci_warn(pdev, "Reset dev timeout, FLR ack reg: %#010x\n", val);
+
+reset_complete:
+	pci_iounmap(pdev, bar);
+
+	return 0;
+}
+
 static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
 	{ PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_82599_SFP_VF,
 		 reset_intel_82599_sfp_virtfn },
@@ -4002,6 +4065,8 @@ static const struct pci_dev_reset_method
 	{ PCI_VENDOR_ID_INTEL, 0x0953, delay_250ms_after_flr },
 	{ PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID,
 		reset_chelsio_generic_dev },
+	{ PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HINIC_VF,
+		reset_hinic_vf_dev },
 	{ 0 }
 };
 



  parent reply	other threads:[~2021-06-21 16:20 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-21 16:14 [PATCH 5.4 00/90] 5.4.128-rc1 review Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 01/90] dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 02/90] dmaengine: QCOM_HIDMA_MGMT " Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 03/90] dmaengine: stedma40: add missing iounmap() on error in d40_probe() Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 04/90] afs: Fix an IS_ERR() vs NULL check Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 05/90] mm/memory-failure: make sure wait for page writeback in memory_failure Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 06/90] kvm: LAPIC: Restore guard to prevent illegal APIC register access Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 07/90] batman-adv: Avoid WARN_ON timing related checks Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 08/90] net: ipv4: fix memory leak in netlbl_cipsov4_add_std Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 09/90] vrf: fix maximum MTU Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 10/90] net: rds: fix memory leak in rds_recvmsg Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 11/90] net: lantiq: disable interrupt before sheduling NAPI Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 12/90] udp: fix race between close() and udp_abort() Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 13/90] rtnetlink: Fix regression in bridge VLAN configuration Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 14/90] net/sched: act_ct: handle DNAT tuple collision Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 15/90] net/mlx5e: Remove dependency in IPsec initialization flows Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 16/90] net/mlx5e: Fix page reclaim for dead peer hairpin Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 17/90] net/mlx5: Consider RoCE cap before init RDMA resources Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 18/90] net/mlx5e: allow TSO on VXLAN over VLAN topologies Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 19/90] net/mlx5e: Block offload of outer header csum for UDP tunnels Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 20/90] netfilter: synproxy: Fix out of bounds when parsing TCP options Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 21/90] sch_cake: Fix out of bounds when parsing TCP options and header Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 22/90] alx: Fix an error handling path in alx_probe() Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 23/90] net: stmmac: dwmac1000: Fix extended MAC address registers definition Greg Kroah-Hartman
2021-06-21 16:14 ` [PATCH 5.4 24/90] net: make get_net_ns return error if NET_NS is disabled Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 25/90] qlcnic: Fix an error handling path in qlcnic_probe() Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 26/90] netxen_nic: Fix an error handling path in netxen_nic_probe() Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 27/90] net: qrtr: fix OOB Read in qrtr_endpoint_post Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 28/90] ptp: improve max_adj check against unreasonable values Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 29/90] net: cdc_ncm: switch to eth%d interface naming Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 30/90] lantiq: net: fix duplicated skb in rx descriptor ring Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 31/90] net: usb: fix possible use-after-free in smsc75xx_bind Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 32/90] net: fec_ptp: fix issue caused by refactor the fec_devtype Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 33/90] net: ipv4: fix memory leak in ip_mc_add1_src Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 34/90] net/af_unix: fix a data-race in unix_dgram_sendmsg / unix_release_sock Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 35/90] be2net: Fix an error handling path in be_probe() Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 36/90] net: hamradio: fix memory leak in mkiss_close Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 37/90] net: cdc_eem: fix tx fixup skb leak Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 38/90] cxgb4: fix wrong shift Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 39/90] bnxt_en: Rediscover PHY capabilities after firmware reset Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 40/90] bnxt_en: Call bnxt_ethtool_free() in bnxt_init_one() error path Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 41/90] icmp: dont send out ICMP messages with a source address of 0.0.0.0 Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 42/90] net: ethernet: fix potential use-after-free in ec_bhf_remove Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 43/90] regulator: bd70528: Fix off-by-one for buck123 .n_voltages setting Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 44/90] ASoC: rt5659: Fix the lost powers for the HDA header Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 45/90] spi: stm32-qspi: Always wait BUSY bit to be cleared in stm32_qspi_wait_cmd() Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 46/90] pinctrl: ralink: rt2880: avoid to error in calls is pin is already enabled Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 47/90] radeon: use memcpy_to/fromio for UVD fw upload Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 48/90] hwmon: (scpi-hwmon) shows the negative temperature properly Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 49/90] sched/fair: Correctly insert cfs_rqs to list on unthrottle Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 50/90] can: bcm: fix infoleak in struct bcm_msg_head Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 51/90] can: bcm/raw/isotp: use per module netdevice notifier Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 52/90] can: j1939: fix Use-after-Free, hold skb ref while in use Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 53/90] can: mcba_usb: fix memory leak in mcba_usb Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 54/90] usb: core: hub: Disable autosuspend for Cypress CY7C65632 Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 55/90] tracing: Do not stop recording cmdlines when tracing is off Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 56/90] tracing: Do not stop recording comms if the trace file is being read Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 57/90] tracing: Do no increment trace_clock_global() by one Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 58/90] PCI: Mark TI C667X to avoid bus reset Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 59/90] PCI: Mark some NVIDIA GPUs " Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 60/90] PCI: aardvark: Dont rely on jiffies while holding spinlock Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 61/90] PCI: aardvark: Fix kernel panic during PIO transfer Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 62/90] PCI: Add ACS quirk for Broadcom BCM57414 NIC Greg Kroah-Hartman
2021-06-21 16:15 ` Greg Kroah-Hartman [this message]
2021-06-21 16:15 ` [PATCH 5.4 64/90] KVM: x86: Immediately reset the MMU context when the SMM flag is cleared Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 65/90] ARCv2: save ABI registers across signal handling Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 66/90] x86/process: Check PF_KTHREAD and not current->mm for kernel threads Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 67/90] x86/pkru: Write hardware init value to PKRU when xstate is init Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 68/90] x86/fpu: Reset state for all signal restore failures Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 69/90] dmaengine: pl330: fix wrong usage of spinlock flags in dma_cyclc Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 70/90] cfg80211: make certificate generation more robust Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 71/90] cfg80211: avoid double free of PMSR request Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 72/90] drm/amdgpu/gfx10: enlarge CP_MEC_DOORBELL_RANGE_UPPER to cover full doorbell Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 73/90] drm/amdgpu/gfx9: fix the doorbell missing when in CGPG issue Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 74/90] net: ll_temac: Make sure to free skb when it is completely used Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 75/90] net: ll_temac: Fix TX BD buffer overwrite Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 76/90] net: bridge: fix vlan tunnel dst null pointer dereference Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 77/90] net: bridge: fix vlan tunnel dst refcnt when egressing Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 78/90] mm/slub: clarify verification reporting Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 79/90] mm/slub: fix redzoning for small allocations Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 80/90] mm/slub.c: include swab.h Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 81/90] net: stmmac: disable clocks in stmmac_remove_config_dt() Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 82/90] net: fec_ptp: add clock rate zero check Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 83/90] tools headers UAPI: Sync linux/in.h copy with the kernel sources Greg Kroah-Hartman
2021-06-21 16:15 ` [PATCH 5.4 84/90] KVM: arm/arm64: Fix KVM_VGIC_V3_ADDR_TYPE_REDIST read Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 85/90] ARM: OMAP: replace setup_irq() by request_irq() Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 86/90] clocksource/drivers/timer-ti-dm: Add clockevent and clocksource support Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 87/90] clocksource/drivers/timer-ti-dm: Prepare to handle dra7 timer wrap issue Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 88/90] clocksource/drivers/timer-ti-dm: Handle dra7 timer wrap errata i940 Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 89/90] usb: dwc3: debugfs: Add and remove endpoint dirs dynamically Greg Kroah-Hartman
2021-06-21 16:16 ` [PATCH 5.4 90/90] usb: dwc3: core: fix kernel panic when do reboot Greg Kroah-Hartman
2021-06-21 19:16 ` [PATCH 5.4 00/90] 5.4.128-rc1 review Florian Fainelli
2021-06-22  7:10 ` Naresh Kamboju
2021-06-22  7:31 ` Samuel Zou
2021-06-22  7:58 ` Jon Hunter
2021-06-22 10:45 ` Sudip Mukherjee
2021-06-22 21:34 ` Guenter Roeck
2021-06-22 23:59 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210621154906.286261737@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=chiqijun@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.