linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Zenghui Yu <yuzenghui@huawei.com>,
	Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH 5.4 25/49] net: hns3: Clear the CMDQ registers before unmapping BAR region
Date: Sat, 31 Oct 2020 12:35:21 +0100	[thread overview]
Message-ID: <20201031113456.656663992@linuxfoundation.org> (raw)
In-Reply-To: <20201031113455.439684970@linuxfoundation.org>

From: Zenghui Yu <yuzenghui@huawei.com>

[ Upstream commit e3364c5ff3ff975b943a7bf47e21a2a4bf20f3fe ]

When unbinding the hns3 driver with the HNS3 VF, I got the following
kernel panic:

[  265.709989] Unable to handle kernel paging request at virtual address ffff800054627000
[  265.717928] Mem abort info:
[  265.720740]   ESR = 0x96000047
[  265.723810]   EC = 0x25: DABT (current EL), IL = 32 bits
[  265.729126]   SET = 0, FnV = 0
[  265.732195]   EA = 0, S1PTW = 0
[  265.735351] Data abort info:
[  265.738227]   ISV = 0, ISS = 0x00000047
[  265.742071]   CM = 0, WnR = 1
[  265.745055] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000009b54000
[  265.751753] [ffff800054627000] pgd=0000202ffffff003, p4d=0000202ffffff003, pud=00002020020eb003, pmd=00000020a0dfc003, pte=0000000000000000
[  265.764314] Internal error: Oops: 96000047 [#1] SMP
[  265.830357] CPU: 61 PID: 20319 Comm: bash Not tainted 5.9.0+ #206
[  265.836423] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.05 09/18/2019
[  265.843873] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO BTYPE=--)
[  265.843890] pc : hclgevf_cmd_uninit+0xbc/0x300
[  265.861988] lr : hclgevf_cmd_uninit+0xb0/0x300
[  265.861992] sp : ffff80004c983b50
[  265.881411] pmr_save: 000000e0
[  265.884453] x29: ffff80004c983b50 x28: ffff20280bbce500
[  265.889744] x27: 0000000000000000 x26: 0000000000000000
[  265.895034] x25: ffff800011a1f000 x24: ffff800011a1fe90
[  265.900325] x23: ffff0020ce9b00d8 x22: ffff0020ce9b0150
[  265.905616] x21: ffff800010d70e90 x20: ffff800010d70e90
[  265.910906] x19: ffff0020ce9b0080 x18: 0000000000000004
[  265.916198] x17: 0000000000000000 x16: ffff800011ae32e8
[  265.916201] x15: 0000000000000028 x14: 0000000000000002
[  265.916204] x13: ffff800011ae32e8 x12: 0000000000012ad8
[  265.946619] x11: ffff80004c983b50 x10: 0000000000000000
[  265.951911] x9 : ffff8000115d0888 x8 : 0000000000000000
[  265.951914] x7 : ffff800011890b20 x6 : c0000000ffff7fff
[  265.951917] x5 : ffff80004c983930 x4 : 0000000000000001
[  265.951919] x3 : ffffa027eec1b000 x2 : 2b78ccbbff369100
[  265.964487] x1 : 0000000000000000 x0 : ffff800054627000
[  265.964491] Call trace:
[  265.964494]  hclgevf_cmd_uninit+0xbc/0x300
[  265.964496]  hclgevf_uninit_ae_dev+0x9c/0xe8
[  265.964501]  hnae3_unregister_ae_dev+0xb0/0x130
[  265.964516]  hns3_remove+0x34/0x88 [hns3]
[  266.009683]  pci_device_remove+0x48/0xf0
[  266.009692]  device_release_driver_internal+0x114/0x1e8
[  266.030058]  device_driver_detach+0x28/0x38
[  266.034224]  unbind_store+0xd4/0x108
[  266.037784]  drv_attr_store+0x40/0x58
[  266.041435]  sysfs_kf_write+0x54/0x80
[  266.045081]  kernfs_fop_write+0x12c/0x250
[  266.049076]  vfs_write+0xc4/0x248
[  266.052378]  ksys_write+0x74/0xf8
[  266.055677]  __arm64_sys_write+0x24/0x30
[  266.059584]  el0_svc_common.constprop.3+0x84/0x270
[  266.064354]  do_el0_svc+0x34/0xa0
[  266.067658]  el0_svc+0x38/0x40
[  266.070700]  el0_sync_handler+0x8c/0xb0
[  266.074519]  el0_sync+0x140/0x180

It looks like the BAR memory region had already been unmapped before we
start clearing CMDQ registers in it, which is pretty bad and the kernel
happily kills itself because of a Current EL Data Abort (on arm64).

Moving the CMDQ uninitialization a bit early fixes the issue for me.

Fixes: 862d969a3a4d ("net: hns3: do VF's pci re-initialization while PF doing FLR")
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Link: https://lore.kernel.org/r/20201023051550.793-1-yuzenghui@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
+++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c
@@ -2782,8 +2782,8 @@ static void hclgevf_uninit_hdev(struct h
 		hclgevf_uninit_msi(hdev);
 	}
 
-	hclgevf_pci_uninit(hdev);
 	hclgevf_cmd_uninit(hdev);
+	hclgevf_pci_uninit(hdev);
 }
 
 static int hclgevf_init_ae_dev(struct hnae3_ae_dev *ae_dev)



  parent reply	other threads:[~2020-10-31 11:37 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-31 11:34 [PATCH 5.4 00/49] 5.4.74-rc1 review Greg Kroah-Hartman
2020-10-31 11:34 ` [PATCH 5.4 01/49] netfilter: nftables_offload: KASAN slab-out-of-bounds Read in nft_flow_rule_create Greg Kroah-Hartman
2020-10-31 11:34 ` [PATCH 5.4 02/49] socket: dont clear SOCK_TSTAMP_NEW when SO_TIMESTAMPNS is disabled Greg Kroah-Hartman
2020-10-31 11:34 ` [PATCH 5.4 03/49] objtool: Support Clang non-section symbols in ORC generation Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 04/49] scripts/setlocalversion: make git describe output more reliable Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 05/49] arm64: Run ARCH_WORKAROUND_1 enabling code on all CPUs Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 06/49] arm64: Run ARCH_WORKAROUND_2 " Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 07/49] arm64: link with -z norelro regardless of CONFIG_RELOCATABLE Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 08/49] x86/PCI: Fix intel_mid_pci.c build error when ACPI is not enabled Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 09/49] efivarfs: Replace invalid slashes with exclamation marks in dentries Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 10/49] bnxt_en: Check abort error state in bnxt_open_nic() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 11/49] bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 12/49] chelsio/chtls: fix deadlock issue Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 13/49] chelsio/chtls: fix memory leaks in CPL handlers Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 14/49] chelsio/chtls: fix tls record info to user Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 15/49] cxgb4: set up filter action after rewrites Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 16/49] gtp: fix an use-before-init in gtp_newlink() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 17/49] ibmvnic: fix ibmvnic_set_mac Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 18/49] mlxsw: core: Fix memory leak on module removal Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 19/49] netem: fix zero division in tabledist Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 20/49] net/sched: act_mpls: Add softdep on mpls_gso.ko Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 21/49] r8169: fix issue with forced threading in combination with shared interrupts Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 22/49] ravb: Fix bit fields checking in ravb_hwtstamp_get() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 23/49] tcp: Prevent low rmem stalls with SO_RCVLOWAT Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 24/49] tipc: fix memory leak caused by tipc_buf_append() Greg Kroah-Hartman
2020-10-31 11:35 ` Greg Kroah-Hartman [this message]
2020-10-31 11:35 ` [PATCH 5.4 26/49] bnxt_en: Re-write PCI BARs after PCI fatal error Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 27/49] bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 28/49] bnxt_en: Invoke cancel_delayed_work_sync() for PFs also Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 29/49] erofs: avoid duplicated permission check for "trusted." xattrs Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 30/49] arch/x86/amd/ibs: Fix re-arming IBS Fetch Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 31/49] x86/xen: disable Firmware First mode for correctable memory errors Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 32/49] ata: ahci: mvebu: Make SATA PHY optional for Armada 3720 Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 33/49] fuse: fix page dereference after free Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 34/49] bpf: Fix comment for helper bpf_current_task_under_cgroup() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 35/49] evm: Check size of security.evm before using it Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 36/49] p54: avoid accessing the data mapped to streaming DMA Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 37/49] cxl: Rework error message for incompatible slots Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 38/49] RDMA/addr: Fix race with netevent_callback()/rdma_addr_cancel() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 39/49] mtd: lpddr: Fix bad logic in print_drs_error Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 40/49] drm/i915/gem: Serialise debugfs i915_gem_objects with ctx->mutex Greg Kroah-Hartman
2020-10-31 11:42   ` Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 41/49] serial: qcom_geni_serial: To correct QUP Version detection logic Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 42/49] serial: pl011: Fix lockdep splat when handling magic-sysrq interrupt Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 43/49] PM: runtime: Fix timer_expires data type on 32-bit arches Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 44/49] ata: sata_rcar: Fix DMA boundary mask Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 45/49] xen/gntdev.c: Mark pages as dirty Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 46/49] crypto: x86/crc32c - fix building with clang ias Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 47/49] openrisc: Fix issue with get_user for 64-bit values Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 48/49] misc: rtsx: do not setting OC_POWER_DOWN reg in rtsx_pci_init_ocp() Greg Kroah-Hartman
2020-10-31 11:35 ` [PATCH 5.4 49/49] phy: marvell: comphy: Convert internal SMCC firmware return codes to errno Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201031113456.656663992@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).