stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Aya Levin <ayal@nvidia.com>,
	Amir Tzin <amirtz@nvidia.com>, Saeed Mahameed <saeedm@nvidia.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 21/73] net/mlx5e: Wrap the tx reporter dump callback to extract the sq
Date: Mon,  3 Jan 2022 15:23:42 +0100	[thread overview]
Message-ID: <20220103142057.602266857@linuxfoundation.org> (raw)
In-Reply-To: <20220103142056.911344037@linuxfoundation.org>

From: Amir Tzin <amirtz@nvidia.com>

[ Upstream commit 918fc3855a6507a200e9cf22c20be852c0982687 ]

Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.

 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
 kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
 CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 [mlx5_core]
 Call Trace:
 mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
 devlink_health_do_dump.part.91+0x71/0xd0
 devlink_health_report+0x157/0x1b0
 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
 ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
 [mlx5_core]
 ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
 ? update_load_avg+0x19b/0x550
 ? set_next_entity+0x72/0x80
 ? pick_next_task_fair+0x227/0x340
 ? finish_task_switch+0xa2/0x280
   mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
   process_one_work+0x1de/0x3a0
   worker_thread+0x2d/0x3c0
 ? process_one_work+0x3a0/0x3a0
   kthread+0x115/0x130
 ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30
 --[ end trace 51ccabea504edaff ]---
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled
 end Kernel panic - not syncing: Fatal exception

To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.

Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/ethernet/mellanox/mlx5/core/en/reporter_tx.c   | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index bb682fd751c98..8024599994642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -463,6 +463,14 @@ static int mlx5e_tx_reporter_dump_sq(struct mlx5e_priv *priv, struct devlink_fms
 	return mlx5e_health_fmsg_named_obj_nest_end(fmsg);
 }
 
+static int mlx5e_tx_reporter_timeout_dump(struct mlx5e_priv *priv, struct devlink_fmsg *fmsg,
+					  void *ctx)
+{
+	struct mlx5e_tx_timeout_ctx *to_ctx = ctx;
+
+	return mlx5e_tx_reporter_dump_sq(priv, fmsg, to_ctx->sq);
+}
+
 static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
 					  struct devlink_fmsg *fmsg)
 {
@@ -558,7 +566,7 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq)
 	to_ctx.sq = sq;
 	err_ctx.ctx = &to_ctx;
 	err_ctx.recover = mlx5e_tx_reporter_timeout_recover;
-	err_ctx.dump = mlx5e_tx_reporter_dump_sq;
+	err_ctx.dump = mlx5e_tx_reporter_timeout_dump;
 	snprintf(err_str, sizeof(err_str),
 		 "TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u",
 		 sq->ch_ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
-- 
2.34.1




  parent reply	other threads:[~2022-01-03 14:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-03 14:23 [PATCH 5.15 00/73] 5.15.13-rc1 review Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 01/73] Input: i8042 - add deferred probe support Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 02/73] Input: i8042 - enable deferred probe quirk for ASUS UM325UA Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 03/73] tomoyo: Check exceeded quota early in tomoyo_domain_quota_is_ok() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 04/73] tomoyo: use hwight16() " Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 05/73] net/sched: Extend qdisc control block with tc control block Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 06/73] parisc: Clear stale IIR value on instruction access rights trap Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 07/73] platform/mellanox: mlxbf-pmc: Fix an IS_ERR() vs NULL bug in mlxbf_pmc_map_counters Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 08/73] platform/x86: apple-gmux: use resource_size() with res Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 09/73] memblock: fix memblock_phys_alloc() section mismatch error Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 10/73] ALSA: hda: intel-sdw-acpi: harden detection of controller Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 11/73] ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2 Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 12/73] recordmcount.pl: fix typo in s390 mcount regex Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 13/73] powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversion Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 14/73] efi: Move efifb_setup_from_dmi() prototype from arch headers Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 15/73] selinux: initialize proto variable in selinux_ip_postroute_compat() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 16/73] scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 17/73] net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 18/73] net/mlx5: Fix error print in case of IRQ request failed Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 19/73] net/mlx5: Fix SF health recovery flow Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 20/73] net/mlx5: Fix tc max supported prio for nic mode Greg Kroah-Hartman
2022-01-03 14:23 ` Greg Kroah-Hartman [this message]
2022-01-03 14:23 ` [PATCH 5.15 22/73] net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 23/73] net/mlx5e: Fix ICOSQ recovery flow for XSK Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 24/73] net/mlx5e: Use tc sample stubs instead of ifdefs in source file Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 25/73] net/mlx5e: Delete forward rule for ct or sample action Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 26/73] udp: using datalen to cap ipv6 udp max gso segments Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 27/73] selftests: Calculate udpgso segment count without header adjustment Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 28/73] net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register Greg Kroah-Hartman
2022-01-03 19:47   ` Florian Fainelli
2022-01-04  7:33     ` Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 29/73] sctp: use call_rcu to free endpoint Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 30/73] net/smc: fix using of uninitialized completions Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 31/73] net: usb: pegasus: Do not drop long Ethernet frames Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 32/73] net: ag71xx: Fix a potential double free in error handling paths Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 33/73] net: lantiq_xrx200: fix statistics of received bytes Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 34/73] NFC: st21nfca: Fix memory leak in device probe and remove Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 35/73] net/smc: dont send CDC/LLC message if link not ready Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 36/73] net/smc: fix kernel panic caused by race of smc_sock Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 37/73] igc: Do not enable crosstimestamping for i225-V models Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 38/73] igc: Fix TX timestamp support for non-MSI-X platforms Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 39/73] drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 40/73] drm/amd/display: Set optimize_pwr_state for DCN31 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 41/73] ionic: Initialize the lif->dbid_inuse bitmap Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 42/73] net/mlx5e: Fix wrong features assignment in case of error Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 43/73] net: bridge: mcast: add and enforce query interval minimum Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 44/73] net: bridge: mcast: add and enforce startup " Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 45/73] selftests/net: udpgso_bench_tx: fix dst ip argument Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 46/73] selftests: net: Fix a typo in udpgro_fwd.sh Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 47/73] net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 48/73] net/ncsi: check for error return from call to nla_put_u32 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 49/73] selftests: net: using ping6 for IPv6 in udpgro_fwd.sh Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 50/73] fsl/fman: Fix missing put_device() call in fman_port_probe Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 51/73] i2c: validate user data in compat ioctl Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 52/73] nfc: uapi: use kernel size_t to fix user-space builds Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 53/73] uapi: fix linux/nfc.h userspace compilation errors Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 54/73] drm/nouveau: wait for the exclusive fence after the shared ones v2 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 55/73] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 56/73] drm/amdgpu: add support for IP discovery gc_info table v2 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 57/73] drm/amd/display: Changed pipe split policy to allow for multi-display pipe split Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 58/73] xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 59/73] usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 60/73] usb: mtu3: add memory barrier before set GPDs HWO Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 61/73] usb: mtu3: fix list_head check warning Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 62/73] usb: mtu3: set interval of FS intr and isoc endpoint Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 63/73] nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 64/73] binder: fix async_free_space accounting for empty parcels Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 65/73] scsi: vmw_pvscsi: Set residual data length conditionally Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 66/73] Input: appletouch - initialize work before device registration Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 67/73] Input: spaceball - fix parsing of movement data packets Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 68/73] mm/damon/dbgfs: fix struct pid leaks in dbgfs_target_ids_write() Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 69/73] net: fix use-after-free in tw_timer_handler Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 70/73] fs/mount_setattr: always cleanup mount_kattr Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 71/73] perf intel-pt: Fix parsing of VM time correlation arguments Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 72/73] perf script: Fix CPU filtering of a scripts switch events Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 73/73] perf scripts python: intel-pt-events.py: Fix printing of " Greg Kroah-Hartman
2022-01-04  1:28 ` [PATCH 5.15 00/73] 5.15.13-rc1 review Guenter Roeck
2022-01-04  5:21 ` Naresh Kamboju
2022-01-04  6:28 ` Rudi Heitbaum
2022-01-04  9:53 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220103142057.602266857@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=amirtz@nvidia.com \
    --cc=ayal@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).