All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Aya Levin <ayal@nvidia.com>,
	Amir Tzin <amirtz@nvidia.com>, Saeed Mahameed <saeedm@nvidia.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 21/73] net/mlx5e: Wrap the tx reporter dump callback to extract the sq
Date: Mon,  3 Jan 2022 15:23:42 +0100	[thread overview]
Message-ID: <20220103142057.602266857@linuxfoundation.org> (raw)
In-Reply-To: <20220103142056.911344037@linuxfoundation.org>

From: Amir Tzin <amirtz@nvidia.com>

[ Upstream commit 918fc3855a6507a200e9cf22c20be852c0982687 ]

Function mlx5e_tx_reporter_dump_sq() casts its void * argument to struct
mlx5e_txqsq *, but in TX-timeout-recovery flow the argument is actually
of type struct mlx5e_tx_timeout_ctx *.

 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout detected
 mlx5_core 0000:08:00.1 enp8s0f1: TX timeout on queue: 1, SQ: 0x11ec, CQ: 0x146d, SQ Cons: 0x0 SQ Prod: 0x1, usecs since last trans: 21565000
 BUG: stack guard page was hit at 0000000093f1a2de (stack is 00000000b66ea0dc..000000004d932dae)
 kernel stack overflow (page fault): 0000 [#1] SMP NOPTI
 CPU: 5 PID: 95 Comm: kworker/u20:1 Tainted: G W OE 5.13.0_mlnx #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 [mlx5_core]
 Call Trace:
 mlx5e_tx_reporter_dump+0x43/0x1c0 [mlx5_core]
 devlink_health_do_dump.part.91+0x71/0xd0
 devlink_health_report+0x157/0x1b0
 mlx5e_reporter_tx_timeout+0xb9/0xf0 [mlx5_core]
 ? mlx5e_tx_reporter_err_cqe_recover+0x1d0/0x1d0
 [mlx5_core]
 ? mlx5e_health_queue_dump+0xd0/0xd0 [mlx5_core]
 ? update_load_avg+0x19b/0x550
 ? set_next_entity+0x72/0x80
 ? pick_next_task_fair+0x227/0x340
 ? finish_task_switch+0xa2/0x280
   mlx5e_tx_timeout_work+0x83/0xb0 [mlx5_core]
   process_one_work+0x1de/0x3a0
   worker_thread+0x2d/0x3c0
 ? process_one_work+0x3a0/0x3a0
   kthread+0x115/0x130
 ? kthread_park+0x90/0x90
   ret_from_fork+0x1f/0x30
 --[ end trace 51ccabea504edaff ]---
 RIP: 0010:mlx5e_tx_reporter_dump_sq+0xd3/0x180
 PKRU: 55555554
 Kernel panic - not syncing: Fatal exception
 Kernel Offset: disabled
 end Kernel panic - not syncing: Fatal exception

To fix this bug add a wrapper for mlx5e_tx_reporter_dump_sq() which
extracts the sq from struct mlx5e_tx_timeout_ctx and set it as the
TX-timeout-recovery flow dump callback.

Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../net/ethernet/mellanox/mlx5/core/en/reporter_tx.c   | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index bb682fd751c98..8024599994642 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -463,6 +463,14 @@ static int mlx5e_tx_reporter_dump_sq(struct mlx5e_priv *priv, struct devlink_fms
 	return mlx5e_health_fmsg_named_obj_nest_end(fmsg);
 }
 
+static int mlx5e_tx_reporter_timeout_dump(struct mlx5e_priv *priv, struct devlink_fmsg *fmsg,
+					  void *ctx)
+{
+	struct mlx5e_tx_timeout_ctx *to_ctx = ctx;
+
+	return mlx5e_tx_reporter_dump_sq(priv, fmsg, to_ctx->sq);
+}
+
 static int mlx5e_tx_reporter_dump_all_sqs(struct mlx5e_priv *priv,
 					  struct devlink_fmsg *fmsg)
 {
@@ -558,7 +566,7 @@ int mlx5e_reporter_tx_timeout(struct mlx5e_txqsq *sq)
 	to_ctx.sq = sq;
 	err_ctx.ctx = &to_ctx;
 	err_ctx.recover = mlx5e_tx_reporter_timeout_recover;
-	err_ctx.dump = mlx5e_tx_reporter_dump_sq;
+	err_ctx.dump = mlx5e_tx_reporter_timeout_dump;
 	snprintf(err_str, sizeof(err_str),
 		 "TX timeout on queue: %d, SQ: 0x%x, CQ: 0x%x, SQ Cons: 0x%x SQ Prod: 0x%x, usecs since last trans: %u",
 		 sq->ch_ix, sq->sqn, sq->cq.mcq.cqn, sq->cc, sq->pc,
-- 
2.34.1




  parent reply	other threads:[~2022-01-03 14:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-03 14:23 [PATCH 5.15 00/73] 5.15.13-rc1 review Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 01/73] Input: i8042 - add deferred probe support Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 02/73] Input: i8042 - enable deferred probe quirk for ASUS UM325UA Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 03/73] tomoyo: Check exceeded quota early in tomoyo_domain_quota_is_ok() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 04/73] tomoyo: use hwight16() " Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 05/73] net/sched: Extend qdisc control block with tc control block Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 06/73] parisc: Clear stale IIR value on instruction access rights trap Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 07/73] platform/mellanox: mlxbf-pmc: Fix an IS_ERR() vs NULL bug in mlxbf_pmc_map_counters Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 08/73] platform/x86: apple-gmux: use resource_size() with res Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 09/73] memblock: fix memblock_phys_alloc() section mismatch error Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 10/73] ALSA: hda: intel-sdw-acpi: harden detection of controller Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 11/73] ALSA: hda: intel-sdw-acpi: go through HDAS ACPI at max depth of 2 Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 12/73] recordmcount.pl: fix typo in s390 mcount regex Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 13/73] powerpc/ptdump: Fix DEBUG_WX since generic ptdump conversion Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 14/73] efi: Move efifb_setup_from_dmi() prototype from arch headers Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 15/73] selinux: initialize proto variable in selinux_ip_postroute_compat() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 16/73] scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write() Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 17/73] net/mlx5: DR, Fix NULL vs IS_ERR checking in dr_domain_init_resources Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 18/73] net/mlx5: Fix error print in case of IRQ request failed Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 19/73] net/mlx5: Fix SF health recovery flow Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 20/73] net/mlx5: Fix tc max supported prio for nic mode Greg Kroah-Hartman
2022-01-03 14:23 ` Greg Kroah-Hartman [this message]
2022-01-03 14:23 ` [PATCH 5.15 22/73] net/mlx5e: Fix interoperability between XSK and ICOSQ recovery flow Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 23/73] net/mlx5e: Fix ICOSQ recovery flow for XSK Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 24/73] net/mlx5e: Use tc sample stubs instead of ifdefs in source file Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 25/73] net/mlx5e: Delete forward rule for ct or sample action Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 26/73] udp: using datalen to cap ipv6 udp max gso segments Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 27/73] selftests: Calculate udpgso segment count without header adjustment Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 28/73] net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register Greg Kroah-Hartman
2022-01-03 19:47   ` Florian Fainelli
2022-01-04  7:33     ` Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 29/73] sctp: use call_rcu to free endpoint Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 30/73] net/smc: fix using of uninitialized completions Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 31/73] net: usb: pegasus: Do not drop long Ethernet frames Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 32/73] net: ag71xx: Fix a potential double free in error handling paths Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 33/73] net: lantiq_xrx200: fix statistics of received bytes Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 34/73] NFC: st21nfca: Fix memory leak in device probe and remove Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 35/73] net/smc: dont send CDC/LLC message if link not ready Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 36/73] net/smc: fix kernel panic caused by race of smc_sock Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 37/73] igc: Do not enable crosstimestamping for i225-V models Greg Kroah-Hartman
2022-01-03 14:23 ` [PATCH 5.15 38/73] igc: Fix TX timestamp support for non-MSI-X platforms Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 39/73] drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 40/73] drm/amd/display: Set optimize_pwr_state for DCN31 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 41/73] ionic: Initialize the lif->dbid_inuse bitmap Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 42/73] net/mlx5e: Fix wrong features assignment in case of error Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 43/73] net: bridge: mcast: add and enforce query interval minimum Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 44/73] net: bridge: mcast: add and enforce startup " Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 45/73] selftests/net: udpgso_bench_tx: fix dst ip argument Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 46/73] selftests: net: Fix a typo in udpgro_fwd.sh Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 47/73] net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 48/73] net/ncsi: check for error return from call to nla_put_u32 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 49/73] selftests: net: using ping6 for IPv6 in udpgro_fwd.sh Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 50/73] fsl/fman: Fix missing put_device() call in fman_port_probe Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 51/73] i2c: validate user data in compat ioctl Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 52/73] nfc: uapi: use kernel size_t to fix user-space builds Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 53/73] uapi: fix linux/nfc.h userspace compilation errors Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 54/73] drm/nouveau: wait for the exclusive fence after the shared ones v2 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 55/73] drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 56/73] drm/amdgpu: add support for IP discovery gc_info table v2 Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 57/73] drm/amd/display: Changed pipe split policy to allow for multi-display pipe split Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 58/73] xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 59/73] usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 60/73] usb: mtu3: add memory barrier before set GPDs HWO Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 61/73] usb: mtu3: fix list_head check warning Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 62/73] usb: mtu3: set interval of FS intr and isoc endpoint Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 63/73] nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 64/73] binder: fix async_free_space accounting for empty parcels Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 65/73] scsi: vmw_pvscsi: Set residual data length conditionally Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 66/73] Input: appletouch - initialize work before device registration Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 67/73] Input: spaceball - fix parsing of movement data packets Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 68/73] mm/damon/dbgfs: fix struct pid leaks in dbgfs_target_ids_write() Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 69/73] net: fix use-after-free in tw_timer_handler Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 70/73] fs/mount_setattr: always cleanup mount_kattr Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 71/73] perf intel-pt: Fix parsing of VM time correlation arguments Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 72/73] perf script: Fix CPU filtering of a scripts switch events Greg Kroah-Hartman
2022-01-03 14:24 ` [PATCH 5.15 73/73] perf scripts python: intel-pt-events.py: Fix printing of " Greg Kroah-Hartman
2022-01-04  1:28 ` [PATCH 5.15 00/73] 5.15.13-rc1 review Guenter Roeck
2022-01-04  5:21 ` Naresh Kamboju
2022-01-04  6:28 ` Rudi Heitbaum
2022-01-04  9:53 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220103142057.602266857@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=amirtz@nvidia.com \
    --cc=ayal@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.