linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Moshe Shemesh <moshe@mellanox.com>,
	Eran Ben Elisha <eranbe@mellanox.com>,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: [PATCH 4.14 03/77] net/mlx5: Add command entry handling completion
Date: Mon,  1 Jun 2020 19:53:08 +0200	[thread overview]
Message-ID: <20200601174017.045494130@linuxfoundation.org> (raw)
In-Reply-To: <20200601174016.396817032@linuxfoundation.org>

From: Moshe Shemesh <moshe@mellanox.com>

[ Upstream commit 17d00e839d3b592da9659c1977d45f85b77f986a ]

When FW response to commands is very slow and all command entries in
use are waiting for completion we can have a race where commands can get
timeout before they get out of the queue and handled. Timeout
completion on uninitialized command will cause releasing command's
buffers before accessing it for initialization and then we will get NULL
pointer exception while trying access it. It may also cause releasing
buffers of another command since we may have timeout completion before
even allocating entry index for this command.
Add entry handling completion to avoid this race.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/ethernet/mellanox/mlx5/core/cmd.c |   14 ++++++++++++++
 include/linux/mlx5/driver.h                   |    1 +
 2 files changed, 15 insertions(+)

--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -804,6 +804,7 @@ static void cmd_work_handler(struct work
 	int alloc_ret;
 	int cmd_mode;
 
+	complete(&ent->handling);
 	sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
 	down(sem);
 	if (!ent->page_queue) {
@@ -922,6 +923,11 @@ static int wait_func(struct mlx5_core_de
 	struct mlx5_cmd *cmd = &dev->cmd;
 	int err;
 
+	if (!wait_for_completion_timeout(&ent->handling, timeout) &&
+	    cancel_work_sync(&ent->work)) {
+		ent->ret = -ECANCELED;
+		goto out_err;
+	}
 	if (cmd->mode == CMD_MODE_POLLING || ent->polling) {
 		wait_for_completion(&ent->done);
 	} else if (!wait_for_completion_timeout(&ent->done, timeout)) {
@@ -929,12 +935,17 @@ static int wait_func(struct mlx5_core_de
 		mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
 	}
 
+out_err:
 	err = ent->ret;
 
 	if (err == -ETIMEDOUT) {
 		mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n",
 			       mlx5_command_str(msg_to_opcode(ent->in)),
 			       msg_to_opcode(ent->in));
+	} else if (err == -ECANCELED) {
+		mlx5_core_warn(dev, "%s(0x%x) canceled on out of queue timeout.\n",
+			       mlx5_command_str(msg_to_opcode(ent->in)),
+			       msg_to_opcode(ent->in));
 	}
 	mlx5_core_dbg(dev, "err %d, delivery status %s(%d)\n",
 		      err, deliv_status_to_str(ent->status), ent->status);
@@ -970,6 +981,7 @@ static int mlx5_cmd_invoke(struct mlx5_c
 	ent->token = token;
 	ent->polling = force_polling;
 
+	init_completion(&ent->handling);
 	if (!callback)
 		init_completion(&ent->done);
 
@@ -989,6 +1001,8 @@ static int mlx5_cmd_invoke(struct mlx5_c
 	err = wait_func(dev, ent);
 	if (err == -ETIMEDOUT)
 		goto out;
+	if (err == -ECANCELED)
+		goto out_free;
 
 	ds = ent->ts2 - ent->ts1;
 	op = MLX5_GET(mbox_in, in->first.data, opcode);
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -841,6 +841,7 @@ struct mlx5_cmd_work_ent {
 	struct delayed_work	cb_timeout_work;
 	void		       *context;
 	int			idx;
+	struct completion	handling;
 	struct completion	done;
 	struct mlx5_cmd        *cmd;
 	struct work_struct	work;



  parent reply	other threads:[~2020-06-01 18:57 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-01 17:53 [PATCH 4.14 00/77] 4.14.183-rc1 review Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 01/77] ax25: fix setsockopt(SO_BINDTODEVICE) Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 02/77] net: ipip: fix wrong address family in init error path Greg Kroah-Hartman
2020-06-01 17:53 ` Greg Kroah-Hartman [this message]
2020-06-01 17:53 ` [PATCH 4.14 04/77] net: revert "net: get rid of an signed integer overflow in ip_idents_reserve()" Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 05/77] net sched: fix reporting the first-time use timestamp Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 06/77] r8152: support additional Microsoft Surface Ethernet Adapter variant Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 07/77] sctp: Start shutdown on association restart if in SHUTDOWN-SENT state and socket is closed Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 08/77] net/mlx5e: Update netdev txq on completions during closure Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 09/77] net: qrtr: Fix passing invalid reference to qrtr_local_enqueue() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 10/77] net: sun: fix missing release regions in cas_init_one() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 11/77] net/mlx4_core: fix a memory leak bug Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 12/77] ARM: dts: rockchip: fix phy nodename for rk3228-evb Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 13/77] arm64: dts: rockchip: swap interrupts interrupt-names rk3399 gpu node Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 14/77] ARM: dts: rockchip: fix pinctrl sub nodename for spi in rk322x.dtsi Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 15/77] gpio: tegra: mask GPIO IRQs during IRQ shutdown Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 16/77] net: microchip: encx24j600: add missed kthread_stop Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 17/77] gfs2: move privileged user check to gfs2_quota_lock_check Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 18/77] gfs2: dont call quota_unhold if quotas are not locked Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 19/77] cachefiles: Fix race between read_waiter and read_copier involving op->to_do Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 20/77] usb: gadget: legacy: fix redundant initialization warnings Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 21/77] net: freescale: select CONFIG_FIXED_PHY where needed Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 22/77] cifs: Fix null pointer check in cifs_read Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 23/77] samples: bpf: Fix build error Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 24/77] Input: usbtouchscreen - add support for BonXeon TP Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 25/77] Input: i8042 - add ThinkPad S230u to i8042 nomux list Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 26/77] Input: evdev - call input_flush_device() on release(), not flush() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 27/77] Input: xpad - add custom init packet for Xbox One S controllers Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 28/77] Input: dlink-dir685-touchkeys - fix a typo in driver name Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 29/77] Input: i8042 - add ThinkPad S230u to i8042 reset list Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 30/77] Input: synaptics-rmi4 - really fix attn_data use-after-free Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 31/77] Input: synaptics-rmi4 - fix error return code in rmi_driver_probe() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 32/77] ARM: 8843/1: use unified assembler in headers Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 33/77] ARM: uaccess: consolidate uaccess asm to asm/uaccess-asm.h Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 34/77] ARM: uaccess: integrate uaccess_save and uaccess_restore Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 35/77] ARM: uaccess: fix DACR mismatch with nested exceptions Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 36/77] gpio: exar: Fix bad handling for ida_simple_get error path Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 37/77] IB/qib: Call kobject_put() when kobject_init_and_add() fails Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 38/77] ARM: dts: imx6q-bx50v3: Add internal switch Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 39/77] ARM: dts/imx6q-bx50v3: Set display interface clock parents Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 40/77] ARM: dts: bcm2835-rpi-zero-w: Fix led polarity Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 41/77] mmc: block: Fix use-after-free issue for rpmb Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 42/77] RDMA/pvrdma: Fix missing pci disable in pvrdma_pci_probe() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 43/77] ALSA: hwdep: fix a left shifting 1 by 31 UB bug Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 44/77] ALSA: usb-audio: mixer: volume quirk for ESS Technology Asus USB DAC Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 45/77] exec: Always set cap_ambient in cap_bprm_set_creds Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 46/77] ALSA: hda/realtek - Add new codec supported for ALC287 Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 47/77] libceph: ignore pool overlay and cache logic on redirects Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 48/77] mm: remove VM_BUG_ON(PageSlab()) from page_mapcount() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 49/77] fs/binfmt_elf.c: allocate initialized memory in fill_thread_core_info() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 50/77] include/asm-generic/topology.h: guard cpumask_of_node() macro argument Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 51/77] iommu: Fix reference count leak in iommu_group_alloc Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 52/77] parisc: Fix kernel panic in mem_init() Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 53/77] mac80211: mesh: fix discovery timer re-arming issue / crash Greg Kroah-Hartman
2020-06-01 17:53 ` [PATCH 4.14 54/77] x86/dma: Fix max PFN arithmetic overflow on 32 bit systems Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 55/77] copy_xstate_to_kernel(): dont leave parts of destination uninitialized Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 56/77] xfrm: allow to accept packets with ipv6 NEXTHDR_HOP in xfrm_input Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 57/77] xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 58/77] xfrm: fix a warning in xfrm_policy_insert_list Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 59/77] xfrm: fix a NULL-ptr deref in xfrm_local_error Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 60/77] xfrm: fix error in comment Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 61/77] vti4: eliminated some duplicate code Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 62/77] ip_vti: receive ipip packet by calling ip_tunnel_rcv Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 63/77] netfilter: nft_reject_bridge: enable reject with bridge vlan Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 64/77] netfilter: ipset: Fix subcounter update skip Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 65/77] netfilter: nfnetlink_cthelper: unbreak userspace helper support Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 66/77] netfilter: nf_conntrack_pptp: prevent buffer overflows in debug code Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 67/77] esp6: get the right proto for transport mode in esp6_gso_encap Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 68/77] qlcnic: fix missing release in qlcnic_83xx_interrupt_test Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 69/77] bonding: Fix reference count leak in bond_sysfs_slave_add Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 70/77] Revert "Input: i8042 - add ThinkPad S230u to i8042 nomux list" Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 71/77] netfilter: nf_conntrack_pptp: fix compilation warning with W=1 build Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 72/77] mm/vmalloc.c: dont dereference possible NULL pointer in __vunmap() Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 73/77] sc16is7xx: move label err_spi to correct section Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 74/77] rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 75/77] KVM: VMX: check for existence of secondary exec controls before accessing Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 76/77] net: hns: fix unsigned comparison to less than zero Greg Kroah-Hartman
2020-06-01 17:54 ` [PATCH 4.14 77/77] net: hns: Fixes the missing put_device in positive leg for roce reset Greg Kroah-Hartman
2020-06-02  7:34 ` [PATCH 4.14 00/77] 4.14.183-rc1 review Naresh Kamboju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200601174017.045494130@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=eranbe@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=moshe@mellanox.com \
    --cc=saeedm@mellanox.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).