linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Erez Shitrit <erezsh@mellanox.com>,
	Alex Vesker <valex@mellanox.com>,
	Leon Romanovsky <leon@kernel.org>,
	Jason Gunthorpe <jgg@mellanox.com>,
	Sasha Levin <alexander.levin@microsoft.com>
Subject: [PATCH 4.9 35/56] IB/ipoib: Fix race condition in neigh creation
Date: Fri,  2 Mar 2018 09:51:21 +0100	[thread overview]
Message-ID: <20180302084451.354951334@linuxfoundation.org> (raw)
In-Reply-To: <20180302084449.568562222@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Erez Shitrit <erezsh@mellanox.com>


[ Upstream commit 16ba3defb8bd01a9464ba4820a487f5b196b455b ]

When using enhanced mode for IPoIB, two threads may execute xmit in
parallel to two different TX queues while the target is the same.
In this case, both of them will add the same neighbor to the path's
neigh link list and we might see the following message:

  list_add double add: new=ffff88024767a348, prev=ffff88024767a348...
  WARNING: lib/list_debug.c:31__list_add_valid+0x4e/0x70
  ipoib_start_xmit+0x477/0x680 [ib_ipoib]
  dev_hard_start_xmit+0xb9/0x3e0
  sch_direct_xmit+0xf9/0x250
  __qdisc_run+0x176/0x5d0
  __dev_queue_xmit+0x1f5/0xb10
  __dev_queue_xmit+0x55/0xb10

Analysis:
Two SKB are scheduled to be transmitted from two cores.
In ipoib_start_xmit, both gets NULL when calling ipoib_neigh_get.
Two calls to neigh_add_path are made. One thread takes the spin-lock
and calls ipoib_neigh_alloc which creates the neigh structure,
then (after the __path_find) the neigh is added to the path's neigh
link list. When the second thread enters the critical section it also
calls ipoib_neigh_alloc but in this case it gets the already allocated
ipoib_neigh structure, which is already linked to the path's neigh
link list and adds it again to the list. Which beside of triggering
the list, it creates a loop in the linked list. This loop leads to
endless loop inside path_rec_completion.

Solution:
Check list_empty(&neigh->list) before adding to the list.
Add a similar fix in "ipoib_multicast.c::ipoib_mcast_send"

Fixes: b63b70d87741 ('IPoIB: Use a private hash table for path lookup in xmit path')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/infiniband/ulp/ipoib/ipoib_main.c      |   25 ++++++++++++++++++-------
 drivers/infiniband/ulp/ipoib/ipoib_multicast.c |    5 ++++-
 2 files changed, 22 insertions(+), 8 deletions(-)

--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -919,8 +919,8 @@ static int path_rec_start(struct net_dev
 	return 0;
 }
 
-static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
-			   struct net_device *dev)
+static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr,
+					  struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = netdev_priv(dev);
 	struct ipoib_path *path;
@@ -933,7 +933,15 @@ static void neigh_add_path(struct sk_buf
 		spin_unlock_irqrestore(&priv->lock, flags);
 		++dev->stats.tx_dropped;
 		dev_kfree_skb_any(skb);
-		return;
+		return NULL;
+	}
+
+	/* To avoid race condition, make sure that the
+	 * neigh will be added only once.
+	 */
+	if (unlikely(!list_empty(&neigh->list))) {
+		spin_unlock_irqrestore(&priv->lock, flags);
+		return neigh;
 	}
 
 	path = __path_find(dev, daddr + 4);
@@ -971,7 +979,7 @@ static void neigh_add_path(struct sk_buf
 			spin_unlock_irqrestore(&priv->lock, flags);
 			ipoib_send(dev, skb, path->ah, IPOIB_QPN(daddr));
 			ipoib_neigh_put(neigh);
-			return;
+			return NULL;
 		}
 	} else {
 		neigh->ah  = NULL;
@@ -988,7 +996,7 @@ static void neigh_add_path(struct sk_buf
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
-	return;
+	return NULL;
 
 err_path:
 	ipoib_neigh_free(neigh);
@@ -998,6 +1006,8 @@ err_drop:
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
+
+	return NULL;
 }
 
 static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
@@ -1103,8 +1113,9 @@ static int ipoib_start_xmit(struct sk_bu
 	case htons(ETH_P_TIPC):
 		neigh = ipoib_neigh_get(dev, phdr->hwaddr);
 		if (unlikely(!neigh)) {
-			neigh_add_path(skb, phdr->hwaddr, dev);
-			return NETDEV_TX_OK;
+			neigh = neigh_add_path(skb, phdr->hwaddr, dev);
+			if (likely(!neigh))
+				return NETDEV_TX_OK;
 		}
 		break;
 	case htons(ETH_P_ARP):
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -818,7 +818,10 @@ void ipoib_mcast_send(struct net_device
 		spin_lock_irqsave(&priv->lock, flags);
 		if (!neigh) {
 			neigh = ipoib_neigh_alloc(daddr, dev);
-			if (neigh) {
+			/* Make sure that the neigh will be added only
+			 * once to mcast list.
+			 */
+			if (neigh && list_empty(&neigh->list)) {
 				kref_get(&mcast->ah->ref);
 				neigh->ah	= mcast->ah;
 				list_add_tail(&neigh->list, &mcast->neigh_list);

  parent reply	other threads:[~2018-03-02  8:51 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-02  8:50 [PATCH 4.9 00/56] 4.9.86-stable review Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 01/56] hrtimer: Ensure POSIX compliance (relative CLOCK_REALTIME hrtimers) Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 02/56] i2c: designware: must wait for enable Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 03/56] f2fs: fix a bug caused by NULL extent tree Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 04/56] dmaengine: fsl-edma: disable clks on all error paths Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 05/56] nvme: check hw sectors before setting chunk sectors Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 06/56] net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 07/56] mtd: nand: gpmi: Fix failure when a erased page has a bitflip at BBM Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 08/56] mtd: nand: brcmnand: Zero bitflip is not an error Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 09/56] ipv6: icmp6: Allow icmp messages to be looped back Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 10/56] ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatch Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 11/56] x86/asm: Allow again using asm.h when building for the bpf clang target Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 12/56] sget(): handle failures of register_shrinker() Greg Kroah-Hartman
2018-03-02  8:50 ` [PATCH 4.9 13/56] net: phy: xgene: disable clk on error paths Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 14/56] drm/nouveau/pci: do a msi rearm on init Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 15/56] mac80211_hwsim: Fix a possible sleep-in-atomic bug in hwsim_get_radio_nl Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 16/56] spi: atmel: fixed spin_lock usage inside atmel_spi_remove Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 17/56] ASoC: nau8825: fix issue that pop noise when start capture Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 18/56] net: mediatek: setup proper state for disabled GMAC on the default Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 19/56] net: arc_emac: fix arc_emac_rx() error paths Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 20/56] ip6_tunnel: get the min mtu properly in ip6_tnl_xmit Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 21/56] net: stmmac: Fix TX timestamp calculation Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 22/56] scsi: storvsc: Fix scsi_cmd error assignments in storvsc_handle_error Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 23/56] ARM: dts: ls1021a: fix incorrect clock references Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 24/56] lib/mpi: Fix umul_ppmm() for MIPS64r6 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 25/56] tipc: error path leak fixes in tipc_enable_bearer() Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 26/56] tipc: fix tipc_mon_delete() oops in tipc_enable_bearer() error path Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 27/56] tg3: Add workaround to restrict 5762 MRRS to 2048 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 28/56] tg3: Enable PHY reset in MTU change path for 5720 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 29/56] bnx2x: Improve reliability in case of nested PCI errors Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 30/56] led: core: Fix brightness setting when setting delay_off=0 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 31/56] IB/mlx5: Fix mlx5_ib_alloc_mr error flow Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 32/56] genirq: Guard handle_bad_irq log messages Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 33/56] s390/dasd: fix wrongly assigned configuration data Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 34/56] IB/mlx4: Fix mlx4_ib_alloc_mr error flow Greg Kroah-Hartman
2018-03-02  8:51 ` Greg Kroah-Hartman [this message]
2018-03-02  8:51 ` [PATCH 4.9 36/56] xfs: quota: fix missed destroy of qi_tree_lock Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 37/56] xfs: quota: check result of register_shrinker() Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 38/56] macvlan: Fix one possible double free Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 39/56] e1000: fix disabling already-disabled warning Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 40/56] NET: usb: qmi_wwan: add support for YUGA CLM920-NC5 PID 0x9625 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 41/56] drm/ttm: check the return value of kzalloc Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 42/56] uapi libc compat: add fallback for unsupported libcs Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 43/56] i40e/i40evf: Account for frags split over multiple descriptors in check linearize Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 44/56] nl80211: Check for the required netlink attribute presence Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 45/56] mac80211: mesh: drop frames appearing to be from us Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 46/56] can: flex_can: Correct the checking for frame length in flexcan_start_xmit() Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 47/56] bnxt_en: Fix the Invalid VF id check in bnxt_vf_ndo_prep routine Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 48/56] xen-netfront: enable device after manual module load Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 49/56] mdio-sun4i: Fix a memory leak Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 50/56] SolutionEngine771x: fix Ether platform data Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 51/56] xen/gntdev: Fix off-by-one error when unmapping with holes Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 52/56] xen/gntdev: Fix partial gntdev_mmap() cleanup Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 53/56] sctp: make use of pre-calculated len Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 54/56] net: gianfar_ptp: move set_fipers() to spinlock protecting area Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 55/56] KVM: arm/arm64: Fix check for hugepage size when allocating at Stage 2 Greg Kroah-Hartman
2018-03-02  8:51 ` [PATCH 4.9 56/56] MIPS: Implement __multi3 for GCC7 MIPS64r6 builds Greg Kroah-Hartman
2018-03-02 17:15 ` [PATCH 4.9 00/56] 4.9.86-stable review Guenter Roeck
2018-03-02 17:52 ` Naresh Kamboju
2018-03-02 21:29 ` Shuah Khan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180302084451.354951334@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=alexander.levin@microsoft.com \
    --cc=erezsh@mellanox.com \
    --cc=jgg@mellanox.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=valex@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).