From: Sasha Levin <sashal@kernel.org>
To: stable@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Alex Vesker <valex@mellanox.com>,
Leon Romanovsky <leon@kernel.org>,
Jason Gunthorpe <jgg@mellanox.com>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH AUTOSEL 4.9 65/98] IB/ipoib: Fix lockdep issue found on ipoib_ib_dev_heavy_flush
Date: Thu, 25 Oct 2018 10:13:50 -0400 [thread overview]
Message-ID: <20181025141423.213774-65-sashal@kernel.org> (raw)
In-Reply-To: <20181025141423.213774-1-sashal@kernel.org>
From: Alex Vesker <valex@mellanox.com>
[ Upstream commit 1f80bd6a6cc8358b81194e1f5fc16449947396ec ]
The locking order of vlan_rwsem (LOCK A) and then rtnl (LOCK B),
contradicts other flows such as ipoib_open possibly causing a deadlock.
To prevent this deadlock heavy flush is called with RTNL locked and
only then tries to acquire vlan_rwsem.
This deadlock is possible only when there are child interfaces.
[ 140.941758] ======================================================
[ 140.946276] WARNING: possible circular locking dependency detected
[ 140.950950] 4.15.0-rc1+ #9 Tainted: G O
[ 140.954797] ------------------------------------------------------
[ 140.959424] kworker/u32:1/146 is trying to acquire lock:
[ 140.963450] (rtnl_mutex){+.+.}, at: [<ffffffffc083516a>] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[ 140.970006]
but task is already holding lock:
[ 140.975141] (&priv->vlan_rwsem){++++}, at: [<ffffffffc0834ee1>] __ipoib_ib_dev_flush+0x51/0x4e0 [ib_ipoib]
[ 140.982105]
which lock already depends on the new lock.
[ 140.990023]
the existing dependency chain (in reverse order) is:
[ 140.998650]
-> #1 (&priv->vlan_rwsem){++++}:
[ 141.005276] down_read+0x4d/0xb0
[ 141.009560] ipoib_open+0xad/0x120 [ib_ipoib]
[ 141.014400] __dev_open+0xcb/0x140
[ 141.017919] __dev_change_flags+0x1a4/0x1e0
[ 141.022133] dev_change_flags+0x23/0x60
[ 141.025695] devinet_ioctl+0x704/0x7d0
[ 141.029156] sock_do_ioctl+0x20/0x50
[ 141.032526] sock_ioctl+0x221/0x300
[ 141.036079] do_vfs_ioctl+0xa6/0x6d0
[ 141.039656] SyS_ioctl+0x74/0x80
[ 141.042811] entry_SYSCALL_64_fastpath+0x1f/0x96
[ 141.046891]
-> #0 (rtnl_mutex){+.+.}:
[ 141.051701] lock_acquire+0xd4/0x220
[ 141.055212] __mutex_lock+0x88/0x970
[ 141.058631] __ipoib_ib_dev_flush+0x2da/0x4e0 [ib_ipoib]
[ 141.063160] __ipoib_ib_dev_flush+0x71/0x4e0 [ib_ipoib]
[ 141.067648] process_one_work+0x1f5/0x610
[ 141.071429] worker_thread+0x4a/0x3f0
[ 141.074890] kthread+0x141/0x180
[ 141.078085] ret_from_fork+0x24/0x30
[ 141.081559]
other info that might help us debug this:
[ 141.088967] Possible unsafe locking scenario:
[ 141.094280] CPU0 CPU1
[ 141.097953] ---- ----
[ 141.101640] lock(&priv->vlan_rwsem);
[ 141.104771] lock(rtnl_mutex);
[ 141.109207] lock(&priv->vlan_rwsem);
[ 141.114032] lock(rtnl_mutex);
[ 141.116800]
*** DEADLOCK ***
Fixes: b4b678b06f6e ("IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop")
Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
drivers/infiniband/ulp/ipoib/ipoib_ib.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 34122c96522b..3dd5bf6c6c7a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -1190,13 +1190,10 @@ static void __ipoib_ib_dev_flush(struct ipoib_dev_priv *priv,
ipoib_ib_dev_down(dev);
if (level == IPOIB_FLUSH_HEAVY) {
- rtnl_lock();
if (test_bit(IPOIB_FLAG_INITIALIZED, &priv->flags))
ipoib_ib_dev_stop(dev);
- result = ipoib_ib_dev_open(dev);
- rtnl_unlock();
- if (result)
+ if (ipoib_ib_dev_open(dev))
return;
if (netif_queue_stopped(dev))
@@ -1236,7 +1233,9 @@ void ipoib_ib_dev_flush_heavy(struct work_struct *work)
struct ipoib_dev_priv *priv =
container_of(work, struct ipoib_dev_priv, flush_heavy);
+ rtnl_lock();
__ipoib_ib_dev_flush(priv, IPOIB_FLUSH_HEAVY, 0);
+ rtnl_unlock();
}
void ipoib_ib_dev_cleanup(struct net_device *dev)
--
2.17.1
next prev parent reply other threads:[~2018-10-25 14:16 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-25 14:12 [PATCH AUTOSEL 4.9 01/98] perf symbols: Fix memory corruption because of zero length symbols Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 02/98] mm/memory_hotplug.c: fix overflow in test_pages_in_a_zone() Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 03/98] MIPS: microMIPS: Fix decoding of swsp16 instruction Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 04/98] MIPS: Handle non word sized instructions when examining frame Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 05/98] scsi: aacraid: Fix typo in blink status Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 06/98] f2fs: fix multiple f2fs_add_link() having same name for inline dentry Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 07/98] igb: Remove superfluous reset to PHY and page 0 selection Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 08/98] ACPI: sysfs: Make ACPI GPE mask kernel parameter cover all GPEs Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 09/98] PCI: Disable MSI for HiSilicon Hip06/Hip07 only in Root Port mode Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 10/98] i2c: bcm2835: Avoid possible NULL ptr dereference Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 11/98] efi/fb: Correct PCI_STD_RESOURCE_END usage Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 12/98] ipv6: set rt6i_protocol properly in the route when it is installed Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 13/98] platform/x86: acer-wmi: setup accelerometer when ACPI device was found Sasha Levin
2018-10-25 14:12 ` [PATCH AUTOSEL 4.9 14/98] IB/ipoib: Do not warn if IPoIB debugfs doesn't exist Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 15/98] IB/core: Fix the validations of a multicast LID in attach or detach operations Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 16/98] orangefs: off by ones in xattr size checks Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 17/98] rxe: Fix a sleep-in-atomic bug in post_one_send Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 18/98] nvme-pci: fix CMB sysfs file removal in reset path Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 19/98] net: phy: marvell: Limit 88m1101 autoneg errata to 88E1145 as well Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 20/98] net/mlx5: Fix command completion after timeout access invalid structure Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 21/98] tipc: Fix tipc_sk_reinit handling of -EAGAIN Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 22/98] tipc: fix a race condition of releasing subscriber object Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 23/98] bnxt_en: Don't use rtnl lock to protect link change logic in workqueue Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 24/98] ath10k: fix NAPI enable/disable symmetry for AHB interface Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 25/98] ARM: dts: bcm283x: Reserve first page for firmware Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 26/98] btrfs: fiemap: Cache and merge fiemap extent before submit it to user Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 27/98] ata: sata_rcar: Handle return value of clk_prepare_enable Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 28/98] reset: hi6220: Set module license so that it can be loaded Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 29/98] ASoC: Intel: Skylake: Fix to parse consecutive string tkns in manifest Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 30/98] arch/sparc: increase CONFIG_NODES_SHIFT on SPARC64 to 5 Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 31/98] mac80211: fix TX aggregation start/stop callback race Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 32/98] libata: fix error checking in in ata_parse_force_one() Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 33/98] ARM: dts: imx6ul-14x14-evk: Add ksz8081 phy properties Sasha Levin
2018-10-29 14:07 ` Leonard Crestez
2018-10-29 18:46 ` Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 34/98] net: ethernet: stmmac: Fix altr_tse_pcs SGMII Initialization Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 35/98] qlcnic: Fix tunnel offload for 82xx adapters Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 36/98] x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 37/98] ARM: 8677/1: boot/compressed: fix decompressor header layout for v7-M Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 38/98] gpu: ipu-v3: Fix CSI selection for VDIC Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 39/98] elevator: fix truncation of icq_cache_name Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 40/98] net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 41/98] Btrfs: clear EXTENT_DEFRAG bits in finish_ordered_io Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 42/98] ufs: we need to sync inode before freeing it Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 43/98] net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 44/98] ip6_tunnel: Correct tos value in collect_md mode Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 45/98] net/mlx5: Fix driver load error flow when firmware is stuck Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 46/98] perf evsel: Fix probing of precise_ip level for default cycles event Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 47/98] perf probe: Fix probe definition for inlined functions Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 48/98] net/mlx5: Fix health work queue spin lock to IRQ safe Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 49/98] usb: renesas_usbhs: gadget: fix spin_lock_init() for &uep->lock Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 50/98] usb: renesas_usbhs: gadget: fix unused-but-set-variable warning Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 51/98] usb: dwc3: omap: remove IRQ_NOAUTOEN used with shared irq Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 52/98] clk: samsung: Fix m2m scaler clock on Exynos542x Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 53/98] ptr_ring: fix up after recent ptr_ring changes Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 54/98] staging: wilc1000: Fix problem with wrong vif index Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 55/98] rds: ib: Fix missing call to rds_ib_dev_put in rds_ib_setup_qp Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 56/98] iio: adc: Revert "axp288: Drop bogus AXP288_ADC_TS_PIN_CTRL register modifications" Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 57/98] qed: Warn PTT usage by wrong hw-function Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 58/98] ocfs2: fix deadlock caused by recursive locking in xattr Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 59/98] net: cdc_ncm: GetNtbFormat endian fix Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 60/98] sctp: use right member as the param of list_for_each_entry Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 61/98] ALSA: hda - No loopback on ALC299 codec Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 62/98] x86/power: Fix some ordering bugs in __restore_processor_context() Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 63/98] ath10k: convert warning about non-existent OTP board id to debug message Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 64/98] ipv6: fix cleanup ordering for ip6_mr failure Sasha Levin
2018-10-25 14:13 ` Sasha Levin [this message]
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 66/98] IB/rxe: put the pool on allocation failure Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 67/98] nbd: only set MSG_MORE when we have more to send Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 68/98] mm/frame_vector.c: release a semaphore in 'get_vaddr_frames()' Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 69/98] IB/mlx5: Avoid passing an invalid QP type to firmware Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 70/98] scsi: qla2xxx: Avoid double completion of abort command Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 71/98] drm: bochs: Don't remove uninitialized fbdev framebuffer Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 72/98] i40e: avoid NVM acquire deadlock during NVM update Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 73/98] Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0" Sasha Levin
2018-10-25 14:13 ` [PATCH AUTOSEL 4.9 74/98] Btrfs: incremental send, fix invalid memory access Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 75/98] drm/msm: Fix possible null dereference on failure of get_pages() Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 76/98] ARM: tegra: Fix ULPI regression on Tegra20 Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 77/98] module: fix DEBUG_SET_MODULE_RONX typo Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 78/98] iio: pressure: zpa2326: Remove always-true check which confuses gcc Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 79/98] l2tp: remove configurable payload offset Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 80/98] macsec: fix memory leaks when skb_to_sgvec fails Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 81/98] perf/core: Fix locking for children siblings group read Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 82/98] cifs: Use ULL suffix for 64-bit constant Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 83/98] futex: futex_wake_op, do not fail on invalid op Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 84/98] ALSA: hda - Fix incorrect usage of IS_REACHABLE() Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 85/98] test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 86/98] xen-netfront: Update features after registering netdev Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 87/98] sparc64: Fix regression in pmdp_invalidate() Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 88/98] xen-netfront: Fix mismatched rtnl_unlock Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 89/98] enic: do not overwrite error code Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 90/98] bonding: ratelimit failed speed/duplex update warning Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 91/98] tty: serial: pl011: add ttyAMA for matching pl011 console Sasha Levin
2018-10-25 15:17 ` Sudeep Holla
2018-10-29 13:39 ` Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 92/98] nvmet: fix space padding in serial number Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 93/98] iio: buffer: fix the function signature to match implementation Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 94/98] x86/paravirt: Fix some warning messages Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 95/98] IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 96/98] libertas: call into generic suspend code before turning off power Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 97/98] xhci: Fix USB3 NULL pointer dereference at logical disconnect Sasha Levin
2018-10-25 14:14 ` [PATCH AUTOSEL 4.9 98/98] perf tests: Fix indexing when invoking subtests Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181025141423.213774-65-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=jgg@mellanox.com \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=valex@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).