linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Mahesh Bandewar <maheshb@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 4.4 18/48] bonding: avoid possible dead-lock
Date: Thu, 18 Oct 2018 19:54:53 +0200	[thread overview]
Message-ID: <20181018175428.997024376@linuxfoundation.org> (raw)
In-Reply-To: <20181018175427.133690306@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mahesh Bandewar <maheshb@google.com>

[ Upstream commit d4859d749aa7090ffb743d15648adb962a1baeae ]

Syzkaller reported this on a slightly older kernel but it's still
applicable to the current kernel -

======================================================
WARNING: possible circular locking dependency detected
4.18.0-next-20180823+ #46 Not tainted
------------------------------------------------------
syz-executor4/26841 is trying to acquire lock:
00000000dd41ef48 ((wq_completion)bond_dev->name){+.+.}, at: flush_workqueue+0x2db/0x1e10 kernel/workqueue.c:2652

but task is already holding lock:
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:77 [inline]
00000000768ab431 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x412/0xc30 net/core/rtnetlink.c:4708

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (rtnl_mutex){+.+.}:
       __mutex_lock_common kernel/locking/mutex.c:925 [inline]
       __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
       mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
       rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77
       bond_netdev_notify drivers/net/bonding/bond_main.c:1310 [inline]
       bond_netdev_notify_work+0x44/0xd0 drivers/net/bonding/bond_main.c:1320
       process_one_work+0xc73/0x1aa0 kernel/workqueue.c:2153
       worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
       kthread+0x35a/0x420 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #1 ((work_completion)(&(&nnw->work)->work)){+.+.}:
       process_one_work+0xc0b/0x1aa0 kernel/workqueue.c:2129
       worker_thread+0x189/0x13c0 kernel/workqueue.c:2296
       kthread+0x35a/0x420 kernel/kthread.c:246
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:415

-> #0 ((wq_completion)bond_dev->name){+.+.}:
       lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
       flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
       drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
       destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
       __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
       bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
       register_netdevice+0x337/0x1100 net/core/dev.c:8410
       bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
       rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
       rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
       netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
       rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
       netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
       netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
       netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
       __sys_sendmsg+0x11d/0x290 net/socket.c:2153
       __do_sys_sendmsg net/socket.c:2162 [inline]
       __se_sys_sendmsg net/socket.c:2160 [inline]
       __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe

other info that might help us debug this:

Chain exists of:
  (wq_completion)bond_dev->name --> (work_completion)(&(&nnw->work)->work) --> rtnl_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock((work_completion)(&(&nnw->work)->work));
                               lock(rtnl_mutex);
  lock((wq_completion)bond_dev->name);

 *** DEADLOCK ***

1 lock held by syz-executor4/26841:

stack backtrace:
CPU: 1 PID: 26841 Comm: syz-executor4 Not tainted 4.18.0-next-20180823+ #46
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
 print_circular_bug.isra.34.cold.55+0x1bd/0x27d kernel/locking/lockdep.c:1222
 check_prev_add kernel/locking/lockdep.c:1862 [inline]
 check_prevs_add kernel/locking/lockdep.c:1975 [inline]
 validate_chain kernel/locking/lockdep.c:2416 [inline]
 __lock_acquire+0x3449/0x5020 kernel/locking/lockdep.c:3412
 lock_acquire+0x1e4/0x4f0 kernel/locking/lockdep.c:3901
 flush_workqueue+0x30a/0x1e10 kernel/workqueue.c:2655
 drain_workqueue+0x2a9/0x640 kernel/workqueue.c:2820
 destroy_workqueue+0xc6/0x9d0 kernel/workqueue.c:4155
 __alloc_workqueue_key+0xef9/0x1190 kernel/workqueue.c:4138
 bond_init+0x269/0x940 drivers/net/bonding/bond_main.c:4734
 register_netdevice+0x337/0x1100 net/core/dev.c:8410
 bond_newlink+0x49/0xa0 drivers/net/bonding/bond_netlink.c:453
 rtnl_newlink+0xef4/0x1d50 net/core/rtnetlink.c:3099
 rtnetlink_rcv_msg+0x46e/0xc30 net/core/rtnetlink.c:4711
 netlink_rcv_skb+0x172/0x440 net/netlink/af_netlink.c:2454
 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:4729
 netlink_unicast_kernel net/netlink/af_netlink.c:1317 [inline]
 netlink_unicast+0x5a0/0x760 net/netlink/af_netlink.c:1343
 netlink_sendmsg+0xa18/0xfc0 net/netlink/af_netlink.c:1908
 sock_sendmsg_nosec net/socket.c:622 [inline]
 sock_sendmsg+0xd5/0x120 net/socket.c:632
 ___sys_sendmsg+0x7fd/0x930 net/socket.c:2115
 __sys_sendmsg+0x11d/0x290 net/socket.c:2153
 __do_sys_sendmsg net/socket.c:2162 [inline]
 __se_sys_sendmsg net/socket.c:2160 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2160
 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457089
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f2df20a5c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f2df20a66d4 RCX: 0000000000457089
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 0000000000930140 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004d40b8 R14: 00000000004c8ad8 R15: 0000000000000001

Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/net/bonding/bond_main.c |   43 +++++++++++++++-------------------------
 include/net/bonding.h           |    7 ------
 2 files changed, 18 insertions(+), 32 deletions(-)

--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -216,6 +216,7 @@ static struct rtnl_link_stats64 *bond_ge
 static void bond_slave_arr_handler(struct work_struct *work);
 static bool bond_time_in_interval(struct bonding *bond, unsigned long last_act,
 				  int mod);
+static void bond_netdev_notify_work(struct work_struct *work);
 
 /*---------------------------- General routines -----------------------------*/
 
@@ -1237,6 +1238,8 @@ static struct slave *bond_alloc_slave(st
 			return NULL;
 		}
 	}
+	INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work);
+
 	return slave;
 }
 
@@ -1244,6 +1247,7 @@ static void bond_free_slave(struct slave
 {
 	struct bonding *bond = bond_get_bond_by_slave(slave);
 
+	cancel_delayed_work_sync(&slave->notify_work);
 	if (BOND_MODE(bond) == BOND_MODE_8023AD)
 		kfree(SLAVE_AD_INFO(slave));
 
@@ -1265,39 +1269,26 @@ static void bond_fill_ifslave(struct sla
 	info->link_failure_count = slave->link_failure_count;
 }
 
-static void bond_netdev_notify(struct net_device *dev,
-			       struct netdev_bonding_info *info)
-{
-	rtnl_lock();
-	netdev_bonding_info_change(dev, info);
-	rtnl_unlock();
-}
-
 static void bond_netdev_notify_work(struct work_struct *_work)
 {
-	struct netdev_notify_work *w =
-		container_of(_work, struct netdev_notify_work, work.work);
+	struct slave *slave = container_of(_work, struct slave,
+					   notify_work.work);
+
+	if (rtnl_trylock()) {
+		struct netdev_bonding_info binfo;
 
-	bond_netdev_notify(w->dev, &w->bonding_info);
-	dev_put(w->dev);
-	kfree(w);
+		bond_fill_ifslave(slave, &binfo.slave);
+		bond_fill_ifbond(slave->bond, &binfo.master);
+		netdev_bonding_info_change(slave->dev, &binfo);
+		rtnl_unlock();
+	} else {
+		queue_delayed_work(slave->bond->wq, &slave->notify_work, 1);
+	}
 }
 
 void bond_queue_slave_event(struct slave *slave)
 {
-	struct bonding *bond = slave->bond;
-	struct netdev_notify_work *nnw = kzalloc(sizeof(*nnw), GFP_ATOMIC);
-
-	if (!nnw)
-		return;
-
-	dev_hold(slave->dev);
-	nnw->dev = slave->dev;
-	bond_fill_ifslave(slave, &nnw->bonding_info.slave);
-	bond_fill_ifbond(bond, &nnw->bonding_info.master);
-	INIT_DELAYED_WORK(&nnw->work, bond_netdev_notify_work);
-
-	queue_delayed_work(slave->bond->wq, &nnw->work, 0);
+	queue_delayed_work(slave->bond->wq, &slave->notify_work, 0);
 }
 
 /* enslave device <slave> to bond device <master> */
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -146,12 +146,6 @@ struct bond_parm_tbl {
 	int mode;
 };
 
-struct netdev_notify_work {
-	struct delayed_work	work;
-	struct net_device	*dev;
-	struct netdev_bonding_info bonding_info;
-};
-
 struct slave {
 	struct net_device *dev; /* first - useful for panic debug */
 	struct bonding *bond; /* our master */
@@ -177,6 +171,7 @@ struct slave {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll *np;
 #endif
+	struct delayed_work notify_work;
 	struct kobject kobj;
 	struct rtnl_link_stats64 slave_stats;
 };



  parent reply	other threads:[~2018-10-18 18:05 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-18 17:54 [PATCH 4.4 00/48] 4.4.162-stable review Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 01/48] ASoC: wm8804: Add ACPI support Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 02/48] ASoC: sigmadsp: safeload should not have lower byte limit Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 03/48] selftests/efivarfs: add required kernel configs Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 04/48] mfd: omap-usb-host: Fix dts probe of children Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 05/48] sound: enable interrupt after dma buffer initialization Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 06/48] stmmac: fix valid numbers of unicast filter entries Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 07/48] net: macb: disable scatter-gather for macb on sama5d3 Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 08/48] ARM: dts: at91: add new compatibility string " Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 09/48] drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7 Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 10/48] ext4: add corruption check in ext4_xattr_set_entry() Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 11/48] mm/vmstat.c: fix outdated vmstat_text Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 12/48] mach64: detect the dot clock divider correctly on sparc Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 13/48] perf script python: Fix export-to-postgresql.py occasional failure Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 14/48] i2c: i2c-scmi: fix for i2c_smbus_write_block_data Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 15/48] xhci: Dont print a warning when setting link state for disabled ports Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 16/48] jffs2: return -ERANGE when xattr buffer is too small Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 17/48] bnxt_en: Fix TX timeout during netpoll Greg Kroah-Hartman
2018-10-18 17:54 ` Greg Kroah-Hartman [this message]
2018-10-18 17:54 ` [PATCH 4.4 19/48] ip6_tunnel: be careful when accessing the inner header Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 20/48] ip_tunnel: " Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 21/48] ipv4: fix use-after-free in ip_cmsg_recv_dstaddr() Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 22/48] net: ipv4: update fnhe_pmtu when first hops MTU changes Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 23/48] net/ipv6: Display all addresses in output of /proc/net/if_inet6 Greg Kroah-Hartman
2018-10-18 17:54 ` [PATCH 4.4 24/48] netlabel: check for IPV4MASK in addrinfo_get Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 25/48] net/usb: cancel pending work when unbinding smsc75xx Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 26/48] qlcnic: fix Tx descriptor corruption on 82xx devices Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 27/48] team: Forbid enslaving team device to itself Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 28/48] net: mvpp2: Extract the correct ethtype from the skb for tx csum offload Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 29/48] net: systemport: Fix wake-up interrupt race during resume Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 30/48] rtnl: limit IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES to 4096 Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 31/48] KVM: x86: remove eager_fpu field of struct kvm_vcpu_arch Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 32/48] x86/fpu: Remove use_eager_fpu() Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 33/48] x86/fpu: Remove struct fpu::counter Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 34/48] x86/fpu: Finish excising eagerfpu Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 35/48] media: af9035: prevent buffer overflow on write Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 36/48] clocksource/drivers/ti-32k: Add CLOCK_SOURCE_SUSPEND_NONSTOP flag for non-am43 SoCs Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 37/48] Input: atakbd - fix Atari keymap Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 38/48] Input: atakbd - fix Atari CapsLock behaviour Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 39/48] net/mlx4: Use cpumask_available for eq->affinity_mask Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 40/48] RISC-V: include linux/ftrace.h in asm-prototypes.h Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 41/48] powerpc/tm: Fix userspace r13 corruption Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 42/48] powerpc/tm: Avoid possible userspace r1 corruption on reclaim Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 43/48] ARC: build: Get rid of toolchain check Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 44/48] usb: gadget: serial: fix oops when data rxd after close Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 45/48] Drivers: hv: utils: Invoke the poll function after handshake Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 46/48] Drivers: hv: util: Pass the channel information during the init call Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 47/48] Drivers: hv: kvp: fix IP Failover Greg Kroah-Hartman
2018-10-18 17:55 ` [PATCH 4.4 48/48] HV: properly delay KVP packets when negotiation is in progress Greg Kroah-Hartman
2018-10-19  1:40 ` [PATCH 4.4 00/48] 4.4.162-stable review Nathan Chancellor
2018-10-19  8:40 ` Sebastian Gottschall
2018-10-19  9:15   ` Greg Kroah-Hartman
2018-10-19 13:58 ` Rafael David Tinoco
2018-10-19 19:50 ` Guenter Roeck
2018-10-19 20:47 ` Shuah Khan
2018-10-19 22:30 ` Shuah Khan
2018-10-22 13:05 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181018175428.997024376@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maheshb@google.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).