linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net/ipv4: deadlock in ip_ra_control
@ 2017-03-01 10:44 Dmitry Vyukov
  2017-03-01 17:18 ` Cong Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2017-03-01 10:44 UTC (permalink / raw)
  To: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, Cong Wang, netdev, LKML
  Cc: syzkaller

Hello,

I've got the following deadlock report while running syzkaller fuzzer
on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:

======================================================
[ INFO: possible circular locking dependency detected ]
4.10.0-next-20170301+ #1 Not tainted
-------------------------------------------------------
syz-executor1/3394 is trying to acquire lock:
 (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
include/net/sock.h:1460 [inline]
 (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652

but task is already holding lock:
 (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
net/core/rtnetlink.c:70

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (rtnl_mutex){+.+.+.}:
       validate_chain kernel/locking/lockdep.c:2265 [inline]
       __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
       lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
       __mutex_lock_common kernel/locking/mutex.c:754 [inline]
       __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
       mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
       rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
       mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
       ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
       do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
       ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
       raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
       sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
       SYSC_setsockopt net/socket.c:1786 [inline]
       SyS_setsockopt+0x25c/0x390 net/socket.c:1765
       entry_SYSCALL_64_fastpath+0x1f/0xc2

-> #0 (sk_lock-AF_INET){+.+.+.}:
       check_prev_add kernel/locking/lockdep.c:1828 [inline]
       check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
       validate_chain kernel/locking/lockdep.c:2265 [inline]
       __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
       lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
       lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
       lock_sock include/net/sock.h:1460 [inline]
       do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
       ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
       tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
       sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
       SYSC_setsockopt net/socket.c:1786 [inline]
       SyS_setsockopt+0x25c/0x390 net/socket.c:1765
       entry_SYSCALL_64_fastpath+0x1f/0xc2

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock(sk_lock-AF_INET);
                               lock(rtnl_mutex);
  lock(sk_lock-AF_INET);

 *** DEADLOCK ***

1 lock held by syz-executor1/3394:
 #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>]
rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70

stack backtrace:
CPU: 0 PID: 3394 Comm: syz-executor1 Not tainted 4.10.0-next-20170301+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:15 [inline]
 dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
 print_circular_bug+0x307/0x3b0 kernel/locking/lockdep.c:1202
 check_prev_add kernel/locking/lockdep.c:1828 [inline]
 check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
 validate_chain kernel/locking/lockdep.c:2265 [inline]
 __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
 lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
 lock_sock include/net/sock.h:1460 [inline]
 do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
 ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
 tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
 SYSC_setsockopt net/socket.c:1786 [inline]
 SyS_setsockopt+0x25c/0x390 net/socket.c:1765
 entry_SYSCALL_64_fastpath+0x1f/0xc2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-01 10:44 net/ipv4: deadlock in ip_ra_control Dmitry Vyukov
@ 2017-03-01 17:18 ` Cong Wang
  2017-03-02  9:40   ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Cong Wang @ 2017-03-01 17:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, netdev, LKML, syzkaller

[-- Attachment #1: Type: text/plain, Size: 2869 bytes --]

On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> I've got the following deadlock report while running syzkaller fuzzer
> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>
> ======================================================
> [ INFO: possible circular locking dependency detected ]
> 4.10.0-next-20170301+ #1 Not tainted
> -------------------------------------------------------
> syz-executor1/3394 is trying to acquire lock:
>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
> include/net/sock.h:1460 [inline]
>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>
> but task is already holding lock:
>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
> net/core/rtnetlink.c:70
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (rtnl_mutex){+.+.+.}:
>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>        SYSC_setsockopt net/socket.c:1786 [inline]
>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> -> #0 (sk_lock-AF_INET){+.+.+.}:
>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>        lock_sock include/net/sock.h:1460 [inline]
>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>        SYSC_setsockopt net/socket.c:1786 [inline]
>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>

Please try the attached patch (compile only).

Thanks.

[-- Attachment #2: ip-router-alert.diff --]
[-- Type: text/plain, Size: 1627 bytes --]

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ebd953b..bda318a 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -591,6 +591,7 @@ static bool setsockopt_needs_rtnl(int optname)
 	case MCAST_LEAVE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 	case MCAST_UNBLOCK_SOURCE:
+	case IP_ROUTER_ALERT:
 		return true;
 	}
 	return false;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index beacd02..932321b 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1278,7 +1278,7 @@ static void mrtsock_destruct(struct sock *sk)
 	struct net *net = sock_net(sk);
 	struct mr_table *mrt;
 
-	rtnl_lock();
+	ASSERT_RTNL();
 	ipmr_for_each_table(mrt, net) {
 		if (sk == rtnl_dereference(mrt->mroute_sk)) {
 			IPV4_DEVCONF_ALL(net, MC_FORWARDING)--;
@@ -1289,7 +1289,6 @@ static void mrtsock_destruct(struct sock *sk)
 			mroute_clean_tables(mrt, false);
 		}
 	}
-	rtnl_unlock();
 }
 
 /* Socket options and virtual interface manipulation. The whole
@@ -1353,13 +1352,8 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 		if (sk != rcu_access_pointer(mrt->mroute_sk)) {
 			ret = -EACCES;
 		} else {
-			/* We need to unlock here because mrtsock_destruct takes
-			 * care of rtnl itself and we can't change that due to
-			 * the IP_ROUTER_ALERT setsockopt which runs without it.
-			 */
-			rtnl_unlock();
 			ret = ip_ra_control(sk, 0, NULL);
-			goto out;
+			goto out_unlock;
 		}
 		break;
 	case MRT_ADD_VIF:
@@ -1470,7 +1464,6 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 	}
 out_unlock:
 	rtnl_unlock();
-out:
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-01 17:18 ` Cong Wang
@ 2017-03-02  9:40   ` Dmitry Vyukov
  2017-03-03 18:43     ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2017-03-02  9:40 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, netdev, LKML, syzkaller

On Wed, Mar 1, 2017 at 6:18 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> Hello,
>>
>> I've got the following deadlock report while running syzkaller fuzzer
>> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>>
>> ======================================================
>> [ INFO: possible circular locking dependency detected ]
>> 4.10.0-next-20170301+ #1 Not tainted
>> -------------------------------------------------------
>> syz-executor1/3394 is trying to acquire lock:
>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
>> include/net/sock.h:1460 [inline]
>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
>> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>
>> but task is already holding lock:
>>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
>> net/core/rtnetlink.c:70
>>
>> which lock already depends on the new lock.
>>
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (rtnl_mutex){+.+.+.}:
>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>> -> #0 (sk_lock-AF_INET){+.+.+.}:
>>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>>        lock_sock include/net/sock.h:1460 [inline]
>>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>
>
> Please try the attached patch (compile only).


Pushed the patch to the bots.
Thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-02  9:40   ` Dmitry Vyukov
@ 2017-03-03 18:43     ` Dmitry Vyukov
  2017-03-03 18:45       ` Dmitry Vyukov
  2017-03-06  2:04       ` Cong Wang
  0 siblings, 2 replies; 9+ messages in thread
From: Dmitry Vyukov @ 2017-03-03 18:43 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, netdev, LKML, syzkaller

On Thu, Mar 2, 2017 at 10:40 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Wed, Mar 1, 2017 at 6:18 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> Hello,
>>>
>>> I've got the following deadlock report while running syzkaller fuzzer
>>> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>>>
>>> ======================================================
>>> [ INFO: possible circular locking dependency detected ]
>>> 4.10.0-next-20170301+ #1 Not tainted
>>> -------------------------------------------------------
>>> syz-executor1/3394 is trying to acquire lock:
>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
>>> include/net/sock.h:1460 [inline]
>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
>>> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>
>>> but task is already holding lock:
>>>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
>>> net/core/rtnetlink.c:70
>>>
>>> which lock already depends on the new lock.
>>>
>>>
>>> the existing dependency chain (in reverse order) is:
>>>
>>> -> #1 (rtnl_mutex){+.+.+.}:
>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>>>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>>>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>>>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>>>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>
>>> -> #0 (sk_lock-AF_INET){+.+.+.}:
>>>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>>>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>>>        lock_sock include/net/sock.h:1460 [inline]
>>>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>
>>
>> Please try the attached patch (compile only).
>
>
> Pushed the patch to the bots.
> Thanks


This patch triggers:

[   57.748990] RTNL: assertion failed at net/ipv4/ipmr.c (1236)
[   57.749022] CPU: 1 PID: 5301 Comm: syz-executor2 Not tainted 4.10.0+ #15
[   57.749026] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   57.749028] Call Trace:
[   57.749042]  dump_stack+0x2ee/0x3ef
[   57.749219]  mrtsock_destruct+0x27e/0x2f0
[   57.749241]  ip_ra_control+0x459/0x600
[   57.749287]  raw_close+0x19/0x30
[   57.749295]  inet_release+0xed/0x1c0
[   57.749303]  sock_release+0x8d/0x1e0
[   57.749316]  sock_close+0x16/0x20
[   57.749323]  __fput+0x332/0x7f0
[   57.749340]  ____fput+0x15/0x20
[   57.749347]  task_work_run+0x18a/0x260
[   57.749372]  do_exit+0x18ef/0x28b0
[   57.749641]  do_group_exit+0x149/0x420
[   57.749656]  get_signal+0x7e0/0x1820
[   57.749697]  do_signal+0xd2/0x2190
[   57.749746]  exit_to_usermode_loop+0x200/0x2a0
[   57.749758]  syscall_return_slowpath+0x4d3/0x570
[   57.749835]  entry_SYSCALL_64_fastpath+0xc0/0xc2
[   57.749840] RIP: 0033:0x44fb79
[   57.749843] RSP: 002b:00007fbba84d9cf8 EFLAGS: 00000246 ORIG_RAX:
00000000000000ca
[   57.749850] RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
[   57.749854] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
[   57.749857] RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
[   57.749860] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   57.749864] R13: 0000000000a5fc57 R14: 00007fbba84da9c0 R15: 000000000000000c
[   57.749964]
[   57.749966] ===============================
[   57.749967] [ INFO: suspicious RCU usage. ]
[   57.749971] 4.10.0+ #15 Not tainted
[   57.749972] -------------------------------
[   57.749975] net/ipv4/ipmr.c:1238 suspicious
rcu_dereference_protected() usage!
[   57.749977]
[   57.749977] other info that might help us debug this:
[   57.749977]
[   57.749980]
[   57.749980] rcu_scheduler_active = 2, debug_locks = 0
[   57.749982] no locks held by syz-executor2/5301.
[   57.749984]
[   57.749984] stack backtrace:
[   57.749989] CPU: 1 PID: 5301 Comm: syz-executor2 Not tainted 4.10.0+ #15
[   57.749993] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[   57.749995] Call Trace:
[   57.750001]  dump_stack+0x2ee/0x3ef
[   57.750117]  lockdep_rcu_suspicious+0x139/0x180
[   57.750122]  mrtsock_destruct+0x167/0x2f0
[   57.750144]  ip_ra_control+0x459/0x600
[   57.750182]  raw_close+0x19/0x30
[   57.750188]  inet_release+0xed/0x1c0
[   57.750194]  sock_release+0x8d/0x1e0
[   57.750208]  sock_close+0x16/0x20
[   57.750213]  __fput+0x332/0x7f0
[   57.750228]  ____fput+0x15/0x20
[   57.750233]  task_work_run+0x18a/0x260
[   57.750256]  do_exit+0x18ef/0x28b0
[   57.750499]  do_group_exit+0x149/0x420
[   57.750515]  get_signal+0x7e0/0x1820
[   57.750556]  do_signal+0xd2/0x2190
[   57.750604]  exit_to_usermode_loop+0x200/0x2a0
[   57.750616]  syscall_return_slowpath+0x4d3/0x570
[   57.750693]  entry_SYSCALL_64_fastpath+0xc0/0xc2
[   57.750698] RIP: 0033:0x44fb79
[   57.750701] RSP: 002b:00007fbba84d9cf8 EFLAGS: 00000246 ORIG_RAX:
00000000000000ca
[   57.750708] RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
[   57.750712] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
[   57.750716] RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
[   57.750720] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[   57.750724] R13: 0000000000a5fc57 R14: 00007fbba84da9c0 R15: 000000000000000c

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-03 18:43     ` Dmitry Vyukov
@ 2017-03-03 18:45       ` Dmitry Vyukov
  2017-03-06  2:04       ` Cong Wang
  1 sibling, 0 replies; 9+ messages in thread
From: Dmitry Vyukov @ 2017-03-03 18:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, netdev, LKML, syzkaller

On Fri, Mar 3, 2017 at 7:43 PM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, Mar 2, 2017 at 10:40 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Wed, Mar 1, 2017 at 6:18 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>> Hello,
>>>>
>>>> I've got the following deadlock report while running syzkaller fuzzer
>>>> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>>>>
>>>> ======================================================
>>>> [ INFO: possible circular locking dependency detected ]
>>>> 4.10.0-next-20170301+ #1 Not tainted
>>>> -------------------------------------------------------
>>>> syz-executor1/3394 is trying to acquire lock:
>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
>>>> include/net/sock.h:1460 [inline]
>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
>>>> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>
>>>> but task is already holding lock:
>>>>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
>>>> net/core/rtnetlink.c:70
>>>>
>>>> which lock already depends on the new lock.
>>>>
>>>>
>>>> the existing dependency chain (in reverse order) is:
>>>>
>>>> -> #1 (rtnl_mutex){+.+.+.}:
>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>>>>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>>>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>>>>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>>>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>>>>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>>>>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>
>>>> -> #0 (sk_lock-AF_INET){+.+.+.}:
>>>>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>>>>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>>>>        lock_sock include/net/sock.h:1460 [inline]
>>>>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>
>>>
>>> Please try the attached patch (compile only).
>>
>>
>> Pushed the patch to the bots.
>> Thanks
>
>
> This patch triggers:
>
> [   57.748990] RTNL: assertion failed at net/ipv4/ipmr.c (1236)
> [   57.749022] CPU: 1 PID: 5301 Comm: syz-executor2 Not tainted 4.10.0+ #15
> [   57.749026] Hardware name: Google Google Compute Engine/Google
> Compute Engine, BIOS Google 01/01/2011
> [   57.749028] Call Trace:
> [   57.749042]  dump_stack+0x2ee/0x3ef
> [   57.749219]  mrtsock_destruct+0x27e/0x2f0
> [   57.749241]  ip_ra_control+0x459/0x600
> [   57.749287]  raw_close+0x19/0x30
> [   57.749295]  inet_release+0xed/0x1c0
> [   57.749303]  sock_release+0x8d/0x1e0
> [   57.749316]  sock_close+0x16/0x20
> [   57.749323]  __fput+0x332/0x7f0
> [   57.749340]  ____fput+0x15/0x20
> [   57.749347]  task_work_run+0x18a/0x260
> [   57.749372]  do_exit+0x18ef/0x28b0
> [   57.749641]  do_group_exit+0x149/0x420
> [   57.749656]  get_signal+0x7e0/0x1820
> [   57.749697]  do_signal+0xd2/0x2190
> [   57.749746]  exit_to_usermode_loop+0x200/0x2a0
> [   57.749758]  syscall_return_slowpath+0x4d3/0x570
> [   57.749835]  entry_SYSCALL_64_fastpath+0xc0/0xc2
> [   57.749840] RIP: 0033:0x44fb79
> [   57.749843] RSP: 002b:00007fbba84d9cf8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000ca
> [   57.749850] RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
> [   57.749854] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
> [   57.749857] RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
> [   57.749860] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [   57.749864] R13: 0000000000a5fc57 R14: 00007fbba84da9c0 R15: 000000000000000c
> [   57.749964]
> [   57.749966] ===============================
> [   57.749967] [ INFO: suspicious RCU usage. ]
> [   57.749971] 4.10.0+ #15 Not tainted
> [   57.749972] -------------------------------
> [   57.749975] net/ipv4/ipmr.c:1238 suspicious
> rcu_dereference_protected() usage!
> [   57.749977]
> [   57.749977] other info that might help us debug this:
> [   57.749977]
> [   57.749980]
> [   57.749980] rcu_scheduler_active = 2, debug_locks = 0
> [   57.749982] no locks held by syz-executor2/5301.
> [   57.749984]
> [   57.749984] stack backtrace:
> [   57.749989] CPU: 1 PID: 5301 Comm: syz-executor2 Not tainted 4.10.0+ #15
> [   57.749993] Hardware name: Google Google Compute Engine/Google
> Compute Engine, BIOS Google 01/01/2011
> [   57.749995] Call Trace:
> [   57.750001]  dump_stack+0x2ee/0x3ef
> [   57.750117]  lockdep_rcu_suspicious+0x139/0x180
> [   57.750122]  mrtsock_destruct+0x167/0x2f0
> [   57.750144]  ip_ra_control+0x459/0x600
> [   57.750182]  raw_close+0x19/0x30
> [   57.750188]  inet_release+0xed/0x1c0
> [   57.750194]  sock_release+0x8d/0x1e0
> [   57.750208]  sock_close+0x16/0x20
> [   57.750213]  __fput+0x332/0x7f0
> [   57.750228]  ____fput+0x15/0x20
> [   57.750233]  task_work_run+0x18a/0x260
> [   57.750256]  do_exit+0x18ef/0x28b0
> [   57.750499]  do_group_exit+0x149/0x420
> [   57.750515]  get_signal+0x7e0/0x1820
> [   57.750556]  do_signal+0xd2/0x2190
> [   57.750604]  exit_to_usermode_loop+0x200/0x2a0
> [   57.750616]  syscall_return_slowpath+0x4d3/0x570
> [   57.750693]  entry_SYSCALL_64_fastpath+0xc0/0xc2
> [   57.750698] RIP: 0033:0x44fb79
> [   57.750701] RSP: 002b:00007fbba84d9cf8 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000ca
> [   57.750708] RAX: fffffffffffffe00 RBX: 0000000000708218 RCX: 000000000044fb79
> [   57.750712] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000708218
> [   57.750716] RBP: 00000000007081f8 R08: 0000000000000000 R09: 0000000000000000
> [   57.750720] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> [   57.750724] R13: 0000000000a5fc57 R14: 00007fbba84da9c0 R15: 000000000000000c



Humm... but only on mmotm
(git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
auto-latest branch)
linux-next and upstream seem to be fine

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-03 18:43     ` Dmitry Vyukov
  2017-03-03 18:45       ` Dmitry Vyukov
@ 2017-03-06  2:04       ` Cong Wang
  2017-04-12 12:05         ` Andrey Konovalov
  1 sibling, 1 reply; 9+ messages in thread
From: Cong Wang @ 2017-03-06  2:04 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: David Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, Eric Dumazet, netdev, LKML, syzkaller

[-- Attachment #1: Type: text/plain, Size: 3387 bytes --]

On Fri, Mar 3, 2017 at 10:43 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> On Thu, Mar 2, 2017 at 10:40 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Wed, Mar 1, 2017 at 6:18 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>> On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>> Hello,
>>>>
>>>> I've got the following deadlock report while running syzkaller fuzzer
>>>> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>>>>
>>>> ======================================================
>>>> [ INFO: possible circular locking dependency detected ]
>>>> 4.10.0-next-20170301+ #1 Not tainted
>>>> -------------------------------------------------------
>>>> syz-executor1/3394 is trying to acquire lock:
>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
>>>> include/net/sock.h:1460 [inline]
>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
>>>> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>
>>>> but task is already holding lock:
>>>>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
>>>> net/core/rtnetlink.c:70
>>>>
>>>> which lock already depends on the new lock.
>>>>
>>>>
>>>> the existing dependency chain (in reverse order) is:
>>>>
>>>> -> #1 (rtnl_mutex){+.+.+.}:
>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>>>>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>>>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>>>>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>>>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>>>>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>>>>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>
>>>> -> #0 (sk_lock-AF_INET){+.+.+.}:
>>>>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>>>>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>>>>        lock_sock include/net/sock.h:1460 [inline]
>>>>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>
>>>
>>> Please try the attached patch (compile only).
>>
>>
>> Pushed the patch to the bots.
>> Thanks
>
>
> This patch triggers:

Ah, update the patch to fix this.

[-- Attachment #2: ip-router-alert.diff --]
[-- Type: text/plain, Size: 1983 bytes --]

diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index ebd953b..bda318a 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -591,6 +591,7 @@ static bool setsockopt_needs_rtnl(int optname)
 	case MCAST_LEAVE_GROUP:
 	case MCAST_LEAVE_SOURCE_GROUP:
 	case MCAST_UNBLOCK_SOURCE:
+	case IP_ROUTER_ALERT:
 		return true;
 	}
 	return false;
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index c0317c9..b036e85 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1278,7 +1278,7 @@ static void mrtsock_destruct(struct sock *sk)
 	struct net *net = sock_net(sk);
 	struct mr_table *mrt;
 
-	rtnl_lock();
+	ASSERT_RTNL();
 	ipmr_for_each_table(mrt, net) {
 		if (sk == rtnl_dereference(mrt->mroute_sk)) {
 			IPV4_DEVCONF_ALL(net, MC_FORWARDING)--;
@@ -1289,7 +1289,6 @@ static void mrtsock_destruct(struct sock *sk)
 			mroute_clean_tables(mrt, false);
 		}
 	}
-	rtnl_unlock();
 }
 
 /* Socket options and virtual interface manipulation. The whole
@@ -1353,13 +1352,8 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 		if (sk != rcu_access_pointer(mrt->mroute_sk)) {
 			ret = -EACCES;
 		} else {
-			/* We need to unlock here because mrtsock_destruct takes
-			 * care of rtnl itself and we can't change that due to
-			 * the IP_ROUTER_ALERT setsockopt which runs without it.
-			 */
-			rtnl_unlock();
 			ret = ip_ra_control(sk, 0, NULL);
-			goto out;
+			goto out_unlock;
 		}
 		break;
 	case MRT_ADD_VIF:
@@ -1470,7 +1464,6 @@ int ip_mroute_setsockopt(struct sock *sk, int optname, char __user *optval,
 	}
 out_unlock:
 	rtnl_unlock();
-out:
 	return ret;
 }
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 8119e1f..9d94397 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -682,7 +682,9 @@ static void raw_close(struct sock *sk, long timeout)
 	/*
 	 * Raw sockets may have direct kernel references. Kill them.
 	 */
+	rtnl_lock();
 	ip_ra_control(sk, 0, NULL);
+	rtnl_unlock();
 
 	sk_common_release(sk);
 }

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-03-06  2:04       ` Cong Wang
@ 2017-04-12 12:05         ` Andrey Konovalov
  2017-04-12 19:41           ` Cong Wang
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Konovalov @ 2017-04-12 12:05 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, David Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet, netdev, LKML,
	syzkaller

[-- Attachment #1: Type: text/plain, Size: 3941 bytes --]

On Mon, Mar 6, 2017 at 3:04 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Mar 3, 2017 at 10:43 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>> On Thu, Mar 2, 2017 at 10:40 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>> On Wed, Mar 1, 2017 at 6:18 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>> On Wed, Mar 1, 2017 at 2:44 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
>>>>> Hello,
>>>>>
>>>>> I've got the following deadlock report while running syzkaller fuzzer
>>>>> on linux-next/51788aebe7cae79cb334ad50641347465fc188fd:
>>>>>
>>>>> ======================================================
>>>>> [ INFO: possible circular locking dependency detected ]
>>>>> 4.10.0-next-20170301+ #1 Not tainted
>>>>> -------------------------------------------------------
>>>>> syz-executor1/3394 is trying to acquire lock:
>>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>] lock_sock
>>>>> include/net/sock.h:1460 [inline]
>>>>>  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff838864cc>]
>>>>> do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>>
>>>>> but task is already holding lock:
>>>>>  (rtnl_mutex){+.+.+.}, at: [<ffffffff836fbd97>] rtnl_lock+0x17/0x20
>>>>> net/core/rtnetlink.c:70
>>>>>
>>>>> which lock already depends on the new lock.
>>>>>
>>>>>
>>>>> the existing dependency chain (in reverse order) is:
>>>>>
>>>>> -> #1 (rtnl_mutex){+.+.+.}:
>>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>>        __mutex_lock_common kernel/locking/mutex.c:754 [inline]
>>>>>        __mutex_lock+0x172/0x1730 kernel/locking/mutex.c:891
>>>>>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:906
>>>>>        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:70
>>>>>        mrtsock_destruct+0x86/0x2c0 net/ipv4/ipmr.c:1281
>>>>>        ip_ra_control+0x459/0x600 net/ipv4/ip_sockglue.c:372
>>>>>        do_ip_setsockopt.isra.12+0x1064/0x3540 net/ipv4/ip_sockglue.c:1161
>>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>>        raw_setsockopt+0xb7/0xd0 net/ipv4/raw.c:839
>>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>
>>>>> -> #0 (sk_lock-AF_INET){+.+.+.}:
>>>>>        check_prev_add kernel/locking/lockdep.c:1828 [inline]
>>>>>        check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1938
>>>>>        validate_chain kernel/locking/lockdep.c:2265 [inline]
>>>>>        __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3338
>>>>>        lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
>>>>>        lock_sock_nested+0xcb/0x120 net/core/sock.c:2530
>>>>>        lock_sock include/net/sock.h:1460 [inline]
>>>>>        do_ip_setsockopt.isra.12+0x21c/0x3540 net/ipv4/ip_sockglue.c:652
>>>>>        ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1264
>>>>>        tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2721
>>>>>        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2725
>>>>>        SYSC_setsockopt net/socket.c:1786 [inline]
>>>>>        SyS_setsockopt+0x25c/0x390 net/socket.c:1765
>>>>>        entry_SYSCALL_64_fastpath+0x1f/0xc2
>>>>>
>>>>
>>>> Please try the attached patch (compile only).
>>>
>>>
>>> Pushed the patch to the bots.
>>> Thanks
>>
>>
>> This patch triggers:
>
> Ah, update the patch to fix this.

Hi Cong,

I now have a reproducer for this bug (attached) and your patch fixes it.

Could you send it?

Thanks!

>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: ipv4-ra-control-deadlock-poc.c --]
[-- Type: text/x-csrc, Size: 6612 bytes --]

// autogenerated by syzkaller (http://github.com/google/syzkaller)

#ifndef __NR_mmap
#define __NR_mmap 9
#endif
#ifndef __NR_socket
#define __NR_socket 41
#endif
#ifndef __NR_setsockopt
#define __NR_setsockopt 54
#endif

#define _GNU_SOURCE

#include <sys/ioctl.h>
#include <sys/mman.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/wait.h>

#include <linux/capability.h>
#include <linux/if.h>
#include <linux/if_tun.h>
#include <linux/kvm.h>
#include <linux/sched.h>
#include <net/if_arp.h>

#include <assert.h>
#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <grp.h>
#include <pthread.h>
#include <setjmp.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

const int kFailStatus = 67;
const int kErrorStatus = 68;
const int kRetryStatus = 69;

__attribute__((noreturn)) void doexit(int status)
{
  volatile unsigned i;
  syscall(__NR_exit_group, status);
  for (i = 0;; i++) {
  }
}

__attribute__((noreturn)) void fail(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  doexit((e == ENOMEM || e == EAGAIN) ? kRetryStatus : kFailStatus);
}

__attribute__((noreturn)) void exitf(const char* msg, ...)
{
  int e = errno;
  fflush(stdout);
  va_list args;
  va_start(args, msg);
  vfprintf(stderr, msg, args);
  va_end(args);
  fprintf(stderr, " (errno %d)\n", e);
  doexit(kRetryStatus);
}

static int flag_debug;

void debug(const char* msg, ...)
{
  if (!flag_debug)
    return;
  va_list args;
  va_start(args, msg);
  vfprintf(stdout, msg, args);
  va_end(args);
  fflush(stdout);
}

__thread int skip_segv;
__thread jmp_buf segv_env;

static void segv_handler(int sig, siginfo_t* info, void* uctx)
{
  uintptr_t addr = (uintptr_t)info->si_addr;
  const uintptr_t prog_start = 1 << 20;
  const uintptr_t prog_end = 100 << 20;
  if (__atomic_load_n(&skip_segv, __ATOMIC_RELAXED) &&
      (addr < prog_start || addr > prog_end)) {
    debug("SIGSEGV on %p, skipping\n", addr);
    _longjmp(segv_env, 1);
  }
  debug("SIGSEGV on %p, exiting\n", addr);
  doexit(sig);
  for (;;) {
  }
}

static void install_segv_handler()
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_sigaction = segv_handler;
  sa.sa_flags = SA_NODEFER | SA_SIGINFO;
  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGBUS, &sa, NULL);
}

#define NONFAILING(...)                                                \
  {                                                                    \
    __atomic_fetch_add(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
    if (_setjmp(segv_env) == 0) {                                      \
      __VA_ARGS__;                                                     \
    }                                                                  \
    __atomic_fetch_sub(&skip_segv, 1, __ATOMIC_SEQ_CST);               \
  }

#define BITMASK_LEN(type, bf_len) (type)((1ull << (bf_len)) - 1)

#define BITMASK_LEN_OFF(type, bf_off, bf_len)                          \
  (type)(BITMASK_LEN(type, (bf_len)) << (bf_off))

#define STORE_BY_BITMASK(type, addr, val, bf_off, bf_len)              \
  if ((bf_off) == 0 && (bf_len) == 0) {                                \
    *(type*)(addr) = (type)(val);                                      \
  } else {                                                             \
    type new_val = *(type*)(addr);                                     \
    new_val &= ~BITMASK_LEN_OFF(type, (bf_off), (bf_len));             \
    new_val |= ((type)(val)&BITMASK_LEN(type, (bf_len))) << (bf_off);  \
    *(type*)(addr) = new_val;                                          \
  }

static uintptr_t execute_syscall(int nr, uintptr_t a0, uintptr_t a1,
                                 uintptr_t a2, uintptr_t a3,
                                 uintptr_t a4, uintptr_t a5,
                                 uintptr_t a6, uintptr_t a7,
                                 uintptr_t a8)
{
  switch (nr) {
  default:
    return syscall(nr, a0, a1, a2, a3, a4, a5);
  }
}

static void setup_main_process()
{
  struct sigaction sa;
  memset(&sa, 0, sizeof(sa));
  sa.sa_handler = SIG_IGN;
  syscall(SYS_rt_sigaction, 0x20, &sa, NULL, 8);
  syscall(SYS_rt_sigaction, 0x21, &sa, NULL, 8);
  install_segv_handler();

  char tmpdir_template[] = "./syzkaller.XXXXXX";
  char* tmpdir = mkdtemp(tmpdir_template);
  if (!tmpdir)
    fail("failed to mkdtemp");
  if (chmod(tmpdir, 0777))
    fail("failed to chmod");
  if (chdir(tmpdir))
    fail("failed to chdir");
}

static void loop();

static void sandbox_common()
{
  prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
  setpgrp();
  setsid();

  struct rlimit rlim;
  rlim.rlim_cur = rlim.rlim_max = 128 << 20;
  setrlimit(RLIMIT_AS, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_FSIZE, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 1 << 20;
  setrlimit(RLIMIT_STACK, &rlim);
  rlim.rlim_cur = rlim.rlim_max = 0;
  setrlimit(RLIMIT_CORE, &rlim);

  unshare(CLONE_NEWNS);
  unshare(CLONE_NEWIPC);
  unshare(CLONE_IO);
}

static int do_sandbox_none(int executor_pid, bool enable_tun)
{
  int pid = fork();
  if (pid)
    return pid;

  sandbox_common();

  loop();
  doexit(1);
}

long r[10];
void loop()
{
  memset(r, -1, sizeof(r));
  r[0] = execute_syscall(__NR_mmap, 0x20000000ul, 0x4000ul, 0x3ul,
                         0x32ul, 0xfffffffffffffffful, 0x0ul, 0, 0, 0);
  r[1] = execute_syscall(__NR_socket, 0x2ul, 0x80003ul, 0x2ul, 0, 0, 0,
                         0, 0, 0);
  NONFAILING(*(uint32_t*)0x20f01000 = (uint32_t)0x0);
  r[3] = execute_syscall(__NR_setsockopt, r[1], 0x0ul, 0xc8ul,
                         0x20f01000ul, 0x4ul, 0, 0, 0, 0);
  NONFAILING(*(uint32_t*)0x20001ff4 = (uint32_t)0xa2090000);
  NONFAILING(*(uint32_t*)0x20001ff8 = (uint32_t)0x0);
  NONFAILING(*(uint32_t*)0x20001ffc = (uint32_t)0x9);
  r[7] = execute_syscall(__NR_setsockopt, r[1], 0x0ul, 0x23ul,
                         0x20001ff4ul, 0xcul, 0, 0, 0, 0);
  NONFAILING(*(uint32_t*)0x20000000 = (uint32_t)0x0);
  r[9] = execute_syscall(__NR_setsockopt, r[1], 0x0ul, 0x5ul,
                         0x20000000ul, 0x4ul, 0, 0, 0, 0);
}
int main()
{
  setup_main_process();
  int pid = do_sandbox_none(0, false);
  int status = 0;
  while (waitpid(pid, &status, __WALL) != pid) {
  }
  return 0;
}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-04-12 12:05         ` Andrey Konovalov
@ 2017-04-12 19:41           ` Cong Wang
  2017-04-13 11:58             ` Andrey Konovalov
  0 siblings, 1 reply; 9+ messages in thread
From: Cong Wang @ 2017-04-12 19:41 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: Dmitry Vyukov, David Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet, netdev, LKML,
	syzkaller

On Wed, Apr 12, 2017 at 5:05 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
> Hi Cong,
>
> I now have a reproducer for this bug (attached) and your patch fixes it.
>
> Could you send it?
>

Done. I verified it with your reproducer too.

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: net/ipv4: deadlock in ip_ra_control
  2017-04-12 19:41           ` Cong Wang
@ 2017-04-13 11:58             ` Andrey Konovalov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrey Konovalov @ 2017-04-13 11:58 UTC (permalink / raw)
  To: Cong Wang
  Cc: Dmitry Vyukov, David Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, Eric Dumazet, netdev, LKML,
	syzkaller

On Wed, Apr 12, 2017 at 9:41 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, Apr 12, 2017 at 5:05 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
>> Hi Cong,
>>
>> I now have a reproducer for this bug (attached) and your patch fixes it.
>>
>> Could you send it?
>>
>
> Done. I verified it with your reproducer too.
>
> Thanks!

Great, thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-04-13 11:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-01 10:44 net/ipv4: deadlock in ip_ra_control Dmitry Vyukov
2017-03-01 17:18 ` Cong Wang
2017-03-02  9:40   ` Dmitry Vyukov
2017-03-03 18:43     ` Dmitry Vyukov
2017-03-03 18:45       ` Dmitry Vyukov
2017-03-06  2:04       ` Cong Wang
2017-04-12 12:05         ` Andrey Konovalov
2017-04-12 19:41           ` Cong Wang
2017-04-13 11:58             ` Andrey Konovalov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).