* net/packet: use-after-free in packet_rcv_fanout
@ 2017-02-09 13:14 Dmitry Vyukov
2017-02-09 15:12 ` Sowmini Varadhan
2017-02-10 1:24 ` Cong Wang
0 siblings, 2 replies; 16+ messages in thread
From: Dmitry Vyukov @ 2017-02-09 13:14 UTC (permalink / raw)
To: David Miller, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
jarno, sowmini.varadhan, philip.pettersson, weongyo.linux,
netdev, LKML
Cc: syzkaller
Hello,
I've got the following use-after-free report in packet_rcv_fanout
while running syzkaller fuzzer on linux-next
e3e6c5f3544c5d05c6b3b309a34f4f2c3537e993. So far it happened once and
is not reproducible, but maybe the stacks will allow you to figure out
what happens.
BUG: KASAN: use-after-free in __lock_acquire+0x3212/0x3430
kernel/locking/lockdep.c:3224 at addr ffff8801d903d538
Read of size 8 by task syz-executor1/10596
CPU: 1 PID: 10596 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170208 #1
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Call Trace:
__asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
__lock_acquire+0x3212/0x3430 kernel/locking/lockdep.c:3224
lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
_raw_spin_lock_bh+0x3a/0x50 kernel/locking/spinlock.c:175
spin_lock_bh include/linux/spinlock.h:304 [inline]
packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
xmit_one net/core/dev.c:2870 [inline]
dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2890
__dev_queue_xmit+0x16d1/0x1e60 net/core/dev.c:3355
dev_queue_xmit+0x17/0x20 net/core/dev.c:3388
neigh_hh_output include/net/neighbour.h:468 [inline]
dst_neigh_output include/net/dst.h:452 [inline]
ip6_finish_output2+0x1461/0x2380 net/ipv6/ip6_output.c:123
ip6_finish_output+0x2f9/0x950 net/ipv6/ip6_output.c:149
NF_HOOK_COND include/linux/netfilter.h:246 [inline]
ip6_output+0x1cb/0x8c0 net/ipv6/ip6_output.c:163
ip6_xmit+0xc2f/0x1e80 include/net/dst.h:498
inet6_csk_xmit+0x320/0x5d0 net/ipv6/inet6_connection_sock.c:139
tcp_transmit_skb+0x1ab4/0x3460 net/ipv4/tcp_output.c:1054
tcp_send_syn_data net/ipv4/tcp_output.c:3343 [inline]
tcp_connect+0x11a7/0x2f50 net/ipv4/tcp_output.c:3375
tcp_v6_connect+0x1a6e/0x1f70 net/ipv6/tcp_ipv6.c:295
__inet_stream_connect+0x2d1/0xf80 net/ipv4/af_inet.c:618
tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
tcp_sendmsg+0x23ac/0x3bd0 net/ipv4/tcp.c:1133
inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
SYSC_sendto+0x660/0x810 net/socket.c:1685
SyS_sendto+0x40/0x50 net/socket.c:1653
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x44fb59
RSP: 002b:00007f4fe6d53b58 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000020475000 RCX: 000000000044fb59
RDX: 0000000000000000 RSI: 0000000020475000 RDI: 0000000000000022
RBP: 0000000000000022 R08: 000000002000afe0 R09: 0000000000000020
R10: c2b66bc2f9e666dc R11: 0000000000000212 R12: 0000000000708000
R13: 0000000000800003 R14: 0000000000181000 R15: 0000000000000000
Object at ffff8801d903d380, in cache kmalloc-2048 size: 2048
Allocated:
PID = 10570
[<ffffffff8362fc61>] kmalloc include/linux/slab.h:497 [inline]
[<ffffffff8362fc61>] sk_prot_alloc+0x101/0x2a0 net/core/sock.c:1338
[<ffffffff8363878c>] sk_alloc+0x8c/0x470 net/core/sock.c:1394
[<ffffffff83c0fb33>] packet_create+0x163/0xb00 net/packet/af_packet.c:3144
[<ffffffff83628264>] __sock_create+0x4e4/0x870 net/socket.c:1197
[<ffffffff83628829>] sock_create net/socket.c:1237 [inline]
[<ffffffff83628829>] SYSC_socket net/socket.c:1267 [inline]
[<ffffffff83628829>] SyS_socket+0xf9/0x230 net/socket.c:1247
Freed:
PID = 10574
[<ffffffff81a36f43>] kfree+0xd3/0x250 mm/slab.c:3827
[<ffffffff83633b6f>] sk_prot_free net/core/sock.c:1377 [inline]
[<ffffffff83633b6f>] __sk_destruct+0x5af/0x6b0 net/core/sock.c:1450
[<ffffffff8363e047>] sk_destruct+0x47/0x80 net/core/sock.c:1458
[<ffffffff8363e0d7>] __sk_free+0x57/0x230 net/core/sock.c:1466
[<ffffffff8363e2d3>] sk_free+0x23/0x30 net/core/sock.c:1477
[<ffffffff83c14bec>] sock_put include/net/sock.h:1644 [inline]
[<ffffffff83c14bec>] packet_release+0x7ac/0x970 net/packet/af_packet.c:2984
[<ffffffff836205fd>] sock_release+0x8d/0x1e0 net/socket.c:597
[<ffffffff83620766>] sock_close+0x16/0x20 net/socket.c:1061
[<ffffffff81a8d412>] __fput+0x332/0x7f0 fs/file_table.c:208
[<ffffffff81a8d955>] ____fput+0x15/0x20 fs/file_table.c:244
[<ffffffff814aae3a>] task_work_run+0x18a/0x260 kernel/task_work.c:116
[<ffffffff814351f6>] exit_task_work include/linux/task_work.h:21 [inline]
[<ffffffff814351f6>] do_exit+0x1956/0x2900 kernel/exit.c:873
[<ffffffff8143ace9>] do_group_exit+0x149/0x420 kernel/exit.c:977
[<ffffffff81469fd0>] get_signal+0x7e0/0x1820 kernel/signal.c:2313
[<ffffffff8126b992>] do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
[<ffffffff81007900>] exit_to_usermode_loop+0x200/0x2a0
arch/x86/entry/common.c:156
[<ffffffff81009413>] prepare_exit_to_usermode
arch/x86/entry/common.c:190 [inline]
[<ffffffff81009413>] syscall_return_slowpath+0x4d3/0x570
arch/x86/entry/common.c:259
[<ffffffff844cb522>] entry_SYSCALL_64_fastpath+0xc0/0xc2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-09 13:14 net/packet: use-after-free in packet_rcv_fanout Dmitry Vyukov
@ 2017-02-09 15:12 ` Sowmini Varadhan
2017-02-09 15:17 ` Eric Dumazet
2017-02-10 1:24 ` Cong Wang
1 sibling, 1 reply; 16+ messages in thread
From: Sowmini Varadhan @ 2017-02-09 15:12 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: David Miller, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
jarno, philip.pettersson, weongyo.linux, netdev, LKML, syzkaller
On (02/09/17 14:14), Dmitry Vyukov wrote:
>
> Call Trace:
:
> packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
> fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
> packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
> dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
:
> tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
:
looks like a race between a NIT socket (tcpdump, maybe?) that is closing,
and a standard tcp socket.. packet_release() takes the po->bind_lock
to remove the socket from the ptype_all NIT queue. but how does
that sync with the Tx path for other af_inet/af_inet6 sockets?
--Sowmini
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-09 15:12 ` Sowmini Varadhan
@ 2017-02-09 15:17 ` Eric Dumazet
0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2017-02-09 15:17 UTC (permalink / raw)
To: Sowmini Varadhan
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Daniel Borkmann,
jarno, Philip Pettersson, weongyo.linux, netdev, LKML, syzkaller
On Thu, Feb 9, 2017 at 7:12 AM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (02/09/17 14:14), Dmitry Vyukov wrote:
>>
>> Call Trace:
> :
>> packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
>> fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
>> packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
>> dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
> :
>> tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
> :
>
> looks like a race between a NIT socket (tcpdump, maybe?) that is closing,
> and a standard tcp socket.. packet_release() takes the po->bind_lock
> to remove the socket from the ptype_all NIT queue. but how does
> that sync with the Tx path for other af_inet/af_inet6 sockets?
RCU protection for hooks.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-09 13:14 net/packet: use-after-free in packet_rcv_fanout Dmitry Vyukov
2017-02-09 15:12 ` Sowmini Varadhan
@ 2017-02-10 1:24 ` Cong Wang
2017-02-10 3:19 ` Eric Dumazet
1 sibling, 1 reply; 16+ messages in thread
From: Cong Wang @ 2017-02-10 1:24 UTC (permalink / raw)
To: Dmitry Vyukov
Cc: David Miller, Willem de Bruijn, Eric Dumazet, Daniel Borkmann,
jarno, Sowmini Varadhan, philip.pettersson, weongyo.linux,
netdev, LKML, syzkaller
On Thu, Feb 9, 2017 at 5:14 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> Hello,
>
> I've got the following use-after-free report in packet_rcv_fanout
> while running syzkaller fuzzer on linux-next
> e3e6c5f3544c5d05c6b3b309a34f4f2c3537e993. So far it happened once and
> is not reproducible, but maybe the stacks will allow you to figure out
> what happens.
>
> BUG: KASAN: use-after-free in __lock_acquire+0x3212/0x3430
> kernel/locking/lockdep.c:3224 at addr ffff8801d903d538
> Read of size 8 by task syz-executor1/10596
> CPU: 1 PID: 10596 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170208 #1
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
>
> Call Trace:
> __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
> __lock_acquire+0x3212/0x3430 kernel/locking/lockdep.c:3224
> lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
> __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
> _raw_spin_lock_bh+0x3a/0x50 kernel/locking/spinlock.c:175
> spin_lock_bh include/linux/spinlock.h:304 [inline]
> packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
> fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
> packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
> dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
> xmit_one net/core/dev.c:2870 [inline]
> dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2890
> __dev_queue_xmit+0x16d1/0x1e60 net/core/dev.c:3355
> dev_queue_xmit+0x17/0x20 net/core/dev.c:3388
> neigh_hh_output include/net/neighbour.h:468 [inline]
> dst_neigh_output include/net/dst.h:452 [inline]
> ip6_finish_output2+0x1461/0x2380 net/ipv6/ip6_output.c:123
> ip6_finish_output+0x2f9/0x950 net/ipv6/ip6_output.c:149
> NF_HOOK_COND include/linux/netfilter.h:246 [inline]
> ip6_output+0x1cb/0x8c0 net/ipv6/ip6_output.c:163
> ip6_xmit+0xc2f/0x1e80 include/net/dst.h:498
> inet6_csk_xmit+0x320/0x5d0 net/ipv6/inet6_connection_sock.c:139
> tcp_transmit_skb+0x1ab4/0x3460 net/ipv4/tcp_output.c:1054
> tcp_send_syn_data net/ipv4/tcp_output.c:3343 [inline]
> tcp_connect+0x11a7/0x2f50 net/ipv4/tcp_output.c:3375
> tcp_v6_connect+0x1a6e/0x1f70 net/ipv6/tcp_ipv6.c:295
> __inet_stream_connect+0x2d1/0xf80 net/ipv4/af_inet.c:618
> tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
> tcp_sendmsg+0x23ac/0x3bd0 net/ipv4/tcp.c:1133
> inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
> sock_sendmsg_nosec net/socket.c:633 [inline]
> sock_sendmsg+0xca/0x110 net/socket.c:643
> SYSC_sendto+0x660/0x810 net/socket.c:1685
> SyS_sendto+0x40/0x50 net/socket.c:1653
> entry_SYSCALL_64_fastpath+0x1f/0xc2
It seems on-flying packets could still refer the struct sock pointer
via f->arr[i], if so we need a sync before unlinking it:
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d56ee46..8724a98 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2924,6 +2924,8 @@ static int packet_release(struct socket *sock)
sock_prot_inuse_add(net, sk->sk_prot, -1);
preempt_enable();
+ synchronize_net();
+
spin_lock(&po->bind_lock);
unregister_prot_hook(sk, false);
packet_cached_dev_reset(po);
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 1:24 ` Cong Wang
@ 2017-02-10 3:19 ` Eric Dumazet
2017-02-10 3:23 ` Eric Dumazet
2017-02-10 3:33 ` Sowmini Varadhan
0 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 3:19 UTC (permalink / raw)
To: Cong Wang
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Thu, 2017-02-09 at 17:24 -0800, Cong Wang wrote:
> On Thu, Feb 9, 2017 at 5:14 AM, Dmitry Vyukov <dvyukov@google.com> wrote:
> > Hello,
> >
> > I've got the following use-after-free report in packet_rcv_fanout
> > while running syzkaller fuzzer on linux-next
> > e3e6c5f3544c5d05c6b3b309a34f4f2c3537e993. So far it happened once and
> > is not reproducible, but maybe the stacks will allow you to figure out
> > what happens.
> >
> > BUG: KASAN: use-after-free in __lock_acquire+0x3212/0x3430
> > kernel/locking/lockdep.c:3224 at addr ffff8801d903d538
> > Read of size 8 by task syz-executor1/10596
> > CPU: 1 PID: 10596 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170208 #1
> > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > BIOS Google 01/01/2011
> >
> > Call Trace:
> > __asan_report_load8_noabort+0x29/0x30 mm/kasan/report.c:332
> > __lock_acquire+0x3212/0x3430 kernel/locking/lockdep.c:3224
> > lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3753
> > __raw_spin_lock_bh include/linux/spinlock_api_smp.h:135 [inline]
> > _raw_spin_lock_bh+0x3a/0x50 kernel/locking/spinlock.c:175
> > spin_lock_bh include/linux/spinlock.h:304 [inline]
> > packet_rcv_has_room+0x25/0xb0 net/packet/af_packet.c:1308
> > fanout_demux_rollover+0x3bb/0x6b0 net/packet/af_packet.c:1388
> > packet_rcv_fanout+0x674/0x800 net/packet/af_packet.c:1490
> > dev_queue_xmit_nit+0x73a/0xa90 net/core/dev.c:1898
> > xmit_one net/core/dev.c:2870 [inline]
> > dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2890
> > __dev_queue_xmit+0x16d1/0x1e60 net/core/dev.c:3355
> > dev_queue_xmit+0x17/0x20 net/core/dev.c:3388
> > neigh_hh_output include/net/neighbour.h:468 [inline]
> > dst_neigh_output include/net/dst.h:452 [inline]
> > ip6_finish_output2+0x1461/0x2380 net/ipv6/ip6_output.c:123
> > ip6_finish_output+0x2f9/0x950 net/ipv6/ip6_output.c:149
> > NF_HOOK_COND include/linux/netfilter.h:246 [inline]
> > ip6_output+0x1cb/0x8c0 net/ipv6/ip6_output.c:163
> > ip6_xmit+0xc2f/0x1e80 include/net/dst.h:498
> > inet6_csk_xmit+0x320/0x5d0 net/ipv6/inet6_connection_sock.c:139
> > tcp_transmit_skb+0x1ab4/0x3460 net/ipv4/tcp_output.c:1054
> > tcp_send_syn_data net/ipv4/tcp_output.c:3343 [inline]
> > tcp_connect+0x11a7/0x2f50 net/ipv4/tcp_output.c:3375
> > tcp_v6_connect+0x1a6e/0x1f70 net/ipv6/tcp_ipv6.c:295
> > __inet_stream_connect+0x2d1/0xf80 net/ipv4/af_inet.c:618
> > tcp_sendmsg_fastopen net/ipv4/tcp.c:1110 [inline]
> > tcp_sendmsg+0x23ac/0x3bd0 net/ipv4/tcp.c:1133
> > inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761
> > sock_sendmsg_nosec net/socket.c:633 [inline]
> > sock_sendmsg+0xca/0x110 net/socket.c:643
> > SYSC_sendto+0x660/0x810 net/socket.c:1685
> > SyS_sendto+0x40/0x50 net/socket.c:1653
> > entry_SYSCALL_64_fastpath+0x1f/0xc2
>
> It seems on-flying packets could still refer the struct sock pointer
> via f->arr[i], if so we need a sync before unlinking it:
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index d56ee46..8724a98 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2924,6 +2924,8 @@ static int packet_release(struct socket *sock)
> sock_prot_inuse_add(net, sk->sk_prot, -1);
> preempt_enable();
>
> + synchronize_net();
> +
> spin_lock(&po->bind_lock);
> unregister_prot_hook(sk, false);
> packet_cached_dev_reset(po);
More likely the bug is in fanout_add(), with a buggy sequence in error
case, and not correct locking.
kfree(po->rollover);
po->rollover = NULL;
Two cpus entering fanout_add() (using the same af_packet socket,
syzkaller courtesy...) might both see po->fanout being NULL.
Then they grab the mutex. Too late...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 3:19 ` Eric Dumazet
@ 2017-02-10 3:23 ` Eric Dumazet
2017-02-10 17:49 ` Cong Wang
2017-02-10 3:33 ` Sowmini Varadhan
1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 3:23 UTC (permalink / raw)
To: Cong Wang
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
> More likely the bug is in fanout_add(), with a buggy sequence in error
> case, and not correct locking.
>
> kfree(po->rollover);
> po->rollover = NULL;
>
> Two cpus entering fanout_add() (using the same af_packet socket,
> syzkaller courtesy...) might both see po->fanout being NULL.
>
> Then they grab the mutex. Too late...
Patch could be :
net/packet/af_packet.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d56ee46b11fc9524e457e5fe8adf10c105a66ab6..11725a350f6953d077f754c10e9f52e48924d780 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1657,7 +1657,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
atomic_long_set(&po->rollover->num_failed, 0);
}
- mutex_lock(&fanout_mutex);
match = NULL;
list_for_each_entry(f, &fanout_list, list) {
if (f->id == id &&
@@ -1704,7 +1703,6 @@ static int fanout_add(struct sock *sk, u16 id, u16 type_flags)
}
}
out:
- mutex_unlock(&fanout_mutex);
if (err) {
kfree(po->rollover);
po->rollover = NULL;
@@ -3698,7 +3696,10 @@ packet_setsockopt(struct socket *sock, int level, int optname, char __user *optv
if (copy_from_user(&val, optval, sizeof(val)))
return -EFAULT;
- return fanout_add(sk, val & 0xffff, val >> 16);
+ mutex_lock(&fanout_mutex);
+ ret = fanout_add(sk, val & 0xffff, val >> 16);
+ mutex_unlock(&fanout_mutex);
+ return ret;
}
case PACKET_FANOUT_DATA:
{
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 3:19 ` Eric Dumazet
2017-02-10 3:23 ` Eric Dumazet
@ 2017-02-10 3:33 ` Sowmini Varadhan
2017-02-10 4:18 ` Eric Dumazet
2017-02-10 18:00 ` Cong Wang
1 sibling, 2 replies; 16+ messages in thread
From: Sowmini Varadhan @ 2017-02-10 3:33 UTC (permalink / raw)
To: Eric Dumazet
Cc: Cong Wang, Dmitry Vyukov, David Miller, Willem de Bruijn,
Eric Dumazet, Daniel Borkmann, jarno, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On (02/09/17 19:19), Eric Dumazet wrote:
>
> More likely the bug is in fanout_add(), with a buggy sequence in error
> case, and not correct locking.
>
> kfree(po->rollover);
> po->rollover = NULL;
>
> Two cpus entering fanout_add() (using the same af_packet socket,
> syzkaller courtesy...) might both see po->fanout being NULL.
>
> Then they grab the mutex. Too late...
I'm not sure I follow- aiui the panic was in acceessing the
sk_receive_queue.lock in a socket that had been closed earlier. I think
the assumption is that rcu_read_lock_bh in __dev_queue_xmit (and
rcu_read_lock in dev_queue_xmit_nit?) should make sure that the nit
packet delivery can be done safely, and the synchronize_net in
packet_release() makes sure that the Tx paths are quiesced before freeing
the socket. What is the race-hole here? Does it have to do with the
_bh and softirq context, somehow?
--Sowmini
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 3:33 ` Sowmini Varadhan
@ 2017-02-10 4:18 ` Eric Dumazet
2017-02-10 18:00 ` Cong Wang
1 sibling, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 4:18 UTC (permalink / raw)
To: Sowmini Varadhan, Anoob Soman
Cc: Eric Dumazet, Cong Wang, Dmitry Vyukov, David Miller,
Willem de Bruijn, Daniel Borkmann, jarno, Philip Pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Thu, Feb 9, 2017 at 7:33 PM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (02/09/17 19:19), Eric Dumazet wrote:
>>
>> More likely the bug is in fanout_add(), with a buggy sequence in error
>> case, and not correct locking.
>>
>> kfree(po->rollover);
>> po->rollover = NULL;
>>
>> Two cpus entering fanout_add() (using the same af_packet socket,
>> syzkaller courtesy...) might both see po->fanout being NULL.
>>
>> Then they grab the mutex. Too late...
>
> I'm not sure I follow- aiui the panic was in acceessing the
> sk_receive_queue.lock in a socket that had been closed earlier. I think
> the assumption is that rcu_read_lock_bh in __dev_queue_xmit (and
> rcu_read_lock in dev_queue_xmit_nit?) should make sure that the nit
> packet delivery can be done safely, and the synchronize_net in
> packet_release() makes sure that the Tx paths are quiesced before freeing
> the socket. What is the race-hole here? Does it have to do with the
> _bh and softirq context, somehow?
>
We have probably a dozen of bugs to fix in af_packet.c
The race in fanout_add() is one ot theml.
I do not believe Anoob Soman sent his fixes btw ...
( Look for this thread : http://marc.info/?l=linux-netdev&m=148588680525648&w=2
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 3:23 ` Eric Dumazet
@ 2017-02-10 17:49 ` Cong Wang
2017-02-10 17:59 ` Eric Dumazet
0 siblings, 1 reply; 16+ messages in thread
From: Cong Wang @ 2017-02-10 17:49 UTC (permalink / raw)
To: Eric Dumazet
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Thu, Feb 9, 2017 at 7:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
>
>> More likely the bug is in fanout_add(), with a buggy sequence in error
>> case, and not correct locking.
>>
>> kfree(po->rollover);
>> po->rollover = NULL;
>>
>> Two cpus entering fanout_add() (using the same af_packet socket,
>> syzkaller courtesy...) might both see po->fanout being NULL.
>>
>> Then they grab the mutex. Too late...
>
> Patch could be :
>
For me, clearly the data structure that use-after-free'd is struct sock
rather than struct packet_rollover.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 17:49 ` Cong Wang
@ 2017-02-10 17:59 ` Eric Dumazet
2017-02-10 18:02 ` Cong Wang
2017-02-10 18:02 ` Eric Dumazet
0 siblings, 2 replies; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 17:59 UTC (permalink / raw)
To: Cong Wang
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Fri, 2017-02-10 at 09:49 -0800, Cong Wang wrote:
> On Thu, Feb 9, 2017 at 7:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
> >
> >> More likely the bug is in fanout_add(), with a buggy sequence in error
> >> case, and not correct locking.
> >>
> >> kfree(po->rollover);
> >> po->rollover = NULL;
> >>
> >> Two cpus entering fanout_add() (using the same af_packet socket,
> >> syzkaller courtesy...) might both see po->fanout being NULL.
> >>
> >> Then they grab the mutex. Too late...
> >
> > Patch could be :
> >
>
> For me, clearly the data structure that use-after-free'd is struct sock
> rather than struct packet_rollover.
Fine. But your patch makes absolutely no sense.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 3:33 ` Sowmini Varadhan
2017-02-10 4:18 ` Eric Dumazet
@ 2017-02-10 18:00 ` Cong Wang
2017-02-10 19:16 ` Sowmini Varadhan
1 sibling, 1 reply; 16+ messages in thread
From: Cong Wang @ 2017-02-10 18:00 UTC (permalink / raw)
To: Sowmini Varadhan
Cc: Eric Dumazet, Dmitry Vyukov, David Miller, Willem de Bruijn,
Eric Dumazet, Daniel Borkmann, jarno, Philip Pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Thu, Feb 9, 2017 at 7:33 PM, Sowmini Varadhan
<sowmini.varadhan@oracle.com> wrote:
> On (02/09/17 19:19), Eric Dumazet wrote:
>>
>> More likely the bug is in fanout_add(), with a buggy sequence in error
>> case, and not correct locking.
>>
>> kfree(po->rollover);
>> po->rollover = NULL;
>>
>> Two cpus entering fanout_add() (using the same af_packet socket,
>> syzkaller courtesy...) might both see po->fanout being NULL.
>>
>> Then they grab the mutex. Too late...
>
> I'm not sure I follow- aiui the panic was in acceessing the
> sk_receive_queue.lock in a socket that had been closed earlier. I think
> the assumption is that rcu_read_lock_bh in __dev_queue_xmit (and
> rcu_read_lock in dev_queue_xmit_nit?) should make sure that the nit
> packet delivery can be done safely, and the synchronize_net in
> packet_release() makes sure that the Tx paths are quiesced before freeing
> the socket. What is the race-hole here? Does it have to do with the
> _bh and softirq context, somehow?
My understanding about the race here is packet_release() doesn't
wait for flying packets correctly, which leads to a flying packet still
refers to the struct sock which is being released.
This could happen because struct packet_fanout is refcn'ted, it is
still there when this is not the last sock referring it, therefore, the
callback packet_rcv_fanout() is not removed yet. When packet_release()
tries to remove the pointer to struct sock from f->arr[i] in
__fanout_unlink(), a flying packet could race with f->arr[i]:
po = pkt_sk(f->arr[idx]);
Of course, the fix may not be as easy as just adding a synchronize_net(),
perhaps we need the spinlock too in fanout_demux_rollover().
At least I believe this explains the crash Dmitry reported.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 17:59 ` Eric Dumazet
@ 2017-02-10 18:02 ` Cong Wang
2017-02-10 18:15 ` Eric Dumazet
2017-02-10 18:02 ` Eric Dumazet
1 sibling, 1 reply; 16+ messages in thread
From: Cong Wang @ 2017-02-10 18:02 UTC (permalink / raw)
To: Eric Dumazet
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, Philip Pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Fri, Feb 10, 2017 at 9:59 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-02-10 at 09:49 -0800, Cong Wang wrote:
>> On Thu, Feb 9, 2017 at 7:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
>> >
>> >> More likely the bug is in fanout_add(), with a buggy sequence in error
>> >> case, and not correct locking.
>> >>
>> >> kfree(po->rollover);
>> >> po->rollover = NULL;
>> >>
>> >> Two cpus entering fanout_add() (using the same af_packet socket,
>> >> syzkaller courtesy...) might both see po->fanout being NULL.
>> >>
>> >> Then they grab the mutex. Too late...
>> >
>> > Patch could be :
>> >
>>
>> For me, clearly the data structure that use-after-free'd is struct sock
>> rather than struct packet_rollover.
>
> Fine. But your patch makes absolutely no sense.
I don't have to give a 100% correct patch to prove my explanation
of the crash. At least it makes more sense than yours...
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 17:59 ` Eric Dumazet
2017-02-10 18:02 ` Cong Wang
@ 2017-02-10 18:02 ` Eric Dumazet
2017-02-10 18:34 ` Cong Wang
1 sibling, 1 reply; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 18:02 UTC (permalink / raw)
To: Cong Wang, Anoob Soman
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, philip.pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Fri, 2017-02-10 at 09:59 -0800, Eric Dumazet wrote:
> On Fri, 2017-02-10 at 09:49 -0800, Cong Wang wrote:
> > On Thu, Feb 9, 2017 at 7:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
> > >
> > >> More likely the bug is in fanout_add(), with a buggy sequence in error
> > >> case, and not correct locking.
> > >>
> > >> kfree(po->rollover);
> > >> po->rollover = NULL;
> > >>
> > >> Two cpus entering fanout_add() (using the same af_packet socket,
> > >> syzkaller courtesy...) might both see po->fanout being NULL.
> > >>
> > >> Then they grab the mutex. Too late...
> > >
> > > Patch could be :
> > >
> >
> > For me, clearly the data structure that use-after-free'd is struct sock
> > rather than struct packet_rollover.
>
> Fine. But your patch makes absolutely no sense.
At least, Anoob patch is making a step into the right direction ;)
https://patchwork.ozlabs.org/patch/726532/
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 18:02 ` Cong Wang
@ 2017-02-10 18:15 ` Eric Dumazet
0 siblings, 0 replies; 16+ messages in thread
From: Eric Dumazet @ 2017-02-10 18:15 UTC (permalink / raw)
To: Cong Wang
Cc: Dmitry Vyukov, David Miller, Willem de Bruijn, Eric Dumazet,
Daniel Borkmann, jarno, Sowmini Varadhan, Philip Pettersson,
weongyo.linux, netdev, LKML, syzkaller
On Fri, 2017-02-10 at 10:02 -0800, Cong Wang wrote:
> I don't have to give a 100% correct patch to prove my explanation
> of the crash. At least it makes more sense than yours...
I will submit it regardless of what you think.
It solves _another_ issue, one of of 10 in af_packet.c
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 18:02 ` Eric Dumazet
@ 2017-02-10 18:34 ` Cong Wang
0 siblings, 0 replies; 16+ messages in thread
From: Cong Wang @ 2017-02-10 18:34 UTC (permalink / raw)
To: Eric Dumazet
Cc: Anoob Soman, Dmitry Vyukov, David Miller, Willem de Bruijn,
Eric Dumazet, Daniel Borkmann, jarno, Sowmini Varadhan,
Philip Pettersson, weongyo.linux, netdev, LKML, syzkaller
On Fri, Feb 10, 2017 at 10:02 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-02-10 at 09:59 -0800, Eric Dumazet wrote:
>> On Fri, 2017-02-10 at 09:49 -0800, Cong Wang wrote:
>> > On Thu, Feb 9, 2017 at 7:23 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > > On Thu, 2017-02-09 at 19:19 -0800, Eric Dumazet wrote:
>> > >
>> > >> More likely the bug is in fanout_add(), with a buggy sequence in error
>> > >> case, and not correct locking.
>> > >>
>> > >> kfree(po->rollover);
>> > >> po->rollover = NULL;
>> > >>
>> > >> Two cpus entering fanout_add() (using the same af_packet socket,
>> > >> syzkaller courtesy...) might both see po->fanout being NULL.
>> > >>
>> > >> Then they grab the mutex. Too late...
>> > >
>> > > Patch could be :
>> > >
>> >
>> > For me, clearly the data structure that use-after-free'd is struct sock
>> > rather than struct packet_rollover.
>>
>> Fine. But your patch makes absolutely no sense.
>
> At least, Anoob patch is making a step into the right direction ;)
>
> https://patchwork.ozlabs.org/patch/726532/
>
Yeah, but still looks like a different one with the one Dmitry reported.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: net/packet: use-after-free in packet_rcv_fanout
2017-02-10 18:00 ` Cong Wang
@ 2017-02-10 19:16 ` Sowmini Varadhan
0 siblings, 0 replies; 16+ messages in thread
From: Sowmini Varadhan @ 2017-02-10 19:16 UTC (permalink / raw)
To: Cong Wang
Cc: Eric Dumazet, Dmitry Vyukov, David Miller, Willem de Bruijn,
Eric Dumazet, Daniel Borkmann, jarno, Philip Pettersson,
weongyo.linux, netdev, LKML, syzkaller
On (02/10/17 10:00), Cong Wang wrote:
> My understanding about the race here is packet_release() doesn't
> wait for flying packets correctly, which leads to a flying packet still
> refers to the struct sock which is being released.
>
> This could happen because struct packet_fanout is refcn'ted, it is
:
> At least I believe this explains the crash Dmitry reported.
hmm, the proof of the pudding is in the eating- would be good to
be able to reliably reproduce this somewhere (thus proving that
root-cause analysis is rock-solid), maybe by introducing artificial
delays to slow down paths..
I'm travelling at the moment but may be able to give this (try
to reproduce it reliably) next week.
--Sowmini
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-02-10 19:16 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 13:14 net/packet: use-after-free in packet_rcv_fanout Dmitry Vyukov
2017-02-09 15:12 ` Sowmini Varadhan
2017-02-09 15:17 ` Eric Dumazet
2017-02-10 1:24 ` Cong Wang
2017-02-10 3:19 ` Eric Dumazet
2017-02-10 3:23 ` Eric Dumazet
2017-02-10 17:49 ` Cong Wang
2017-02-10 17:59 ` Eric Dumazet
2017-02-10 18:02 ` Cong Wang
2017-02-10 18:15 ` Eric Dumazet
2017-02-10 18:02 ` Eric Dumazet
2017-02-10 18:34 ` Cong Wang
2017-02-10 3:33 ` Sowmini Varadhan
2017-02-10 4:18 ` Eric Dumazet
2017-02-10 18:00 ` Cong Wang
2017-02-10 19:16 ` Sowmini Varadhan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).