* net/ipv4: use-after-free in ipv4_mtu
@ 2017-04-04 14:50 Andrey Konovalov
2017-04-04 18:51 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: Andrey Konovalov @ 2017-04-04 14:50 UTC (permalink / raw)
To: David S. Miller, Alexey Kuznetsov, James Morris,
Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML
Cc: Dmitry Vyukov, Kostya Serebryany, Eric Dumazet, syzkaller
Hi,
I've got the following error report while fuzzing the kernel with syzkaller.
On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5).
Unfortunately it's not reproducible.
==================================================================
BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176
[inline] at addr ffff88003d6a965c
BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0
net/ipv4/route.c:1270 at addr ffff88003d6a965c
Read of size 4 by task syz-executor3/20611
CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x292/0x398 lib/dump_stack.c:52
kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
print_address_description mm/kasan/report.c:202 [inline]
kasan_report_error mm/kasan/report.c:291 [inline]
kasan_report+0x252/0x510 mm/kasan/report.c:347
__asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367
dst_metric_raw include/net/dst.h:176 [inline]
ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270
dst_mtu include/net/dst.h:221 [inline]
do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433
ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578
tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131
sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709
SYSC_getsockopt net/socket.c:1829 [inline]
SyS_getsockopt+0x252/0x390 net/socket.c:1811
entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458d9
RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9
RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000
R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150
R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003
Object at ffff88003d6a9658, in cache kmalloc-64 size: 64
Allocated:
PID = 20110
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
kmalloc include/linux/slab.h:490 [inline]
kzalloc include/linux/slab.h:663 [inline]
fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040
fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221
ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597
inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882
sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst
socket option.
Use struct sctp_assoc_value instead
sock_do_ioctl+0x65/0xb0 net/socket.c:906
sock_ioctl+0x28f/0x440 net/socket.c:1004
vfs_ioctl fs/ioctl.c:45 [inline]
do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685
SYSC_ioctl fs/ioctl.c:700 [inline]
SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 4439
save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
slab_free_hook mm/slub.c:1357 [inline]
slab_free_freelist_hook mm/slub.c:1379 [inline]
slab_free mm/slub.c:2961 [inline]
kfree+0xe8/0x2b0 mm/slub.c:3882
free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218
__rcu_reclaim kernel/rcu/rcu.h:118 [inline]
rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879
invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
__rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126
__do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Memory state around the buggy address:
ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
^
ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-04 14:50 net/ipv4: use-after-free in ipv4_mtu Andrey Konovalov @ 2017-04-04 18:51 ` Eric Dumazet 2017-04-05 1:11 ` Cong Wang 0 siblings, 1 reply; 8+ messages in thread From: Eric Dumazet @ 2017-04-04 18:51 UTC (permalink / raw) To: Andrey Konovalov Cc: David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote: > > Hi, > > I've got the following error report while fuzzing the kernel with syzkaller. > > On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5). > > Unfortunately it's not reproducible. > > ================================================================== > BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176 > [inline] at addr ffff88003d6a965c > BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0 > net/ipv4/route.c:1270 at addr ffff88003d6a965c > Read of size 4 by task syz-executor3/20611 > CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x292/0x398 lib/dump_stack.c:52 > kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 > print_address_description mm/kasan/report.c:202 [inline] > kasan_report_error mm/kasan/report.c:291 [inline] > kasan_report+0x252/0x510 mm/kasan/report.c:347 > __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367 > dst_metric_raw include/net/dst.h:176 [inline] > ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270 > dst_mtu include/net/dst.h:221 [inline] > do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433 > ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578 > tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131 > sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709 > SYSC_getsockopt net/socket.c:1829 [inline] > SyS_getsockopt+0x252/0x390 net/socket.c:1811 > entry_SYSCALL_64_fastpath+0x1f/0xc2 > RIP: 0033:0x4458d9 > RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037 > RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9 > RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005 > RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000 > R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150 > R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003 > Object at ffff88003d6a9658, in cache kmalloc-64 size: 64 > Allocated: > PID = 20110 > save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > set_track mm/kasan/kasan.c:525 [inline] > kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 > kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 > kmalloc include/linux/slab.h:490 [inline] > kzalloc include/linux/slab.h:663 [inline] > fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040 > fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221 > ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597 > inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882 > sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst > socket option. > Use struct sctp_assoc_value instead > sock_do_ioctl+0x65/0xb0 net/socket.c:906 > sock_ioctl+0x28f/0x440 net/socket.c:1004 > vfs_ioctl fs/ioctl.c:45 [inline] > do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685 > SYSC_ioctl fs/ioctl.c:700 [inline] > SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 > entry_SYSCALL_64_fastpath+0x1f/0xc2 > Freed: > PID = 4439 > save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > set_track mm/kasan/kasan.c:525 [inline] > kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 > slab_free_hook mm/slub.c:1357 [inline] > slab_free_freelist_hook mm/slub.c:1379 [inline] > slab_free mm/slub.c:2961 [inline] > kfree+0xe8/0x2b0 mm/slub.c:3882 > free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218 > __rcu_reclaim kernel/rcu/rcu.h:118 [inline] > rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879 > invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline] > __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline] > rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126 > __do_softirq+0x2fb/0xb7d kernel/softirq.c:284 > Memory state around the buggy address: > ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb > ^ > ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc > ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > ================================================================== Thanks for the report Andrey Looking at fib->fib_metrics, I fail to understand how the following can work : dst_init_metrics(&rt->dst, fi->fib_metrics, true); In the cases fi->fib_metrics is _not_ dst_default_metrics, fi->fib_metrics can be freed when the fib is deleted, while dst(s) have still the 'read only pointer'. RCU grace period before fi->fib_metrics freeing does not help. Without refcounts, it looks like we need to copy the fib_metrics. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-04 18:51 ` Eric Dumazet @ 2017-04-05 1:11 ` Cong Wang 2017-04-05 2:45 ` Eric Dumazet 0 siblings, 1 reply; 8+ messages in thread From: Cong Wang @ 2017-04-05 1:11 UTC (permalink / raw) To: Eric Dumazet Cc: Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote: > On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote: >> >> Hi, >> >> I've got the following error report while fuzzing the kernel with syzkaller. >> >> On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5). >> >> Unfortunately it's not reproducible. >> >> ================================================================== >> BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176 >> [inline] at addr ffff88003d6a965c >> BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0 >> net/ipv4/route.c:1270 at addr ffff88003d6a965c >> Read of size 4 by task syz-executor3/20611 >> CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:16 [inline] >> dump_stack+0x292/0x398 lib/dump_stack.c:52 >> kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 >> print_address_description mm/kasan/report.c:202 [inline] >> kasan_report_error mm/kasan/report.c:291 [inline] >> kasan_report+0x252/0x510 mm/kasan/report.c:347 >> __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367 >> dst_metric_raw include/net/dst.h:176 [inline] >> ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270 >> dst_mtu include/net/dst.h:221 [inline] >> do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433 >> ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578 >> tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131 >> sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709 >> SYSC_getsockopt net/socket.c:1829 [inline] >> SyS_getsockopt+0x252/0x390 net/socket.c:1811 >> entry_SYSCALL_64_fastpath+0x1f/0xc2 >> RIP: 0033:0x4458d9 >> RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037 >> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9 >> RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005 >> RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000 >> R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150 >> R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003 >> Object at ffff88003d6a9658, in cache kmalloc-64 size: 64 >> Allocated: >> PID = 20110 >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 >> save_stack+0x43/0xd0 mm/kasan/kasan.c:513 >> set_track mm/kasan/kasan.c:525 [inline] >> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 >> kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 >> kmalloc include/linux/slab.h:490 [inline] >> kzalloc include/linux/slab.h:663 [inline] >> fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040 >> fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221 >> ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597 >> inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882 >> sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst >> socket option. >> Use struct sctp_assoc_value instead >> sock_do_ioctl+0x65/0xb0 net/socket.c:906 >> sock_ioctl+0x28f/0x440 net/socket.c:1004 >> vfs_ioctl fs/ioctl.c:45 [inline] >> do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685 >> SYSC_ioctl fs/ioctl.c:700 [inline] >> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 >> entry_SYSCALL_64_fastpath+0x1f/0xc2 >> Freed: >> PID = 4439 >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 >> save_stack+0x43/0xd0 mm/kasan/kasan.c:513 >> set_track mm/kasan/kasan.c:525 [inline] >> kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 >> slab_free_hook mm/slub.c:1357 [inline] >> slab_free_freelist_hook mm/slub.c:1379 [inline] >> slab_free mm/slub.c:2961 [inline] >> kfree+0xe8/0x2b0 mm/slub.c:3882 >> free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218 >> __rcu_reclaim kernel/rcu/rcu.h:118 [inline] >> rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879 >> invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline] >> __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline] >> rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126 >> __do_softirq+0x2fb/0xb7d kernel/softirq.c:284 >> Memory state around the buggy address: >> ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb >> ^ >> ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc >> ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >> ================================================================== > > Thanks for the report Andrey > > Looking at fib->fib_metrics, I fail to understand how the following can work : > > dst_init_metrics(&rt->dst, fi->fib_metrics, true); > > In the cases fi->fib_metrics is _not_ dst_default_metrics, > fi->fib_metrics can be freed when the fib is deleted, > while dst(s) have still the 'read only pointer'. > > RCU grace period before fi->fib_metrics freeing does not help. > > Without refcounts, it looks like we need to copy the fib_metrics. The dst is obtained from sk_dst_cache which is cached for a fast path where fib_info is obtained in fib_lookup() without refcnt: err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF); ... if (!(fib_flags & FIB_LOOKUP_NOREF)) atomic_inc(&fi->fib_clntref); This probably starts from: commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9 Author: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue Oct 5 10:41:36 2010 +0000 fib: RCU conversion of fib_lookup() ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-05 1:11 ` Cong Wang @ 2017-04-05 2:45 ` Eric Dumazet 2017-04-05 18:59 ` Subash Abhinov Kasiviswanathan 2017-04-05 22:33 ` Cong Wang 0 siblings, 2 replies; 8+ messages in thread From: Eric Dumazet @ 2017-04-05 2:45 UTC (permalink / raw) To: Cong Wang Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller On Tue, 2017-04-04 at 18:11 -0700, Cong Wang wrote: > On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote: > > On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote: > >> > >> Hi, > >> > >> I've got the following error report while fuzzing the kernel with syzkaller. > >> > >> On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5). > >> > >> Unfortunately it's not reproducible. > >> > >> ================================================================== > >> BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176 > >> [inline] at addr ffff88003d6a965c > >> BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0 > >> net/ipv4/route.c:1270 at addr ffff88003d6a965c > >> Read of size 4 by task syz-executor3/20611 > >> CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199 > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > >> Call Trace: > >> __dump_stack lib/dump_stack.c:16 [inline] > >> dump_stack+0x292/0x398 lib/dump_stack.c:52 > >> kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 > >> print_address_description mm/kasan/report.c:202 [inline] > >> kasan_report_error mm/kasan/report.c:291 [inline] > >> kasan_report+0x252/0x510 mm/kasan/report.c:347 > >> __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367 > >> dst_metric_raw include/net/dst.h:176 [inline] > >> ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270 > >> dst_mtu include/net/dst.h:221 [inline] > >> do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433 > >> ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578 > >> tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131 > >> sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709 > >> SYSC_getsockopt net/socket.c:1829 [inline] > >> SyS_getsockopt+0x252/0x390 net/socket.c:1811 > >> entry_SYSCALL_64_fastpath+0x1f/0xc2 > >> RIP: 0033:0x4458d9 > >> RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037 > >> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9 > >> RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005 > >> RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000 > >> R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150 > >> R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003 > >> Object at ffff88003d6a9658, in cache kmalloc-64 size: 64 > >> Allocated: > >> PID = 20110 > >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > >> save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > >> set_track mm/kasan/kasan.c:525 [inline] > >> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 > >> kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 > >> kmalloc include/linux/slab.h:490 [inline] > >> kzalloc include/linux/slab.h:663 [inline] > >> fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040 > >> fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221 > >> ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597 > >> inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882 > >> sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst > >> socket option. > >> Use struct sctp_assoc_value instead > >> sock_do_ioctl+0x65/0xb0 net/socket.c:906 > >> sock_ioctl+0x28f/0x440 net/socket.c:1004 > >> vfs_ioctl fs/ioctl.c:45 [inline] > >> do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685 > >> SYSC_ioctl fs/ioctl.c:700 [inline] > >> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 > >> entry_SYSCALL_64_fastpath+0x1f/0xc2 > >> Freed: > >> PID = 4439 > >> save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > >> save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > >> set_track mm/kasan/kasan.c:525 [inline] > >> kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 > >> slab_free_hook mm/slub.c:1357 [inline] > >> slab_free_freelist_hook mm/slub.c:1379 [inline] > >> slab_free mm/slub.c:2961 [inline] > >> kfree+0xe8/0x2b0 mm/slub.c:3882 > >> free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218 > >> __rcu_reclaim kernel/rcu/rcu.h:118 [inline] > >> rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879 > >> invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline] > >> __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline] > >> rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126 > >> __do_softirq+0x2fb/0xb7d kernel/softirq.c:284 > >> Memory state around the buggy address: > >> ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > >> ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > >> >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb > >> ^ > >> ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc > >> ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > >> ================================================================== > > > > Thanks for the report Andrey > > > > Looking at fib->fib_metrics, I fail to understand how the following can work : > > > > dst_init_metrics(&rt->dst, fi->fib_metrics, true); > > > > In the cases fi->fib_metrics is _not_ dst_default_metrics, > > fi->fib_metrics can be freed when the fib is deleted, > > while dst(s) have still the 'read only pointer'. > > > > RCU grace period before fi->fib_metrics freeing does not help. > > > > Without refcounts, it looks like we need to copy the fib_metrics. > > The dst is obtained from sk_dst_cache which is cached for a fast > path where fib_info is obtained in fib_lookup() without refcnt: > > err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF); > > > ... > if (!(fib_flags & FIB_LOOKUP_NOREF)) > atomic_inc(&fi->fib_clntref); > > > This probably starts from: > > commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9 > Author: Eric Dumazet <eric.dumazet@gmail.com> > Date: Tue Oct 5 10:41:36 2010 +0000 > > fib: RCU conversion of fib_lookup() Interesting. I might had too many beers tonight, but ... refcount was removed in 2860583fe840 many months later -static void rt_init_metrics(struct rtable *rt, struct fib_info *fi) -{ - if (fi->fib_metrics != (u32 *) dst_default_metrics) { - rt->fi = fi; - atomic_inc(&fi->fib_clntref); - } - dst_init_metrics(&rt->dst, fi->fib_metrics, true); -} - static struct fib_nh_exception *find_exception(struct fib_nh *nh, __be32 daddr) { struct fnhe_hash_bucket *hash = nh->nh_exceptions; @@ -1261,7 +1239,7 @@ static void rt_set_nexthop(struct rtable *rt, __be32 daddr, rt->rt_gateway = nh->nh_gw; if (unlikely(fnhe)) rt_bind_exception(rt, fnhe, daddr); - rt_init_metrics(rt, fi); + dst_init_metrics(&rt->dst, fi->fib_metrics, true); #ifdef CONFIG_IP_ROUTE_CLASSID rt->dst.tclassid = nh->nh_tclassid; #endif ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-05 2:45 ` Eric Dumazet @ 2017-04-05 18:59 ` Subash Abhinov Kasiviswanathan 2017-04-05 22:33 ` Cong Wang 1 sibling, 0 replies; 8+ messages in thread From: Subash Abhinov Kasiviswanathan @ 2017-04-05 18:59 UTC (permalink / raw) To: Eric Dumazet Cc: Cong Wang, Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller, netdev-owner > > Interesting. I might had too many beers tonight, but ... > > refcount was removed in 2860583fe840 many months later > > -static void rt_init_metrics(struct rtable *rt, struct fib_info *fi) > -{ > - if (fi->fib_metrics != (u32 *) dst_default_metrics) { > - rt->fi = fi; > - atomic_inc(&fi->fib_clntref); > - } > - dst_init_metrics(&rt->dst, fi->fib_metrics, true); > -} > - > static struct fib_nh_exception *find_exception(struct fib_nh *nh, > __be32 daddr) > { > struct fnhe_hash_bucket *hash = nh->nh_exceptions; > @@ -1261,7 +1239,7 @@ static void rt_set_nexthop(struct rtable *rt, > __be32 daddr, > rt->rt_gateway = nh->nh_gw; > if (unlikely(fnhe)) > rt_bind_exception(rt, fnhe, daddr); > - rt_init_metrics(rt, fi); > + dst_init_metrics(&rt->dst, fi->fib_metrics, true); > #ifdef CONFIG_IP_ROUTE_CLASSID > rt->dst.tclassid = nh->nh_tclassid; > #endif Hi Eric I encountered a crash on 4.4 kernel pointing to ipv4_mtu. Is the crash similar to this one? (target is ARM64 Android, was seen on a stability rack, so no reproducer unfortunately) <6> Kernel BUG at 00000000000005dc [verbose debug info unavailable] <6> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP <6> CPU: 1 PID: 4649 Comm: iperf Tainted: G W O 4.4.21+ #1 <6> task: ffffffef02242f00 ti: ffffffef021b8000 task.ti: ffffffef021b8000 <2> PC is at 0x5dc <2> LR is at ipv4_mtu+0x70/0x84 <2> pc : [<00000000000005dc>] lr : [<ffffff9bd2c35ab8>] pstate: a0000145 <2> sp : ffffffef021bb9b0 <2> x29: ffffffef021bb9b0 x28: 0000000000000000 <2> x27: ffffffef318122c0 x26: 00000000000005be <2> x25: ffffffef31812678 x24: ffffffef31812678 <2> x23: ffffffef8794c000 x22: ffffff9bd43f4380 <2> x21: ffffffef318122c0 x20: ffffffef6aef6ac0 <2> x19: ffffffef05026ac0 x18: 0000000001026749 <2> x17: 0000007fabaf145c x16: ffffff9bd1fe72bc <2> x15: 00368fbefea52a8e x14: 3736353433323130 <2> x13: 3938373635343332 x12: 0000000000000003 <2> x11: 0000000000000028 x10: 0101010101010101 <2> x9 : 0000000000000001 x8 : 0000000000000098 <2> x7 : ffffff9bd2c8cbc0 x6 : 0000000000000000 <2> x5 : ffffffef68481c00 x4 : 00000000ffffefbf <2> x3 : 0000000000000000 x2 : 0000000000000000 <2> x1 : 000000000000ef7f x0 : 0000000001280058 <2> LR: 0xffffff9bd2c35a78: <2> 5a78 b7f80241 f9401661 927ef421 b9400422 2a0203e0 350001a2 f9400e60 b9400021 <2> 5a98 b9422800 361000c1 39428e61 34000081 7109001f 52804801 1a819000 529fffe1 <2> 5ab8 6b01001f 1a819000 f9400bf3 a8c27bfd d65f03c0 a9ba7bfd 910003fd a90153f3 <2> 5ad8 a9025bf5 a90363f7 a9046bf9 aa0003f3 aa1e03e0 f9002fa1 2a0203f8 2a0303f9 <2> SP: 0xffffffef021bb970: <2> b970 d2c35ab8 ffffff9b 021bb9b0 ffffffef 000005dc 00000000 a0000145 00000000 <2> b990 6aef6ac0 ffffffef 6aef6ac0 ffffffef 00000000 00000080 d2c015b0 ffffff9b <2> b9b0 021bb9d0 ffffffef d2c3e4d4 ffffff9b 6aef6ac0 ffffffef 021bba18 ffffffef <2> b9d0 021bba20 ffffffef d2c3f05c ffffff9b d37d9418 ffffff9b 6aef6ac0 ffffffef <2> <6> Process iperf (pid: 4649, stack limit = 0xffffffef021b8020) <2> Call trace: <2> [<00000000000005dc>] 0x5dc <2> [<ffffff9bd2c3e4d4>] ip_finish_output+0xbc/0x1dc <2> [<ffffff9bd2c3f05c>] ip_output+0xe8/0x15c <2> [<ffffff9bd2c3e78c>] ip_local_out+0x58/0x68 <2> [<ffffff9bd2c3fa88>] ip_send_skb+0x2c/0xa8 <2> [<ffffff9bd2c643d0>] udp_send_skb+0x194/0x29c <2> [<ffffff9bd2c66584>] udp_sendmsg+0x4e0/0x700 <2> [<ffffff9bd2c70788>] inet_sendmsg+0x98/0xc8 <2> [<ffffff9bd2ba82e8>] sock_sendmsg+0x48/0x60 <2> [<ffffff9bd2ba8394>] sock_write_iter+0x94/0xc0 <2> [<ffffff9bd1fe61c8>] __vfs_write+0xc0/0xf0 <2> [<ffffff9bd1fe6abc>] vfs_write+0xb8/0x150 <2> [<ffffff9bd1fe7314>] SyS_write+0x58/0x94 <2> [<ffffff9bd1e84e30>] el0_svc_naked+0x24/0x28 <6> Code: bad PC value <6> ---[ end trace debf337ba02da94f ]--- <6> Kernel panic - not syncing: Fatal exception -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-05 2:45 ` Eric Dumazet 2017-04-05 18:59 ` Subash Abhinov Kasiviswanathan @ 2017-04-05 22:33 ` Cong Wang 2017-04-06 10:49 ` Eric Dumazet 1 sibling, 1 reply; 8+ messages in thread From: Cong Wang @ 2017-04-05 22:33 UTC (permalink / raw) To: Eric Dumazet Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller [-- Attachment #1: Type: text/plain, Size: 1536 bytes --] On Tue, Apr 4, 2017 at 7:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Tue, 2017-04-04 at 18:11 -0700, Cong Wang wrote: >> On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote: >> > Looking at fib->fib_metrics, I fail to understand how the following can work : >> > >> > dst_init_metrics(&rt->dst, fi->fib_metrics, true); >> > >> > In the cases fi->fib_metrics is _not_ dst_default_metrics, >> > fi->fib_metrics can be freed when the fib is deleted, >> > while dst(s) have still the 'read only pointer'. >> > >> > RCU grace period before fi->fib_metrics freeing does not help. >> > >> > Without refcounts, it looks like we need to copy the fib_metrics. >> >> The dst is obtained from sk_dst_cache which is cached for a fast >> path where fib_info is obtained in fib_lookup() without refcnt: >> >> err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF); >> >> >> ... >> if (!(fib_flags & FIB_LOOKUP_NOREF)) >> atomic_inc(&fi->fib_clntref); >> >> >> This probably starts from: >> >> commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9 >> Author: Eric Dumazet <eric.dumazet@gmail.com> >> Date: Tue Oct 5 10:41:36 2010 +0000 >> >> fib: RCU conversion of fib_lookup() > > Interesting. I might had too many beers tonight, but ... > > refcount was removed in 2860583fe840 many months later Good find! I missed the refcnt in rt_set_nexthop() before that commit. We need to revert that commit to restore the refcnt for fib_info. [-- Attachment #2: ipv4-fib-info.diff --] [-- Type: text/plain, Size: 1798 bytes --] diff --git a/include/net/route.h b/include/net/route.h index c0874c8..3917d0a 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -69,6 +69,7 @@ struct rtable { struct list_head rt_uncached; struct uncached_list *rt_uncached_list; + struct fib_info *fi; /* for refcnt to shared metrics */ }; static inline bool rt_is_input_route(const struct rtable *rt) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 8471dd1..514d7e9 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -1391,6 +1391,11 @@ static void ipv4_dst_destroy(struct dst_entry *dst) { struct rtable *rt = (struct rtable *) dst; + if (rt->fi) { + fib_info_put(rt->fi); + rt->fi = NULL; + } + if (!list_empty(&rt->rt_uncached)) { struct uncached_list *ul = rt->rt_uncached_list; @@ -1428,6 +1433,16 @@ static bool rt_cache_valid(const struct rtable *rt) !rt_is_expired(rt); } +static void rt_init_metrics(struct rtable *rt, struct fib_info *fi) +{ + if (fi->fib_metrics != (u32 *) dst_default_metrics) { + fib_info_hold(fi); + rt->fi = fi; + } + + dst_init_metrics(&rt->dst, fi->fib_metrics, true); +} + static void rt_set_nexthop(struct rtable *rt, __be32 daddr, const struct fib_result *res, struct fib_nh_exception *fnhe, @@ -1442,7 +1457,7 @@ static void rt_set_nexthop(struct rtable *rt, __be32 daddr, rt->rt_gateway = nh->nh_gw; rt->rt_uses_gateway = 1; } - dst_init_metrics(&rt->dst, fi->fib_metrics, true); + rt_init_metrics(rt, fi); #ifdef CONFIG_IP_ROUTE_CLASSID rt->dst.tclassid = nh->nh_tclassid; #endif @@ -1494,6 +1509,7 @@ struct rtable *rt_dst_alloc(struct net_device *dev, rt->rt_gateway = 0; rt->rt_uses_gateway = 0; rt->rt_table_id = 0; + rt->fi = NULL; INIT_LIST_HEAD(&rt->rt_uncached); rt->dst.output = ip_output; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-05 22:33 ` Cong Wang @ 2017-04-06 10:49 ` Eric Dumazet 2017-04-07 17:10 ` Cong Wang 0 siblings, 1 reply; 8+ messages in thread From: Eric Dumazet @ 2017-04-06 10:49 UTC (permalink / raw) To: Cong Wang Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller On Wed, 2017-04-05 at 15:33 -0700, Cong Wang wrote: > Good find! I missed the refcnt in rt_set_nexthop() before that commit. > > We need to revert that commit to restore the refcnt for fib_info. Well, there are other spots , in decnet and IPv6. This is why my original mail stated the problem was in the calls to : dst_init_metrics(&rt->dst, fi->fib_metrics, true); Lets do not think in "reverting" spirit, but adding the missing bits. The problem here is that the metrics should not be freed until last user is gone. So maybe a refcount should be added to metrics, and we do not have to add a fib pointer again in all dsts. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: net/ipv4: use-after-free in ipv4_mtu 2017-04-06 10:49 ` Eric Dumazet @ 2017-04-07 17:10 ` Cong Wang 0 siblings, 0 replies; 8+ messages in thread From: Cong Wang @ 2017-04-07 17:10 UTC (permalink / raw) To: Eric Dumazet Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller On Thu, Apr 6, 2017 at 3:49 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > On Wed, 2017-04-05 at 15:33 -0700, Cong Wang wrote: > >> Good find! I missed the refcnt in rt_set_nexthop() before that commit. >> >> We need to revert that commit to restore the refcnt for fib_info. > > Well, there are other spots , in decnet and IPv6. IPv6 is very different, it copies or steals the metrics from mx6_config: static int fib6_commit_metrics(struct dst_entry *dst, struct mx6_config *mxc) { if (!mxc->mx) return 0; if (dst->flags & DST_HOST) { u32 *mp = dst_metrics_write_ptr(dst); if (unlikely(!mp)) return -ENOMEM; fib6_copy_metrics(mp, mxc); } else { dst_init_metrics(dst, mxc->mx, false); /* We've stolen mx now. */ mxc->mx = NULL; } return 0; } so probably doesn't need a refcnt. Decnet has already done the refcnt'ing, see dn_fib_semantic_match(). > > This is why my original mail stated the problem was in the calls to : > > dst_init_metrics(&rt->dst, fi->fib_metrics, true); > > Lets do not think in "reverting" spirit, but adding the missing bits. > > The problem here is that the metrics should not be freed until last user > is gone. > > So maybe a refcount should be added to metrics, and we do not have to > add a fib pointer again in all dsts. > Good point, but it is harder than just revert given that fact that dst metrics is a magic pointer to an array and COW. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-04-07 17:10 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-04 14:50 net/ipv4: use-after-free in ipv4_mtu Andrey Konovalov 2017-04-04 18:51 ` Eric Dumazet 2017-04-05 1:11 ` Cong Wang 2017-04-05 2:45 ` Eric Dumazet 2017-04-05 18:59 ` Subash Abhinov Kasiviswanathan 2017-04-05 22:33 ` Cong Wang 2017-04-06 10:49 ` Eric Dumazet 2017-04-07 17:10 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).