linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* net/ipv4: use-after-free in ipv4_mtu
@ 2017-04-04 14:50 Andrey Konovalov
  2017-04-04 18:51 ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Andrey Konovalov @ 2017-04-04 14:50 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML
  Cc: Dmitry Vyukov, Kostya Serebryany, Eric Dumazet, syzkaller

Hi,

I've got the following error report while fuzzing the kernel with syzkaller.

On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5).

Unfortunately it's not reproducible.

==================================================================
BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176
[inline] at addr ffff88003d6a965c
BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0
net/ipv4/route.c:1270 at addr ffff88003d6a965c
Read of size 4 by task syz-executor3/20611
CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x292/0x398 lib/dump_stack.c:52
 kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
 print_address_description mm/kasan/report.c:202 [inline]
 kasan_report_error mm/kasan/report.c:291 [inline]
 kasan_report+0x252/0x510 mm/kasan/report.c:347
 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367
 dst_metric_raw include/net/dst.h:176 [inline]
 ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270
 dst_mtu include/net/dst.h:221 [inline]
 do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433
 ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578
 tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131
 sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709
 SYSC_getsockopt net/socket.c:1829 [inline]
 SyS_getsockopt+0x252/0x390 net/socket.c:1811
 entry_SYSCALL_64_fastpath+0x1f/0xc2
RIP: 0033:0x4458d9
RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9
RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005
RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000
R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150
R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003
Object at ffff88003d6a9658, in cache kmalloc-64 size: 64
Allocated:
PID = 20110
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
 kmalloc include/linux/slab.h:490 [inline]
 kzalloc include/linux/slab.h:663 [inline]
 fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040
 fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221
 ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597
 inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882
sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst
socket option.
Use struct sctp_assoc_value instead
 sock_do_ioctl+0x65/0xb0 net/socket.c:906
 sock_ioctl+0x28f/0x440 net/socket.c:1004
 vfs_ioctl fs/ioctl.c:45 [inline]
 do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685
 SYSC_ioctl fs/ioctl.c:700 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
 entry_SYSCALL_64_fastpath+0x1f/0xc2
Freed:
PID = 4439
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
 slab_free_hook mm/slub.c:1357 [inline]
 slab_free_freelist_hook mm/slub.c:1379 [inline]
 slab_free mm/slub.c:2961 [inline]
 kfree+0xe8/0x2b0 mm/slub.c:3882
 free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218
 __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
 rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879
 invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
 __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
 rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126
 __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
Memory state around the buggy address:
 ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
                                                    ^
 ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-04 14:50 net/ipv4: use-after-free in ipv4_mtu Andrey Konovalov
@ 2017-04-04 18:51 ` Eric Dumazet
  2017-04-05  1:11   ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2017-04-04 18:51 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, netdev, LKML, Dmitry Vyukov, Kostya Serebryany,
	syzkaller

On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
>
> Hi,
>
> I've got the following error report while fuzzing the kernel with syzkaller.
>
> On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5).
>
> Unfortunately it's not reproducible.
>
> ==================================================================
> BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176
> [inline] at addr ffff88003d6a965c
> BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0
> net/ipv4/route.c:1270 at addr ffff88003d6a965c
> Read of size 4 by task syz-executor3/20611
> CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x292/0x398 lib/dump_stack.c:52
>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
>  print_address_description mm/kasan/report.c:202 [inline]
>  kasan_report_error mm/kasan/report.c:291 [inline]
>  kasan_report+0x252/0x510 mm/kasan/report.c:347
>  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367
>  dst_metric_raw include/net/dst.h:176 [inline]
>  ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270
>  dst_mtu include/net/dst.h:221 [inline]
>  do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433
>  ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578
>  tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131
>  sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709
>  SYSC_getsockopt net/socket.c:1829 [inline]
>  SyS_getsockopt+0x252/0x390 net/socket.c:1811
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> RIP: 0033:0x4458d9
> RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037
> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9
> RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005
> RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000
> R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150
> R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003
> Object at ffff88003d6a9658, in cache kmalloc-64 size: 64
> Allocated:
> PID = 20110
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>  kmalloc include/linux/slab.h:490 [inline]
>  kzalloc include/linux/slab.h:663 [inline]
>  fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040
>  fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221
>  ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597
>  inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882
> sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst
> socket option.
> Use struct sctp_assoc_value instead
>  sock_do_ioctl+0x65/0xb0 net/socket.c:906
>  sock_ioctl+0x28f/0x440 net/socket.c:1004
>  vfs_ioctl fs/ioctl.c:45 [inline]
>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685
>  SYSC_ioctl fs/ioctl.c:700 [inline]
>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> Freed:
> PID = 4439
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xe8/0x2b0 mm/slub.c:3882
>  free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218
>  __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
>  rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879
>  invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
>  __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
>  rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126
>  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
> Memory state around the buggy address:
>  ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
>                                                     ^
>  ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================

Thanks for the report Andrey

Looking at fib->fib_metrics, I fail to understand how the following can work :

dst_init_metrics(&rt->dst, fi->fib_metrics, true);

In the cases fi->fib_metrics is _not_ dst_default_metrics,
fi->fib_metrics can be freed when the fib is deleted,
while dst(s) have still the 'read only pointer'.

RCU grace period before fi->fib_metrics freeing does not help.

Without refcounts, it looks like we need to copy the fib_metrics.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-04 18:51 ` Eric Dumazet
@ 2017-04-05  1:11   ` Cong Wang
  2017-04-05  2:45     ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Cong Wang @ 2017-04-05  1:11 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Konovalov, David S. Miller, netdev, LKML, Dmitry Vyukov,
	Kostya Serebryany, syzkaller

On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote:
> On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
>>
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5).
>>
>> Unfortunately it's not reproducible.
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176
>> [inline] at addr ffff88003d6a965c
>> BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0
>> net/ipv4/route.c:1270 at addr ffff88003d6a965c
>> Read of size 4 by task syz-executor3/20611
>> CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:16 [inline]
>>  dump_stack+0x292/0x398 lib/dump_stack.c:52
>>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
>>  print_address_description mm/kasan/report.c:202 [inline]
>>  kasan_report_error mm/kasan/report.c:291 [inline]
>>  kasan_report+0x252/0x510 mm/kasan/report.c:347
>>  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367
>>  dst_metric_raw include/net/dst.h:176 [inline]
>>  ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270
>>  dst_mtu include/net/dst.h:221 [inline]
>>  do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433
>>  ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578
>>  tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131
>>  sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709
>>  SYSC_getsockopt net/socket.c:1829 [inline]
>>  SyS_getsockopt+0x252/0x390 net/socket.c:1811
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> RIP: 0033:0x4458d9
>> RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037
>> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9
>> RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005
>> RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000
>> R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150
>> R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003
>> Object at ffff88003d6a9658, in cache kmalloc-64 size: 64
>> Allocated:
>> PID = 20110
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
>>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>>  kmalloc include/linux/slab.h:490 [inline]
>>  kzalloc include/linux/slab.h:663 [inline]
>>  fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040
>>  fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221
>>  ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597
>>  inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882
>> sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst
>> socket option.
>> Use struct sctp_assoc_value instead
>>  sock_do_ioctl+0x65/0xb0 net/socket.c:906
>>  sock_ioctl+0x28f/0x440 net/socket.c:1004
>>  vfs_ioctl fs/ioctl.c:45 [inline]
>>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685
>>  SYSC_ioctl fs/ioctl.c:700 [inline]
>>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
>>  entry_SYSCALL_64_fastpath+0x1f/0xc2
>> Freed:
>> PID = 4439
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
>>  slab_free_hook mm/slub.c:1357 [inline]
>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>  slab_free mm/slub.c:2961 [inline]
>>  kfree+0xe8/0x2b0 mm/slub.c:3882
>>  free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218
>>  __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
>>  rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879
>>  invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
>>  __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
>>  rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126
>>  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
>> Memory state around the buggy address:
>>  ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
>>                                                     ^
>>  ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>
> Thanks for the report Andrey
>
> Looking at fib->fib_metrics, I fail to understand how the following can work :
>
> dst_init_metrics(&rt->dst, fi->fib_metrics, true);
>
> In the cases fi->fib_metrics is _not_ dst_default_metrics,
> fi->fib_metrics can be freed when the fib is deleted,
> while dst(s) have still the 'read only pointer'.
>
> RCU grace period before fi->fib_metrics freeing does not help.
>
> Without refcounts, it looks like we need to copy the fib_metrics.

The dst is obtained from sk_dst_cache which is cached for a fast
path where fib_info is obtained in fib_lookup() without refcnt:

                err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF);


...
                        if (!(fib_flags & FIB_LOOKUP_NOREF))
                                atomic_inc(&fi->fib_clntref);


This probably starts from:

commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9
Author: Eric Dumazet <eric.dumazet@gmail.com>
Date:   Tue Oct 5 10:41:36 2010 +0000

    fib: RCU conversion of fib_lookup()

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-05  1:11   ` Cong Wang
@ 2017-04-05  2:45     ` Eric Dumazet
  2017-04-05 18:59       ` Subash Abhinov Kasiviswanathan
  2017-04-05 22:33       ` Cong Wang
  0 siblings, 2 replies; 8+ messages in thread
From: Eric Dumazet @ 2017-04-05  2:45 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

On Tue, 2017-04-04 at 18:11 -0700, Cong Wang wrote:
> On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote:
> > On Tue, Apr 4, 2017 at 7:50 AM, Andrey Konovalov <andreyknvl@google.com> wrote:
> >>
> >> Hi,
> >>
> >> I've got the following error report while fuzzing the kernel with syzkaller.
> >>
> >> On commit a71c9a1c779f2499fb2afc0553e543f18aff6edf (4.11-rc5).
> >>
> >> Unfortunately it's not reproducible.
> >>
> >> ==================================================================
> >> BUG: KASAN: use-after-free in dst_metric_raw include/net/dst.h:176
> >> [inline] at addr ffff88003d6a965c
> >> BUG: KASAN: use-after-free in ipv4_mtu+0x3f2/0x4b0
> >> net/ipv4/route.c:1270 at addr ffff88003d6a965c
> >> Read of size 4 by task syz-executor3/20611
> >> CPU: 3 PID: 20611 Comm: syz-executor3 Not tainted 4.11.0-rc5+ #199
> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> >> Call Trace:
> >>  __dump_stack lib/dump_stack.c:16 [inline]
> >>  dump_stack+0x292/0x398 lib/dump_stack.c:52
> >>  kasan_object_err+0x1c/0x70 mm/kasan/report.c:164
> >>  print_address_description mm/kasan/report.c:202 [inline]
> >>  kasan_report_error mm/kasan/report.c:291 [inline]
> >>  kasan_report+0x252/0x510 mm/kasan/report.c:347
> >>  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:367
> >>  dst_metric_raw include/net/dst.h:176 [inline]
> >>  ipv4_mtu+0x3f2/0x4b0 net/ipv4/route.c:1270
> >>  dst_mtu include/net/dst.h:221 [inline]
> >>  do_ip_getsockopt+0x71d/0x2290 net/ipv4/ip_sockglue.c:1433
> >>  ip_getsockopt+0x90/0x230 net/ipv4/ip_sockglue.c:1578
> >>  tcp_getsockopt+0x82/0xd0 net/ipv4/tcp.c:3131
> >>  sock_common_getsockopt+0x95/0xd0 net/core/sock.c:2709
> >>  SYSC_getsockopt net/socket.c:1829 [inline]
> >>  SyS_getsockopt+0x252/0x390 net/socket.c:1811
> >>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> >> RIP: 0033:0x4458d9
> >> RSP: 002b:00007fe87f452b58 EFLAGS: 00000286 ORIG_RAX: 0000000000000037
> >> RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004458d9
> >> RDX: 000000000000000e RSI: 0000000000000000 RDI: 0000000000000005
> >> RBP: 00000000006e0020 R08: 0000000020db6000 R09: 0000000000000000
> >> R10: 00000000207e8000 R11: 0000000000000286 R12: 0000000000708150
> >> R13: 0000000020db8000 R14: 0000000000001000 R15: 0000000000000003
> >> Object at ffff88003d6a9658, in cache kmalloc-64 size: 64
> >> Allocated:
> >> PID = 20110
> >>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
> >>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> >>  set_track mm/kasan/kasan.c:525 [inline]
> >>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616
> >>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
> >>  kmalloc include/linux/slab.h:490 [inline]
> >>  kzalloc include/linux/slab.h:663 [inline]
> >>  fib_create_info+0x8e0/0x3a30 net/ipv4/fib_semantics.c:1040
> >>  fib_table_insert+0x1a5/0x1550 net/ipv4/fib_trie.c:1221
> >>  ip_rt_ioctl+0xddc/0x1590 net/ipv4/fib_frontend.c:597
> >>  inet_ioctl+0xf2/0x1c0 net/ipv4/af_inet.c:882
> >> sctp: [Deprecated]: syz-executor0 (pid 20638) Use of int in max_burst
> >> socket option.
> >> Use struct sctp_assoc_value instead
> >>  sock_do_ioctl+0x65/0xb0 net/socket.c:906
> >>  sock_ioctl+0x28f/0x440 net/socket.c:1004
> >>  vfs_ioctl fs/ioctl.c:45 [inline]
> >>  do_vfs_ioctl+0x1bf/0x1780 fs/ioctl.c:685
> >>  SYSC_ioctl fs/ioctl.c:700 [inline]
> >>  SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
> >>  entry_SYSCALL_64_fastpath+0x1f/0xc2
> >> Freed:
> >> PID = 4439
> >>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
> >>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> >>  set_track mm/kasan/kasan.c:525 [inline]
> >>  kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589
> >>  slab_free_hook mm/slub.c:1357 [inline]
> >>  slab_free_freelist_hook mm/slub.c:1379 [inline]
> >>  slab_free mm/slub.c:2961 [inline]
> >>  kfree+0xe8/0x2b0 mm/slub.c:3882
> >>  free_fib_info_rcu+0x4ba/0x5e0 net/ipv4/fib_semantics.c:218
> >>  __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
> >>  rcu_do_batch.isra.64+0x947/0xcc0 kernel/rcu/tree.c:2879
> >>  invoke_rcu_callbacks kernel/rcu/tree.c:3142 [inline]
> >>  __rcu_process_callbacks kernel/rcu/tree.c:3109 [inline]
> >>  rcu_process_callbacks+0x2cc/0xb90 kernel/rcu/tree.c:3126
> >>  __do_softirq+0x2fb/0xb7d kernel/softirq.c:284
> >> Memory state around the buggy address:
> >>  ffff88003d6a9500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>  ffff88003d6a9580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> >ffff88003d6a9600: fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb
> >>                                                     ^
> >>  ffff88003d6a9680: fb fb fb fc fc fc fc fc fc fc fc fc fc fc fc fc
> >>  ffff88003d6a9700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >> ==================================================================
> >
> > Thanks for the report Andrey
> >
> > Looking at fib->fib_metrics, I fail to understand how the following can work :
> >
> > dst_init_metrics(&rt->dst, fi->fib_metrics, true);
> >
> > In the cases fi->fib_metrics is _not_ dst_default_metrics,
> > fi->fib_metrics can be freed when the fib is deleted,
> > while dst(s) have still the 'read only pointer'.
> >
> > RCU grace period before fi->fib_metrics freeing does not help.
> >
> > Without refcounts, it looks like we need to copy the fib_metrics.
> 
> The dst is obtained from sk_dst_cache which is cached for a fast
> path where fib_info is obtained in fib_lookup() without refcnt:
> 
>                 err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF);
> 
> 
> ...
>                         if (!(fib_flags & FIB_LOOKUP_NOREF))
>                                 atomic_inc(&fi->fib_clntref);
> 
> 
> This probably starts from:
> 
> commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9
> Author: Eric Dumazet <eric.dumazet@gmail.com>
> Date:   Tue Oct 5 10:41:36 2010 +0000
> 
>     fib: RCU conversion of fib_lookup()

Interesting. I might had too many beers tonight, but ...

refcount was removed in 2860583fe840 many months later

-static void rt_init_metrics(struct rtable *rt, struct fib_info *fi)
-{
-       if (fi->fib_metrics != (u32 *) dst_default_metrics) {
-               rt->fi = fi;
-               atomic_inc(&fi->fib_clntref);
-       }
-       dst_init_metrics(&rt->dst, fi->fib_metrics, true);
-}
-
 static struct fib_nh_exception *find_exception(struct fib_nh *nh,
__be32 daddr)
 {
        struct fnhe_hash_bucket *hash = nh->nh_exceptions;
@@ -1261,7 +1239,7 @@ static void rt_set_nexthop(struct rtable *rt,
__be32 daddr,
                        rt->rt_gateway = nh->nh_gw;
                if (unlikely(fnhe))
                        rt_bind_exception(rt, fnhe, daddr);
-               rt_init_metrics(rt, fi);
+               dst_init_metrics(&rt->dst, fi->fib_metrics, true);
 #ifdef CONFIG_IP_ROUTE_CLASSID
                rt->dst.tclassid = nh->nh_tclassid;
 #endif

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-05  2:45     ` Eric Dumazet
@ 2017-04-05 18:59       ` Subash Abhinov Kasiviswanathan
  2017-04-05 22:33       ` Cong Wang
  1 sibling, 0 replies; 8+ messages in thread
From: Subash Abhinov Kasiviswanathan @ 2017-04-05 18:59 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Cong Wang, Eric Dumazet, Andrey Konovalov, David S. Miller,
	netdev, LKML, Dmitry Vyukov, Kostya Serebryany, syzkaller,
	netdev-owner

> 
> Interesting. I might had too many beers tonight, but ...
> 
> refcount was removed in 2860583fe840 many months later
> 
> -static void rt_init_metrics(struct rtable *rt, struct fib_info *fi)
> -{
> -       if (fi->fib_metrics != (u32 *) dst_default_metrics) {
> -               rt->fi = fi;
> -               atomic_inc(&fi->fib_clntref);
> -       }
> -       dst_init_metrics(&rt->dst, fi->fib_metrics, true);
> -}
> -
>  static struct fib_nh_exception *find_exception(struct fib_nh *nh,
> __be32 daddr)
>  {
>         struct fnhe_hash_bucket *hash = nh->nh_exceptions;
> @@ -1261,7 +1239,7 @@ static void rt_set_nexthop(struct rtable *rt,
> __be32 daddr,
>                         rt->rt_gateway = nh->nh_gw;
>                 if (unlikely(fnhe))
>                         rt_bind_exception(rt, fnhe, daddr);
> -               rt_init_metrics(rt, fi);
> +               dst_init_metrics(&rt->dst, fi->fib_metrics, true);
>  #ifdef CONFIG_IP_ROUTE_CLASSID
>                 rt->dst.tclassid = nh->nh_tclassid;
>  #endif

Hi Eric

I encountered a crash on 4.4 kernel pointing to ipv4_mtu.
Is the crash similar to this one?
(target is ARM64 Android, was seen on a stability rack, so no reproducer
unfortunately)

<6> Kernel BUG at 00000000000005dc [verbose debug info unavailable]
<6> Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
<6> CPU: 1 PID: 4649 Comm: iperf Tainted: G        W  O    4.4.21+ #1
<6> task: ffffffef02242f00 ti: ffffffef021b8000 task.ti: 
ffffffef021b8000
<2> PC is at 0x5dc
<2> LR is at ipv4_mtu+0x70/0x84
<2> pc : [<00000000000005dc>] lr : [<ffffff9bd2c35ab8>] pstate: a0000145
<2> sp : ffffffef021bb9b0
<2> x29: ffffffef021bb9b0 x28: 0000000000000000
<2> x27: ffffffef318122c0 x26: 00000000000005be
<2> x25: ffffffef31812678 x24: ffffffef31812678
<2> x23: ffffffef8794c000 x22: ffffff9bd43f4380
<2> x21: ffffffef318122c0 x20: ffffffef6aef6ac0
<2> x19: ffffffef05026ac0 x18: 0000000001026749
<2> x17: 0000007fabaf145c x16: ffffff9bd1fe72bc
<2> x15: 00368fbefea52a8e x14: 3736353433323130
<2> x13: 3938373635343332 x12: 0000000000000003
<2> x11: 0000000000000028 x10: 0101010101010101
<2> x9 : 0000000000000001 x8 : 0000000000000098
<2> x7 : ffffff9bd2c8cbc0 x6 : 0000000000000000
<2> x5 : ffffffef68481c00 x4 : 00000000ffffefbf
<2> x3 : 0000000000000000 x2 : 0000000000000000
<2> x1 : 000000000000ef7f x0 : 0000000001280058
<2>
LR: 0xffffff9bd2c35a78:
<2> 5a78  b7f80241 f9401661 927ef421 b9400422 2a0203e0 350001a2 f9400e60 
b9400021
<2> 5a98  b9422800 361000c1 39428e61 34000081 7109001f 52804801 1a819000 
529fffe1
<2> 5ab8  6b01001f 1a819000 f9400bf3 a8c27bfd d65f03c0 a9ba7bfd 910003fd 
a90153f3
<2> 5ad8  a9025bf5 a90363f7 a9046bf9 aa0003f3 aa1e03e0 f9002fa1 2a0203f8 
2a0303f9
<2>
SP: 0xffffffef021bb970:
<2> b970  d2c35ab8 ffffff9b 021bb9b0 ffffffef 000005dc 00000000 a0000145 
00000000
<2> b990  6aef6ac0 ffffffef 6aef6ac0 ffffffef 00000000 00000080 d2c015b0 
ffffff9b
<2> b9b0  021bb9d0 ffffffef d2c3e4d4 ffffff9b 6aef6ac0 ffffffef 021bba18 
ffffffef
<2> b9d0  021bba20 ffffffef d2c3f05c ffffff9b d37d9418 ffffff9b 6aef6ac0 
ffffffef
<2>
<6> Process iperf (pid: 4649, stack limit = 0xffffffef021b8020)
<2> Call trace:
<2> [<00000000000005dc>] 0x5dc
<2> [<ffffff9bd2c3e4d4>] ip_finish_output+0xbc/0x1dc
<2> [<ffffff9bd2c3f05c>] ip_output+0xe8/0x15c
<2> [<ffffff9bd2c3e78c>] ip_local_out+0x58/0x68
<2> [<ffffff9bd2c3fa88>] ip_send_skb+0x2c/0xa8
<2> [<ffffff9bd2c643d0>] udp_send_skb+0x194/0x29c
<2> [<ffffff9bd2c66584>] udp_sendmsg+0x4e0/0x700
<2> [<ffffff9bd2c70788>] inet_sendmsg+0x98/0xc8
<2> [<ffffff9bd2ba82e8>] sock_sendmsg+0x48/0x60
<2> [<ffffff9bd2ba8394>] sock_write_iter+0x94/0xc0
<2> [<ffffff9bd1fe61c8>] __vfs_write+0xc0/0xf0
<2> [<ffffff9bd1fe6abc>] vfs_write+0xb8/0x150
<2> [<ffffff9bd1fe7314>] SyS_write+0x58/0x94
<2> [<ffffff9bd1e84e30>] el0_svc_naked+0x24/0x28
<6> Code: bad PC value
<6> ---[ end trace debf337ba02da94f ]---
<6> Kernel panic - not syncing: Fatal exception

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora 
Forum,
a Linux Foundation Collaborative Project

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-05  2:45     ` Eric Dumazet
  2017-04-05 18:59       ` Subash Abhinov Kasiviswanathan
@ 2017-04-05 22:33       ` Cong Wang
  2017-04-06 10:49         ` Eric Dumazet
  1 sibling, 1 reply; 8+ messages in thread
From: Cong Wang @ 2017-04-05 22:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

[-- Attachment #1: Type: text/plain, Size: 1536 bytes --]

On Tue, Apr 4, 2017 at 7:45 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2017-04-04 at 18:11 -0700, Cong Wang wrote:
>> On Tue, Apr 4, 2017 at 11:51 AM, Eric Dumazet <edumazet@google.com> wrote:
>> > Looking at fib->fib_metrics, I fail to understand how the following can work :
>> >
>> > dst_init_metrics(&rt->dst, fi->fib_metrics, true);
>> >
>> > In the cases fi->fib_metrics is _not_ dst_default_metrics,
>> > fi->fib_metrics can be freed when the fib is deleted,
>> > while dst(s) have still the 'read only pointer'.
>> >
>> > RCU grace period before fi->fib_metrics freeing does not help.
>> >
>> > Without refcounts, it looks like we need to copy the fib_metrics.
>>
>> The dst is obtained from sk_dst_cache which is cached for a fast
>> path where fib_info is obtained in fib_lookup() without refcnt:
>>
>>                 err = fib_table_lookup(tb, flp, res, flags | FIB_LOOKUP_NOREF);
>>
>>
>> ...
>>                         if (!(fib_flags & FIB_LOOKUP_NOREF))
>>                                 atomic_inc(&fi->fib_clntref);
>>
>>
>> This probably starts from:
>>
>> commit ebc0ffae5dfb4447e0a431ffe7fe1d467c48bbb9
>> Author: Eric Dumazet <eric.dumazet@gmail.com>
>> Date:   Tue Oct 5 10:41:36 2010 +0000
>>
>>     fib: RCU conversion of fib_lookup()
>
> Interesting. I might had too many beers tonight, but ...
>
> refcount was removed in 2860583fe840 many months later

Good find! I missed the refcnt in rt_set_nexthop() before that commit.

We need to revert that commit to restore the refcnt for fib_info.

[-- Attachment #2: ipv4-fib-info.diff --]
[-- Type: text/plain, Size: 1798 bytes --]

diff --git a/include/net/route.h b/include/net/route.h
index c0874c8..3917d0a 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -69,6 +69,7 @@ struct rtable {
 
 	struct list_head	rt_uncached;
 	struct uncached_list	*rt_uncached_list;
+	struct fib_info		*fi; /* for refcnt to shared metrics */
 };
 
 static inline bool rt_is_input_route(const struct rtable *rt)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 8471dd1..514d7e9 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1391,6 +1391,11 @@ static void ipv4_dst_destroy(struct dst_entry *dst)
 {
 	struct rtable *rt = (struct rtable *) dst;
 
+	if (rt->fi) {
+		fib_info_put(rt->fi);
+		rt->fi = NULL;
+	}
+
 	if (!list_empty(&rt->rt_uncached)) {
 		struct uncached_list *ul = rt->rt_uncached_list;
 
@@ -1428,6 +1433,16 @@ static bool rt_cache_valid(const struct rtable *rt)
 		!rt_is_expired(rt);
 }
 
+static void rt_init_metrics(struct rtable *rt, struct fib_info *fi)
+{
+	if (fi->fib_metrics != (u32 *) dst_default_metrics) {
+		fib_info_hold(fi);
+		rt->fi = fi;
+	}
+
+	dst_init_metrics(&rt->dst, fi->fib_metrics, true);
+}
+
 static void rt_set_nexthop(struct rtable *rt, __be32 daddr,
 			   const struct fib_result *res,
 			   struct fib_nh_exception *fnhe,
@@ -1442,7 +1457,7 @@ static void rt_set_nexthop(struct rtable *rt, __be32 daddr,
 			rt->rt_gateway = nh->nh_gw;
 			rt->rt_uses_gateway = 1;
 		}
-		dst_init_metrics(&rt->dst, fi->fib_metrics, true);
+		rt_init_metrics(rt, fi);
 #ifdef CONFIG_IP_ROUTE_CLASSID
 		rt->dst.tclassid = nh->nh_tclassid;
 #endif
@@ -1494,6 +1509,7 @@ struct rtable *rt_dst_alloc(struct net_device *dev,
 		rt->rt_gateway = 0;
 		rt->rt_uses_gateway = 0;
 		rt->rt_table_id = 0;
+		rt->fi = NULL;
 		INIT_LIST_HEAD(&rt->rt_uncached);
 
 		rt->dst.output = ip_output;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-05 22:33       ` Cong Wang
@ 2017-04-06 10:49         ` Eric Dumazet
  2017-04-07 17:10           ` Cong Wang
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2017-04-06 10:49 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

On Wed, 2017-04-05 at 15:33 -0700, Cong Wang wrote:

> Good find! I missed the refcnt in rt_set_nexthop() before that commit.
> 
> We need to revert that commit to restore the refcnt for fib_info.

Well, there are other spots , in decnet and IPv6.

This is why my original mail stated the problem was in the calls to :

dst_init_metrics(&rt->dst, fi->fib_metrics, true);

Lets do not think in "reverting" spirit, but adding the missing bits.

The problem here is that the metrics should not be freed until last user
is gone.

So maybe a refcount should be added to metrics, and we do not have to
add a fib pointer again in all dsts.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: net/ipv4: use-after-free in ipv4_mtu
  2017-04-06 10:49         ` Eric Dumazet
@ 2017-04-07 17:10           ` Cong Wang
  0 siblings, 0 replies; 8+ messages in thread
From: Cong Wang @ 2017-04-07 17:10 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

On Thu, Apr 6, 2017 at 3:49 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2017-04-05 at 15:33 -0700, Cong Wang wrote:
>
>> Good find! I missed the refcnt in rt_set_nexthop() before that commit.
>>
>> We need to revert that commit to restore the refcnt for fib_info.
>
> Well, there are other spots , in decnet and IPv6.

IPv6 is very different, it copies or steals the metrics from mx6_config:

static int fib6_commit_metrics(struct dst_entry *dst, struct mx6_config *mxc)
{
        if (!mxc->mx)
                return 0;

        if (dst->flags & DST_HOST) {
                u32 *mp = dst_metrics_write_ptr(dst);

                if (unlikely(!mp))
                        return -ENOMEM;

                fib6_copy_metrics(mp, mxc);
        } else {
                dst_init_metrics(dst, mxc->mx, false);

                /* We've stolen mx now. */
                mxc->mx = NULL;
        }

        return 0;
}

so probably doesn't need a refcnt.

Decnet has already done the refcnt'ing, see dn_fib_semantic_match().


>
> This is why my original mail stated the problem was in the calls to :
>
> dst_init_metrics(&rt->dst, fi->fib_metrics, true);
>
> Lets do not think in "reverting" spirit, but adding the missing bits.
>
> The problem here is that the metrics should not be freed until last user
> is gone.
>
> So maybe a refcount should be added to metrics, and we do not have to
> add a fib pointer again in all dsts.
>

Good point, but it is harder than just revert given that fact that dst metrics
is a magic pointer to an array and COW.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-07 17:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-04 14:50 net/ipv4: use-after-free in ipv4_mtu Andrey Konovalov
2017-04-04 18:51 ` Eric Dumazet
2017-04-05  1:11   ` Cong Wang
2017-04-05  2:45     ` Eric Dumazet
2017-04-05 18:59       ` Subash Abhinov Kasiviswanathan
2017-04-05 22:33       ` Cong Wang
2017-04-06 10:49         ` Eric Dumazet
2017-04-07 17:10           ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).