All of lore.kernel.org
 help / color / mirror / Atom feed
* net/ipv4: use-after-free in add_grec
@ 2017-05-31  9:46 Andrey Konovalov
  2017-05-31 16:12 ` Eric Dumazet
  2017-06-08 13:43 ` [PATCH net] ipv4: igmp: fix a use after free Eric Dumazet
  0 siblings, 2 replies; 18+ messages in thread
From: Andrey Konovalov @ 2017-05-31  9:46 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML, Eric Dumazet,
	Cong Wang
  Cc: Dmitry Vyukov, Kostya Serebryany, syzkaller

Hi,

I've got the following error report while fuzzing the kernel with syzkaller.

On commit 5ed02dbb497422bf225783f46e6eadd237d23d6b (4.12-rc3).

Unfortunately it's not reproducible.

==================================================================
BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x292/0x395 lib/dump_stack.c:52
 print_address_description+0x73/0x280 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x22b/0x340 mm/kasan/report.c:408
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
 add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
 igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
 igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
IPVS: length: 51 != 8
 call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x19e/0x1d0 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
 </IRQ>
 arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
 default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
 default_idle_call+0x36/0x60 kernel/sched/idle.c:98
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x348/0x420 kernel/sched/idle.c:245
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
 rest_init+0x18d/0x1a0 init/main.c:415
 start_kernel+0x747/0x779 init/main.c:679
 x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
 x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
 secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304

Allocated by task 30543:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
 kmalloc include/linux/slab.h:492 [inline]
 kzalloc include/linux/slab.h:665 [inline]
 ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
 ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
 ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
 do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
 ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
 tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
 SYSC_setsockopt net/socket.c:1798 [inline]
 SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 30543:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
 slab_free_hook mm/slub.c:1357 [inline]
 slab_free_freelist_hook mm/slub.c:1379 [inline]
 slab_free mm/slub.c:2961 [inline]
 kfree+0xe8/0x2b0 mm/slub.c:3882
 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1072
 __fput+0x332/0x7f0 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0x18a3/0x2820 kernel/exit.c:878
 do_group_exit+0x149/0x420 kernel/exit.c:982
 get_signal+0x76d/0x1790 kernel/signal.c:2318
 do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
 prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
 syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
 entry_SYSCALL_64_fastpath+0xbc/0xbe

The buggy address belongs to the object at ffff88003053c1a0
 which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
 64-byte region [ffff88003053c1a0, ffff88003053c1e0)
The buggy address belongs to the page:
page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
                               ^
 ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: net/ipv4: use-after-free in add_grec
  2017-05-31  9:46 net/ipv4: use-after-free in add_grec Andrey Konovalov
@ 2017-05-31 16:12 ` Eric Dumazet
  2017-05-31 23:49   ` Cong Wang
  2017-06-08 13:43 ` [PATCH net] ipv4: igmp: fix a use after free Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-05-31 16:12 UTC (permalink / raw)
  To: Andrey Konovalov
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML, Eric Dumazet,
	Cong Wang, Dmitry Vyukov, Kostya Serebryany, syzkaller

On Wed, 2017-05-31 at 11:46 +0200, Andrey Konovalov wrote:
> Hi,
> 
> I've got the following error report while fuzzing the kernel with syzkaller.
> 
> On commit 5ed02dbb497422bf225783f46e6eadd237d23d6b (4.12-rc3).
> 
> Unfortunately it's not reproducible.
> 
> ==================================================================
> BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
> Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0
> 
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x292/0x395 lib/dump_stack.c:52
>  print_address_description+0x73/0x280 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x22b/0x340 mm/kasan/report.c:408
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
>  add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>  igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
>  igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
> IPVS: length: 51 != 8
>  call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
>  expire_timers kernel/time/timer.c:1307 [inline]
>  __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>  __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:652 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
> RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
> RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
> R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
>  </IRQ>
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
>  default_idle_call+0x36/0x60 kernel/sched/idle.c:98
>  cpuidle_idle_call kernel/sched/idle.c:156 [inline]
>  do_idle+0x348/0x420 kernel/sched/idle.c:245
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
>  rest_init+0x18d/0x1a0 init/main.c:415
>  start_kernel+0x747/0x779 init/main.c:679
>  x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
>  x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
>  secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304
> 
> Allocated by task 30543:
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>  kmalloc include/linux/slab.h:492 [inline]
>  kzalloc include/linux/slab.h:665 [inline]
>  ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
>  ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
>  ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
>  do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
>  ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
>  tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
>  sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
>  SYSC_setsockopt net/socket.c:1798 [inline]
>  SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
>  entry_SYSCALL_64_fastpath+0x1f/0xbe
> 
> Freed by task 30543:
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xe8/0x2b0 mm/slub.c:3882
>  ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
>  ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
>  ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
>  inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
>  sock_release+0x8d/0x1e0 net/socket.c:597
>  sock_close+0x16/0x20 net/socket.c:1072
>  __fput+0x332/0x7f0 fs/file_table.c:209
>  ____fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x19b/0x270 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0x18a3/0x2820 kernel/exit.c:878
>  do_group_exit+0x149/0x420 kernel/exit.c:982
>  get_signal+0x76d/0x1790 kernel/signal.c:2318
>  do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>  syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
>  entry_SYSCALL_64_fastpath+0xbc/0xbe
> 
> The buggy address belongs to the object at ffff88003053c1a0
>  which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 0 bytes inside of
>  64-byte region [ffff88003053c1a0, ffff88003053c1e0)
> The buggy address belongs to the page:
> page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
> raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
> page dumped because: kasan: bad access detected
> 
> Memory state around the buggy address:
>  ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>                                ^
>  ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================

I have the feeling that ip_mc_clear_src() is called too soon.

We should call it once all users have released their reference.

I am not sure we have to call it after the RCU grace period,
more analysis of the code is needed.


diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 44fd86de2823dd17de16276a8ec01b190e69b8b4..f172d251c5f2cda8fc8bedd65fa16cdfe33a4d40 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -174,6 +174,7 @@ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
 static void ip_ma_put(struct ip_mc_list *im)
 {
 	if (atomic_dec_and_test(&im->refcnt)) {
+		ip_mc_clear_src(im);
 		in_dev_put(im->interface);
 		kfree_rcu(im, rcu);
 	}
@@ -1615,7 +1616,6 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
 				*ip = i->next_rcu;
 				in_dev->mc_count--;
 				igmp_group_dropped(i);
-				ip_mc_clear_src(i);
 
 				if (!in_dev->dead)
 					ip_rt_multicast_event(in_dev);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: net/ipv4: use-after-free in add_grec
  2017-05-31 16:12 ` Eric Dumazet
@ 2017-05-31 23:49   ` Cong Wang
  2017-05-31 23:55     ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-05-31 23:49 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Andrey Konovalov, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML,
	Eric Dumazet, Dmitry Vyukov, Kostya Serebryany, syzkaller

On Wed, May 31, 2017 at 9:12 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2017-05-31 at 11:46 +0200, Andrey Konovalov wrote:
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit 5ed02dbb497422bf225783f46e6eadd237d23d6b (4.12-rc3).
>>
>> Unfortunately it's not reproducible.
>>
>> ==================================================================
>> BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>> Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0
>>
>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>>  <IRQ>
>>  __dump_stack lib/dump_stack.c:16 [inline]
>>  dump_stack+0x292/0x395 lib/dump_stack.c:52
>>  print_address_description+0x73/0x280 mm/kasan/report.c:252
>>  kasan_report_error mm/kasan/report.c:351 [inline]
>>  kasan_report+0x22b/0x340 mm/kasan/report.c:408
>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
>>  add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>>  igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
>>  igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
>> IPVS: length: 51 != 8
>>  call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
>>  expire_timers kernel/time/timer.c:1307 [inline]
>>  __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>  __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
>>  invoke_softirq kernel/softirq.c:364 [inline]
>>  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>>  exiting_irq arch/x86/include/asm/apic.h:652 [inline]
>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>> RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
>> RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
>> RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
>> RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
>> R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
>>  </IRQ>
>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>  default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
>>  default_idle_call+0x36/0x60 kernel/sched/idle.c:98
>>  cpuidle_idle_call kernel/sched/idle.c:156 [inline]
>>  do_idle+0x348/0x420 kernel/sched/idle.c:245
>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
>>  rest_init+0x18d/0x1a0 init/main.c:415
>>  start_kernel+0x747/0x779 init/main.c:679
>>  x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
>>  x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
>>  secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304
>>
>> Allocated by task 30543:
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
>>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>>  kmalloc include/linux/slab.h:492 [inline]
>>  kzalloc include/linux/slab.h:665 [inline]
>>  ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
>>  ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
>>  ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
>>  do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
>>  ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
>>  tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
>>  sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
>>  SYSC_setsockopt net/socket.c:1798 [inline]
>>  SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
>>  entry_SYSCALL_64_fastpath+0x1f/0xbe
>>
>> Freed by task 30543:
>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>  set_track mm/kasan/kasan.c:525 [inline]
>>  kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
>>  slab_free_hook mm/slub.c:1357 [inline]
>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>  slab_free mm/slub.c:2961 [inline]
>>  kfree+0xe8/0x2b0 mm/slub.c:3882
>>  ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
>>  ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
>>  ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
>>  inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
>>  sock_release+0x8d/0x1e0 net/socket.c:597
>>  sock_close+0x16/0x20 net/socket.c:1072
>>  __fput+0x332/0x7f0 fs/file_table.c:209
>>  ____fput+0x15/0x20 fs/file_table.c:245
>>  task_work_run+0x19b/0x270 kernel/task_work.c:116
>>  exit_task_work include/linux/task_work.h:21 [inline]
>>  do_exit+0x18a3/0x2820 kernel/exit.c:878
>>  do_group_exit+0x149/0x420 kernel/exit.c:982
>>  get_signal+0x76d/0x1790 kernel/signal.c:2318
>>  do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
>>  exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
>>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>>  syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
>>  entry_SYSCALL_64_fastpath+0xbc/0xbe
>>
>> The buggy address belongs to the object at ffff88003053c1a0
>>  which belongs to the cache kmalloc-64 of size 64
>> The buggy address is located 0 bytes inside of
>>  64-byte region [ffff88003053c1a0, ffff88003053c1e0)
>> The buggy address belongs to the page:
>> page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
>> index:0x0 compound_mapcount: 0
>> flags: 0x100000000008100(slab|head)
>> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
>> raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
>> page dumped because: kasan: bad access detected
>>
>> Memory state around the buggy address:
>>  ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> >ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>>                                ^
>>  ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>  ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>> ==================================================================
>
> I have the feeling that ip_mc_clear_src() is called too soon.
>
> We should call it once all users have released their reference.

In this case, ip_mc_clear_src() is called when the ->users hits
zero, therefore I think we possibly miss the refcnt on ->users.

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 44fd86d..c2f3347 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -1886,6 +1886,7 @@ static int ip_mc_del_src(struct in_device
*in_dev, __be32 *pmca, int sfmode,
                igmp_ifc_event(pmc->interface);
 #endif
        }
+       pmc->users--;
 out_unlock:
        spin_unlock_bh(&pmc->lock);
        return err;
@@ -2025,6 +2026,7 @@ static int ip_mc_add_src(struct in_device
*in_dev, __be32 *pmca, int sfmode,
 #ifdef CONFIG_IP_MULTICAST
        sf_markstate(pmc);
 #endif
+       pmc->users++;
        isexclude = pmc->sfmode == MCAST_EXCLUDE;
        if (!delta)
                pmc->sfcount[sfmode]++;

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: net/ipv4: use-after-free in add_grec
  2017-05-31 23:49   ` Cong Wang
@ 2017-05-31 23:55     ` Eric Dumazet
  2017-06-01  0:13       ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-05-31 23:55 UTC (permalink / raw)
  To: Cong Wang
  Cc: Eric Dumazet, Andrey Konovalov, David S. Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	Patrick McHardy, netdev, LKML, Dmitry Vyukov, Kostya Serebryany,
	syzkaller

On Wed, May 31, 2017 at 4:49 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Wed, May 31, 2017 at 9:12 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Wed, 2017-05-31 at 11:46 +0200, Andrey Konovalov wrote:
>>> Hi,
>>>
>>> I've got the following error report while fuzzing the kernel with syzkaller.
>>>
>>> On commit 5ed02dbb497422bf225783f46e6eadd237d23d6b (4.12-rc3).
>>>
>>> Unfortunately it's not reproducible.
>>>
>>> ==================================================================
>>> BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>>> Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0
>>>
>>> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>> Call Trace:
>>>  <IRQ>
>>>  __dump_stack lib/dump_stack.c:16 [inline]
>>>  dump_stack+0x292/0x395 lib/dump_stack.c:52
>>>  print_address_description+0x73/0x280 mm/kasan/report.c:252
>>>  kasan_report_error mm/kasan/report.c:351 [inline]
>>>  kasan_report+0x22b/0x340 mm/kasan/report.c:408
>>>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
>>>  add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>>>  igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
>>>  igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
>>> IPVS: length: 51 != 8
>>>  call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
>>>  expire_timers kernel/time/timer.c:1307 [inline]
>>>  __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
>>>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>>>  __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
>>>  invoke_softirq kernel/softirq.c:364 [inline]
>>>  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>>>  exiting_irq arch/x86/include/asm/apic.h:652 [inline]
>>>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
>>>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
>>> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
>>> RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
>>> RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
>>> RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
>>> RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
>>> R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
>>> R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
>>>  </IRQ>
>>>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>>>  default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
>>>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
>>>  default_idle_call+0x36/0x60 kernel/sched/idle.c:98
>>>  cpuidle_idle_call kernel/sched/idle.c:156 [inline]
>>>  do_idle+0x348/0x420 kernel/sched/idle.c:245
>>>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
>>>  rest_init+0x18d/0x1a0 init/main.c:415
>>>  start_kernel+0x747/0x779 init/main.c:679
>>>  x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
>>>  x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
>>>  secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304
>>>
>>> Allocated by task 30543:
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>>  set_track mm/kasan/kasan.c:525 [inline]
>>>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
>>>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>>>  kmalloc include/linux/slab.h:492 [inline]
>>>  kzalloc include/linux/slab.h:665 [inline]
>>>  ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
>>>  ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
>>>  ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
>>>  do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
>>>  ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
>>>  tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
>>>  sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
>>>  SYSC_setsockopt net/socket.c:1798 [inline]
>>>  SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
>>>  entry_SYSCALL_64_fastpath+0x1f/0xbe
>>>
>>> Freed by task 30543:
>>>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>>>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>>>  set_track mm/kasan/kasan.c:525 [inline]
>>>  kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
>>>  slab_free_hook mm/slub.c:1357 [inline]
>>>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>>>  slab_free mm/slub.c:2961 [inline]
>>>  kfree+0xe8/0x2b0 mm/slub.c:3882
>>>  ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
>>>  ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
>>>  ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
>>>  inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
>>>  sock_release+0x8d/0x1e0 net/socket.c:597
>>>  sock_close+0x16/0x20 net/socket.c:1072
>>>  __fput+0x332/0x7f0 fs/file_table.c:209
>>>  ____fput+0x15/0x20 fs/file_table.c:245
>>>  task_work_run+0x19b/0x270 kernel/task_work.c:116
>>>  exit_task_work include/linux/task_work.h:21 [inline]
>>>  do_exit+0x18a3/0x2820 kernel/exit.c:878
>>>  do_group_exit+0x149/0x420 kernel/exit.c:982
>>>  get_signal+0x76d/0x1790 kernel/signal.c:2318
>>>  do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
>>>  exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
>>>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>>>  syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
>>>  entry_SYSCALL_64_fastpath+0xbc/0xbe
>>>
>>> The buggy address belongs to the object at ffff88003053c1a0
>>>  which belongs to the cache kmalloc-64 of size 64
>>> The buggy address is located 0 bytes inside of
>>>  64-byte region [ffff88003053c1a0, ffff88003053c1e0)
>>> The buggy address belongs to the page:
>>> page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
>>> index:0x0 compound_mapcount: 0
>>> flags: 0x100000000008100(slab|head)
>>> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
>>> raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
>>> page dumped because: kasan: bad access detected
>>>
>>> Memory state around the buggy address:
>>>  ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> >ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>>>                                ^
>>>  ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>>  ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>> ==================================================================
>>
>> I have the feeling that ip_mc_clear_src() is called too soon.
>>
>> We should call it once all users have released their reference.
>
> In this case, ip_mc_clear_src() is called when the ->users hits
> zero, therefore I think we possibly miss the refcnt on ->users.
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 44fd86d..c2f3347 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -1886,6 +1886,7 @@ static int ip_mc_del_src(struct in_device
> *in_dev, __be32 *pmca, int sfmode,
>                 igmp_ifc_event(pmc->interface);
>  #endif
>         }
> +       pmc->users--;
>  out_unlock:
>         spin_unlock_bh(&pmc->lock);
>         return err;
> @@ -2025,6 +2026,7 @@ static int ip_mc_add_src(struct in_device
> *in_dev, __be32 *pmca, int sfmode,
>  #ifdef CONFIG_IP_MULTICAST
>         sf_markstate(pmc);
>  #endif
> +       pmc->users++;
>         isexclude = pmc->sfmode == MCAST_EXCLUDE;
>         if (!delta)
>                 pmc->sfcount[sfmode]++;


The issue here is the timer firing while ip_mc_clear_src() has been
already called.

My patch should fix the problem.

Or another one using del_timer_sync() instead of del_timer() in
igmp_stop_timer(), but such a change would be more invasive,
since the del_timer_sync() would need to happen while im->lock
spinlock is not held.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: net/ipv4: use-after-free in add_grec
  2017-05-31 23:55     ` Eric Dumazet
@ 2017-06-01  0:13       ` Eric Dumazet
  2017-06-01 12:01         ` Andrey Konovalov
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-06-01  0:13 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Cong Wang, Andrey Konovalov, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

On Wed, 2017-05-31 at 16:55 -0700, Eric Dumazet wrote:

> The issue here is the timer firing while ip_mc_clear_src() has been
> already called.
> 
> My patch should fix the problem.
> 
> Or another one using del_timer_sync() instead of del_timer() in
> igmp_stop_timer(), but such a change would be more invasive,
> since the del_timer_sync() would need to happen while im->lock
> spinlock is not held.

BTW, I guess that Andrey could try to add a delay to trigger the bug
more often.

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 44fd86de2823dd17de16276a8ec01b190e69b8b4..84fff17daab0832c470a613b29b2aaade07cec0a 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -798,7 +798,7 @@ static void igmp_timer_expire(unsigned long data)
 	}
 	im->reporter = 1;
 	spin_unlock(&im->lock);
-
+	udelay(10000);
 	if (IGMP_V1_SEEN(in_dev))
 		igmp_send_report(in_dev, im, IGMP_HOST_MEMBERSHIP_REPORT);
 	else if (IGMP_V2_SEEN(in_dev))

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: net/ipv4: use-after-free in add_grec
  2017-06-01  0:13       ` Eric Dumazet
@ 2017-06-01 12:01         ` Andrey Konovalov
  0 siblings, 0 replies; 18+ messages in thread
From: Andrey Konovalov @ 2017-06-01 12:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Eric Dumazet, Cong Wang, David S. Miller, Alexey Kuznetsov,
	James Morris, Hideaki YOSHIFUJI, Patrick McHardy, netdev, LKML,
	Dmitry Vyukov, Kostya Serebryany, syzkaller

On Thu, Jun 1, 2017 at 2:13 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Wed, 2017-05-31 at 16:55 -0700, Eric Dumazet wrote:
>
>> The issue here is the timer firing while ip_mc_clear_src() has been
>> already called.
>>
>> My patch should fix the problem.
>>
>> Or another one using del_timer_sync() instead of del_timer() in
>> igmp_stop_timer(), but such a change would be more invasive,
>> since the del_timer_sync() would need to happen while im->lock
>> spinlock is not held.
>
> BTW, I guess that Andrey could try to add a delay to trigger the bug
> more often.

Applied, now testing with your patch.

Thanks!

>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 44fd86de2823dd17de16276a8ec01b190e69b8b4..84fff17daab0832c470a613b29b2aaade07cec0a 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -798,7 +798,7 @@ static void igmp_timer_expire(unsigned long data)
>         }
>         im->reporter = 1;
>         spin_unlock(&im->lock);
> -
> +       udelay(10000);
>         if (IGMP_V1_SEEN(in_dev))
>                 igmp_send_report(in_dev, im, IGMP_HOST_MEMBERSHIP_REPORT);
>         else if (IGMP_V2_SEEN(in_dev))
>
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net] ipv4: igmp: fix a use after free
  2017-05-31  9:46 net/ipv4: use-after-free in add_grec Andrey Konovalov
  2017-05-31 16:12 ` Eric Dumazet
@ 2017-06-08 13:43 ` Eric Dumazet
  2017-06-08 18:22   ` Xin Long
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-06-08 13:43 UTC (permalink / raw)
  To: Andrey Konovalov; +Cc: David S. Miller, netdev

From: Eric Dumazet <edumazet@google.com>

Andrey reported a use-after-free in add_grec(), courtesy of syzkaller.

Problem here is that igmp_stop_timer() uses a del_timer(), so we can not
guarantee that another cpu is not servicing the timer.

Therefore, if igmp_group_dropped() call from ip_mc_dec_group() is
immediately followed by ip_mc_clear_src(), ip_mc_clear_src() might free
memory that could be used by the other cpu servicing the timer.

To fix this issue, we should defer the memory freeing
(ip_mc_clear_src()) to the point all references to (struct
ip_mc_list)->refcnt have been released.
This happens in ip_ma_put()


==================================================================
BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0

CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x292/0x395 lib/dump_stack.c:52
 print_address_description+0x73/0x280 mm/kasan/report.c:252
 kasan_report_error mm/kasan/report.c:351 [inline]
 kasan_report+0x22b/0x340 mm/kasan/report.c:408
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
 add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
 igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
 igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
IPVS: length: 51 != 8
 call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
 expire_timers kernel/time/timer.c:1307 [inline]
 __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
 __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
 invoke_softirq kernel/softirq.c:364 [inline]
 irq_exit+0x19e/0x1d0 kernel/softirq.c:405
 exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
 </IRQ>
 arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
 default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
 default_idle_call+0x36/0x60 kernel/sched/idle.c:98
 cpuidle_idle_call kernel/sched/idle.c:156 [inline]
 do_idle+0x348/0x420 kernel/sched/idle.c:245
 cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
 rest_init+0x18d/0x1a0 init/main.c:415
 start_kernel+0x747/0x779 init/main.c:679
 x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
 x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
 secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304

Allocated by task 30543:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
 kmalloc include/linux/slab.h:492 [inline]
 kzalloc include/linux/slab.h:665 [inline]
 ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
 ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
 ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
 do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
 ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
 tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
 sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
 SYSC_setsockopt net/socket.c:1798 [inline]
 SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
 entry_SYSCALL_64_fastpath+0x1f/0xbe

Freed by task 30543:
 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
 save_stack+0x43/0xd0 mm/kasan/kasan.c:513
 set_track mm/kasan/kasan.c:525 [inline]
 kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
 slab_free_hook mm/slub.c:1357 [inline]
 slab_free_freelist_hook mm/slub.c:1379 [inline]
 slab_free mm/slub.c:2961 [inline]
 kfree+0xe8/0x2b0 mm/slub.c:3882
 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
 sock_release+0x8d/0x1e0 net/socket.c:597
 sock_close+0x16/0x20 net/socket.c:1072
 __fput+0x332/0x7f0 fs/file_table.c:209
 ____fput+0x15/0x20 fs/file_table.c:245
 task_work_run+0x19b/0x270 kernel/task_work.c:116
 exit_task_work include/linux/task_work.h:21 [inline]
 do_exit+0x18a3/0x2820 kernel/exit.c:878
 do_group_exit+0x149/0x420 kernel/exit.c:982
 get_signal+0x76d/0x1790 kernel/signal.c:2318
 do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
 exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
 prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
 syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
 entry_SYSCALL_64_fastpath+0xbc/0xbe

The buggy address belongs to the object at ffff88003053c1a0
 which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
 64-byte region [ffff88003053c1a0, ffff88003053c1e0)
The buggy address belongs to the page:
page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
                               ^
 ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
---
 net/ipv4/igmp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 44fd86de2823dd17de16276a8ec01b190e69b8b4..f172d251c5f2cda8fc8bedd65fa16cdfe33a4d40 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -174,6 +174,7 @@ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
 static void ip_ma_put(struct ip_mc_list *im)
 {
 	if (atomic_dec_and_test(&im->refcnt)) {
+		ip_mc_clear_src(im);
 		in_dev_put(im->interface);
 		kfree_rcu(im, rcu);
 	}
@@ -1615,7 +1616,6 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
 				*ip = i->next_rcu;
 				in_dev->mc_count--;
 				igmp_group_dropped(i);
-				ip_mc_clear_src(i);
 
 				if (!in_dev->dead)
 					ip_rt_multicast_event(in_dev);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-08 13:43 ` [PATCH net] ipv4: igmp: fix a use after free Eric Dumazet
@ 2017-06-08 18:22   ` Xin Long
  2017-06-08 20:33     ` Eric Dumazet
  0 siblings, 1 reply; 18+ messages in thread
From: Xin Long @ 2017-06-08 18:22 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Andrey Konovalov, David S. Miller, netdev

On Thu, Jun 8, 2017 at 9:43 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Andrey reported a use-after-free in add_grec(), courtesy of syzkaller.
>
> Problem here is that igmp_stop_timer() uses a del_timer(), so we can not
> guarantee that another cpu is not servicing the timer.
>
> Therefore, if igmp_group_dropped() call from ip_mc_dec_group() is
> immediately followed by ip_mc_clear_src(), ip_mc_clear_src() might free
> memory that could be used by the other cpu servicing the timer.
>
> To fix this issue, we should defer the memory freeing
> (ip_mc_clear_src()) to the point all references to (struct
> ip_mc_list)->refcnt have been released.
> This happens in ip_ma_put()
>
>
> ==================================================================
> BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
> Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0
>
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:16 [inline]
>  dump_stack+0x292/0x395 lib/dump_stack.c:52
>  print_address_description+0x73/0x280 mm/kasan/report.c:252
>  kasan_report_error mm/kasan/report.c:351 [inline]
>  kasan_report+0x22b/0x340 mm/kasan/report.c:408
>  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
>  add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
>  igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
>  igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
the call trace is igmp_ifc_timer_expire -> igmpv3_send_cr -> add_grec
and the timer should be in_dev->mr_ifc_timer.
but igmp_stop_timer you mentioned is used to stop im->timer

It's possible that ip_mc_clear_src is done in ip_ma_put()
while igmp_ifc_timer_expire is still using ip_mc_list under
rcu_read_lock(). no ?

> IPVS: length: 51 != 8
>  call_timer_fn+0x23f/0x800 kernel/time/timer.c:1268
>  expire_timers kernel/time/timer.c:1307 [inline]
>  __run_timers+0x94e/0xcd0 kernel/time/timer.c:1601
>  run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614
>  __do_softirq+0x2fb/0xb99 kernel/softirq.c:284
>  invoke_softirq kernel/softirq.c:364 [inline]
>  irq_exit+0x19e/0x1d0 kernel/softirq.c:405
>  exiting_irq arch/x86/include/asm/apic.h:652 [inline]
>  smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:966
>  apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:481
> RIP: 0010:native_safe_halt+0x6/0x10 arch/x86/include/asm/irqflags.h:53
> RSP: 0018:ffffffff85a079a8 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff10
> RAX: dffffc0000000020 RBX: 1ffffffff0b40f38 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff85a2a9e4
> RBP: ffffffff85a079a8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: dffffc0000000000
> R13: ffffffff85a07a60 R14: ffffffff86171338 R15: 1ffffffff0b40f5b
>  </IRQ>
>  arch_safe_halt arch/x86/include/asm/paravirt.h:98 [inline]
>  default_idle+0x8f/0x440 arch/x86/kernel/process.c:341
>  arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:332
>  default_idle_call+0x36/0x60 kernel/sched/idle.c:98
>  cpuidle_idle_call kernel/sched/idle.c:156 [inline]
>  do_idle+0x348/0x420 kernel/sched/idle.c:245
>  cpu_startup_entry+0x18/0x20 kernel/sched/idle.c:350
>  rest_init+0x18d/0x1a0 init/main.c:415
>  start_kernel+0x747/0x779 init/main.c:679
>  x86_64_start_reservations+0x2a/0x2c arch/x86/kernel/head64.c:196
>  x86_64_start_kernel+0x132/0x141 arch/x86/kernel/head64.c:177
>  secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:304
>
> Allocated by task 30543:
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
>  kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745
>  kmalloc include/linux/slab.h:492 [inline]
>  kzalloc include/linux/slab.h:665 [inline]
>  ip_mc_add1_src net/ipv4/igmp.c:1909 [inline]
>  ip_mc_add_src+0x6cd/0x1020 net/ipv4/igmp.c:2033
>  ip_mc_msfilter+0x5e5/0xcf0 net/ipv4/igmp.c:2403
>  do_ip_setsockopt.isra.12+0x2d47/0x38c0 net/ipv4/ip_sockglue.c:959
>  ip_setsockopt+0x3a/0xb0 net/ipv4/ip_sockglue.c:1256
>  tcp_setsockopt+0x82/0xd0 net/ipv4/tcp.c:2740
>  sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2844
>  SYSC_setsockopt net/socket.c:1798 [inline]
>  SyS_setsockopt+0x270/0x3a0 net/socket.c:1777
>  entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 30543:
>  save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:513
>  set_track mm/kasan/kasan.c:525 [inline]
>  kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
>  slab_free_hook mm/slub.c:1357 [inline]
>  slab_free_freelist_hook mm/slub.c:1379 [inline]
>  slab_free mm/slub.c:2961 [inline]
>  kfree+0xe8/0x2b0 mm/slub.c:3882
>  ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
>  ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
>  ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
>  inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
>  sock_release+0x8d/0x1e0 net/socket.c:597
>  sock_close+0x16/0x20 net/socket.c:1072
>  __fput+0x332/0x7f0 fs/file_table.c:209
>  ____fput+0x15/0x20 fs/file_table.c:245
>  task_work_run+0x19b/0x270 kernel/task_work.c:116
>  exit_task_work include/linux/task_work.h:21 [inline]
>  do_exit+0x18a3/0x2820 kernel/exit.c:878
>  do_group_exit+0x149/0x420 kernel/exit.c:982
>  get_signal+0x76d/0x1790 kernel/signal.c:2318
>  do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
>  exit_to_usermode_loop+0x17a/0x210 arch/x86/entry/common.c:157
>  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
>  syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
>  entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003053c1a0
>  which belongs to the cache kmalloc-64 of size 64
> The buggy address is located 0 bytes inside of
>  64-byte region [ffff88003053c1a0, ffff88003053c1e0)
> The buggy address belongs to the page:
> page:ffffea0000c14f00 count:1 mapcount:0 mapping:          (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100140014
> raw: ffffea0000c2f520 ffffea0000e20aa0 ffff88003e80f740 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff88003053c080: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003053c100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>>ffff88003053c180: fc fc fc fc fb fb fb fb fb fb fb fb fc fc fc fc
>                                ^
>  ffff88003053c200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  ffff88003053c280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==================================================================
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Andrey Konovalov <andreyknvl@google.com>
> ---
>  net/ipv4/igmp.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
> index 44fd86de2823dd17de16276a8ec01b190e69b8b4..f172d251c5f2cda8fc8bedd65fa16cdfe33a4d40 100644
> --- a/net/ipv4/igmp.c
> +++ b/net/ipv4/igmp.c
> @@ -174,6 +174,7 @@ static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
>  static void ip_ma_put(struct ip_mc_list *im)
>  {
>         if (atomic_dec_and_test(&im->refcnt)) {
> +               ip_mc_clear_src(im);
>                 in_dev_put(im->interface);
>                 kfree_rcu(im, rcu);
>         }
> @@ -1615,7 +1616,6 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
>                                 *ip = i->next_rcu;
>                                 in_dev->mc_count--;
>                                 igmp_group_dropped(i);
> -                               ip_mc_clear_src(i);
>
>                                 if (!in_dev->dead)
>                                         ip_rt_multicast_event(in_dev);
>
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-08 18:22   ` Xin Long
@ 2017-06-08 20:33     ` Eric Dumazet
  2017-06-09  0:59       ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-06-08 20:33 UTC (permalink / raw)
  To: Xin Long; +Cc: Andrey Konovalov, David S. Miller, netdev

On Fri, 2017-06-09 at 02:22 +0800, Xin Long wrote:
> On Thu, Jun 8, 2017 at 9:43 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > From: Eric Dumazet <edumazet@google.com>
> >
> > Andrey reported a use-after-free in add_grec(), courtesy of syzkaller.
> >
> > Problem here is that igmp_stop_timer() uses a del_timer(), so we can not
> > guarantee that another cpu is not servicing the timer.
> >
> > Therefore, if igmp_group_dropped() call from ip_mc_dec_group() is
> > immediately followed by ip_mc_clear_src(), ip_mc_clear_src() might free
> > memory that could be used by the other cpu servicing the timer.
> >
> > To fix this issue, we should defer the memory freeing
> > (ip_mc_clear_src()) to the point all references to (struct
> > ip_mc_list)->refcnt have been released.
> > This happens in ip_ma_put()
> >
> >
> > ==================================================================
> > BUG: KASAN: use-after-free in add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
> > Read of size 8 at addr ffff88003053c1a0 by task swapper/0/0
> >
> > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc3+ #370
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > Call Trace:
> >  <IRQ>
> >  __dump_stack lib/dump_stack.c:16 [inline]
> >  dump_stack+0x292/0x395 lib/dump_stack.c:52
> >  print_address_description+0x73/0x280 mm/kasan/report.c:252
> >  kasan_report_error mm/kasan/report.c:351 [inline]
> >  kasan_report+0x22b/0x340 mm/kasan/report.c:408
> >  __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:429
> >  add_grec+0x101e/0x1090 net/ipv4/igmp.c:473
> >  igmpv3_send_cr net/ipv4/igmp.c:663 [inline]
> >  igmp_ifc_timer_expire+0x46d/0xa80 net/ipv4/igmp.c:768
> the call trace is igmp_ifc_timer_expire -> igmpv3_send_cr -> add_grec
> and the timer should be in_dev->mr_ifc_timer.
> but igmp_stop_timer you mentioned is used to stop im->timer
> 
> It's possible that ip_mc_clear_src is done in ip_ma_put()
> while igmp_ifc_timer_expire is still using ip_mc_list under
> rcu_read_lock(). no ?

You might be right. I looked at the freeing side 

>  kfree+0xe8/0x2b0 mm/slub.c:3882
>  ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078
>  ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618
>  ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609
>  inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411
>  sock_release+0x8d/0x1e0 net/socket.c:597
>  sock_close+0x16/0x20 net/socket.c:1072

Then I tried to catch a problem happening on another cpu, and found one.

I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
to defer freeing after rcu grace period but for some reason decided it
was not needed.

What about :

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 44fd86de2823dd17de16276a8ec01b190e69b8b4..80932880af861046849d7dbac5f5aa0a1117f166 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -171,12 +171,20 @@ static void ip_mc_clear_src(struct ip_mc_list *pmc);
 static int ip_mc_add_src(struct in_device *in_dev, __be32 *pmca, int sfmode,
 			 int sfcount, __be32 *psfsrc, int delta);
 
+
+static void ip_mc_list_reclaim(struct rcu_head *head)
+{
+	struct ip_mc_list *im = container_of(head, struct ip_mc_list, rcu);
+
+	ip_mc_clear_src(im);
+	in_dev_put(im->interface);
+	kfree(im);
+}
+
 static void ip_ma_put(struct ip_mc_list *im)
 {
-	if (atomic_dec_and_test(&im->refcnt)) {
-		in_dev_put(im->interface);
-		kfree_rcu(im, rcu);
-	}
+	if (atomic_dec_and_test(&im->refcnt))
+		call_rcu(&im->rcu, ip_mc_list_reclaim);
 }
 
 #define for_each_pmc_rcu(in_dev, pmc)				\
@@ -1615,7 +1623,6 @@ void ip_mc_dec_group(struct in_device *in_dev, __be32 addr)
 				*ip = i->next_rcu;
 				in_dev->mc_count--;
 				igmp_group_dropped(i);
-				ip_mc_clear_src(i);
 
 				if (!in_dev->dead)
 					ip_rt_multicast_event(in_dev);

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-08 20:33     ` Eric Dumazet
@ 2017-06-09  0:59       ` Cong Wang
  2017-06-09  1:37         ` Eric Dumazet
  2017-06-09  6:24         ` Xin Long
  0 siblings, 2 replies; 18+ messages in thread
From: Cong Wang @ 2017-06-09  0:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Xin Long, Andrey Konovalov, David S. Miller, netdev

On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
> to defer freeing after rcu grace period but for some reason decided it
> was not needed.

This one makes sense, it is the second time I saw the use-after-free
in igmp code, both are because we don't respect the RCU rule to free
an element in the list.

>
> What about :

But not sure if all ip_ma_put() callers want ip_mc_clear_src().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09  0:59       ` Cong Wang
@ 2017-06-09  1:37         ` Eric Dumazet
  2017-06-09  6:05           ` Cong Wang
  2017-06-09  6:24         ` Xin Long
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-06-09  1:37 UTC (permalink / raw)
  To: Cong Wang; +Cc: Xin Long, Andrey Konovalov, David S. Miller, netdev

On Thu, 2017-06-08 at 17:59 -0700, Cong Wang wrote:
> On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
> > to defer freeing after rcu grace period but for some reason decided it
> > was not needed.
> 
> This one makes sense, it is the second time I saw the use-after-free
> in igmp code, both are because we don't respect the RCU rule to free
> an element in the list.
> 
> >
> > What about :
> 
> But not sure if all ip_ma_put() callers want ip_mc_clear_src().

That would lead to a memory leak if this was the case ?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09  1:37         ` Eric Dumazet
@ 2017-06-09  6:05           ` Cong Wang
  2017-06-09  6:27             ` Xin Long
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-06-09  6:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Xin Long, Andrey Konovalov, David S. Miller, netdev

On Thu, Jun 8, 2017 at 6:37 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2017-06-08 at 17:59 -0700, Cong Wang wrote:
>> On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
>> > to defer freeing after rcu grace period but for some reason decided it
>> > was not needed.
>>
>> This one makes sense, it is the second time I saw the use-after-free
>> in igmp code, both are because we don't respect the RCU rule to free
>> an element in the list.
>>
>> >
>> > What about :
>>
>> But not sure if all ip_ma_put() callers want ip_mc_clear_src().
>
> That would lead to a memory leak if this was the case ?

Maybe, but looking at igmpv3_clear_delrec() again, seems
we can just acquire pmc->lock in ip_mc_clear_src() to serialize
with the readers?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09  0:59       ` Cong Wang
  2017-06-09  1:37         ` Eric Dumazet
@ 2017-06-09  6:24         ` Xin Long
  2017-06-09 15:56           ` Eric Dumazet
  1 sibling, 1 reply; 18+ messages in thread
From: Xin Long @ 2017-06-09  6:24 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev

On Fri, Jun 9, 2017 at 8:59 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:

> On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
>> to defer freeing after rcu grace period but for some reason decided it
>> was not needed.
Yes, this one could fix it.

>
> This one makes sense, it is the second time I saw the use-after-free
> in igmp code, both are because we don't respect the RCU rule to free
> an element in the list.
>
>>
>> What about :
>
> But not sure if all ip_ma_put() callers want ip_mc_clear_src().
If that's problem, there may be another way:

  leave ip_mc_clear_src as it is, just add pmc->lock to protect this call.

this use-after-free was actually caused by using pmc->sources/tomb
in add_grec while ip_mc_clear_src is freeing them. add_grec is already
under pmc->lock, so to add pmc->lock for ip_mc_clear_src should be
enough to protect the list pmc->sources/tomb.

wdyt ?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09  6:05           ` Cong Wang
@ 2017-06-09  6:27             ` Xin Long
  0 siblings, 0 replies; 18+ messages in thread
From: Xin Long @ 2017-06-09  6:27 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev

On Fri, Jun 9, 2017 at 2:05 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Jun 8, 2017 at 6:37 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Thu, 2017-06-08 at 17:59 -0700, Cong Wang wrote:
>>> On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> > I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
>>> > to defer freeing after rcu grace period but for some reason decided it
>>> > was not needed.
>>>
>>> This one makes sense, it is the second time I saw the use-after-free
>>> in igmp code, both are because we don't respect the RCU rule to free
>>> an element in the list.
>>>
>>> >
>>> > What about :
>>>
>>> But not sure if all ip_ma_put() callers want ip_mc_clear_src().
>>
>> That would lead to a memory leak if this was the case ?
>
> Maybe, but looking at igmpv3_clear_delrec() again, seems
> we can just acquire pmc->lock in ip_mc_clear_src() to serialize
> with the readers?
just refreshed the page and saw your reply, the same way as
I just replied

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09  6:24         ` Xin Long
@ 2017-06-09 15:56           ` Eric Dumazet
  2017-06-09 17:01             ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Dumazet @ 2017-06-09 15:56 UTC (permalink / raw)
  To: Xin Long; +Cc: Cong Wang, Andrey Konovalov, David S. Miller, netdev

On Fri, 2017-06-09 at 14:24 +0800, Xin Long wrote:
> On Fri, Jun 9, 2017 at 8:59 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> 
> > On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >> I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
> >> to defer freeing after rcu grace period but for some reason decided it
> >> was not needed.
> Yes, this one could fix it.
> 
> >
> > This one makes sense, it is the second time I saw the use-after-free
> > in igmp code, both are because we don't respect the RCU rule to free
> > an element in the list.
> >
> >>
> >> What about :
> >
> > But not sure if all ip_ma_put() callers want ip_mc_clear_src().
> If that's problem, there may be another way:
> 
>   leave ip_mc_clear_src as it is, just add pmc->lock to protect this call.
> 
> this use-after-free was actually caused by using pmc->sources/tomb
> in add_grec while ip_mc_clear_src is freeing them. add_grec is already
> under pmc->lock, so to add pmc->lock for ip_mc_clear_src should be
> enough to protect the list pmc->sources/tomb.
> 
> wdyt ?

This would we weird.

When we free skb components, we do not grab a spinlock.

When we free something, just make sure we must be the last user of it.

RCU rules -> Must respect RCU grace period before delete.

No need for extra spinlock.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09 15:56           ` Eric Dumazet
@ 2017-06-09 17:01             ` Cong Wang
  2017-06-09 18:05               ` Xin Long
  0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2017-06-09 17:01 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Xin Long, Andrey Konovalov, David S. Miller, netdev

On Fri, Jun 9, 2017 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2017-06-09 at 14:24 +0800, Xin Long wrote:
>> On Fri, Jun 9, 2017 at 8:59 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>
>> > On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >> I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
>> >> to defer freeing after rcu grace period but for some reason decided it
>> >> was not needed.
>> Yes, this one could fix it.
>>
>> >
>> > This one makes sense, it is the second time I saw the use-after-free
>> > in igmp code, both are because we don't respect the RCU rule to free
>> > an element in the list.
>> >
>> >>
>> >> What about :
>> >
>> > But not sure if all ip_ma_put() callers want ip_mc_clear_src().
>> If that's problem, there may be another way:
>>
>>   leave ip_mc_clear_src as it is, just add pmc->lock to protect this call.
>>
>> this use-after-free was actually caused by using pmc->sources/tomb
>> in add_grec while ip_mc_clear_src is freeing them. add_grec is already
>> under pmc->lock, so to add pmc->lock for ip_mc_clear_src should be
>> enough to protect the list pmc->sources/tomb.
>>
>> wdyt ?
>
> This would we weird.
>
> When we free skb components, we do not grab a spinlock.
>
> When we free something, just make sure we must be the last user of it.
>
> RCU rules -> Must respect RCU grace period before delete.
>
> No need for extra spinlock.

This is what I thought in my first response, until I realized
it is not pure RCU, otherwise pmc->lock should not be taken
in igmpv3_send_cr(). It seems the code is mixing the use
of spinlock and RCU.

We need RCU anyway, ip_check_mc_rcu() is the real fast
path where we don't take spinlock. I think we will need more
work.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09 17:01             ` Cong Wang
@ 2017-06-09 18:05               ` Xin Long
  2017-06-09 21:35                 ` Cong Wang
  0 siblings, 1 reply; 18+ messages in thread
From: Xin Long @ 2017-06-09 18:05 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev

On Sat, Jun 10, 2017 at 1:01 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:

> On Fri, Jun 9, 2017 at 8:56 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Fri, 2017-06-09 at 14:24 +0800, Xin Long wrote:
>>> On Fri, Jun 9, 2017 at 8:59 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>>>
>>> > On Thu, Jun 8, 2017 at 1:33 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>> >> I mentioned (in https://lkml.org/lkml/2017/5/31/619 ) that we might need
>>> >> to defer freeing after rcu grace period but for some reason decided it
>>> >> was not needed.
>>> Yes, this one could fix it.
>>>
>>> >
>>> > This one makes sense, it is the second time I saw the use-after-free
>>> > in igmp code, both are because we don't respect the RCU rule to free
>>> > an element in the list.
>>> >
>>> >>
>>> >> What about :
>>> >
>>> > But not sure if all ip_ma_put() callers want ip_mc_clear_src().
>>> If that's problem, there may be another way:
>>>
>>>   leave ip_mc_clear_src as it is, just add pmc->lock to protect this call.
>>>
>>> this use-after-free was actually caused by using pmc->sources/tomb
>>> in add_grec while ip_mc_clear_src is freeing them. add_grec is already
>>> under pmc->lock, so to add pmc->lock for ip_mc_clear_src should be
>>> enough to protect the list pmc->sources/tomb.
>>>
>>> wdyt ?
>>
>> This would we weird.
>>
>> When we free skb components, we do not grab a spinlock.
>>
>> When we free something, just make sure we must be the last user of it.
>>
>> RCU rules -> Must respect RCU grace period before delete.
>>
>> No need for extra spinlock.
>
> This is what I thought in my first response, until I realized
> it is not pure RCU, otherwise pmc->lock should not be taken
> in igmpv3_send_cr(). It seems the code is mixing the use
> of spinlock and RCU.
rcu lock is for pmc not being freed, and spinlock is for pmc's
members' modification. is there some rule these two locks
should be mixed?


>
> We need RCU anyway, ip_check_mc_rcu() is the real fast
> path where we don't take spinlock. I think we will need more
> work.
It seems all add_grec() callings needs spinlock, maybe  add_grec
modifies pmc's member. it's hard to drop spinlock.

from ip_check_mc_rcu you mentioned about, it should be right
to call ip_mc_clear_src after rcu grace period, like Eric's patch.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net] ipv4: igmp: fix a use after free
  2017-06-09 18:05               ` Xin Long
@ 2017-06-09 21:35                 ` Cong Wang
  0 siblings, 0 replies; 18+ messages in thread
From: Cong Wang @ 2017-06-09 21:35 UTC (permalink / raw)
  To: Xin Long; +Cc: Eric Dumazet, Andrey Konovalov, David S. Miller, netdev

On Fri, Jun 9, 2017 at 11:05 AM, Xin Long <lucien.xin@gmail.com> wrote:
> On Sat, Jun 10, 2017 at 1:01 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> This is what I thought in my first response, until I realized
>> it is not pure RCU, otherwise pmc->lock should not be taken
>> in igmpv3_send_cr(). It seems the code is mixing the use
>> of spinlock and RCU.
> rcu lock is for pmc not being freed, and spinlock is for pmc's
> members' modification. is there some rule these two locks
> should be mixed?
>

This is exactly why I said we are mixing RCU and spinlock.

>
>>
>> We need RCU anyway, ip_check_mc_rcu() is the real fast
>> path where we don't take spinlock. I think we will need more
>> work.
> It seems all add_grec() callings needs spinlock, maybe  add_grec
> modifies pmc's member. it's hard to drop spinlock.
>
> from ip_check_mc_rcu you mentioned about, it should be right
> to call ip_mc_clear_src after rcu grace period, like Eric's patch.

Well, more than just that, we need to use proper RCU API for
pmc->sources and you know it is a singly linked list...

I will work on this.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-06-09 21:35 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-31  9:46 net/ipv4: use-after-free in add_grec Andrey Konovalov
2017-05-31 16:12 ` Eric Dumazet
2017-05-31 23:49   ` Cong Wang
2017-05-31 23:55     ` Eric Dumazet
2017-06-01  0:13       ` Eric Dumazet
2017-06-01 12:01         ` Andrey Konovalov
2017-06-08 13:43 ` [PATCH net] ipv4: igmp: fix a use after free Eric Dumazet
2017-06-08 18:22   ` Xin Long
2017-06-08 20:33     ` Eric Dumazet
2017-06-09  0:59       ` Cong Wang
2017-06-09  1:37         ` Eric Dumazet
2017-06-09  6:05           ` Cong Wang
2017-06-09  6:27             ` Xin Long
2017-06-09  6:24         ` Xin Long
2017-06-09 15:56           ` Eric Dumazet
2017-06-09 17:01             ` Cong Wang
2017-06-09 18:05               ` Xin Long
2017-06-09 21:35                 ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.