Re: [ovs-discuss] Double free in recent kernels after memleak fix

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
       [not found] <CA+Sh73MJhqs7PBk6OV2AhzVjYvE1foUQUnwP5DwWR44LHZRZ9w@mail.gmail.com>
@ 2020-08-04 15:51 ` Gregory Rose
  2020-08-07 15:31   ` Johan Knöös
  0 siblings, 1 reply; 17+ messages in thread
From: Gregory Rose @ 2020-08-04 15:51 UTC (permalink / raw)
  To: Johan Knöös, bugs, Tonghao Zhang, Netdev; +Cc: joel



On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> Hi Open vSwitch contributors,
> 
> We have found openvswitch is causing double-freeing of memory. The
> issue was not present in kernel version 5.5.17 but is present in
> 5.6.14 and newer kernels.
> 
> After reverting the RCU commits below for debugging, enabling
> slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> email in the kernel log (the last one shows the double-free). When I
> revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> fix possible memleak on destroy flow-table"), the symptoms disappear.
> While I have a reliable way to reproduce the issue, I unfortunately
> don't yet have a process that's amenable to sharing. Please take a
> look.
> 
> 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> a35d16905efc rcu: Add basic support for kfree_rcu() batching
> 
> Thanks,
> Johan Knöös

Let's add the author of the patch you reverted and the Linux netdev
mailing list.

- Greg

> 
> Traces:
> 
> ------------[ cut here ]------------
> WARNING: CPU: 30 PID: 0 at net/openvswitch/flow_table.c:272
> table_instance_flow_free+0x2fd/0x340 [openvswitch]
> Modules linked in: ...
> CPU: 30 PID: 0 Comm: swapper/30 Tainted: G            E     5.6.14+ #18
> Hardware name: ...
> RIP: 0010:table_instance_flow_free+0x2fd/0x340 [openvswitch]
> Code: c1 fa 1f 48 c1 e8 20 29 d0 41 39 c7 0f 8f 95 fe ff ff 48 83 c4
> 10 48 89 ef d1 fe 5b 5d 41 5c 41 5d 41 5e 41 5f e9 33 fb ff ff <0f> 0b
> e9 59 fe ff ff 0f 0b e8 65 f1 fe ff 85 c0 0f 85 9b fe ff ff
> RSP: 0018:ffff888c3e589da8 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: ffff889f954ee580 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000003 RDI: 0000000000000246
> RBP: ffff888c295150a0 R08: ffffffff9297f341 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: ffff889f1ed55000
> R13: ffff888b72efa020 R14: ffff888c24209480 R15: ffff888b731bb6f8
> FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> table_instance_destroy+0xf9/0x1b0 [openvswitch]
> ? new_vport+0xb0/0xb0 [openvswitch]
> destroy_dp_rcu+0x12/0x50 [openvswitch]
> rcu_core+0x34d/0x9b0
> ? rcu_all_qs+0x90/0x90
> ? rcu_read_lock_sched_held+0xa5/0xc0
> ? rcu_read_lock_bh_held+0xc0/0xc0
> ? run_rebalance_domains+0x11d/0x140
> __do_softirq+0x128/0x55c
> irq_exit+0x101/0x110
> smp_apic_timer_interrupt+0xfd/0x2f0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> ? lockdep_hardirqs_on+0x182/0x260
> ? cpuidle_enter_state+0xd3/0x5d0
> cpuidle_enter+0x3c/0x60
> do_idle+0x36a/0x450
> ? arch_cpu_idle_exit+0x40/0x40
> cpu_startup_entry+0x19/0x20
> start_secondary+0x21f/0x290
> ? set_cpu_sibling_map+0xcb0/0xcb0
> secondary_startup_64+0xa4/0xb0
> irq event stamp: 1626911
> hardirqs last  enabled at (1626910): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> hardirqs last disabled at (1626911): [<ffffffff92804552>]
> trace_hardirqs_off_thunk+0x1a/0x1c
> softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> ---[ end trace 8dc48dec48bb79c0 ]---
> 
> 
> -------------------------------------------------------------------------------
> 
> 
> =============================
> WARNING: suspicious RCU usage
> 5.6.14+ #18 Tainted: G        W   E
> -----------------------------
> net/openvswitch/flow_table.c:239 suspicious rcu_dereference_protected() usage!
> \x0aother info that might help us debug this:\x0a
> \x0arcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by swapper/30/0:
> #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0
> \x0astack backtrace:
> CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> Hardware name: ...
> Call Trace:
> <IRQ>
> dump_stack+0xb8/0x110
> table_instance_flow_free+0x332/0x340 [openvswitch]
> table_instance_destroy+0xf9/0x1b0 [openvswitch]
> ? new_vport+0xb0/0xb0 [openvswitch]
> destroy_dp_rcu+0x12/0x50 [openvswitch]
> rcu_core+0x34d/0x9b0
> ? rcu_all_qs+0x90/0x90
> ? rcu_read_lock_sched_held+0xa5/0xc0
> ? rcu_read_lock_bh_held+0xc0/0xc0
> ? run_rebalance_domains+0x11d/0x140
> __do_softirq+0x128/0x55c
> irq_exit+0x101/0x110
> smp_apic_timer_interrupt+0xfd/0x2f0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> ? lockdep_hardirqs_on+0x182/0x260
> ? cpuidle_enter_state+0xd3/0x5d0
> cpuidle_enter+0x3c/0x60
> do_idle+0x36a/0x450
> ? arch_cpu_idle_exit+0x40/0x40
> cpu_startup_entry+0x19/0x20
> start_secondary+0x21f/0x290
> ? set_cpu_sibling_map+0xcb0/0xcb0
> secondary_startup_64+0xa4/0xb0
> 
> 
> -------------------------------------------------------------------------------
> 
> 
> ================================
> WARNING: inconsistent lock state
> 5.6.14+ #18 Tainted: G        W   E
> --------------------------------
> inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> swapper/30/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> ffffffff943e24c0 (fs_reclaim){+.?.}, at: fs_reclaim_acquire.part.0+0x5/0x30
> {SOFTIRQ-ON-W} state was registered at:
>   lock_acquire+0xe5/0x1f0
>   fs_reclaim_acquire.part.0+0x25/0x30
>   kmem_cache_alloc_trace+0x30/0x2d0
>   alloc_workqueue_attrs+0x29/0xc0
>   workqueue_init+0x4e/0x3c5
>   kernel_init_freeable+0x16a/0x350
>   kernel_init+0xd/0x116
>   ret_from_fork+0x3a/0x50
> irq event stamp: 1629166
> hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> \x0aother info that might help us debug this:
> Possible unsafe locking scenario:\x0a
>        CPU0
>        ----
>   lock(fs_reclaim);
>   <Interrupt>
>     lock(fs_reclaim);
> \x0a *** DEADLOCK ***\x0a
> 1 lock held by swapper/30/0:
> #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0
> \x0astack backtrace:
> CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> Hardware name: ...
> Call Trace:
> <IRQ>
> dump_stack+0xb8/0x110
> mark_lock+0x7ee/0x9d0
> ? check_usage_backwards+0x230/0x230
> __lock_acquire+0xb84/0x2650
> ? lockdep_hardirqs_on+0x182/0x260
> ? lockdep_hardirqs_on+0x260/0x260
> ? debug_object_deactivate+0x210/0x210
> ? trace_hardirqs_on_thunk+0x1a/0x1c
> lock_acquire+0xe5/0x1f0
> ? fs_reclaim_acquire.part.0+0x5/0x30
> ? tbl_mask_array_realloc+0x38/0x1d0 [openvswitch]
> fs_reclaim_acquire.part.0+0x25/0x30
> ? fs_reclaim_acquire.part.0+0x5/0x30
> __kmalloc+0x4d/0x320
> tbl_mask_array_realloc+0x38/0x1d0 [openvswitch]
> ? table_instance_flow_free+0x2ba/0x340 [openvswitch]
> table_instance_destroy+0xf9/0x1b0 [openvswitch]
> ? new_vport+0xb0/0xb0 [openvswitch]
> destroy_dp_rcu+0x12/0x50 [openvswitch]
> rcu_core+0x34d/0x9b0
> ? rcu_all_qs+0x90/0x90
> ? rcu_read_lock_sched_held+0xa5/0xc0
> ? rcu_read_lock_bh_held+0xc0/0xc0
> ? run_rebalance_domains+0x11d/0x140
> __do_softirq+0x128/0x55c
> irq_exit+0x101/0x110
> smp_apic_timer_interrupt+0xfd/0x2f0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> ? lockdep_hardirqs_on+0x182/0x260
> ? cpuidle_enter_state+0xd3/0x5d0
> cpuidle_enter+0x3c/0x60
> do_idle+0x36a/0x450
> ? arch_cpu_idle_exit+0x40/0x40
> cpu_startup_entry+0x19/0x20
> start_secondary+0x21f/0x290
> ? set_cpu_sibling_map+0xcb0/0xcb0
> secondary_startup_64+0xa4/0xb0
> 
> 
> -------------------------------------------------------------------------------
> 
> 
> ------------[ cut here ]------------
> ODEBUG: activate active (active state 1) object type: rcu_head hint: 0x0
> WARNING: CPU: 30 PID: 0 at lib/debugobjects.c:485 debug_print_object+0xc6/0xe0
> Modules linked in: ...
> CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> Hardware name: ...
> RIP: 0010:debug_print_object+0xc6/0xe0
> Code: dd 20 6c 52 94 e8 3a 0a cf ff 4d 89 f1 49 89 e8 44 89 e1 48 8b
> 14 dd 20 6c 52 94 4c 89 ee 48 c7 c7 40 7f c4 93 e8 84 fd 98 ff <0f> 0b
> 5b 83 05 04 89 86 01 01 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f
> RSP: 0018:ffff888c3e589c38 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb1379
> RBP: ffffffff93aca380 R08: 0000000000000001 R09: ffffed1187cb6691
> R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000001
> R13: ffffffff93c48600 R14: 0000000000000000 R15: 0000000000000000
> FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> debug_object_activate+0x2b8/0x300
> ? debug_object_assert_init+0x220/0x220
> ? kasan_unpoison_shadow+0x33/0x40
> __call_rcu+0x34/0x3b0
> ? __kasan_kmalloc.constprop.0+0xc2/0xd0
> tbl_mask_array_realloc+0x170/0x1d0 [openvswitch]
> table_instance_destroy+0xf9/0x1b0 [openvswitch]
> ? new_vport+0xb0/0xb0 [openvswitch]
> destroy_dp_rcu+0x12/0x50 [openvswitch]
> rcu_core+0x34d/0x9b0
> ? rcu_all_qs+0x90/0x90
> ? rcu_read_lock_sched_held+0xa5/0xc0
> ? rcu_read_lock_bh_held+0xc0/0xc0
> ? run_rebalance_domains+0x11d/0x140
> __do_softirq+0x128/0x55c
> irq_exit+0x101/0x110
> smp_apic_timer_interrupt+0xfd/0x2f0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> ? lockdep_hardirqs_on+0x182/0x260
> ? cpuidle_enter_state+0xd3/0x5d0
> cpuidle_enter+0x3c/0x60
> do_idle+0x36a/0x450
> ? arch_cpu_idle_exit+0x40/0x40
> cpu_startup_entry+0x19/0x20
> start_secondary+0x21f/0x290
> ? set_cpu_sibling_map+0xcb0/0xcb0
> secondary_startup_64+0xa4/0xb0
> irq event stamp: 1629166
> hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> ---[ end trace 8dc48dec48bb79f0 ]---
> 
> 
> -------------------------------------------------------------------------------
> 
> 
> ------------[ cut here ]------------
> __call_rcu(): Double-freed CB 00000000111691a8->0x0()!!!
> WARNING: CPU: 30 PID: 0 at kernel/rcu/tree.c:2594 __call_rcu+0x20f/0x3b0
> Modules linked in: ...
> CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> Hardware name: ...
> RIP: 0010:__call_rcu+0x20f/0x3b0
> Code: 5e 41 5f e9 03 59 0d 00 48 89 ef c6 05 5d 2e dc 01 01 e8 94 33
> 27 00 48 8b 53 08 48 89 de 48 c7 c7 20 bf ac 93 e8 eb 26 f1 ff <0f> 0b
> e9 4b fe ff ff e8 05 fd ff ff e9 21 ff ff ff 0f 0b e9 fe fd
> RSP: 0018:ffff888c3e589d58 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: ffff888bd345e200 RCX: 0000000000000000
> RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb139d
> RBP: ffff888bd345e208 R08: 0000000000000001 R09: ffffed1187cb6691
> R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000000
> R13: 0000000000000001 R14: ffff888107e06618 R15: ffff888bd345e2f8
> FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <IRQ>
> ? __kasan_kmalloc.constprop.0+0xc2/0xd0
> tbl_mask_array_realloc+0x170/0x1d0 [openvswitch]
> table_instance_destroy+0xf9/0x1b0 [openvswitch]
> ? new_vport+0xb0/0xb0 [openvswitch]
> destroy_dp_rcu+0x12/0x50 [openvswitch]
> rcu_core+0x34d/0x9b0
> ? rcu_all_qs+0x90/0x90
> ? rcu_read_lock_sched_held+0xa5/0xc0
> ? rcu_read_lock_bh_held+0xc0/0xc0
> ? run_rebalance_domains+0x11d/0x140
> __do_softirq+0x128/0x55c
> irq_exit+0x101/0x110
> smp_apic_timer_interrupt+0xfd/0x2f0
> apic_timer_interrupt+0xf/0x20
> </IRQ>
> RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> ? lockdep_hardirqs_on+0x182/0x260
> ? cpuidle_enter_state+0xd3/0x5d0
> cpuidle_enter+0x3c/0x60
> do_idle+0x36a/0x450
> ? arch_cpu_idle_exit+0x40/0x40
> cpu_startup_entry+0x19/0x20
> start_secondary+0x21f/0x290
> ? set_cpu_sibling_map+0xcb0/0xcb0
> secondary_startup_64+0xa4/0xb0
> irq event stamp: 1629166
> hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> ---[ end trace 8dc48dec48bb79f2 ]---
> _______________________________________________
> discuss mailing list
> discuss@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-04 15:51 ` [ovs-discuss] Double free in recent kernels after memleak fix Gregory Rose
@ 2020-08-07 15:31   ` Johan Knöös
  2020-08-07 20:47     ` Joel Fernandes
  2020-08-07 21:52     ` Cong Wang
  0 siblings, 2 replies; 17+ messages in thread
From: Johan Knöös @ 2020-08-07 15:31 UTC (permalink / raw)
  To: Gregory Rose; +Cc: bugs, Tonghao Zhang, Netdev, joel

On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
>
>
>
> On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > Hi Open vSwitch contributors,
> >
> > We have found openvswitch is causing double-freeing of memory. The
> > issue was not present in kernel version 5.5.17 but is present in
> > 5.6.14 and newer kernels.
> >
> > After reverting the RCU commits below for debugging, enabling
> > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > email in the kernel log (the last one shows the double-free). When I
> > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > While I have a reliable way to reproduce the issue, I unfortunately
> > don't yet have a process that's amenable to sharing. Please take a
> > look.
> >
> > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> >
> > Thanks,
> > Johan Knöös
>
> Let's add the author of the patch you reverted and the Linux netdev
> mailing list.
>
> - Greg

I found we also sometimes get warnings from
https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
under similar conditions even on kernel 5.5.17, which I believe may be
related. However, it's much rarer and I don't have a reliable way of
reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only
increases the frequency of a pre-existing bug.

> > Traces:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 30 PID: 0 at net/openvswitch/flow_table.c:272
> > table_instance_flow_free+0x2fd/0x340 [openvswitch]
> > Modules linked in: ...
> > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G            E     5.6.14+ #18
> > Hardware name: ...
> > RIP: 0010:table_instance_flow_free+0x2fd/0x340 [openvswitch]
> > Code: c1 fa 1f 48 c1 e8 20 29 d0 41 39 c7 0f 8f 95 fe ff ff 48 83 c4
> > 10 48 89 ef d1 fe 5b 5d 41 5c 41 5d 41 5e 41 5f e9 33 fb ff ff <0f> 0b
> > e9 59 fe ff ff 0f 0b e8 65 f1 fe ff 85 c0 0f 85 9b fe ff ff
> > RSP: 0018:ffff888c3e589da8 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: ffff889f954ee580 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000003 RDI: 0000000000000246
> > RBP: ffff888c295150a0 R08: ffffffff9297f341 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff889f1ed55000
> > R13: ffff888b72efa020 R14: ffff888c24209480 R15: ffff888b731bb6f8
> > FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <IRQ>
> > table_instance_destroy+0xf9/0x1b0 [openvswitch]
> > ? new_vport+0xb0/0xb0 [openvswitch]
> > destroy_dp_rcu+0x12/0x50 [openvswitch]
> > rcu_core+0x34d/0x9b0
> > ? rcu_all_qs+0x90/0x90
> > ? rcu_read_lock_sched_held+0xa5/0xc0
> > ? rcu_read_lock_bh_held+0xc0/0xc0
> > ? run_rebalance_domains+0x11d/0x140
> > __do_softirq+0x128/0x55c
> > irq_exit+0x101/0x110
> > smp_apic_timer_interrupt+0xfd/0x2f0
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? cpuidle_enter_state+0xd3/0x5d0
> > cpuidle_enter+0x3c/0x60
> > do_idle+0x36a/0x450
> > ? arch_cpu_idle_exit+0x40/0x40
> > cpu_startup_entry+0x19/0x20
> > start_secondary+0x21f/0x290
> > ? set_cpu_sibling_map+0xcb0/0xcb0
> > secondary_startup_64+0xa4/0xb0
> > irq event stamp: 1626911
> > hardirqs last  enabled at (1626910): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> > hardirqs last disabled at (1626911): [<ffffffff92804552>]
> > trace_hardirqs_off_thunk+0x1a/0x1c
> > softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> > ---[ end trace 8dc48dec48bb79c0 ]---
> >
> >
> > -------------------------------------------------------------------------------
> >
> >
> > =============================
> > WARNING: suspicious RCU usage
> > 5.6.14+ #18 Tainted: G        W   E
> > -----------------------------
> > net/openvswitch/flow_table.c:239 suspicious rcu_dereference_protected() usage!
> > \x0aother info that might help us debug this:\x0a
> > \x0arcu_scheduler_active = 2, debug_locks = 1
> > 1 lock held by swapper/30/0:
> > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0
> > \x0astack backtrace:
> > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> > Hardware name: ...
> > Call Trace:
> > <IRQ>
> > dump_stack+0xb8/0x110
> > table_instance_flow_free+0x332/0x340 [openvswitch]
> > table_instance_destroy+0xf9/0x1b0 [openvswitch]
> > ? new_vport+0xb0/0xb0 [openvswitch]
> > destroy_dp_rcu+0x12/0x50 [openvswitch]
> > rcu_core+0x34d/0x9b0
> > ? rcu_all_qs+0x90/0x90
> > ? rcu_read_lock_sched_held+0xa5/0xc0
> > ? rcu_read_lock_bh_held+0xc0/0xc0
> > ? run_rebalance_domains+0x11d/0x140
> > __do_softirq+0x128/0x55c
> > irq_exit+0x101/0x110
> > smp_apic_timer_interrupt+0xfd/0x2f0
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? cpuidle_enter_state+0xd3/0x5d0
> > cpuidle_enter+0x3c/0x60
> > do_idle+0x36a/0x450
> > ? arch_cpu_idle_exit+0x40/0x40
> > cpu_startup_entry+0x19/0x20
> > start_secondary+0x21f/0x290
> > ? set_cpu_sibling_map+0xcb0/0xcb0
> > secondary_startup_64+0xa4/0xb0
> >
> >
> > -------------------------------------------------------------------------------
> >
> >
> > ================================
> > WARNING: inconsistent lock state
> > 5.6.14+ #18 Tainted: G        W   E
> > --------------------------------
> > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
> > swapper/30/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
> > ffffffff943e24c0 (fs_reclaim){+.?.}, at: fs_reclaim_acquire.part.0+0x5/0x30
> > {SOFTIRQ-ON-W} state was registered at:
> >   lock_acquire+0xe5/0x1f0
> >   fs_reclaim_acquire.part.0+0x25/0x30
> >   kmem_cache_alloc_trace+0x30/0x2d0
> >   alloc_workqueue_attrs+0x29/0xc0
> >   workqueue_init+0x4e/0x3c5
> >   kernel_init_freeable+0x16a/0x350
> >   kernel_init+0xd/0x116
> >   ret_from_fork+0x3a/0x50
> > irq event stamp: 1629166
> > hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> > softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> > \x0aother info that might help us debug this:
> > Possible unsafe locking scenario:\x0a
> >        CPU0
> >        ----
> >   lock(fs_reclaim);
> >   <Interrupt>
> >     lock(fs_reclaim);
> > \x0a *** DEADLOCK ***\x0a
> > 1 lock held by swapper/30/0:
> > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0
> > \x0astack backtrace:
> > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> > Hardware name: ...
> > Call Trace:
> > <IRQ>
> > dump_stack+0xb8/0x110
> > mark_lock+0x7ee/0x9d0
> > ? check_usage_backwards+0x230/0x230
> > __lock_acquire+0xb84/0x2650
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? lockdep_hardirqs_on+0x260/0x260
> > ? debug_object_deactivate+0x210/0x210
> > ? trace_hardirqs_on_thunk+0x1a/0x1c
> > lock_acquire+0xe5/0x1f0
> > ? fs_reclaim_acquire.part.0+0x5/0x30
> > ? tbl_mask_array_realloc+0x38/0x1d0 [openvswitch]
> > fs_reclaim_acquire.part.0+0x25/0x30
> > ? fs_reclaim_acquire.part.0+0x5/0x30
> > __kmalloc+0x4d/0x320
> > tbl_mask_array_realloc+0x38/0x1d0 [openvswitch]
> > ? table_instance_flow_free+0x2ba/0x340 [openvswitch]
> > table_instance_destroy+0xf9/0x1b0 [openvswitch]
> > ? new_vport+0xb0/0xb0 [openvswitch]
> > destroy_dp_rcu+0x12/0x50 [openvswitch]
> > rcu_core+0x34d/0x9b0
> > ? rcu_all_qs+0x90/0x90
> > ? rcu_read_lock_sched_held+0xa5/0xc0
> > ? rcu_read_lock_bh_held+0xc0/0xc0
> > ? run_rebalance_domains+0x11d/0x140
> > __do_softirq+0x128/0x55c
> > irq_exit+0x101/0x110
> > smp_apic_timer_interrupt+0xfd/0x2f0
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? cpuidle_enter_state+0xd3/0x5d0
> > cpuidle_enter+0x3c/0x60
> > do_idle+0x36a/0x450
> > ? arch_cpu_idle_exit+0x40/0x40
> > cpu_startup_entry+0x19/0x20
> > start_secondary+0x21f/0x290
> > ? set_cpu_sibling_map+0xcb0/0xcb0
> > secondary_startup_64+0xa4/0xb0
> >
> >
> > -------------------------------------------------------------------------------
> >
> >
> > ------------[ cut here ]------------
> > ODEBUG: activate active (active state 1) object type: rcu_head hint: 0x0
> > WARNING: CPU: 30 PID: 0 at lib/debugobjects.c:485 debug_print_object+0xc6/0xe0
> > Modules linked in: ...
> > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> > Hardware name: ...
> > RIP: 0010:debug_print_object+0xc6/0xe0
> > Code: dd 20 6c 52 94 e8 3a 0a cf ff 4d 89 f1 49 89 e8 44 89 e1 48 8b
> > 14 dd 20 6c 52 94 4c 89 ee 48 c7 c7 40 7f c4 93 e8 84 fd 98 ff <0f> 0b
> > 5b 83 05 04 89 86 01 01 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f
> > RSP: 0018:ffff888c3e589c38 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb1379
> > RBP: ffffffff93aca380 R08: 0000000000000001 R09: ffffed1187cb6691
> > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000001
> > R13: ffffffff93c48600 R14: 0000000000000000 R15: 0000000000000000
> > FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <IRQ>
> > debug_object_activate+0x2b8/0x300
> > ? debug_object_assert_init+0x220/0x220
> > ? kasan_unpoison_shadow+0x33/0x40
> > __call_rcu+0x34/0x3b0
> > ? __kasan_kmalloc.constprop.0+0xc2/0xd0
> > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch]
> > table_instance_destroy+0xf9/0x1b0 [openvswitch]
> > ? new_vport+0xb0/0xb0 [openvswitch]
> > destroy_dp_rcu+0x12/0x50 [openvswitch]
> > rcu_core+0x34d/0x9b0
> > ? rcu_all_qs+0x90/0x90
> > ? rcu_read_lock_sched_held+0xa5/0xc0
> > ? rcu_read_lock_bh_held+0xc0/0xc0
> > ? run_rebalance_domains+0x11d/0x140
> > __do_softirq+0x128/0x55c
> > irq_exit+0x101/0x110
> > smp_apic_timer_interrupt+0xfd/0x2f0
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? cpuidle_enter_state+0xd3/0x5d0
> > cpuidle_enter+0x3c/0x60
> > do_idle+0x36a/0x450
> > ? arch_cpu_idle_exit+0x40/0x40
> > cpu_startup_entry+0x19/0x20
> > start_secondary+0x21f/0x290
> > ? set_cpu_sibling_map+0xcb0/0xcb0
> > secondary_startup_64+0xa4/0xb0
> > irq event stamp: 1629166
> > hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> > softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> > ---[ end trace 8dc48dec48bb79f0 ]---
> >
> >
> > -------------------------------------------------------------------------------
> >
> >
> > ------------[ cut here ]------------
> > __call_rcu(): Double-freed CB 00000000111691a8->0x0()!!!
> > WARNING: CPU: 30 PID: 0 at kernel/rcu/tree.c:2594 __call_rcu+0x20f/0x3b0
> > Modules linked in: ...
> > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G        W   E     5.6.14+ #18
> > Hardware name: ...
> > RIP: 0010:__call_rcu+0x20f/0x3b0
> > Code: 5e 41 5f e9 03 59 0d 00 48 89 ef c6 05 5d 2e dc 01 01 e8 94 33
> > 27 00 48 8b 53 08 48 89 de 48 c7 c7 20 bf ac 93 e8 eb 26 f1 ff <0f> 0b
> > e9 4b fe ff ff e8 05 fd ff ff e9 21 ff ff ff 0f 0b e9 fe fd
> > RSP: 0018:ffff888c3e589d58 EFLAGS: 00010282
> > RAX: 0000000000000000 RBX: ffff888bd345e200 RCX: 0000000000000000
> > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb139d
> > RBP: ffff888bd345e208 R08: 0000000000000001 R09: ffffed1187cb6691
> > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000000
> > R13: 0000000000000001 R14: ffff888107e06618 R15: ffff888bd345e2f8
> > FS:  0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <IRQ>
> > ? __kasan_kmalloc.constprop.0+0xc2/0xd0
> > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch]
> > table_instance_destroy+0xf9/0x1b0 [openvswitch]
> > ? new_vport+0xb0/0xb0 [openvswitch]
> > destroy_dp_rcu+0x12/0x50 [openvswitch]
> > rcu_core+0x34d/0x9b0
> > ? rcu_all_qs+0x90/0x90
> > ? rcu_read_lock_sched_held+0xa5/0xc0
> > ? rcu_read_lock_bh_held+0xc0/0xc0
> > ? run_rebalance_domains+0x11d/0x140
> > __do_softirq+0x128/0x55c
> > irq_exit+0x101/0x110
> > smp_apic_timer_interrupt+0xfd/0x2f0
> > apic_timer_interrupt+0xf/0x20
> > </IRQ>
> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0
> > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04
> > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85
> > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00
> > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
> > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000
> > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4
> > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10
> > ? lockdep_hardirqs_on+0x182/0x260
> > ? cpuidle_enter_state+0xd3/0x5d0
> > cpuidle_enter+0x3c/0x60
> > do_idle+0x36a/0x450
> > ? arch_cpu_idle_exit+0x40/0x40
> > cpu_startup_entry+0x19/0x20
> > start_secondary+0x21f/0x290
> > ? set_cpu_sibling_map+0xcb0/0xcb0
> > secondary_startup_64+0xa4/0xb0
> > irq event stamp: 1629166
> > hardirqs last  enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0
> > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0
> > softirqs last  enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80
> > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110
> > ---[ end trace 8dc48dec48bb79f2 ]---
> > _______________________________________________
> > discuss mailing list
> > discuss@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> >

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 15:31   ` Johan Knöös
@ 2020-08-07 20:47     ` Joel Fernandes
  2020-08-07 20:49       ` Joel Fernandes
  2020-08-07 22:20       ` Paul E. McKenney
  2020-08-07 21:52     ` Cong Wang
  1 sibling, 2 replies; 17+ messages in thread
From: Joel Fernandes @ 2020-08-07 20:47 UTC (permalink / raw)
  To: Johan Knöös
  Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	Paul E. McKenney, rcu

Hi,
Adding more of us working on RCU as well. Johan from another team at
Google discovered a likely issue in openswitch, details below:

On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
>
> On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> >
> >
> >
> > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > Hi Open vSwitch contributors,
> > >
> > > We have found openvswitch is causing double-freeing of memory. The
> > > issue was not present in kernel version 5.5.17 but is present in
> > > 5.6.14 and newer kernels.
> > >
> > > After reverting the RCU commits below for debugging, enabling
> > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > email in the kernel log (the last one shows the double-free). When I
> > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > While I have a reliable way to reproduce the issue, I unfortunately
> > > don't yet have a process that's amenable to sharing. Please take a
> > > look.
> > >
> > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > >

Note that these reverts were only for testing the same code, because
he was testing 2 different kernel versions. One of them did not have
this set. So I asked him to revert. There's no known bug in the
reverted code itself. But somehow these patches do make it harder for
him to reproduce the issue.

> > > Thanks,
> > > Johan Knöös
> >
> > Let's add the author of the patch you reverted and the Linux netdev
> > mailing list.
> >
> > - Greg
>
> I found we also sometimes get warnings from
> https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
> under similar conditions even on kernel 5.5.17, which I believe may be
> related. However, it's much rarer and I don't have a reliable way of
> reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only
> increases the frequency of a pre-existing bug.

This is interesting, because I saw kbuild warn me recently [1] about
it as well. Though, I was actually intentionally messing with the
segcblist. I plan to debug it next week, but the warning itself is
unlikely to be caused by my patch IMHO (since it is slightly
orthogonal to what I changed).

[1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/

But then again, I have not heard reports of this warning firing. Paul,
has this come to your radar recently?

Thanks,

 - Joel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 20:47     ` Joel Fernandes
@ 2020-08-07 20:49       ` Joel Fernandes
  2020-08-07 22:20       ` Paul E. McKenney
  1 sibling, 0 replies; 17+ messages in thread
From: Joel Fernandes @ 2020-08-07 20:49 UTC (permalink / raw)
  To: Johan Knöös
  Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	Paul E. McKenney, rcu

On Fri, Aug 7, 2020 at 4:47 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> Hi,
> Adding more of us working on RCU as well. Johan from another team at
> Google discovered a likely issue in openswitch, details below:
>
> On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> >
> > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > >
> > >
> > >
> > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > Hi Open vSwitch contributors,
> > > >
> > > > We have found openvswitch is causing double-freeing of memory. The
> > > > issue was not present in kernel version 5.5.17 but is present in
> > > > 5.6.14 and newer kernels.
> > > >
> > > > After reverting the RCU commits below for debugging, enabling
> > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > email in the kernel log (the last one shows the double-free). When I
> > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > don't yet have a process that's amenable to sharing. Please take a
> > > > look.
> > > >
> > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > > >
>
> Note that these reverts were only for testing the same code, because
> he was testing 2 different kernel versions. One of them did not have
> this set. So I asked him to revert. There's no known bug in the
> reverted code itself. But somehow these patches do make it harder for
> him to reproduce the issue.

And the reason for this is likely the additional kfree batching is
slowing down the occurrence of the crash.

thanks,

 - Joel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 20:47     ` Joel Fernandes
  2020-08-07 20:49       ` Joel Fernandes
@ 2020-08-07 22:20       ` Paul E. McKenney
  2020-08-07 23:05         ` Johan Knöös
  2020-08-10 20:08         ` Joel Fernandes
  1 sibling, 2 replies; 17+ messages in thread
From: Paul E. McKenney @ 2020-08-07 22:20 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	rcu

On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> Hi,
> Adding more of us working on RCU as well. Johan from another team at
> Google discovered a likely issue in openswitch, details below:
> 
> On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> >
> > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > >
> > >
> > >
> > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > Hi Open vSwitch contributors,
> > > >
> > > > We have found openvswitch is causing double-freeing of memory. The
> > > > issue was not present in kernel version 5.5.17 but is present in
> > > > 5.6.14 and newer kernels.
> > > >
> > > > After reverting the RCU commits below for debugging, enabling
> > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > email in the kernel log (the last one shows the double-free). When I
> > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > don't yet have a process that's amenable to sharing. Please take a
> > > > look.
> > > >
> > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> 
> Note that these reverts were only for testing the same code, because
> he was testing 2 different kernel versions. One of them did not have
> this set. So I asked him to revert. There's no known bug in the
> reverted code itself. But somehow these patches do make it harder for
> him to reproduce the issue.

Perhaps they adjust timing?

> > > > Thanks,
> > > > Johan Knöös
> > >
> > > Let's add the author of the patch you reverted and the Linux netdev
> > > mailing list.
> > >
> > > - Greg
> >
> > I found we also sometimes get warnings from
> > https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
> > under similar conditions even on kernel 5.5.17, which I believe may be
> > related. However, it's much rarer and I don't have a reliable way of
> > reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only
> > increases the frequency of a pre-existing bug.
> 
> This is interesting, because I saw kbuild warn me recently [1] about
> it as well. Though, I was actually intentionally messing with the
> segcblist. I plan to debug it next week, but the warning itself is
> unlikely to be caused by my patch IMHO (since it is slightly
> orthogonal to what I changed).
> 
> [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/
> 
> But then again, I have not heard reports of this warning firing. Paul,
> has this come to your radar recently?

I have not seen any recent WARNs in rcu_do_batch().  I am guessing that
this is one of the last two in that function?

If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y?  That Kconfig
option is designed to help locate double frees via RCU.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 22:20       ` Paul E. McKenney
@ 2020-08-07 23:05         ` Johan Knöös
  2020-08-08 11:44           ` Uladzislau Rezki
  2020-08-10 20:08         ` Joel Fernandes
  1 sibling, 1 reply; 17+ messages in thread
From: Johan Knöös @ 2020-08-07 23:05 UTC (permalink / raw)
  To: paulmck
  Cc: Joel Fernandes, Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	rcu

On Fri, Aug 7, 2020 at 3:20 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > Hi,
> > Adding more of us working on RCU as well. Johan from another team at
> > Google discovered a likely issue in openswitch, details below:
> >
> > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> > >
> > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > > >
> > > >
> > > >
> > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > Hi Open vSwitch contributors,
> > > > >
> > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > 5.6.14 and newer kernels.
> > > > >
> > > > > After reverting the RCU commits below for debugging, enabling
> > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > look.
> > > > >
> > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> >
> > Note that these reverts were only for testing the same code, because
> > he was testing 2 different kernel versions. One of them did not have
> > this set. So I asked him to revert. There's no known bug in the
> > reverted code itself. But somehow these patches do make it harder for
> > him to reproduce the issue.

I'm not certain the frequency of the issue changes with and without
these commits on 5.6.14, but at least the symptoms/definition of the
issue changes. To clarify, this is what I've observed with different
kernels:
* 5.6.14:  "kernel BUG at mm/slub.c:304!". Easily reproducible.
* 5.6.14 with the above RCU commits reverted: the warnings reported in
my original email. Easily reproducible.
* 5.6.14 with the above RCU commits reverted and
50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 reverted: no warnings
observed (the frequency might be the same as on 5.5.17).
* 5.5.17: warning at kernel/rcu/tree.c#L2239. Difficult to reproduce.
Maybe a different root cause.

> Perhaps they adjust timing?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 23:05         ` Johan Knöös
@ 2020-08-08 11:44           ` Uladzislau Rezki
  0 siblings, 0 replies; 17+ messages in thread
From: Uladzislau Rezki @ 2020-08-08 11:44 UTC (permalink / raw)
  To: Johan Knöös
  Cc: paulmck, Joel Fernandes, Gregory Rose, bugs, Tonghao Zhang,
	Netdev, Uladzislau Rezki (Sony),
	rcu

> On Fri, Aug 7, 2020 at 3:20 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > > Hi,
> > > Adding more of us working on RCU as well. Johan from another team at
> > > Google discovered a likely issue in openswitch, details below:
> > >
> > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> > > >
> > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > > Hi Open vSwitch contributors,
> > > > > >
> > > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > > 5.6.14 and newer kernels.
> > > > > >
> > > > > > After reverting the RCU commits below for debugging, enabling
> > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > > look.
> > > > > >
> > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > >
> > > Note that these reverts were only for testing the same code, because
> > > he was testing 2 different kernel versions. One of them did not have
> > > this set. So I asked him to revert. There's no known bug in the
> > > reverted code itself. But somehow these patches do make it harder for
> > > him to reproduce the issue.
> 
> I'm not certain the frequency of the issue changes with and without
> these commits on 5.6.14, but at least the symptoms/definition of the
> issue changes. To clarify, this is what I've observed with different
> kernels:
> * 5.6.14:  "kernel BUG at mm/slub.c:304!". Easily reproducible.
> * 5.6.14 with the above RCU commits reverted: the warnings reported in
> my original email. Easily reproducible.
> * 5.6.14 with the above RCU commits reverted and
> 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 reverted: no warnings
> observed (the frequency might be the same as on 5.5.17).
> * 5.5.17: warning at kernel/rcu/tree.c#L2239. Difficult to reproduce.
> Maybe a different root cause.
> 
If you can reproduce it, maybe enabling CONFIG_KASAN will detect something?
It can detect out-of-bounds and use after free bugs.

Thanks.

--
Vlad Rezki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 22:20       ` Paul E. McKenney
  2020-08-07 23:05         ` Johan Knöös
@ 2020-08-10 20:08         ` Joel Fernandes
  2020-08-10 20:28           ` Paul E. McKenney
  1 sibling, 1 reply; 17+ messages in thread
From: Joel Fernandes @ 2020-08-10 20:08 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	rcu

On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote:
> On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > Hi,
> > Adding more of us working on RCU as well. Johan from another team at
> > Google discovered a likely issue in openswitch, details below:
> > 
> > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> > >
> > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > > >
> > > >
> > > >
> > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > Hi Open vSwitch contributors,
> > > > >
> > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > 5.6.14 and newer kernels.
> > > > >
> > > > > After reverting the RCU commits below for debugging, enabling
> > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > look.
> > > > >
> > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > 
> > Note that these reverts were only for testing the same code, because
> > he was testing 2 different kernel versions. One of them did not have
> > this set. So I asked him to revert. There's no known bug in the
> > reverted code itself. But somehow these patches do make it harder for
> > him to reproduce the issue.
> 
> Perhaps they adjust timing?

Yes that could be it. In my testing (which is unrelated to OVS), the issue
happens only with TREE02. I can reproduce the issue in [1] on just boot-up of
TREE02.

I could have screwed up something in my segcblist count patch, any hints
would be great. I'll dig more into it as well.

> > 
> > But then again, I have not heard reports of this warning firing. Paul,
> > has this come to your radar recently?
> 
> I have not seen any recent WARNs in rcu_do_batch().  I am guessing that
> this is one of the last two in that function?
> 
> If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y?  That Kconfig
> option is designed to help locate double frees via RCU.

Yes true, kfree_rcu() also has support for this. Jonathan, did you get a
chance to try this out in your failure scenario?

thanks,

 - Joel

[1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-10 20:08         ` Joel Fernandes
@ 2020-08-10 20:28           ` Paul E. McKenney
  2020-08-11  1:14             ` Tonghao Zhang
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2020-08-10 20:28 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev,
	Uladzislau Rezki (Sony),
	rcu

On Mon, Aug 10, 2020 at 04:08:59PM -0400, Joel Fernandes wrote:
> On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote:
> > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > > Hi,
> > > Adding more of us working on RCU as well. Johan from another team at
> > > Google discovered a likely issue in openswitch, details below:
> > > 
> > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > > Hi Open vSwitch contributors,
> > > > > >
> > > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > > 5.6.14 and newer kernels.
> > > > > >
> > > > > > After reverting the RCU commits below for debugging, enabling
> > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > > look.
> > > > > >
> > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > > 
> > > Note that these reverts were only for testing the same code, because
> > > he was testing 2 different kernel versions. One of them did not have
> > > this set. So I asked him to revert. There's no known bug in the
> > > reverted code itself. But somehow these patches do make it harder for
> > > him to reproduce the issue.
> > 
> > Perhaps they adjust timing?
> 
> Yes that could be it. In my testing (which is unrelated to OVS), the issue
> happens only with TREE02. I can reproduce the issue in [1] on just boot-up of
> TREE02.
> 
> I could have screwed up something in my segcblist count patch, any hints
> would be great. I'll dig more into it as well.

Has anyone taken a close look at 50b0e61b32ee ("net: openvswitch: fix
possible memleak on destroy flow-table") commit?  Maybe it avoided the
memleak so thoroughly that it did a double free?

							Thanx, Paul

> > > But then again, I have not heard reports of this warning firing. Paul,
> > > has this come to your radar recently?
> > 
> > I have not seen any recent WARNs in rcu_do_batch().  I am guessing that
> > this is one of the last two in that function?
> > 
> > If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y?  That Kconfig
> > option is designed to help locate double frees via RCU.
> 
> Yes true, kfree_rcu() also has support for this. Jonathan, did you get a
> chance to try this out in your failure scenario?
> 
> thanks,
> 
>  - Joel
> 
> [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-10 20:28           ` Paul E. McKenney
@ 2020-08-11  1:14             ` Tonghao Zhang
  2020-08-11  2:24               ` Cong Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Tonghao Zhang @ 2020-08-11  1:14 UTC (permalink / raw)
  To: paulmck
  Cc: Joel Fernandes, Johan Knöös, Gregory Rose, bugs,
	Netdev, Uladzislau Rezki (Sony),
	rcu

On Tue, Aug 11, 2020 at 4:28 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Mon, Aug 10, 2020 at 04:08:59PM -0400, Joel Fernandes wrote:
> > On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote:
> > > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote:
> > > > Hi,
> > > > Adding more of us working on RCU as well. Johan from another team at
> > > > Google discovered a likely issue in openswitch, details below:
> > > >
> > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote:
> > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > > > > > Hi Open vSwitch contributors,
> > > > > > >
> > > > > > > We have found openvswitch is causing double-freeing of memory. The
> > > > > > > issue was not present in kernel version 5.5.17 but is present in
> > > > > > > 5.6.14 and newer kernels.
> > > > > > >
> > > > > > > After reverting the RCU commits below for debugging, enabling
> > > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > > > > > email in the kernel log (the last one shows the double-free). When I
> > > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > > > > > While I have a reliable way to reproduce the issue, I unfortunately
> > > > > > > don't yet have a process that's amenable to sharing. Please take a
> > > > > > > look.
> > > > > > >
> > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > > >
> > > > Note that these reverts were only for testing the same code, because
> > > > he was testing 2 different kernel versions. One of them did not have
> > > > this set. So I asked him to revert. There's no known bug in the
> > > > reverted code itself. But somehow these patches do make it harder for
> > > > him to reproduce the issue.
> > >
> > > Perhaps they adjust timing?
> >
> > Yes that could be it. In my testing (which is unrelated to OVS), the issue
> > happens only with TREE02. I can reproduce the issue in [1] on just boot-up of
> > TREE02.
> >
> > I could have screwed up something in my segcblist count patch, any hints
> > would be great. I'll dig more into it as well.
>
> Has anyone taken a close look at 50b0e61b32ee ("net: openvswitch: fix
> possible memleak on destroy flow-table") commit?  Maybe it avoided the
> memleak so thoroughly that it did a double free?
Hi all, I send a patch to fix this. The rcu warnings disappear. I
don't reproduce the double free issue.
But I guess this patch may address this issue.

http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
>                                                         Thanx, Paul
>
> > > > But then again, I have not heard reports of this warning firing. Paul,
> > > > has this come to your radar recently?
> > >
> > > I have not seen any recent WARNs in rcu_do_batch().  I am guessing that
> > > this is one of the last two in that function?
> > >
> > > If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y?  That Kconfig
> > > option is designed to help locate double frees via RCU.
> >
> > Yes true, kfree_rcu() also has support for this. Jonathan, did you get a
> > chance to try this out in your failure scenario?
> >
> > thanks,
> >
> >  - Joel
> >
> > [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/



-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  1:14             ` Tonghao Zhang
@ 2020-08-11  2:24               ` Cong Wang
  2020-08-11  3:26                 ` Tonghao Zhang
  0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2020-08-11  2:24 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös,
	Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony),
	rcu

On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> Hi all, I send a patch to fix this. The rcu warnings disappear. I
> don't reproduce the double free issue.
> But I guess this patch may address this issue.
>
> http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/

I don't see how your patch address the double-free, as we still
free mask array twice after your patch: once in tbl_mask_array_realloc()
and once in ovs_flow_tbl_destroy().

Have you tried my patch which is supposed to address this double-free?
It simply skips the reallocation as it makes no sense to trigger reallocation
when destroying it.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  2:24               ` Cong Wang
@ 2020-08-11  3:26                 ` Tonghao Zhang
  2020-08-11  4:07                   ` Cong Wang
  0 siblings, 1 reply; 17+ messages in thread
From: Tonghao Zhang @ 2020-08-11  3:26 UTC (permalink / raw)
  To: Cong Wang
  Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös,
	Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony),
	rcu

On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > Hi all, I send a patch to fix this. The rcu warnings disappear. I
> > don't reproduce the double free issue.
> > But I guess this patch may address this issue.
> >
> > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
>
> I don't see how your patch address the double-free, as we still
> free mask array twice after your patch: once in tbl_mask_array_realloc()
> and once in ovs_flow_tbl_destroy().
Hi Cong.
Before my patch, we use the ovsl_dereference
(rcu_dereference_protected) in the rcu callback.
ovs_flow_tbl_destroy
->table_instance_destroy
->table_instance_flow_free
->flow_mask_remove
ASSERT_OVSL(will print warning)
->tbl_mask_array_del_mask
ovsl_dereference(rcu usage warning)

so we should invoke the table_instance_destroy or others under
ovs_lock to avoid (ASSERT_OVSL and rcu usage warning).
with this patch, we reallocate the mask_array under ovs_lock, and free
it in the rcu callback. Without it, we  reallocate and free it in the
rcu callback.
I think we may fix it with this patch.

> Have you tried my patch which is supposed to address this double-free?
I don't reproduce it. but your patch does not avoid ruc usage warning
and ASSERT_OVSL.
> It simply skips the reallocation as it makes no sense to trigger reallocation
> when destroying it.
>
> Thanks.



-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  3:26                 ` Tonghao Zhang
@ 2020-08-11  4:07                   ` Cong Wang
  2020-08-11  5:58                     ` Tonghao Zhang
  0 siblings, 1 reply; 17+ messages in thread
From: Cong Wang @ 2020-08-11  4:07 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös,
	Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony),
	rcu

On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > Hi all, I send a patch to fix this. The rcu warnings disappear. I
> > > don't reproduce the double free issue.
> > > But I guess this patch may address this issue.
> > >
> > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
> >
> > I don't see how your patch address the double-free, as we still
> > free mask array twice after your patch: once in tbl_mask_array_realloc()
> > and once in ovs_flow_tbl_destroy().
> Hi Cong.
> Before my patch, we use the ovsl_dereference
> (rcu_dereference_protected) in the rcu callback.
> ovs_flow_tbl_destroy
> ->table_instance_destroy
> ->table_instance_flow_free
> ->flow_mask_remove
> ASSERT_OVSL(will print warning)
> ->tbl_mask_array_del_mask
> ovsl_dereference(rcu usage warning)
>

I understand how your patch addresses the RCU annotation issue,
which is different from double-free.


> so we should invoke the table_instance_destroy or others under
> ovs_lock to avoid (ASSERT_OVSL and rcu usage warning).

Of course... I never doubt it.


> with this patch, we reallocate the mask_array under ovs_lock, and free
> it in the rcu callback. Without it, we  reallocate and free it in the
> rcu callback.
> I think we may fix it with this patch.

Does it matter which context tbl_mask_array_realloc() is called?
Even with ovs_lock, we can still double free:

ovs_lock()
tbl_mask_array_realloc()
 => call_rcu(&old->rcu, mask_array_rcu_cb);
ovs_unlock()
...
ovs_flow_tbl_destroy()
 => call_rcu(&old->rcu, mask_array_rcu_cb);

So still twice, right? To fix the double-free, we have to eliminate one
of them, don't we? ;)


>
> > Have you tried my patch which is supposed to address this double-free?
> I don't reproduce it. but your patch does not avoid ruc usage warning
> and ASSERT_OVSL.

Sure, I never intend to fix anything else but double-free. The $subject is
about double free, I double checked. ;)

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  4:07                   ` Cong Wang
@ 2020-08-11  5:58                     ` Tonghao Zhang
  2020-08-11 18:28                       ` Johan Knöös
  2020-08-12  0:43                       ` Cong Wang
  0 siblings, 2 replies; 17+ messages in thread
From: Tonghao Zhang @ 2020-08-11  5:58 UTC (permalink / raw)
  To: Cong Wang
  Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös,
	Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony),
	rcu

On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> >
> > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > >
> > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I
> > > > don't reproduce the double free issue.
> > > > But I guess this patch may address this issue.
> > > >
> > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
> > >
> > > I don't see how your patch address the double-free, as we still
> > > free mask array twice after your patch: once in tbl_mask_array_realloc()
> > > and once in ovs_flow_tbl_destroy().
> > Hi Cong.
> > Before my patch, we use the ovsl_dereference
> > (rcu_dereference_protected) in the rcu callback.
> > ovs_flow_tbl_destroy
> > ->table_instance_destroy
> > ->table_instance_flow_free
> > ->flow_mask_remove
> > ASSERT_OVSL(will print warning)
> > ->tbl_mask_array_del_mask
> > ovsl_dereference(rcu usage warning)
> >
>
> I understand how your patch addresses the RCU annotation issue,
> which is different from double-free.
>
>
> > so we should invoke the table_instance_destroy or others under
> > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning).
>
> Of course... I never doubt it.
>
>
> > with this patch, we reallocate the mask_array under ovs_lock, and free
> > it in the rcu callback. Without it, we  reallocate and free it in the
> > rcu callback.
> > I think we may fix it with this patch.
>
> Does it matter which context tbl_mask_array_realloc() is called?
> Even with ovs_lock, we can still double free:
>
> ovs_lock()
> tbl_mask_array_realloc()
>  => call_rcu(&old->rcu, mask_array_rcu_cb);
> ovs_unlock()
> ...
> ovs_flow_tbl_destroy()
>  => call_rcu(&old->rcu, mask_array_rcu_cb);
>
> So still twice, right? To fix the double-free, we have to eliminate one
> of them, don't we? ;)
No
Without my patch: in rcu callback:
ovs_flow_tbl_destroy
->call_rcu(&ma->rcu, mask_array_rcu_cb);
->table_instance_destroy
->tbl_mask_array_realloc(Shrink the mask array if necessary)
->call_rcu(&old->rcu, mask_array_rcu_cb);

With the patch:
ovs_lock
table_instance_flow_flush (free the flow)
tbl_mask_array_realloc(shrink the mask array if necessary, will free
mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new
mask_array)
ovs_unlock

in rcu callback:
ovs_flow_tbl_destroy
call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array)

>
> >
> > > Have you tried my patch which is supposed to address this double-free?
> > I don't reproduce it. but your patch does not avoid ruc usage warning
> > and ASSERT_OVSL.
>
> Sure, I never intend to fix anything else but double-free. The $subject is
> about double free, I double checked. ;)
>
> Thanks.



-- 
Best regards, Tonghao

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  5:58                     ` Tonghao Zhang
@ 2020-08-11 18:28                       ` Johan Knöös
  2020-08-12  0:43                       ` Cong Wang
  1 sibling, 0 replies; 17+ messages in thread
From: Johan Knöös @ 2020-08-11 18:28 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Cong Wang, Paul E . McKenney, Joel Fernandes, Gregory Rose, bugs,
	Netdev, Uladzislau Rezki (Sony),
	rcu

On Mon, Aug 10, 2020 at 10:59 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I
> > > > > don't reproduce the double free issue.
> > > > > But I guess this patch may address this issue.
> > > > >
> > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
> > > >
> > > > I don't see how your patch address the double-free, as we still
> > > > free mask array twice after your patch: once in tbl_mask_array_realloc()
> > > > and once in ovs_flow_tbl_destroy().
> > > Hi Cong.
> > > Before my patch, we use the ovsl_dereference
> > > (rcu_dereference_protected) in the rcu callback.
> > > ovs_flow_tbl_destroy
> > > ->table_instance_destroy
> > > ->table_instance_flow_free
> > > ->flow_mask_remove
> > > ASSERT_OVSL(will print warning)
> > > ->tbl_mask_array_del_mask
> > > ovsl_dereference(rcu usage warning)
> > >
> >
> > I understand how your patch addresses the RCU annotation issue,
> > which is different from double-free.
> >
> >
> > > so we should invoke the table_instance_destroy or others under
> > > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning).
> >
> > Of course... I never doubt it.
> >
> >
> > > with this patch, we reallocate the mask_array under ovs_lock, and free
> > > it in the rcu callback. Without it, we  reallocate and free it in the
> > > rcu callback.
> > > I think we may fix it with this patch.
> >
> > Does it matter which context tbl_mask_array_realloc() is called?
> > Even with ovs_lock, we can still double free:
> >
> > ovs_lock()
> > tbl_mask_array_realloc()
> >  => call_rcu(&old->rcu, mask_array_rcu_cb);
> > ovs_unlock()
> > ...
> > ovs_flow_tbl_destroy()
> >  => call_rcu(&old->rcu, mask_array_rcu_cb);
> >
> > So still twice, right? To fix the double-free, we have to eliminate one
> > of them, don't we? ;)
> No
> Without my patch: in rcu callback:
> ovs_flow_tbl_destroy
> ->call_rcu(&ma->rcu, mask_array_rcu_cb);
> ->table_instance_destroy
> ->tbl_mask_array_realloc(Shrink the mask array if necessary)
> ->call_rcu(&old->rcu, mask_array_rcu_cb);
>
> With the patch:
> ovs_lock
> table_instance_flow_flush (free the flow)
> tbl_mask_array_realloc(shrink the mask array if necessary, will free
> mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new
> mask_array)
> ovs_unlock
>
> in rcu callback:
> ovs_flow_tbl_destroy
> call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array)
>
> >
> > >
> > > > Have you tried my patch which is supposed to address this double-free?
> > > I don't reproduce it. but your patch does not avoid ruc usage warning
> > > and ASSERT_OVSL.
> >
> > Sure, I never intend to fix anything else but double-free. The $subject is
> > about double free, I double checked. ;)
> >
> > Thanks.
>
>
>
> --
> Best regards, Tonghao

Cong and Tonghao, thanks for your patches. I cannot repro the double
free with either of them, and the "suspicious RCU usage" and the
ASSERT_OVSL warnings are also gone with Tonghao's patch.

Tonghao, from your sequence above it looks like it should fix the
https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
warning, correct?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-11  5:58                     ` Tonghao Zhang
  2020-08-11 18:28                       ` Johan Knöös
@ 2020-08-12  0:43                       ` Cong Wang
  1 sibling, 0 replies; 17+ messages in thread
From: Cong Wang @ 2020-08-12  0:43 UTC (permalink / raw)
  To: Tonghao Zhang
  Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös,
	Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony),
	rcu

On Mon, Aug 10, 2020 at 10:59 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
>
> On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > >
> > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote:
> > > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I
> > > > > don't reproduce the double free issue.
> > > > > But I guess this patch may address this issue.
> > > > >
> > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/
> > > >
> > > > I don't see how your patch address the double-free, as we still
> > > > free mask array twice after your patch: once in tbl_mask_array_realloc()
> > > > and once in ovs_flow_tbl_destroy().
> > > Hi Cong.
> > > Before my patch, we use the ovsl_dereference
> > > (rcu_dereference_protected) in the rcu callback.
> > > ovs_flow_tbl_destroy
> > > ->table_instance_destroy
> > > ->table_instance_flow_free
> > > ->flow_mask_remove
> > > ASSERT_OVSL(will print warning)
> > > ->tbl_mask_array_del_mask
> > > ovsl_dereference(rcu usage warning)
> > >
> >
> > I understand how your patch addresses the RCU annotation issue,
> > which is different from double-free.
> >
> >
> > > so we should invoke the table_instance_destroy or others under
> > > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning).
> >
> > Of course... I never doubt it.
> >
> >
> > > with this patch, we reallocate the mask_array under ovs_lock, and free
> > > it in the rcu callback. Without it, we  reallocate and free it in the
> > > rcu callback.
> > > I think we may fix it with this patch.
> >
> > Does it matter which context tbl_mask_array_realloc() is called?
> > Even with ovs_lock, we can still double free:
> >
> > ovs_lock()
> > tbl_mask_array_realloc()
> >  => call_rcu(&old->rcu, mask_array_rcu_cb);
> > ovs_unlock()
> > ...
> > ovs_flow_tbl_destroy()
> >  => call_rcu(&old->rcu, mask_array_rcu_cb);
> >
> > So still twice, right? To fix the double-free, we have to eliminate one
> > of them, don't we? ;)
> No
> Without my patch: in rcu callback:
> ovs_flow_tbl_destroy
> ->call_rcu(&ma->rcu, mask_array_rcu_cb);
> ->table_instance_destroy
> ->tbl_mask_array_realloc(Shrink the mask array if necessary)
> ->call_rcu(&old->rcu, mask_array_rcu_cb);
>
> With the patch:
> ovs_lock
> table_instance_flow_flush (free the flow)
> tbl_mask_array_realloc(shrink the mask array if necessary, will free
> mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new
> mask_array)
> ovs_unlock
>
> in rcu callback:
> ovs_flow_tbl_destroy
> call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array)

Ah, I see, I thought the mask array was cached in caller and passed along,
it is in fact refetched via &dp->table.

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [ovs-discuss] Double free in recent kernels after memleak fix
  2020-08-07 15:31   ` Johan Knöös
  2020-08-07 20:47     ` Joel Fernandes
@ 2020-08-07 21:52     ` Cong Wang
  1 sibling, 0 replies; 17+ messages in thread
From: Cong Wang @ 2020-08-07 21:52 UTC (permalink / raw)
  To: Johan Knöös; +Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev, joel

[-- Attachment #1: Type: text/plain, Size: 2309 bytes --]

On Fri, Aug 7, 2020 at 8:33 AM Johan Knöös <jknoos@google.com> wrote:
>
> On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote:
> >
> >
> >
> > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote:
> > > Hi Open vSwitch contributors,
> > >
> > > We have found openvswitch is causing double-freeing of memory. The
> > > issue was not present in kernel version 5.5.17 but is present in
> > > 5.6.14 and newer kernels.
> > >
> > > After reverting the RCU commits below for debugging, enabling
> > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this
> > > email in the kernel log (the last one shows the double-free). When I
> > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch:
> > > fix possible memleak on destroy flow-table"), the symptoms disappear.
> > > While I have a reliable way to reproduce the issue, I unfortunately
> > > don't yet have a process that's amenable to sharing. Please take a
> > > look.
> > >
> > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch()
> > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling
> > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu()
> > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work
> > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo
> > > a35d16905efc rcu: Add basic support for kfree_rcu() batching
> > >
> > > Thanks,
> > > Johan Knöös
> >
> > Let's add the author of the patch you reverted and the Linux netdev
> > mailing list.
> >
> > - Greg
>
> I found we also sometimes get warnings from
> https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239
> under similar conditions even on kernel 5.5.17, which I believe may be
> related. However, it's much rarer and I don't have a reliable way of
> reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only
> increases the frequency of a pre-existing bug.

It seems clear we have a double free on table->mask_array when
the reallocation is triggered on the destroy path.

Are you able to test the attached patch (compile tested only)?
Also note: it is generated against the latest net tree, it may not be
applied cleanly to any earlier stable release.

Thanks!

[-- Attachment #2: openvswitch.diff --]
[-- Type: text/x-patch, Size: 2227 bytes --]

diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
index 8c12675cbb67..cc7859db445a 100644
--- a/net/openvswitch/flow_table.c
+++ b/net/openvswitch/flow_table.c
@@ -294,7 +294,7 @@ static int tbl_mask_array_add_mask(struct flow_table *tbl,
 }
 
 static void tbl_mask_array_del_mask(struct flow_table *tbl,
-				    struct sw_flow_mask *mask)
+				    struct sw_flow_mask *mask, bool destroy)
 {
 	struct mask_array *ma = ovsl_dereference(tbl->mask_array);
 	int i, ma_count = READ_ONCE(ma->count);
@@ -314,6 +314,11 @@ static void tbl_mask_array_del_mask(struct flow_table *tbl,
 	rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]);
 	RCU_INIT_POINTER(ma->masks[ma_count -1], NULL);
 
+	if (destroy) {
+		kfree(mask);
+		return;
+	}
+
 	kfree_rcu(mask, rcu);
 
 	/* Shrink the mask array if necessary. */
@@ -326,7 +331,8 @@ static void tbl_mask_array_del_mask(struct flow_table *tbl,
 }
 
 /* Remove 'mask' from the mask list, if it is not needed any more. */
-static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
+static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask,
+			     bool destroy)
 {
 	if (mask) {
 		/* ovs-lock is required to protect mask-refcount and
@@ -337,7 +343,7 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask)
 		mask->ref_count--;
 
 		if (!mask->ref_count)
-			tbl_mask_array_del_mask(tbl, mask);
+			tbl_mask_array_del_mask(tbl, mask, destroy);
 	}
 }
 
@@ -470,7 +476,7 @@ static void table_instance_flow_free(struct flow_table *table,
 			table->ufid_count--;
 	}
 
-	flow_mask_remove(table, flow->mask);
+	flow_mask_remove(table, flow->mask, !count);
 }
 
 static void table_instance_destroy(struct flow_table *table,
@@ -521,9 +527,9 @@ void ovs_flow_tbl_destroy(struct flow_table *table)
 	struct mask_cache *mc = rcu_dereference_raw(table->mask_cache);
 	struct mask_array *ma = rcu_dereference_raw(table->mask_array);
 
+	table_instance_destroy(table, ti, ufid_ti, false);
 	call_rcu(&mc->rcu, mask_cache_rcu_cb);
 	call_rcu(&ma->rcu, mask_array_rcu_cb);
-	table_instance_destroy(table, ti, ufid_ti, false);
 }
 
 struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti,

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-08-12  0:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CA+Sh73MJhqs7PBk6OV2AhzVjYvE1foUQUnwP5DwWR44LHZRZ9w@mail.gmail.com>
2020-08-04 15:51 ` [ovs-discuss] Double free in recent kernels after memleak fix Gregory Rose
2020-08-07 15:31   ` Johan Knöös
2020-08-07 20:47     ` Joel Fernandes
2020-08-07 20:49       ` Joel Fernandes
2020-08-07 22:20       ` Paul E. McKenney
2020-08-07 23:05         ` Johan Knöös
2020-08-08 11:44           ` Uladzislau Rezki
2020-08-10 20:08         ` Joel Fernandes
2020-08-10 20:28           ` Paul E. McKenney
2020-08-11  1:14             ` Tonghao Zhang
2020-08-11  2:24               ` Cong Wang
2020-08-11  3:26                 ` Tonghao Zhang
2020-08-11  4:07                   ` Cong Wang
2020-08-11  5:58                     ` Tonghao Zhang
2020-08-11 18:28                       ` Johan Knöös
2020-08-12  0:43                       ` Cong Wang
2020-08-07 21:52     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).