* Re: [ovs-discuss] Double free in recent kernels after memleak fix [not found] <CA+Sh73MJhqs7PBk6OV2AhzVjYvE1foUQUnwP5DwWR44LHZRZ9w@mail.gmail.com> @ 2020-08-04 15:51 ` Gregory Rose 2020-08-07 15:31 ` Johan Knöös 0 siblings, 1 reply; 17+ messages in thread From: Gregory Rose @ 2020-08-04 15:51 UTC (permalink / raw) To: Johan Knöös, bugs, Tonghao Zhang, Netdev; +Cc: joel On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > Hi Open vSwitch contributors, > > We have found openvswitch is causing double-freeing of memory. The > issue was not present in kernel version 5.5.17 but is present in > 5.6.14 and newer kernels. > > After reverting the RCU commits below for debugging, enabling > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > email in the kernel log (the last one shows the double-free). When I > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > fix possible memleak on destroy flow-table"), the symptoms disappear. > While I have a reliable way to reproduce the issue, I unfortunately > don't yet have a process that's amenable to sharing. Please take a > look. > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > Thanks, > Johan Knöös Let's add the author of the patch you reverted and the Linux netdev mailing list. - Greg > > Traces: > > ------------[ cut here ]------------ > WARNING: CPU: 30 PID: 0 at net/openvswitch/flow_table.c:272 > table_instance_flow_free+0x2fd/0x340 [openvswitch] > Modules linked in: ... > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G E 5.6.14+ #18 > Hardware name: ... > RIP: 0010:table_instance_flow_free+0x2fd/0x340 [openvswitch] > Code: c1 fa 1f 48 c1 e8 20 29 d0 41 39 c7 0f 8f 95 fe ff ff 48 83 c4 > 10 48 89 ef d1 fe 5b 5d 41 5c 41 5d 41 5e 41 5f e9 33 fb ff ff <0f> 0b > e9 59 fe ff ff 0f 0b e8 65 f1 fe ff 85 c0 0f 85 9b fe ff ff > RSP: 0018:ffff888c3e589da8 EFLAGS: 00010246 > RAX: 0000000000000000 RBX: ffff889f954ee580 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000003 RDI: 0000000000000246 > RBP: ffff888c295150a0 R08: ffffffff9297f341 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: ffff889f1ed55000 > R13: ffff888b72efa020 R14: ffff888c24209480 R15: ffff888b731bb6f8 > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > table_instance_destroy+0xf9/0x1b0 [openvswitch] > ? new_vport+0xb0/0xb0 [openvswitch] > destroy_dp_rcu+0x12/0x50 [openvswitch] > rcu_core+0x34d/0x9b0 > ? rcu_all_qs+0x90/0x90 > ? rcu_read_lock_sched_held+0xa5/0xc0 > ? rcu_read_lock_bh_held+0xc0/0xc0 > ? run_rebalance_domains+0x11d/0x140 > __do_softirq+0x128/0x55c > irq_exit+0x101/0x110 > smp_apic_timer_interrupt+0xfd/0x2f0 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > ? lockdep_hardirqs_on+0x182/0x260 > ? cpuidle_enter_state+0xd3/0x5d0 > cpuidle_enter+0x3c/0x60 > do_idle+0x36a/0x450 > ? arch_cpu_idle_exit+0x40/0x40 > cpu_startup_entry+0x19/0x20 > start_secondary+0x21f/0x290 > ? set_cpu_sibling_map+0xcb0/0xcb0 > secondary_startup_64+0xa4/0xb0 > irq event stamp: 1626911 > hardirqs last enabled at (1626910): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > hardirqs last disabled at (1626911): [<ffffffff92804552>] > trace_hardirqs_off_thunk+0x1a/0x1c > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > ---[ end trace 8dc48dec48bb79c0 ]--- > > > ------------------------------------------------------------------------------- > > > ============================= > WARNING: suspicious RCU usage > 5.6.14+ #18 Tainted: G W E > ----------------------------- > net/openvswitch/flow_table.c:239 suspicious rcu_dereference_protected() usage! > \x0aother info that might help us debug this:\x0a > \x0arcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by swapper/30/0: > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0 > \x0astack backtrace: > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > Hardware name: ... > Call Trace: > <IRQ> > dump_stack+0xb8/0x110 > table_instance_flow_free+0x332/0x340 [openvswitch] > table_instance_destroy+0xf9/0x1b0 [openvswitch] > ? new_vport+0xb0/0xb0 [openvswitch] > destroy_dp_rcu+0x12/0x50 [openvswitch] > rcu_core+0x34d/0x9b0 > ? rcu_all_qs+0x90/0x90 > ? rcu_read_lock_sched_held+0xa5/0xc0 > ? rcu_read_lock_bh_held+0xc0/0xc0 > ? run_rebalance_domains+0x11d/0x140 > __do_softirq+0x128/0x55c > irq_exit+0x101/0x110 > smp_apic_timer_interrupt+0xfd/0x2f0 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > ? lockdep_hardirqs_on+0x182/0x260 > ? cpuidle_enter_state+0xd3/0x5d0 > cpuidle_enter+0x3c/0x60 > do_idle+0x36a/0x450 > ? arch_cpu_idle_exit+0x40/0x40 > cpu_startup_entry+0x19/0x20 > start_secondary+0x21f/0x290 > ? set_cpu_sibling_map+0xcb0/0xcb0 > secondary_startup_64+0xa4/0xb0 > > > ------------------------------------------------------------------------------- > > > ================================ > WARNING: inconsistent lock state > 5.6.14+ #18 Tainted: G W E > -------------------------------- > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > swapper/30/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > ffffffff943e24c0 (fs_reclaim){+.?.}, at: fs_reclaim_acquire.part.0+0x5/0x30 > {SOFTIRQ-ON-W} state was registered at: > lock_acquire+0xe5/0x1f0 > fs_reclaim_acquire.part.0+0x25/0x30 > kmem_cache_alloc_trace+0x30/0x2d0 > alloc_workqueue_attrs+0x29/0xc0 > workqueue_init+0x4e/0x3c5 > kernel_init_freeable+0x16a/0x350 > kernel_init+0xd/0x116 > ret_from_fork+0x3a/0x50 > irq event stamp: 1629166 > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > \x0aother info that might help us debug this: > Possible unsafe locking scenario:\x0a > CPU0 > ---- > lock(fs_reclaim); > <Interrupt> > lock(fs_reclaim); > \x0a *** DEADLOCK ***\x0a > 1 lock held by swapper/30/0: > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0 > \x0astack backtrace: > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > Hardware name: ... > Call Trace: > <IRQ> > dump_stack+0xb8/0x110 > mark_lock+0x7ee/0x9d0 > ? check_usage_backwards+0x230/0x230 > __lock_acquire+0xb84/0x2650 > ? lockdep_hardirqs_on+0x182/0x260 > ? lockdep_hardirqs_on+0x260/0x260 > ? debug_object_deactivate+0x210/0x210 > ? trace_hardirqs_on_thunk+0x1a/0x1c > lock_acquire+0xe5/0x1f0 > ? fs_reclaim_acquire.part.0+0x5/0x30 > ? tbl_mask_array_realloc+0x38/0x1d0 [openvswitch] > fs_reclaim_acquire.part.0+0x25/0x30 > ? fs_reclaim_acquire.part.0+0x5/0x30 > __kmalloc+0x4d/0x320 > tbl_mask_array_realloc+0x38/0x1d0 [openvswitch] > ? table_instance_flow_free+0x2ba/0x340 [openvswitch] > table_instance_destroy+0xf9/0x1b0 [openvswitch] > ? new_vport+0xb0/0xb0 [openvswitch] > destroy_dp_rcu+0x12/0x50 [openvswitch] > rcu_core+0x34d/0x9b0 > ? rcu_all_qs+0x90/0x90 > ? rcu_read_lock_sched_held+0xa5/0xc0 > ? rcu_read_lock_bh_held+0xc0/0xc0 > ? run_rebalance_domains+0x11d/0x140 > __do_softirq+0x128/0x55c > irq_exit+0x101/0x110 > smp_apic_timer_interrupt+0xfd/0x2f0 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > ? lockdep_hardirqs_on+0x182/0x260 > ? cpuidle_enter_state+0xd3/0x5d0 > cpuidle_enter+0x3c/0x60 > do_idle+0x36a/0x450 > ? arch_cpu_idle_exit+0x40/0x40 > cpu_startup_entry+0x19/0x20 > start_secondary+0x21f/0x290 > ? set_cpu_sibling_map+0xcb0/0xcb0 > secondary_startup_64+0xa4/0xb0 > > > ------------------------------------------------------------------------------- > > > ------------[ cut here ]------------ > ODEBUG: activate active (active state 1) object type: rcu_head hint: 0x0 > WARNING: CPU: 30 PID: 0 at lib/debugobjects.c:485 debug_print_object+0xc6/0xe0 > Modules linked in: ... > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > Hardware name: ... > RIP: 0010:debug_print_object+0xc6/0xe0 > Code: dd 20 6c 52 94 e8 3a 0a cf ff 4d 89 f1 49 89 e8 44 89 e1 48 8b > 14 dd 20 6c 52 94 4c 89 ee 48 c7 c7 40 7f c4 93 e8 84 fd 98 ff <0f> 0b > 5b 83 05 04 89 86 01 01 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f > RSP: 0018:ffff888c3e589c38 EFLAGS: 00010286 > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb1379 > RBP: ffffffff93aca380 R08: 0000000000000001 R09: ffffed1187cb6691 > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000001 > R13: ffffffff93c48600 R14: 0000000000000000 R15: 0000000000000000 > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > debug_object_activate+0x2b8/0x300 > ? debug_object_assert_init+0x220/0x220 > ? kasan_unpoison_shadow+0x33/0x40 > __call_rcu+0x34/0x3b0 > ? __kasan_kmalloc.constprop.0+0xc2/0xd0 > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch] > table_instance_destroy+0xf9/0x1b0 [openvswitch] > ? new_vport+0xb0/0xb0 [openvswitch] > destroy_dp_rcu+0x12/0x50 [openvswitch] > rcu_core+0x34d/0x9b0 > ? rcu_all_qs+0x90/0x90 > ? rcu_read_lock_sched_held+0xa5/0xc0 > ? rcu_read_lock_bh_held+0xc0/0xc0 > ? run_rebalance_domains+0x11d/0x140 > __do_softirq+0x128/0x55c > irq_exit+0x101/0x110 > smp_apic_timer_interrupt+0xfd/0x2f0 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > ? lockdep_hardirqs_on+0x182/0x260 > ? cpuidle_enter_state+0xd3/0x5d0 > cpuidle_enter+0x3c/0x60 > do_idle+0x36a/0x450 > ? arch_cpu_idle_exit+0x40/0x40 > cpu_startup_entry+0x19/0x20 > start_secondary+0x21f/0x290 > ? set_cpu_sibling_map+0xcb0/0xcb0 > secondary_startup_64+0xa4/0xb0 > irq event stamp: 1629166 > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > ---[ end trace 8dc48dec48bb79f0 ]--- > > > ------------------------------------------------------------------------------- > > > ------------[ cut here ]------------ > __call_rcu(): Double-freed CB 00000000111691a8->0x0()!!! > WARNING: CPU: 30 PID: 0 at kernel/rcu/tree.c:2594 __call_rcu+0x20f/0x3b0 > Modules linked in: ... > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > Hardware name: ... > RIP: 0010:__call_rcu+0x20f/0x3b0 > Code: 5e 41 5f e9 03 59 0d 00 48 89 ef c6 05 5d 2e dc 01 01 e8 94 33 > 27 00 48 8b 53 08 48 89 de 48 c7 c7 20 bf ac 93 e8 eb 26 f1 ff <0f> 0b > e9 4b fe ff ff e8 05 fd ff ff e9 21 ff ff ff 0f 0b e9 fe fd > RSP: 0018:ffff888c3e589d58 EFLAGS: 00010282 > RAX: 0000000000000000 RBX: ffff888bd345e200 RCX: 0000000000000000 > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb139d > RBP: ffff888bd345e208 R08: 0000000000000001 R09: ffffed1187cb6691 > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000000 > R13: 0000000000000001 R14: ffff888107e06618 R15: ffff888bd345e2f8 > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <IRQ> > ? __kasan_kmalloc.constprop.0+0xc2/0xd0 > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch] > table_instance_destroy+0xf9/0x1b0 [openvswitch] > ? new_vport+0xb0/0xb0 [openvswitch] > destroy_dp_rcu+0x12/0x50 [openvswitch] > rcu_core+0x34d/0x9b0 > ? rcu_all_qs+0x90/0x90 > ? rcu_read_lock_sched_held+0xa5/0xc0 > ? rcu_read_lock_bh_held+0xc0/0xc0 > ? run_rebalance_domains+0x11d/0x140 > __do_softirq+0x128/0x55c > irq_exit+0x101/0x110 > smp_apic_timer_interrupt+0xfd/0x2f0 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > ? lockdep_hardirqs_on+0x182/0x260 > ? cpuidle_enter_state+0xd3/0x5d0 > cpuidle_enter+0x3c/0x60 > do_idle+0x36a/0x450 > ? arch_cpu_idle_exit+0x40/0x40 > cpu_startup_entry+0x19/0x20 > start_secondary+0x21f/0x290 > ? set_cpu_sibling_map+0xcb0/0xcb0 > secondary_startup_64+0xa4/0xb0 > irq event stamp: 1629166 > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > ---[ end trace 8dc48dec48bb79f2 ]--- > _______________________________________________ > discuss mailing list > discuss@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-04 15:51 ` [ovs-discuss] Double free in recent kernels after memleak fix Gregory Rose @ 2020-08-07 15:31 ` Johan Knöös 2020-08-07 20:47 ` Joel Fernandes 2020-08-07 21:52 ` Cong Wang 0 siblings, 2 replies; 17+ messages in thread From: Johan Knöös @ 2020-08-07 15:31 UTC (permalink / raw) To: Gregory Rose; +Cc: bugs, Tonghao Zhang, Netdev, joel On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > Hi Open vSwitch contributors, > > > > We have found openvswitch is causing double-freeing of memory. The > > issue was not present in kernel version 5.5.17 but is present in > > 5.6.14 and newer kernels. > > > > After reverting the RCU commits below for debugging, enabling > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > email in the kernel log (the last one shows the double-free). When I > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > While I have a reliable way to reproduce the issue, I unfortunately > > don't yet have a process that's amenable to sharing. Please take a > > look. > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > Thanks, > > Johan Knöös > > Let's add the author of the patch you reverted and the Linux netdev > mailing list. > > - Greg I found we also sometimes get warnings from https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239 under similar conditions even on kernel 5.5.17, which I believe may be related. However, it's much rarer and I don't have a reliable way of reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only increases the frequency of a pre-existing bug. > > Traces: > > > > ------------[ cut here ]------------ > > WARNING: CPU: 30 PID: 0 at net/openvswitch/flow_table.c:272 > > table_instance_flow_free+0x2fd/0x340 [openvswitch] > > Modules linked in: ... > > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G E 5.6.14+ #18 > > Hardware name: ... > > RIP: 0010:table_instance_flow_free+0x2fd/0x340 [openvswitch] > > Code: c1 fa 1f 48 c1 e8 20 29 d0 41 39 c7 0f 8f 95 fe ff ff 48 83 c4 > > 10 48 89 ef d1 fe 5b 5d 41 5c 41 5d 41 5e 41 5f e9 33 fb ff ff <0f> 0b > > e9 59 fe ff ff 0f 0b e8 65 f1 fe ff 85 c0 0f 85 9b fe ff ff > > RSP: 0018:ffff888c3e589da8 EFLAGS: 00010246 > > RAX: 0000000000000000 RBX: ffff889f954ee580 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000003 RDI: 0000000000000246 > > RBP: ffff888c295150a0 R08: ffffffff9297f341 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: ffff889f1ed55000 > > R13: ffff888b72efa020 R14: ffff888c24209480 R15: ffff888b731bb6f8 > > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <IRQ> > > table_instance_destroy+0xf9/0x1b0 [openvswitch] > > ? new_vport+0xb0/0xb0 [openvswitch] > > destroy_dp_rcu+0x12/0x50 [openvswitch] > > rcu_core+0x34d/0x9b0 > > ? rcu_all_qs+0x90/0x90 > > ? rcu_read_lock_sched_held+0xa5/0xc0 > > ? rcu_read_lock_bh_held+0xc0/0xc0 > > ? run_rebalance_domains+0x11d/0x140 > > __do_softirq+0x128/0x55c > > irq_exit+0x101/0x110 > > smp_apic_timer_interrupt+0xfd/0x2f0 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? cpuidle_enter_state+0xd3/0x5d0 > > cpuidle_enter+0x3c/0x60 > > do_idle+0x36a/0x450 > > ? arch_cpu_idle_exit+0x40/0x40 > > cpu_startup_entry+0x19/0x20 > > start_secondary+0x21f/0x290 > > ? set_cpu_sibling_map+0xcb0/0xcb0 > > secondary_startup_64+0xa4/0xb0 > > irq event stamp: 1626911 > > hardirqs last enabled at (1626910): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > > hardirqs last disabled at (1626911): [<ffffffff92804552>] > > trace_hardirqs_off_thunk+0x1a/0x1c > > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > > ---[ end trace 8dc48dec48bb79c0 ]--- > > > > > > ------------------------------------------------------------------------------- > > > > > > ============================= > > WARNING: suspicious RCU usage > > 5.6.14+ #18 Tainted: G W E > > ----------------------------- > > net/openvswitch/flow_table.c:239 suspicious rcu_dereference_protected() usage! > > \x0aother info that might help us debug this:\x0a > > \x0arcu_scheduler_active = 2, debug_locks = 1 > > 1 lock held by swapper/30/0: > > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0 > > \x0astack backtrace: > > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > > Hardware name: ... > > Call Trace: > > <IRQ> > > dump_stack+0xb8/0x110 > > table_instance_flow_free+0x332/0x340 [openvswitch] > > table_instance_destroy+0xf9/0x1b0 [openvswitch] > > ? new_vport+0xb0/0xb0 [openvswitch] > > destroy_dp_rcu+0x12/0x50 [openvswitch] > > rcu_core+0x34d/0x9b0 > > ? rcu_all_qs+0x90/0x90 > > ? rcu_read_lock_sched_held+0xa5/0xc0 > > ? rcu_read_lock_bh_held+0xc0/0xc0 > > ? run_rebalance_domains+0x11d/0x140 > > __do_softirq+0x128/0x55c > > irq_exit+0x101/0x110 > > smp_apic_timer_interrupt+0xfd/0x2f0 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? cpuidle_enter_state+0xd3/0x5d0 > > cpuidle_enter+0x3c/0x60 > > do_idle+0x36a/0x450 > > ? arch_cpu_idle_exit+0x40/0x40 > > cpu_startup_entry+0x19/0x20 > > start_secondary+0x21f/0x290 > > ? set_cpu_sibling_map+0xcb0/0xcb0 > > secondary_startup_64+0xa4/0xb0 > > > > > > ------------------------------------------------------------------------------- > > > > > > ================================ > > WARNING: inconsistent lock state > > 5.6.14+ #18 Tainted: G W E > > -------------------------------- > > inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. > > swapper/30/0 [HC0[0]:SC1[1]:HE1:SE0] takes: > > ffffffff943e24c0 (fs_reclaim){+.?.}, at: fs_reclaim_acquire.part.0+0x5/0x30 > > {SOFTIRQ-ON-W} state was registered at: > > lock_acquire+0xe5/0x1f0 > > fs_reclaim_acquire.part.0+0x25/0x30 > > kmem_cache_alloc_trace+0x30/0x2d0 > > alloc_workqueue_attrs+0x29/0xc0 > > workqueue_init+0x4e/0x3c5 > > kernel_init_freeable+0x16a/0x350 > > kernel_init+0xd/0x116 > > ret_from_fork+0x3a/0x50 > > irq event stamp: 1629166 > > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > > \x0aother info that might help us debug this: > > Possible unsafe locking scenario:\x0a > > CPU0 > > ---- > > lock(fs_reclaim); > > <Interrupt> > > lock(fs_reclaim); > > \x0a *** DEADLOCK ***\x0a > > 1 lock held by swapper/30/0: > > #0: ffffffff94315e00 (rcu_callback){....}, at: rcu_core+0x395/0x9b0 > > \x0astack backtrace: > > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > > Hardware name: ... > > Call Trace: > > <IRQ> > > dump_stack+0xb8/0x110 > > mark_lock+0x7ee/0x9d0 > > ? check_usage_backwards+0x230/0x230 > > __lock_acquire+0xb84/0x2650 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? lockdep_hardirqs_on+0x260/0x260 > > ? debug_object_deactivate+0x210/0x210 > > ? trace_hardirqs_on_thunk+0x1a/0x1c > > lock_acquire+0xe5/0x1f0 > > ? fs_reclaim_acquire.part.0+0x5/0x30 > > ? tbl_mask_array_realloc+0x38/0x1d0 [openvswitch] > > fs_reclaim_acquire.part.0+0x25/0x30 > > ? fs_reclaim_acquire.part.0+0x5/0x30 > > __kmalloc+0x4d/0x320 > > tbl_mask_array_realloc+0x38/0x1d0 [openvswitch] > > ? table_instance_flow_free+0x2ba/0x340 [openvswitch] > > table_instance_destroy+0xf9/0x1b0 [openvswitch] > > ? new_vport+0xb0/0xb0 [openvswitch] > > destroy_dp_rcu+0x12/0x50 [openvswitch] > > rcu_core+0x34d/0x9b0 > > ? rcu_all_qs+0x90/0x90 > > ? rcu_read_lock_sched_held+0xa5/0xc0 > > ? rcu_read_lock_bh_held+0xc0/0xc0 > > ? run_rebalance_domains+0x11d/0x140 > > __do_softirq+0x128/0x55c > > irq_exit+0x101/0x110 > > smp_apic_timer_interrupt+0xfd/0x2f0 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? cpuidle_enter_state+0xd3/0x5d0 > > cpuidle_enter+0x3c/0x60 > > do_idle+0x36a/0x450 > > ? arch_cpu_idle_exit+0x40/0x40 > > cpu_startup_entry+0x19/0x20 > > start_secondary+0x21f/0x290 > > ? set_cpu_sibling_map+0xcb0/0xcb0 > > secondary_startup_64+0xa4/0xb0 > > > > > > ------------------------------------------------------------------------------- > > > > > > ------------[ cut here ]------------ > > ODEBUG: activate active (active state 1) object type: rcu_head hint: 0x0 > > WARNING: CPU: 30 PID: 0 at lib/debugobjects.c:485 debug_print_object+0xc6/0xe0 > > Modules linked in: ... > > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > > Hardware name: ... > > RIP: 0010:debug_print_object+0xc6/0xe0 > > Code: dd 20 6c 52 94 e8 3a 0a cf ff 4d 89 f1 49 89 e8 44 89 e1 48 8b > > 14 dd 20 6c 52 94 4c 89 ee 48 c7 c7 40 7f c4 93 e8 84 fd 98 ff <0f> 0b > > 5b 83 05 04 89 86 01 01 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f > > RSP: 0018:ffff888c3e589c38 EFLAGS: 00010286 > > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000 > > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb1379 > > RBP: ffffffff93aca380 R08: 0000000000000001 R09: ffffed1187cb6691 > > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000001 > > R13: ffffffff93c48600 R14: 0000000000000000 R15: 0000000000000000 > > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <IRQ> > > debug_object_activate+0x2b8/0x300 > > ? debug_object_assert_init+0x220/0x220 > > ? kasan_unpoison_shadow+0x33/0x40 > > __call_rcu+0x34/0x3b0 > > ? __kasan_kmalloc.constprop.0+0xc2/0xd0 > > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch] > > table_instance_destroy+0xf9/0x1b0 [openvswitch] > > ? new_vport+0xb0/0xb0 [openvswitch] > > destroy_dp_rcu+0x12/0x50 [openvswitch] > > rcu_core+0x34d/0x9b0 > > ? rcu_all_qs+0x90/0x90 > > ? rcu_read_lock_sched_held+0xa5/0xc0 > > ? rcu_read_lock_bh_held+0xc0/0xc0 > > ? run_rebalance_domains+0x11d/0x140 > > __do_softirq+0x128/0x55c > > irq_exit+0x101/0x110 > > smp_apic_timer_interrupt+0xfd/0x2f0 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? cpuidle_enter_state+0xd3/0x5d0 > > cpuidle_enter+0x3c/0x60 > > do_idle+0x36a/0x450 > > ? arch_cpu_idle_exit+0x40/0x40 > > cpu_startup_entry+0x19/0x20 > > start_secondary+0x21f/0x290 > > ? set_cpu_sibling_map+0xcb0/0xcb0 > > secondary_startup_64+0xa4/0xb0 > > irq event stamp: 1629166 > > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > > ---[ end trace 8dc48dec48bb79f0 ]--- > > > > > > ------------------------------------------------------------------------------- > > > > > > ------------[ cut here ]------------ > > __call_rcu(): Double-freed CB 00000000111691a8->0x0()!!! > > WARNING: CPU: 30 PID: 0 at kernel/rcu/tree.c:2594 __call_rcu+0x20f/0x3b0 > > Modules linked in: ... > > CPU: 30 PID: 0 Comm: swapper/30 Tainted: G W E 5.6.14+ #18 > > Hardware name: ... > > RIP: 0010:__call_rcu+0x20f/0x3b0 > > Code: 5e 41 5f e9 03 59 0d 00 48 89 ef c6 05 5d 2e dc 01 01 e8 94 33 > > 27 00 48 8b 53 08 48 89 de 48 c7 c7 20 bf ac 93 e8 eb 26 f1 ff <0f> 0b > > e9 4b fe ff ff e8 05 fd ff ff e9 21 ff ff ff 0f 0b e9 fe fd > > RSP: 0018:ffff888c3e589d58 EFLAGS: 00010282 > > RAX: 0000000000000000 RBX: ffff888bd345e200 RCX: 0000000000000000 > > RDX: 1ffff11187cb5415 RSI: 0000000000000008 RDI: ffffed1187cb139d > > RBP: ffff888bd345e208 R08: 0000000000000001 R09: ffffed1187cb6691 > > R10: ffffed1187cb6690 R11: ffff888c3e5b3487 R12: 0000000000000000 > > R13: 0000000000000001 R14: ffff888107e06618 R15: ffff888bd345e2f8 > > FS: 0000000000000000(0000) GS:ffff888c3e580000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 00000733feb8a700 CR3: 0000000ba726e004 CR4: 00000000003606e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <IRQ> > > ? __kasan_kmalloc.constprop.0+0xc2/0xd0 > > tbl_mask_array_realloc+0x170/0x1d0 [openvswitch] > > table_instance_destroy+0xf9/0x1b0 [openvswitch] > > ? new_vport+0xb0/0xb0 [openvswitch] > > destroy_dp_rcu+0x12/0x50 [openvswitch] > > rcu_core+0x34d/0x9b0 > > ? rcu_all_qs+0x90/0x90 > > ? rcu_read_lock_sched_held+0xa5/0xc0 > > ? rcu_read_lock_bh_held+0xc0/0xc0 > > ? run_rebalance_domains+0x11d/0x140 > > __do_softirq+0x128/0x55c > > irq_exit+0x101/0x110 > > smp_apic_timer_interrupt+0xfd/0x2f0 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0010:cpuidle_enter_state+0xda/0x5d0 > > Code: 80 7c 24 10 00 74 17 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 be 04 > > 00 00 31 ff e8 c2 1a 7a ff e8 9d 4d 84 ff fb 66 0f 1f 44 00 00 <45> 85 > > ed 0f 88 b6 03 00 00 4d 63 f5 4b 8d 04 76 4e 8d 3c f5 00 00 > > RSP: 0018:ffff888103f07d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 > > RAX: 0000000000000000 RBX: ffff888c3e5c1800 RCX: dffffc0000000000 > > RDX: 0000000000000007 RSI: 0000000000000006 RDI: ffff888103ec88d4 > > RBP: ffffffff945a3940 R08: ffffffff92982042 R09: 0000000000000000 > > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > > R13: 0000000000000002 R14: 00000000000000d0 R15: ffffffff945a3a10 > > ? lockdep_hardirqs_on+0x182/0x260 > > ? cpuidle_enter_state+0xd3/0x5d0 > > cpuidle_enter+0x3c/0x60 > > do_idle+0x36a/0x450 > > ? arch_cpu_idle_exit+0x40/0x40 > > cpu_startup_entry+0x19/0x20 > > start_secondary+0x21f/0x290 > > ? set_cpu_sibling_map+0xcb0/0xcb0 > > secondary_startup_64+0xa4/0xb0 > > irq event stamp: 1629166 > > hardirqs last enabled at (1629166): [<ffffffff929c0227>] __call_rcu+0x1b7/0x3b0 > > hardirqs last disabled at (1629165): [<ffffffff929c0129>] __call_rcu+0xb9/0x3b0 > > softirqs last enabled at (1626882): [<ffffffff928df0d5>] irq_enter+0x75/0x80 > > softirqs last disabled at (1626883): [<ffffffff928df1e1>] irq_exit+0x101/0x110 > > ---[ end trace 8dc48dec48bb79f2 ]--- > > _______________________________________________ > > discuss mailing list > > discuss@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 15:31 ` Johan Knöös @ 2020-08-07 20:47 ` Joel Fernandes 2020-08-07 20:49 ` Joel Fernandes 2020-08-07 22:20 ` Paul E. McKenney 2020-08-07 21:52 ` Cong Wang 1 sibling, 2 replies; 17+ messages in thread From: Joel Fernandes @ 2020-08-07 20:47 UTC (permalink / raw) To: Johan Knöös Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), Paul E. McKenney, rcu Hi, Adding more of us working on RCU as well. Johan from another team at Google discovered a likely issue in openswitch, details below: On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > Hi Open vSwitch contributors, > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > issue was not present in kernel version 5.5.17 but is present in > > > 5.6.14 and newer kernels. > > > > > > After reverting the RCU commits below for debugging, enabling > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > email in the kernel log (the last one shows the double-free). When I > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > While I have a reliable way to reproduce the issue, I unfortunately > > > don't yet have a process that's amenable to sharing. Please take a > > > look. > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > Note that these reverts were only for testing the same code, because he was testing 2 different kernel versions. One of them did not have this set. So I asked him to revert. There's no known bug in the reverted code itself. But somehow these patches do make it harder for him to reproduce the issue. > > > Thanks, > > > Johan Knöös > > > > Let's add the author of the patch you reverted and the Linux netdev > > mailing list. > > > > - Greg > > I found we also sometimes get warnings from > https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239 > under similar conditions even on kernel 5.5.17, which I believe may be > related. However, it's much rarer and I don't have a reliable way of > reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only > increases the frequency of a pre-existing bug. This is interesting, because I saw kbuild warn me recently [1] about it as well. Though, I was actually intentionally messing with the segcblist. I plan to debug it next week, but the warning itself is unlikely to be caused by my patch IMHO (since it is slightly orthogonal to what I changed). [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/ But then again, I have not heard reports of this warning firing. Paul, has this come to your radar recently? Thanks, - Joel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 20:47 ` Joel Fernandes @ 2020-08-07 20:49 ` Joel Fernandes 2020-08-07 22:20 ` Paul E. McKenney 1 sibling, 0 replies; 17+ messages in thread From: Joel Fernandes @ 2020-08-07 20:49 UTC (permalink / raw) To: Johan Knöös Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), Paul E. McKenney, rcu On Fri, Aug 7, 2020 at 4:47 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > Hi, > Adding more of us working on RCU as well. Johan from another team at > Google discovered a likely issue in openswitch, details below: > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > Hi Open vSwitch contributors, > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > issue was not present in kernel version 5.5.17 but is present in > > > > 5.6.14 and newer kernels. > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > email in the kernel log (the last one shows the double-free). When I > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > don't yet have a process that's amenable to sharing. Please take a > > > > look. > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > > > Note that these reverts were only for testing the same code, because > he was testing 2 different kernel versions. One of them did not have > this set. So I asked him to revert. There's no known bug in the > reverted code itself. But somehow these patches do make it harder for > him to reproduce the issue. And the reason for this is likely the additional kfree batching is slowing down the occurrence of the crash. thanks, - Joel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 20:47 ` Joel Fernandes 2020-08-07 20:49 ` Joel Fernandes @ 2020-08-07 22:20 ` Paul E. McKenney 2020-08-07 23:05 ` Johan Knöös 2020-08-10 20:08 ` Joel Fernandes 1 sibling, 2 replies; 17+ messages in thread From: Paul E. McKenney @ 2020-08-07 22:20 UTC (permalink / raw) To: Joel Fernandes Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), rcu On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > Hi, > Adding more of us working on RCU as well. Johan from another team at > Google discovered a likely issue in openswitch, details below: > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > Hi Open vSwitch contributors, > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > issue was not present in kernel version 5.5.17 but is present in > > > > 5.6.14 and newer kernels. > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > email in the kernel log (the last one shows the double-free). When I > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > don't yet have a process that's amenable to sharing. Please take a > > > > look. > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > Note that these reverts were only for testing the same code, because > he was testing 2 different kernel versions. One of them did not have > this set. So I asked him to revert. There's no known bug in the > reverted code itself. But somehow these patches do make it harder for > him to reproduce the issue. Perhaps they adjust timing? > > > > Thanks, > > > > Johan Knöös > > > > > > Let's add the author of the patch you reverted and the Linux netdev > > > mailing list. > > > > > > - Greg > > > > I found we also sometimes get warnings from > > https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239 > > under similar conditions even on kernel 5.5.17, which I believe may be > > related. However, it's much rarer and I don't have a reliable way of > > reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only > > increases the frequency of a pre-existing bug. > > This is interesting, because I saw kbuild warn me recently [1] about > it as well. Though, I was actually intentionally messing with the > segcblist. I plan to debug it next week, but the warning itself is > unlikely to be caused by my patch IMHO (since it is slightly > orthogonal to what I changed). > > [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/ > > But then again, I have not heard reports of this warning firing. Paul, > has this come to your radar recently? I have not seen any recent WARNs in rcu_do_batch(). I am guessing that this is one of the last two in that function? If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y? That Kconfig option is designed to help locate double frees via RCU. Thanx, Paul ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 22:20 ` Paul E. McKenney @ 2020-08-07 23:05 ` Johan Knöös 2020-08-08 11:44 ` Uladzislau Rezki 2020-08-10 20:08 ` Joel Fernandes 1 sibling, 1 reply; 17+ messages in thread From: Johan Knöös @ 2020-08-07 23:05 UTC (permalink / raw) To: paulmck Cc: Joel Fernandes, Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), rcu On Fri, Aug 7, 2020 at 3:20 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > > Hi, > > Adding more of us working on RCU as well. Johan from another team at > > Google discovered a likely issue in openswitch, details below: > > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > > Hi Open vSwitch contributors, > > > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > > issue was not present in kernel version 5.5.17 but is present in > > > > > 5.6.14 and newer kernels. > > > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > > email in the kernel log (the last one shows the double-free). When I > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > > don't yet have a process that's amenable to sharing. Please take a > > > > > look. > > > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > Note that these reverts were only for testing the same code, because > > he was testing 2 different kernel versions. One of them did not have > > this set. So I asked him to revert. There's no known bug in the > > reverted code itself. But somehow these patches do make it harder for > > him to reproduce the issue. I'm not certain the frequency of the issue changes with and without these commits on 5.6.14, but at least the symptoms/definition of the issue changes. To clarify, this is what I've observed with different kernels: * 5.6.14: "kernel BUG at mm/slub.c:304!". Easily reproducible. * 5.6.14 with the above RCU commits reverted: the warnings reported in my original email. Easily reproducible. * 5.6.14 with the above RCU commits reverted and 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 reverted: no warnings observed (the frequency might be the same as on 5.5.17). * 5.5.17: warning at kernel/rcu/tree.c#L2239. Difficult to reproduce. Maybe a different root cause. > Perhaps they adjust timing? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 23:05 ` Johan Knöös @ 2020-08-08 11:44 ` Uladzislau Rezki 0 siblings, 0 replies; 17+ messages in thread From: Uladzislau Rezki @ 2020-08-08 11:44 UTC (permalink / raw) To: Johan Knöös Cc: paulmck, Joel Fernandes, Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), rcu > On Fri, Aug 7, 2020 at 3:20 PM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > > > Hi, > > > Adding more of us working on RCU as well. Johan from another team at > > > Google discovered a likely issue in openswitch, details below: > > > > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > > > > > > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > > > Hi Open vSwitch contributors, > > > > > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > > > issue was not present in kernel version 5.5.17 but is present in > > > > > > 5.6.14 and newer kernels. > > > > > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > > > email in the kernel log (the last one shows the double-free). When I > > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > > > don't yet have a process that's amenable to sharing. Please take a > > > > > > look. > > > > > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > > > Note that these reverts were only for testing the same code, because > > > he was testing 2 different kernel versions. One of them did not have > > > this set. So I asked him to revert. There's no known bug in the > > > reverted code itself. But somehow these patches do make it harder for > > > him to reproduce the issue. > > I'm not certain the frequency of the issue changes with and without > these commits on 5.6.14, but at least the symptoms/definition of the > issue changes. To clarify, this is what I've observed with different > kernels: > * 5.6.14: "kernel BUG at mm/slub.c:304!". Easily reproducible. > * 5.6.14 with the above RCU commits reverted: the warnings reported in > my original email. Easily reproducible. > * 5.6.14 with the above RCU commits reverted and > 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 reverted: no warnings > observed (the frequency might be the same as on 5.5.17). > * 5.5.17: warning at kernel/rcu/tree.c#L2239. Difficult to reproduce. > Maybe a different root cause. > If you can reproduce it, maybe enabling CONFIG_KASAN will detect something? It can detect out-of-bounds and use after free bugs. Thanks. -- Vlad Rezki ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 22:20 ` Paul E. McKenney 2020-08-07 23:05 ` Johan Knöös @ 2020-08-10 20:08 ` Joel Fernandes 2020-08-10 20:28 ` Paul E. McKenney 1 sibling, 1 reply; 17+ messages in thread From: Joel Fernandes @ 2020-08-10 20:08 UTC (permalink / raw) To: Paul E. McKenney Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), rcu On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote: > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > > Hi, > > Adding more of us working on RCU as well. Johan from another team at > > Google discovered a likely issue in openswitch, details below: > > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > > Hi Open vSwitch contributors, > > > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > > issue was not present in kernel version 5.5.17 but is present in > > > > > 5.6.14 and newer kernels. > > > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > > email in the kernel log (the last one shows the double-free). When I > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > > don't yet have a process that's amenable to sharing. Please take a > > > > > look. > > > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > Note that these reverts were only for testing the same code, because > > he was testing 2 different kernel versions. One of them did not have > > this set. So I asked him to revert. There's no known bug in the > > reverted code itself. But somehow these patches do make it harder for > > him to reproduce the issue. > > Perhaps they adjust timing? Yes that could be it. In my testing (which is unrelated to OVS), the issue happens only with TREE02. I can reproduce the issue in [1] on just boot-up of TREE02. I could have screwed up something in my segcblist count patch, any hints would be great. I'll dig more into it as well. > > > > But then again, I have not heard reports of this warning firing. Paul, > > has this come to your radar recently? > > I have not seen any recent WARNs in rcu_do_batch(). I am guessing that > this is one of the last two in that function? > > If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y? That Kconfig > option is designed to help locate double frees via RCU. Yes true, kfree_rcu() also has support for this. Jonathan, did you get a chance to try this out in your failure scenario? thanks, - Joel [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-10 20:08 ` Joel Fernandes @ 2020-08-10 20:28 ` Paul E. McKenney 2020-08-11 1:14 ` Tonghao Zhang 0 siblings, 1 reply; 17+ messages in thread From: Paul E. McKenney @ 2020-08-10 20:28 UTC (permalink / raw) To: Joel Fernandes Cc: Johan Knöös, Gregory Rose, bugs, Tonghao Zhang, Netdev, Uladzislau Rezki (Sony), rcu On Mon, Aug 10, 2020 at 04:08:59PM -0400, Joel Fernandes wrote: > On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote: > > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > > > Hi, > > > Adding more of us working on RCU as well. Johan from another team at > > > Google discovered a likely issue in openswitch, details below: > > > > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > > > Hi Open vSwitch contributors, > > > > > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > > > issue was not present in kernel version 5.5.17 but is present in > > > > > > 5.6.14 and newer kernels. > > > > > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > > > email in the kernel log (the last one shows the double-free). When I > > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > > > don't yet have a process that's amenable to sharing. Please take a > > > > > > look. > > > > > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > > > Note that these reverts were only for testing the same code, because > > > he was testing 2 different kernel versions. One of them did not have > > > this set. So I asked him to revert. There's no known bug in the > > > reverted code itself. But somehow these patches do make it harder for > > > him to reproduce the issue. > > > > Perhaps they adjust timing? > > Yes that could be it. In my testing (which is unrelated to OVS), the issue > happens only with TREE02. I can reproduce the issue in [1] on just boot-up of > TREE02. > > I could have screwed up something in my segcblist count patch, any hints > would be great. I'll dig more into it as well. Has anyone taken a close look at 50b0e61b32ee ("net: openvswitch: fix possible memleak on destroy flow-table") commit? Maybe it avoided the memleak so thoroughly that it did a double free? Thanx, Paul > > > But then again, I have not heard reports of this warning firing. Paul, > > > has this come to your radar recently? > > > > I have not seen any recent WARNs in rcu_do_batch(). I am guessing that > > this is one of the last two in that function? > > > > If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y? That Kconfig > > option is designed to help locate double frees via RCU. > > Yes true, kfree_rcu() also has support for this. Jonathan, did you get a > chance to try this out in your failure scenario? > > thanks, > > - Joel > > [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-10 20:28 ` Paul E. McKenney @ 2020-08-11 1:14 ` Tonghao Zhang 2020-08-11 2:24 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Tonghao Zhang @ 2020-08-11 1:14 UTC (permalink / raw) To: paulmck Cc: Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Tue, Aug 11, 2020 at 4:28 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Mon, Aug 10, 2020 at 04:08:59PM -0400, Joel Fernandes wrote: > > On Fri, Aug 07, 2020 at 03:20:15PM -0700, Paul E. McKenney wrote: > > > On Fri, Aug 07, 2020 at 04:47:56PM -0400, Joel Fernandes wrote: > > > > Hi, > > > > Adding more of us working on RCU as well. Johan from another team at > > > > Google discovered a likely issue in openswitch, details below: > > > > > > > > On Fri, Aug 7, 2020 at 11:32 AM Johan Knöös <jknoos@google.com> wrote: > > > > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > > > > > Hi Open vSwitch contributors, > > > > > > > > > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > > > > > issue was not present in kernel version 5.5.17 but is present in > > > > > > > 5.6.14 and newer kernels. > > > > > > > > > > > > > > After reverting the RCU commits below for debugging, enabling > > > > > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > > > > > email in the kernel log (the last one shows the double-free). When I > > > > > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > > > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > > > > > While I have a reliable way to reproduce the issue, I unfortunately > > > > > > > don't yet have a process that's amenable to sharing. Please take a > > > > > > > look. > > > > > > > > > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > > > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > > > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > > > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > > > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > > > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > > > > > Note that these reverts were only for testing the same code, because > > > > he was testing 2 different kernel versions. One of them did not have > > > > this set. So I asked him to revert. There's no known bug in the > > > > reverted code itself. But somehow these patches do make it harder for > > > > him to reproduce the issue. > > > > > > Perhaps they adjust timing? > > > > Yes that could be it. In my testing (which is unrelated to OVS), the issue > > happens only with TREE02. I can reproduce the issue in [1] on just boot-up of > > TREE02. > > > > I could have screwed up something in my segcblist count patch, any hints > > would be great. I'll dig more into it as well. > > Has anyone taken a close look at 50b0e61b32ee ("net: openvswitch: fix > possible memleak on destroy flow-table") commit? Maybe it avoided the > memleak so thoroughly that it did a double free? Hi all, I send a patch to fix this. The rcu warnings disappear. I don't reproduce the double free issue. But I guess this patch may address this issue. http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > Thanx, Paul > > > > > But then again, I have not heard reports of this warning firing. Paul, > > > > has this come to your radar recently? > > > > > > I have not seen any recent WARNs in rcu_do_batch(). I am guessing that > > > this is one of the last two in that function? > > > > > > If so, have you tried using CONFIG_DEBUG_OBJECTS_RCU_HEAD=y? That Kconfig > > > option is designed to help locate double frees via RCU. > > > > Yes true, kfree_rcu() also has support for this. Jonathan, did you get a > > chance to try this out in your failure scenario? > > > > thanks, > > > > - Joel > > > > [1] https://lore.kernel.org/lkml/20200720005334.GC19262@shao2-debian/ -- Best regards, Tonghao ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 1:14 ` Tonghao Zhang @ 2020-08-11 2:24 ` Cong Wang 2020-08-11 3:26 ` Tonghao Zhang 0 siblings, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-08-11 2:24 UTC (permalink / raw) To: Tonghao Zhang Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > Hi all, I send a patch to fix this. The rcu warnings disappear. I > don't reproduce the double free issue. > But I guess this patch may address this issue. > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ I don't see how your patch address the double-free, as we still free mask array twice after your patch: once in tbl_mask_array_realloc() and once in ovs_flow_tbl_destroy(). Have you tried my patch which is supposed to address this double-free? It simply skips the reallocation as it makes no sense to trigger reallocation when destroying it. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 2:24 ` Cong Wang @ 2020-08-11 3:26 ` Tonghao Zhang 2020-08-11 4:07 ` Cong Wang 0 siblings, 1 reply; 17+ messages in thread From: Tonghao Zhang @ 2020-08-11 3:26 UTC (permalink / raw) To: Cong Wang Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > Hi all, I send a patch to fix this. The rcu warnings disappear. I > > don't reproduce the double free issue. > > But I guess this patch may address this issue. > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > > I don't see how your patch address the double-free, as we still > free mask array twice after your patch: once in tbl_mask_array_realloc() > and once in ovs_flow_tbl_destroy(). Hi Cong. Before my patch, we use the ovsl_dereference (rcu_dereference_protected) in the rcu callback. ovs_flow_tbl_destroy ->table_instance_destroy ->table_instance_flow_free ->flow_mask_remove ASSERT_OVSL(will print warning) ->tbl_mask_array_del_mask ovsl_dereference(rcu usage warning) so we should invoke the table_instance_destroy or others under ovs_lock to avoid (ASSERT_OVSL and rcu usage warning). with this patch, we reallocate the mask_array under ovs_lock, and free it in the rcu callback. Without it, we reallocate and free it in the rcu callback. I think we may fix it with this patch. > Have you tried my patch which is supposed to address this double-free? I don't reproduce it. but your patch does not avoid ruc usage warning and ASSERT_OVSL. > It simply skips the reallocation as it makes no sense to trigger reallocation > when destroying it. > > Thanks. -- Best regards, Tonghao ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 3:26 ` Tonghao Zhang @ 2020-08-11 4:07 ` Cong Wang 2020-08-11 5:58 ` Tonghao Zhang 0 siblings, 1 reply; 17+ messages in thread From: Cong Wang @ 2020-08-11 4:07 UTC (permalink / raw) To: Tonghao Zhang Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I > > > don't reproduce the double free issue. > > > But I guess this patch may address this issue. > > > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > > > > I don't see how your patch address the double-free, as we still > > free mask array twice after your patch: once in tbl_mask_array_realloc() > > and once in ovs_flow_tbl_destroy(). > Hi Cong. > Before my patch, we use the ovsl_dereference > (rcu_dereference_protected) in the rcu callback. > ovs_flow_tbl_destroy > ->table_instance_destroy > ->table_instance_flow_free > ->flow_mask_remove > ASSERT_OVSL(will print warning) > ->tbl_mask_array_del_mask > ovsl_dereference(rcu usage warning) > I understand how your patch addresses the RCU annotation issue, which is different from double-free. > so we should invoke the table_instance_destroy or others under > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning). Of course... I never doubt it. > with this patch, we reallocate the mask_array under ovs_lock, and free > it in the rcu callback. Without it, we reallocate and free it in the > rcu callback. > I think we may fix it with this patch. Does it matter which context tbl_mask_array_realloc() is called? Even with ovs_lock, we can still double free: ovs_lock() tbl_mask_array_realloc() => call_rcu(&old->rcu, mask_array_rcu_cb); ovs_unlock() ... ovs_flow_tbl_destroy() => call_rcu(&old->rcu, mask_array_rcu_cb); So still twice, right? To fix the double-free, we have to eliminate one of them, don't we? ;) > > > Have you tried my patch which is supposed to address this double-free? > I don't reproduce it. but your patch does not avoid ruc usage warning > and ASSERT_OVSL. Sure, I never intend to fix anything else but double-free. The $subject is about double free, I double checked. ;) Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 4:07 ` Cong Wang @ 2020-08-11 5:58 ` Tonghao Zhang 2020-08-11 18:28 ` Johan Knöös 2020-08-12 0:43 ` Cong Wang 0 siblings, 2 replies; 17+ messages in thread From: Tonghao Zhang @ 2020-08-11 5:58 UTC (permalink / raw) To: Cong Wang Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I > > > > don't reproduce the double free issue. > > > > But I guess this patch may address this issue. > > > > > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > > > > > > I don't see how your patch address the double-free, as we still > > > free mask array twice after your patch: once in tbl_mask_array_realloc() > > > and once in ovs_flow_tbl_destroy(). > > Hi Cong. > > Before my patch, we use the ovsl_dereference > > (rcu_dereference_protected) in the rcu callback. > > ovs_flow_tbl_destroy > > ->table_instance_destroy > > ->table_instance_flow_free > > ->flow_mask_remove > > ASSERT_OVSL(will print warning) > > ->tbl_mask_array_del_mask > > ovsl_dereference(rcu usage warning) > > > > I understand how your patch addresses the RCU annotation issue, > which is different from double-free. > > > > so we should invoke the table_instance_destroy or others under > > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning). > > Of course... I never doubt it. > > > > with this patch, we reallocate the mask_array under ovs_lock, and free > > it in the rcu callback. Without it, we reallocate and free it in the > > rcu callback. > > I think we may fix it with this patch. > > Does it matter which context tbl_mask_array_realloc() is called? > Even with ovs_lock, we can still double free: > > ovs_lock() > tbl_mask_array_realloc() > => call_rcu(&old->rcu, mask_array_rcu_cb); > ovs_unlock() > ... > ovs_flow_tbl_destroy() > => call_rcu(&old->rcu, mask_array_rcu_cb); > > So still twice, right? To fix the double-free, we have to eliminate one > of them, don't we? ;) No Without my patch: in rcu callback: ovs_flow_tbl_destroy ->call_rcu(&ma->rcu, mask_array_rcu_cb); ->table_instance_destroy ->tbl_mask_array_realloc(Shrink the mask array if necessary) ->call_rcu(&old->rcu, mask_array_rcu_cb); With the patch: ovs_lock table_instance_flow_flush (free the flow) tbl_mask_array_realloc(shrink the mask array if necessary, will free mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new mask_array) ovs_unlock in rcu callback: ovs_flow_tbl_destroy call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array) > > > > > > Have you tried my patch which is supposed to address this double-free? > > I don't reproduce it. but your patch does not avoid ruc usage warning > > and ASSERT_OVSL. > > Sure, I never intend to fix anything else but double-free. The $subject is > about double free, I double checked. ;) > > Thanks. -- Best regards, Tonghao ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 5:58 ` Tonghao Zhang @ 2020-08-11 18:28 ` Johan Knöös 2020-08-12 0:43 ` Cong Wang 1 sibling, 0 replies; 17+ messages in thread From: Johan Knöös @ 2020-08-11 18:28 UTC (permalink / raw) To: Tonghao Zhang Cc: Cong Wang, Paul E . McKenney, Joel Fernandes, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Mon, Aug 10, 2020 at 10:59 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I > > > > > don't reproduce the double free issue. > > > > > But I guess this patch may address this issue. > > > > > > > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > > > > > > > > I don't see how your patch address the double-free, as we still > > > > free mask array twice after your patch: once in tbl_mask_array_realloc() > > > > and once in ovs_flow_tbl_destroy(). > > > Hi Cong. > > > Before my patch, we use the ovsl_dereference > > > (rcu_dereference_protected) in the rcu callback. > > > ovs_flow_tbl_destroy > > > ->table_instance_destroy > > > ->table_instance_flow_free > > > ->flow_mask_remove > > > ASSERT_OVSL(will print warning) > > > ->tbl_mask_array_del_mask > > > ovsl_dereference(rcu usage warning) > > > > > > > I understand how your patch addresses the RCU annotation issue, > > which is different from double-free. > > > > > > > so we should invoke the table_instance_destroy or others under > > > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning). > > > > Of course... I never doubt it. > > > > > > > with this patch, we reallocate the mask_array under ovs_lock, and free > > > it in the rcu callback. Without it, we reallocate and free it in the > > > rcu callback. > > > I think we may fix it with this patch. > > > > Does it matter which context tbl_mask_array_realloc() is called? > > Even with ovs_lock, we can still double free: > > > > ovs_lock() > > tbl_mask_array_realloc() > > => call_rcu(&old->rcu, mask_array_rcu_cb); > > ovs_unlock() > > ... > > ovs_flow_tbl_destroy() > > => call_rcu(&old->rcu, mask_array_rcu_cb); > > > > So still twice, right? To fix the double-free, we have to eliminate one > > of them, don't we? ;) > No > Without my patch: in rcu callback: > ovs_flow_tbl_destroy > ->call_rcu(&ma->rcu, mask_array_rcu_cb); > ->table_instance_destroy > ->tbl_mask_array_realloc(Shrink the mask array if necessary) > ->call_rcu(&old->rcu, mask_array_rcu_cb); > > With the patch: > ovs_lock > table_instance_flow_flush (free the flow) > tbl_mask_array_realloc(shrink the mask array if necessary, will free > mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new > mask_array) > ovs_unlock > > in rcu callback: > ovs_flow_tbl_destroy > call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array) > > > > > > > > > > Have you tried my patch which is supposed to address this double-free? > > > I don't reproduce it. but your patch does not avoid ruc usage warning > > > and ASSERT_OVSL. > > > > Sure, I never intend to fix anything else but double-free. The $subject is > > about double free, I double checked. ;) > > > > Thanks. > > > > -- > Best regards, Tonghao Cong and Tonghao, thanks for your patches. I cannot repro the double free with either of them, and the "suspicious RCU usage" and the ASSERT_OVSL warnings are also gone with Tonghao's patch. Tonghao, from your sequence above it looks like it should fix the https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239 warning, correct? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-11 5:58 ` Tonghao Zhang 2020-08-11 18:28 ` Johan Knöös @ 2020-08-12 0:43 ` Cong Wang 1 sibling, 0 replies; 17+ messages in thread From: Cong Wang @ 2020-08-12 0:43 UTC (permalink / raw) To: Tonghao Zhang Cc: Paul E . McKenney, Joel Fernandes, Johan Knöös, Gregory Rose, bugs, Netdev, Uladzislau Rezki (Sony), rcu On Mon, Aug 10, 2020 at 10:59 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > On Tue, Aug 11, 2020 at 12:08 PM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > On Mon, Aug 10, 2020 at 8:27 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > > > On Tue, Aug 11, 2020 at 10:24 AM Cong Wang <xiyou.wangcong@gmail.com> wrote: > > > > > > > > On Mon, Aug 10, 2020 at 6:16 PM Tonghao Zhang <xiangxia.m.yue@gmail.com> wrote: > > > > > Hi all, I send a patch to fix this. The rcu warnings disappear. I > > > > > don't reproduce the double free issue. > > > > > But I guess this patch may address this issue. > > > > > > > > > > http://patchwork.ozlabs.org/project/netdev/patch/20200811011001.75690-1-xiangxia.m.yue@gmail.com/ > > > > > > > > I don't see how your patch address the double-free, as we still > > > > free mask array twice after your patch: once in tbl_mask_array_realloc() > > > > and once in ovs_flow_tbl_destroy(). > > > Hi Cong. > > > Before my patch, we use the ovsl_dereference > > > (rcu_dereference_protected) in the rcu callback. > > > ovs_flow_tbl_destroy > > > ->table_instance_destroy > > > ->table_instance_flow_free > > > ->flow_mask_remove > > > ASSERT_OVSL(will print warning) > > > ->tbl_mask_array_del_mask > > > ovsl_dereference(rcu usage warning) > > > > > > > I understand how your patch addresses the RCU annotation issue, > > which is different from double-free. > > > > > > > so we should invoke the table_instance_destroy or others under > > > ovs_lock to avoid (ASSERT_OVSL and rcu usage warning). > > > > Of course... I never doubt it. > > > > > > > with this patch, we reallocate the mask_array under ovs_lock, and free > > > it in the rcu callback. Without it, we reallocate and free it in the > > > rcu callback. > > > I think we may fix it with this patch. > > > > Does it matter which context tbl_mask_array_realloc() is called? > > Even with ovs_lock, we can still double free: > > > > ovs_lock() > > tbl_mask_array_realloc() > > => call_rcu(&old->rcu, mask_array_rcu_cb); > > ovs_unlock() > > ... > > ovs_flow_tbl_destroy() > > => call_rcu(&old->rcu, mask_array_rcu_cb); > > > > So still twice, right? To fix the double-free, we have to eliminate one > > of them, don't we? ;) > No > Without my patch: in rcu callback: > ovs_flow_tbl_destroy > ->call_rcu(&ma->rcu, mask_array_rcu_cb); > ->table_instance_destroy > ->tbl_mask_array_realloc(Shrink the mask array if necessary) > ->call_rcu(&old->rcu, mask_array_rcu_cb); > > With the patch: > ovs_lock > table_instance_flow_flush (free the flow) > tbl_mask_array_realloc(shrink the mask array if necessary, will free > mask_array in rcu(mask_array_rcu_cb) and rcu_assign_pointer new > mask_array) > ovs_unlock > > in rcu callback: > ovs_flow_tbl_destroy > call_rcu(&ma->rcu, mask_array_rcu_cb);(that is new mask_array) Ah, I see, I thought the mask array was cached in caller and passed along, it is in fact refetched via &dp->table. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [ovs-discuss] Double free in recent kernels after memleak fix 2020-08-07 15:31 ` Johan Knöös 2020-08-07 20:47 ` Joel Fernandes @ 2020-08-07 21:52 ` Cong Wang 1 sibling, 0 replies; 17+ messages in thread From: Cong Wang @ 2020-08-07 21:52 UTC (permalink / raw) To: Johan Knöös; +Cc: Gregory Rose, bugs, Tonghao Zhang, Netdev, joel [-- Attachment #1: Type: text/plain, Size: 2309 bytes --] On Fri, Aug 7, 2020 at 8:33 AM Johan Knöös <jknoos@google.com> wrote: > > On Tue, Aug 4, 2020 at 8:52 AM Gregory Rose <gvrose8192@gmail.com> wrote: > > > > > > > > On 8/3/2020 12:01 PM, Johan Knöös via discuss wrote: > > > Hi Open vSwitch contributors, > > > > > > We have found openvswitch is causing double-freeing of memory. The > > > issue was not present in kernel version 5.5.17 but is present in > > > 5.6.14 and newer kernels. > > > > > > After reverting the RCU commits below for debugging, enabling > > > slub_debug, lockdep, and KASAN, we see the warnings at the end of this > > > email in the kernel log (the last one shows the double-free). When I > > > revert 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 ("net: openvswitch: > > > fix possible memleak on destroy flow-table"), the symptoms disappear. > > > While I have a reliable way to reproduce the issue, I unfortunately > > > don't yet have a process that's amenable to sharing. Please take a > > > look. > > > > > > 189a6883dcf7 rcu: Remove kfree_call_rcu_nobatch() > > > 77a40f97030b rcu: Remove kfree_rcu() special casing and lazy-callback handling > > > e99637becb2e rcu: Add support for debug_objects debugging for kfree_rcu() > > > 0392bebebf26 rcu: Add multiple in-flight batches of kfree_rcu() work > > > 569d767087ef rcu: Make kfree_rcu() use a non-atomic ->monitor_todo > > > a35d16905efc rcu: Add basic support for kfree_rcu() batching > > > > > > Thanks, > > > Johan Knöös > > > > Let's add the author of the patch you reverted and the Linux netdev > > mailing list. > > > > - Greg > > I found we also sometimes get warnings from > https://elixir.bootlin.com/linux/v5.5.17/source/kernel/rcu/tree.c#L2239 > under similar conditions even on kernel 5.5.17, which I believe may be > related. However, it's much rarer and I don't have a reliable way of > reproducing it. Perhaps 50b0e61b32ee890a75b4377d5fbe770a86d6a4c1 only > increases the frequency of a pre-existing bug. It seems clear we have a double free on table->mask_array when the reallocation is triggered on the destroy path. Are you able to test the attached patch (compile tested only)? Also note: it is generated against the latest net tree, it may not be applied cleanly to any earlier stable release. Thanks! [-- Attachment #2: openvswitch.diff --] [-- Type: text/x-patch, Size: 2227 bytes --] diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c index 8c12675cbb67..cc7859db445a 100644 --- a/net/openvswitch/flow_table.c +++ b/net/openvswitch/flow_table.c @@ -294,7 +294,7 @@ static int tbl_mask_array_add_mask(struct flow_table *tbl, } static void tbl_mask_array_del_mask(struct flow_table *tbl, - struct sw_flow_mask *mask) + struct sw_flow_mask *mask, bool destroy) { struct mask_array *ma = ovsl_dereference(tbl->mask_array); int i, ma_count = READ_ONCE(ma->count); @@ -314,6 +314,11 @@ static void tbl_mask_array_del_mask(struct flow_table *tbl, rcu_assign_pointer(ma->masks[i], ma->masks[ma_count -1]); RCU_INIT_POINTER(ma->masks[ma_count -1], NULL); + if (destroy) { + kfree(mask); + return; + } + kfree_rcu(mask, rcu); /* Shrink the mask array if necessary. */ @@ -326,7 +331,8 @@ static void tbl_mask_array_del_mask(struct flow_table *tbl, } /* Remove 'mask' from the mask list, if it is not needed any more. */ -static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask) +static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask, + bool destroy) { if (mask) { /* ovs-lock is required to protect mask-refcount and @@ -337,7 +343,7 @@ static void flow_mask_remove(struct flow_table *tbl, struct sw_flow_mask *mask) mask->ref_count--; if (!mask->ref_count) - tbl_mask_array_del_mask(tbl, mask); + tbl_mask_array_del_mask(tbl, mask, destroy); } } @@ -470,7 +476,7 @@ static void table_instance_flow_free(struct flow_table *table, table->ufid_count--; } - flow_mask_remove(table, flow->mask); + flow_mask_remove(table, flow->mask, !count); } static void table_instance_destroy(struct flow_table *table, @@ -521,9 +527,9 @@ void ovs_flow_tbl_destroy(struct flow_table *table) struct mask_cache *mc = rcu_dereference_raw(table->mask_cache); struct mask_array *ma = rcu_dereference_raw(table->mask_array); + table_instance_destroy(table, ti, ufid_ti, false); call_rcu(&mc->rcu, mask_cache_rcu_cb); call_rcu(&ma->rcu, mask_array_rcu_cb); - table_instance_destroy(table, ti, ufid_ti, false); } struct sw_flow *ovs_flow_tbl_dump_next(struct table_instance *ti, ^ permalink raw reply related [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-08-12 0:43 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CA+Sh73MJhqs7PBk6OV2AhzVjYvE1foUQUnwP5DwWR44LHZRZ9w@mail.gmail.com> 2020-08-04 15:51 ` [ovs-discuss] Double free in recent kernels after memleak fix Gregory Rose 2020-08-07 15:31 ` Johan Knöös 2020-08-07 20:47 ` Joel Fernandes 2020-08-07 20:49 ` Joel Fernandes 2020-08-07 22:20 ` Paul E. McKenney 2020-08-07 23:05 ` Johan Knöös 2020-08-08 11:44 ` Uladzislau Rezki 2020-08-10 20:08 ` Joel Fernandes 2020-08-10 20:28 ` Paul E. McKenney 2020-08-11 1:14 ` Tonghao Zhang 2020-08-11 2:24 ` Cong Wang 2020-08-11 3:26 ` Tonghao Zhang 2020-08-11 4:07 ` Cong Wang 2020-08-11 5:58 ` Tonghao Zhang 2020-08-11 18:28 ` Johan Knöös 2020-08-12 0:43 ` Cong Wang 2020-08-07 21:52 ` Cong Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).