All of lore.kernel.org
 help / color / mirror / Atom feed
* bpf/asan related lockup
@ 2015-12-04 18:23 Dave Jones
  2015-12-04 19:06 ` Alexei Starovoitov
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Jones @ 2015-12-04 18:23 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Alexei Starovoitov

Trinity had aparently created a bpf program that upset things greatly.
I guess I need to find a way to make it record those somewhere for replaying later.

Alexei, any ideas ?

	Dave


NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:991]
irq event stamp: 153214
hardirqs last  enabled at (153213): [<ffffffffa0cb390c>] _raw_spin_unlock_irq+0x2c/0x50
hardirqs last disabled at (153214): [<ffffffffa0cb35c9>] _raw_spin_lock_irq+0x19/0x80
softirqs last  enabled at (153108): [<ffffffffa00b32f8>] __do_softirq+0x2b8/0x590
softirqs last disabled at (153103): [<ffffffffa00b3805>] irq_exit+0xf5/0x100
CPU: 0 PID: 991 Comm: kworker/0:1 Tainted: G      D W       4.4.0-rc3-think+ #5
Workqueue: events bpf_prog_free_deferred
task: ffff880464dab700 ti: ffff8803041d8000 task.ti: ffff8803041d8000
RIP: 0010:[<ffffffffa02d6950>]  [<ffffffffa02d6950>] __asan_load4+0x0/0x70
RSP: 0018:ffff8803041dfa08  EFLAGS: 00000202
RAX: 0000000000000003 RBX: ffff880468be39a8 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff880468be39c0
RBP: ffff8803041dfa70 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8803041dfb8f R11: 0000000000000000 R12: ffff880468be39c0
R13: 0000000000000001 R14: ffff8804689dff00 R15: 0000000000000001
FS:  0000000000000000(0000) GS:ffff880468800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007faeedb04000 CR3: 0000000452548000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
Stack:
 ffffffffa018f2ce 00000000001dfec0 01ff880300000001 ffff8803041dfaf8
 ffffffffa0076610 ffffffffa18895d8 ffff8804689dff08 0000000000000004
 ffffffffa0076610 ffff8803041dfaf8 0000000000000001 ffffc90000171000
Call Trace:
 [<ffffffffa018f2ce>] ? smp_call_function_many+0x32e/0x410
 [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
 [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
 [<ffffffffa018f3f7>] smp_call_function+0x47/0x80
 [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
 [<ffffffffa018f45f>] on_each_cpu+0x2f/0x90
 [<ffffffffa0077a10>] flush_tlb_kernel_range+0xc0/0xd0
 [<ffffffffa0077950>] ? flush_tlb_all+0x20/0x20
 [<ffffffffa02b547f>] remove_vm_area+0xaf/0x100
 [<ffffffffa02b5506>] __vunmap+0x36/0x180
 [<ffffffffa02b56c5>] vfree+0x35/0xa0
 [<ffffffffa0229867>] __bpf_prog_free+0x27/0x30
 [<ffffffffa00a4da9>] bpf_jit_free+0x69/0x6e
 [<ffffffffa022988f>] bpf_prog_free_deferred+0x1f/0x30
 [<ffffffffa00d6d3a>] process_one_work+0x3fa/0xa10
 [<ffffffffa00d6c74>] ? process_one_work+0x334/0xa10
 [<ffffffffa00d6940>] ? pwq_dec_nr_in_flight+0x110/0x110
 [<ffffffffa00d73d8>] worker_thread+0x88/0x6c0
 [<ffffffffa0caa625>] ? __schedule+0x645/0xf80
 [<ffffffffa00d7350>] ? process_one_work+0xa10/0xa10
 [<ffffffffa00e08b9>] kthread+0x199/0x1c0
 [<ffffffffa00e0720>] ? kthread_create_on_node+0x330/0x330
 [<ffffffffa00eaf31>] ? preempt_count_sub+0xc1/0x120
 [<ffffffffa00e0720>] ? kthread_create_on_node+0x330/0x330
 [<ffffffffa0cb46ef>] ret_from_fork+0x3f/0x70
 [<ffffffffa00e0720>] ? kthread_create_on_node+0x330/0x330
Code: 75 e6 80 38 00 74 e1 48 8b 4d 08 ba 01 00 00 00 be 02 00 00 00 e8 31 16 00 00 5d c3 89 ce 83 e6 07 40 38 f2 7e e1 eb d5 0f 1f 00 <48> b8 ff ff ff ff ff 7f ff ff 55 48 39 c7 48 89 e5 76 3d 48 ba 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: bpf/asan related lockup
  2015-12-04 18:23 bpf/asan related lockup Dave Jones
@ 2015-12-04 19:06 ` Alexei Starovoitov
  2015-12-09 22:24   ` Daniel Borkmann
  0 siblings, 1 reply; 3+ messages in thread
From: Alexei Starovoitov @ 2015-12-04 19:06 UTC (permalink / raw)
  To: Dave Jones, Linux Kernel, Alexei Starovoitov; +Cc: netdev

On Fri, Dec 04, 2015 at 01:23:33PM -0500, Dave Jones wrote:
> Trinity had aparently created a bpf program that upset things greatly.
> I guess I need to find a way to make it record those somewhere for replaying later.
> 
> Alexei, any ideas ?
> 
> 	Dave
> 
> 
> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:991]
> irq event stamp: 153214
> hardirqs last  enabled at (153213): [<ffffffffa0cb390c>] _raw_spin_unlock_irq+0x2c/0x50
> hardirqs last disabled at (153214): [<ffffffffa0cb35c9>] _raw_spin_lock_irq+0x19/0x80
> softirqs last  enabled at (153108): [<ffffffffa00b32f8>] __do_softirq+0x2b8/0x590
> softirqs last disabled at (153103): [<ffffffffa00b3805>] irq_exit+0xf5/0x100
> CPU: 0 PID: 991 Comm: kworker/0:1 Tainted: G      D W       4.4.0-rc3-think+ #5
> Workqueue: events bpf_prog_free_deferred
> task: ffff880464dab700 ti: ffff8803041d8000 task.ti: ffff8803041d8000
> RIP: 0010:[<ffffffffa02d6950>]  [<ffffffffa02d6950>] __asan_load4+0x0/0x70
> RSP: 0018:ffff8803041dfa08  EFLAGS: 00000202
> RAX: 0000000000000003 RBX: ffff880468be39a8 RCX: 0000000000000000
> RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff880468be39c0
> RBP: ffff8803041dfa70 R08: 0000000000000000 R09: 0000000000000001
> R10: ffff8803041dfb8f R11: 0000000000000000 R12: ffff880468be39c0
> R13: 0000000000000001 R14: ffff8804689dff00 R15: 0000000000000001
> FS:  0000000000000000(0000) GS:ffff880468800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007faeedb04000 CR3: 0000000452548000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> Stack:
>  ffffffffa018f2ce 00000000001dfec0 01ff880300000001 ffff8803041dfaf8
>  ffffffffa0076610 ffffffffa18895d8 ffff8804689dff08 0000000000000004
>  ffffffffa0076610 ffff8803041dfaf8 0000000000000001 ffffc90000171000
> Call Trace:
>  [<ffffffffa018f2ce>] ? smp_call_function_many+0x32e/0x410
>  [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>  [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>  [<ffffffffa018f3f7>] smp_call_function+0x47/0x80
>  [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>  [<ffffffffa018f45f>] on_each_cpu+0x2f/0x90
>  [<ffffffffa0077a10>] flush_tlb_kernel_range+0xc0/0xd0
>  [<ffffffffa0077950>] ? flush_tlb_all+0x20/0x20
>  [<ffffffffa02b547f>] remove_vm_area+0xaf/0x100
>  [<ffffffffa02b5506>] __vunmap+0x36/0x180
>  [<ffffffffa02b56c5>] vfree+0x35/0xa0
>  [<ffffffffa0229867>] __bpf_prog_free+0x27/0x30
>  [<ffffffffa00a4da9>] bpf_jit_free+0x69/0x6e
>  [<ffffffffa022988f>] bpf_prog_free_deferred+0x1f/0x30
>  [<ffffffffa00d6d3a>] process_one_work+0x3fa/0xa10
>  [<ffffffffa00d6c74>] ? process_one_work+0x334/0xa10
>  [<ffffffffa00d6940>] ? pwq_dec_nr_in_flight+0x110/0x110
>  [<ffffffffa00d73d8>] worker_thread+0x88/0x6c0

hmm. may be set_memory_rw(ptr) followed by vfree(ptr) have a race
deep inside mm logic.
Both of them do flush_tlb_kernel_range()...


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: bpf/asan related lockup
  2015-12-04 19:06 ` Alexei Starovoitov
@ 2015-12-09 22:24   ` Daniel Borkmann
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Borkmann @ 2015-12-09 22:24 UTC (permalink / raw)
  To: Alexei Starovoitov, Dave Jones, Linux Kernel, Alexei Starovoitov
  Cc: netdev, venkatesh.pallipadi, suresh.b.siddha

On 12/04/2015 08:06 PM, Alexei Starovoitov wrote:
> On Fri, Dec 04, 2015 at 01:23:33PM -0500, Dave Jones wrote:
>> Trinity had aparently created a bpf program that upset things greatly.
>> I guess I need to find a way to make it record those somewhere for replaying later.
>>
>> Alexei, any ideas ?
>>
>> 	Dave
>>
>> NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:991]
>> irq event stamp: 153214
>> hardirqs last  enabled at (153213): [<ffffffffa0cb390c>] _raw_spin_unlock_irq+0x2c/0x50
>> hardirqs last disabled at (153214): [<ffffffffa0cb35c9>] _raw_spin_lock_irq+0x19/0x80
>> softirqs last  enabled at (153108): [<ffffffffa00b32f8>] __do_softirq+0x2b8/0x590
>> softirqs last disabled at (153103): [<ffffffffa00b3805>] irq_exit+0xf5/0x100
>> CPU: 0 PID: 991 Comm: kworker/0:1 Tainted: G      D W       4.4.0-rc3-think+ #5
>> Workqueue: events bpf_prog_free_deferred
>> task: ffff880464dab700 ti: ffff8803041d8000 task.ti: ffff8803041d8000
>> RIP: 0010:[<ffffffffa02d6950>]  [<ffffffffa02d6950>] __asan_load4+0x0/0x70
>> RSP: 0018:ffff8803041dfa08  EFLAGS: 00000202
>> RAX: 0000000000000003 RBX: ffff880468be39a8 RCX: 0000000000000000
>> RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff880468be39c0
>> RBP: ffff8803041dfa70 R08: 0000000000000000 R09: 0000000000000001
>> R10: ffff8803041dfb8f R11: 0000000000000000 R12: ffff880468be39c0
>> R13: 0000000000000001 R14: ffff8804689dff00 R15: 0000000000000001
>> FS:  0000000000000000(0000) GS:ffff880468800000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007faeedb04000 CR3: 0000000452548000 CR4: 00000000001406f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
>> Stack:
>>   ffffffffa018f2ce 00000000001dfec0 01ff880300000001 ffff8803041dfaf8
>>   ffffffffa0076610 ffffffffa18895d8 ffff8804689dff08 0000000000000004
>>   ffffffffa0076610 ffff8803041dfaf8 0000000000000001 ffffc90000171000
>> Call Trace:
>>   [<ffffffffa018f2ce>] ? smp_call_function_many+0x32e/0x410
>>   [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>>   [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>>   [<ffffffffa018f3f7>] smp_call_function+0x47/0x80
>>   [<ffffffffa0076610>] ? rbt_memtype_copy_nth_element+0xd0/0xd0
>>   [<ffffffffa018f45f>] on_each_cpu+0x2f/0x90
>>   [<ffffffffa0077a10>] flush_tlb_kernel_range+0xc0/0xd0
>>   [<ffffffffa0077950>] ? flush_tlb_all+0x20/0x20
>>   [<ffffffffa02b547f>] remove_vm_area+0xaf/0x100
>>   [<ffffffffa02b5506>] __vunmap+0x36/0x180
>>   [<ffffffffa02b56c5>] vfree+0x35/0xa0
>>   [<ffffffffa0229867>] __bpf_prog_free+0x27/0x30
>>   [<ffffffffa00a4da9>] bpf_jit_free+0x69/0x6e
>>   [<ffffffffa022988f>] bpf_prog_free_deferred+0x1f/0x30
>>   [<ffffffffa00d6d3a>] process_one_work+0x3fa/0xa10
>>   [<ffffffffa00d6c74>] ? process_one_work+0x334/0xa10
>>   [<ffffffffa00d6940>] ? pwq_dec_nr_in_flight+0x110/0x110
>>   [<ffffffffa00d73d8>] worker_thread+0x88/0x6c0
>
> hmm. may be set_memory_rw(ptr) followed by vfree(ptr) have a race
> deep inside mm logic.
> Both of them do flush_tlb_kernel_range()...

Hmm, was the rbt_memtype_copy_nth_element() by chance unrelated when this
happens or are we somehow stuck there each time? Only place where that can
be invoked is memtype_get_idx() when a cat /sys/kernel/debug/x86/pat_memtype_list
is done.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-12-09 22:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-04 18:23 bpf/asan related lockup Dave Jones
2015-12-04 19:06 ` Alexei Starovoitov
2015-12-09 22:24   ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.