bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] WARNING in bpf_test_run
@ 2021-04-01 11:29 syzbot
  2021-04-01 22:05 ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: syzbot @ 2021-04-01 11:29 UTC (permalink / raw)
  To: akpm, andrii, ast, bp, bpf, daniel, davem, hawk, hpa, jmattson,
	john.fastabend, joro, kafai, kpsingh, kuba, kvm, linux-kernel,
	mark.rutland, masahiroy, mingo, netdev, pbonzini, peterz,
	rafael.j.wysocki, rostedt, seanjc, songliubraving,
	syzkaller-bugs, tglx, vkuznets, wanpengli, will, x86, yhs

Hello,

syzbot found the following issue on:

HEAD commit:    36e79851 libbpf: Preserve empty DATASEC BTFs during static..
git tree:       bpf-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1569bb06d00000
kernel config:  https://syzkaller.appspot.com/x/.config?x=7eff0f22b8563a5f
dashboard link: https://syzkaller.appspot.com/bug?extid=774c590240616eaa3423
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17556b7cd00000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1772be26d00000

The issue was bisected to:

commit 997acaf6b4b59c6a9c259740312a69ea549cc684
Author: Mark Rutland <mark.rutland@arm.com>
Date:   Mon Jan 11 15:37:07 2021 +0000

    lockdep: report broken irq restoration

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10197016d00000
final oops:     https://syzkaller.appspot.com/x/report.txt?x=12197016d00000
console output: https://syzkaller.appspot.com/x/log.txt?x=14197016d00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+774c590240616eaa3423@syzkaller.appspotmail.com
Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")

------------[ cut here ]------------
WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
Modules linked in:
CPU: 0 PID: 8725 Comm: syz-executor927 Not tainted 5.12.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
RIP: 0010:bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
Code: e9 29 fe ff ff e8 b2 9d 3a fa 41 83 c6 01 bf 08 00 00 00 44 89 f6 e8 51 a5 3a fa 41 83 fe 08 0f 85 74 fc ff ff e8 92 9d 3a fa <0f> 0b bd f0 ff ff ff e9 5c fd ff ff e8 81 9d 3a fa 83 c5 01 bf 08
RSP: 0018:ffffc900017bfaf0 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffffc90000f29000 RCX: 0000000000000000
RDX: ffff88801bc68000 RSI: ffffffff8739543e RDI: 0000000000000003
RBP: 0000000000000007 R08: 0000000000000008 R09: 0000000000000001
R10: ffffffff8739542f R11: 0000000000000000 R12: dffffc0000000000
R13: ffff888021dd54c0 R14: 0000000000000008 R15: 0000000000000000
FS:  00007f00157d7700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0015795718 CR3: 00000000157ae000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 bpf_prog_test_run_skb+0xabc/0x1c70 net/bpf/test_run.c:628
 bpf_prog_test_run kernel/bpf/syscall.c:3132 [inline]
 __do_sys_bpf+0x218b/0x4f40 kernel/bpf/syscall.c:4411
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x446199
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f00157d72f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004cb440 RCX: 0000000000446199
RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
RBP: 000000000049b074 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: f9abde7200f522cd
R13: 3952ddf3af240c07 R14: 1631e0d82d3fa99d R15: 00000000004cb448


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] WARNING in bpf_test_run
  2021-04-01 11:29 [syzbot] WARNING in bpf_test_run syzbot
@ 2021-04-01 22:05 ` Yonghong Song
  2021-04-02  0:40   ` Yonghong Song
  0 siblings, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2021-04-01 22:05 UTC (permalink / raw)
  To: syzbot, akpm, andrii, ast, bp, bpf, daniel, davem, hawk, hpa,
	jmattson, john.fastabend, joro, kafai, kpsingh, kuba, kvm,
	linux-kernel, mark.rutland, masahiroy, mingo, netdev, pbonzini,
	peterz, rafael.j.wysocki, rostedt, seanjc, songliubraving,
	syzkaller-bugs, tglx, vkuznets, wanpengli, will, x86



On 4/1/21 4:29 AM, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    36e79851 libbpf: Preserve empty DATASEC BTFs during static..
> git tree:       bpf-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1569bb06d00000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=7eff0f22b8563a5f
> dashboard link: https://syzkaller.appspot.com/bug?extid=774c590240616eaa3423
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17556b7cd00000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1772be26d00000
> 
> The issue was bisected to:
> 
> commit 997acaf6b4b59c6a9c259740312a69ea549cc684
> Author: Mark Rutland <mark.rutland@arm.com>
> Date:   Mon Jan 11 15:37:07 2021 +0000
> 
>      lockdep: report broken irq restoration
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=10197016d00000
> final oops:     https://syzkaller.appspot.com/x/report.txt?x=12197016d00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=14197016d00000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+774c590240616eaa3423@syzkaller.appspotmail.com
> Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")
> 
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109

I will look at this issue. Thanks!

> Modules linked in:
> CPU: 0 PID: 8725 Comm: syz-executor927 Not tainted 5.12.0-rc4-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
> RIP: 0010:bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
> Code: e9 29 fe ff ff e8 b2 9d 3a fa 41 83 c6 01 bf 08 00 00 00 44 89 f6 e8 51 a5 3a fa 41 83 fe 08 0f 85 74 fc ff ff e8 92 9d 3a fa <0f> 0b bd f0 ff ff ff e9 5c fd ff ff e8 81 9d 3a fa 83 c5 01 bf 08
> RSP: 0018:ffffc900017bfaf0 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: ffffc90000f29000 RCX: 0000000000000000
> RDX: ffff88801bc68000 RSI: ffffffff8739543e RDI: 0000000000000003
> RBP: 0000000000000007 R08: 0000000000000008 R09: 0000000000000001
> R10: ffffffff8739542f R11: 0000000000000000 R12: dffffc0000000000
> R13: ffff888021dd54c0 R14: 0000000000000008 R15: 0000000000000000
> FS:  00007f00157d7700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f0015795718 CR3: 00000000157ae000 CR4: 00000000001506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>   bpf_prog_test_run_skb+0xabc/0x1c70 net/bpf/test_run.c:628
>   bpf_prog_test_run kernel/bpf/syscall.c:3132 [inline]
>   __do_sys_bpf+0x218b/0x4f40 kernel/bpf/syscall.c:4411
>   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x446199
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f00157d72f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 00000000004cb440 RCX: 0000000000446199
> RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
> RBP: 000000000049b074 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: f9abde7200f522cd
> R13: 3952ddf3af240c07 R14: 1631e0d82d3fa99d R15: 00000000004cb448
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ  for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status  for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] WARNING in bpf_test_run
  2021-04-01 22:05 ` Yonghong Song
@ 2021-04-02  0:40   ` Yonghong Song
  2021-04-13  7:56     ` Dmitry Vyukov
  0 siblings, 1 reply; 5+ messages in thread
From: Yonghong Song @ 2021-04-02  0:40 UTC (permalink / raw)
  To: syzbot, akpm, andrii, ast, bp, bpf, daniel, davem, hawk, hpa,
	jmattson, john.fastabend, joro, kafai, kpsingh, kuba, kvm,
	linux-kernel, mark.rutland, masahiroy, mingo, netdev, pbonzini,
	peterz, rafael.j.wysocki, rostedt, seanjc, songliubraving,
	syzkaller-bugs, tglx, vkuznets, wanpengli, will, x86



On 4/1/21 3:05 PM, Yonghong Song wrote:
> 
> 
> On 4/1/21 4:29 AM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit:    36e79851 libbpf: Preserve empty DATASEC BTFs during 
>> static..
>> git tree:       bpf-next
>> console output: 
>> https://syzkaller.appspot.com/x/log.txt?x=1569bb06d00000 
>> kernel config:  
>> https://syzkaller.appspot.com/x/.config?x=7eff0f22b8563a5f 
>> dashboard link: 
>> https://syzkaller.appspot.com/bug?extid=774c590240616eaa3423 
>> syz repro:      
>> https://syzkaller.appspot.com/x/repro.syz?x=17556b7cd00000 
>> C reproducer:   
>> https://syzkaller.appspot.com/x/repro.c?x=1772be26d00000 
>>
>> The issue was bisected to:
>>
>> commit 997acaf6b4b59c6a9c259740312a69ea549cc684
>> Author: Mark Rutland <mark.rutland@arm.com>
>> Date:   Mon Jan 11 15:37:07 2021 +0000
>>
>>      lockdep: report broken irq restoration
>>
>> bisection log:  
>> https://syzkaller.appspot.com/x/bisect.txt?x=10197016d00000 
>> final oops:     
>> https://syzkaller.appspot.com/x/report.txt?x=12197016d00000 
>> console output: 
>> https://syzkaller.appspot.com/x/log.txt?x=14197016d00000 
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the 
>> commit:
>> Reported-by: syzbot+774c590240616eaa3423@syzkaller.appspotmail.com
>> Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 
>> bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
>> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193 
>> bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
> 
> I will look at this issue. Thanks!
> 
>> Modules linked in:
>> CPU: 0 PID: 8725 Comm: syz-executor927 Not tainted 
>> 5.12.0-rc4-syzkaller #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, 
>> BIOS Google 01/01/2011
>> RIP: 0010:bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
>> RIP: 0010:bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
>> Code: e9 29 fe ff ff e8 b2 9d 3a fa 41 83 c6 01 bf 08 00 00 00 44 89 
>> f6 e8 51 a5 3a fa 41 83 fe 08 0f 85 74 fc ff ff e8 92 9d 3a fa <0f> 0b 
>> bd f0 ffff ff e9 5c fd ff ff e8 81 9d 3a fa 83 c5 01 bf 08
>> RSP: 0018:ffffc900017bfaf0 EFLAGS: 00010293
>> RAX: 0000000000000000 RBX: ffffc90000f29000 RCX: 0000000000000000
>> RDX: ffff88801bc68000 RSI: ffffffff8739543e RDI: 0000000000000003
>> RBP: 0000000000000007 R08: 0000000000000008 R09: 0000000000000001
>> R10: ffffffff8739542f R11: 0000000000000000 R12: dffffc0000000000
>> R13: ffff888021dd54c0 R14: 0000000000000008 R15: 0000000000000000
>> FS:  00007f00157d7700(0000) GS:ffff8880b9c00000(0000) 
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f0015795718 CR3: 00000000157ae000 CR4: 00000000001506f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>   bpf_prog_test_run_skb+0xabc/0x1c70 net/bpf/test_run.c:628
>>   bpf_prog_test_run kernel/bpf/syscall.c:3132 [inline]
>>   __do_sys_bpf+0x218b/0x4f40 kernel/bpf/syscall.c:4411
>>   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46

Run on my qemu (4 cpus) with C reproducer and I cannot reproduce the 
result. It already ran 30 minutes and still running. Checked the code, 
it is just doing a lot of parallel bpf_prog_test_run's.

The failure is in the below WARN_ON_ONCE code:

175 static inline int bpf_cgroup_storage_set(struct bpf_cgroup_storage
176 
*storage[MAX_BPF_CGROUP_STORAGE_TYPE])
177 {
178         enum bpf_cgroup_storage_type stype;
179         int i, err = 0;
180
181         preempt_disable();
182         for (i = 0; i < BPF_CGROUP_STORAGE_NEST_MAX; i++) {
183                 if 
(unlikely(this_cpu_read(bpf_cgroup_storage_info[i].task) != NULL))
184                         continue;
185
186                 this_cpu_write(bpf_cgroup_storage_info[i].task, 
current);
187                 for_each_cgroup_storage_type(stype)
188 
this_cpu_write(bpf_cgroup_storage_info[i].storage[stype],
189                                        storage[stype]);
190                 goto out;
191         }
192         err = -EBUSY;
193         WARN_ON_ONCE(1);
194
195 out:
196         preempt_enable();
197         return err;
198 }

Basically it shows the stress test triggered a warning due to
limited kernel resource.

>>   entry_SYSCALL_64_after_hwframe+0x44/0xae
>> RIP: 0033:0x446199
>> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48 
>> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 
>> 01 f0 ffff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:00007f00157d72f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
>> RAX: ffffffffffffffda RBX: 00000000004cb440 RCX: 0000000000446199
>> RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
>> RBP: 000000000049b074 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: f9abde7200f522cd
>> R13: 3952ddf3af240c07 R14: 1631e0d82d3fa99d R15: 00000000004cb448
>>
>>
>> ---
>> This report is generated by a bot. It may contain errors.
>> See 
>> https://goo.gl/tpsmEJ   
>> for more information about syzbot.
>> syzbot engineers can be reached at syzkaller@googlegroups.com.
>>
>> syzbot will keep track of this issue. See:
>> https://goo.gl/tpsmEJ#status   
>> for how to communicate with syzbot.
>> For information about bisection process see: 
>> https://goo.gl/tpsmEJ#bisection 
>> syzbot can test patches for this issue, for details see:
>> https://goo.gl/tpsmEJ#testing-patches 
>>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] WARNING in bpf_test_run
  2021-04-02  0:40   ` Yonghong Song
@ 2021-04-13  7:56     ` Dmitry Vyukov
  2021-04-13 12:55       ` Steven Rostedt
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Vyukov @ 2021-04-13  7:56 UTC (permalink / raw)
  To: Yonghong Song
  Cc: syzbot, Andrew Morton, andrii, Alexei Starovoitov,
	Borislav Petkov, bpf, Daniel Borkmann, David Miller,
	Jesper Dangaard Brouer, H. Peter Anvin, Jim Mattson,
	John Fastabend, Joerg Roedel, Martin KaFai Lau, kpsingh,
	Jakub Kicinski, KVM list, LKML, Mark Rutland, masahiroy,
	Ingo Molnar, netdev, Paolo Bonzini, Peter Zijlstra,
	rafael.j.wysocki, Steven Rostedt, Sean Christopherson, Song Liu,
	syzkaller-bugs, Thomas Gleixner, vkuznets, wanpengli, will, x86

On Fri, Apr 2, 2021 at 2:41 AM 'Yonghong Song' via syzkaller-bugs
<syzkaller-bugs@googlegroups.com> wrote:
> > On 4/1/21 4:29 AM, syzbot wrote:
> >> Hello,
> >>
> >> syzbot found the following issue on:
> >>
> >> HEAD commit:    36e79851 libbpf: Preserve empty DATASEC BTFs during
> >> static..
> >> git tree:       bpf-next
> >> console output:
> >> https://syzkaller.appspot.com/x/log.txt?x=1569bb06d00000
> >> kernel config:
> >> https://syzkaller.appspot.com/x/.config?x=7eff0f22b8563a5f
> >> dashboard link:
> >> https://syzkaller.appspot.com/bug?extid=774c590240616eaa3423
> >> syz repro:
> >> https://syzkaller.appspot.com/x/repro.syz?x=17556b7cd00000
> >> C reproducer:
> >> https://syzkaller.appspot.com/x/repro.c?x=1772be26d00000
> >>
> >> The issue was bisected to:
> >>
> >> commit 997acaf6b4b59c6a9c259740312a69ea549cc684
> >> Author: Mark Rutland <mark.rutland@arm.com>
> >> Date:   Mon Jan 11 15:37:07 2021 +0000
> >>
> >>      lockdep: report broken irq restoration
> >>
> >> bisection log:
> >> https://syzkaller.appspot.com/x/bisect.txt?x=10197016d00000
> >> final oops:
> >> https://syzkaller.appspot.com/x/report.txt?x=12197016d00000
> >> console output:
> >> https://syzkaller.appspot.com/x/log.txt?x=14197016d00000
> >>
> >> IMPORTANT: if you fix the issue, please add the following tag to the
> >> commit:
> >> Reported-by: syzbot+774c590240616eaa3423@syzkaller.appspotmail.com
> >> Fixes: 997acaf6b4b5 ("lockdep: report broken irq restoration")
> >>
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193
> >> bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
> >> WARNING: CPU: 0 PID: 8725 at include/linux/bpf-cgroup.h:193
> >> bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
> >
> > I will look at this issue. Thanks!
> >
> >> Modules linked in:
> >> CPU: 0 PID: 8725 Comm: syz-executor927 Not tainted
> >> 5.12.0-rc4-syzkaller #0
> >> Hardware name: Google Google Compute Engine/Google Compute Engine,
> >> BIOS Google 01/01/2011
> >> RIP: 0010:bpf_cgroup_storage_set include/linux/bpf-cgroup.h:193 [inline]
> >> RIP: 0010:bpf_test_run+0x65e/0xaa0 net/bpf/test_run.c:109
> >> Code: e9 29 fe ff ff e8 b2 9d 3a fa 41 83 c6 01 bf 08 00 00 00 44 89
> >> f6 e8 51 a5 3a fa 41 83 fe 08 0f 85 74 fc ff ff e8 92 9d 3a fa <0f> 0b
> >> bd f0 ffff ff e9 5c fd ff ff e8 81 9d 3a fa 83 c5 01 bf 08
> >> RSP: 0018:ffffc900017bfaf0 EFLAGS: 00010293
> >> RAX: 0000000000000000 RBX: ffffc90000f29000 RCX: 0000000000000000
> >> RDX: ffff88801bc68000 RSI: ffffffff8739543e RDI: 0000000000000003
> >> RBP: 0000000000000007 R08: 0000000000000008 R09: 0000000000000001
> >> R10: ffffffff8739542f R11: 0000000000000000 R12: dffffc0000000000
> >> R13: ffff888021dd54c0 R14: 0000000000000008 R15: 0000000000000000
> >> FS:  00007f00157d7700(0000) GS:ffff8880b9c00000(0000)
> >> knlGS:0000000000000000
> >> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 00007f0015795718 CR3: 00000000157ae000 CR4: 00000000001506f0
> >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >> Call Trace:
> >>   bpf_prog_test_run_skb+0xabc/0x1c70 net/bpf/test_run.c:628
> >>   bpf_prog_test_run kernel/bpf/syscall.c:3132 [inline]
> >>   __do_sys_bpf+0x218b/0x4f40 kernel/bpf/syscall.c:4411
> >>   do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
>
> Run on my qemu (4 cpus) with C reproducer and I cannot reproduce the
> result. It already ran 30 minutes and still running. Checked the code,
> it is just doing a lot of parallel bpf_prog_test_run's.
>
> The failure is in the below WARN_ON_ONCE code:
>
> 175 static inline int bpf_cgroup_storage_set(struct bpf_cgroup_storage
> 176
> *storage[MAX_BPF_CGROUP_STORAGE_TYPE])
> 177 {
> 178         enum bpf_cgroup_storage_type stype;
> 179         int i, err = 0;
> 180
> 181         preempt_disable();
> 182         for (i = 0; i < BPF_CGROUP_STORAGE_NEST_MAX; i++) {
> 183                 if
> (unlikely(this_cpu_read(bpf_cgroup_storage_info[i].task) != NULL))
> 184                         continue;
> 185
> 186                 this_cpu_write(bpf_cgroup_storage_info[i].task,
> current);
> 187                 for_each_cgroup_storage_type(stype)
> 188
> this_cpu_write(bpf_cgroup_storage_info[i].storage[stype],
> 189                                        storage[stype]);
> 190                 goto out;
> 191         }
> 192         err = -EBUSY;
> 193         WARN_ON_ONCE(1);
> 194
> 195 out:
> 196         preempt_enable();
> 197         return err;
> 198 }
>
> Basically it shows the stress test triggered a warning due to
> limited kernel resource.

Hi Yonghong,

Thanks for looking into this.
If this is not a kernel bug, then it must not use WARN_ON[_ONCE]. It
makes the kernel untestable for both automated systems and humans:

https://lwn.net/Articles/769365/

<quote>
Greg Kroah-Hartman raised the problem of core kernel API code that
will use WARN_ON_ONCE() to complain about bad usage; that will not
generate the desired result if WARN_ON_ONCE() is configured to crash
the machine. He was told that the code should just call pr_warn()
instead, and that the called function should return an error in such
situations. It was generally agreed that any WARN_ON() or
WARN_ON_ONCE() calls that can be triggered from user space need to be
fixed.
</quote>



> >>   entry_SYSCALL_64_after_hwframe+0x44/0xae
> >> RIP: 0033:0x446199
> >> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 11 15 00 00 90 48 89 f8 48
> >> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> >> 01 f0 ffff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> >> RSP: 002b:00007f00157d72f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> >> RAX: ffffffffffffffda RBX: 00000000004cb440 RCX: 0000000000446199
> >> RDX: 0000000000000028 RSI: 0000000020000080 RDI: 000000000000000a
> >> RBP: 000000000049b074 R08: 0000000000000000 R09: 0000000000000000
> >> R10: 0000000000000000 R11: 0000000000000246 R12: f9abde7200f522cd
> >> R13: 3952ddf3af240c07 R14: 1631e0d82d3fa99d R15: 00000000004cb448
> >>
> >>
> >> ---
> >> This report is generated by a bot. It may contain errors.
> >> See
> >> https://goo.gl/tpsmEJ
> >> for more information about syzbot.
> >> syzbot engineers can be reached at syzkaller@googlegroups.com.
> >>
> >> syzbot will keep track of this issue. See:
> >> https://goo.gl/tpsmEJ#status
> >> for how to communicate with syzbot.
> >> For information about bisection process see:
> >> https://goo.gl/tpsmEJ#bisection
> >> syzbot can test patches for this issue, for details see:
> >> https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [syzbot] WARNING in bpf_test_run
  2021-04-13  7:56     ` Dmitry Vyukov
@ 2021-04-13 12:55       ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2021-04-13 12:55 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Yonghong Song, syzbot, Andrew Morton, andrii, Alexei Starovoitov,
	Borislav Petkov, bpf, Daniel Borkmann, David Miller,
	Jesper Dangaard Brouer, H. Peter Anvin, Jim Mattson,
	John Fastabend, Joerg Roedel, Martin KaFai Lau, kpsingh,
	Jakub Kicinski, KVM list, LKML, Mark Rutland, masahiroy,
	Ingo Molnar, netdev, Paolo Bonzini, Peter Zijlstra,
	rafael.j.wysocki, Sean Christopherson, Song Liu, syzkaller-bugs,
	Thomas Gleixner, vkuznets, wanpengli, will, x86

On Tue, 13 Apr 2021 09:56:40 +0200
Dmitry Vyukov <dvyukov@google.com> wrote:

> Thanks for looking into this.
> If this is not a kernel bug, then it must not use WARN_ON[_ONCE]. It
> makes the kernel untestable for both automated systems and humans:
> 
> https://lwn.net/Articles/769365/
> 
> <quote>
> Greg Kroah-Hartman raised the problem of core kernel API code that
> will use WARN_ON_ONCE() to complain about bad usage; that will not
> generate the desired result if WARN_ON_ONCE() is configured to crash
> the machine. He was told that the code should just call pr_warn()
> instead, and that the called function should return an error in such
> situations. It was generally agreed that any WARN_ON() or
> WARN_ON_ONCE() calls that can be triggered from user space need to be
> fixed.
> </quote>

I agree. WARN_ON(_ONCE) should be reserved for anomalies that should not
happen ever. Anything that the user could trigger, should not trigger a
WARN_ON.

A WARN_ON is perfectly fine for detecting an accounting error inside the
kernel. I have them scattered all over my code, but they should never be
hit, even if something in user space tries to hit it. (with an exception of
an interface I want to deprecate, where I want to know if it's still being
used ;-) Of course, that wouldn't help bots testing the code. And I haven't
done that in years)

Any anomaly that can be triggered by user space doing something it should
not be doing really needs a pr_warn().

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-13 12:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-01 11:29 [syzbot] WARNING in bpf_test_run syzbot
2021-04-01 22:05 ` Yonghong Song
2021-04-02  0:40   ` Yonghong Song
2021-04-13  7:56     ` Dmitry Vyukov
2021-04-13 12:55       ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).