* [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep @ 2023-09-03 19:55 syzbot 2023-09-06 12:40 ` Eduard Zingerman 2023-09-09 16:12 ` syzbot 0 siblings, 2 replies; 8+ messages in thread From: syzbot @ 2023-09-03 19:55 UTC (permalink / raw) To: andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, kuba, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song Hello, syzbot found the following issue on: HEAD commit: fa09bc40b21a igb: disable virtualization features on 82580 git tree: net console+strace: https://syzkaller.appspot.com/x/log.txt?x=13382fa8680000 kernel config: https://syzkaller.appspot.com/x/.config?x=634e05b4025da9da dashboard link: https://syzkaller.appspot.com/bug?extid=291100dcb32190ec02a8 compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1529c448680000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15db0248680000 Downloadable assets: disk image: https://storage.googleapis.com/syzbot-assets/7ab461d84992/disk-fa09bc40.raw.xz vmlinux: https://storage.googleapis.com/syzbot-assets/3ac6d43ab2db/vmlinux-fa09bc40.xz kernel image: https://storage.googleapis.com/syzbot-assets/778d096a134e/bzImage-fa09bc40.xz IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 5055 Comm: syz-executor625 Not tainted 6.5.0-syzkaller-04012-gfa09bc40b21a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> bpf_check+0x52f3/0xabd0 kernel/bpf/verifier.c:19762 bpf_prog_load+0x153a/0x2270 kernel/bpf/syscall.c:2708 __sys_bpf+0xbb6/0x4e90 kernel/bpf/syscall.c:5335 __do_sys_bpf kernel/bpf/syscall.c:5439 [inline] __se_sys_bpf kernel/bpf/syscall.c:5437 [inline] __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5437 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f7c0df78ea9 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffde3592128 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7c0df78ea9 RDX: 0000000000000090 RSI: 0000000020000940 RDI: 0000000000000005 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000100000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 ---------------- Code disassembly (best guess), 3 bytes skipped: 0: df 48 89 fisttps -0x77(%rax) 3: fa cli 4: 48 c1 ea 03 shr $0x3,%rdx 8: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) c: 0f 85 a1 00 00 00 jne 0xb3 12: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 19: fc ff df 1c: 4c 8b 65 10 mov 0x10(%rbp),%r12 20: 4c 89 e2 mov %r12,%rdx 23: 48 c1 ea 03 shr $0x3,%rdx * 27: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction 2b: 0f 85 93 00 00 00 jne 0xc4 31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax 38: fc ff df 3b: 4d rex.WRB 3c: 8b .byte 0x8b --- This report is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this issue. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. If the bug is already fixed, let syzbot know by replying with: #syz fix: exact-commit-title If you want syzbot to run the reproducer, reply with: #syz test: git://repo/address.git branch-or-commit-hash If you attach or paste a git patch, syzbot will apply it before testing. If you want to overwrite bug's subsystems, reply with: #syz set subsystems: new-subsystem (See the list of subsystem names on the web dashboard) If the bug is a duplicate of another bug, reply with: #syz dup: exact-subject-of-another-report If you want to undo deduplication, reply with: #syz undup ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-03 19:55 [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep syzbot @ 2023-09-06 12:40 ` Eduard Zingerman 2023-09-06 13:50 ` Eduard Zingerman 2023-09-09 16:12 ` syzbot 1 sibling, 1 reply; 8+ messages in thread From: Eduard Zingerman @ 2023-09-06 12:40 UTC (permalink / raw) To: syzbot, andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, kuba, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Sun, 2023-09-03 at 12:55 -0700, syzbot wrote: > Hello, > > syzbot found the following issue on: > > HEAD commit: fa09bc40b21a igb: disable virtualization features on 82580 > git tree: net > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13382fa8680000 > kernel config: https://syzkaller.appspot.com/x/.config?x=634e05b4025da9da > dashboard link: https://syzkaller.appspot.com/bug?extid=291100dcb32190ec02a8 > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1529c448680000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15db0248680000 > > Downloadable assets: > disk image: https://storage.googleapis.com/syzbot-assets/7ab461d84992/disk-fa09bc40.raw.xz > vmlinux: https://storage.googleapis.com/syzbot-assets/3ac6d43ab2db/vmlinux-fa09bc40.xz > kernel image: https://storage.googleapis.com/syzbot-assets/778d096a134e/bzImage-fa09bc40.xz > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com > > general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN > KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] > CPU: 1 PID: 5055 Comm: syz-executor625 Not tainted 6.5.0-syzkaller-04012-gfa09bc40b21a #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 > RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 > Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b > RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 > RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d > R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 > FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > bpf_check+0x52f3/0xabd0 kernel/bpf/verifier.c:19762 > bpf_prog_load+0x153a/0x2270 kernel/bpf/syscall.c:2708 > __sys_bpf+0xbb6/0x4e90 kernel/bpf/syscall.c:5335 > __do_sys_bpf kernel/bpf/syscall.c:5439 [inline] > __se_sys_bpf kernel/bpf/syscall.c:5437 [inline] > __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5437 > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 > entry_SYSCALL_64_after_hwframe+0x63/0xcd > RIP: 0033:0x7f7c0df78ea9 > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > RSP: 002b:00007ffde3592128 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7c0df78ea9 > RDX: 0000000000000090 RSI: 0000000020000940 RDI: 0000000000000005 > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000100000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > </TASK> > Modules linked in: > ---[ end trace 0000000000000000 ]--- > RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 > Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b > RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 > RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 > RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d > R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 > R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 > FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > ---------------- > Code disassembly (best guess), 3 bytes skipped: > 0: df 48 89 fisttps -0x77(%rax) > 3: fa cli > 4: 48 c1 ea 03 shr $0x3,%rdx > 8: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) > c: 0f 85 a1 00 00 00 jne 0xb3 > 12: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > 19: fc ff df > 1c: 4c 8b 65 10 mov 0x10(%rbp),%r12 > 20: 4c 89 e2 mov %r12,%rdx > 23: 48 c1 ea 03 shr $0x3,%rdx > * 27: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction > 2b: 0f 85 93 00 00 00 jne 0xc4 > 31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > 38: fc ff df > 3b: 4d rex.WRB > 3c: 8b .byte 0x8b > > > --- > This report is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this issue. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > If the bug is already fixed, let syzbot know by replying with: > #syz fix: exact-commit-title > > If you want syzbot to run the reproducer, reply with: > #syz test: git://repo/address.git branch-or-commit-hash > If you attach or paste a git patch, syzbot will apply it before testing. > > If you want to overwrite bug's subsystems, reply with: > #syz set subsystems: new-subsystem > (See the list of subsystem names on the web dashboard) > > If the bug is a duplicate of another bug, reply with: > #syz dup: exact-subject-of-another-report > > If you want to undo deduplication, reply with: > #syz undup > I have an explanation of why this error occurs, but I need an advice on how to fix it. Then NULL pointer deference occurs in the following function from offload.c: int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) { struct bpf_prog_offload *offload; int ret = -ENODEV; down_read(&bpf_devs_lock); offload = prog->aux->offload; if (offload) { ret = offload->offdev->ops->prepare(prog); ^^^^^^ this pointer is NULL offload->dev_state = !ret; } up_read(&bpf_devs_lock); return ret; } # Short explanation (a) call chain bpf_prog_load -> bpf_prog_dev_bound_init -> __bpf_prog_dev_bound_init -> __bpf_offload_dev_netdev_register might insert an instance of struct bpf_offload_netdev with {.offdev == NULL} into hash table offload.c:offdevs; (b) call chain bpf_prog_load -> bpf_check -> bpf_prog_offload_verifier_prep assumes that from (prog->aux->offload != NULL) follows (prog->aux->offload->offdev != NULL) which is not the case because of (a). # Long explanation The reproducer generated by testbot has the following structure: - in a loop call function execute_one(), which does the following system calls in sequence: - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> - ioctl(3, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 - bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0x40, prog_ifindex=29, ...}) = -1 EINVAL (referred to as program #1 below) - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> - ioctl(4, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 - bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0, ... prog_ifindex=29}) = -1 EINVAL (referred to as program #2 below) The error occurs when second bpf call is processed. Interestingly, if sleep(1) is inserted somewhere between first and second bpf calls error does not occur: @@ -1246,6 +1246,7 @@ void execute_one(void) *(uint32_t*)0x200009cc = 4; syscall(__NR_bpf, /*cmd=*/5ul, /*arg=*/0x20000940ul, /*size=*/0x90ul); res = syscall(__NR_socket, /*domain=*/0xaul, /*type=*/3ul, /*proto=*/2); + // sleep(1); /* uncomment to hide the error */ if (res != -1) r[2] = res; memcpy((void*)0x20000100, "batadv_slave_1\000\000", 16); ## Control flow when error occurs For program #1: - bpf_prog_load(): - bpf_prog_is_dev_bound(prog->aux) is true - bpf_prog_dev_bound_init - prog->aux->offload_requested is 0 (because of 0x40 prog_flags) - __bpf_prog_dev_bound_init - netdev is "batadv_slave_1" - bpf_offload_find_netdev(offload->netdev) == NULL, (this is a lookup in hash table offload.c:offdevs) which triggers a call to __bpf_offload_dev_netdev_register - __bpf_offload_dev_netdev_register(NULL, offload->netdev) registers struct bpf_offload_netdev with {.offdev = NULL} for netdev "batadv_slave_1" in offload.c:offdevs hash table. For program #2: - bpf_prog_load(): - bpf_prog_is_dev_bound(prog->aux) is true - bpf_prog_dev_bound_init - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) - __bpf_prog_dev_bound_init - netdev is "batadv_slave_1" - bpf_offload_find_netdev(offload->netdev) != NULL, this is struct bpf_offload_netdev with {.offdev = NULL} created for program #1 - prog->aux->offload = struct bpf_prog_offload {.offload -> {.offdev = NULL}}, The bpf_prog_offload remembered for prog points to bpf_offload_netdev with .offdev == NULL. - ... - bpf_check - bpf_prog_offload_verifier_prep - prog->aux->offload != NULL, but prog->aux->offload->offdev == NULL => null pointer deference. ## Control flow when error does not occur For program #1: - ... all as in the previous case ... Some worker thread: - kernel/bpf/core.c:bpf_prog_free_deferred, registered for program #1: - bpf_prog_is_dev_bound(aux) is true - bpf_prog_dev_bound_destroy - netdev is "batadv_slave_1" - (!ondev->offdev && list_empty(&ondev->progs)) is true - __bpf_offload_dev_netdev_unregister this removes struct bpf_offload_netdev with {.offdev = NULL} from offload.c:offdevs hash table. For program #2: - bpf_prog_load(): - bpf_prog_is_dev_bound(prog->aux) is true - bpf_prog_dev_bound_init - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) - __bpf_prog_dev_bound_init - netdev is "batadv_slave_1" - bpf_offload_find_netdev(offload->netdev) == NULL - bpf_prog_is_offloaded(prog->aux) is true - -EINVAL is returned. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-06 12:40 ` Eduard Zingerman @ 2023-09-06 13:50 ` Eduard Zingerman 2023-09-06 13:54 ` Daniel Borkmann 2023-09-06 14:57 ` Jakub Kicinski 0 siblings, 2 replies; 8+ messages in thread From: Eduard Zingerman @ 2023-09-06 13:50 UTC (permalink / raw) To: syzbot, andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, kuba, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song Cc: sdf On Wed, 2023-09-06 at 15:40 +0300, Eduard Zingerman wrote: > On Sun, 2023-09-03 at 12:55 -0700, syzbot wrote: > > Hello, > > > > syzbot found the following issue on: > > > > HEAD commit: fa09bc40b21a igb: disable virtualization features on 82580 > > git tree: net > > console+strace: https://syzkaller.appspot.com/x/log.txt?x=13382fa8680000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=634e05b4025da9da > > dashboard link: https://syzkaller.appspot.com/bug?extid=291100dcb32190ec02a8 > > compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1529c448680000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15db0248680000 > > > > Downloadable assets: > > disk image: https://storage.googleapis.com/syzbot-assets/7ab461d84992/disk-fa09bc40.raw.xz > > vmlinux: https://storage.googleapis.com/syzbot-assets/3ac6d43ab2db/vmlinux-fa09bc40.xz > > kernel image: https://storage.googleapis.com/syzbot-assets/778d096a134e/bzImage-fa09bc40.xz > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit: > > Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com > > > > general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN > > KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] > > CPU: 1 PID: 5055 Comm: syz-executor625 Not tainted 6.5.0-syzkaller-04012-gfa09bc40b21a #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 > > RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 > > Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b > > RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 > > RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 > > RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d > > R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 > > R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 > > FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > <TASK> > > bpf_check+0x52f3/0xabd0 kernel/bpf/verifier.c:19762 > > bpf_prog_load+0x153a/0x2270 kernel/bpf/syscall.c:2708 > > __sys_bpf+0xbb6/0x4e90 kernel/bpf/syscall.c:5335 > > __do_sys_bpf kernel/bpf/syscall.c:5439 [inline] > > __se_sys_bpf kernel/bpf/syscall.c:5437 [inline] > > __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5437 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x63/0xcd > > RIP: 0033:0x7f7c0df78ea9 > > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 > > RSP: 002b:00007ffde3592128 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7c0df78ea9 > > RDX: 0000000000000090 RSI: 0000000020000940 RDI: 0000000000000005 > > RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000100000000 > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > > </TASK> > > Modules linked in: > > ---[ end trace 0000000000000000 ]--- > > RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 > > Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b > > RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 > > RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 > > RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 > > RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d > > R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 > > R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 > > FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > ---------------- > > Code disassembly (best guess), 3 bytes skipped: > > 0: df 48 89 fisttps -0x77(%rax) > > 3: fa cli > > 4: 48 c1 ea 03 shr $0x3,%rdx > > 8: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) > > c: 0f 85 a1 00 00 00 jne 0xb3 > > 12: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > > 19: fc ff df > > 1c: 4c 8b 65 10 mov 0x10(%rbp),%r12 > > 20: 4c 89 e2 mov %r12,%rdx > > 23: 48 c1 ea 03 shr $0x3,%rdx > > * 27: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction > > 2b: 0f 85 93 00 00 00 jne 0xc4 > > 31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax > > 38: fc ff df > > 3b: 4d rex.WRB > > 3c: 8b .byte 0x8b > > > > > > --- > > This report is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this issue. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > If the bug is already fixed, let syzbot know by replying with: > > #syz fix: exact-commit-title > > > > If you want syzbot to run the reproducer, reply with: > > #syz test: git://repo/address.git branch-or-commit-hash > > If you attach or paste a git patch, syzbot will apply it before testing. > > > > If you want to overwrite bug's subsystems, reply with: > > #syz set subsystems: new-subsystem > > (See the list of subsystem names on the web dashboard) > > > > If the bug is a duplicate of another bug, reply with: > > #syz dup: exact-subject-of-another-report > > > > If you want to undo deduplication, reply with: > > #syz undup > > > > I have an explanation of why this error occurs, but I need an advice > on how to fix it. I think the fix should look as follows: diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 3e4f2ec1af06..302e38bffffa 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -199,12 +199,11 @@ static int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *n offload->netdev = netdev; ondev = bpf_offload_find_netdev(offload->netdev); + if (bpf_prog_is_offloaded(prog->aux) && (!ondev || !ondev->offdev)) { + err = -EINVAL; + goto err_free; + } if (!ondev) { - if (bpf_prog_is_offloaded(prog->aux)) { - err = -EINVAL; - goto err_free; - } - /* When only binding to the device, explicitly * create an entry in the hashtable. */ With the following reasoning: for offloaded programs offload device should exist and it should not be a fake device create in !ondev branch. Stanislav, could you please take a look? I think this is related to commit: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") > Then NULL pointer deference occurs in the following function from offload.c: > > int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) > { > struct bpf_prog_offload *offload; > int ret = -ENODEV; > > down_read(&bpf_devs_lock); > offload = prog->aux->offload; > if (offload) { > ret = offload->offdev->ops->prepare(prog); > ^^^^^^ > this pointer is NULL > offload->dev_state = !ret; > } > up_read(&bpf_devs_lock); > > return ret; > } > > # Short explanation > > (a) call chain bpf_prog_load -> bpf_prog_dev_bound_init -> __bpf_prog_dev_bound_init > -> __bpf_offload_dev_netdev_register > might insert an instance of struct bpf_offload_netdev with {.offdev == NULL} > into hash table offload.c:offdevs; > (b) call chain bpf_prog_load -> bpf_check -> bpf_prog_offload_verifier_prep > assumes that from (prog->aux->offload != NULL) > follows (prog->aux->offload->offdev != NULL) > which is not the case because of (a). > > # Long explanation > > The reproducer generated by testbot has the following structure: > - in a loop call function execute_one(), which does the following > system calls in sequence: > - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> > - ioctl(3, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 > - bpf(BPF_PROG_LOAD, > {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0x40, prog_ifindex=29, ...}) = -1 EINVAL > (referred to as program #1 below) > - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> > - ioctl(4, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 > - bpf(BPF_PROG_LOAD, > {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0, ... prog_ifindex=29}) = -1 EINVAL > (referred to as program #2 below) > > The error occurs when second bpf call is processed. > Interestingly, if sleep(1) is inserted somewhere between first and > second bpf calls error does not occur: > > @@ -1246,6 +1246,7 @@ void execute_one(void) > *(uint32_t*)0x200009cc = 4; > syscall(__NR_bpf, /*cmd=*/5ul, /*arg=*/0x20000940ul, /*size=*/0x90ul); > res = syscall(__NR_socket, /*domain=*/0xaul, /*type=*/3ul, /*proto=*/2); > + // sleep(1); /* uncomment to hide the error */ > if (res != -1) > r[2] = res; > memcpy((void*)0x20000100, "batadv_slave_1\000\000", 16); > > ## Control flow when error occurs > > For program #1: > - bpf_prog_load(): > - bpf_prog_is_dev_bound(prog->aux) is true > - bpf_prog_dev_bound_init > - prog->aux->offload_requested is 0 (because of 0x40 prog_flags) > - __bpf_prog_dev_bound_init > - netdev is "batadv_slave_1" > - bpf_offload_find_netdev(offload->netdev) == NULL, > (this is a lookup in hash table offload.c:offdevs) > which triggers a call to __bpf_offload_dev_netdev_register > - __bpf_offload_dev_netdev_register(NULL, offload->netdev) > registers struct bpf_offload_netdev with {.offdev = NULL} > for netdev "batadv_slave_1" in offload.c:offdevs hash table. > > For program #2: > - bpf_prog_load(): > - bpf_prog_is_dev_bound(prog->aux) is true > - bpf_prog_dev_bound_init > - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) > - __bpf_prog_dev_bound_init > - netdev is "batadv_slave_1" > - bpf_offload_find_netdev(offload->netdev) != NULL, > this is struct bpf_offload_netdev with {.offdev = NULL} > created for program #1 > - prog->aux->offload = struct bpf_prog_offload {.offload -> {.offdev = NULL}}, > The bpf_prog_offload remembered for prog points to bpf_offload_netdev > with .offdev == NULL. > - ... > - bpf_check > - bpf_prog_offload_verifier_prep > - prog->aux->offload != NULL, but prog->aux->offload->offdev == NULL > => null pointer deference. > > ## Control flow when error does not occur > > For program #1: > - ... all as in the previous case ... > > Some worker thread: > - kernel/bpf/core.c:bpf_prog_free_deferred, registered for program #1: > - bpf_prog_is_dev_bound(aux) is true > - bpf_prog_dev_bound_destroy > - netdev is "batadv_slave_1" > - (!ondev->offdev && list_empty(&ondev->progs)) is true > - __bpf_offload_dev_netdev_unregister > this removes struct bpf_offload_netdev with {.offdev = NULL} > from offload.c:offdevs hash table. > > For program #2: > - bpf_prog_load(): > - bpf_prog_is_dev_bound(prog->aux) is true > - bpf_prog_dev_bound_init > - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) > - __bpf_prog_dev_bound_init > - netdev is "batadv_slave_1" > - bpf_offload_find_netdev(offload->netdev) == NULL > - bpf_prog_is_offloaded(prog->aux) is true > - -EINVAL is returned. ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-06 13:50 ` Eduard Zingerman @ 2023-09-06 13:54 ` Daniel Borkmann 2023-09-06 14:57 ` Jakub Kicinski 1 sibling, 0 replies; 8+ messages in thread From: Daniel Borkmann @ 2023-09-06 13:54 UTC (permalink / raw) To: Eduard Zingerman, syzbot, andrii, ast, bpf, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, kuba, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song, horms [ Also adding Simon to Cc ] On 9/6/23 3:50 PM, Eduard Zingerman wrote: > On Wed, 2023-09-06 at 15:40 +0300, Eduard Zingerman wrote: >> On Sun, 2023-09-03 at 12:55 -0700, syzbot wrote: >>> Hello, >>> >>> syzbot found the following issue on: >>> >>> HEAD commit: fa09bc40b21a igb: disable virtualization features on 82580 >>> git tree: net >>> console+strace: https://syzkaller.appspot.com/x/log.txt?x=13382fa8680000 >>> kernel config: https://syzkaller.appspot.com/x/.config?x=634e05b4025da9da >>> dashboard link: https://syzkaller.appspot.com/bug?extid=291100dcb32190ec02a8 >>> compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40 >>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1529c448680000 >>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15db0248680000 >>> >>> Downloadable assets: >>> disk image: https://storage.googleapis.com/syzbot-assets/7ab461d84992/disk-fa09bc40.raw.xz >>> vmlinux: https://storage.googleapis.com/syzbot-assets/3ac6d43ab2db/vmlinux-fa09bc40.xz >>> kernel image: https://storage.googleapis.com/syzbot-assets/778d096a134e/bzImage-fa09bc40.xz >>> >>> IMPORTANT: if you fix the issue, please add the following tag to the commit: >>> Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com >>> >>> general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN >>> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] >>> CPU: 1 PID: 5055 Comm: syz-executor625 Not tainted 6.5.0-syzkaller-04012-gfa09bc40b21a #0 >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 >>> RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 >>> Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b >>> RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 >>> RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 >>> RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d >>> R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 >>> R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 >>> FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> Call Trace: >>> <TASK> >>> bpf_check+0x52f3/0xabd0 kernel/bpf/verifier.c:19762 >>> bpf_prog_load+0x153a/0x2270 kernel/bpf/syscall.c:2708 >>> __sys_bpf+0xbb6/0x4e90 kernel/bpf/syscall.c:5335 >>> __do_sys_bpf kernel/bpf/syscall.c:5439 [inline] >>> __se_sys_bpf kernel/bpf/syscall.c:5437 [inline] >>> __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5437 >>> do_syscall_x64 arch/x86/entry/common.c:50 [inline] >>> do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 >>> entry_SYSCALL_64_after_hwframe+0x63/0xcd >>> RIP: 0033:0x7f7c0df78ea9 >>> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 >>> RSP: 002b:00007ffde3592128 EFLAGS: 00000246 ORIG_RAX: 0000000000000141 >>> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f7c0df78ea9 >>> RDX: 0000000000000090 RSI: 0000000020000940 RDI: 0000000000000005 >>> RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000100000000 >>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 >>> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 >>> </TASK> >>> Modules linked in: >>> ---[ end trace 0000000000000000 ]--- >>> RIP: 0010:bpf_prog_offload_verifier_prep+0xaa/0x170 kernel/bpf/offload.c:295 >>> Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 a1 00 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 65 10 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 93 00 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b >>> RSP: 0018:ffffc900039ff7f8 EFLAGS: 00010246 >>> RAX: dffffc0000000000 RBX: ffffc9000156e000 RCX: 0000000000000000 >>> RDX: 0000000000000000 RSI: ffffffff81a8cf76 RDI: ffff888021b25f10 >>> RBP: ffff888021b25f00 R08: 0000000000000001 R09: fffffbfff195203d >>> R10: ffffffff8ca901ef R11: 0000000000000000 R12: 0000000000000000 >>> R13: 0000000000000005 R14: 0000000000000003 R15: ffffc9000156e060 >>> FS: 0000555556071380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> CR2: 0000000020000100 CR3: 0000000022f6b000 CR4: 00000000003506e0 >>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 >>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 >>> ---------------- >>> Code disassembly (best guess), 3 bytes skipped: >>> 0: df 48 89 fisttps -0x77(%rax) >>> 3: fa cli >>> 4: 48 c1 ea 03 shr $0x3,%rdx >>> 8: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) >>> c: 0f 85 a1 00 00 00 jne 0xb3 >>> 12: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax >>> 19: fc ff df >>> 1c: 4c 8b 65 10 mov 0x10(%rbp),%r12 >>> 20: 4c 89 e2 mov %r12,%rdx >>> 23: 48 c1 ea 03 shr $0x3,%rdx >>> * 27: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction >>> 2b: 0f 85 93 00 00 00 jne 0xc4 >>> 31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax >>> 38: fc ff df >>> 3b: 4d rex.WRB >>> 3c: 8b .byte 0x8b >>> >>> >>> --- >>> This report is generated by a bot. It may contain errors. >>> See https://goo.gl/tpsmEJ for more information about syzbot. >>> syzbot engineers can be reached at syzkaller@googlegroups.com. >>> >>> syzbot will keep track of this issue. See: >>> https://goo.gl/tpsmEJ#status for how to communicate with syzbot. >>> >>> If the bug is already fixed, let syzbot know by replying with: >>> #syz fix: exact-commit-title >>> >>> If you want syzbot to run the reproducer, reply with: >>> #syz test: git://repo/address.git branch-or-commit-hash >>> If you attach or paste a git patch, syzbot will apply it before testing. >>> >>> If you want to overwrite bug's subsystems, reply with: >>> #syz set subsystems: new-subsystem >>> (See the list of subsystem names on the web dashboard) >>> >>> If the bug is a duplicate of another bug, reply with: >>> #syz dup: exact-subject-of-another-report >>> >>> If you want to undo deduplication, reply with: >>> #syz undup >>> >> >> I have an explanation of why this error occurs, but I need an advice >> on how to fix it. > > I think the fix should look as follows: > > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c > index 3e4f2ec1af06..302e38bffffa 100644 > --- a/kernel/bpf/offload.c > +++ b/kernel/bpf/offload.c > @@ -199,12 +199,11 @@ static int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *n > offload->netdev = netdev; > > ondev = bpf_offload_find_netdev(offload->netdev); > + if (bpf_prog_is_offloaded(prog->aux) && (!ondev || !ondev->offdev)) { > + err = -EINVAL; > + goto err_free; > + } > if (!ondev) { > - if (bpf_prog_is_offloaded(prog->aux)) { > - err = -EINVAL; > - goto err_free; > - } > - > /* When only binding to the device, explicitly > * create an entry in the hashtable. > */ > > With the following reasoning: for offloaded programs offload device > should exist and it should not be a fake device create in !ondev branch. > > Stanislav, could you please take a look? I think this is related to commit: > 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") > >> Then NULL pointer deference occurs in the following function from offload.c: >> >> int bpf_prog_offload_verifier_prep(struct bpf_prog *prog) >> { >> struct bpf_prog_offload *offload; >> int ret = -ENODEV; >> >> down_read(&bpf_devs_lock); >> offload = prog->aux->offload; >> if (offload) { >> ret = offload->offdev->ops->prepare(prog); >> ^^^^^^ >> this pointer is NULL >> offload->dev_state = !ret; >> } >> up_read(&bpf_devs_lock); >> >> return ret; >> } >> >> # Short explanation >> >> (a) call chain bpf_prog_load -> bpf_prog_dev_bound_init -> __bpf_prog_dev_bound_init >> -> __bpf_offload_dev_netdev_register >> might insert an instance of struct bpf_offload_netdev with {.offdev == NULL} >> into hash table offload.c:offdevs; >> (b) call chain bpf_prog_load -> bpf_check -> bpf_prog_offload_verifier_prep >> assumes that from (prog->aux->offload != NULL) >> follows (prog->aux->offload->offdev != NULL) >> which is not the case because of (a). >> >> # Long explanation >> >> The reproducer generated by testbot has the following structure: >> - in a loop call function execute_one(), which does the following >> system calls in sequence: >> - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> >> - ioctl(3, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 >> - bpf(BPF_PROG_LOAD, >> {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0x40, prog_ifindex=29, ...}) = -1 EINVAL >> (referred to as program #1 below) >> - socket(AF_INET6, SOCK_RAW, IPPROTO_IGMP) = <some fd> >> - ioctl(4, SIOCGIFINDEX, {ifr_name="batadv_slave_1"}) = 0 >> - bpf(BPF_PROG_LOAD, >> {prog_type=BPF_PROG_TYPE_XDP, ... prog_flags=0, ... prog_ifindex=29}) = -1 EINVAL >> (referred to as program #2 below) >> >> The error occurs when second bpf call is processed. >> Interestingly, if sleep(1) is inserted somewhere between first and >> second bpf calls error does not occur: >> >> @@ -1246,6 +1246,7 @@ void execute_one(void) >> *(uint32_t*)0x200009cc = 4; >> syscall(__NR_bpf, /*cmd=*/5ul, /*arg=*/0x20000940ul, /*size=*/0x90ul); >> res = syscall(__NR_socket, /*domain=*/0xaul, /*type=*/3ul, /*proto=*/2); >> + // sleep(1); /* uncomment to hide the error */ >> if (res != -1) >> r[2] = res; >> memcpy((void*)0x20000100, "batadv_slave_1\000\000", 16); >> >> ## Control flow when error occurs >> >> For program #1: >> - bpf_prog_load(): >> - bpf_prog_is_dev_bound(prog->aux) is true >> - bpf_prog_dev_bound_init >> - prog->aux->offload_requested is 0 (because of 0x40 prog_flags) >> - __bpf_prog_dev_bound_init >> - netdev is "batadv_slave_1" >> - bpf_offload_find_netdev(offload->netdev) == NULL, >> (this is a lookup in hash table offload.c:offdevs) >> which triggers a call to __bpf_offload_dev_netdev_register >> - __bpf_offload_dev_netdev_register(NULL, offload->netdev) >> registers struct bpf_offload_netdev with {.offdev = NULL} >> for netdev "batadv_slave_1" in offload.c:offdevs hash table. >> >> For program #2: >> - bpf_prog_load(): >> - bpf_prog_is_dev_bound(prog->aux) is true >> - bpf_prog_dev_bound_init >> - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) >> - __bpf_prog_dev_bound_init >> - netdev is "batadv_slave_1" >> - bpf_offload_find_netdev(offload->netdev) != NULL, >> this is struct bpf_offload_netdev with {.offdev = NULL} >> created for program #1 >> - prog->aux->offload = struct bpf_prog_offload {.offload -> {.offdev = NULL}}, >> The bpf_prog_offload remembered for prog points to bpf_offload_netdev >> with .offdev == NULL. >> - ... >> - bpf_check >> - bpf_prog_offload_verifier_prep >> - prog->aux->offload != NULL, but prog->aux->offload->offdev == NULL >> => null pointer deference. >> >> ## Control flow when error does not occur >> >> For program #1: >> - ... all as in the previous case ... >> >> Some worker thread: >> - kernel/bpf/core.c:bpf_prog_free_deferred, registered for program #1: >> - bpf_prog_is_dev_bound(aux) is true >> - bpf_prog_dev_bound_destroy >> - netdev is "batadv_slave_1" >> - (!ondev->offdev && list_empty(&ondev->progs)) is true >> - __bpf_offload_dev_netdev_unregister >> this removes struct bpf_offload_netdev with {.offdev = NULL} >> from offload.c:offdevs hash table. >> >> For program #2: >> - bpf_prog_load(): >> - bpf_prog_is_dev_bound(prog->aux) is true >> - bpf_prog_dev_bound_init >> - prog->aux->offload_requested is 1 (because of 0x0 prog_flags) >> - __bpf_prog_dev_bound_init >> - netdev is "batadv_slave_1" >> - bpf_offload_find_netdev(offload->netdev) == NULL >> - bpf_prog_is_offloaded(prog->aux) is true >> - -EINVAL is returned. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-06 13:50 ` Eduard Zingerman 2023-09-06 13:54 ` Daniel Borkmann @ 2023-09-06 14:57 ` Jakub Kicinski 2023-09-06 18:00 ` Eduard Zingerman 1 sibling, 1 reply; 8+ messages in thread From: Jakub Kicinski @ 2023-09-06 14:57 UTC (permalink / raw) To: Eduard Zingerman Cc: syzbot, andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Wed, 06 Sep 2023 16:50:23 +0300 Eduard Zingerman wrote: > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c > index 3e4f2ec1af06..302e38bffffa 100644 > --- a/kernel/bpf/offload.c > +++ b/kernel/bpf/offload.c > @@ -199,12 +199,11 @@ static int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *n > offload->netdev = netdev; > > ondev = bpf_offload_find_netdev(offload->netdev); > + if (bpf_prog_is_offloaded(prog->aux) && (!ondev || !ondev->offdev)) { > + err = -EINVAL; > + goto err_free; > + } > if (!ondev) { > - if (bpf_prog_is_offloaded(prog->aux)) { > - err = -EINVAL; > - goto err_free; > - } > - > /* When only binding to the device, explicitly > * create an entry in the hashtable. > */ LGTM, FWIW. > With the following reasoning: for offloaded programs offload device > should exist and it should not be a fake device create in !ondev branch. > > Stanislav, could you please take a look? I think this is related to commit: > 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-06 14:57 ` Jakub Kicinski @ 2023-09-06 18:00 ` Eduard Zingerman 2023-09-06 18:34 ` Stanislav Fomichev 0 siblings, 1 reply; 8+ messages in thread From: Eduard Zingerman @ 2023-09-06 18:00 UTC (permalink / raw) To: Jakub Kicinski Cc: syzbot, andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, sdf, song, syzkaller-bugs, yonghong.song On Wed, 2023-09-06 at 07:57 -0700, Jakub Kicinski wrote: > On Wed, 06 Sep 2023 16:50:23 +0300 Eduard Zingerman wrote: > > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c > > index 3e4f2ec1af06..302e38bffffa 100644 > > --- a/kernel/bpf/offload.c > > +++ b/kernel/bpf/offload.c > > @@ -199,12 +199,11 @@ static int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *n > > offload->netdev = netdev; > > > > ondev = bpf_offload_find_netdev(offload->netdev); > > + if (bpf_prog_is_offloaded(prog->aux) && (!ondev || !ondev->offdev)) { > > + err = -EINVAL; > > + goto err_free; > > + } > > if (!ondev) { > > - if (bpf_prog_is_offloaded(prog->aux)) { > > - err = -EINVAL; > > - goto err_free; > > - } > > - > > /* When only binding to the device, explicitly > > * create an entry in the hashtable. > > */ > > LGTM, FWIW. Thanks, I'll wrap it up as a proper patch with a test. > > > With the following reasoning: for offloaded programs offload device > > should exist and it should not be a fake device create in !ondev branch. > > > > Stanislav, could you please take a look? I think this is related to commit: > > 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs") ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-06 18:00 ` Eduard Zingerman @ 2023-09-06 18:34 ` Stanislav Fomichev 0 siblings, 0 replies; 8+ messages in thread From: Stanislav Fomichev @ 2023-09-06 18:34 UTC (permalink / raw) To: Eduard Zingerman Cc: Jakub Kicinski, syzbot, andrii, ast, bpf, daniel, davem, haoluo, hawk, john.fastabend, jolsa, kpsingh, linux-kernel, martin.lau, netdev, song, syzkaller-bugs, yonghong.song On 09/06, Eduard Zingerman wrote: > On Wed, 2023-09-06 at 07:57 -0700, Jakub Kicinski wrote: > > On Wed, 06 Sep 2023 16:50:23 +0300 Eduard Zingerman wrote: > > > diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c > > > index 3e4f2ec1af06..302e38bffffa 100644 > > > --- a/kernel/bpf/offload.c > > > +++ b/kernel/bpf/offload.c > > > @@ -199,12 +199,11 @@ static int __bpf_prog_dev_bound_init(struct bpf_prog *prog, struct net_device *n > > > offload->netdev = netdev; > > > > > > ondev = bpf_offload_find_netdev(offload->netdev); > > > + if (bpf_prog_is_offloaded(prog->aux) && (!ondev || !ondev->offdev)) { > > > + err = -EINVAL; > > > + goto err_free; > > > + } > > > if (!ondev) { > > > - if (bpf_prog_is_offloaded(prog->aux)) { > > > - err = -EINVAL; > > > - goto err_free; > > > - } > > > - > > > /* When only binding to the device, explicitly > > > * create an entry in the hashtable. > > > */ > > > > LGTM, FWIW. > > Thanks, I'll wrap it up as a proper patch with a test. LGTM as well, thanks! ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep 2023-09-03 19:55 [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep syzbot 2023-09-06 12:40 ` Eduard Zingerman @ 2023-09-09 16:12 ` syzbot 1 sibling, 0 replies; 8+ messages in thread From: syzbot @ 2023-09-09 16:12 UTC (permalink / raw) To: andrii, ast, bpf, daniel, davem, eddyz87, haoluo, hawk, horms, john.fastabend, jolsa, kpsingh, kuba, linux-kernel, lmb, martin.lau, netdev, sdf, song, syzkaller-bugs, yhs, yonghong.song syzbot has bisected this issue to: commit 47a71c1f9af0a334c9dfa97633c41de4feda4287 Author: Andrii Nakryiko <andrii@kernel.org> Date: Thu Apr 6 23:41:58 2023 +0000 bpf: Add log_true_size output field to return necessary log buffer size bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=132a4268680000 start commit: fa09bc40b21a igb: disable virtualization features on 82580 git tree: net final oops: https://syzkaller.appspot.com/x/report.txt?x=10aa4268680000 console output: https://syzkaller.appspot.com/x/log.txt?x=172a4268680000 kernel config: https://syzkaller.appspot.com/x/.config?x=634e05b4025da9da dashboard link: https://syzkaller.appspot.com/bug?extid=291100dcb32190ec02a8 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1529c448680000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15db0248680000 Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com Fixes: 47a71c1f9af0 ("bpf: Add log_true_size output field to return necessary log buffer size") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2023-09-09 16:25 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-09-03 19:55 [syzbot] [bpf?] general protection fault in bpf_prog_offload_verifier_prep syzbot 2023-09-06 12:40 ` Eduard Zingerman 2023-09-06 13:50 ` Eduard Zingerman 2023-09-06 13:54 ` Daniel Borkmann 2023-09-06 14:57 ` Jakub Kicinski 2023-09-06 18:00 ` Eduard Zingerman 2023-09-06 18:34 ` Stanislav Fomichev 2023-09-09 16:12 ` syzbot
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.