Greeting, FYI, we noticed the following commit (built with gcc-9): commit: 944be1796bc1da08d98ef6f41a9b97e39450f356 ("sched: Use lightweight hazard pointers to grab lazy mms") https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git sched/lazymm in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +---------------------------------------------------------------------------------------------+------------+------------+ | | 0e57afe67a | 944be1796b | +---------------------------------------------------------------------------------------------+------------+------------+ | boot_successes | 48 | 2 | | boot_failures | 0 | 68 | | BUG:Bad_rss-counter_state_mm:#type:MM_FILEPAGES_val | 0 | 68 | | BUG:Bad_rss-counter_state_mm:#type:MM_ANONPAGES_val | 0 | 68 | | BUG:non-zero_pgtables_bytes_on_freeing_mm | 0 | 68 | | Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode= | 0 | 8 | | WARNING:at_kernel/sched/core.c:#__schedule | 0 | 51 | | RIP:__schedule | 0 | 51 | | WARNING:at_kernel/fork.c:#__mmdrop | 0 | 43 | | RIP:__mmdrop | 0 | 43 | | WARNING:at_arch/x86/mm/tlb.c:#switch_mm_irqs_off | 0 | 17 | | RIP:switch_mm_irqs_off | 0 | 17 | | kernel_BUG_at_include/linux/mm.h | 0 | 40 | | invalid_opcode:#[##] | 0 | 43 | | RIP:put_page_testzero | 0 | 40 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 47 | | kernel_BUG_at_mm/khugepaged.c | 0 | 3 | | RIP:__khugepaged_enter | 0 | 3 | | RIP:__put_user_nocheck_1 | 0 | 2 | | BUG:Bad_rss-counter_state_mm:#type:MM_SHMEMPAGES_val | 0 | 7 | | WARNING:at_kernel/sched/core.c:#mm_unlazy_mm_count | 0 | 3 | | RIP:mm_unlazy_mm_count | 0 | 3 | | canonical_address#:#[##] | 0 | 4 | | RIP:pgd_free | 0 | 3 | | RIP:__handle_mm_fault | 0 | 1 | | RIP:copy_user_generic_string | 0 | 1 | | BUG:kernel_NULL_pointer_dereference,address | 0 | 1 | | Oops:#[##] | 0 | 2 | | WARNING:at_arch/x86/mm/tlb.c:#nmi_uaccess_okay | 0 | 8 | | RIP:nmi_uaccess_okay | 0 | 8 | | BUG:unable_to_handle_page_fault_for_address | 0 | 1 | | RIP:vma_interval_tree_augment_compute_max | 0 | 1 | | BUG:Bad_page_map_in_process | 0 | 1 | | BUG:Bad_page_state_in_process | 0 | 1 | | RIP:native_machine_emergency_restart | 0 | 1 | | BUG:stack_guard_page_was_hit_at(____ptrval____)(stack_is(____ptrval____)..(____ptrval____)) | 0 | 1 | | RIP:number | 0 | 1 | | RIP:ia32_setup_frame | 0 | 1 | | WARNING:at_fs/coredump.c:#dump_vma_snapshot | 0 | 2 | | RIP:dump_vma_snapshot | 0 | 2 | +---------------------------------------------------------------------------------------------+------------+------------+ If you fix the issue, kindly add following tag Reported-by: kernel test robot [ 6.304063] BUG: Bad rss-counter state mm:(____ptrval____) type:MM_FILEPAGES val:104 [ 6.304586] BUG: Bad rss-counter state mm:(____ptrval____) type:MM_ANONPAGES val:7 [ 6.305063] BUG: non-zero pgtables_bytes on freeing mm: 24576 [ 6.305415] ------------[ cut here ]------------ [ 6.305706] rq->lazy_mm [ 6.305710] WARNING: CPU: 0 PID: 2577 at kernel/sched/core.c:4392 __schedule (kbuild/src/consumer/kernel/sched/core.c:4392 kbuild/src/consumer/kernel/sched/core.c:5234) [ 6.306389] Modules linked in: [ 6.306604] CPU: 0 PID: 2577 Comm: modprobe Not tainted 5.13.0-rc3-00009-g944be1796bc1 #1 [ 6.307110] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 6.307622] RIP: 0010:__schedule (kbuild/src/consumer/kernel/sched/core.c:4392 kbuild/src/consumer/kernel/sched/core.c:5234) [ 6.307900] Code: 00 00 74 38 49 83 bd 68 09 00 00 00 74 1e 80 3d db 6c 0e 01 00 75 15 48 c7 c7 3f 09 2b 82 c6 05 cb 6c 0e 01 01 e8 41 f0 fd ff <0f> 0b 49 8b 87 f0 03 00 00 49 89 85 68 09 00 00 eb 76 49 c7 84 24 All code ======== 0: 00 00 add %al,(%rax) 2: 74 38 je 0x3c 4: 49 83 bd 68 09 00 00 cmpq $0x0,0x968(%r13) b: 00 c: 74 1e je 0x2c e: 80 3d db 6c 0e 01 00 cmpb $0x0,0x10e6cdb(%rip) # 0x10e6cf0 15: 75 15 jne 0x2c 17: 48 c7 c7 3f 09 2b 82 mov $0xffffffff822b093f,%rdi 1e: c6 05 cb 6c 0e 01 01 movb $0x1,0x10e6ccb(%rip) # 0x10e6cf0 25: e8 41 f0 fd ff callq 0xfffffffffffdf06b 2a:* 0f 0b ud2 <-- trapping instruction 2c: 49 8b 87 f0 03 00 00 mov 0x3f0(%r15),%rax 33: 49 89 85 68 09 00 00 mov %rax,0x968(%r13) 3a: eb 76 jmp 0xb2 3c: 49 rex.WB 3d: c7 .byte 0xc7 3e: 84 .byte 0x84 3f: 24 .byte 0x24 Code starting with the faulting instruction =========================================== 0: 0f 0b ud2 2: 49 8b 87 f0 03 00 00 mov 0x3f0(%r15),%rax 9: 49 89 85 68 09 00 00 mov %rax,0x968(%r13) 10: eb 76 jmp 0x88 12: 49 rex.WB 13: c7 .byte 0xc7 14: 84 .byte 0x84 15: 24 .byte 0x24 [ 6.308958] RSP: 0000:ffffc900017fbcf8 EFLAGS: 00010082 [ 6.309279] RAX: 0000000000000000 RBX: ffff88811a2241c8 RCX: 00000000ffff7fff [ 6.309697] RDX: 0000000000000252 RSI: 0000000000000001 RDI: 0000000000000001 [ 6.310114] RBP: ffffc900017fbd40 R08: 0000000000000003 R09: 0000000000000000 [ 6.310531] R10: 0000000000000001 R11: 000000002d2d2d2d R12: ffff88811a223c00 [ 6.310949] R13: ffff88842fc2a9c0 R14: 0000000000000001 R15: ffff88810caa9e00 [ 6.311369] FS: 0000000000000000(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000 [ 6.311861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6.312209] CR2: 000000000809d8a0 CR3: 0000000110296000 CR4: 00000000000406f0 [ 6.312628] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6.313048] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6.313466] Call Trace: [ 6.313655] preempt_schedule_common (kbuild/src/consumer/arch/x86/include/asm/preempt.h:85 (discriminator 1) kbuild/src/consumer/kernel/sched/core.c:5396 (discriminator 1)) [ 6.313939] __cond_resched (kbuild/src/consumer/kernel/sched/core.c:7076) [ 6.314185] down_write_killable (kbuild/src/consumer/kernel/locking/rwsem.c:1260 kbuild/src/consumer/kernel/locking/rwsem.c:1275 kbuild/src/consumer/kernel/locking/rwsem.c:1419) [ 6.314453] mmap_write_lock_killable (kbuild/src/consumer/include/linux/mmap_lock.h:87) [ 6.314740] setup_arg_pages (kbuild/src/consumer/fs/exec.c:793) [ 6.314992] load_elf_binary (kbuild/src/consumer/fs/binfmt_elf.c:1033) [ 6.315252] ? __kernel_read (kbuild/src/consumer/arch/x86/include/asm/current.h:15 kbuild/src/consumer/fs/read_write.c:459) [ 6.315512] ? __kernel_read (kbuild/src/consumer/arch/x86/include/asm/current.h:15 kbuild/src/consumer/fs/read_write.c:459) [ 6.315767] bprm_execve (kbuild/src/consumer/fs/exec.c:1722 kbuild/src/consumer/fs/exec.c:1761 kbuild/src/consumer/fs/exec.c:1830 kbuild/src/consumer/fs/exec.c:1792) [ 6.316010] kernel_execve (kbuild/src/consumer/fs/exec.c:1975) [ 6.316256] call_usermodehelper_exec_async (kbuild/src/consumer/kernel/umh.c:112) [ 6.316573] ? call_usermodehelper (kbuild/src/consumer/kernel/umh.c:67) [ 6.316846] ret_from_fork (kbuild/src/consumer/arch/x86/entry/entry_64.S:300) [ 6.317085] ---[ end trace 80dc07957d67d052 ]--- [ 6.317399] sh[2576]: segfault at 0 ip 0000000000000000 sp 00000000ffb5c300 error 14 in busybox[8048000+54000] [ 6.317993] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. Code starting with the faulting instruction =========================================== To reproduce: # build kernel cd linux cp config-5.13.0-rc3-00009-g944be1796bc1 .config make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation Thanks, Oliver Sang