Greeting, FYI, we noticed the following commit (built with gcc-9): commit: 95e381b6095d0808a64ecbe36515cca2ea2df477 ("[PATCH net-next 12/13] af_unix: Replace the big lock with small locks.") url: https://github.com/0day-ci/linux/commits/Kuniyuki-Iwashima/af_unix-Replace-unix_table_lock-with-per-hash-locks/20211106-172208 base: https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git 84882cf72cd774cf16fd338bdbf00f69ac9f9194 patch link: https://lore.kernel.org/netdev/20211106091712.15206-13-kuniyu@amazon.co.jp in testcase: kernel-selftests version: kernel-selftests-x86_64-c8c9111a-1_20210929 with following parameters: group: x86 ucode: 0xde test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt on test machine: 4 threads 1 sockets Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz with 32G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +---------------------------------------------+------------+------------+ | | 5c1456529e | 95e381b609 | +---------------------------------------------+------------+------------+ | boot_successes | 4 | 0 | | boot_failures | 0 | 4 | | WARNING:possible_recursive_locking_detected | 0 | 4 | +---------------------------------------------+------------+------------+ If you fix the issue, kindly add following tag Reported-by: kernel test robot kern :warn : [ 39.215794] WARNING: possible recursive locking detected kern :warn : [ 39.216286] 5.15.0-rc7-02477-g95e381b6095d #1 Not tainted kern :warn : [ 39.216816] -------------------------------------------- kern :warn : [ 39.217306] systemd/1 is trying to acquire lock: kern :warn : [ 39.217783] ffffffff87c53fb8 (&unix_table_locks[i]){+.+.}-{2:2}, at: unix_bind (net/unix/af_unix.c:1176 net/unix/af_unix.c:1253) kern :warn : [ 39.218514] but task is already holding lock: kern :warn : [ 39.219142] ffffffff87c4de38 (&unix_table_locks[i]){+.+.}-{2:2}, at: unix_table_double_lock (net/unix/af_unix.c:170) kern :warn : [ 39.219969] other info that might help us debug this: kern :warn : [ 39.220611] Possible unsafe locking scenario: kern :warn : [ 39.221209] CPU0 kern :warn : [ 39.221499] ---- kern :warn : [ 39.221789] lock(&unix_table_locks[i]); kern :warn : [ 39.222191] lock(&unix_table_locks[i]); kern :warn : [ 39.222592] *** DEADLOCK *** kern :warn : [ 39.223256] May be due to missing lock nesting notation kern :warn : [ 39.223919] 4 locks held by systemd/1: kern :warn : [ 39.224301] #0: ffff8888765ca448 (sb_writers#6){.+.+}-{0:0}, at: filename_create (fs/namei.c:3656) kern :warn : [ 39.225037] #1: ffff888100275280 (&type->i_mutex_dir_key#4/1){+.+.}-{3:3}, at: filename_create (fs/namei.c:3657) kern :warn : [ 39.225863] #2: ffff888874f82e18 (&u->bindlock){+.+.}-{3:3}, at: unix_bind (net/unix/af_unix.c:1168 net/unix/af_unix.c:1253) kern :warn : [ 39.226571] #3: ffffffff87c4de38 (&unix_table_locks[i]){+.+.}-{2:2}, at: unix_table_double_lock (net/unix/af_unix.c:170) kern :warn : [ 39.227394] stack backtrace: kern :warn : [ 39.227885] CPU: 2 PID: 1 Comm: systemd Not tainted 5.15.0-rc7-02477-g95e381b6095d #1 kern :warn : [ 39.228571] Hardware name: Intel Corporation NUC7i7BNH/NUC7i7BNB, BIOS BNKBL357.86A.0067.2018.0814.1500 08/14/2018 kern :warn : [ 39.229434] Call Trace: kern :warn : [ 39.229727] dump_stack_lvl (lib/dump_stack.c:107) kern :warn : [ 39.230107] __lock_acquire.cold (kernel/locking/lockdep.c:2944 kernel/locking/lockdep.c:2987 kernel/locking/lockdep.c:3776 kernel/locking/lockdep.c:5015) kern :warn : [ 39.230530] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4873) kern :warn : [ 39.231012] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5669) kern :warn : [ 39.231426] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4873) kern :warn : [ 39.231907] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5669) kern :warn : [ 39.232321] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) kern :warn : [ 39.232779] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5627 kernel/locking/lockdep.c:5590) kern :warn : [ 39.233155] ? unix_bind (net/unix/af_unix.c:1176 net/unix/af_unix.c:1253) kern :warn : [ 39.233533] ? rcu_read_unlock (include/linux/rcupdate.h:716 (discriminator 5)) kern :warn : [ 39.233932] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:513 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115) kern :warn : [ 39.234346] ? rwlock_bug+0xc0/0xc0 kern :warn : [ 39.234757] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:154) kern :warn : [ 39.235135] ? unix_bind (net/unix/af_unix.c:1176 net/unix/af_unix.c:1253) kern :warn : [ 39.235513] unix_bind (net/unix/af_unix.c:1176 net/unix/af_unix.c:1253) kern :warn : [ 39.235880] ? unix_socketpair (net/unix/af_unix.c:1239) kern :warn : [ 39.236289] ? _copy_from_user (arch/x86/include/asm/uaccess_64.h:46 arch/x86/include/asm/uaccess_64.h:52 lib/usercopy.c:16) kern :warn : [ 39.236694] __sys_bind (net/socket.c:1693) kern :warn : [ 39.237061] ? __ia32_sys_socketpair (net/socket.c:1680) kern :warn : [ 39.237504] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5669) kern :warn : [ 39.237919] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5669) kern :warn : [ 39.238332] ? syscall_enter_from_user_mode (kernel/entry/common.c:107) kern :warn : [ 39.241493] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) kern :warn : [ 39.241951] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120) kern :warn : [ 39.242382] ? do_syscall_64 (arch/x86/entry/common.c:87) kern :warn : [ 39.242768] __x64_sys_bind (net/socket.c:1702) kern :warn : [ 39.243145] ? syscall_enter_from_user_mode (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 kernel/entry/common.c:107) kern :warn : [ 39.243620] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) kern :warn : [ 39.243994] ? do_syscall_64 (arch/x86/entry/common.c:87) kern :warn : [ 39.244378] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4344) kern :warn : [ 39.244803] ? do_syscall_64 (arch/x86/entry/common.c:87) kern :warn : [ 39.245188] ? do_syscall_64 (arch/x86/entry/common.c:87) kern :warn : [ 39.245571] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4344) kern :warn : [ 39.245997] ? do_syscall_64 (arch/x86/entry/common.c:87) kern :warn : [ 39.246380] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:638) kern :warn : [ 39.246873] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:638) kern :warn : [ 39.247359] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4344) kern :warn : [ 39.247786] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) kern :warn : [ 39.248260] RIP: 0033:0x7f84b7e22497 kern :warn : [ 39.248632] Code: ff ff ff ff c3 48 8b 15 f7 09 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb ba 66 2e 0f 1f 84 00 00 00 00 00 90 b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 09 0c 00 f7 d8 64 89 01 48 All code ======== 0: ff (bad) 1: ff (bad) 2: ff (bad) 3: ff c3 inc %ebx 5: 48 8b 15 f7 09 0c 00 mov 0xc09f7(%rip),%rdx # 0xc0a03 c: f7 d8 neg %eax e: 64 89 02 mov %eax,%fs:(%rdx) 11: b8 ff ff ff ff mov $0xffffffff,%eax 16: eb ba jmp 0xffffffffffffffd2 18: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 1f: 00 00 00 22: 90 nop 23: b8 31 00 00 00 mov $0x31,%eax 28: 0f 05 syscall 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction 30: 73 01 jae 0x33 32: c3 retq 33: 48 8b 0d c9 09 0c 00 mov 0xc09c9(%rip),%rcx # 0xc0a03 3a: f7 d8 neg %eax 3c: 64 89 01 mov %eax,%fs:(%rcx) 3f: 48 rex.W Code starting with the faulting instruction =========================================== 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax 6: 73 01 jae 0x9 8: c3 retq 9: 48 8b 0d c9 09 0c 00 mov 0xc09c9(%rip),%rcx # 0xc09d9 10: f7 d8 neg %eax 12: 64 89 01 mov %eax,%fs:(%rcx) 15: 48 rex.W kern :warn : [ 39.250101] RSP: 002b:00007ffd7751fa58 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 kern :warn : [ 39.250770] RAX: ffffffffffffffda RBX: 0000559d4fc70bb0 RCX: 00007f84b7e22497 kern :warn : [ 39.251406] RDX: 0000000000000016 RSI: 00007ffd7751fa70 RDI: 000000000000000f kern :warn : [ 39.252042] RBP: 00007ffd7751fa70 R08: 000000000000000c R09: 0000559d4fcd6670 kern :warn : [ 39.252679] R10: 00007ffd7751fa24 R11: 0000000000000246 R12: 000000000000000f kern :warn : [ 39.253314] R13: 0000000000000016 R14: 00007ffd7751fb50 R15: 00007ffd7751fb48 kern :notice: [ 39.425532] random: systemd-random-: uninitialized urandom read (512 bytes read) kern :info : [ 39.905355] intel_pmc_core INT33A1:00: initialized kern :debug : [ 40.139879] IOAPIC[2]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000001 SID:F0F8 SQ:0 SVT:1) kern :debug : [ 40.141118] IOAPIC[0]: Preconfigured routing entry (2-18 -> IRQ 18 Level:1 ActiveLow:1) kern :info : [ 40.186707] mei_me 0000:00:16.0: enabling device (0004 -> 0006) kern :debug : [ 40.196333] libata version 3.00 loaded. kern :notice: [ 40.361505] random: dbus-daemon: uninitialized urandom read (12 bytes read) kern :notice: [ 40.365282] random: dbus-daemon: uninitialized urandom read (12 bytes read) kern :info : [ 40.969470] microcode: updated to revision 0xde, date = 2020-05-27 kern :warn : [ 40.970115] x86/CPU: CPU features have changed after loading microcode, but might not take effect. kern :warn : [ 40.971061] x86/CPU: Please consider either early loading through initrd/built-in or a potential BIOS update. kern :info : [ 40.971992] microcode: Reload completed, microcode revision: 0xde kern :info : [ 41.013694] i801_smbus 0000:00:1f.4: SPD Write Disable is set kern :info : [ 41.014375] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt kern :info : [ 41.015800] IPMI message handler: version 39.2 kern :info : [ 41.022407] iTCO_wdt iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400) kern :info : [ 41.023587] ipmi device interface kern :info : [ 41.033277] iTCO_wdt iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0) kern :notice: [ 41.048766] random: ln: uninitialized urandom read (6 bytes read) kern :debug : [ 41.068893] ahci 0000:00:17.0: version 3.0 kern :info : [ 41.115078] ipmi_si: IPMI System Interface driver kern :warn : [ 41.117158] ipmi_si: Unable to find any System Interface(s) kern :info : [ 41.117819] ahci 0000:00:17.0: AHCI 0001.0301 32 slots 2 ports 6 Gbps 0x5 impl SATA mode kern :info : [ 41.118556] ahci 0000:00:17.0: flags: 64bit ncq pm led clo only pio slum part deso sadm sds apst kern :notice: [ 41.138009] random: ln: uninitialized urandom read (6 bytes read) kern :info : [ 41.170985] RAPL PMU: API unit is 2^-32 Joules, 5 fixed counters, 655360 ms ovfl timer kern :info : [ 41.171706] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules kern :info : [ 41.172240] RAPL PMU: hw unit of domain package 2^-14 Joules kern :info : [ 41.172781] RAPL PMU: hw unit of domain dram 2^-14 Joules kern :info : [ 41.173296] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules kern :info : [ 41.174495] RAPL PMU: hw unit of domain psys 2^-14 Joules kern :info : [ 41.220484] scsi host0: ahci kern :info : [ 41.235600] i2c i2c-0: 2/2 memory slots populated (from DMI) kern :info : [ 41.236463] scsi host1: ahci kern :info : [ 41.243332] i2c i2c-0: Successfully instantiated SPD at 0x50 kern :info : [ 41.251989] scsi host2: ahci kern :info : [ 41.254369] ata1: SATA max UDMA/133 abar m2048@0xdc24b000 port 0xdc24b100 irq 129 kern :info : [ 41.255076] ata2: DUMMY kern :info : [ 41.255403] ata3: SATA max UDMA/133 abar m2048@0xdc24b000 port 0xdc24b200 irq 129 kern :info : [ 41.428685] i915 0000:00:02.0: [drm] Found 64MB of eDRAM kern :info : [ 41.429635] i915 0000:00:02.0: vgaarb: deactivate vga console To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests sudo bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run sudo bin/lkp run generated-yaml-file # if come across any failure that blocks the test, # please remove ~/.lkp and /lkp dir to run from a clean state. --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp@lists.01.org Intel Corporation Thanks, Oliver Sang