Greeting, FYI, we noticed the following commit (built with gcc-9): commit: 7540b2861e5292b5993f8e693fc69510b2a7277a ("crypto: aesni - AVX512 version of AESNI-GCM using VPCLMULQDQ") https://github.com/meghadey/crypto for_crypto_avx512 in testcase: kernel-selftests version: kernel-selftests-x86_64-2ff4761d-1_20210630 with following parameters: group: net ucode: 0xe2 test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel. test-url: https://www.kernel.org/doc/Documentation/kselftest.txt on test machine: 4 threads Intel(R) Xeon(R) CPU E3-1225 v5 @ 3.30GHz with 16G memory caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): If you fix the issue, kindly add following tag Reported-by: kernel test robot [ 282.813956] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected [ 282.820619] 5.13.0-rc1-00154-g7540b2861e52 #1 Not tainted [ 282.826047] ----------------------------------------------------- [ 282.832171] ping6/22937 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire: [ 282.838650] ffffffff83516520 (fs_reclaim){+.+.}-{0:0}, at: fs_reclaim_acquire (mm/page_alloc.c:4377 mm/page_alloc.c:4391) [ 282.846795] [ 282.846795] and this task is already holding: [ 282.852648] ffff8884359f2498 (_xmit_TUNNEL6#2){+...}-{2:2}, at: __dev_queue_xmit (include/linux/netdevice.h:4380 net/core/dev.c:4241) [ 282.861088] which would create a new lock dependency: [ 282.866179] (_xmit_TUNNEL6#2){+...}-{2:2} -> (fs_reclaim){+.+.}-{0:0} [ 282.872745] [ 282.872745] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 282.880696] (&dev->tx_global_lock){+.-.}-{2:2} [ 282.880697] [ 282.880697] ... which became SOFTIRQ-irq-safe at: [ 282.891493] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 282.895172] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 282.898953] dev_watchdog (include/linux/netdevice.h:4447 net/sched/sch_generic.c:417) [ 282.902657] call_timer_fn (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/timer.h:125 kernel/time/timer.c:1432) [ 282.906439] run_timer_softirq (kernel/time/timer.c:1477 kernel/time/timer.c:1745 kernel/time/timer.c:1721 kernel/time/timer.c:1758) [ 282.910653] __do_softirq (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/irq.h:142 kernel/softirq.c:560) [ 282.914319] irq_exit_rcu (kernel/softirq.c:433 kernel/softirq.c:637 kernel/softirq.c:649) [ 282.918082] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1100 (discriminator 14)) [ 282.923089] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:647) [ 282.928353] lock_acquire (kernel/locking/lockdep.c:5516) [ 282.932044] __might_fault (mm/memory.c:5070 mm/memory.c:5055) [ 282.935741] get_user_arg_ptr+0x23/0x80 [ 282.940309] count+0x5d/0xc0 [ 282.944946] do_execveat_common+0x9e/0x200 [ 282.949762] __x64_sys_execve (fs/exec.c:2058) [ 282.953707] do_syscall_64 (arch/x86/entry/common.c:47) [ 282.957384] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:112) [ 282.962569] [ 282.962569] to a SOFTIRQ-irq-unsafe lock: [ 282.968086] (fs_reclaim){+.+.}-{0:0} [ 282.968088] [ 282.968088] ... which became SOFTIRQ-irq-unsafe at: [ 282.978135] ... [ 282.978135] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 282.983615] fs_reclaim_acquire (mm/page_alloc.c:4378 mm/page_alloc.c:4391) [ 282.987831] __kmalloc_node (include/linux/sched/mm.h:199 mm/slab.h:497 mm/slub.c:2833 mm/slub.c:4111) [ 282.991699] alloc_cpumask_var_node (lib/cpumask.c:124) [ 282.996175] native_smp_prepare_cpus (arch/x86/kernel/smpboot.c:1334) [ 283.000823] kernel_init_freeable (init/main.c:1544) [ 283.005308] kernel_init (init/main.c:1449) [ 283.008827] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 283.012522] [ 283.012522] other info that might help us debug this: [ 283.012522] [ 283.020527] Chain exists of: [ 283.020527] &dev->tx_global_lock --> _xmit_TUNNEL6#2 --> fs_reclaim [ 283.020527] [ 283.031348] Possible interrupt unsafe locking scenario: [ 283.031348] [ 283.038182] CPU0 CPU1 [ 283.042741] ---- ---- [ 283.047277] lock(fs_reclaim); [ 283.050424] local_irq_disable(); [ 283.056364] lock(&dev->tx_global_lock); [ 283.062911] lock(_xmit_TUNNEL6#2); [ 283.069034] [ 283.071676] lock(&dev->tx_global_lock); [ 283.075866] [ 283.075866] *** DEADLOCK *** [ 283.075866] [ 283.081796] 4 locks held by ping6/22937: [ 283.085734] #0: ffff8884357a4f20 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg (net/ipv6/raw.c:949) [ 283.094471] #1: ffffffff8343c100 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2 (include/linux/bottom_half.h:19 include/linux/rcupdate.h:728 net/ipv6/ip6_output.c:110) [ 283.103560] #2: ffffffff8343c100 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit (include/linux/bottom_half.h:19 include/linux/rcupdate.h:728 net/core/dev.c:4185) [ 283.112460] #3: ffff8884359f2498 (_xmit_TUNNEL6#2){+...}-{2:2}, at: __dev_queue_xmit (include/linux/netdevice.h:4380 net/core/dev.c:4241) [ 283.121383] [ 283.121383] the dependencies between SOFTIRQ-irq-safe lock and the holding lock: [ 283.130289] -> (&dev->tx_global_lock){+.-.}-{2:2} { [ 283.135284] HARDIRQ-ON-W at: [ 283.138536] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.143957] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.149461] dev_deactivate_many (include/linux/netdevice.h:4447 include/linux/netdevice.h:4466 net/sched/sch_generic.c:476 net/sched/sch_generic.c:1210) [ 283.155491] __dev_close_many (net/core/dev.c:1682) [ 283.161250] __dev_change_flags (include/linux/list.h:132 include/linux/list.h:146 net/core/dev.c:1706 net/core/dev.c:8739) [ 283.167183] dev_change_flags (net/core/dev.c:8812) [ 283.172858] ic_close_devs (net/ipv4/ipconfig.c:338) [ 283.178278] ip_auto_config (net/ipv4/ipconfig.c:1636) [ 283.183954] do_one_initcall (init/main.c:1249) [ 283.189648] kernel_init_freeable (init/main.c:1321 init/main.c:1338 init/main.c:1358 init/main.c:1560) [ 283.195868] kernel_init (init/main.c:1449) [ 283.201124] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 283.206563] IN-SOFTIRQ-W at: [ 283.209816] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.215236] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.220737] dev_watchdog (include/linux/netdevice.h:4447 net/sched/sch_generic.c:417) [ 283.226166] call_timer_fn (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/timer.h:125 kernel/time/timer.c:1432) [ 283.231674] run_timer_softirq (kernel/time/timer.c:1477 kernel/time/timer.c:1745 kernel/time/timer.c:1721 kernel/time/timer.c:1758) [ 283.237606] __do_softirq (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/irq.h:142 kernel/softirq.c:560) [ 283.243046] irq_exit_rcu (kernel/softirq.c:433 kernel/softirq.c:637 kernel/softirq.c:649) [ 283.248562] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1100 (discriminator 14)) [ 283.255311] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:647) [ 283.262294] lock_acquire (kernel/locking/lockdep.c:5516) [ 283.267713] __might_fault (mm/memory.c:5070 mm/memory.c:5055) [ 283.273125] get_user_arg_ptr+0x23/0x80 [ 283.279406] count+0x5d/0xc0 [ 283.285774] do_execveat_common+0x9e/0x200 [ 283.292305] __x64_sys_execve (fs/exec.c:2058) [ 283.297995] do_syscall_64 (arch/x86/entry/common.c:47) [ 283.303432] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:112) [ 283.310343] INITIAL USE at: [ 283.313492] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.318828] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.324255] dev_deactivate_many (include/linux/netdevice.h:4447 include/linux/netdevice.h:4466 net/sched/sch_generic.c:476 net/sched/sch_generic.c:1210) [ 283.330193] __dev_close_many (net/core/dev.c:1682) [ 283.335864] __dev_change_flags (include/linux/list.h:132 include/linux/list.h:146 net/core/dev.c:1706 net/core/dev.c:8739) [ 283.341731] dev_change_flags (net/core/dev.c:8812) [ 283.347323] ic_close_devs (net/ipv4/ipconfig.c:338) [ 283.352661] ip_auto_config (net/ipv4/ipconfig.c:1636) [ 283.358253] do_one_initcall (init/main.c:1249) [ 283.363851] kernel_init_freeable (init/main.c:1321 init/main.c:1338 init/main.c:1358 init/main.c:1560) [ 283.369947] kernel_init (init/main.c:1449) [ 283.375097] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 283.380430] } [ 283.382198] ... key at: __key.101907+0x0/0x10 [ 283.388999] ... acquired at: [ 283.392084] __lock_acquire (kernel/locking/lockdep.c:4902) [ 283.396127] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.399908] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.403759] dev_deactivate_many (include/linux/netdevice.h:4380 include/linux/netdevice.h:4457 include/linux/netdevice.h:4466 net/sched/sch_generic.c:476 net/sched/sch_generic.c:1210) [ 283.408129] __dev_close_many (net/core/dev.c:1682) [ 283.412239] dev_close_many (net/core/dev.c:1720) [ 283.416173] unregister_netdevice_many (net/core/dev.c:10985) [ 283.421168] ip6_tnl_exit_batch_net (net/ipv6/ip6_tunnel.c:2310) [ 283.425913] cleanup_net (net/core/net_namespace.c:594 (discriminator 9)) [ 283.429694] process_one_work (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/workqueue.h:108 kernel/workqueue.c:2280) [ 283.433912] worker_thread (include/linux/list.h:282 kernel/workqueue.c:2422) [ 283.437780] kthread (kernel/kthread.c:313) [ 283.441216] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 283.444998] [ 283.446491] -> (_xmit_TUNNEL6#2){+...}-{2:2} { [ 283.450951] HARDIRQ-ON-W at: [ 283.454108] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.459344] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.464666] dev_deactivate_many (include/linux/netdevice.h:4380 include/linux/netdevice.h:4457 include/linux/netdevice.h:4466 net/sched/sch_generic.c:476 net/sched/sch_generic.c:1210) [ 283.470522] __dev_close_many (net/core/dev.c:1682) [ 283.476137] dev_close_many (net/core/dev.c:1720) [ 283.481569] unregister_netdevice_many (net/core/dev.c:10985) [ 283.488027] ip6_tnl_exit_batch_net (net/ipv6/ip6_tunnel.c:2310) [ 283.494217] cleanup_net (net/core/net_namespace.c:594 (discriminator 9)) [ 283.499494] process_one_work (arch/x86/include/asm/jump_label.h:19 include/linux/jump_label.h:200 include/trace/events/workqueue.h:108 kernel/workqueue.c:2280) [ 283.505180] worker_thread (include/linux/list.h:282 kernel/workqueue.c:2422) [ 283.510531] kthread (kernel/kthread.c:313) [ 283.515434] ret_from_fork (arch/x86/entry/entry_64.S:300) [ 283.520699] INITIAL USE at: [ 283.523768] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5514 kernel/locking/lockdep.c:5477) [ 283.528917] _raw_spin_lock (include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:151) [ 283.534173] dev_deactivate_many+0xef/0x3c0 [ 283.539939] __dev_close_many+0x8c/0x140 [ 283.545437] dev_close_many+0x8b/0x140 [ 283.550776] unregister_netdevice_many+0x15c/0x6c0 [ 283.557142] ip6_tnl_exit_batch_net+0x264/0x3c0 [ 283.563263] cleanup_net+0x265/0x400 [ 283.568430] process_one_work+0x274/0x5c0 [ 283.574043] worker_thread+0x50/0x3c0 [ 283.579306] kthread+0x133/0x180 [ 283.584129] ret_from_fork+0x22/0x30 [ 283.589306] } [ 283.590978] ... key at: [] netdev_xmit_lock_key+0x1e0/0x390 [ 283.598657] ... acquired at: [ 283.601664] check_prev_add+0xa2/0xc40 [ 283.605603] validate_chain+0x8f8/0x1200 [ 283.609738] __lock_acquire+0x563/0xb00 [ 283.613753] lock_acquire+0xc8/0x3c0 [ 283.617532] fs_reclaim_acquire+0xa3/0x100 [ 283.621827] kmem_cache_alloc_trace+0x35/0x840 [ 283.626487] gcmaes_crypt_by_sg+0x79/0x480 [ 283.630785] gcmaes_encrypt+0x4b/0xc0 [ 283.634651] helper_rfc4106_encrypt+0x8e/0xc0 [ 283.639215] seqiv_aead_encrypt+0x13a/0x200 [ 283.643603] esp6_output_tail+0x1e8/0x600 [esp6] [ 283.648408] esp6_output+0x117/0x1c0 [esp6] [ 283.652790] xfrm_output_resume+0x1c8/0xa40 [ 283.657178] xfrm6_output+0x51/0x240 [ 283.660961] vti6_xmit+0x57d/0x640 [ip6_vti] [ 283.665411] vti6_tnl_xmit+0x120/0x18b [ip6_vti] [ 283.670214] dev_hard_start_xmit+0xf2/0x340 [ 283.674583] __dev_queue_xmit+0xb3d/0xd40 [ 283.678780] ip6_finish_output2+0x466/0x9c0 [ 283.683169] ip6_fragment+0x7c8/0x8c0 [ 283.687012] ip6_output+0x81/0x2c0 [ 283.690618] ip6_send_skb+0x28/0xc0 [ 283.694297] rawv6_sendmsg+0xa3c/0xc00 [ 283.698261] sock_sendmsg+0x57/0x80 [ 283.701956] __sys_sendto+0xf1/0x180 [ 283.705737] __x64_sys_sendto+0x25/0x40 [ 283.709777] do_syscall_64+0x40/0x80 [ 283.713565] entry_SYSCALL_64_after_hwframe+0x44/0xae To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run bin/lkp run generated-yaml-file --- 0DAY/LKP+ Test Infrastructure Open Source Technology Center https://lists.01.org/hyperkitty/list/lkp(a)lists.01.org Intel Corporation Thanks, Oliver Sang