From: "Lin Ma" <linma@zju.edu.cn>
To: intel-wired-lan@lists.osuosl.org
Cc: Paul Menzel <pmenzel@molgen.mpg.de>,
regressions <regressions@lists.linux.dev>
Subject: Re: [Intel-wired-lan] [REGRESSION] Deadlock since commit 6faee3d4ee8b ("igb: Add lock to avoid data race")
Date: Tue, 7 Mar 2023 20:37:46 +0800 (GMT+08:00) [thread overview]
Message-ID: <4826c9a2.825ce.186bc13e703.Coremail.linma@zju.edu.cn> (raw)
In-Reply-To: <ZAcqAgLo5/EMca8e@calimero.vinschen.de>
Hi Corinna,
Thanks for the crash log. Seems the reason I didn't successfully reproduce the bug is that I didn't actually enable the sriov.
According to the log, the deadlock seems obvious. I'm soooo noob to make such a mistake.
> [ 141.423324] unregister_netdev+0xe/0x20 // <===== again grant rtnl_lock
> [ 141.423578] igbvf_remove+0x45/0xe0 [igbvf]
> [ 141.423791] pci_device_remove+0x36/0xb0
> [ 141.423990] device_release_driver_internal+0xc1/0x160
> [ 141.424270] pci_stop_bus_device+0x6d/0x90
> [ 141.424507] pci_stop_and_remove_bus_device+0xe/0x20
> [ 141.424789] pci_iov_remove_virtfn+0xba/0x120
> [ 141.425452] sriov_disable+0x2f/0xf0
> [ 141.425679] igb_disable_sriov+0x4e/0x100 [igb]
> [ 141.426353] igb_remove+0xa0/0x130 [igb] // <===== first grant rtnl_lock
I will prepare the commit to revert this buggy commit and I will add the Report-by tag for you.
Regards
Lin
> From: "Corinna Vinschen" <vinschen@redhat.com>
> Sent Time: 2023-03-07 20:11:46 (Tuesday)
> To: "Lin Ma" <linma@zju.edu.cn>
> Cc: "Paul Menzel" <pmenzel@molgen.mpg.de>, intel-wired-lan <intel-wired-lan@lists.osuosl.org>, regressions <regressions@lists.linux.dev>
> Subject: Re: [Intel-wired-lan] [REGRESSION] Deadlock since commit 6faee3d4ee8b ("igb: Add lock to avoid data race")
>
> Hi Lin,
>
> On Mar 7 18:36, Lin Ma wrote:
> > Hello there
> >
> > Yeah I am looking at it. Could you please offer the crash log or
> > locking debug message if accessible? Thanks in advance.
>
> The commands used to reroduce and the resulting log message:
>
> # echo 10 > /proc/sys/kernel/hung_task_timeout_secs
> # echo 2 > /sys/class/net/ens5f2/device/sriov_numvfs
> # modprobe -r igb
> [hang]
>
> console log:
>
> [ 116.914656] pci 0000:84:10.2: [8086:1520] type 00 class 0x020000
> [ 116.915722] pci 0000:84:10.6: [8086:1520] type 00 class 0x020000
> [ 116.917013] igb 0000:84:00.2: 2 VFs allocated
> [ 116.978350] igbvf: Intel(R) Gigabit Virtual Function Network Driver
> [ 116.979072] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
> [ 116.980253] igbvf 0000:84:10.2: enabling device (0000 -> 0002)
> [ 116.982356] igbvf 0000:84:10.2: PF still in reset state. Is the PF interface up?
> [ 116.983196] igbvf 0000:84:10.2: Assigning random MAC address.
> [ 116.985058] igbvf 0000:84:10.2: PF still resetting
> [ 117.011054] igbvf 0000:84:10.2: Intel(R) I350 Virtual Function
> [ 117.011785] igbvf 0000:84:10.2: Address: c2:e5:c2:a2:75:00
> [ 117.012189] igbvf 0000:84:10.6: enabling device (0000 -> 0002)
> [ 117.023911] igbvf 0000:84:10.6: PF still in reset state. Is the PF interface up?
> [ 117.024724] igbvf 0000:84:10.6: Assigning random MAC address.
> [ 117.036215] igbvf 0000:84:10.6: PF still resetting
> [ 117.037596] igbvf 0000:84:10.6: Intel(R) I350 Virtual Function
> [ 117.038327] igbvf 0000:84:10.6: Address: ea:74:07:f4:28:7c
> [ 117.047970] igbvf 0000:84:10.6 ens5f2v1: renamed from eth1
> [ 117.062847] igbvf 0000:84:10.2 ens5f2v0: renamed from eth0
> [ 117.080725] igb 0000:84:00.2: VF 1 attempted to set invalid MAC filter
> [ 117.106106] igb 0000:84:00.2: VF 1 attempted to set invalid MAC filter
> [ 117.107189] igb 0000:84:00.2: VF 1 attempted to set invalid MAC filter
> [ 127.361316] igb 0000:84:00.3: removed PHC on ens5f3
> [ 127.361975] igb 0000:84:00.3: DCA disabled
> [ 127.483085] igb 0000:84:00.2: removed PHC on ens5f2
> [ 127.483786] igb 0000:84:00.2: DCA disabled
> [ 141.418410] INFO: task modprobe:2078 blocked for more than 10 seconds.
> [ 141.418856] Not tainted 6.2.0-rc8+ #12
> [ 141.419184] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 141.419616] task:modprobe state:D stack:0 pid:2078 ppid:2037 flags:0x00004000
> [ 141.420039] Call Trace:
> [ 141.420169] <TASK>
> [ 141.420672] __schedule+0x2dd/0x840
> [ 141.421427] schedule+0x50/0xc0
> [ 141.422041] schedule_preempt_disabled+0x11/0x20
> [ 141.422678] __mutex_lock.isra.13+0x431/0x6b0
> [ 141.423324] unregister_netdev+0xe/0x20
> [ 141.423578] igbvf_remove+0x45/0xe0 [igbvf]
> [ 141.423791] pci_device_remove+0x36/0xb0
> [ 141.423990] device_release_driver_internal+0xc1/0x160
> [ 141.424270] pci_stop_bus_device+0x6d/0x90
> [ 141.424507] pci_stop_and_remove_bus_device+0xe/0x20
> [ 141.424789] pci_iov_remove_virtfn+0xba/0x120
> [ 141.425452] sriov_disable+0x2f/0xf0
> [ 141.425679] igb_disable_sriov+0x4e/0x100 [igb]
> [ 141.426353] igb_remove+0xa0/0x130 [igb]
> [ 141.426599] pci_device_remove+0x36/0xb0
> [ 141.426796] device_release_driver_internal+0xc1/0x160
> [ 141.427060] driver_detach+0x44/0x90
> [ 141.427253] bus_remove_driver+0x55/0xe0
> [ 141.427477] pci_unregister_driver+0x2a/0xa0
> [ 141.428296] __x64_sys_delete_module+0x141/0x2b0
> [ 141.429126] ? mntput_no_expire+0x4a/0x240
> [ 141.429363] ? syscall_trace_enter.isra.19+0x126/0x1a0
> [ 141.429653] do_syscall_64+0x5b/0x80
> [ 141.429847] ? exit_to_user_mode_prepare+0x14d/0x1c0
> [ 141.430109] ? syscall_exit_to_user_mode+0x12/0x30
> [ 141.430849] ? do_syscall_64+0x67/0x80
> [ 141.431083] ? syscall_exit_to_user_mode_prepare+0x183/0x1b0
> [ 141.431770] ? syscall_exit_to_user_mode+0x12/0x30
> [ 141.432482] ? do_syscall_64+0x67/0x80
> [ 141.432714] ? exc_page_fault+0x64/0x140
> [ 141.432911] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 141.433175] RIP: 0033:0x7ff04cc3a05b
> [ 141.433375] RSP: 002b:00007ffd891bcb38 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> [ 141.434337] RAX: ffffffffffffffda RBX: 000055fde256dd00 RCX: 00007ff04cc3a05b
> [ 141.435171] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055fde256dd68
> [ 141.435973] RBP: 000055fde256dd68 R08: 00007ffd891bbae1 R09: 0000000000000000
> [ 141.436852] R10: 00007ff04cd71480 R11: 0000000000000206 R12: 0000000000000000
> [ 141.437636] R13: 0000000000000000 R14: 000055fde256dd68 R15: 0000000000000000
> [ 141.438442] </TASK>
>
>
> Corinna
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
next prev parent reply other threads:[~2023-03-07 12:38 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-07 9:54 [Intel-wired-lan] Deadlock since commit 6faee3d4ee8b ("igb: Add lock to avoid data race") Corinna Vinschen
2023-03-07 10:12 ` [Intel-wired-lan] [REGRESSION] " Paul Menzel
2023-03-07 10:19 ` Paul Menzel
2023-03-07 10:36 ` Lin Ma
2023-03-07 12:11 ` Corinna Vinschen
2023-03-07 12:37 ` Lin Ma [this message]
2023-03-07 12:55 ` Corinna Vinschen
2023-03-07 11:48 ` Lin Ma
2023-03-07 13:05 ` [Intel-wired-lan] [PATCH] igb: revert rtnl_lock() that causes deadlock Lin Ma
2023-03-07 13:17 ` Linux regression tracking (Thorsten Leemhuis)
2023-03-07 13:45 ` Corinna Vinschen
2023-03-07 15:29 ` [Intel-wired-lan] [PATCH v2] " Lin Ma
2023-03-07 16:27 ` Corinna Vinschen
2023-03-07 23:22 ` Jacob Keller
2023-03-08 12:04 ` Simon Horman
2023-03-16 9:10 ` Romanowski, Rafal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4826c9a2.825ce.186bc13e703.Coremail.linma@zju.edu.cn \
--to=linma@zju.edu.cn \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=pmenzel@molgen.mpg.de \
--cc=regressions@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).