* [PATCH net v2] iavf: fix hang on reboot with ice
@ 2023-03-13 16:06 ` Stefan Assmann
0 siblings, 0 replies; 5+ messages in thread
From: Stefan Assmann @ 2023-03-13 16:06 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, patryk.piotrowski, slawomirx.laba,
michal.kubiak, sassmann
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.
Pid 1 is hung in iavf_remove(), part of a network driver:
PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
#0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.
Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().
Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
v2: return instead of breaking the while (1) loop
This avoids going through remove code twice and is how things worked
before a8417330f8a5.
drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 3273aeb8fa67..ce7071e9af15 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev)
mutex_unlock(&adapter->crit_lock);
break;
}
+ /* Simply return if we already went through iavf_shutdown */
+ if (adapter->state == __IAVF_REMOVE) {
+ mutex_unlock(&adapter->crit_lock);
+ return;
+ }
mutex_unlock(&adapter->crit_lock);
usleep_range(500, 1000);
--
2.39.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [Intel-wired-lan] [PATCH net v2] iavf: fix hang on reboot with ice
@ 2023-03-13 16:06 ` Stefan Assmann
0 siblings, 0 replies; 5+ messages in thread
From: Stefan Assmann @ 2023-03-13 16:06 UTC (permalink / raw)
To: intel-wired-lan
Cc: slawomirx.laba, netdev, sassmann, patryk.piotrowski,
michal.kubiak, anthony.l.nguyen
When a system with E810 with existing VFs gets rebooted the following
hang may be observed.
Pid 1 is hung in iavf_remove(), part of a network driver:
PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
#0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.
Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().
Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
---
v2: return instead of breaking the while (1) loop
This avoids going through remove code twice and is how things worked
before a8417330f8a5.
drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 3273aeb8fa67..ce7071e9af15 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev)
mutex_unlock(&adapter->crit_lock);
break;
}
+ /* Simply return if we already went through iavf_shutdown */
+ if (adapter->state == __IAVF_REMOVE) {
+ mutex_unlock(&adapter->crit_lock);
+ return;
+ }
mutex_unlock(&adapter->crit_lock);
usleep_range(500, 1000);
--
2.39.1
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v2] iavf: fix hang on reboot with ice
2023-03-13 16:06 ` [Intel-wired-lan] " Stefan Assmann
@ 2023-03-14 14:24 ` Michal Kubiak
-1 siblings, 0 replies; 5+ messages in thread
From: Michal Kubiak @ 2023-03-14 14:24 UTC (permalink / raw)
To: Stefan Assmann
Cc: intel-wired-lan, netdev, anthony.l.nguyen, patryk.piotrowski,
slawomirx.laba
On Mon, Mar 13, 2023 at 05:06:45PM +0100, Stefan Assmann wrote:
> When a system with E810 with existing VFs gets rebooted the following
> hang may be observed.
>
> Pid 1 is hung in iavf_remove(), part of a network driver:
> PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
> #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
> #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
> #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
> #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
> #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
> #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
> #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
> #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
> #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
> #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
> #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
> #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
> #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
> #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
> #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
> #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
> #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
> #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
> #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
> #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
> #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
> #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
> #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
> RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
> RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
> RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
> R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
> R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
> ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
>
> During reboot all drivers PM shutdown callbacks are invoked.
> In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
> In ice_shutdown() the call chain above is executed, which at some point
> calls iavf_remove(). However iavf_remove() expects the VF to be in one
> of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
> that's not the case it sleeps forever.
> So if iavf_shutdown() gets invoked before iavf_remove() the system will
> hang indefinitely because the adapter is already in state __IAVF_REMOVE.
>
> Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
> as we already went through iavf_shutdown().
>
> Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
> Reported-by: Marius Cornea <mcornea@redhat.com>
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
> v2: return instead of breaking the while (1) loop
> This avoids going through remove code twice and is how things worked
> before a8417330f8a5.
Good catch. Indeed there was such a logic before that patch.
Thanks,
Michal
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
>
> drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 3273aeb8fa67..ce7071e9af15 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev)
> mutex_unlock(&adapter->crit_lock);
> break;
> }
> + /* Simply return if we already went through iavf_shutdown */
> + if (adapter->state == __IAVF_REMOVE) {
> + mutex_unlock(&adapter->crit_lock);
> + return;
> + }
>
> mutex_unlock(&adapter->crit_lock);
> usleep_range(500, 1000);
> --
> 2.39.1
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v2] iavf: fix hang on reboot with ice
@ 2023-03-14 14:24 ` Michal Kubiak
0 siblings, 0 replies; 5+ messages in thread
From: Michal Kubiak @ 2023-03-14 14:24 UTC (permalink / raw)
To: Stefan Assmann
Cc: netdev, anthony.l.nguyen, intel-wired-lan, patryk.piotrowski,
slawomirx.laba
On Mon, Mar 13, 2023 at 05:06:45PM +0100, Stefan Assmann wrote:
> When a system with E810 with existing VFs gets rebooted the following
> hang may be observed.
>
> Pid 1 is hung in iavf_remove(), part of a network driver:
> PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow"
> #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
> #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
> #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
> #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
> #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
> #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
> #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
> #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
> #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
> #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
> #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
> #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
> #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
> #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
> #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
> #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
> #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
> #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
> #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
> #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
> #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
> #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
> #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
> RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
> RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
> RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
> R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
> R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
> ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
>
> During reboot all drivers PM shutdown callbacks are invoked.
> In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
> In ice_shutdown() the call chain above is executed, which at some point
> calls iavf_remove(). However iavf_remove() expects the VF to be in one
> of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
> that's not the case it sleeps forever.
> So if iavf_shutdown() gets invoked before iavf_remove() the system will
> hang indefinitely because the adapter is already in state __IAVF_REMOVE.
>
> Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
> as we already went through iavf_shutdown().
>
> Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
> Reported-by: Marius Cornea <mcornea@redhat.com>
> Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> ---
> v2: return instead of breaking the while (1) loop
> This avoids going through remove code twice and is how things worked
> before a8417330f8a5.
Good catch. Indeed there was such a logic before that patch.
Thanks,
Michal
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
>
> drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 3273aeb8fa67..ce7071e9af15 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -5066,6 +5066,11 @@ static void iavf_remove(struct pci_dev *pdev)
> mutex_unlock(&adapter->crit_lock);
> break;
> }
> + /* Simply return if we already went through iavf_shutdown */
> + if (adapter->state == __IAVF_REMOVE) {
> + mutex_unlock(&adapter->crit_lock);
> + return;
> + }
>
> mutex_unlock(&adapter->crit_lock);
> usleep_range(500, 1000);
> --
> 2.39.1
>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-wired-lan] [PATCH net v2] iavf: fix hang on reboot with ice
2023-03-14 14:24 ` [Intel-wired-lan] " Michal Kubiak
(?)
@ 2023-03-21 9:43 ` Romanowski, Rafal
-1 siblings, 0 replies; 5+ messages in thread
From: Romanowski, Rafal @ 2023-03-21 9:43 UTC (permalink / raw)
To: intel-wired-lan
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Michal Kubiak
> Sent: wtorek, 14 marca 2023 15:24
> To: Stefan Assmann <sassmann@kpanic.de>
> Cc: netdev@vger.kernel.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; intel-wired-lan@lists.osuosl.org; Piotrowski,
> Patryk <patryk.piotrowski@intel.com>; Laba, SlawomirX
> <slawomirx.laba@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH net v2] iavf: fix hang on reboot with ice
>
> On Mon, Mar 13, 2023 at 05:06:45PM +0100, Stefan Assmann wrote:
> > When a system with E810 with existing VFs gets rebooted the following
> > hang may be observed.
> >
> > Pid 1 is hung in iavf_remove(), part of a network driver:
> > PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-
> shutdow"
> > #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
> > #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
> > #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at
> ffffffff8b32cebc
> > #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
> > #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
> > #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
> > #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
> > #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
> > #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at
> ffffffff8adcc81e
> > #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
> > #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
> > #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
> > #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
> > #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
> > #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
> > #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
> > #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
> > #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
> > #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
> > #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
> > #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
> > #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
> > #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at
> ffffffff8b40009b
> > RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7
> > RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
> > RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90
> > R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005
> > R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000
> > ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b
> >
> > During reboot all drivers PM shutdown callbacks are invoked.
> > In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
> > In ice_shutdown() the call chain above is executed, which at some
> > point calls iavf_remove(). However iavf_remove() expects the VF to be
> > in one of the states __IAVF_RUNNING, __IAVF_DOWN or
> > __IAVF_INIT_FAILED. If that's not the case it sleeps forever.
> > So if iavf_shutdown() gets invoked before iavf_remove() the system
> > will hang indefinitely because the adapter is already in state
> __IAVF_REMOVE.
> >
> > Fix this by returning from iavf_remove() if the state is
> > __IAVF_REMOVE, as we already went through iavf_shutdown().
> >
> > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in
> > remove")
> > Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown
> > and iavf_remove")
> > Reported-by: Marius Cornea <mcornea@redhat.com>
> > Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
> > ---
> > v2: return instead of breaking the while (1) loop
> > This avoids going through remove code twice and is how things worked
> > before a8417330f8a5.
>
> Good catch. Indeed there was such a logic before that patch.
>
> Thanks,
> Michal
>
> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
>
> >
> > drivers/net/ethernet/intel/iavf/iavf_main.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > index 3273aeb8fa67..ce7071e9af15 100644
> > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2023-03-21 9:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-13 16:06 [PATCH net v2] iavf: fix hang on reboot with ice Stefan Assmann
2023-03-13 16:06 ` [Intel-wired-lan] " Stefan Assmann
2023-03-14 14:24 ` Michal Kubiak
2023-03-14 14:24 ` [Intel-wired-lan] " Michal Kubiak
2023-03-21 9:43 ` Romanowski, Rafal
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.