All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH net v2 0/2] iavf: Fix close path on shutdown and remove in iavf driver.
@ 2022-08-02 11:51 Mateusz Palczewski
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one Mateusz Palczewski
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
  0 siblings, 2 replies; 9+ messages in thread
From: Mateusz Palczewski @ 2022-08-02 11:51 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Slawomir Laba

From: Slawomir Laba <slawomirx.laba@intel.com>

iavf_shutdown was implementing an incomplete version
of iavf_remove. It misses several calls to the kernel like
iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap
that might break the system on reboot or hibernation.

Fix a deadlock introduced by commit
974578017fc1f ("iavf: Add waiting so the port is initialized in remove")
due to race condition between iavf_shutdown and iavf_remove, where
iavf_remove stucks forever in while loop since iavf_shutdown already
set __IAVF_REMOVE adapter state.

---
 v2: Fixed author
---

SlawomirX Laba (2):
  iavf: Fix shutdown pci callback to match the remove one
  iavf: Fix race condition between iavf_shutdown and iavf_remove

 drivers/net/ethernet/intel/iavf/iavf_main.c | 59 ++++++++-------------
 1 file changed, 22 insertions(+), 37 deletions(-)

-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one
  2022-08-02 11:51 [Intel-wired-lan] [PATCH net v2 0/2] iavf: Fix close path on shutdown and remove in iavf driver Mateusz Palczewski
@ 2022-08-02 11:51 ` Mateusz Palczewski
  2022-08-04  7:55   ` Szlosek, Marek
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
  1 sibling, 1 reply; 9+ messages in thread
From: Mateusz Palczewski @ 2022-08-02 11:51 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Slawomir Laba

From: Slawomir Laba <slawomirx.laba@intel.com>

Make the flow for pci shutdown be the same to the pci remove.

iavf_shutdown was implementing an incomplete version
of iavf_remove. It misses several calls  to the kernel like
iavf_free_misc_irq, iavf_reset_interrupt_capability, iounmap
that might break the system on reboot or hibernation.

Implement the call of iavf_remove directly in iavf_shutdown to
close this gap.

Fixes: 5eae00c57f5e ("i40evf: main driver core")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
---
 v2: Fixed author
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 40 +++++++--------------
 1 file changed, 12 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 04fb9fc0de19..6357dea93b99 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -4647,34 +4647,6 @@ int iavf_process_config(struct iavf_adapter *adapter)
 	return 0;
 }
 
-/**
- * iavf_shutdown - Shutdown the device in preparation for a reboot
- * @pdev: pci device structure
- **/
-static void iavf_shutdown(struct pci_dev *pdev)
-{
-	struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
-	struct net_device *netdev = adapter->netdev;
-
-	netif_device_detach(netdev);
-
-	if (netif_running(netdev))
-		iavf_close(netdev);
-
-	if (iavf_lock_timeout(&adapter->crit_lock, 5000))
-		dev_warn(&adapter->pdev->dev, "failed to acquire crit_lock in %s\n", __FUNCTION__);
-	/* Prevent the watchdog from running. */
-	iavf_change_state(adapter, __IAVF_REMOVE);
-	adapter->aq_required = 0;
-	mutex_unlock(&adapter->crit_lock);
-
-#ifdef CONFIG_PM
-	pci_save_state(pdev);
-
-#endif
-	pci_disable_device(pdev);
-}
-
 /**
  * iavf_probe - Device Initialization Routine
  * @pdev: PCI device information struct
@@ -5011,6 +4983,18 @@ static void iavf_remove(struct pci_dev *pdev)
 	pci_disable_device(pdev);
 }
 
+/**
+ * iavf_shutdown - Shutdown the device in preparation for a reboot
+ * @pdev: pci device structure
+ **/
+static void iavf_shutdown(struct pci_dev *pdev)
+{
+	iavf_remove(pdev);
+
+	if (system_state == SYSTEM_POWER_OFF)
+		pci_set_power_state(pdev, PCI_D3hot);
+}
+
 static SIMPLE_DEV_PM_OPS(iavf_pm_ops, iavf_suspend, iavf_resume);
 
 static struct pci_driver iavf_driver = {
-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-02 11:51 [Intel-wired-lan] [PATCH net v2 0/2] iavf: Fix close path on shutdown and remove in iavf driver Mateusz Palczewski
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one Mateusz Palczewski
@ 2022-08-02 11:51 ` Mateusz Palczewski
  2022-08-04  7:56   ` Szlosek, Marek
  2022-08-04  9:14   ` Michal Schmidt
  1 sibling, 2 replies; 9+ messages in thread
From: Mateusz Palczewski @ 2022-08-02 11:51 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Slawomir Laba

From: Slawomir Laba <slawomirx.laba@intel.com>

Fix a deadlock introduced by commit
974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
due to race condition between iavf_shutdown and iavf_remove, where
iavf_remove stucks forever in while loop since iavf_shutdown already
set __IAVF_REMOVE adapter state.

Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
set and return if so.

Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
---
 v2: Fixed author
---
 drivers/net/ethernet/intel/iavf/iavf_main.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 6357dea93b99..d43e8f12d3ad 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -4846,23 +4846,24 @@ static int __maybe_unused iavf_resume(struct device *dev_d)
 static void iavf_remove(struct pci_dev *pdev)
 {
 	struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
-	struct net_device *netdev = adapter->netdev;
 	struct iavf_fdir_fltr *fdir, *fdirtmp;
 	struct iavf_vlan_filter *vlf, *vlftmp;
+	struct iavf_cloud_filter *cf, *cftmp;
 	struct iavf_adv_rss *rss, *rsstmp;
 	struct iavf_mac_filter *f, *ftmp;
-	struct iavf_cloud_filter *cf, *cftmp;
-	struct iavf_hw *hw = &adapter->hw;
+	struct net_device *netdev;
+	struct iavf_hw *hw;
 	int err;
 
-	/* When reboot/shutdown is in progress no need to do anything
-	 * as the adapter is already REMOVE state that was set during
-	 * iavf_shutdown() callback.
-	 */
-	if (adapter->state == __IAVF_REMOVE)
+	if (!adapter)
+		return;
+
+	netdev = adapter->netdev;
+	hw = &adapter->hw;
+
+	if (test_and_set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
 		return;
 
-	set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section);
 	/* Wait until port initialization is complete.
 	 * There are flows where register/unregister netdev may race.
 	 */
-- 
2.27.0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one Mateusz Palczewski
@ 2022-08-04  7:55   ` Szlosek, Marek
  0 siblings, 0 replies; 9+ messages in thread
From: Szlosek, Marek @ 2022-08-04  7:55 UTC (permalink / raw)
  To: Palczewski, Mateusz, intel-wired-lan; +Cc: Laba, SlawomirX



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Mateusz Palczewski
> Sent: wtorek, 2 sierpnia 2022 13:52
> To: intel-wired-lan@lists.osuosl.org
> Cc: Laba, SlawomirX <slawomirx.laba@intel.com>
> Subject: [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback
> to match the remove one
> 
> From: Slawomir Laba <slawomirx.laba@intel.com>
> 
> Make the flow for pci shutdown be the same to the pci remove.
> 
> iavf_shutdown was implementing an incomplete version of iavf_remove. It
> misses several calls  to the kernel like iavf_free_misc_irq,
> iavf_reset_interrupt_capability, iounmap that might break the system on
> reboot or hibernation.
> 
> Implement the call of iavf_remove directly in iavf_shutdown to close this gap.
> 
> Fixes: 5eae00c57f5e ("i40evf: main driver core")
> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> ---
>  v2: Fixed author
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 40 +++++++--------------
>  1 file changed, 12 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 04fb9fc0de19..6357dea93b99 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c

Tested-by: Marek Szlosek <marek.szlosek@intel.com>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
@ 2022-08-04  7:56   ` Szlosek, Marek
  2022-08-04  9:14   ` Michal Schmidt
  1 sibling, 0 replies; 9+ messages in thread
From: Szlosek, Marek @ 2022-08-04  7:56 UTC (permalink / raw)
  To: Palczewski, Mateusz, intel-wired-lan; +Cc: Laba, SlawomirX



> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Mateusz Palczewski
> Sent: wtorek, 2 sierpnia 2022 13:52
> To: intel-wired-lan@lists.osuosl.org
> Cc: Laba, SlawomirX <slawomirx.laba@intel.com>
> Subject: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between
> iavf_shutdown and iavf_remove
> 
> From: Slawomir Laba <slawomirx.laba@intel.com>
> 
> Fix a deadlock introduced by commit
> 974578017fc1 ("iavf: Add waiting so the port is initialized in remove") due to
> race condition between iavf_shutdown and iavf_remove, where iavf_remove
> stucks forever in while loop since iavf_shutdown already set
> __IAVF_REMOVE adapter state.
> 
> Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been set
> and return if so.
> 
> Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> ---
>  v2: Fixed author
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 6357dea93b99..d43e8f12d3ad 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c

Tested-by: Marek Szlosek <marek.szlosek@intel.com>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
  2022-08-04  7:56   ` Szlosek, Marek
@ 2022-08-04  9:14   ` Michal Schmidt
  2022-08-04  9:37     ` Ivan Vecera
  1 sibling, 1 reply; 9+ messages in thread
From: Michal Schmidt @ 2022-08-04  9:14 UTC (permalink / raw)
  To: Mateusz Palczewski; +Cc: Ivan Vecera, Slawomir Laba, intel-wired-lan


[-- Attachment #1.1: Type: text/plain, Size: 3189 bytes --]

On Tue, Aug 2, 2022 at 1:52 PM Mateusz Palczewski <
mateusz.palczewski@intel.com> wrote:

> From: Slawomir Laba <slawomirx.laba@intel.com>
>
> Fix a deadlock introduced by commit
> 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> due to race condition between iavf_shutdown and iavf_remove, where
> iavf_remove stucks forever in while loop since iavf_shutdown already
> set __IAVF_REMOVE adapter state.
>
> Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
> set and return if so.
>
> Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in
> remove")
> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> ---
>  v2: Fixed author
> ---
>  drivers/net/ethernet/intel/iavf/iavf_main.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 6357dea93b99..d43e8f12d3ad 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -4846,23 +4846,24 @@ static int __maybe_unused iavf_resume(struct
> device *dev_d)
>  static void iavf_remove(struct pci_dev *pdev)
>  {
>         struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
> -       struct net_device *netdev = adapter->netdev;
>         struct iavf_fdir_fltr *fdir, *fdirtmp;
>         struct iavf_vlan_filter *vlf, *vlftmp;
> +       struct iavf_cloud_filter *cf, *cftmp;
>         struct iavf_adv_rss *rss, *rsstmp;
>         struct iavf_mac_filter *f, *ftmp;
> -       struct iavf_cloud_filter *cf, *cftmp;
> -       struct iavf_hw *hw = &adapter->hw;
> +       struct net_device *netdev;
> +       struct iavf_hw *hw;
>         int err;
>
> -       /* When reboot/shutdown is in progress no need to do anything
> -        * as the adapter is already REMOVE state that was set during
> -        * iavf_shutdown() callback.
> -        */
> -       if (adapter->state == __IAVF_REMOVE)
> +       if (!adapter)
> +               return;
>

The check for !adapter cannot work. iavf_pdev_to_adapter(pdev) will never
return NULL. It does:
        return netdev_priv(pci_get_drvdata(pdev));
Even if pci_get_drvdata(pdev) somehow returns NULL (which I don't think it
will, because the driver never resets the drvdata before freeing netdev),
netdev_priv() would make a non-NULL value out of it (it adds the aligned
size of struct net_device to the pointer).

CCing Ivan, who will have more comments about the adapter states.
Michal

+
> +       netdev = adapter->netdev;
> +       hw = &adapter->hw;
> +
> +       if (test_and_set_bit(__IAVF_IN_REMOVE_TASK,
> &adapter->crit_section))
>                 return;
>
> -       set_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section);
>         /* Wait until port initialization is complete.
>          * There are flows where register/unregister netdev may race.
>          */
> --
> 2.27.0
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
>
>

[-- Attachment #1.2: Type: text/html, Size: 4514 bytes --]

[-- Attachment #2: Type: text/plain, Size: 162 bytes --]

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-04  9:14   ` Michal Schmidt
@ 2022-08-04  9:37     ` Ivan Vecera
  2022-08-09 21:11       ` Laba, SlawomirX
  0 siblings, 1 reply; 9+ messages in thread
From: Ivan Vecera @ 2022-08-04  9:37 UTC (permalink / raw)
  To: Michal Schmidt; +Cc: intel-wired-lan, Slawomir Laba

On Thu, 4 Aug 2022 11:14:58 +0200
Michal Schmidt <mschmidt@redhat.com> wrote:

> On Tue, Aug 2, 2022 at 1:52 PM Mateusz Palczewski <
> mateusz.palczewski@intel.com> wrote:  
> 
> > From: Slawomir Laba <slawomirx.laba@intel.com>
> >
> > Fix a deadlock introduced by commit
> > 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
> > due to race condition between iavf_shutdown and iavf_remove, where
> > iavf_remove stucks forever in while loop since iavf_shutdown already
> > set __IAVF_REMOVE adapter state.
> >
> > Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
> > set and return if so.
> >
> > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in
> > remove")
> > Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> > Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> > ---
> >  v2: Fixed author
> > ---
> >  drivers/net/ethernet/intel/iavf/iavf_main.c | 19 ++++++++++---------
> >  1 file changed, 10 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > index 6357dea93b99..d43e8f12d3ad 100644
> > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > @@ -4846,23 +4846,24 @@ static int __maybe_unused iavf_resume(struct
> > device *dev_d)
> >  static void iavf_remove(struct pci_dev *pdev)
> >  {
> >         struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
> > -       struct net_device *netdev = adapter->netdev;
> >         struct iavf_fdir_fltr *fdir, *fdirtmp;
> >         struct iavf_vlan_filter *vlf, *vlftmp;
> > +       struct iavf_cloud_filter *cf, *cftmp;
> >         struct iavf_adv_rss *rss, *rsstmp;
> >         struct iavf_mac_filter *f, *ftmp;
> > -       struct iavf_cloud_filter *cf, *cftmp;
> > -       struct iavf_hw *hw = &adapter->hw;
> > +       struct net_device *netdev;
> > +       struct iavf_hw *hw;
> >         int err;
> >
> > -       /* When reboot/shutdown is in progress no need to do anything
> > -        * as the adapter is already REMOVE state that was set during
> > -        * iavf_shutdown() callback.
> > -        */
> > -       if (adapter->state == __IAVF_REMOVE)
> > +       if (!adapter)
> > +               return;
> >  
> 
> The check for !adapter cannot work. iavf_pdev_to_adapter(pdev) will never
> return NULL. It does:
>         return netdev_priv(pci_get_drvdata(pdev));
> Even if pci_get_drvdata(pdev) somehow returns NULL (which I don't think it
> will, because the driver never resets the drvdata before freeing netdev),
> netdev_priv() would make a non-NULL value out of it (it adds the aligned
> size of struct net_device to the pointer).
> 
> CCing Ivan, who will have more comments about the adapter states.
> Michal

Yes, to make your patch working correctly you need to adjust iavf_pdev_to_adapter()
appropriately and also set pci drvdata to NULL prior free_netdev():

--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -164,7 +164,9 @@ int virtchnl_status_to_errno(enum virtchnl_status_code v_status)
  */
 static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev)
 {
-       return netdev_priv(pci_get_drvdata(pdev));
+       struct net_device *netdev = pci_get_drvdata(pdev);
+
+       return netdev ? netdev_priv(netdev) : NULL;
 }
 
 /**
@@ -4899,6 +4901,7 @@ static void iavf_remove(struct pci_dev *pdev)
        }
        spin_unlock_bh(&adapter->adv_rss_lock);
 
+       pci_set_drvdata(pdev, NULL);
        free_netdev(netdev);
 
        pci_disable_pcie_error_reporting(pdev);

Regarding adapter states... __IAVF_REMOVE can be removed as redundant and instead only
use __IAVF_IN_REMOVE_TASK bit.

Ivan

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-04  9:37     ` Ivan Vecera
@ 2022-08-09 21:11       ` Laba, SlawomirX
  2022-10-11 13:22         ` Ivan Vecera
  0 siblings, 1 reply; 9+ messages in thread
From: Laba, SlawomirX @ 2022-08-09 21:11 UTC (permalink / raw)
  To: ivecera, mschmidt; +Cc: intel-wired-lan

> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Thursday, August 4, 2022 11:38 AM
> To: mschmidt <mschmidt@redhat.com>
> Cc: Palczewski, Mateusz <mateusz.palczewski@intel.com>; intel-wired-
> lan@lists.osuosl.org; Laba, SlawomirX <slawomirx.laba@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition
> between iavf_shutdown and iavf_remove
> 
> On Thu, 4 Aug 2022 11:14:58 +0200
> Michal Schmidt <mschmidt@redhat.com> wrote:
> 
> > On Tue, Aug 2, 2022 at 1:52 PM Mateusz Palczewski <
> > mateusz.palczewski@intel.com> wrote:
> >
> > > From: Slawomir Laba <slawomirx.laba@intel.com>
> > >
> > > Fix a deadlock introduced by commit
> > > 974578017fc1 ("iavf: Add waiting so the port is initialized in
> > > remove") due to race condition between iavf_shutdown and
> > > iavf_remove, where iavf_remove stucks forever in while loop since
> > > iavf_shutdown already set __IAVF_REMOVE adapter state.
> > >
> > > Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
> > > set and return if so.
> > >
> > > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized
> > > in
> > > remove")
> > > Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> > > Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> > > ---
> > >  v2: Fixed author
> > > ---
> > >  drivers/net/ethernet/intel/iavf/iavf_main.c | 19
> > > ++++++++++---------
> > >  1 file changed, 10 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > index 6357dea93b99..d43e8f12d3ad 100644
> > > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > @@ -4846,23 +4846,24 @@ static int __maybe_unused
> iavf_resume(struct
> > > device *dev_d)  static void iavf_remove(struct pci_dev *pdev)  {
> > >         struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
> > > -       struct net_device *netdev = adapter->netdev;
> > >         struct iavf_fdir_fltr *fdir, *fdirtmp;
> > >         struct iavf_vlan_filter *vlf, *vlftmp;
> > > +       struct iavf_cloud_filter *cf, *cftmp;
> > >         struct iavf_adv_rss *rss, *rsstmp;
> > >         struct iavf_mac_filter *f, *ftmp;
> > > -       struct iavf_cloud_filter *cf, *cftmp;
> > > -       struct iavf_hw *hw = &adapter->hw;
> > > +       struct net_device *netdev;
> > > +       struct iavf_hw *hw;
> > >         int err;
> > >
> > > -       /* When reboot/shutdown is in progress no need to do anything
> > > -        * as the adapter is already REMOVE state that was set during
> > > -        * iavf_shutdown() callback.
> > > -        */
> > > -       if (adapter->state == __IAVF_REMOVE)
> > > +       if (!adapter)
> > > +               return;
> > >
> >
> > The check for !adapter cannot work. iavf_pdev_to_adapter(pdev) will
> > never return NULL. It does:
> >         return netdev_priv(pci_get_drvdata(pdev));
> > Even if pci_get_drvdata(pdev) somehow returns NULL (which I don't
> > think it will, because the driver never resets the drvdata before
> > freeing netdev),
> > netdev_priv() would make a non-NULL value out of it (it adds the
> > aligned size of struct net_device to the pointer).
> >
> > CCing Ivan, who will have more comments about the adapter states.
> > Michal
> 
> Yes, to make your patch working correctly you need to adjust
> iavf_pdev_to_adapter() appropriately and also set pci drvdata to NULL prior
> free_netdev():

Thanks Ivan for a nice fix of this problem. The only way that this check would work is when iavf_probe fails with no memory.
We also came to the conclusion that this check is not really necessary and our update on this patch would be to simply
remove the check on the adapter to NULL. What do you think about this?

> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -164,7 +164,9 @@ int virtchnl_status_to_errno(enum
> virtchnl_status_code v_status)
>   */
>  static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev)  {
> -       return netdev_priv(pci_get_drvdata(pdev));
> +       struct net_device *netdev = pci_get_drvdata(pdev);
> +
> +       return netdev ? netdev_priv(netdev) : NULL;
>  }
> 
>  /**
> @@ -4899,6 +4901,7 @@ static void iavf_remove(struct pci_dev *pdev)
>         }
>         spin_unlock_bh(&adapter->adv_rss_lock);
> 
> +       pci_set_drvdata(pdev, NULL);
>         free_netdev(netdev);
> 
>         pci_disable_pcie_error_reporting(pdev);
> 
> Regarding adapter states... __IAVF_REMOVE can be removed as redundant
> and instead only use __IAVF_IN_REMOVE_TASK bit.
> 
> Ivan

I divided iavf_remove function into two logical pieces. The first piece helps the driver to survive races of watchdog init states and iavf_remove call.
So when init fails the driver doesn't reinit if remove is triggered. Additionally the __IAVF_IN_REMOVE_TASK was introduced in order to fix race
condition between register_netdevice in init and unregister_netdevice in remove. The second piece is the cleanup of resources after netdev gets 
unregistered. I had no better idea on how to deal with unregister_netdevice call without holding crit_lock. Unregister_netdevice function will call
iavf_close which requires this lock in order to free traffic irqs and cleanup rx and tx resources.

The __IAVF_REMOVE state is used in different purpose, for example when the driver is ready to stop workqueues (after netdev gets unregistered)
and iavf_remove already holds the crit_lock for final cleanups.









_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove
  2022-08-09 21:11       ` Laba, SlawomirX
@ 2022-10-11 13:22         ` Ivan Vecera
  0 siblings, 0 replies; 9+ messages in thread
From: Ivan Vecera @ 2022-10-11 13:22 UTC (permalink / raw)
  To: Laba, SlawomirX; +Cc: intel-wired-lan

On Tue, 9 Aug 2022 21:11:26 +0000
"Laba, SlawomirX" <slawomirx.laba@intel.com> wrote:

> > -----Original Message-----
> > From: Ivan Vecera <ivecera@redhat.com>
> > Sent: Thursday, August 4, 2022 11:38 AM
> > To: mschmidt <mschmidt@redhat.com>
> > Cc: Palczewski, Mateusz <mateusz.palczewski@intel.com>; intel-wired-
> > lan@lists.osuosl.org; Laba, SlawomirX <slawomirx.laba@intel.com>
> > Subject: Re: [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition
> > between iavf_shutdown and iavf_remove
> > 
> > On Thu, 4 Aug 2022 11:14:58 +0200
> > Michal Schmidt <mschmidt@redhat.com> wrote:
> >   
> > > On Tue, Aug 2, 2022 at 1:52 PM Mateusz Palczewski <  
> > > mateusz.palczewski@intel.com> wrote:  
> > >  
> > > > From: Slawomir Laba <slawomirx.laba@intel.com>
> > > >
> > > > Fix a deadlock introduced by commit
> > > > 974578017fc1 ("iavf: Add waiting so the port is initialized in
> > > > remove") due to race condition between iavf_shutdown and
> > > > iavf_remove, where iavf_remove stucks forever in while loop since
> > > > iavf_shutdown already set __IAVF_REMOVE adapter state.
> > > >
> > > > Fix this by checking if the __IAVF_IN_REMOVE_TASK has already been
> > > > set and return if so.
> > > >
> > > > Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized
> > > > in
> > > > remove")
> > > > Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
> > > > Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
> > > > ---
> > > >  v2: Fixed author
> > > > ---
> > > >  drivers/net/ethernet/intel/iavf/iavf_main.c | 19
> > > > ++++++++++---------
> > > >  1 file changed, 10 insertions(+), 9 deletions(-)
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > index 6357dea93b99..d43e8f12d3ad 100644
> > > > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > > > @@ -4846,23 +4846,24 @@ static int __maybe_unused  
> > iavf_resume(struct  
> > > > device *dev_d)  static void iavf_remove(struct pci_dev *pdev)  {
> > > >         struct iavf_adapter *adapter = iavf_pdev_to_adapter(pdev);
> > > > -       struct net_device *netdev = adapter->netdev;
> > > >         struct iavf_fdir_fltr *fdir, *fdirtmp;
> > > >         struct iavf_vlan_filter *vlf, *vlftmp;
> > > > +       struct iavf_cloud_filter *cf, *cftmp;
> > > >         struct iavf_adv_rss *rss, *rsstmp;
> > > >         struct iavf_mac_filter *f, *ftmp;
> > > > -       struct iavf_cloud_filter *cf, *cftmp;
> > > > -       struct iavf_hw *hw = &adapter->hw;
> > > > +       struct net_device *netdev;
> > > > +       struct iavf_hw *hw;
> > > >         int err;
> > > >
> > > > -       /* When reboot/shutdown is in progress no need to do anything
> > > > -        * as the adapter is already REMOVE state that was set during
> > > > -        * iavf_shutdown() callback.
> > > > -        */
> > > > -       if (adapter->state == __IAVF_REMOVE)
> > > > +       if (!adapter)
> > > > +               return;
> > > >  
> > >
> > > The check for !adapter cannot work. iavf_pdev_to_adapter(pdev) will
> > > never return NULL. It does:
> > >         return netdev_priv(pci_get_drvdata(pdev));
> > > Even if pci_get_drvdata(pdev) somehow returns NULL (which I don't
> > > think it will, because the driver never resets the drvdata before
> > > freeing netdev),
> > > netdev_priv() would make a non-NULL value out of it (it adds the
> > > aligned size of struct net_device to the pointer).
> > >
> > > CCing Ivan, who will have more comments about the adapter states.
> > > Michal  
> > 
> > Yes, to make your patch working correctly you need to adjust
> > iavf_pdev_to_adapter() appropriately and also set pci drvdata to NULL prior
> > free_netdev():  
> 
> Thanks Ivan for a nice fix of this problem. The only way that this check would work is when iavf_probe fails with no memory.
> We also came to the conclusion that this check is not really necessary and our update on this patch would be to simply
> remove the check on the adapter to NULL. What do you think about this?

OK, make sense.

> > --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> > +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> > @@ -164,7 +164,9 @@ int virtchnl_status_to_errno(enum
> > virtchnl_status_code v_status)
> >   */
> >  static struct iavf_adapter *iavf_pdev_to_adapter(struct pci_dev *pdev)  {
> > -       return netdev_priv(pci_get_drvdata(pdev));
> > +       struct net_device *netdev = pci_get_drvdata(pdev);
> > +
> > +       return netdev ? netdev_priv(netdev) : NULL;
> >  }
> > 
> >  /**
> > @@ -4899,6 +4901,7 @@ static void iavf_remove(struct pci_dev *pdev)
> >         }
> >         spin_unlock_bh(&adapter->adv_rss_lock);
> > 
> > +       pci_set_drvdata(pdev, NULL);
> >         free_netdev(netdev);
> > 
> >         pci_disable_pcie_error_reporting(pdev);
> > 
> > Regarding adapter states... __IAVF_REMOVE can be removed as redundant
> > and instead only use __IAVF_IN_REMOVE_TASK bit.
> > 
> > Ivan  
> 
> I divided iavf_remove function into two logical pieces. The first piece helps the driver to survive races of watchdog init states and iavf_remove call.
> So when init fails the driver doesn't reinit if remove is triggered. Additionally the __IAVF_IN_REMOVE_TASK was introduced in order to fix race
> condition between register_netdevice in init and unregister_netdevice in remove. The second piece is the cleanup of resources after netdev gets 
> unregistered. I had no better idea on how to deal with unregister_netdevice call without holding crit_lock. Unregister_netdevice function will call
> iavf_close which requires this lock in order to free traffic irqs and cleanup rx and tx resources.
> 
> The __IAVF_REMOVE state is used in different purpose, for example when the driver is ready to stop workqueues (after netdev gets unregistered)
> and iavf_remove already holds the crit_lock for final cleanups.

OK, got it.

Thanks,
Ivan

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-10-11 13:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-02 11:51 [Intel-wired-lan] [PATCH net v2 0/2] iavf: Fix close path on shutdown and remove in iavf driver Mateusz Palczewski
2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 1/2] iavf: Fix shutdown pci callback to match the remove one Mateusz Palczewski
2022-08-04  7:55   ` Szlosek, Marek
2022-08-02 11:51 ` [Intel-wired-lan] [PATCH net v2 2/2] iavf: Fix race condition between iavf_shutdown and iavf_remove Mateusz Palczewski
2022-08-04  7:56   ` Szlosek, Marek
2022-08-04  9:14   ` Michal Schmidt
2022-08-04  9:37     ` Ivan Vecera
2022-08-09 21:11       ` Laba, SlawomirX
2022-10-11 13:22         ` Ivan Vecera

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.