All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
@ 2023-02-15 19:17 Dave Ertman
  2023-02-16  8:54 ` Petr Oros
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Dave Ertman @ 2023-02-15 19:17 UTC (permalink / raw)
  To: intel-wired-lan; +Cc: Jaroslav Pulchart

RDMA is not supported in ice on a PF that has been added to a bonded
interface. To enforce this, when an interface enters a bond, we unplug
the auxiliary device that supports RDMA functionality.  This unplug
currently happens in the context of handling the netdev bonding event.
This event is sent to the ice driver under RTNL context.  This is causing
a deadlock where the RDMA driver is waiting for the RTNL lock to complete
the removal.

Defer the unplugging/re-plugging of the auxiliary device to the service
task so that it is not performed under the RTNL lock context.

Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
---
Changes since v1:
Reversed order of bit processing in ice_service_task for PLUG/UNPLUG

 drivers/net/ethernet/intel/ice/ice.h      | 14 +++++---------
 drivers/net/ethernet/intel/ice/ice_main.c | 19 ++++++++-----------
 2 files changed, 13 insertions(+), 20 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 2f0b604abc5e..0ad9bab84617 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -506,6 +506,7 @@ enum ice_pf_flags {
 	ICE_FLAG_VF_VLAN_PRUNING,
 	ICE_FLAG_LINK_LENIENT_MODE_ENA,
 	ICE_FLAG_PLUG_AUX_DEV,
+	ICE_FLAG_UNPLUG_AUX_DEV,
 	ICE_FLAG_MTU_CHANGED,
 	ICE_FLAG_GNSS,			/* GNSS successfully initialized */
 	ICE_PF_FLAGS_NBITS		/* must be last */
@@ -950,16 +951,11 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
  */
 static inline void ice_clear_rdma_cap(struct ice_pf *pf)
 {
-	/* We can directly unplug aux device here only if the flag bit
-	 * ICE_FLAG_PLUG_AUX_DEV is not set because ice_unplug_aux_dev()
-	 * could race with ice_plug_aux_dev() called from
-	 * ice_service_task(). In this case we only clear that bit now and
-	 * aux device will be unplugged later once ice_plug_aux_device()
-	 * called from ice_service_task() finishes (see ice_service_task()).
+	/* defer unplug to service task to avoid RTNL lock and
+	 * clear PLUG bit so that pending plugs don't interfere
 	 */
-	if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
-		ice_unplug_aux_dev(pf);
-
+	clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags);
+	set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
 	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
 }
 #endif /* _ICE_H_ */
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index a9a7f8b52140..b62c2f18eb5f 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2290,18 +2290,15 @@ static void ice_service_task(struct work_struct *work)
 		}
 	}
 
-	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
-		/* Plug aux device per request */
-		ice_plug_aux_dev(pf);
+	/* unplug aux dev per request, if an unplug request came in
+	 * while processing a plug request, this will handle it
+	 */
+	if (test_and_clear_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags))
+		ice_unplug_aux_dev(pf);
 
-		/* Mark plugging as done but check whether unplug was
-		 * requested during ice_plug_aux_dev() call
-		 * (e.g. from ice_clear_rdma_cap()) and if so then
-		 * plug aux device.
-		 */
-		if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
-			ice_unplug_aux_dev(pf);
-	}
+	/* Plug aux device per request */
+	if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+		ice_plug_aux_dev(pf);
 
 	if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
 		struct iidc_event *event;
-- 
2.37.3

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-15 19:17 [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock Dave Ertman
@ 2023-02-16  8:54 ` Petr Oros
  2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Petr Oros @ 2023-02-16  8:54 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan; +Cc: Jaroslav Pulchart

Hi Dave,

The first version was sent and commented on upstream ml.
Why do you omit cc on upstream (netdev@vger.kernel.org
<netdev@vger.kernel.org>)?
Can you resend this to upstream ml to take valuable feedback for this
critical fix?

Many Thanks,
Petr


Dave Ertman píše v St 15. 02. 2023 v 11:17 -0800:
> RDMA is not supported in ice on a PF that has been added to a bonded
> interface. To enforce this, when an interface enters a bond, we
> unplug
> the auxiliary device that supports RDMA functionality.  This unplug
> currently happens in the context of handling the netdev bonding
> event.
> This event is sent to the ice driver under RTNL context.  This is
> causing
> a deadlock where the RDMA driver is waiting for the RTNL lock to
> complete
> the removal.
> 
> Defer the unplugging/re-plugging of the auxiliary device to the
> service
> task so that it is not performed under the RTNL lock context.
> 
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Link:
> https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface
> enslave")
> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> ---
> Changes since v1:
> Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
> 
>  drivers/net/ethernet/intel/ice/ice.h      | 14 +++++---------
>  drivers/net/ethernet/intel/ice/ice_main.c | 19 ++++++++-----------
>  2 files changed, 13 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice.h
> b/drivers/net/ethernet/intel/ice/ice.h
> index 2f0b604abc5e..0ad9bab84617 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -506,6 +506,7 @@ enum ice_pf_flags {
>         ICE_FLAG_VF_VLAN_PRUNING,
>         ICE_FLAG_LINK_LENIENT_MODE_ENA,
>         ICE_FLAG_PLUG_AUX_DEV,
> +       ICE_FLAG_UNPLUG_AUX_DEV,
>         ICE_FLAG_MTU_CHANGED,
>         ICE_FLAG_GNSS,                  /* GNSS successfully
> initialized */
>         ICE_PF_FLAGS_NBITS              /* must be last */
> @@ -950,16 +951,11 @@ static inline void ice_set_rdma_cap(struct
> ice_pf *pf)
>   */
>  static inline void ice_clear_rdma_cap(struct ice_pf *pf)
>  {
> -       /* We can directly unplug aux device here only if the flag
> bit
> -        * ICE_FLAG_PLUG_AUX_DEV is not set because
> ice_unplug_aux_dev()
> -        * could race with ice_plug_aux_dev() called from
> -        * ice_service_task(). In this case we only clear that bit
> now and
> -        * aux device will be unplugged later once
> ice_plug_aux_device()
> -        * called from ice_service_task() finishes (see
> ice_service_task()).
> +       /* defer unplug to service task to avoid RTNL lock and
> +        * clear PLUG bit so that pending plugs don't interfere
>          */
> -       if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> -               ice_unplug_aux_dev(pf);
> -
> +       clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags);
> +       set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags);
>         clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
>  }
>  #endif /* _ICE_H_ */
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index a9a7f8b52140..b62c2f18eb5f 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -2290,18 +2290,15 @@ static void ice_service_task(struct
> work_struct *work)
>                 }
>         }
>  
> -       if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
> -               /* Plug aux device per request */
> -               ice_plug_aux_dev(pf);
> +       /* unplug aux dev per request, if an unplug request came in
> +        * while processing a plug request, this will handle it
> +        */
> +       if (test_and_clear_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags))
> +               ice_unplug_aux_dev(pf);
>  
> -               /* Mark plugging as done but check whether unplug was
> -                * requested during ice_plug_aux_dev() call
> -                * (e.g. from ice_clear_rdma_cap()) and if so then
> -                * plug aux device.
> -                */
> -               if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf-
> >flags))
> -                       ice_unplug_aux_dev(pf);
> -       }
> +       /* Plug aux device per request */
> +       if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> +               ice_plug_aux_dev(pf);
>  
>         if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
>                 struct iidc_event *event;

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-15 19:17 [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock Dave Ertman
@ 2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
  2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-02-16 10:33 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan
  Cc: Jaroslav Pulchart, Linux kernel regressions list

On 15.02.23 20:17, Dave Ertman wrote:
> RDMA is not supported in ice on a PF that has been added to a bonded
> interface. To enforce this, when an interface enters a bond, we unplug
> the auxiliary device that supports RDMA functionality.  This unplug
> currently happens in the context of handling the netdev bonding event.
> This event is sent to the ice driver under RTNL context.  This is causing
> a deadlock where the RDMA driver is waiting for the RTNL lock to complete
> the removal.
> 
> Defer the unplugging/re-plugging of the auxiliary device to the service
> task so that it is not performed under the RTNL lock context.

Thanks for taking care of this.

> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/

FWIW, that should be

Link:
https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/

instead, as that's Jaroslav's report

> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> [...]

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
@ 2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-02-16 10:33 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan
  Cc: Jaroslav Pulchart, Linux kernel regressions list

On 15.02.23 20:17, Dave Ertman wrote:
> RDMA is not supported in ice on a PF that has been added to a bonded
> interface. To enforce this, when an interface enters a bond, we unplug
> the auxiliary device that supports RDMA functionality.  This unplug
> currently happens in the context of handling the netdev bonding event.
> This event is sent to the ice driver under RTNL context.  This is causing
> a deadlock where the RDMA driver is waiting for the RTNL lock to complete
> the removal.
> 
> Defer the unplugging/re-plugging of the auxiliary device to the service
> task so that it is not performed under the RTNL lock context.

Thanks for taking care of this.

> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/

FWIW, that should be

Link:
https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/

instead, as that's Jaroslav's report

> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> [...]

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-02-16 10:36     ` Linux regression tracking (Thorsten Leemhuis)
  -1 siblings, 0 replies; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-02-16 10:36 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan
  Cc: Jaroslav Pulchart, Linux kernel regressions list

Sorry, forgot something

On 16.02.23 11:33, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 15.02.23 20:17, Dave Ertman wrote:
>> RDMA is not supported in ice on a PF that has been added to a bonded
>> interface. To enforce this, when an interface enters a bond, we unplug
> [...]
>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
> 
> FWIW, that should be
> 
> Link:
> https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/
> 
> instead, as that's Jaroslav's report
> 
>> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
>> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")

As this was a regression that was noticed in 6.1.2, this afaics also needs:

Cc: stable@vger.kernel.org # 6.1.x

[no, fixes tags alone do not suffice, see docs]

Ciao, Thorsten
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
@ 2023-02-16 10:36     ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-02-16 10:36 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan
  Cc: Jaroslav Pulchart, Linux kernel regressions list

Sorry, forgot something

On 16.02.23 11:33, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 15.02.23 20:17, Dave Ertman wrote:
>> RDMA is not supported in ice on a PF that has been added to a bonded
>> interface. To enforce this, when an interface enters a bond, we unplug
> [...]
>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
> 
> FWIW, that should be
> 
> Link:
> https://lore.kernel.org/netdev/CAK8fFZ6A_Gphw_3-QMGKEFQk=sfCw1Qmq0TVZK3rtAi7vb621A@mail.gmail.com/
> 
> instead, as that's Jaroslav's report
> 
>> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
>> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")

As this was a regression that was noticed in 6.1.2, this afaics also needs:

Cc: stable@vger.kernel.org # 6.1.x

[no, fixes tags alone do not suffice, see docs]

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-15 19:17 [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock Dave Ertman
  2023-02-16  8:54 ` Petr Oros
  2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-02-16 17:24 ` Linus Heckemann
  2023-02-24 18:33   ` Ertman, David M
  2023-03-06 11:03 ` Arland, ArpanaX
  3 siblings, 1 reply; 12+ messages in thread
From: Linus Heckemann @ 2023-02-16 17:24 UTC (permalink / raw)
  To: Dave Ertman, intel-wired-lan; +Cc: Jaroslav Pulchart

Dave Ertman <david.m.ertman@intel.com> writes:
> RDMA is not supported in ice on a PF that has been added to a bonded
> interface. To enforce this, when an interface enters a bond, we unplug
> the auxiliary device that supports RDMA functionality.  This unplug
> currently happens in the context of handling the netdev bonding event.
> This event is sent to the ice driver under RTNL context.  This is causing
> a deadlock where the RDMA driver is waiting for the RTNL lock to complete
> the removal.
>
> Defer the unplugging/re-plugging of the auxiliary device to the service
> task so that it is not performed under the RTNL lock context.
>
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> ---
> Changes since v1:
> Reversed order of bit processing in ice_service_task for PLUG/UNPLUG

Hi Dave,

Thanks for your continued work on this! We've tested this on a system
affected by the original issue (with 8086:1593 cards) and, unlike v1 of
the patch, it appears not to resolve it:

[  250.826308] INFO: task kworker/0:0:7 blocked for more than 122 seconds.
[  250.832925]       Not tainted 5.15.86 #1-NixOS
[  250.837370] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.845195] task:kworker/0:0     state:D stack:    0 pid:    7 ppid:     2 flags:0x00004000
[  250.845200] Workqueue: infiniband ib_cache_event_task [ib_core]
[  250.845227] Call Trace:
[  250.845228]  <TASK>
[  250.845230]  __schedule+0x2e6/0x13a0
[  250.845238]  ? dev_vprintk_emit+0x171/0x199
[  250.845242]  schedule+0x5b/0xd0
[  250.845244]  schedule_preempt_disabled+0xa/0x10
[  250.845247]  __mutex_lock.constprop.0+0x216/0x400
[  250.845249]  ib_get_eth_speed+0x65/0x190 [ib_core]
[  250.845262]  ? update_load_avg+0x7a/0x5c0
[  250.845265]  ? newidle_balance+0x121/0x400
[  250.845268]  irdma_query_port+0xac/0x110 [irdma]
[  250.845283]  ib_query_port+0x16f/0x1f0 [ib_core]
[  250.845301]  ib_cache_update.part.0+0x5e/0x280 [ib_core]
[  250.845313]  ? __schedule+0x2ee/0x13a0
[  250.845316]  ib_cache_event_task+0x53/0x70 [ib_core]
[  250.845327]  process_one_work+0x1ee/0x390
[  250.845330]  worker_thread+0x53/0x3e0
[  250.845332]  ? process_one_work+0x390/0x390
[  250.845334]  kthread+0x124/0x150
[  250.845336]  ? set_kthread_struct+0x50/0x50
[  250.845339]  ret_from_fork+0x1f/0x30
[  250.845344]  </TASK>
[  250.845418] INFO: task kworker/19:1:750 blocked for more than 122 seconds.
[  250.852294]       Not tainted 5.15.86 #1-NixOS
[  250.856739] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.864565] task:kworker/19:1    state:D stack:    0 pid:  750 ppid:     2 flags:0x00004000
[  250.864567] Workqueue: events linkwatch_event
[  250.864571] Call Trace:
[  250.864572]  <TASK>
[  250.864573]  __schedule+0x2e6/0x13a0
[  250.864576]  ? newidle_balance+0x121/0x400
[  250.864578]  schedule+0x5b/0xd0
[  250.864581]  schedule_preempt_disabled+0xa/0x10
[  250.864583]  __mutex_lock.constprop.0+0x216/0x400
[  250.864585]  linkwatch_event+0xa/0x30
[  250.864588]  process_one_work+0x1ee/0x390
[  250.864589]  worker_thread+0x53/0x3e0
[  250.864591]  ? process_one_work+0x390/0x390
[  250.864592]  kthread+0x124/0x150
[  250.864594]  ? set_kthread_struct+0x50/0x50
[  250.864597]  ret_from_fork+0x1f/0x30
[  250.864600]  </TASK>
[  250.864621] INFO: task kworker/119:1:946 blocked for more than 122 seconds.
[  250.871578]       Not tainted 5.15.86 #1-NixOS
[  250.876025] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.883848] task:kworker/119:1   state:D stack:    0 pid:  946 ppid:     2 flags:0x00004000
[  250.883851] Workqueue: events_power_efficient reg_check_chans_work [cfg80211]
[  250.883890] Call Trace:
[  250.883891]  <TASK>
[  250.883892]  __schedule+0x2e6/0x13a0
[  250.883895]  ? ttwu_do_wakeup+0x17/0x150
[  250.883898]  ? try_to_wake_up+0x1d1/0x420
[  250.883901]  schedule+0x5b/0xd0
[  250.883903]  schedule_preempt_disabled+0xa/0x10
[  250.883905]  __mutex_lock.constprop.0+0x216/0x400
[  250.883907]  reg_check_chans_work+0x2d/0x390 [cfg80211]
[  250.883925]  ? add_timer_on+0xe9/0x120
[  250.883928]  process_one_work+0x1ee/0x390
[  250.883930]  worker_thread+0x53/0x3e0
[  250.883931]  ? process_one_work+0x390/0x390
[  250.883932]  kthread+0x124/0x150
[  250.883935]  ? set_kthread_struct+0x50/0x50
[  250.883937]  ret_from_fork+0x1f/0x30
[  250.883940]  </TASK>
[  250.883943] INFO: task kworker/0:3:1082 blocked for more than 122 seconds.
[  250.890817]       Not tainted 5.15.86 #1-NixOS
[  250.895262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.903088] task:kworker/0:3     state:D stack:    0 pid: 1082 ppid:     2 flags:0x00004000
[  250.903090] Workqueue: infiniband ib_cache_event_task [ib_core]
[  250.903104] Call Trace:
[  250.903105]  <TASK>
[  250.903106]  __schedule+0x2e6/0x13a0
[  250.903109]  schedule+0x5b/0xd0
[  250.903111]  schedule_preempt_disabled+0xa/0x10
[  250.903114]  __mutex_lock.constprop.0+0x216/0x400
[  250.903115]  ib_get_eth_speed+0x65/0x190 [ib_core]
[  250.903127]  ? raw_spin_rq_lock_nested+0x12/0x70
[  250.903129]  ? newidle_balance+0x2ee/0x400
[  250.903131]  irdma_query_port+0xac/0x110 [irdma]
[  250.903144]  ib_query_port+0x16f/0x1f0 [ib_core]
[  250.903155]  ib_cache_update.part.0+0x5e/0x280 [ib_core]
[  250.903166]  ? __schedule+0x2ee/0x13a0
[  250.903169]  ib_cache_event_task+0x53/0x70 [ib_core]
[  250.903178]  process_one_work+0x1ee/0x390
[  250.903180]  worker_thread+0x53/0x3e0
[  250.903181]  ? process_one_work+0x390/0x390
[  250.903183]  kthread+0x124/0x150
[  250.903185]  ? set_kthread_struct+0x50/0x50
[  250.903187]  ret_from_fork+0x1f/0x30
[  250.903190]  </TASK>
[  250.903195] INFO: task systemd-network:2679 blocked for more than 122 seconds.
[  250.910414]       Not tainted 5.15.86 #1-NixOS
[  250.914858] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.922685] task:systemd-network state:D stack:    0 pid: 2679 ppid:     1 flags:0x00004226
[  250.922687] Call Trace:
[  250.922688]  <TASK>
[  250.922688]  ? usleep_range_state+0x90/0x90
[  250.922690]  __schedule+0x2e6/0x13a0
[  250.922693]  ? __wake_up_common+0x7d/0x190
[  250.922695]  ? usleep_range_state+0x90/0x90
[  250.922696]  schedule+0x5b/0xd0
[  250.922699]  schedule_timeout+0x104/0x140
[  250.922700]  ? __prepare_to_swait+0x4f/0x70
[  250.922702]  __wait_for_common+0xac/0x160
[  250.922704]  flush_workqueue+0x13a/0x3f0
[  250.922706]  ib_cache_cleanup_one+0x1c/0xe0 [ib_core]
[  250.922719]  __ib_unregister_device+0x6a/0xb0 [ib_core]
[  250.922730]  ib_unregister_device+0x22/0x30 [ib_core]
[  250.922742]  irdma_remove+0x1a/0x60 [irdma]
[  250.922752]  auxiliary_bus_remove+0x18/0x30
[  250.922754]  __device_release_driver+0x17b/0x250
[  250.922758]  device_release_driver+0x24/0x30
[  250.922760]  bus_remove_device+0xd8/0x150
[  250.922762]  device_del+0x18b/0x400
[  250.922765]  ice_unplug_aux_dev+0x42/0x60 [ice]
[  250.922786]  ice_lag_event_handler+0x4c9/0x580 [ice]
[  250.922800]  raw_notifier_call_chain+0x41/0x60
[  250.922802]  __netdev_upper_dev_link+0x1a0/0x370
[  250.922806]  netdev_master_upper_dev_link+0x3d/0x60
[  250.922809]  bond_enslave+0xbe3/0x1620 [bonding]
[  250.922819]  ? __nla_parse+0x22/0x30
[  250.922822]  ? inet6_set_link_af+0x6a/0x3d0
[  250.922825]  do_setlink+0x28e/0x1050
[  250.922828]  ? select_task_rq_fair+0x131/0x1090
[  250.922830]  ? select_task_rq_fair+0x131/0x1090
[  250.922832]  ? __nla_validate_parse+0x5f/0xc10
[  250.922834]  ? sched_clock_cpu+0x9/0xa0
[  250.922835]  ? __smp_call_single_queue+0x23/0x40
[  250.922837]  ? ttwu_queue_wakelist+0xf4/0x110
[  250.922840]  rtnl_setlink+0xe5/0x180
[  250.922844]  ? security_capable+0x33/0x60
[  250.922846]  rtnetlink_rcv_msg+0x12e/0x380
[  250.922849]  ? rtnl_calcit.isra.0+0x130/0x130
[  250.922851]  netlink_rcv_skb+0x4e/0x100
[  250.922854]  netlink_unicast+0x204/0x2e0
[  250.922855]  netlink_sendmsg+0x24e/0x4b0
[  250.922857]  sock_sendmsg+0x5f/0x70
[  250.922861]  __sys_sendto+0xf0/0x160
[  250.922862]  ? __seccomp_filter+0x389/0x580
[  250.922867]  __x64_sys_sendto+0x20/0x30
[  250.922868]  do_syscall_64+0x38/0x90
[  250.922872]  entry_SYSCALL_64_after_hwframe+0x61/0xcb
[  250.922876] RIP: 0033:0x7f4340b8befa
[  250.922894] RSP: 002b:00007ffd17815b68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[  250.922896] RAX: ffffffffffffffda RBX: 00005603231d29a8 RCX: 00007f4340b8befa
[  250.922897] RDX: 0000000000000028 RSI: 00005603231d2930 RDI: 0000000000000003
[  250.922898] RBP: 00005603231aade0 R08: 00007ffd17815b70 R09: 0000000000000080
[  250.922899] R10: 0000000000000000 R11: 0000000000000246 R12: 00005603231d24a0
[  250.922900] R13: 00005603231d2960 R14: 0000000000000000 R15: 0000560323063140
[  250.922902]  </TASK>
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-16 17:24 ` Linus Heckemann
@ 2023-02-24 18:33   ` Ertman, David M
  2023-03-01 13:12     ` Petr Oros
  2023-03-02 12:38     ` Linus Heckemann
  0 siblings, 2 replies; 12+ messages in thread
From: Ertman, David M @ 2023-02-24 18:33 UTC (permalink / raw)
  To: Linus Heckemann, intel-wired-lan; +Cc: Jaroslav Pulchart

> -----Original Message-----
> From: Linus Heckemann <git@sphalerite.org>
> Sent: Thursday, February 16, 2023 9:24 AM
> To: Ertman, David M <david.m.ertman@intel.com>; intel-wired-
> lan@lists.osuosl.org
> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Subject: Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing
> auxiliary plug/unplug under RTNL lock
> 
> Dave Ertman <david.m.ertman@intel.com> writes:
> > RDMA is not supported in ice on a PF that has been added to a bonded
> > interface. To enforce this, when an interface enters a bond, we unplug
> > the auxiliary device that supports RDMA functionality.  This unplug
> > currently happens in the context of handling the netdev bonding event.
> > This event is sent to the ice driver under RTNL context.  This is causing
> > a deadlock where the RDMA driver is waiting for the RTNL lock to complete
> > the removal.
> >
> > Defer the unplugging/re-plugging of the auxiliary device to the service
> > task so that it is not performed under the RTNL lock context.
> >
> > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> > Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-
> 0487c95e395d@leemhuis.info/
> > Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
> > Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> > Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> > ---
> > Changes since v1:
> > Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
> 
> Hi Dave,
> 
> Thanks for your continued work on this! We've tested this on a system
> affected by the original issue (with 8086:1593 cards) and, unlike v1 of
> the patch, it appears not to resolve it:

Hi Linus,

This error confuses me.  The only difference between v1 and v2 of this patch
is the order in which we process state bits in the service task thread.  They are
still being processed outside of RTNL context.

Can you provide the steps you used to reproduce this issue? 

Thanks,
DaveE
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-24 18:33   ` Ertman, David M
@ 2023-03-01 13:12     ` Petr Oros
  2023-03-02 12:38     ` Linus Heckemann
  1 sibling, 0 replies; 12+ messages in thread
From: Petr Oros @ 2023-03-01 13:12 UTC (permalink / raw)
  To: Ertman, David M, Linus Heckemann, intel-wired-lan; +Cc: Jaroslav Pulchart

Ertman, David M píše v Pá 24. 02. 2023 v 18:33 +0000:
> > -----Original Message-----
> > From: Linus Heckemann <git@sphalerite.org>
> > Sent: Thursday, February 16, 2023 9:24 AM
> > To: Ertman, David M <david.m.ertman@intel.com>; intel-wired-
> > lan@lists.osuosl.org
> > Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> > Subject: Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding
> > causing
> > auxiliary plug/unplug under RTNL lock
> > 
> > Dave Ertman <david.m.ertman@intel.com> writes:
> > > RDMA is not supported in ice on a PF that has been added to a
> > > bonded
> > > interface. To enforce this, when an interface enters a bond, we
> > > unplug
> > > the auxiliary device that supports RDMA functionality.  This
> > > unplug
> > > currently happens in the context of handling the netdev bonding
> > > event.
> > > This event is sent to the ice driver under RTNL context.  This is
> > > causing
> > > a deadlock where the RDMA driver is waiting for the RTNL lock to
> > > complete
> > > the removal.
> > > 
> > > Defer the unplugging/re-plugging of the auxiliary device to the
> > > service
> > > task so that it is not performed under the RTNL lock context.
> > > 
> > > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> > > Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-
> > 0487c95e395d@leemhuis.info/
> > > Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface
> > > enslave")
> > > Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> > > Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> > > ---
> > > Changes since v1:
> > > Reversed order of bit processing in ice_service_task for
> > > PLUG/UNPLUG
> > 
> > Hi Dave,
> > 
> > Thanks for your continued work on this! We've tested this on a
> > system
> > affected by the original issue (with 8086:1593 cards) and, unlike
> > v1 of
> > the patch, it appears not to resolve it:
> 
> Hi Linus,
> 
> This error confuses me.  The only difference between v1 and v2 of
> this patch
> is the order in which we process state bits in the service task
> thread.  They are
> still being processed outside of RTNL context.
> 
> Can you provide the steps you used to reproduce this issue? 
Hi all,

I have tested this fix and i can confirm that the issue is resolved
with v2.

With patch (v1 or v2)
$ modprobe -v bonding mode=1 miimon=100 max_bonds=1
insmod /lib/modules/6.2.0+/kernel/net/tls/tls.ko 
insmod /lib/modules/6.2.0+/kernel/drivers/net/bonding/bonding.ko
max_bonds=0 mode=1 miimon=100 max_bonds=1
$ ip link set up bond0
$ ifenslave bond0 enp65s0f0np0 enp65s0f1np1
$ 

Without patch
$ modprobe -v bonding mode=1 miimon=100 max_bonds=1
insmod /lib/modules/6.2.0+/kernel/net/tls/tls.ko 
insmod /lib/modules/6.2.0+/kernel/drivers/net/bonding/bonding.ko
max_bonds=0 mode=1 miimon=100 max_bonds=1
$ ip link set up bond0
$ ifenslave bond0 enp65s0f0np0 enp65s0f1np1
^^^^^^ HANG

Regards,
Petr

> 
> Thanks,
> DaveE
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-24 18:33   ` Ertman, David M
  2023-03-01 13:12     ` Petr Oros
@ 2023-03-02 12:38     ` Linus Heckemann
  2023-03-10 19:03       ` Tony Nguyen
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Heckemann @ 2023-03-02 12:38 UTC (permalink / raw)
  To: Ertman, David M, intel-wired-lan; +Cc: Jaroslav Pulchart

"Ertman, David M" <david.m.ertman@intel.com> writes:

>> -----Original Message-----
>> From: Linus Heckemann <git@sphalerite.org>
>> Sent: Thursday, February 16, 2023 9:24 AM
>> To: Ertman, David M <david.m.ertman@intel.com>; intel-wired-
>> lan@lists.osuosl.org
>> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>> Subject: Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing
>> auxiliary plug/unplug under RTNL lock
>> 
>> Dave Ertman <david.m.ertman@intel.com> writes:
>> > RDMA is not supported in ice on a PF that has been added to a bonded
>> > interface. To enforce this, when an interface enters a bond, we unplug
>> > the auxiliary device that supports RDMA functionality.  This unplug
>> > currently happens in the context of handling the netdev bonding event.
>> > This event is sent to the ice driver under RTNL context.  This is causing
>> > a deadlock where the RDMA driver is waiting for the RTNL lock to complete
>> > the removal.
>> >
>> > Defer the unplugging/re-plugging of the auxiliary device to the service
>> > task so that it is not performed under the RTNL lock context.
>> >
>> > Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>> > Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-
>> 0487c95e395d@leemhuis.info/
>> > Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
>> > Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
>> > Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
>> > ---
>> > Changes since v1:
>> > Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
>> 
>> Hi Dave,
>> 
>> Thanks for your continued work on this! We've tested this on a system
>> affected by the original issue (with 8086:1593 cards) and, unlike v1 of
>> the patch, it appears not to resolve it:
>
> Hi Linus,
>
> This error confuses me.  The only difference between v1 and v2 of this patch
> is the order in which we process state bits in the service task thread.  They are
> still being processed outside of RTNL context.
>
> Can you provide the steps you used to reproduce this issue? 
>
> Thanks,
> DaveE

Hi Dave,

It confuses me as well!

Like before, this was reproduced by booting a system configured to bond
the interfaces provided by two of the cards (using systemd-networkd,
relevant config below). The failure occurred less frequently than prior
to applying the patch, but still enough to be quite an annoyance!

According to the provider, the machine's card was on an older firmware
(3.00 0x8000893f 20.5.13), and upgrading to the latest available version
resolved this issue for our purposes. Nevertheless, I think the kernel
shouldn't be deadlock on the RTNL lock regardless of which firmware
version is running. If there's any more information that would be
helpful for debugging, let us know -- though we can't get at machines
running the old firmware trivially, so it's hard for us to reproduce at
this point.

As mentioned, upgrading the firmware has resolved the problem for us,
though it certainly feels unsatisfying to leave the bug there. I have no
strong opinion on whether the patch should be included as is
anyway. Maybe the firmware version info is enough to help you reproduce
the problem?

Linus


________
/etc/systemd/network/10-bond0.netdev:
[NetDev]
Kind=bond
Name=bond0

[Bond]
DownDelaySec=0.200000
LACPTransmitRate=fast
MIIMonitorSec=0.100000
Mode=802.3ad
TransmitHashPolicy=layer3+4
UpDelaySec=0.200000

________
/etc/systemd/network/40-bond0.network:
[Match]
Name=bond0

[Link]
#MACAddress=<omitted>
RequiredForOnline=carrier

[Network]
LinkLocalAddressing=no

# some Address and Route sections omitted
________
/etc/systemd/network/30-eno12419.network:
[Match]
Name=eno12419

[Network]
Bond=bond0

________
/etc/systemd/network/30-eno12399.network:
[Match]
Name=eno12399

[Network]
Bond=bond0

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-02-15 19:17 [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock Dave Ertman
                   ` (2 preceding siblings ...)
  2023-02-16 17:24 ` Linus Heckemann
@ 2023-03-06 11:03 ` Arland, ArpanaX
  3 siblings, 0 replies; 12+ messages in thread
From: Arland, ArpanaX @ 2023-03-06 11:03 UTC (permalink / raw)
  To: Ertman, David M, intel-wired-lan; +Cc: Jaroslav Pulchart

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Ertman, David M
> Sent: Thursday, February 16, 2023 12:48 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Subject: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
>
> RDMA is not supported in ice on a PF that has been added to a bonded interface. To enforce this, when an interface enters a bond, we unplug the auxiliary device that supports RDMA functionality.  This unplug currently happens in the context of handling the netdev bonding event.
> This event is sent to the ice driver under RTNL context.  This is causing a deadlock where the RDMA driver is waiting for the RTNL lock to complete the removal.
>
> Defer the unplugging/re-plugging of the auxiliary device to the service task so that it is not performed under the RTNL lock context.
>
> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-0487c95e395d@leemhuis.info/
> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
> ---
> Changes since v1:
> Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
>
>  drivers/net/ethernet/intel/ice/ice.h      | 14 +++++---------
>  drivers/net/ethernet/intel/ice/ice_main.c | 19 ++++++++-----------
>  2 files changed, 13 insertions(+), 20 deletions(-)
>
Tested-by: Arpana Arland < arpanax.arland@intel.com> (A Contingent worker at Intel)

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock
  2023-03-02 12:38     ` Linus Heckemann
@ 2023-03-10 19:03       ` Tony Nguyen
  0 siblings, 0 replies; 12+ messages in thread
From: Tony Nguyen @ 2023-03-10 19:03 UTC (permalink / raw)
  To: Linus Heckemann, Ertman, David M, intel-wired-lan; +Cc: Jaroslav Pulchart

On 3/2/2023 4:38 AM, Linus Heckemann wrote:
> "Ertman, David M" <david.m.ertman@intel.com> writes:
> 
>>> -----Original Message-----
>>> From: Linus Heckemann <git@sphalerite.org>
>>> Sent: Thursday, February 16, 2023 9:24 AM
>>> To: Ertman, David M <david.m.ertman@intel.com>; intel-wired-
>>> lan@lists.osuosl.org
>>> Cc: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>>> Subject: Re: [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing
>>> auxiliary plug/unplug under RTNL lock
>>>
>>> Dave Ertman <david.m.ertman@intel.com> writes:
>>>> RDMA is not supported in ice on a PF that has been added to a bonded
>>>> interface. To enforce this, when an interface enters a bond, we unplug
>>>> the auxiliary device that supports RDMA functionality.  This unplug
>>>> currently happens in the context of handling the netdev bonding event.
>>>> This event is sent to the ice driver under RTNL context.  This is causing
>>>> a deadlock where the RDMA driver is waiting for the RTNL lock to complete
>>>> the removal.
>>>>
>>>> Defer the unplugging/re-plugging of the auxiliary device to the service
>>>> task so that it is not performed under the RTNL lock context.
>>>>
>>>> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com>
>>>> Link: https://lore.kernel.org/linux-rdma/68b14b11-d0c7-65c9-4eeb-
>>> 0487c95e395d@leemhuis.info/
>>>> Fixes: 5cb1ebdbc434 ("ice: Fix race condition during interface enslave")
>>>> Fixes: 425c9bd06b7a ("RDMA/irdma: Report the correct link speed")
>>>> Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
>>>> ---
>>>> Changes since v1:
>>>> Reversed order of bit processing in ice_service_task for PLUG/UNPLUG
>>>
>>> Hi Dave,
>>>
>>> Thanks for your continued work on this! We've tested this on a system
>>> affected by the original issue (with 8086:1593 cards) and, unlike v1 of
>>> the patch, it appears not to resolve it:
>>
>> Hi Linus,
>>
>> This error confuses me.  The only difference between v1 and v2 of this patch
>> is the order in which we process state bits in the service task thread.  They are
>> still being processed outside of RTNL context.
>>
>> Can you provide the steps you used to reproduce this issue?
>>
>> Thanks,
>> DaveE
> 
> Hi Dave,
> 
> It confuses me as well!
> 
> Like before, this was reproduced by booting a system configured to bond
> the interfaces provided by two of the cards (using systemd-networkd,
> relevant config below). The failure occurred less frequently than prior
> to applying the patch, but still enough to be quite an annoyance!
> 
> According to the provider, the machine's card was on an older firmware
> (3.00 0x8000893f 20.5.13), and upgrading to the latest available version
> resolved this issue for our purposes. Nevertheless, I think the kernel
> shouldn't be deadlock on the RTNL lock regardless of which firmware
> version is running. If there's any more information that would be
> helpful for debugging, let us know -- though we can't get at machines
> running the old firmware trivially, so it's hard for us to reproduce at
> this point.
> 
> As mentioned, upgrading the firmware has resolved the problem for us,
> though it certainly feels unsatisfying to leave the bug there. I have no
> strong opinion on whether the patch should be included as is
> anyway. Maybe the firmware version info is enough to help you reproduce
> the problem?

Thanks for the additional information Linus. Unfortunately, our 
validation was unable to reproduce the issue with the specified firmware 
and setup. We were able to see failure without patch, but not with the 
patch applied. If you are able to run across the issue again or have any 
other setup info that may help with a reproduction, please let us know.

Thanks,
Tony

> Linus
> 
> 
> ________
> /etc/systemd/network/10-bond0.netdev:
> [NetDev]
> Kind=bond
> Name=bond0
> 
> [Bond]
> DownDelaySec=0.200000
> LACPTransmitRate=fast
> MIIMonitorSec=0.100000
> Mode=802.3ad
> TransmitHashPolicy=layer3+4
> UpDelaySec=0.200000
> 
> ________
> /etc/systemd/network/40-bond0.network:
> [Match]
> Name=bond0
> 
> [Link]
> #MACAddress=<omitted>
> RequiredForOnline=carrier
> 
> [Network]
> LinkLocalAddressing=no
> 
> # some Address and Route sections omitted
> ________
> /etc/systemd/network/30-eno12419.network:
> [Match]
> Name=eno12419
> 
> [Network]
> Bond=bond0
> 
> ________
> /etc/systemd/network/30-eno12399.network:
> [Match]
> Name=eno12399
> 
> [Network]
> Bond=bond0
> 
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan
_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-03-10 19:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-15 19:17 [Intel-wired-lan] [PATCH net v2] ice: avoid bonding causing auxiliary plug/unplug under RTNL lock Dave Ertman
2023-02-16  8:54 ` Petr Oros
2023-02-16 10:33 ` Linux regression tracking (Thorsten Leemhuis)
2023-02-16 10:33   ` Linux regression tracking (Thorsten Leemhuis)
2023-02-16 10:36   ` Linux regression tracking (Thorsten Leemhuis)
2023-02-16 10:36     ` Linux regression tracking (Thorsten Leemhuis)
2023-02-16 17:24 ` Linus Heckemann
2023-02-24 18:33   ` Ertman, David M
2023-03-01 13:12     ` Petr Oros
2023-03-02 12:38     ` Linus Heckemann
2023-03-10 19:03       ` Tony Nguyen
2023-03-06 11:03 ` Arland, ArpanaX

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.