All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 17:16 ` Ivan Vecera
  0 siblings, 0 replies; 13+ messages in thread
From: Ivan Vecera @ 2022-03-10 17:16 UTC (permalink / raw)
  To: netdev
  Cc: Petr Oros, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Jakub Kicinski, moderated list:INTEL ETHERNET DRIVERS, open list

Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
auxiliary device") changes a process of re-creation of aux device
so ice_plug_aux_dev() is called from ice_service_task() context.
This unfortunately opens a race window that can result in dead-lock
when interface has left LAG and immediately enters LAG again.

Reproducer:
```
#!/bin/sh

ip link add lag0 type bond mode 1 miimon 100
ip link set lag0

for n in {1..10}; do
        echo Cycle: $n
        ip link set ens7f0 master lag0
        sleep 1
        ip link set ens7f0 nomaster
done
```

This results in:
[20976.208697] Workqueue: ice ice_service_task [ice]
[20976.213422] Call Trace:
[20976.215871]  __schedule+0x2d1/0x830
[20976.219364]  schedule+0x35/0xa0
[20976.222510]  schedule_preempt_disabled+0xa/0x10
[20976.227043]  __mutex_lock.isra.7+0x310/0x420
[20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
[20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
[20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
[20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
[20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
[20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
[20976.285691]  auxiliary_bus_probe+0x45/0x70
[20976.289790]  really_probe+0x1f2/0x480
[20976.298509]  driver_probe_device+0x49/0xc0
[20976.302609]  bus_for_each_drv+0x79/0xc0
[20976.306448]  __device_attach+0xdc/0x160
[20976.310286]  bus_probe_device+0x9d/0xb0
[20976.314128]  device_add+0x43c/0x890
[20976.321287]  __auxiliary_device_add+0x43/0x60
[20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
[20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
[20976.342591]  process_one_work+0x1a7/0x360
[20976.350536]  worker_thread+0x30/0x390
[20976.358128]  kthread+0x10a/0x120
[20976.365547]  ret_from_fork+0x1f/0x40
...
[20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
[20976.446469] Call Trace:
[20976.448921]  __schedule+0x2d1/0x830
[20976.452414]  schedule+0x35/0xa0
[20976.455559]  schedule_preempt_disabled+0xa/0x10
[20976.460090]  __mutex_lock.isra.7+0x310/0x420
[20976.464364]  device_del+0x36/0x3c0
[20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
[20976.477288]  notifier_call_chain+0x47/0x70
[20976.481386]  __netdev_upper_dev_link+0x18b/0x280
[20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
[20976.494475]  do_setlink+0x336/0xf50
[20976.502517]  __rtnl_newlink+0x529/0x8b0
[20976.543441]  rtnl_newlink+0x43/0x60
[20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
[20976.559238]  netlink_rcv_skb+0x4c/0x120
[20976.563079]  netlink_unicast+0x196/0x230
[20976.567005]  netlink_sendmsg+0x204/0x3d0
[20976.570930]  sock_sendmsg+0x4c/0x50
[20976.574423]  ____sys_sendmsg+0x1eb/0x250
[20976.586807]  ___sys_sendmsg+0x7c/0xc0
[20976.606353]  __sys_sendmsg+0x57/0xa0
[20976.609930]  do_syscall_64+0x5b/0x1a0
[20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca

1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
   is called from ice_service_task() context, aux device is created
   and associated device->lock is taken.
2. Command 'ip link ... set master...' calls ice's notifier under
   RTNL lock and that notifier calls ice_unplug_aux_dev(). That
   function tries to take aux device->lock but this is already taken
   by ice_plug_aux_dev() in step 1
3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
   taken in step 2
4. Dead-lock

The patch fixes this issue by following changes:
- Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
  call in ice_service_task()
- The bit is checked in ice_clear_rdma_cap() and only if it is not set
  then ice_unplug_aux_dev() is called. If it is set (in other words
  plugging of aux device was requested and ice_plug_aux_dev() is
  potentially running) then the function only clears the bit
- Once ice_plug_aux_dev() call (in ice_service_task) is finished
  the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
  whether it was already cleared by ice_clear_rdma_cap(). If so then
  aux device is unplugged.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Co-developed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice.h      | 11 ++++++++++-
 drivers/net/ethernet/intel/ice/ice_main.c | 12 +++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 3121f9b04f59..bea1d1e39fa2 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -898,7 +898,16 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
  */
 static inline void ice_clear_rdma_cap(struct ice_pf *pf)
 {
-	ice_unplug_aux_dev(pf);
+	/* We can directly unplug aux device here only if the flag bit
+	 * ICE_FLAG_PLUG_AUX_DEV is not set because ice_unplug_aux_dev()
+	 * could race with ice_plug_aux_dev() called from
+	 * ice_service_task(). In this case we only clear that bit now and
+	 * aux device will be unplugged later once ice_plug_aux_device()
+	 * called from ice_service_task() finishes (see ice_service_task()).
+	 */
+	if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+		ice_unplug_aux_dev(pf);
+
 	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
 	clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 83e3e8aae6cf..493942e910be 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2255,9 +2255,19 @@ static void ice_service_task(struct work_struct *work)
 		return;
 	}
 
-	if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
+		/* Plug aux device per request */
 		ice_plug_aux_dev(pf);
 
+		/* Mark plugging as done but check whether unplug was
+		 * requested during ice_plug_aux_dev() call
+		 * (e.g. from ice_clear_rdma_cap()) and if so then
+		 * plug aux device.
+		 */
+		if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+			ice_unplug_aux_dev(pf);
+	}
+
 	if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
 		struct iidc_event *event;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 17:16 ` Ivan Vecera
  0 siblings, 0 replies; 13+ messages in thread
From: Ivan Vecera @ 2022-03-10 17:16 UTC (permalink / raw)
  To: intel-wired-lan

Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
auxiliary device") changes a process of re-creation of aux device
so ice_plug_aux_dev() is called from ice_service_task() context.
This unfortunately opens a race window that can result in dead-lock
when interface has left LAG and immediately enters LAG again.

Reproducer:
```
#!/bin/sh

ip link add lag0 type bond mode 1 miimon 100
ip link set lag0

for n in {1..10}; do
        echo Cycle: $n
        ip link set ens7f0 master lag0
        sleep 1
        ip link set ens7f0 nomaster
done
```

This results in:
[20976.208697] Workqueue: ice ice_service_task [ice]
[20976.213422] Call Trace:
[20976.215871]  __schedule+0x2d1/0x830
[20976.219364]  schedule+0x35/0xa0
[20976.222510]  schedule_preempt_disabled+0xa/0x10
[20976.227043]  __mutex_lock.isra.7+0x310/0x420
[20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
[20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
[20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
[20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
[20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
[20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
[20976.285691]  auxiliary_bus_probe+0x45/0x70
[20976.289790]  really_probe+0x1f2/0x480
[20976.298509]  driver_probe_device+0x49/0xc0
[20976.302609]  bus_for_each_drv+0x79/0xc0
[20976.306448]  __device_attach+0xdc/0x160
[20976.310286]  bus_probe_device+0x9d/0xb0
[20976.314128]  device_add+0x43c/0x890
[20976.321287]  __auxiliary_device_add+0x43/0x60
[20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
[20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
[20976.342591]  process_one_work+0x1a7/0x360
[20976.350536]  worker_thread+0x30/0x390
[20976.358128]  kthread+0x10a/0x120
[20976.365547]  ret_from_fork+0x1f/0x40
...
[20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627 flags:0x00004084
[20976.446469] Call Trace:
[20976.448921]  __schedule+0x2d1/0x830
[20976.452414]  schedule+0x35/0xa0
[20976.455559]  schedule_preempt_disabled+0xa/0x10
[20976.460090]  __mutex_lock.isra.7+0x310/0x420
[20976.464364]  device_del+0x36/0x3c0
[20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
[20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
[20976.477288]  notifier_call_chain+0x47/0x70
[20976.481386]  __netdev_upper_dev_link+0x18b/0x280
[20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
[20976.494475]  do_setlink+0x336/0xf50
[20976.502517]  __rtnl_newlink+0x529/0x8b0
[20976.543441]  rtnl_newlink+0x43/0x60
[20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
[20976.559238]  netlink_rcv_skb+0x4c/0x120
[20976.563079]  netlink_unicast+0x196/0x230
[20976.567005]  netlink_sendmsg+0x204/0x3d0
[20976.570930]  sock_sendmsg+0x4c/0x50
[20976.574423]  ____sys_sendmsg+0x1eb/0x250
[20976.586807]  ___sys_sendmsg+0x7c/0xc0
[20976.606353]  __sys_sendmsg+0x57/0xa0
[20976.609930]  do_syscall_64+0x5b/0x1a0
[20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca

1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
   is called from ice_service_task() context, aux device is created
   and associated device->lock is taken.
2. Command 'ip link ... set master...' calls ice's notifier under
   RTNL lock and that notifier calls ice_unplug_aux_dev(). That
   function tries to take aux device->lock but this is already taken
   by ice_plug_aux_dev() in step 1
3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
   taken in step 2
4. Dead-lock

The patch fixes this issue by following changes:
- Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
  call in ice_service_task()
- The bit is checked in ice_clear_rdma_cap() and only if it is not set
  then ice_unplug_aux_dev() is called. If it is set (in other words
  plugging of aux device was requested and ice_plug_aux_dev() is
  potentially running) then the function only clears the bit
- Once ice_plug_aux_dev() call (in ice_service_task) is finished
  the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
  whether it was already cleared by ice_clear_rdma_cap(). If so then
  aux device is unplugged.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Co-developed-by: Petr Oros <poros@redhat.com>
Signed-off-by: Petr Oros <poros@redhat.com>
---
 drivers/net/ethernet/intel/ice/ice.h      | 11 ++++++++++-
 drivers/net/ethernet/intel/ice/ice_main.c | 12 +++++++++++-
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index 3121f9b04f59..bea1d1e39fa2 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -898,7 +898,16 @@ static inline void ice_set_rdma_cap(struct ice_pf *pf)
  */
 static inline void ice_clear_rdma_cap(struct ice_pf *pf)
 {
-	ice_unplug_aux_dev(pf);
+	/* We can directly unplug aux device here only if the flag bit
+	 * ICE_FLAG_PLUG_AUX_DEV is not set because ice_unplug_aux_dev()
+	 * could race with ice_plug_aux_dev() called from
+	 * ice_service_task(). In this case we only clear that bit now and
+	 * aux device will be unplugged later once ice_plug_aux_device()
+	 * called from ice_service_task() finishes (see ice_service_task()).
+	 */
+	if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+		ice_unplug_aux_dev(pf);
+
 	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
 	clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 83e3e8aae6cf..493942e910be 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2255,9 +2255,19 @@ static void ice_service_task(struct work_struct *work)
 		return;
 	}
 
-	if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
+		/* Plug aux device per request */
 		ice_plug_aux_dev(pf);
 
+		/* Mark plugging as done but check whether unplug was
+		 * requested during ice_plug_aux_dev() call
+		 * (e.g. from ice_clear_rdma_cap()) and if so then
+		 * plug aux device.
+		 */
+		if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
+			ice_unplug_aux_dev(pf);
+	}
+
 	if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
 		struct iidc_event *event;
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 17:16 ` [Intel-wired-lan] " Ivan Vecera
@ 2022-03-10 17:48   ` Ertman, David M
  -1 siblings, 0 replies; 13+ messages in thread
From: Ertman, David M @ 2022-03-10 17:48 UTC (permalink / raw)
  To: Ivan Vecera, netdev
  Cc: Petr Oros, Brandeburg, Jesse, Nguyen, Anthony L, David S. Miller,
	Jakub Kicinski, moderated list:INTEL ETHERNET DRIVERS, open list

> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Thursday, March 10, 2022 9:17 AM
> To: netdev@vger.kernel.org
> Cc: Petr Oros <poros@redhat.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; moderated list:INTEL ETHERNET DRIVERS
> <intel-wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>
> Subject: [PATCH net] ice: Fix race condition during interface enslave
> 
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> ip link add lag0 type bond mode 1 miimon 100
> ip link set lag0
> 
> for n in {1..10}; do
>         echo Cycle: $n
>         ip link set ens7f0 master lag0
>         sleep 1
>         ip link set ens7f0 nomaster
> done
> ```
> 
> This results in:
> [20976.208697] Workqueue: ice ice_service_task [ice]
> [20976.213422] Call Trace:
> [20976.215871]  __schedule+0x2d1/0x830
> [20976.219364]  schedule+0x35/0xa0
> [20976.222510]  schedule_preempt_disabled+0xa/0x10
> [20976.227043]  __mutex_lock.isra.7+0x310/0x420
> [20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
> [20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
> [20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
> [20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
> [20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
> [20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
> [20976.285691]  auxiliary_bus_probe+0x45/0x70
> [20976.289790]  really_probe+0x1f2/0x480
> [20976.298509]  driver_probe_device+0x49/0xc0
> [20976.302609]  bus_for_each_drv+0x79/0xc0
> [20976.306448]  __device_attach+0xdc/0x160
> [20976.310286]  bus_probe_device+0x9d/0xb0
> [20976.314128]  device_add+0x43c/0x890
> [20976.321287]  __auxiliary_device_add+0x43/0x60
> [20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
> [20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
> [20976.342591]  process_one_work+0x1a7/0x360
> [20976.350536]  worker_thread+0x30/0x390
> [20976.358128]  kthread+0x10a/0x120
> [20976.365547]  ret_from_fork+0x1f/0x40
> ...
> [20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627
> flags:0x00004084
> [20976.446469] Call Trace:
> [20976.448921]  __schedule+0x2d1/0x830
> [20976.452414]  schedule+0x35/0xa0
> [20976.455559]  schedule_preempt_disabled+0xa/0x10
> [20976.460090]  __mutex_lock.isra.7+0x310/0x420
> [20976.464364]  device_del+0x36/0x3c0
> [20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
> [20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
> [20976.477288]  notifier_call_chain+0x47/0x70
> [20976.481386]  __netdev_upper_dev_link+0x18b/0x280
> [20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
> [20976.494475]  do_setlink+0x336/0xf50
> [20976.502517]  __rtnl_newlink+0x529/0x8b0
> [20976.543441]  rtnl_newlink+0x43/0x60
> [20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
> [20976.559238]  netlink_rcv_skb+0x4c/0x120
> [20976.563079]  netlink_unicast+0x196/0x230
> [20976.567005]  netlink_sendmsg+0x204/0x3d0
> [20976.570930]  sock_sendmsg+0x4c/0x50
> [20976.574423]  ____sys_sendmsg+0x1eb/0x250
> [20976.586807]  ___sys_sendmsg+0x7c/0xc0
> [20976.606353]  __sys_sendmsg+0x57/0xa0
> [20976.609930]  do_syscall_64+0x5b/0x1a0
> [20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> 
> 1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
>    is called from ice_service_task() context, aux device is created
>    and associated device->lock is taken.
> 2. Command 'ip link ... set master...' calls ice's notifier under
>    RTNL lock and that notifier calls ice_unplug_aux_dev(). That
>    function tries to take aux device->lock but this is already taken
>    by ice_plug_aux_dev() in step 1
> 3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
>    taken in step 2
> 4. Dead-lock
> 
> The patch fixes this issue by following changes:
> - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
>   call in ice_service_task()
> - The bit is checked in ice_clear_rdma_cap() and only if it is not set
>   then ice_unplug_aux_dev() is called. If it is set (in other words
>   plugging of aux device was requested and ice_plug_aux_dev() is
>   potentially running) then the function only clears the bit
> - Once ice_plug_aux_dev() call (in ice_service_task) is finished
>   the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
>   whether it was already cleared by ice_clear_rdma_cap(). If so then
>   aux device is unplugged.
> 
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> Co-developed-by: Petr Oros <poros@redhat.com>
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice.h      | 11 ++++++++++-
>  drivers/net/ethernet/intel/ice/ice_main.c | 12 +++++++++++-
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice.h
> b/drivers/net/ethernet/intel/ice/ice.h
> index 3121f9b04f59..bea1d1e39fa2 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -898,7 +898,16 @@ static inline void ice_set_rdma_cap(struct ice_pf
> *pf)
>   */
>  static inline void ice_clear_rdma_cap(struct ice_pf *pf)
>  {
> -	ice_unplug_aux_dev(pf);
> +	/* We can directly unplug aux device here only if the flag bit
> +	 * ICE_FLAG_PLUG_AUX_DEV is not set because
> ice_unplug_aux_dev()
> +	 * could race with ice_plug_aux_dev() called from
> +	 * ice_service_task(). In this case we only clear that bit now and
> +	 * aux device will be unplugged later once ice_plug_aux_device()
> +	 * called from ice_service_task() finishes (see ice_service_task()).
> +	 */
> +	if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> +		ice_unplug_aux_dev(pf);
> +
>  	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
>  	clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
>  }
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 83e3e8aae6cf..493942e910be 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -2255,9 +2255,19 @@ static void ice_service_task(struct work_struct
> *work)
>  		return;
>  	}
> 
> -	if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> +	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
> +		/* Plug aux device per request */
>  		ice_plug_aux_dev(pf);
> 
> +		/* Mark plugging as done but check whether unplug was
> +		 * requested during ice_plug_aux_dev() call
> +		 * (e.g. from ice_clear_rdma_cap()) and if so then
> +		 * plug aux device.
> +		 */
> +		if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf-
> >flags))
> +			ice_unplug_aux_dev(pf);
> +	}
> +
>  	if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
>  		struct iidc_event *event;
> 
> --
> 2.34.1

This only addresses one case of unplugging the auxiliary bus.  Rather than controlling one instance of
calling ice_unplig_aux_dev(), it seems like it would be better to modify ice_unplug_aux_dev so that it
will pause until any plugging is done by the service task (check for the pf->flag bit and wait until it clears
before progressing).

This way resets, devlink calls, etc would also be controlled from runaway plugging / unplugging of the aux
devices.  This would eliminate the second check in the service task also, since unplug would not be allowed
to progress until the bit is cleared by the service task.

In an upcoming patch I am making all plugging of the aux devices done in the service task, this would centralize
it and protect it against looping plug/unplug of the auxiliary devices.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 17:48   ` Ertman, David M
  0 siblings, 0 replies; 13+ messages in thread
From: Ertman, David M @ 2022-03-10 17:48 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Thursday, March 10, 2022 9:17 AM
> To: netdev at vger.kernel.org
> Cc: Petr Oros <poros@redhat.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; moderated list:INTEL ETHERNET DRIVERS
> <intel-wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>
> Subject: [PATCH net] ice: Fix race condition during interface enslave
> 
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> ip link add lag0 type bond mode 1 miimon 100
> ip link set lag0
> 
> for n in {1..10}; do
>         echo Cycle: $n
>         ip link set ens7f0 master lag0
>         sleep 1
>         ip link set ens7f0 nomaster
> done
> ```
> 
> This results in:
> [20976.208697] Workqueue: ice ice_service_task [ice]
> [20976.213422] Call Trace:
> [20976.215871]  __schedule+0x2d1/0x830
> [20976.219364]  schedule+0x35/0xa0
> [20976.222510]  schedule_preempt_disabled+0xa/0x10
> [20976.227043]  __mutex_lock.isra.7+0x310/0x420
> [20976.235071]  enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core]
> [20976.251215]  ib_enum_roce_netdev+0xa4/0xe0 [ib_core]
> [20976.256192]  ib_cache_setup_one+0x33/0xa0 [ib_core]
> [20976.261079]  ib_register_device+0x40d/0x580 [ib_core]
> [20976.266139]  irdma_ib_register_device+0x129/0x250 [irdma]
> [20976.281409]  irdma_probe+0x2c1/0x360 [irdma]
> [20976.285691]  auxiliary_bus_probe+0x45/0x70
> [20976.289790]  really_probe+0x1f2/0x480
> [20976.298509]  driver_probe_device+0x49/0xc0
> [20976.302609]  bus_for_each_drv+0x79/0xc0
> [20976.306448]  __device_attach+0xdc/0x160
> [20976.310286]  bus_probe_device+0x9d/0xb0
> [20976.314128]  device_add+0x43c/0x890
> [20976.321287]  __auxiliary_device_add+0x43/0x60
> [20976.325644]  ice_plug_aux_dev+0xb2/0x100 [ice]
> [20976.330109]  ice_service_task+0xd0c/0xed0 [ice]
> [20976.342591]  process_one_work+0x1a7/0x360
> [20976.350536]  worker_thread+0x30/0x390
> [20976.358128]  kthread+0x10a/0x120
> [20976.365547]  ret_from_fork+0x1f/0x40
> ...
> [20976.438030] task:ip              state:D stack:    0 pid:213658 ppid:213627
> flags:0x00004084
> [20976.446469] Call Trace:
> [20976.448921]  __schedule+0x2d1/0x830
> [20976.452414]  schedule+0x35/0xa0
> [20976.455559]  schedule_preempt_disabled+0xa/0x10
> [20976.460090]  __mutex_lock.isra.7+0x310/0x420
> [20976.464364]  device_del+0x36/0x3c0
> [20976.467772]  ice_unplug_aux_dev+0x1a/0x40 [ice]
> [20976.472313]  ice_lag_event_handler+0x2a2/0x520 [ice]
> [20976.477288]  notifier_call_chain+0x47/0x70
> [20976.481386]  __netdev_upper_dev_link+0x18b/0x280
> [20976.489845]  bond_enslave+0xe05/0x1790 [bonding]
> [20976.494475]  do_setlink+0x336/0xf50
> [20976.502517]  __rtnl_newlink+0x529/0x8b0
> [20976.543441]  rtnl_newlink+0x43/0x60
> [20976.546934]  rtnetlink_rcv_msg+0x2b1/0x360
> [20976.559238]  netlink_rcv_skb+0x4c/0x120
> [20976.563079]  netlink_unicast+0x196/0x230
> [20976.567005]  netlink_sendmsg+0x204/0x3d0
> [20976.570930]  sock_sendmsg+0x4c/0x50
> [20976.574423]  ____sys_sendmsg+0x1eb/0x250
> [20976.586807]  ___sys_sendmsg+0x7c/0xc0
> [20976.606353]  __sys_sendmsg+0x57/0xa0
> [20976.609930]  do_syscall_64+0x5b/0x1a0
> [20976.613598]  entry_SYSCALL_64_after_hwframe+0x65/0xca
> 
> 1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev()
>    is called from ice_service_task() context, aux device is created
>    and associated device->lock is taken.
> 2. Command 'ip link ... set master...' calls ice's notifier under
>    RTNL lock and that notifier calls ice_unplug_aux_dev(). That
>    function tries to take aux device->lock but this is already taken
>    by ice_plug_aux_dev() in step 1
> 3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already
>    taken in step 2
> 4. Dead-lock
> 
> The patch fixes this issue by following changes:
> - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev()
>   call in ice_service_task()
> - The bit is checked in ice_clear_rdma_cap() and only if it is not set
>   then ice_unplug_aux_dev() is called. If it is set (in other words
>   plugging of aux device was requested and ice_plug_aux_dev() is
>   potentially running) then the function only clears the bit
> - Once ice_plug_aux_dev() call (in ice_service_task) is finished
>   the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked
>   whether it was already cleared by ice_clear_rdma_cap(). If so then
>   aux device is unplugged.
> 
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> Co-developed-by: Petr Oros <poros@redhat.com>
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice.h      | 11 ++++++++++-
>  drivers/net/ethernet/intel/ice/ice_main.c | 12 +++++++++++-
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice.h
> b/drivers/net/ethernet/intel/ice/ice.h
> index 3121f9b04f59..bea1d1e39fa2 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -898,7 +898,16 @@ static inline void ice_set_rdma_cap(struct ice_pf
> *pf)
>   */
>  static inline void ice_clear_rdma_cap(struct ice_pf *pf)
>  {
> -	ice_unplug_aux_dev(pf);
> +	/* We can directly unplug aux device here only if the flag bit
> +	 * ICE_FLAG_PLUG_AUX_DEV is not set because
> ice_unplug_aux_dev()
> +	 * could race with ice_plug_aux_dev() called from
> +	 * ice_service_task(). In this case we only clear that bit now and
> +	 * aux device will be unplugged later once ice_plug_aux_device()
> +	 * called from ice_service_task() finishes (see ice_service_task()).
> +	 */
> +	if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> +		ice_unplug_aux_dev(pf);
> +
>  	clear_bit(ICE_FLAG_RDMA_ENA, pf->flags);
>  	clear_bit(ICE_FLAG_AUX_ENA, pf->flags);
>  }
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 83e3e8aae6cf..493942e910be 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -2255,9 +2255,19 @@ static void ice_service_task(struct work_struct
> *work)
>  		return;
>  	}
> 
> -	if (test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags))
> +	if (test_bit(ICE_FLAG_PLUG_AUX_DEV, pf->flags)) {
> +		/* Plug aux device per request */
>  		ice_plug_aux_dev(pf);
> 
> +		/* Mark plugging as done but check whether unplug was
> +		 * requested during ice_plug_aux_dev() call
> +		 * (e.g. from ice_clear_rdma_cap()) and if so then
> +		 * plug aux device.
> +		 */
> +		if (!test_and_clear_bit(ICE_FLAG_PLUG_AUX_DEV, pf-
> >flags))
> +			ice_unplug_aux_dev(pf);
> +	}
> +
>  	if (test_and_clear_bit(ICE_FLAG_MTU_CHANGED, pf->flags)) {
>  		struct iidc_event *event;
> 
> --
> 2.34.1

This only addresses one case of unplugging the auxiliary bus.  Rather than controlling one instance of
calling ice_unplig_aux_dev(), it seems like it would be better to modify ice_unplug_aux_dev so that it
will pause until any plugging is done by the service task (check for the pf->flag bit and wait until it clears
before progressing).

This way resets, devlink calls, etc would also be controlled from runaway plugging / unplugging of the aux
devices.  This would eliminate the second check in the service task also, since unplug would not be allowed
to progress until the bit is cleared by the service task.

In an upcoming patch I am making all plugging of the aux devices done in the service task, this would centralize
it and protect it against looping plug/unplug of the auxiliary devices.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 17:48   ` [Intel-wired-lan] " Ertman, David M
  (?)
@ 2022-03-10 18:06   ` Ivan Vecera
  2022-03-10 19:26       ` [Intel-wired-lan] " Ertman, David M
  -1 siblings, 1 reply; 13+ messages in thread
From: Ivan Vecera @ 2022-03-10 18:06 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, 10 Mar 2022 17:48:16 +0000
"Ertman, David M" <david.m.ertman@intel.com> wrote:

> This only addresses one case of unplugging the auxiliary bus.  Rather than controlling one instance of
> calling ice_unplig_aux_dev(), it seems like it would be better to modify ice_unplug_aux_dev so that it
> will pause until any plugging is done by the service task (check for the pf->flag bit and wait until it clears
> before progressing).

You cannot wait in ice_unplug_aux_dev() for ICE_FLAG_PLUG_AUX_DEV to be cleared because
ice_clear_rdma_cap() is called under RTNL.
This patch is a counter part for commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when
re-creating auxiliary device") that eliminates ice_plug_aux() and fixed first part
of deadlock and this patch fixes a second part and eliminates also ice_unplug_aux_dev()
to be called under RTNL.

Thanks,
Ivan


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 18:06   ` Ivan Vecera
@ 2022-03-10 19:26       ` Ertman, David M
  0 siblings, 0 replies; 13+ messages in thread
From: Ertman, David M @ 2022-03-10 19:26 UTC (permalink / raw)
  To: Ivan Vecera
  Cc: netdev, Petr Oros, Brandeburg, Jesse, Nguyen, Anthony L,
	David S. Miller, Jakub Kicinski, intel-wired-lan, linux-kernel

> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Thursday, March 10, 2022 10:07 AM
> To: Ertman, David M <david.m.ertman@intel.com>
> Cc: netdev@vger.kernel.org; Petr Oros <poros@redhat.com>; Brandeburg,
> Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; moderated list:INTEL ETHERNET DRIVERS"
> <intel-wired-lan@lists.osuosl.org>, open list <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH net] ice: Fix race condition during interface enslave
> 
> On Thu, 10 Mar 2022 17:48:16 +0000
> "Ertman, David M" <david.m.ertman@intel.com> wrote:
> 
> > This only addresses one case of unplugging the auxiliary bus.  Rather than
> controlling one instance of
> > calling ice_unplig_aux_dev(), it seems like it would be better to modify
> ice_unplug_aux_dev so that it
> > will pause until any plugging is done by the service task (check for the pf-
> >flag bit and wait until it clears
> > before progressing).
> 
> You cannot wait in ice_unplug_aux_dev() for ICE_FLAG_PLUG_AUX_DEV to
> be cleared because
> ice_clear_rdma_cap() is called under RTNL.
> This patch is a counter part for commit 5dbbbd01cbba83 ("ice: Avoid RTNL
> lock when
> re-creating auxiliary device") that eliminates ice_plug_aux() and fixed first
> part
> of deadlock and this patch fixes a second part and eliminates also
> ice_unplug_aux_dev()
> to be called under RTNL.
> 
> Thanks,
> Ivan

You are correct.  I forgot about the RTNL context!

With this in mind - I agree with your approach.

Reviewed-by: Dave Ertman <david.m.ertman@intel.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 19:26       ` Ertman, David M
  0 siblings, 0 replies; 13+ messages in thread
From: Ertman, David M @ 2022-03-10 19:26 UTC (permalink / raw)
  To: intel-wired-lan

> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Thursday, March 10, 2022 10:07 AM
> To: Ertman, David M <david.m.ertman@intel.com>
> Cc: netdev at vger.kernel.org; Petr Oros <poros@redhat.com>; Brandeburg,
> Jesse <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; moderated list:INTEL ETHERNET DRIVERS"
> <intel-wired-lan@lists.osuosl.org>, open list <linux-kernel@vger.kernel.org>
> Subject: Re: [PATCH net] ice: Fix race condition during interface enslave
> 
> On Thu, 10 Mar 2022 17:48:16 +0000
> "Ertman, David M" <david.m.ertman@intel.com> wrote:
> 
> > This only addresses one case of unplugging the auxiliary bus.  Rather than
> controlling one instance of
> > calling ice_unplig_aux_dev(), it seems like it would be better to modify
> ice_unplug_aux_dev so that it
> > will pause until any plugging is done by the service task (check for the pf-
> >flag bit and wait until it clears
> > before progressing).
> 
> You cannot wait in ice_unplug_aux_dev() for ICE_FLAG_PLUG_AUX_DEV to
> be cleared because
> ice_clear_rdma_cap() is called under RTNL.
> This patch is a counter part for commit 5dbbbd01cbba83 ("ice: Avoid RTNL
> lock when
> re-creating auxiliary device") that eliminates ice_plug_aux() and fixed first
> part
> of deadlock and this patch fixes a second part and eliminates also
> ice_unplug_aux_dev()
> to be called under RTNL.
> 
> Thanks,
> Ivan

You are correct.  I forgot about the RTNL context!

With this in mind - I agree with your approach.

Reviewed-by: Dave Ertman <david.m.ertman@intel.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 17:16 ` [Intel-wired-lan] " Ivan Vecera
@ 2022-03-10 21:59   ` Jakub Kicinski
  -1 siblings, 0 replies; 13+ messages in thread
From: Jakub Kicinski @ 2022-03-10 21:59 UTC (permalink / raw)
  To: Ivan Vecera, Jesse Brandeburg, Tony Nguyen
  Cc: netdev, Petr Oros, David S. Miller,
	moderated list:INTEL ETHERNET DRIVERS, open list

On Thu, 10 Mar 2022 18:16:41 +0100 Ivan Vecera wrote:
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> ip link add lag0 type bond mode 1 miimon 100
> ip link set lag0
> 
> for n in {1..10}; do
>         echo Cycle: $n
>         ip link set ens7f0 master lag0
>         sleep 1
>         ip link set ens7f0 nomaster
> done

What's the priority on this one? The loop max of 10 seems a little
worrying.

Tony, Jesse, is it important enough to push into 5.17 or do you prefer
to take it via the normal path and do full QA? The blamed patch come
in to 5.17-rc it seems.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 21:59   ` Jakub Kicinski
  0 siblings, 0 replies; 13+ messages in thread
From: Jakub Kicinski @ 2022-03-10 21:59 UTC (permalink / raw)
  To: intel-wired-lan

On Thu, 10 Mar 2022 18:16:41 +0100 Ivan Vecera wrote:
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> ip link add lag0 type bond mode 1 miimon 100
> ip link set lag0
> 
> for n in {1..10}; do
>         echo Cycle: $n
>         ip link set ens7f0 master lag0
>         sleep 1
>         ip link set ens7f0 nomaster
> done

What's the priority on this one? The loop max of 10 seems a little
worrying.

Tony, Jesse, is it important enough to push into 5.17 or do you prefer
to take it via the normal path and do full QA? The blamed patch come
in to 5.17-rc it seems.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 21:59   ` [Intel-wired-lan] " Jakub Kicinski
@ 2022-03-10 22:15     ` Tony Nguyen
  -1 siblings, 0 replies; 13+ messages in thread
From: Tony Nguyen @ 2022-03-10 22:15 UTC (permalink / raw)
  To: Jakub Kicinski, Ivan Vecera, Jesse Brandeburg
  Cc: netdev, Petr Oros, David S. Miller,
	moderated list:INTEL ETHERNET DRIVERS, open list


On 3/10/2022 1:59 PM, Jakub Kicinski wrote:
> On Thu, 10 Mar 2022 18:16:41 +0100 Ivan Vecera wrote:
>> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
>> auxiliary device") changes a process of re-creation of aux device
>> so ice_plug_aux_dev() is called from ice_service_task() context.
>> This unfortunately opens a race window that can result in dead-lock
>> when interface has left LAG and immediately enters LAG again.
>>
>> Reproducer:
>> ```
>> #!/bin/sh
>>
>> ip link add lag0 type bond mode 1 miimon 100
>> ip link set lag0
>>
>> for n in {1..10}; do
>>          echo Cycle: $n
>>          ip link set ens7f0 master lag0
>>          sleep 1
>>          ip link set ens7f0 nomaster
>> done
> What's the priority on this one? The loop max of 10 seems a little
> worrying.
>
> Tony, Jesse, is it important enough to push into 5.17 or do you prefer
> to take it via the normal path and do full QA? The blamed patch come
> in to 5.17-rc it seems.

Hi Jakub,

Yea, it'd be preferred to make it into 5.17. Feel free to take it directly.

Thanks,

Tony


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 22:15     ` Tony Nguyen
  0 siblings, 0 replies; 13+ messages in thread
From: Tony Nguyen @ 2022-03-10 22:15 UTC (permalink / raw)
  To: intel-wired-lan


On 3/10/2022 1:59 PM, Jakub Kicinski wrote:
> On Thu, 10 Mar 2022 18:16:41 +0100 Ivan Vecera wrote:
>> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
>> auxiliary device") changes a process of re-creation of aux device
>> so ice_plug_aux_dev() is called from ice_service_task() context.
>> This unfortunately opens a race window that can result in dead-lock
>> when interface has left LAG and immediately enters LAG again.
>>
>> Reproducer:
>> ```
>> #!/bin/sh
>>
>> ip link add lag0 type bond mode 1 miimon 100
>> ip link set lag0
>>
>> for n in {1..10}; do
>>          echo Cycle: $n
>>          ip link set ens7f0 master lag0
>>          sleep 1
>>          ip link set ens7f0 nomaster
>> done
> What's the priority on this one? The loop max of 10 seems a little
> worrying.
>
> Tony, Jesse, is it important enough to push into 5.17 or do you prefer
> to take it via the normal path and do full QA? The blamed patch come
> in to 5.17-rc it seems.

Hi Jakub,

Yea, it'd be preferred to make it into 5.17. Feel free to take it directly.

Thanks,

Tony


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net] ice: Fix race condition during interface enslave
  2022-03-10 17:16 ` [Intel-wired-lan] " Ivan Vecera
@ 2022-03-10 23:30   ` patchwork-bot+netdevbpf
  -1 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-10 23:30 UTC (permalink / raw)
  To: Ivan Vecera
  Cc: netdev, poros, jesse.brandeburg, anthony.l.nguyen, davem, kuba,
	intel-wired-lan, linux-kernel

Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 10 Mar 2022 18:16:41 +0100 you wrote:
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> [...]

Here is the summary with links:
  - [net] ice: Fix race condition during interface enslave
    https://git.kernel.org/netdev/net/c/5cb1ebdbc434

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Intel-wired-lan] [PATCH net] ice: Fix race condition during interface enslave
@ 2022-03-10 23:30   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 13+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-10 23:30 UTC (permalink / raw)
  To: intel-wired-lan

Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 10 Mar 2022 18:16:41 +0100 you wrote:
> Commit 5dbbbd01cbba83 ("ice: Avoid RTNL lock when re-creating
> auxiliary device") changes a process of re-creation of aux device
> so ice_plug_aux_dev() is called from ice_service_task() context.
> This unfortunately opens a race window that can result in dead-lock
> when interface has left LAG and immediately enters LAG again.
> 
> Reproducer:
> ```
> #!/bin/sh
> 
> [...]

Here is the summary with links:
  - [net] ice: Fix race condition during interface enslave
    https://git.kernel.org/netdev/net/c/5cb1ebdbc434

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-03-10 23:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 17:16 [PATCH net] ice: Fix race condition during interface enslave Ivan Vecera
2022-03-10 17:16 ` [Intel-wired-lan] " Ivan Vecera
2022-03-10 17:48 ` Ertman, David M
2022-03-10 17:48   ` [Intel-wired-lan] " Ertman, David M
2022-03-10 18:06   ` Ivan Vecera
2022-03-10 19:26     ` Ertman, David M
2022-03-10 19:26       ` [Intel-wired-lan] " Ertman, David M
2022-03-10 21:59 ` Jakub Kicinski
2022-03-10 21:59   ` [Intel-wired-lan] " Jakub Kicinski
2022-03-10 22:15   ` Tony Nguyen
2022-03-10 22:15     ` [Intel-wired-lan] " Tony Nguyen
2022-03-10 23:30 ` patchwork-bot+netdevbpf
2022-03-10 23:30   ` [Intel-wired-lan] " patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.