* [PATCH v2] igb: reinit_locked() should be called with rtnl_lock
@ 2020-07-02 22:39 ` Francesco Ruggeri
0 siblings, 0 replies; 4+ messages in thread
From: Francesco Ruggeri @ 2020-07-02 22:39 UTC (permalink / raw)
To: linux-kernel, netdev, intel-wired-lan, kuba, davem,
jeffrey.t.kirsher, fruggeri
We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:
kworker reboot -f
igb_reset_task
igb_reinit_locked
igb_down
napi_synchronize
__igb_shutdown
igb_clear_interrupt_scheme
igb_free_q_vectors
igb_free_q_vector
adapter->q_vector[v_idx] = NULL;
napi_disable
Panics trying to access
adapter->q_vector[v_idx].napi_state
The second panic (a divide error) is caused by this race:
kworker reboot -f tx packet
igb_reset_task
__igb_shutdown
rtnl_lock()
...
igb_clear_interrupt_scheme
igb_free_q_vectors
adapter->num_tx_queues = 0
...
rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
dev_hard_start_xmit
igb_xmit_frame
igb_tx_queue_mapping
Panics on
r_idx % adapter->num_tx_queues
This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
reset subtask").
v2: add fix for second race condition above.
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8bb3db2cbd41..6e5861bfb0fa 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6224,9 +6224,18 @@ static void igb_reset_task(struct work_struct *work)
struct igb_adapter *adapter;
adapter = container_of(work, struct igb_adapter, reset_task);
+ rtnl_lock();
+ /* If we're already down or resetting, just bail */
+ if (test_bit(__IGB_DOWN, &adapter->state) ||
+ test_bit(__IGB_RESETTING, &adapter->state)) {
+ rtnl_unlock();
+ return;
+ }
+
igb_dump(adapter);
netdev_err(adapter->netdev, "Reset adapter\n");
igb_reinit_locked(adapter);
+ rtnl_unlock();
}
/**
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [Intel-wired-lan] [PATCH v2] igb: reinit_locked() should be called with rtnl_lock
@ 2020-07-02 22:39 ` Francesco Ruggeri
0 siblings, 0 replies; 4+ messages in thread
From: Francesco Ruggeri @ 2020-07-02 22:39 UTC (permalink / raw)
To: intel-wired-lan
We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:
kworker reboot -f
igb_reset_task
igb_reinit_locked
igb_down
napi_synchronize
__igb_shutdown
igb_clear_interrupt_scheme
igb_free_q_vectors
igb_free_q_vector
adapter->q_vector[v_idx] = NULL;
napi_disable
Panics trying to access
adapter->q_vector[v_idx].napi_state
The second panic (a divide error) is caused by this race:
kworker reboot -f tx packet
igb_reset_task
__igb_shutdown
rtnl_lock()
...
igb_clear_interrupt_scheme
igb_free_q_vectors
adapter->num_tx_queues = 0
...
rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
dev_hard_start_xmit
igb_xmit_frame
igb_tx_queue_mapping
Panics on
r_idx % adapter->num_tx_queues
This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
reset subtask").
v2: add fix for second race condition above.
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
---
drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 8bb3db2cbd41..6e5861bfb0fa 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6224,9 +6224,18 @@ static void igb_reset_task(struct work_struct *work)
struct igb_adapter *adapter;
adapter = container_of(work, struct igb_adapter, reset_task);
+ rtnl_lock();
+ /* If we're already down or resetting, just bail */
+ if (test_bit(__IGB_DOWN, &adapter->state) ||
+ test_bit(__IGB_RESETTING, &adapter->state)) {
+ rtnl_unlock();
+ return;
+ }
+
igb_dump(adapter);
netdev_err(adapter->netdev, "Reset adapter\n");
igb_reinit_locked(adapter);
+ rtnl_unlock();
}
/**
^ permalink raw reply related [flat|nested] 4+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2] igb: reinit_locked() should be called with rtnl_lock
2020-07-02 22:39 ` [Intel-wired-lan] " Francesco Ruggeri
@ 2020-07-28 20:37 ` Brown, Aaron F
-1 siblings, 0 replies; 4+ messages in thread
From: Brown, Aaron F @ 2020-07-28 20:37 UTC (permalink / raw)
To: Francesco Ruggeri, linux-kernel, netdev, intel-wired-lan, kuba,
davem, Kirsher, Jeffrey T
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Francesco Ruggeri
> Sent: Thursday, July 2, 2020 3:39 PM
> To: linux-kernel@vger.kernel.org; netdev@vger.kernel.org; intel-wired-
> lan@lists.osuosl.org; kuba@kernel.org; davem@davemloft.net; Kirsher, Jeffrey
> T <jeffrey.t.kirsher@intel.com>; fruggeri@arista.com
> Subject: [Intel-wired-lan] [PATCH v2] igb: reinit_locked() should be called with
> rtnl_lock
>
> We observed two panics involving races with igb_reset_task.
> The first panic is caused by this race condition:
>
> kworker reboot -f
>
> igb_reset_task
> igb_reinit_locked
> igb_down
> napi_synchronize
> __igb_shutdown
> igb_clear_interrupt_scheme
> igb_free_q_vectors
> igb_free_q_vector
> adapter->q_vector[v_idx] = NULL;
> napi_disable
> Panics trying to access
> adapter->q_vector[v_idx].napi_state
>
> The second panic (a divide error) is caused by this race:
>
> kworker reboot -f tx packet
>
> igb_reset_task
> __igb_shutdown
> rtnl_lock()
> ...
> igb_clear_interrupt_scheme
> igb_free_q_vectors
> adapter->num_tx_queues = 0
> ...
> rtnl_unlock()
> rtnl_lock()
> igb_reinit_locked
> igb_down
> igb_up
> netif_tx_start_all_queues
> dev_hard_start_xmit
> igb_xmit_frame
> igb_tx_queue_mapping
> Panics on
> r_idx % adapter->num_tx_queues
>
> This commit applies to igb_reset_task the same changes that
> were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch
> adds support for DCB to the kernel and ixgbe driver"),
> commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
> rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
> reset subtask").
>
> v2: add fix for second race condition above.
>
> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Intel-wired-lan] [PATCH v2] igb: reinit_locked() should be called with rtnl_lock
@ 2020-07-28 20:37 ` Brown, Aaron F
0 siblings, 0 replies; 4+ messages in thread
From: Brown, Aaron F @ 2020-07-28 20:37 UTC (permalink / raw)
To: intel-wired-lan
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Francesco Ruggeri
> Sent: Thursday, July 2, 2020 3:39 PM
> To: linux-kernel at vger.kernel.org; netdev at vger.kernel.org; intel-wired-
> lan at lists.osuosl.org; kuba at kernel.org; davem at davemloft.net; Kirsher, Jeffrey
> T <jeffrey.t.kirsher@intel.com>; fruggeri at arista.com
> Subject: [Intel-wired-lan] [PATCH v2] igb: reinit_locked() should be called with
> rtnl_lock
>
> We observed two panics involving races with igb_reset_task.
> The first panic is caused by this race condition:
>
> kworker reboot -f
>
> igb_reset_task
> igb_reinit_locked
> igb_down
> napi_synchronize
> __igb_shutdown
> igb_clear_interrupt_scheme
> igb_free_q_vectors
> igb_free_q_vector
> adapter->q_vector[v_idx] = NULL;
> napi_disable
> Panics trying to access
> adapter->q_vector[v_idx].napi_state
>
> The second panic (a divide error) is caused by this race:
>
> kworker reboot -f tx packet
>
> igb_reset_task
> __igb_shutdown
> rtnl_lock()
> ...
> igb_clear_interrupt_scheme
> igb_free_q_vectors
> adapter->num_tx_queues = 0
> ...
> rtnl_unlock()
> rtnl_lock()
> igb_reinit_locked
> igb_down
> igb_up
> netif_tx_start_all_queues
> dev_hard_start_xmit
> igb_xmit_frame
> igb_tx_queue_mapping
> Panics on
> r_idx % adapter->num_tx_queues
>
> This commit applies to igb_reset_task the same changes that
> were applied to ixgbe in commit 2f90b8657ec9 ("ixgbe: this patch
> adds support for DCB to the kernel and ixgbe driver"),
> commit 8f4c5c9fb87a ("ixgbe: reinit_locked() should be called with
> rtnl_lock") and commit 88adce4ea8f9 ("ixgbe: fix possible race in
> reset subtask").
>
> v2: add fix for second race condition above.
>
> Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-07-28 20:38 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 22:39 [PATCH v2] igb: reinit_locked() should be called with rtnl_lock Francesco Ruggeri
2020-07-02 22:39 ` [Intel-wired-lan] " Francesco Ruggeri
2020-07-28 20:37 ` Brown, Aaron F
2020-07-28 20:37 ` Brown, Aaron F
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.