[PATCH V1 net 0/3] Bug fixes for ENA ethernet driver

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH V1 net 0/3] Bug fixes for ENA ethernet driver
@ 2020-08-12 10:10 Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction Shay Agroskin
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Shay Agroskin @ 2020-08-12 10:10 UTC (permalink / raw)
  To: davem, netdev
  Cc: dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi, benh, akiyano, sameehj, ndagan, Shay Agroskin

This series adds the following:
- Fix possible NULL dereference after returning from suspend
- Fix condition inside a WARN_ON
- Fix overriding previous value when updating missed_tx statistic

Shay Agroskin (3):
  net: ena: Prevent reset after device destruction
  net: ena: Change WARN_ON expression in ena_del_napi_in_range()
  net: ena: Make missed_tx stat incremental

 drivers/net/ethernet/amazon/ena/ena_netdev.c | 33 +++++++++-----------
 1 file changed, 14 insertions(+), 19 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction
  2020-08-12 10:10 [PATCH V1 net 0/3] Bug fixes for ENA ethernet driver Shay Agroskin
@ 2020-08-12 10:10 ` Shay Agroskin
  2020-08-12 17:52   ` Jakub Kicinski
  2020-08-12 10:10 ` [PATCH V1 net 2/3] net: ena: Change WARN_ON expression in ena_del_napi_in_range() Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 3/3] net: ena: Make missed_tx stat incremental Shay Agroskin
  2 siblings, 1 reply; 8+ messages in thread
From: Shay Agroskin @ 2020-08-12 10:10 UTC (permalink / raw)
  To: davem, netdev
  Cc: dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi, benh, akiyano, sameehj, ndagan, Shay Agroskin

The reset work is scheduled by the timer routine whenever it
detects that a device reset is required (e.g. when a keep_alive signal
is missing).
When releasing device resources in ena_destroy() the driver cancels the
scheduling of the timer routine without destroying the reset
work explicitly.

This creates the following bug:
    The driver is suspended and the ena_suspend() function is called
	-> This function calls ena_destroy() to free the net device
	   resources
	    -> The driver waits for the timer routine to finish
	    its execution and then cancels it, thus preventing from it
	    to be called again.

    If, in its final execution, the timer routine schedules a reset,
    the reset routine might be called after the device resources are
    freed, which might cause a kernel panic.

By changing the reset routine so that it cannot run simultaneously with
the destruction routine, we allow the reset routine read the device's
state accurately.
This is achieved by checking whether ENA_FLAG_TRIGGER_RESET flag is set
before resetting the device and making both the destruction function and
the flag check are under rtnl lock.
The ENA_FLAG_TRIGGER_RESET is cleared at the end of the destruction
routine. Also surround the flag check with 'likely' because
we expect that the reset routine would be called only when
ENA_FLAG_TRIGGER_RESET flag is set.

This patch also removes the destruction of the timer and reset services
from ena_remove() since the timer is destroyed by the destruction
routine and the reset work is handled by this patch.

Fixes: 8c5c7abdeb2d ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 2a6c9725e092..0488fcbf48f7 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3601,16 +3601,14 @@ static void ena_fw_reset_device(struct work_struct *work)
 {
 	struct ena_adapter *adapter =
 		container_of(work, struct ena_adapter, reset_task);
-	struct pci_dev *pdev = adapter->pdev;
 
-	if (unlikely(!test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
-		dev_err(&pdev->dev,
-			"device reset schedule while reset bit is off\n");
-		return;
-	}
 	rtnl_lock();
-	ena_destroy_device(adapter, false);
-	ena_restore_device(adapter);
+
+	if (likely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+		ena_destroy_device(adapter, false);
+		ena_restore_device(adapter);
+	}
+
 	rtnl_unlock();
 }
 
@@ -4389,9 +4387,6 @@ static void __ena_shutoff(struct pci_dev *pdev, bool shutdown)
 		netdev->rx_cpu_rmap = NULL;
 	}
 #endif /* CONFIG_RFS_ACCEL */
-	del_timer_sync(&adapter->timer_service);
-
-	cancel_work_sync(&adapter->reset_task);
 
 	rtnl_lock(); /* lock released inside the below if-else block */
 	adapter->reset_reason = ENA_REGS_RESET_SHUTDOWN;
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V1 net 2/3] net: ena: Change WARN_ON expression in ena_del_napi_in_range()
  2020-08-12 10:10 [PATCH V1 net 0/3] Bug fixes for ENA ethernet driver Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction Shay Agroskin
@ 2020-08-12 10:10 ` Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 3/3] net: ena: Make missed_tx stat incremental Shay Agroskin
  2 siblings, 0 replies; 8+ messages in thread
From: Shay Agroskin @ 2020-08-12 10:10 UTC (permalink / raw)
  To: davem, netdev
  Cc: dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi, benh, akiyano, sameehj, ndagan, Shay Agroskin

The ena_del_napi_in_range() function unregisters the napi handler for
rings in a given range.
This function had the following WARN_ON macro:

    WARN_ON(ENA_IS_XDP_INDEX(adapter, i) &&
	    adapter->ena_napi[i].xdp_ring);

This macro prints the call stack if the expression inside of it is
true [1], but the expression inside of it is the wanted situation.
The expression checks whether the ring has an XDP queue and its index
corresponds to a XDP one.

This patch changes the expression to
    !ENA_IS_XDP_INDEX(adapter, i) && adapter->ena_napi[i].xdp_ring
which indicates an unwanted situation.

Also, change the structure of the function. The napi handler is
unregistered for all rings, and so there's no need to check whether the
index is an XDP index or not. By removing this check the code becomes
much more readable.

[1]:
https://elixir.bootlin.com/linux/latest/source/arch/parisc/include/asm/bug.h#L79

Fixes: 8c5c7abdeb2d ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 0488fcbf48f7..3e12065482c2 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2180,13 +2180,10 @@ static void ena_del_napi_in_range(struct ena_adapter *adapter,
 	int i;
 
 	for (i = first_index; i < first_index + count; i++) {
-		/* Check if napi was initialized before */
-		if (!ENA_IS_XDP_INDEX(adapter, i) ||
-		    adapter->ena_napi[i].xdp_ring)
-			netif_napi_del(&adapter->ena_napi[i].napi);
-		else
-			WARN_ON(ENA_IS_XDP_INDEX(adapter, i) &&
-				adapter->ena_napi[i].xdp_ring);
+		netif_napi_del(&adapter->ena_napi[i].napi);
+
+		WARN_ON(!ENA_IS_XDP_INDEX(adapter, i) &&
+			adapter->ena_napi[i].xdp_ring);
 	}
 }
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH V1 net 3/3] net: ena: Make missed_tx stat incremental
  2020-08-12 10:10 [PATCH V1 net 0/3] Bug fixes for ENA ethernet driver Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction Shay Agroskin
  2020-08-12 10:10 ` [PATCH V1 net 2/3] net: ena: Change WARN_ON expression in ena_del_napi_in_range() Shay Agroskin
@ 2020-08-12 10:10 ` Shay Agroskin
  2 siblings, 0 replies; 8+ messages in thread
From: Shay Agroskin @ 2020-08-12 10:10 UTC (permalink / raw)
  To: davem, netdev
  Cc: dwmw, zorik, matua, saeedb, msw, aliguori, nafea, gtzalik,
	netanel, alisaidi, benh, akiyano, sameehj, ndagan, Shay Agroskin

Most statistics in ena driver are incremented, meaning that a stat's
value is a sum of all increases done to it since driver/queue
initialization.

This patch makes all statistics this way, effectively making missed_tx
statistic incremental.
Also added a comment regarding rx_drops and tx_drops to make it
clearer how these counters are calculated.

Fixes: 11095fdb712b ("net: ena: add statistics for missed tx packets")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 3e12065482c2..7a11a759d053 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3687,7 +3687,7 @@ static int check_missing_comp_in_tx_queue(struct ena_adapter *adapter,
 	}
 
 	u64_stats_update_begin(&tx_ring->syncp);
-	tx_ring->tx_stats.missed_tx = missed_tx;
+	tx_ring->tx_stats.missed_tx += missed_tx;
 	u64_stats_update_end(&tx_ring->syncp);
 
 	return rc;
@@ -4550,6 +4550,9 @@ static void ena_keep_alive_wd(void *adapter_data,
 	tx_drops = ((u64)desc->tx_drops_high << 32) | desc->tx_drops_low;
 
 	u64_stats_update_begin(&adapter->syncp);
+	/* These stats are accumulated by the device, so the counters indicate
+	 * all drops since last reset.
+	 */
 	adapter->dev_stats.rx_drops = rx_drops;
 	adapter->dev_stats.tx_drops = tx_drops;
 	u64_stats_update_end(&adapter->syncp);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction
  2020-08-12 10:10 ` [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction Shay Agroskin
@ 2020-08-12 17:52   ` Jakub Kicinski
  2020-08-13 12:51     ` Shay Agroskin
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2020-08-12 17:52 UTC (permalink / raw)
  To: Shay Agroskin
  Cc: davem, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
	gtzalik, netanel, alisaidi, benh, akiyano, sameehj, ndagan

On Wed, 12 Aug 2020 13:10:57 +0300 Shay Agroskin wrote:
> This patch also removes the destruction of the timer and reset services
> from ena_remove() since the timer is destroyed by the destruction
> routine and the reset work is handled by this patch.

You'd still have a use after free if the work runs after the device is
removed. I think cancel_work_sync() gotta stay.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction
  2020-08-12 17:52   ` Jakub Kicinski
@ 2020-08-13 12:51     ` Shay Agroskin
  2020-08-13 20:41       ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Shay Agroskin @ 2020-08-13 12:51 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
	gtzalik, netanel, alisaidi, benh, akiyano, sameehj, ndagan

Jakub Kicinski <kuba@kernel.org> writes:

> On Wed, 12 Aug 2020 13:10:57 +0300 Shay Agroskin wrote:
>> This patch also removes the destruction of the timer and reset 
>> services
>> from ena_remove() since the timer is destroyed by the 
>> destruction
>> routine and the reset work is handled by this patch.
>
> You'd still have a use after free if the work runs after the 
> device is
> removed. I think cancel_work_sync() gotta stay.

Hi, thank you for reviewing the patch. Short answer: I verified 
that the ENA_FLAG_TRIGGER_RESET flag cannot be set after 
ena_destroy_device() finishes its execution.

Long answer:
The ena_destroy_device() function is called with rtnl_lock() held, 
so it cannot run in parallel with the reset function. Also the 
destroy function clears the bit ENA_FLAG_TRIGGER_RESET without 
which the reset function just exits without doing anything.

A problem can then only happen when some routine sets the 
ENA_FLAG_TRIGGER_RESET bit before the reset function is executed, 
the following describes all functions from which this bit can be 
set:

- check_* functions: these function are called from the timer 
  routine which is destroyed in ena_destroy_device(), so by the 
  time the rtnl_lock() released, the bit is cleared

- napi related functions (io_poll, xdp_io_poll, validate_rx_req_id 
  etc.): the napi is de-registered in ena_destroy_device(), so 
  none of these functions is called after destroying the device.

- xmit functions (ena_xmit_common, ena_tx_timeout): the device is 
  brought down and all its RX/TX resources are freed before 
  releasing the lock.

These are all the occurrences I found. Without this bit set, the 
reset function would fail the 'if' check in this patch, and exit 
without doing anything. Destroying the reset function explicitly 
won't help since by the time we do it, the function can already be 
executed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction
  2020-08-13 12:51     ` Shay Agroskin
@ 2020-08-13 20:41       ` Jakub Kicinski
  2020-08-16 10:25         ` Shay Agroskin
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2020-08-13 20:41 UTC (permalink / raw)
  To: Shay Agroskin
  Cc: davem, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
	gtzalik, netanel, alisaidi, benh, akiyano, sameehj, ndagan

On Thu, 13 Aug 2020 15:51:46 +0300 Shay Agroskin wrote:
> Long answer:
> The ena_destroy_device() function is called with rtnl_lock() held, 
> so it cannot run in parallel with the reset function. Also the 
> destroy function clears the bit ENA_FLAG_TRIGGER_RESET without 
> which the reset function just exits without doing anything.
> 
> A problem can then only happen when some routine sets the 
> ENA_FLAG_TRIGGER_RESET bit before the reset function is executed, 
> the following describes all functions from which this bit can be 
> set:

ena_fw_reset_device() runs from a workqueue, it can be preempted right
before it tries to take the rtnl_lock. Then after arbitrarily long
delay it will start again, take the lock, and dereference
adapter->flags. But adapter could have been long freed at this point.

Unless you flush a workqueue or cancel_work_sync() you can never be
sure it's not scheduled. And I can only see a flush when module is
unloaded now.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction
  2020-08-13 20:41       ` Jakub Kicinski
@ 2020-08-16 10:25         ` Shay Agroskin
  0 siblings, 0 replies; 8+ messages in thread
From: Shay Agroskin @ 2020-08-16 10:25 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, dwmw, zorik, matua, saeedb, msw, aliguori, nafea,
	gtzalik, netanel, alisaidi, benh, akiyano, sameehj, ndagan


Jakub Kicinski <kuba@kernel.org> writes:

> On Thu, 13 Aug 2020 15:51:46 +0300 Shay Agroskin wrote:
>> Long answer:
>> The ena_destroy_device() function is called with rtnl_lock() 
>> held, 
>> so it cannot run in parallel with the reset function. Also the 
>> destroy function clears the bit ENA_FLAG_TRIGGER_RESET without 
>> which the reset function just exits without doing anything.
>> 
>> A problem can then only happen when some routine sets the 
>> ENA_FLAG_TRIGGER_RESET bit before the reset function is 
>> executed, 
>> the following describes all functions from which this bit can 
>> be 
>> set:
>
> ena_fw_reset_device() runs from a workqueue, it can be preempted 
> right
> before it tries to take the rtnl_lock. Then after arbitrarily 
> long
> delay it will start again, take the lock, and dereference
> adapter->flags. But adapter could have been long freed at this 
> point.

Missed that the check for the 'flags' field also requires that 
netdev_priv field (adapter variable) would be allocated. Thank you 
for pointing that out, this indeed needs to be fixed. I'll add 
reset work destruction in next patchset.

Thank you for reviewing it
>
> Unless you flush a workqueue or cancel_work_sync() you can never 
> be
> sure it's not scheduled. And I can only see a flush when module 
> is
> unloaded now.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-16 10:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-12 10:10 [PATCH V1 net 0/3] Bug fixes for ENA ethernet driver Shay Agroskin
2020-08-12 10:10 ` [PATCH V1 net 1/3] net: ena: Prevent reset after device destruction Shay Agroskin
2020-08-12 17:52   ` Jakub Kicinski
2020-08-13 12:51     ` Shay Agroskin
2020-08-13 20:41       ` Jakub Kicinski
2020-08-16 10:25         ` Shay Agroskin
2020-08-12 10:10 ` [PATCH V1 net 2/3] net: ena: Change WARN_ON expression in ena_del_napi_in_range() Shay Agroskin
2020-08-12 10:10 ` [PATCH V1 net 3/3] net: ena: Make missed_tx stat incremental Shay Agroskin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).