netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()
@ 2023-01-17  9:25 Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 1/5] qede: " Magnus Karlsson
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: Magnus Karlsson, bpf

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found in [1].

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in [2].

The drivers have only been compile-tested since I do not own any of
the HW below. So if you are a manintainer, please make sure I did not
mess something up. This is a lousy excuse for virtio-net though, but
it should be much simpler for the vitio-net maintainers to test this,
than me trying to find test cases, validation suites, instantiating a
good setup, etc. Michael and Jason can likely do this in minutes.

Note that these were the drivers I found that violated the ordering by
running a simple script and manually checking the ones that came up as
potential offenders. But the script was not perfect in any way. There
might still be offenders out there, since the script can generate
false negatives.

[1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
[2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/

Thanks: Magnus

Magnus Karlsson (5):
  qede: execute xdp_do_flush() before napi_complete_done()
  lan966x: execute xdp_do_flush() before napi_complete_done()
  virtio-net: execute xdp_do_flush() before napi_complete_done()
  dpaa_eth: execute xdp_do_flush() before napi_complete_done()
  dpaa2-eth: execute xdp_do_flush() before napi_complete_done()

 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c        | 6 +++---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c      | 9 ++++++---
 drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
 drivers/net/ethernet/qlogic/qede/qede_fp.c            | 7 ++++---
 drivers/net/virtio_net.c                              | 6 +++---
 5 files changed, 19 insertions(+), 15 deletions(-)


base-commit: 87b93b678e95c7d93fe6a55b0e0fbda26d8c7760
--
2.34.1

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net 1/5] qede: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
@ 2023-01-17  9:25 ` Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 2/5] lan966x: " Magnus Karlsson
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: bpf

From: Magnus Karlsson <magnus.karlsson@intel.com>

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.

Fixes: d1b25b79e162b ("qede: add .ndo_xdp_xmit() and XDP_REDIRECT support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
---
 drivers/net/ethernet/qlogic/qede/qede_fp.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index 7c2af482192d..cb1746bc0e0c 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -1438,6 +1438,10 @@ int qede_poll(struct napi_struct *napi, int budget)
 	rx_work_done = (likely(fp->type & QEDE_FASTPATH_RX) &&
 			qede_has_rx_work(fp->rxq)) ?
 			qede_rx_int(fp, budget) : 0;
+
+	if (fp->xdp_xmit & QEDE_XDP_REDIRECT)
+		xdp_do_flush();
+
 	/* Handle case where we are called by netpoll with a budget of 0 */
 	if (rx_work_done < budget || !budget) {
 		if (!qede_poll_is_more_work(fp)) {
@@ -1457,9 +1461,6 @@ int qede_poll(struct napi_struct *napi, int budget)
 		qede_update_tx_producer(fp->xdp_tx);
 	}
 
-	if (fp->xdp_xmit & QEDE_XDP_REDIRECT)
-		xdp_do_flush_map();
-
 	return rx_work_done;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 2/5] lan966x: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 1/5] qede: " Magnus Karlsson
@ 2023-01-17  9:25 ` Magnus Karlsson
  2023-01-17 11:53   ` Steen Hegelund
  2023-01-17  9:25 ` [PATCH net 3/5] virtio-net: " Magnus Karlsson
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: bpf

From: Magnus Karlsson <magnus.karlsson@intel.com>

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.

Fixes: a825b611c7c1 ("net: lan966x: Add support for XDP_REDIRECT")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
---
 drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
index 5314c064ceae..55b484b10562 100644
--- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
+++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
@@ -608,12 +608,12 @@ static int lan966x_fdma_napi_poll(struct napi_struct *napi, int weight)
 		lan966x_fdma_rx_reload(rx);
 	}
 
-	if (counter < weight && napi_complete_done(napi, counter))
-		lan_wr(0xff, lan966x, FDMA_INTR_DB_ENA);
-
 	if (redirect)
 		xdp_do_flush();
 
+	if (counter < weight && napi_complete_done(napi, counter))
+		lan_wr(0xff, lan966x, FDMA_INTR_DB_ENA);
+
 	return counter;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 3/5] virtio-net: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 1/5] qede: " Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 2/5] lan966x: " Magnus Karlsson
@ 2023-01-17  9:25 ` Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 4/5] dpaa_eth: " Magnus Karlsson
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: bpf

From: Magnus Karlsson <magnus.karlsson@intel.com>

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.

Fixes: 186b3c998c50 ("virtio-net: support XDP_REDIRECT")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
---
 drivers/net/virtio_net.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7723b2a49d8e..bc4d79fe3c83 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1677,13 +1677,13 @@ static int virtnet_poll(struct napi_struct *napi, int budget)
 
 	received = virtnet_receive(rq, budget, &xdp_xmit);
 
+	if (xdp_xmit & VIRTIO_XDP_REDIR)
+		xdp_do_flush();
+
 	/* Out of packets? */
 	if (received < budget)
 		virtqueue_napi_complete(napi, rq->vq, received);
 
-	if (xdp_xmit & VIRTIO_XDP_REDIR)
-		xdp_do_flush();
-
 	if (xdp_xmit & VIRTIO_XDP_TX) {
 		sq = virtnet_xdp_get_sq(vi);
 		if (virtqueue_kick_prepare(sq->vq) && virtqueue_notify(sq->vq)) {
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 4/5] dpaa_eth: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
                   ` (2 preceding siblings ...)
  2023-01-17  9:25 ` [PATCH net 3/5] virtio-net: " Magnus Karlsson
@ 2023-01-17  9:25 ` Magnus Karlsson
  2023-01-17  9:25 ` [PATCH net 5/5] dpaa2-eth: " Magnus Karlsson
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: bpf

From: Magnus Karlsson <magnus.karlsson@intel.com>

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.

Fixes: a1e031ffb422 ("dpaa_eth: add XDP_REDIRECT support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
---
 drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
index 3f8032947d86..027fff9f7db0 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
@@ -2410,6 +2410,9 @@ static int dpaa_eth_poll(struct napi_struct *napi, int budget)
 
 	cleaned = qman_p_poll_dqrr(np->p, budget);
 
+	if (np->xdp_act & XDP_REDIRECT)
+		xdp_do_flush();
+
 	if (cleaned < budget) {
 		napi_complete_done(napi, cleaned);
 		qman_p_irqsource_add(np->p, QM_PIRQ_DQRI);
@@ -2417,9 +2420,6 @@ static int dpaa_eth_poll(struct napi_struct *napi, int budget)
 		qman_p_irqsource_add(np->p, QM_PIRQ_DQRI);
 	}
 
-	if (np->xdp_act & XDP_REDIRECT)
-		xdp_do_flush();
-
 	return cleaned;
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net 5/5] dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
                   ` (3 preceding siblings ...)
  2023-01-17  9:25 ` [PATCH net 4/5] dpaa_eth: " Magnus Karlsson
@ 2023-01-17  9:25 ` Magnus Karlsson
  2023-01-17 10:12 ` [PATCH net 0/5] net: xdp: " Michael S. Tsirkin
  2023-01-17 11:13 ` Toke Høiland-Jørgensen
  6 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17  9:25 UTC (permalink / raw)
  To: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur
  Cc: bpf

From: Magnus Karlsson <magnus.karlsson@intel.com>

Make sure that xdp_do_flush() is always executed before
napi_complete_done(). This is important for two reasons. First, a
redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
napi context X on CPU Y will be follwed by a xdp_do_flush() from the
same napi context and CPU. This is not guaranteed if the
napi_complete_done() is executed before xdp_do_flush(), as it tells
the napi logic that it is fine to schedule napi context X on another
CPU. Details from a production system triggering this bug using the
veth driver can be found following the first link below.

The second reason is that the XDP_REDIRECT logic in itself relies on
being inside a single NAPI instance through to the xdp_do_flush() call
for RCU protection of all in-kernel data structures. Details can be
found in the second link below.

Fixes: d678be1dc1ec ("dpaa2-eth: add XDP_REDIRECT support")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
---
 drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
index 0c35abb7d065..2e79d18fc3c7 100644
--- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
+++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c
@@ -1993,10 +1993,15 @@ static int dpaa2_eth_poll(struct napi_struct *napi, int budget)
 		if (rx_cleaned >= budget ||
 		    txconf_cleaned >= DPAA2_ETH_TXCONF_PER_NAPI) {
 			work_done = budget;
+			if (ch->xdp.res & XDP_REDIRECT)
+				xdp_do_flush();
 			goto out;
 		}
 	} while (store_cleaned);
 
+	if (ch->xdp.res & XDP_REDIRECT)
+		xdp_do_flush();
+
 	/* Update NET DIM with the values for this CDAN */
 	dpaa2_io_update_net_dim(ch->dpio, ch->stats.frames_per_cdan,
 				ch->stats.bytes_per_cdan);
@@ -2032,9 +2037,7 @@ static int dpaa2_eth_poll(struct napi_struct *napi, int budget)
 		txc_fq->dq_bytes = 0;
 	}
 
-	if (ch->xdp.res & XDP_REDIRECT)
-		xdp_do_flush_map();
-	else if (rx_cleaned && ch->xdp.res & XDP_TX)
+	if (rx_cleaned && ch->xdp.res & XDP_TX)
 		dpaa2_eth_xdp_tx_flush(priv, ch, &priv->fq[flowid]);
 
 	return work_done;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
                   ` (4 preceding siblings ...)
  2023-01-17  9:25 ` [PATCH net 5/5] dpaa2-eth: " Magnus Karlsson
@ 2023-01-17 10:12 ` Michael S. Tsirkin
  2023-01-17 10:40   ` Magnus Karlsson
  2023-01-17 11:13 ` Toke Høiland-Jørgensen
  6 siblings, 1 reply; 11+ messages in thread
From: Michael S. Tsirkin @ 2023-01-17 10:12 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, jasowang, ioana.ciornei,
	madalin.bucur, bpf

On Tue, Jan 17, 2023 at 10:25:28AM +0100, Magnus Karlsson wrote:
> Make sure that xdp_do_flush() is always executed before
> napi_complete_done(). This is important for two reasons. First, a
> redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> napi context X on CPU Y will be follwed by a xdp_do_flush() from the
> same napi context and CPU. This is not guaranteed if the
> napi_complete_done() is executed before xdp_do_flush(), as it tells
> the napi logic that it is fine to schedule napi context X on another
> CPU. Details from a production system triggering this bug using the
> veth driver can be found in [1].
> 
> The second reason is that the XDP_REDIRECT logic in itself relies on
> being inside a single NAPI instance through to the xdp_do_flush() call
> for RCU protection of all in-kernel data structures. Details can be
> found in [2].
> 
> The drivers have only been compile-tested since I do not own any of
> the HW below. So if you are a manintainer, please make sure I did not
> mess something up. This is a lousy excuse for virtio-net though, but
> it should be much simpler for the vitio-net maintainers to test this,
> than me trying to find test cases, validation suites, instantiating a
> good setup, etc. Michael and Jason can likely do this in minutes.

This kind of thing doesn't scale though. There are more contributors
than maintainers. Also, I am not 100% sure what kind of XDP workload
do I need to be a good test.

> 
> Note that these were the drivers I found that violated the ordering by
> running a simple script and manually checking the ones that came up as
> potential offenders. But the script was not perfect in any way. There
> might still be offenders out there, since the script can generate
> false negatives.
> 
> [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
> 
> Thanks: Magnus
> 
> Magnus Karlsson (5):
>   qede: execute xdp_do_flush() before napi_complete_done()
>   lan966x: execute xdp_do_flush() before napi_complete_done()
>   virtio-net: execute xdp_do_flush() before napi_complete_done()
>   dpaa_eth: execute xdp_do_flush() before napi_complete_done()
>   dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
> 
>  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c        | 6 +++---
>  drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c      | 9 ++++++---
>  drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
>  drivers/net/ethernet/qlogic/qede/qede_fp.c            | 7 ++++---
>  drivers/net/virtio_net.c                              | 6 +++---
>  5 files changed, 19 insertions(+), 15 deletions(-)
> 
> 
> base-commit: 87b93b678e95c7d93fe6a55b0e0fbda26d8c7760
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()
  2023-01-17 10:12 ` [PATCH net 0/5] net: xdp: " Michael S. Tsirkin
@ 2023-01-17 10:40   ` Magnus Karlsson
  0 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17 10:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, toke, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, jasowang, ioana.ciornei,
	madalin.bucur, bpf

On Tue, Jan 17, 2023 at 11:12 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jan 17, 2023 at 10:25:28AM +0100, Magnus Karlsson wrote:
> > Make sure that xdp_do_flush() is always executed before
> > napi_complete_done(). This is important for two reasons. First, a
> > redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> > napi context X on CPU Y will be follwed by a xdp_do_flush() from the
> > same napi context and CPU. This is not guaranteed if the
> > napi_complete_done() is executed before xdp_do_flush(), as it tells
> > the napi logic that it is fine to schedule napi context X on another
> > CPU. Details from a production system triggering this bug using the
> > veth driver can be found in [1].
> >
> > The second reason is that the XDP_REDIRECT logic in itself relies on
> > being inside a single NAPI instance through to the xdp_do_flush() call
> > for RCU protection of all in-kernel data structures. Details can be
> > found in [2].
> >
> > The drivers have only been compile-tested since I do not own any of
> > the HW below. So if you are a manintainer, please make sure I did not
> > mess something up. This is a lousy excuse for virtio-net though, but
> > it should be much simpler for the vitio-net maintainers to test this,
> > than me trying to find test cases, validation suites, instantiating a
> > good setup, etc. Michael and Jason can likely do this in minutes.
>
> This kind of thing doesn't scale though. There are more contributors
> than maintainers. Also, I am not 100% sure what kind of XDP workload
> do I need to be a good test.

True. Is there a smoke test that could be run to check that normal
traffic is not affected? Just so we know that it works as expected.
Then we could move on to try out XDP_REDIRECT for virtio. Anyone out
there that knows of something existing that could be used for this?
Just note that reproducing the issue seems to be challenging as 10
systems running a production workload only experienced a single
failure per night due to this [1]. So I suggest we just go with
checking that existing functionality works as expected.

> >
> > Note that these were the drivers I found that violated the ordering by
> > running a simple script and manually checking the ones that came up as
> > potential offenders. But the script was not perfect in any way. There
> > might still be offenders out there, since the script can generate
> > false negatives.
> >
> > [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> > [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
> >
> > Thanks: Magnus
> >
> > Magnus Karlsson (5):
> >   qede: execute xdp_do_flush() before napi_complete_done()
> >   lan966x: execute xdp_do_flush() before napi_complete_done()
> >   virtio-net: execute xdp_do_flush() before napi_complete_done()
> >   dpaa_eth: execute xdp_do_flush() before napi_complete_done()
> >   dpaa2-eth: execute xdp_do_flush() before napi_complete_done()
> >
> >  drivers/net/ethernet/freescale/dpaa/dpaa_eth.c        | 6 +++---
> >  drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c      | 9 ++++++---
> >  drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
> >  drivers/net/ethernet/qlogic/qede/qede_fp.c            | 7 ++++---
> >  drivers/net/virtio_net.c                              | 6 +++---
> >  5 files changed, 19 insertions(+), 15 deletions(-)
> >
> >
> > base-commit: 87b93b678e95c7d93fe6a55b0e0fbda26d8c7760
> > --
> > 2.34.1
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
                   ` (5 preceding siblings ...)
  2023-01-17 10:12 ` [PATCH net 0/5] net: xdp: " Michael S. Tsirkin
@ 2023-01-17 11:13 ` Toke Høiland-Jørgensen
  2023-01-17 11:34   ` Magnus Karlsson
  6 siblings, 1 reply; 11+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-01-17 11:13 UTC (permalink / raw)
  To: Magnus Karlsson, magnus.karlsson, bjorn, ast, daniel, netdev,
	jonathan.lemon, maciej.fijalkowski, kuba, pabeni, davem, aelior,
	manishc, horatiu.vultur, UNGLinuxDriver, mst, jasowang,
	ioana.ciornei, madalin.bucur
  Cc: Magnus Karlsson, bpf

Magnus Karlsson <magnus.karlsson@gmail.com> writes:

> Make sure that xdp_do_flush() is always executed before
> napi_complete_done(). This is important for two reasons. First, a
> redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> napi context X on CPU Y will be follwed by a xdp_do_flush() from the

Typo in 'followed' here (and in all the copy-pasted commit messages).

> same napi context and CPU. This is not guaranteed if the
> napi_complete_done() is executed before xdp_do_flush(), as it tells
> the napi logic that it is fine to schedule napi context X on another
> CPU. Details from a production system triggering this bug using the
> veth driver can be found in [1].
>
> The second reason is that the XDP_REDIRECT logic in itself relies on
> being inside a single NAPI instance through to the xdp_do_flush() call
> for RCU protection of all in-kernel data structures. Details can be
> found in [2].
>
> The drivers have only been compile-tested since I do not own any of
> the HW below. So if you are a manintainer, please make sure I did not

And another typo in 'maintainer' here.

> mess something up. This is a lousy excuse for virtio-net though, but
> it should be much simpler for the vitio-net maintainers to test this,
> than me trying to find test cases, validation suites, instantiating a
> good setup, etc. Michael and Jason can likely do this in minutes.
>
> Note that these were the drivers I found that violated the ordering by
> running a simple script and manually checking the ones that came up as
> potential offenders. But the script was not perfect in any way. There
> might still be offenders out there, since the script can generate
> false negatives.
>
> [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/

Otherwise LGTM!

For the series:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done()
  2023-01-17 11:13 ` Toke Høiland-Jørgensen
@ 2023-01-17 11:34   ` Magnus Karlsson
  0 siblings, 0 replies; 11+ messages in thread
From: Magnus Karlsson @ 2023-01-17 11:34 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: magnus.karlsson, bjorn, ast, daniel, netdev, jonathan.lemon,
	maciej.fijalkowski, kuba, pabeni, davem, aelior, manishc,
	horatiu.vultur, UNGLinuxDriver, mst, jasowang, ioana.ciornei,
	madalin.bucur, bpf

On Tue, Jan 17, 2023 at 12:13 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Magnus Karlsson <magnus.karlsson@gmail.com> writes:
>
> > Make sure that xdp_do_flush() is always executed before
> > napi_complete_done(). This is important for two reasons. First, a
> > redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> > napi context X on CPU Y will be follwed by a xdp_do_flush() from the
>
> Typo in 'followed' here (and in all the copy-pasted commit messages).
>
> > same napi context and CPU. This is not guaranteed if the
> > napi_complete_done() is executed before xdp_do_flush(), as it tells
> > the napi logic that it is fine to schedule napi context X on another
> > CPU. Details from a production system triggering this bug using the
> > veth driver can be found in [1].
> >
> > The second reason is that the XDP_REDIRECT logic in itself relies on
> > being inside a single NAPI instance through to the xdp_do_flush() call
> > for RCU protection of all in-kernel data structures. Details can be
> > found in [2].
> >
> > The drivers have only been compile-tested since I do not own any of
> > the HW below. So if you are a manintainer, please make sure I did not
>
> And another typo in 'maintainer' here.

Thanks for spotting. Will fix these spelling errors in a v2.

> > mess something up. This is a lousy excuse for virtio-net though, but
> > it should be much simpler for the vitio-net maintainers to test this,
> > than me trying to find test cases, validation suites, instantiating a
> > good setup, etc. Michael and Jason can likely do this in minutes.
> >
> > Note that these were the drivers I found that violated the ordering by
> > running a simple script and manually checking the ones that came up as
> > potential offenders. But the script was not perfect in any way. There
> > might still be offenders out there, since the script can generate
> > false negatives.
> >
> > [1] https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> > [2] https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
>
> Otherwise LGTM!
>
> For the series:
>
> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net 2/5] lan966x: execute xdp_do_flush() before napi_complete_done()
  2023-01-17  9:25 ` [PATCH net 2/5] lan966x: " Magnus Karlsson
@ 2023-01-17 11:53   ` Steen Hegelund
  0 siblings, 0 replies; 11+ messages in thread
From: Steen Hegelund @ 2023-01-17 11:53 UTC (permalink / raw)
  To: Magnus Karlsson, magnus.karlsson, bjorn, ast, daniel, netdev,
	jonathan.lemon, maciej.fijalkowski, kuba, toke, pabeni, davem,
	aelior, manishc, horatiu.vultur, UNGLinuxDriver, mst, jasowang,
	ioana.ciornei, madalin.bucur
  Cc: bpf

Hi Magnus,

This looks good to me.

Acked-by: Steen Hegelund <Steen.Hegelund@microchip.com>

BR
Steen

On Tue, 2023-01-17 at 10:25 +0100, Magnus Karlsson wrote:
> [Some people who received this message don't often get email from magnus.karlsson@gmail.com. Learn
> why this is important at https://aka.ms/LearnAboutSenderIdentification ]
> 
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> 
> From: Magnus Karlsson <magnus.karlsson@intel.com>
> 
> Make sure that xdp_do_flush() is always executed before
> napi_complete_done(). This is important for two reasons. First, a
> redirect to an XSKMAP assumes that a call to xdp_do_redirect() from
> napi context X on CPU Y will be follwed by a xdp_do_flush() from the
> same napi context and CPU. This is not guaranteed if the
> napi_complete_done() is executed before xdp_do_flush(), as it tells
> the napi logic that it is fine to schedule napi context X on another
> CPU. Details from a production system triggering this bug using the
> veth driver can be found following the first link below.
> 
> The second reason is that the XDP_REDIRECT logic in itself relies on
> being inside a single NAPI instance through to the xdp_do_flush() call
> for RCU protection of all in-kernel data structures. Details can be
> found in the second link below.
> 
> Fixes: a825b611c7c1 ("net: lan966x: Add support for XDP_REDIRECT")
> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Link: https://lore.kernel.org/r/20221220185903.1105011-1-sbohrer@cloudflare.com
> Link: https://lore.kernel.org/all/20210624160609.292325-1-toke@redhat.com/
> ---
>  drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> index 5314c064ceae..55b484b10562 100644
> --- a/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_fdma.c
> @@ -608,12 +608,12 @@ static int lan966x_fdma_napi_poll(struct napi_struct *napi, int weight)
>                 lan966x_fdma_rx_reload(rx);
>         }
> 
> -       if (counter < weight && napi_complete_done(napi, counter))
> -               lan_wr(0xff, lan966x, FDMA_INTR_DB_ENA);
> -
>         if (redirect)
>                 xdp_do_flush();
> 
> +       if (counter < weight && napi_complete_done(napi, counter))
> +               lan_wr(0xff, lan966x, FDMA_INTR_DB_ENA);
> +
>         return counter;
>  }
> 
> --
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-01-17 11:54 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-17  9:25 [PATCH net 0/5] net: xdp: execute xdp_do_flush() before napi_complete_done() Magnus Karlsson
2023-01-17  9:25 ` [PATCH net 1/5] qede: " Magnus Karlsson
2023-01-17  9:25 ` [PATCH net 2/5] lan966x: " Magnus Karlsson
2023-01-17 11:53   ` Steen Hegelund
2023-01-17  9:25 ` [PATCH net 3/5] virtio-net: " Magnus Karlsson
2023-01-17  9:25 ` [PATCH net 4/5] dpaa_eth: " Magnus Karlsson
2023-01-17  9:25 ` [PATCH net 5/5] dpaa2-eth: " Magnus Karlsson
2023-01-17 10:12 ` [PATCH net 0/5] net: xdp: " Michael S. Tsirkin
2023-01-17 10:40   ` Magnus Karlsson
2023-01-17 11:13 ` Toke Høiland-Jørgensen
2023-01-17 11:34   ` Magnus Karlsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).