All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP
@ 2018-07-11  8:12 Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 1/4] xsk: do not return ENXIO from TX copy mode Magnus Karlsson
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Magnus Karlsson @ 2018-07-11  8:12 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, eric.dumazet
  Cc: qi.z.zhang, pavel

This patch set adjusts the AF_XDP TX error reporting so that it becomes
consistent between copy mode and zero-copy. First some background:

Copy-mode for TX uses the SKB path in which the action of sending the
packet is performed from process context using the sendmsg
syscall. Completions are usually done asynchronously from NAPI mode by
using a TX interrupt. In this mode, send errors can be returned back
through the syscall.

In zero-copy mode both the sending of the packet and the completions
are done asynchronously from NAPI mode for performance reasons. In
this mode, the sendmsg syscall only makes sure that the TX NAPI loop
will be run that performs both the actions of sending and
completing. In this mode it is therefore not possible to return errors
through the sendmsg syscall as the sending is done from the NAPI
loop. Note that it is possible to implement a synchronous send with
our API, but in our benchmarks that made the TX performance drop by
nearly half due to synchronization requirements and cache line
bouncing. But for some netdevs this might be preferable so let us
leave it up to the implementation to decide.

The problem is that the current code base returns some errors in
copy-mode that are not possible to return in zero-copy mode. This
patch set aligns them so that the two modes always return the same
error code. We achieve this by removing some of the errors returned by
sendmsg in copy-mode (and in one case adding an error message for
zero-copy mode) and offering alternative error detection methods that
are consistent between the two modes.

The structure of the patch set is as follows:

Patch 1: removes the ENXIO return code from copy-mode when someone has
forcefully changed the number of queues on the device so that the
queue bound to the socket is no longer available. Just silently stop
sending anything as in zero-copy mode.

Patch 2: stop returning EAGAIN in copy mode when the completion queue
is full as zero-copy does not do this. Instead this situation can be
detected by comparing the head and tail pointers of the completion
queue in both modes. In any case, EAGAIN was not the correct error code
here since no amount of calling sendmsg will solve the problem. Only
consuming one or more messages on the completion queue will fix this.

Patch 3: Always return ENOBUFS from sendmsg if there is no TX queue
configured. This was not the case for zero-copy mode.

Patch 4: stop returning EMSGSIZE when the size of the packet is larger
than the MTU. Just send it to the device so that it will drop it as in
zero-copy mode.

Note that copy-mode can still return EAGAIN in certain circumstances,
but as these conditions cannot occur in zero-copy mode it is fine for
copy-mode to return them.

Question: For patch 4, is it fine to let the device drop a packet
that is greater than its MTU, or should I have a check for this in
both zero-copy and copy-mode and drop the packet up in the AF_XDP
code? The drawback of this is that it will have performance
implications for zero-copy mode as we will touch one more cache line
with dev->mtu.

Thanks: Magnus

Magnus Karlsson (4):
  xsk: do not return ENXIO from TX copy mode
  xsk: do not return EAGAIN from sendmsg when completion queue is full
  xsk: always return ENOBUFS from sendmsg if there is no TX queue
  xsk: do not return EMSGSIZE in copy mode for packets larger than MTU

 net/xdp/xsk.c | 20 +++++---------------
 1 file changed, 5 insertions(+), 15 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf 1/4] xsk: do not return ENXIO from TX copy mode
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
@ 2018-07-11  8:12 ` Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 2/4] xsk: do not return EAGAIN from sendmsg when completion queue is full Magnus Karlsson
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Karlsson @ 2018-07-11  8:12 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, eric.dumazet
  Cc: qi.z.zhang, pavel

This patch removes the ENXIO return code from TX copy-mode when
someone has forcefully changed the number of queues on the device so
that the queue bound to the socket is no longer available. Just
silently stop sending anything as in zero-copy mode so the error
reporting gets consistent between the two modes.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 7d220cbd09b6..08d09115093e 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -244,10 +244,8 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
 			goto out;
 		}
 
-		if (xs->queue_id >= xs->dev->real_num_tx_queues) {
-			err = -ENXIO;
+		if (xs->queue_id >= xs->dev->real_num_tx_queues)
 			goto out;
-		}
 
 		skb = sock_alloc_send_skb(sk, len, 1, &err);
 		if (unlikely(!skb)) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf 2/4] xsk: do not return EAGAIN from sendmsg when completion queue is full
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 1/4] xsk: do not return ENXIO from TX copy mode Magnus Karlsson
@ 2018-07-11  8:12 ` Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 3/4] xsk: always return ENOBUFS from sendmsg if there is no TX queue Magnus Karlsson
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Karlsson @ 2018-07-11  8:12 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, eric.dumazet
  Cc: qi.z.zhang, pavel

This patch stops returning EAGAIN in TX copy mode when the completion
queue is full as zero-copy does not do this. Instead this situation
can be detected by comparing the head and tail pointers of the
completion queue in both modes. In any case, EAGAIN was not the
correct error code here since no amount of calling sendmsg will solve
the problem. Only consuming one or more messages on the completion
queue will fix this.

With this patch, the error reporting becomes consistent between copy
mode and zero-copy mode.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 08d09115093e..87567232d0f8 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -233,10 +233,8 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
 			goto out;
 		}
 
-		if (xskq_reserve_addr(xs->umem->cq)) {
-			err = -EAGAIN;
+		if (xskq_reserve_addr(xs->umem->cq))
 			goto out;
-		}
 
 		len = desc.len;
 		if (unlikely(len > xs->dev->mtu)) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf 3/4] xsk: always return ENOBUFS from sendmsg if there is no TX queue
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 1/4] xsk: do not return ENXIO from TX copy mode Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 2/4] xsk: do not return EAGAIN from sendmsg when completion queue is full Magnus Karlsson
@ 2018-07-11  8:12 ` Magnus Karlsson
  2018-07-11  8:12 ` [PATCH bpf 4/4] xsk: do not return EMSGSIZE in copy mode for packets larger than MTU Magnus Karlsson
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Karlsson @ 2018-07-11  8:12 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, eric.dumazet
  Cc: qi.z.zhang, pavel

This patch makes sure ENOBUFS is always returned from sendmsg if there
is no TX queue configured. This was not the case for zero-copy
mode. With this patch this error reporting is consistent between copy
mode and zero-copy mode.

Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 87567232d0f8..9c784307f7b0 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -218,9 +218,6 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
 	struct sk_buff *skb;
 	int err = 0;
 
-	if (unlikely(!xs->tx))
-		return -ENOBUFS;
-
 	mutex_lock(&xs->mutex);
 
 	while (xskq_peek_desc(xs->tx, &desc)) {
@@ -296,6 +293,8 @@ static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
 		return -ENXIO;
 	if (unlikely(!(xs->dev->flags & IFF_UP)))
 		return -ENETDOWN;
+	if (unlikely(!xs->tx))
+		return -ENOBUFS;
 	if (need_wait)
 		return -EOPNOTSUPP;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf 4/4] xsk: do not return EMSGSIZE in copy mode for packets larger than MTU
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
                   ` (2 preceding siblings ...)
  2018-07-11  8:12 ` [PATCH bpf 3/4] xsk: always return ENOBUFS from sendmsg if there is no TX queue Magnus Karlsson
@ 2018-07-11  8:12 ` Magnus Karlsson
  2018-07-11 23:48 ` [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Alexei Starovoitov
  2018-07-13 13:38 ` Daniel Borkmann
  5 siblings, 0 replies; 7+ messages in thread
From: Magnus Karlsson @ 2018-07-11  8:12 UTC (permalink / raw)
  To: magnus.karlsson, bjorn.topel, ast, daniel, netdev, eric.dumazet
  Cc: qi.z.zhang, pavel

This patch stops returning EMSGSIZE from sendmsg in copy mode when the
size of the packet is larger than the MTU. Just send it to the device
so that it will drop it as in zero-copy mode. This makes the error
reporting consistent between copy mode and zero-copy mode.

Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
---
 net/xdp/xsk.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 9c784307f7b0..72335c2e8108 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -233,15 +233,10 @@ static int xsk_generic_xmit(struct sock *sk, struct msghdr *m,
 		if (xskq_reserve_addr(xs->umem->cq))
 			goto out;
 
-		len = desc.len;
-		if (unlikely(len > xs->dev->mtu)) {
-			err = -EMSGSIZE;
-			goto out;
-		}
-
 		if (xs->queue_id >= xs->dev->real_num_tx_queues)
 			goto out;
 
+		len = desc.len;
 		skb = sock_alloc_send_skb(sk, len, 1, &err);
 		if (unlikely(!skb)) {
 			err = -EAGAIN;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
                   ` (3 preceding siblings ...)
  2018-07-11  8:12 ` [PATCH bpf 4/4] xsk: do not return EMSGSIZE in copy mode for packets larger than MTU Magnus Karlsson
@ 2018-07-11 23:48 ` Alexei Starovoitov
  2018-07-13 13:38 ` Daniel Borkmann
  5 siblings, 0 replies; 7+ messages in thread
From: Alexei Starovoitov @ 2018-07-11 23:48 UTC (permalink / raw)
  To: Magnus Karlsson
  Cc: bjorn.topel, ast, daniel, netdev, eric.dumazet, qi.z.zhang, pavel

On Wed, Jul 11, 2018 at 10:12:48AM +0200, Magnus Karlsson wrote:
> This patch set adjusts the AF_XDP TX error reporting so that it becomes
> consistent between copy mode and zero-copy. First some background:
> 
> Copy-mode for TX uses the SKB path in which the action of sending the
> packet is performed from process context using the sendmsg
> syscall. Completions are usually done asynchronously from NAPI mode by
> using a TX interrupt. In this mode, send errors can be returned back
> through the syscall.
> 
> In zero-copy mode both the sending of the packet and the completions
> are done asynchronously from NAPI mode for performance reasons. In
> this mode, the sendmsg syscall only makes sure that the TX NAPI loop
> will be run that performs both the actions of sending and
> completing. In this mode it is therefore not possible to return errors
> through the sendmsg syscall as the sending is done from the NAPI
> loop. Note that it is possible to implement a synchronous send with
> our API, but in our benchmarks that made the TX performance drop by
> nearly half due to synchronization requirements and cache line
> bouncing. But for some netdevs this might be preferable so let us
> leave it up to the implementation to decide.
> 
> The problem is that the current code base returns some errors in
> copy-mode that are not possible to return in zero-copy mode. This
> patch set aligns them so that the two modes always return the same
> error code. We achieve this by removing some of the errors returned by
> sendmsg in copy-mode (and in one case adding an error message for
> zero-copy mode) and offering alternative error detection methods that
> are consistent between the two modes.
> 
> The structure of the patch set is as follows:
> 
> Patch 1: removes the ENXIO return code from copy-mode when someone has
> forcefully changed the number of queues on the device so that the
> queue bound to the socket is no longer available. Just silently stop
> sending anything as in zero-copy mode.
> 
> Patch 2: stop returning EAGAIN in copy mode when the completion queue
> is full as zero-copy does not do this. Instead this situation can be
> detected by comparing the head and tail pointers of the completion
> queue in both modes. In any case, EAGAIN was not the correct error code
> here since no amount of calling sendmsg will solve the problem. Only
> consuming one or more messages on the completion queue will fix this.
> 
> Patch 3: Always return ENOBUFS from sendmsg if there is no TX queue
> configured. This was not the case for zero-copy mode.
> 
> Patch 4: stop returning EMSGSIZE when the size of the packet is larger
> than the MTU. Just send it to the device so that it will drop it as in
> zero-copy mode.
> 
> Note that copy-mode can still return EAGAIN in certain circumstances,
> but as these conditions cannot occur in zero-copy mode it is fine for
> copy-mode to return them.
> 
> Question: For patch 4, is it fine to let the device drop a packet
> that is greater than its MTU, or should I have a check for this in
> both zero-copy and copy-mode and drop the packet up in the AF_XDP
> code? The drawback of this is that it will have performance
> implications for zero-copy mode as we will touch one more cache line
> with dev->mtu.
> 
> Thanks: Magnus

for the set:
Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP
  2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
                   ` (4 preceding siblings ...)
  2018-07-11 23:48 ` [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Alexei Starovoitov
@ 2018-07-13 13:38 ` Daniel Borkmann
  5 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2018-07-13 13:38 UTC (permalink / raw)
  To: Magnus Karlsson, bjorn.topel, ast, netdev, eric.dumazet; +Cc: qi.z.zhang, pavel

On 07/11/2018 10:12 AM, Magnus Karlsson wrote:
> This patch set adjusts the AF_XDP TX error reporting so that it becomes
> consistent between copy mode and zero-copy. First some background:
> 
> Copy-mode for TX uses the SKB path in which the action of sending the
> packet is performed from process context using the sendmsg
> syscall. Completions are usually done asynchronously from NAPI mode by
> using a TX interrupt. In this mode, send errors can be returned back
> through the syscall.
> 
> In zero-copy mode both the sending of the packet and the completions
> are done asynchronously from NAPI mode for performance reasons. In
> this mode, the sendmsg syscall only makes sure that the TX NAPI loop
> will be run that performs both the actions of sending and
> completing. In this mode it is therefore not possible to return errors
> through the sendmsg syscall as the sending is done from the NAPI
> loop. Note that it is possible to implement a synchronous send with
> our API, but in our benchmarks that made the TX performance drop by
> nearly half due to synchronization requirements and cache line
> bouncing. But for some netdevs this might be preferable so let us
> leave it up to the implementation to decide.
> 
> The problem is that the current code base returns some errors in
> copy-mode that are not possible to return in zero-copy mode. This
> patch set aligns them so that the two modes always return the same
> error code. We achieve this by removing some of the errors returned by
> sendmsg in copy-mode (and in one case adding an error message for
> zero-copy mode) and offering alternative error detection methods that
> are consistent between the two modes.

Looks good, applied to bpf, thanks Magnus!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-07-13 13:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-11  8:12 [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Magnus Karlsson
2018-07-11  8:12 ` [PATCH bpf 1/4] xsk: do not return ENXIO from TX copy mode Magnus Karlsson
2018-07-11  8:12 ` [PATCH bpf 2/4] xsk: do not return EAGAIN from sendmsg when completion queue is full Magnus Karlsson
2018-07-11  8:12 ` [PATCH bpf 3/4] xsk: always return ENOBUFS from sendmsg if there is no TX queue Magnus Karlsson
2018-07-11  8:12 ` [PATCH bpf 4/4] xsk: do not return EMSGSIZE in copy mode for packets larger than MTU Magnus Karlsson
2018-07-11 23:48 ` [PATCH bpf 0/4] Consistent sendmsg error reporting in AF_XDP Alexei Starovoitov
2018-07-13 13:38 ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.