netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] net: add gro frag support to udp tunnel api
@ 2020-01-27 15:24 Jason A. Donenfeld
  2020-01-27 15:40 ` Willem de Bruijn
  2020-01-30 10:16 ` Steffen Klassert
  0 siblings, 2 replies; 4+ messages in thread
From: Jason A. Donenfeld @ 2020-01-27 15:24 UTC (permalink / raw)
  To: steffen.klassert, netdev; +Cc: Jason A. Donenfeld

Hi Steffen,

This is very much a "RFC", in that the code here is fodder for
discussion more than something I'm seriously proposing at the moment.
I'm writing to you specifically because I recall us having discussed
something like this a while ago and you being interested.

WireGuard would benefit from getting lists of SKBs all together in a
bunch on the receive side. At the moment, encap_rcv splits them apart
one by one before giving them to the API. This patch proposes a way to
opt-in to receiving them before they've been split. The solution
involves adding an encap_type flag that enables calling the encap_rcv
function earlier before the skbs have been split apart.

I worry that it's not this straight forward, however, because of this
function below called, "udp_unexpected_gso". It looks like there's a
fast path for the case when it doesn't need to be split apart, and that
if it is already split apart, that's expected, whereas splitting it
apart would be "unexpected." I'm not too familiar with this code. Do you
know off hand why this would be unexpected? And does that imply that a
proper implementation of this might be a bit more involved? Or is the
naming just silly and this actually is _the_ path where this happens?

The other thing I'm wondering with regards to this is how you even hit
this path. So far I've only been able to hit it with the out of tree
Qualcom driver, "rmnet_perf". I saw a mailing list discussion a few
years ago about adding some flags to be able to simulate this with veth,
but I didn't see that go anywhere. Figuring out how I can test this is
probably a good idea before proceeding further.

Finally, and most importantly, is this overlapping with something you're
working on at the moment? Where do you stand with this? Did you wind up
solving your own use cases in a different way, or are you sitting on a
more proper version of this RFC or something else?

Regards,
Jason
---
 include/uapi/linux/udp.h |  1 +
 net/ipv4/udp.c           | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
index 30baccb6c9c4..35b23c8a030f 100644
--- a/include/uapi/linux/udp.h
+++ b/include/uapi/linux/udp.h
@@ -42,5 +42,6 @@ struct udphdr {
 #define UDP_ENCAP_GTP0		4 /* GSM TS 09.60 */
 #define UDP_ENCAP_GTP1U		5 /* 3GPP TS 29.060 */
 #define UDP_ENCAP_RXRPC		6
+#define UDP_ENCAP_SUPPORT_GRO_FRAGS (1UL << 31)
 
 #endif /* _UAPI_LINUX_UDP_H */
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 208da0917469..ee583f20cef5 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2095,9 +2095,19 @@ static int udp_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
 
 static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 {
+	int (*encap_rcv_gro)(struct sock *sk, struct sk_buff *skb);
 	struct sk_buff *next, *segs;
 	int ret;
 
+	if (static_branch_unlikely(&udp_encap_needed_key) &&
+	    up->encap_type & UDP_ENCAP_SUPPORT_GRO_FRAGS &&
+	    (encap_rcv_gro = READ_ONCE(up->encap_rcv))) {
+		//XXX: deal with checksum?
+		ret = encap_rcv(sk, skb);
+		if (ret <= 0) //XXX: deal with incrementing udp error stats?
+			return -ret;
+	}
+
 	if (likely(!udp_unexpected_gso(sk, skb)))
 		return udp_queue_rcv_one_skb(sk, skb);
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC] net: add gro frag support to udp tunnel api
  2020-01-27 15:24 [RFC] net: add gro frag support to udp tunnel api Jason A. Donenfeld
@ 2020-01-27 15:40 ` Willem de Bruijn
  2020-01-30 10:01   ` Steffen Klassert
  2020-01-30 10:16 ` Steffen Klassert
  1 sibling, 1 reply; 4+ messages in thread
From: Willem de Bruijn @ 2020-01-27 15:40 UTC (permalink / raw)
  To: Jason A. Donenfeld, Paolo Abeni; +Cc: Steffen Klassert, Network Development

On Mon, Jan 27, 2020 at 10:25 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Hi Steffen,
>
> This is very much a "RFC", in that the code here is fodder for
> discussion more than something I'm seriously proposing at the moment.
> I'm writing to you specifically because I recall us having discussed
> something like this a while ago and you being interested.
>
> WireGuard would benefit from getting lists of SKBs all together in a
> bunch on the receive side. At the moment, encap_rcv splits them apart
> one by one before giving them to the API. This patch proposes a way to
> opt-in to receiving them before they've been split. The solution
> involves adding an encap_type flag that enables calling the encap_rcv
> function earlier before the skbs have been split apart.
>
> I worry that it's not this straight forward, however, because of this
> function below called, "udp_unexpected_gso". It looks like there's a
> fast path for the case when it doesn't need to be split apart, and that
> if it is already split apart, that's expected, whereas splitting it
> apart would be "unexpected." I'm not too familiar with this code. Do you
> know off hand why this would be unexpected?

This is for delivery to local sockets.

UDP GRO packets need to be split back up before delivery, unless the
socket has set socket option UDP_GRO.

This is checked in the GRO layer by checking udp_sk(sk)->gro_enabled.

There is a race condition between this check and the packet arriving
at the socket. Hence the unexpected.

Note that UDP GSO can take two forms, the fraglist approach by Steffen
or the earlier implementation that builds a single skb with frags.

>  static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
>  {
> +       int (*encap_rcv_gro)(struct sock *sk, struct sk_buff *skb);
>         struct sk_buff *next, *segs;
>         int ret;
>
> +       if (static_branch_unlikely(&udp_encap_needed_key) &&
> +           up->encap_type & UDP_ENCAP_SUPPORT_GRO_FRAGS &&
> +           (encap_rcv_gro = READ_ONCE(up->encap_rcv))) {
> +               //XXX: deal with checksum?
> +               ret = encap_rcv(sk, skb);
> +               if (ret <= 0) //XXX: deal with incrementing udp error stats?
> +                       return -ret;
> +       }

I think it'll be sufficient to just set optionally
udp_sk(sk)->gro_enabled on encap sockets and let it takes the default
path, below?

> +
>         if (likely(!udp_unexpected_gso(sk, skb)))
>                 return udp_queue_rcv_one_skb(sk, skb);
>
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] net: add gro frag support to udp tunnel api
  2020-01-27 15:40 ` Willem de Bruijn
@ 2020-01-30 10:01   ` Steffen Klassert
  0 siblings, 0 replies; 4+ messages in thread
From: Steffen Klassert @ 2020-01-30 10:01 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: Jason A. Donenfeld, Paolo Abeni, Network Development

On Mon, Jan 27, 2020 at 10:40:55AM -0500, Willem de Bruijn wrote:
> On Mon, Jan 27, 2020 at 10:25 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > Hi Steffen,
> >
> > This is very much a "RFC", in that the code here is fodder for
> > discussion more than something I'm seriously proposing at the moment.
> > I'm writing to you specifically because I recall us having discussed
> > something like this a while ago and you being interested.
> >
> > WireGuard would benefit from getting lists of SKBs all together in a
> > bunch on the receive side. At the moment, encap_rcv splits them apart
> > one by one before giving them to the API. This patch proposes a way to
> > opt-in to receiving them before they've been split. The solution
> > involves adding an encap_type flag that enables calling the encap_rcv
> > function earlier before the skbs have been split apart.
> >
> > I worry that it's not this straight forward, however, because of this
> > function below called, "udp_unexpected_gso". It looks like there's a
> > fast path for the case when it doesn't need to be split apart, and that
> > if it is already split apart, that's expected, whereas splitting it
> > apart would be "unexpected." I'm not too familiar with this code. Do you
> > know off hand why this would be unexpected?
> 
> This is for delivery to local sockets.
> 
> UDP GRO packets need to be split back up before delivery, unless the
> socket has set socket option UDP_GRO.
> 
> This is checked in the GRO layer by checking udp_sk(sk)->gro_enabled.
> 
> There is a race condition between this check and the packet arriving
> at the socket. Hence the unexpected.

Actually, if fraglist GRO is enabled this codepath is not that
unexpected anymore. Maybe we should remove the branch predictor
on !udp_unexpected_gso.

> 
> Note that UDP GSO can take two forms, the fraglist approach by Steffen
> or the earlier implementation that builds a single skb with frags.
> 
> >  static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> >  {
> > +       int (*encap_rcv_gro)(struct sock *sk, struct sk_buff *skb);
> >         struct sk_buff *next, *segs;
> >         int ret;
> >
> > +       if (static_branch_unlikely(&udp_encap_needed_key) &&
> > +           up->encap_type & UDP_ENCAP_SUPPORT_GRO_FRAGS &&
> > +           (encap_rcv_gro = READ_ONCE(up->encap_rcv))) {
> > +               //XXX: deal with checksum?
> > +               ret = encap_rcv(sk, skb);
> > +               if (ret <= 0) //XXX: deal with incrementing udp error stats?
> > +                       return -ret;
> > +       }
> 
> I think it'll be sufficient to just set optionally
> udp_sk(sk)->gro_enabled on encap sockets and let it takes the default
> path, below?

I think Jason wants to have the fraglist GRO version.
Just setting udp_sk(sk)->gro_enabled would generate
standard GRO packets. The UDP payload from wireguard
is encrypted, so merging into a single datagram does
not work that well here.

> 
> > +
> >         if (likely(!udp_unexpected_gso(sk, skb)))
> >                 return udp_queue_rcv_one_skb(sk, skb);
> >

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] net: add gro frag support to udp tunnel api
  2020-01-27 15:24 [RFC] net: add gro frag support to udp tunnel api Jason A. Donenfeld
  2020-01-27 15:40 ` Willem de Bruijn
@ 2020-01-30 10:16 ` Steffen Klassert
  1 sibling, 0 replies; 4+ messages in thread
From: Steffen Klassert @ 2020-01-30 10:16 UTC (permalink / raw)
  To: Jason A. Donenfeld; +Cc: netdev

Hi Jason,

On Mon, Jan 27, 2020 at 04:24:11PM +0100, Jason A. Donenfeld wrote:
> Hi Steffen,
> 
> This is very much a "RFC", in that the code here is fodder for
> discussion more than something I'm seriously proposing at the moment.
> I'm writing to you specifically because I recall us having discussed
> something like this a while ago and you being interested.
> 
> WireGuard would benefit from getting lists of SKBs all together in a
> bunch on the receive side. At the moment, encap_rcv splits them apart
> one by one before giving them to the API. This patch proposes a way to
> opt-in to receiving them before they've been split. The solution
> involves adding an encap_type flag that enables calling the encap_rcv
> function earlier before the skbs have been split apart.
> 
> I worry that it's not this straight forward, however, because of this
> function below called, "udp_unexpected_gso". It looks like there's a
> fast path for the case when it doesn't need to be split apart, and that
> if it is already split apart, that's expected, whereas splitting it
> apart would be "unexpected." I'm not too familiar with this code. Do you
> know off hand why this would be unexpected? And does that imply that a
> proper implementation of this might be a bit more involved? Or is the
> naming just silly and this actually is _the_ path where this happens?
> 
> The other thing I'm wondering with regards to this is how you even hit
> this path. So far I've only been able to hit it with the out of tree
> Qualcom driver, "rmnet_perf". I saw a mailing list discussion a few
> years ago about adding some flags to be able to simulate this with veth,
> but I didn't see that go anywhere. Figuring out how I can test this is
> probably a good idea before proceeding further.
> 
> Finally, and most importantly, is this overlapping with something you're
> working on at the moment? Where do you stand with this? Did you wind up
> solving your own use cases in a different way, or are you sitting on a
> more proper version of this RFC or something else?

I have a patch to enable GRO for ESP in UDP encapsulation.
The patch is not that well tested so far, so I hold it back
for now. But I think it won't overlap with enabling fraglist
GRO for encap sockets. I have not thought about enabling
fraglist GRO for encap sockets, but a flag to signal this
could be a way to go.

Other that that I plan to enable UDP GRO by default
once we are sure that this won't add a regression.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-30 10:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-27 15:24 [RFC] net: add gro frag support to udp tunnel api Jason A. Donenfeld
2020-01-27 15:40 ` Willem de Bruijn
2020-01-30 10:01   ` Steffen Klassert
2020-01-30 10:16 ` Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).