From: Peter Oskolkov <posk@google.com>
To: Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
netdev@vger.kernel.org
Cc: Peter Oskolkov <posk@posk.io>, David Ahern <dsahern@gmail.com>,
Willem de Bruijn <willemb@google.com>,
Peter Oskolkov <posk@google.com>
Subject: [PATCH bpf-next v10 3/7] bpf: handle GSO in bpf_lwt_push_encap
Date: Tue, 12 Feb 2019 09:32:43 -0800 [thread overview]
Message-ID: <20190212173247.121342-4-posk@google.com> (raw)
In-Reply-To: <20190212173247.121342-1-posk@google.com>
This patch adds handling of GSO packets in bpf_lwt_push_ip_encap()
(called from bpf_lwt_push_encap):
* IPIP, GRE, and UDP encapsulation types are deduced by looking
into iphdr->protocol or ipv6hdr->next_header;
* SCTP GSO packets are not supported (as bpf_skb_proto_4_to_6
and similar do);
* UDP_L4 GSO packets are also not supported (although they are
not blocked in bpf_skb_proto_4_to_6 and similar), as
skb_decrease_gso_size() will break it;
* SKB_GSO_DODGY bit is set.
Note: it may be possible to support SCTP and UDP_L4 gso packets;
but as these cases seem to be not well handled by other
tunneling/encapping code paths, the solution should
be generic enough to apply to all tunneling/encapping code.
v8 changes:
- make sure that if GRE or UDP encap is detected, there is
enough of pushed bytes to cover both IP[v6] + GRE|UDP headers;
- do not reject double-encapped packets;
- whitelist TCP GSO packets rather than block SCTP GSO and
UDP GSO.
Signed-off-by: Peter Oskolkov <posk@google.com>
---
net/core/lwt_bpf.c | 67 ++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 65 insertions(+), 2 deletions(-)
diff --git a/net/core/lwt_bpf.c b/net/core/lwt_bpf.c
index e5a9850d9f48..079871fc020f 100644
--- a/net/core/lwt_bpf.c
+++ b/net/core/lwt_bpf.c
@@ -16,6 +16,7 @@
#include <linux/types.h>
#include <linux/bpf.h>
#include <net/lwtunnel.h>
+#include <net/gre.h>
struct bpf_lwt_prog {
struct bpf_prog *prog;
@@ -390,10 +391,72 @@ static const struct lwtunnel_encap_ops bpf_encap_ops = {
.owner = THIS_MODULE,
};
+static int handle_gso_type(struct sk_buff *skb, unsigned int gso_type,
+ int encap_len)
+{
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
+
+ gso_type |= SKB_GSO_DODGY;
+ shinfo->gso_type |= gso_type;
+ skb_decrease_gso_size(shinfo, encap_len);
+ shinfo->gso_segs = 0;
+ return 0;
+}
+
static int handle_gso_encap(struct sk_buff *skb, bool ipv4, int encap_len)
{
- /* Handling of GSO-enabled packets is added in the next patch. */
- return -EOPNOTSUPP;
+ int next_hdr_offset;
+ void *next_hdr;
+ __u8 protocol;
+
+ /* SCTP and UDP_L4 gso need more nuanced handling than what
+ * handle_gso_type() does above: skb_decrease_gso_size() is not enough.
+ * So at the moment only TCP GSO packets are let through.
+ */
+ if (!(skb_shinfo(skb)->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
+ return -ENOTSUPP;
+
+ if (ipv4) {
+ protocol = ip_hdr(skb)->protocol;
+ next_hdr_offset = sizeof(struct iphdr);
+ next_hdr = skb_network_header(skb) + next_hdr_offset;
+ } else {
+ protocol = ipv6_hdr(skb)->nexthdr;
+ next_hdr_offset = sizeof(struct ipv6hdr);
+ next_hdr = skb_network_header(skb) + next_hdr_offset;
+ }
+
+ switch (protocol) {
+ case IPPROTO_GRE:
+ next_hdr_offset += sizeof(struct gre_base_hdr);
+ if (next_hdr_offset > encap_len)
+ return -EINVAL;
+
+ if (((struct gre_base_hdr *)next_hdr)->flags & GRE_CSUM)
+ return handle_gso_type(skb, SKB_GSO_GRE_CSUM,
+ encap_len);
+ return handle_gso_type(skb, SKB_GSO_GRE, encap_len);
+
+ case IPPROTO_UDP:
+ next_hdr_offset += sizeof(struct udphdr);
+ if (next_hdr_offset > encap_len)
+ return -EINVAL;
+
+ if (((struct udphdr *)next_hdr)->check)
+ return handle_gso_type(skb, SKB_GSO_UDP_TUNNEL_CSUM,
+ encap_len);
+ return handle_gso_type(skb, SKB_GSO_UDP_TUNNEL, encap_len);
+
+ case IPPROTO_IP:
+ case IPPROTO_IPV6:
+ if (ipv4)
+ return handle_gso_type(skb, SKB_GSO_IPXIP4, encap_len);
+ else
+ return handle_gso_type(skb, SKB_GSO_IPXIP6, encap_len);
+
+ default:
+ return -EPROTONOSUPPORT;
+ }
}
int bpf_lwt_push_ip_encap(struct sk_buff *skb, void *hdr, u32 len, bool ingress)
--
2.20.1.791.gb4d0f1c61a-goog
next prev parent reply other threads:[~2019-02-12 17:33 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-12 17:32 [PATCH bpf-next v10 0/7] bpf: add BPF_LWT_ENCAP_IP option to bpf_lwt_push_encap Peter Oskolkov
2019-02-12 17:32 ` [PATCH bpf-next v10 1/7] bpf: add plumbing for BPF_LWT_ENCAP_IP in bpf_lwt_push_encap Peter Oskolkov
2019-02-12 17:32 ` [PATCH bpf-next v10 2/7] bpf: implement BPF_LWT_ENCAP_IP mode " Peter Oskolkov
2019-02-12 17:32 ` Peter Oskolkov [this message]
2019-02-12 17:32 ` [PATCH bpf-next v10 4/7] ipv6_stub: add ipv6_route_input stub/proxy Peter Oskolkov
2019-02-12 17:32 ` [PATCH bpf-next v10 5/7] bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c Peter Oskolkov
2019-02-13 2:58 ` David Ahern
2019-02-13 19:57 ` Peter Oskolkov
2019-02-13 20:11 ` David Ahern
2019-02-13 20:41 ` Peter Oskolkov
2019-02-12 17:32 ` [PATCH bpf-next v10 6/7] bpf: sync <kdir>/include/.../bpf.h with tools/include/.../bpf.h Peter Oskolkov
2019-02-12 17:32 ` [PATCH bpf-next v10 7/7] selftests: bpf: add test_lwt_ip_encap selftest Peter Oskolkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190212173247.121342-4-posk@google.com \
--to=posk@google.com \
--cc=ast@kernel.org \
--cc=daniel@iogearbox.net \
--cc=dsahern@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=posk@posk.io \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).