All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/2] BPF update
@ 2016-01-07 14:50 Daniel Borkmann
  2016-01-07 14:50 ` [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper Daniel Borkmann
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Daniel Borkmann @ 2016-01-07 14:50 UTC (permalink / raw)
  To: davem; +Cc: alexei.starovoitov, hannes, netdev, Daniel Borkmann

Fixes a csum issue on ingress. As mentioned previously, net-next
seems just fine imho. Later on, will follow up with couple of
replacements like ovs_skb_postpush_rcsum() etc.

Thanks!

v1 -> v2:
  - Added patch 1 with helper
  - Implemented Hannes' idea to just use csum_partial, thanks!

Daniel Borkmann (2):
  net, sched: add skb_at_tc_ingress helper
  bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions

 include/linux/skbuff.h    | 17 +++++++++++++++++
 include/net/sch_generic.h |  9 +++++++++
 net/core/filter.c         | 17 +++++++++++++----
 net/sched/cls_bpf.c       |  6 +-----
 4 files changed, 40 insertions(+), 9 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper
  2016-01-07 14:50 [PATCH net-next v2 0/2] BPF update Daniel Borkmann
@ 2016-01-07 14:50 ` Daniel Borkmann
  2016-01-07 18:12   ` Alexei Starovoitov
  2016-01-07 14:50 ` [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions Daniel Borkmann
  2016-01-10 22:55 ` [PATCH net-next v2 0/2] BPF update David Miller
  2 siblings, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2016-01-07 14:50 UTC (permalink / raw)
  To: davem; +Cc: alexei.starovoitov, hannes, netdev, Daniel Borkmann

Add a skb_at_tc_ingress() as this will be needed elsewhere as well and
can hide the ugly ifdef.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 include/net/sch_generic.h | 9 +++++++++
 net/sched/cls_bpf.c       | 6 +-----
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index b2a8e63..636a362 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -407,6 +407,15 @@ bool tcf_destroy(struct tcf_proto *tp, bool force);
 void tcf_destroy_chain(struct tcf_proto __rcu **fl);
 int skb_do_redirect(struct sk_buff *);
 
+static inline bool skb_at_tc_ingress(const struct sk_buff *skb)
+{
+#ifdef CONFIG_NET_CLS_ACT
+	return G_TC_AT(skb->tc_verd) & AT_INGRESS;
+#else
+	return false;
+#endif
+}
+
 /* Reset all TX qdiscs greater then index of a device.  */
 static inline void qdisc_reset_all_tx_gt(struct net_device *dev, unsigned int i)
 {
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 99259f5..8dc8430 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -79,12 +79,8 @@ static int cls_bpf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			    struct tcf_result *res)
 {
 	struct cls_bpf_head *head = rcu_dereference_bh(tp->root);
+	bool at_ingress = skb_at_tc_ingress(skb);
 	struct cls_bpf_prog *prog;
-#ifdef CONFIG_NET_CLS_ACT
-	bool at_ingress = G_TC_AT(skb->tc_verd) & AT_INGRESS;
-#else
-	bool at_ingress = false;
-#endif
 	int ret = -1;
 
 	if (unlikely(!skb_mac_header_was_set(skb)))
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions
  2016-01-07 14:50 [PATCH net-next v2 0/2] BPF update Daniel Borkmann
  2016-01-07 14:50 ` [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper Daniel Borkmann
@ 2016-01-07 14:50 ` Daniel Borkmann
  2016-01-07 18:52   ` Alexei Starovoitov
  2016-01-10 22:55 ` [PATCH net-next v2 0/2] BPF update David Miller
  2 siblings, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2016-01-07 14:50 UTC (permalink / raw)
  To: davem; +Cc: alexei.starovoitov, hannes, netdev, Daniel Borkmann

Add a small helper skb_postpush_rcsum() and fix up redirect locations
that need CHECKSUM_COMPLETE fixups on ingress. dev_forward_skb() expects
a proper csum that covers also Ethernet header, f.e. since 2c26d34bbcc0
("net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding"), we
also do skb_postpull_rcsum() after pulling Ethernet header off via
eth_type_trans().

When using eBPF in a netns setup f.e. with vxlan in collect metadata mode,
I can trigger the following csum issue with an IPv6 setup:

  [  505.144065] dummy1: hw csum failure
  [...]
  [  505.144108] Call Trace:
  [  505.144112]  <IRQ>  [<ffffffff81372f08>] dump_stack+0x44/0x5c
  [  505.144134]  [<ffffffff81607cea>] netdev_rx_csum_fault+0x3a/0x40
  [  505.144142]  [<ffffffff815fee3f>] __skb_checksum_complete+0xcf/0xe0
  [  505.144149]  [<ffffffff816f0902>] nf_ip6_checksum+0xb2/0x120
  [  505.144161]  [<ffffffffa08c0e0e>] icmpv6_error+0x17e/0x328 [nf_conntrack_ipv6]
  [  505.144170]  [<ffffffffa0898eca>] ? ip6t_do_table+0x2fa/0x645 [ip6_tables]
  [  505.144177]  [<ffffffffa08c0725>] ? ipv6_get_l4proto+0x65/0xd0 [nf_conntrack_ipv6]
  [  505.144189]  [<ffffffffa06c9a12>] nf_conntrack_in+0xc2/0x5a0 [nf_conntrack]
  [  505.144196]  [<ffffffffa08c039c>] ipv6_conntrack_in+0x1c/0x20 [nf_conntrack_ipv6]
  [  505.144204]  [<ffffffff8164385d>] nf_iterate+0x5d/0x70
  [  505.144210]  [<ffffffff816438d6>] nf_hook_slow+0x66/0xc0
  [  505.144218]  [<ffffffff816bd302>] ipv6_rcv+0x3f2/0x4f0
  [  505.144225]  [<ffffffff816bca40>] ? ip6_make_skb+0x1b0/0x1b0
  [  505.144232]  [<ffffffff8160b77b>] __netif_receive_skb_core+0x36b/0x9a0
  [  505.144239]  [<ffffffff8160bdc8>] ? __netif_receive_skb+0x18/0x60
  [  505.144245]  [<ffffffff8160bdc8>] __netif_receive_skb+0x18/0x60
  [  505.144252]  [<ffffffff8160ccff>] process_backlog+0x9f/0x140
  [  505.144259]  [<ffffffff8160c4a5>] net_rx_action+0x145/0x320
  [...]

What happens is that on ingress, we push Ethernet header back in, either
from cls_bpf or right before skb_do_redirect(), but without updating csum.
The "hw csum failure" can be fixed by using the new skb_postpush_rcsum()
helper for the dev_forward_skb() case to correct the csum diff again.

Thanks to Hannes Frederic Sowa for the csum_partial() idea!

Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper")
Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 include/linux/skbuff.h | 17 +++++++++++++++++
 net/core/filter.c      | 17 +++++++++++++----
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6b6bd42..07f9ccd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2805,6 +2805,23 @@ static inline void skb_postpull_rcsum(struct sk_buff *skb,
 
 unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len);
 
+static inline void skb_postpush_rcsum(struct sk_buff *skb,
+				      const void *start, unsigned int len)
+{
+	/* For performing the reverse operation to skb_postpull_rcsum(),
+	 * we can instead of ...
+	 *
+	 *   skb->csum = csum_add(skb->csum, csum_partial(start, len, 0));
+	 *
+	 * ... just use this equivalent version here to save a few
+	 * instructions. Feeding csum of 0 in csum_partial() and later
+	 * on adding skb->csum is equivalent to feed skb->csum in the
+	 * first place.
+	 */
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->csum = csum_partial(start, len, skb->csum);
+}
+
 /**
  *	pskb_trim_rcsum - trim received skb and update checksum
  *	@skb: buffer to trim
diff --git a/net/core/filter.c b/net/core/filter.c
index 35e6fed..0db92b5 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1368,8 +1368,9 @@ static u64 bpf_skb_store_bytes(u64 r1, u64 r2, u64 r3, u64 r4, u64 flags)
 		/* skb_store_bits cannot return -EFAULT here */
 		skb_store_bits(skb, offset, ptr, len);
 
-	if (BPF_RECOMPUTE_CSUM(flags) && skb->ip_summed == CHECKSUM_COMPLETE)
-		skb->csum = csum_add(skb->csum, csum_partial(ptr, len, 0));
+	if (BPF_RECOMPUTE_CSUM(flags))
+		skb_postpush_rcsum(skb, ptr, len);
+
 	return 0;
 }
 
@@ -1525,8 +1526,12 @@ static u64 bpf_clone_redirect(u64 r1, u64 ifindex, u64 flags, u64 r4, u64 r5)
 	if (unlikely(!skb2))
 		return -ENOMEM;
 
-	if (BPF_IS_REDIRECT_INGRESS(flags))
+	if (BPF_IS_REDIRECT_INGRESS(flags)) {
+		if (skb_at_tc_ingress(skb2))
+			skb_postpush_rcsum(skb2, skb_mac_header(skb2),
+					   skb2->mac_len);
 		return dev_forward_skb(dev, skb2);
+	}
 
 	skb2->dev = dev;
 	skb_sender_cpu_clear(skb2);
@@ -1569,8 +1574,12 @@ int skb_do_redirect(struct sk_buff *skb)
 		return -EINVAL;
 	}
 
-	if (BPF_IS_REDIRECT_INGRESS(ri->flags))
+	if (BPF_IS_REDIRECT_INGRESS(ri->flags)) {
+		if (skb_at_tc_ingress(skb))
+			skb_postpush_rcsum(skb, skb_mac_header(skb),
+					   skb->mac_len);
 		return dev_forward_skb(dev, skb);
+	}
 
 	skb->dev = dev;
 	skb_sender_cpu_clear(skb);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper
  2016-01-07 14:50 ` [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper Daniel Borkmann
@ 2016-01-07 18:12   ` Alexei Starovoitov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexei Starovoitov @ 2016-01-07 18:12 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, hannes, netdev

On Thu, Jan 07, 2016 at 03:50:22PM +0100, Daniel Borkmann wrote:
> Add a skb_at_tc_ingress() as this will be needed elsewhere as well and
> can hide the ugly ifdef.
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions
  2016-01-07 14:50 ` [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions Daniel Borkmann
@ 2016-01-07 18:52   ` Alexei Starovoitov
  0 siblings, 0 replies; 6+ messages in thread
From: Alexei Starovoitov @ 2016-01-07 18:52 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, hannes, netdev

On Thu, Jan 07, 2016 at 03:50:23PM +0100, Daniel Borkmann wrote:
> Add a small helper skb_postpush_rcsum() and fix up redirect locations
> that need CHECKSUM_COMPLETE fixups on ingress. dev_forward_skb() expects
> a proper csum that covers also Ethernet header, f.e. since 2c26d34bbcc0
> ("net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding"), we
> also do skb_postpull_rcsum() after pulling Ethernet header off via
> eth_type_trans().
> 
> When using eBPF in a netns setup f.e. with vxlan in collect metadata mode,
> I can trigger the following csum issue with an IPv6 setup:
> 
>   [  505.144065] dummy1: hw csum failure
>   [...]
>   [  505.144108] Call Trace:
>   [  505.144112]  <IRQ>  [<ffffffff81372f08>] dump_stack+0x44/0x5c
>   [  505.144134]  [<ffffffff81607cea>] netdev_rx_csum_fault+0x3a/0x40
>   [  505.144142]  [<ffffffff815fee3f>] __skb_checksum_complete+0xcf/0xe0
>   [  505.144149]  [<ffffffff816f0902>] nf_ip6_checksum+0xb2/0x120
>   [  505.144161]  [<ffffffffa08c0e0e>] icmpv6_error+0x17e/0x328 [nf_conntrack_ipv6]
>   [  505.144170]  [<ffffffffa0898eca>] ? ip6t_do_table+0x2fa/0x645 [ip6_tables]
>   [  505.144177]  [<ffffffffa08c0725>] ? ipv6_get_l4proto+0x65/0xd0 [nf_conntrack_ipv6]
>   [  505.144189]  [<ffffffffa06c9a12>] nf_conntrack_in+0xc2/0x5a0 [nf_conntrack]
>   [  505.144196]  [<ffffffffa08c039c>] ipv6_conntrack_in+0x1c/0x20 [nf_conntrack_ipv6]
>   [  505.144204]  [<ffffffff8164385d>] nf_iterate+0x5d/0x70
>   [  505.144210]  [<ffffffff816438d6>] nf_hook_slow+0x66/0xc0
>   [  505.144218]  [<ffffffff816bd302>] ipv6_rcv+0x3f2/0x4f0
>   [  505.144225]  [<ffffffff816bca40>] ? ip6_make_skb+0x1b0/0x1b0
>   [  505.144232]  [<ffffffff8160b77b>] __netif_receive_skb_core+0x36b/0x9a0
>   [  505.144239]  [<ffffffff8160bdc8>] ? __netif_receive_skb+0x18/0x60
>   [  505.144245]  [<ffffffff8160bdc8>] __netif_receive_skb+0x18/0x60
>   [  505.144252]  [<ffffffff8160ccff>] process_backlog+0x9f/0x140
>   [  505.144259]  [<ffffffff8160c4a5>] net_rx_action+0x145/0x320
>   [...]
> 
> What happens is that on ingress, we push Ethernet header back in, either
> from cls_bpf or right before skb_do_redirect(), but without updating csum.
> The "hw csum failure" can be fixed by using the new skb_postpush_rcsum()
> helper for the dev_forward_skb() case to correct the csum diff again.
> 
> Thanks to Hannes Frederic Sowa for the csum_partial() idea!
> 
> Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper")
> Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Alexei Starovoitov <ast@kernel.org>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next v2 0/2] BPF update
  2016-01-07 14:50 [PATCH net-next v2 0/2] BPF update Daniel Borkmann
  2016-01-07 14:50 ` [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper Daniel Borkmann
  2016-01-07 14:50 ` [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions Daniel Borkmann
@ 2016-01-10 22:55 ` David Miller
  2 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2016-01-10 22:55 UTC (permalink / raw)
  To: daniel; +Cc: alexei.starovoitov, hannes, netdev

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Thu,  7 Jan 2016 15:50:21 +0100

> Fixes a csum issue on ingress. As mentioned previously, net-next
> seems just fine imho. Later on, will follow up with couple of
> replacements like ovs_skb_postpush_rcsum() etc.
> 
> Thanks!
> 
> v1 -> v2:
>   - Added patch 1 with helper
>   - Implemented Hannes' idea to just use csum_partial, thanks!

Series applied.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-01-10 22:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-07 14:50 [PATCH net-next v2 0/2] BPF update Daniel Borkmann
2016-01-07 14:50 ` [PATCH net-next v2 1/2] net, sched: add skb_at_tc_ingress helper Daniel Borkmann
2016-01-07 18:12   ` Alexei Starovoitov
2016-01-07 14:50 ` [PATCH net-next v2 2/2] bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions Daniel Borkmann
2016-01-07 18:52   ` Alexei Starovoitov
2016-01-10 22:55 ` [PATCH net-next v2 0/2] BPF update David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.