All of lore.kernel.org
 help / color / mirror / Atom feed
* xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
@ 2019-07-31 21:12 Brandon Cazander
  2019-08-01  8:18 ` Jesper Dangaard Brouer
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
  0 siblings, 2 replies; 16+ messages in thread
From: Brandon Cazander @ 2019-07-31 21:12 UTC (permalink / raw)
  To: xdp-newbies


I am having an issue with xdpgeneric specifically when using XDP_PASS after
bpf_xdp_adjust_head to pop some headers off. My test environment is qemu using
virtio_net specifically, but it also happens with e1000 in qemu/physical devices.

On a real NIC (ixgbe), the same program is successfully passing decapsulated
traffic, but fails in the same way when forcing xdpgeneric mode.

Here is the packet outside of my VM, with the IP+UDP+tun header.

# sudo tcpdump -ni ens5-outside -e udp -vvvXc1
tcpdump: listening on ens5-outside, link-type EN10MB (Ethernet), capture size 262144 bytes
15:54:14.263306 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 127: (tos 0x0, ttl 255, id 0, offset 0, flags [DF], proto UDP (17), length 113)
    172.64.0.108.39999 > 172.64.0.101.7803: [no cksum] UDP, length 85
	0x0000:  4500 0071 0000 4000 ff11 222a ac40 006c  E..q..@..."*.@.l
	0x0010:  ac40 0065 9c3f 1e7b 005d 0000 0145 0000  .@.e.?.{.]...E..
	0x0020:  54e3 6b40 003f 01d5 e8c0 a800 02c0 a801  T.k@.?..........
	0x0030:  0208 002c b1b0 5eb8 f196 ca40 5d00 0000  ...,..^....@]...
	0x0040:  00c8 0304 0000 0000 0010 1112 1314 1516  ................
	0x0050:  1718 191a 1b1c 1d1e 1f20 2122 2324 2526  ..........!"#$%&
	0x0060:  2728 292a 2b2c 2d2e 2f30 3132 3334 3536  '()*+,-./0123456
	0x0070:  37                                       7

My XDP program copies the original ethhdr to the new offset and adjusts the head
forward, and I can see the resulting ICMP packet is valid with tcpdump inside
the VM:

# tcpdump -niens5 icmp -c1 -vXe
tcpdump: listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes
15:53:25.602109 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 35986, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.0.2 > 192.168.1.2: ICMP echo request, id 45150, seq 43683, length 64
	0x0000:  4500 0054 8c92 4000 3f01 2cc2 c0a8 0002  E..T..@.?.,.....
	0x0010:  c0a8 0102 0800 ed79 b05e aaa3 6dca 405d  .......y.^..m.@]
	0x0020:  0000 0000 3589 0d00 0000 0000 1011 1213  ....5...........
	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
	0x0050:  3435 3637                                4567

Unfortunately, at this point, the packet is dropped in ip_rcv_core. I added a
perf probe on the specific drop line that I'm hitting, and printing the skb->len
and len (from ntohs(iph->tot_len)) variables. You can see the obviously wrong
len value below, while a packet capture in the VM does show the correct value
for tot_len in the IP header.

# perf probe -L ip_rcv_core:64+4 | cat
<ip_rcv_core@/usr/src/debug/kernel-debug-5.2.2-1.2.x86_64/linux-5.2/linux-obj/../net/ipv4/ip_input.c:64>
     64  	if (pskb_trim_rcsum(skb, len)) {
     65  		__IP_INC_STATS(net, IPSTATS_MIB_INDISCARDS);
     66  		goto drop;
         	}

# perf probe -a 'ip_rcv_core:66 skb=skb->len:u32 len=len:u32'
	swapper     0 [001] 84794.954487: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294931717
	swapper     0 [001] 84794.965473: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294936833

In contrast, here's what it looks like in XDP native mode (different line number
but looking):

# perf probe -a 'ip_rcv_core:57 skb=skb->len:u32 len=len:u32'
	swapper     0 [000]   353.187439: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84
	swapper     0 [003]   353.187577: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84

Here's the relevant portion of my program where I decapsulate:

static __always_inline int handle_peer_data_ipv4(struct xdp_md *ctx)
{
	void *data, *data_end;
	struct ethhdr *eth, *orig_eth;
	__u32 csum = 0;

	data = (void *)(unsigned long)ctx->data;
	data_end = (void *)(unsigned long)ctx->data_end;

	if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header) > data_end) {
		return XDP_DROP;
	}

	orig_eth = data;
	eth = data + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header);
	memcpy(&eth->h_source, &orig_eth->h_source, ETH_ALEN);
	memcpy(&eth->h_dest, &orig_eth->h_dest, ETH_ALEN);
	eth->h_proto = __constant_htons(ETH_P_IP);

	/* Decapsulate by removing IP + UDP + tunnel headers */
	if (bpf_xdp_adjust_head(ctx, (int)(sizeof(struct iphdr) + sizeof(struct udphdr) +
					   sizeof(struct tunnel_header)))) {
		return XDP_DROP;
	}

	return XDP_PASS;
}

struct tunnel_header {
	__u8 flags;
};

Kernel is 5.2.2-1-debug, OS is openSUSE Tumbleweed 20190724. For qemu I run with
these arguments for the NICs:

# qemu (...) -netdev tap,br=vm-bridge,id=hostnet1,ifname=ens5-outside,queues=4,vhost=on \
-device virtio-net-pci,mq=on,vectors=9,guest_tso4=off,guest_tso6=off,netdev=hostnet1,id=net1,mac=06:54:01:00:00:01

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-07-31 21:12 xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets Brandon Cazander
@ 2019-08-01  8:18 ` Jesper Dangaard Brouer
  2019-08-01 14:54   ` Jesper Dangaard Brouer
  2019-08-01 17:33   ` Brandon Cazander
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
  1 sibling, 2 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01  8:18 UTC (permalink / raw)
  To: Brandon Cazander; +Cc: xdp-newbies, brouer, Stephen Hemminger, David Miller


First of all, thank you for this VERY excellent bug-report!
My comments inlined below.

Cc. Stephen as this might be related to:
  commit 458bf2f224f0 ("net: core: support XDP generic on stacked devices.") (Author: Stephen Hemminger).

On Wed, 31 Jul 2019 21:12:23 +0000 Brandon Cazander <brandon.cazander@multapplied.net> wrote:

> I am having an issue with xdpgeneric specifically when using XDP_PASS after
> bpf_xdp_adjust_head to pop some headers off. My test environment is qemu using
> virtio_net specifically, but it also happens with e1000 in qemu/physical devices.
> 
> On a real NIC (ixgbe), the same program is successfully passing decapsulated
> traffic, but fails in the same way when forcing xdpgeneric mode.

Are you sure you are using "xdpgeneric" mode?
As virtio_net do have "native" XDP mode.
 
> Here is the packet outside of my VM, with the IP+UDP+tun header.
> 
> # sudo tcpdump -ni ens5-outside -e udp -vvvXc1
> tcpdump: listening on ens5-outside, link-type EN10MB (Ethernet), capture size 262144 bytes
> 15:54:14.263306 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 127: (tos 0x0, ttl 255, id 0, offset 0, flags [DF], proto UDP (17), length 113)
>     172.64.0.108.39999 > 172.64.0.101.7803: [no cksum] UDP, length 85
> 	0x0000:  4500 0071 0000 4000 ff11 222a ac40 006c  E..q..@..."*.@.l
                      ^^^^
iph->tot_len

$ echo $((0x71))
113

$ echo $((0x71+14))
127

> 	0x0010:  ac40 0065 9c3f 1e7b 005d 0000 0145 0000  .@.e.?.{.]...E..
                                                 ^^ ^^
Encapsulated packet starts at a funny offset, making it harder to follow.

> 	0x0020:  54e3 6b40 003f 01d5 e8c0 a800 02c0 a801  T.k@.?..........
                 ^^             ^^
Encapsulated iph->tot_len and ipproto=ICMP

> 	0x0030:  0208 002c b1b0 5eb8 f196 ca40 5d00 0000  ...,..^....@]...
> 	0x0040:  00c8 0304 0000 0000 0010 1112 1314 1516  ................
> 	0x0050:  1718 191a 1b1c 1d1e 1f20 2122 2324 2526  ..........!"#$%&
> 	0x0060:  2728 292a 2b2c 2d2e 2f30 3132 3334 3536  '()*+,-./0123456
> 	0x0070:  37                                       7
> 
> My XDP program copies the original ethhdr to the new offset and adjusts the head
> forward, and I can see the resulting ICMP packet is valid with tcpdump inside
> the VM:
> 
> # tcpdump -niens5 icmp -c1 -vXe
> tcpdump: listening on ens5, link-type EN10MB (Ethernet), capture size 262144 bytes
> 15:53:25.602109 06:54:00:00:00:01 > 06:54:01:00:00:01, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 63, id 35986, offset 0, flags [DF], proto ICMP (1), length 84)
>     192.168.0.2 > 192.168.1.2: ICMP echo request, id 45150, seq 43683, length 64
> 	0x0000:  4500 0054 8c92 4000 3f01 2cc2 c0a8 0002  E..T..@.?.,.....
                        ^^             ^^ 
iph->tot_len and ipproto=ICMP

$ echo $((0x54))
84

$ echo $((0x54+14))
98

> 	0x0010:  c0a8 0102 0800 ed79 b05e aaa3 6dca 405d  .......y.^..m.@]
> 	0x0020:  0000 0000 3589 0d00 0000 0000 1011 1213  ....5...........
> 	0x0030:  1415 1617 1819 1a1b 1c1d 1e1f 2021 2223  .............!"#
> 	0x0040:  2425 2627 2829 2a2b 2c2d 2e2f 3031 3233  $%&'()*+,-./0123
> 	0x0050:  3435 3637                                4567
> 
> Unfortunately, at this point, the packet is dropped in ip_rcv_core. I added a
> perf probe on the specific drop line that I'm hitting, and printing the skb->len
> and len (from ntohs(iph->tot_len)) variables. You can see the obviously wrong
> len value below, while a packet capture in the VM does show the correct value
> for tot_len in the IP header.
 
The len from ntohs(iph->tot_len) is obviously wrong (len=4294931717),
but given it comes the packet data, it can easily be a mistake in your
adjustment of headers.  But from above packet dump, the iph->tot_len is
correct.  Thus, this indicate that how we find the IP-header offset is
likely broken...


> # perf probe -L ip_rcv_core:64+4 | cat
> <ip_rcv_core@/usr/src/debug/kernel-debug-5.2.2-1.2.x86_64/linux-5.2/linux-obj/../net/ipv4/ip_input.c:64>
>      64  	if (pskb_trim_rcsum(skb, len)) {
>      65  		__IP_INC_STATS(net, IPSTATS_MIB_INDISCARDS);
>      66  		goto drop;
>          	}
> 

Looking at the code, I think the packet will be dropped earlier than
the pskb_trim_rcsum() call, in the check if (skb->len < len).

	iph = ip_hdr(skb);
	[...]

	len = ntohs(iph->tot_len);
	if (skb->len < len) {
		__IP_INC_STATS(net, IPSTATS_MIB_INTRUNCATEDPKTS);
		goto drop;
	} else if (len < (iph->ihl*4))
		goto inhdr_error;

	/* Our transport medium may have padded the buffer out. Now we know it
	 * is IP we can trim to the true length of the frame.
	 * Note this now means skb->len holds ntohs(iph->tot_len).
	 */
	if (pskb_trim_rcsum(skb, len)) {
		__IP_INC_STATS(net, IPSTATS_MIB_INDISCARDS);
		goto drop;
	}

The bug might hide in iph=ip_hdr(skb) which is found via skb->network_header
(via helper ip_hdr => skb_network_header(skb)).


> # perf probe -a 'ip_rcv_core:66 skb=skb->len:u32 len=len:u32'
> 	swapper     0 [001] 84794.954487: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294931717
> 	swapper     0 [001] 84794.965473: probe:ip_rcv_core: (ffffffffbcbd5a7d) skb=84 len=4294936833
> 
> In contrast, here's what it looks like in XDP native mode (different line number
> but looking):
> 
> # perf probe -a 'ip_rcv_core:57 skb=skb->len:u32 len=len:u32'
> 	swapper     0 [000]   353.187439: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84
> 	swapper     0 [003]   353.187577: probe:ip_rcv_core: (ffffffffac9dcca7) skb=84 len=84
> 
> Here's the relevant portion of my program where I decapsulate:
> 
> static __always_inline int handle_peer_data_ipv4(struct xdp_md *ctx)
> {
> 	void *data, *data_end;
> 	struct ethhdr *eth, *orig_eth;
> 	__u32 csum = 0;
> 
> 	data = (void *)(unsigned long)ctx->data;
> 	data_end = (void *)(unsigned long)ctx->data_end;
> 
> 	if (data + sizeof(struct ethhdr) + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header) > data_end) {
> 		return XDP_DROP;
> 	}
> 
> 	orig_eth = data;
> 	eth = data + sizeof(struct iphdr) + sizeof(struct udphdr) + sizeof(struct tunnel_header);
> 	memcpy(&eth->h_source, &orig_eth->h_source, ETH_ALEN);
> 	memcpy(&eth->h_dest, &orig_eth->h_dest, ETH_ALEN);
> 	eth->h_proto = __constant_htons(ETH_P_IP);

You only support IPv4 encapsulated packets?

Do you handle ARP requests separately?

> 	/* Decapsulate by removing IP + UDP + tunnel headers */
> 	if (bpf_xdp_adjust_head(ctx, (int)(sizeof(struct iphdr) + sizeof(struct udphdr) +
> 					   sizeof(struct tunnel_header)))) {
> 		return XDP_DROP;
> 	}
> 
> 	return XDP_PASS;
> }
> 
> struct tunnel_header {
> 	__u8 flags;
> };
> 
> Kernel is 5.2.2-1-debug, OS is openSUSE Tumbleweed 20190724.

Can you test an earlier kernel, specifically before: commit
458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
(Author: Stephen Hemminger)

$ git describe --contains  458bf2f224f04
v5.2-rc3~26^2~11^2

I fear that this commit, which moved generic-XDP to a later call point,
might cause this.  Because it could be that the SKB network_header
update, is now done before calling XDP program (... still looking at
code details).

> For qemu I run with these arguments for the NICs:
> 
> # qemu (...) -netdev tap,br=vm-bridge,id=hostnet1,ifname=ens5-outside,queues=4,vhost=on \
> -device virtio-net-pci,mq=on,vectors=9,guest_tso4=off,guest_tso6=off,netdev=hostnet1,id=net1,mac=06:54:01:00:00:01
> 


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-08-01  8:18 ` Jesper Dangaard Brouer
@ 2019-08-01 14:54   ` Jesper Dangaard Brouer
  2019-08-01 16:05     ` Jesper Dangaard Brouer
  2019-08-01 17:33   ` Brandon Cazander
  1 sibling, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 14:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Brandon Cazander, xdp-newbies, David Miller

On Thu, 1 Aug 2019 10:18:49 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

[...]
> Cc. Stephen as this might be related to:
>   commit 458bf2f224f0 ("net: core: support XDP generic on stacked devices.") (Author: Stephen Hemminger).
> 
> On Wed, 31 Jul 2019 21:12:23 +0000 Brandon Cazander <brandon.cazander@multapplied.net> wrote:
> 
> > I am having an issue with xdpgeneric specifically when using XDP_PASS after
> > bpf_xdp_adjust_head to pop some headers off. My test environment is qemu using
> > virtio_net specifically, but it also happens with e1000 in qemu/physical devices.
> > 
> > On a real NIC (ixgbe), the same program is successfully passing decapsulated
> > traffic, but fails in the same way when forcing xdpgeneric mode.  
> 
[...]
> > Kernel is 5.2.2-1-debug, OS is openSUSE Tumbleweed 20190724.  
> 
> Can you test an earlier kernel, specifically before: commit
> 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> (Author: Stephen Hemminger)
> 
> $ git describe --contains  458bf2f224f04
> v5.2-rc3~26^2~11^2
> 
> I fear that this commit, which moved generic-XDP to a later call point,
> might cause this.  Because it could be that the SKB network_header
> update, is now done before calling XDP program (... still looking at
> code details).

I now have a reproducer myself, and I can confirm that this commit
458bf2f224f0 ("net: core: support XDP generic on stacked devices.") is
the issue... reverting it fixes the problem.  (I don't have a fix yet)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-08-01 14:54   ` Jesper Dangaard Brouer
@ 2019-08-01 16:05     ` Jesper Dangaard Brouer
  0 siblings, 0 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 16:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Brandon Cazander, xdp-newbies, David Miller, brouer

On Thu, 1 Aug 2019 16:54:33 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> On Thu, 1 Aug 2019 10:18:49 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> 
> [...]
> > Cc. Stephen as this might be related to:
> >   commit 458bf2f224f0 ("net: core: support XDP generic on stacked devices.") (Author: Stephen Hemminger).
> > 
> > On Wed, 31 Jul 2019 21:12:23 +0000 Brandon Cazander <brandon.cazander@multapplied.net> wrote:
> >   
> > > I am having an issue with xdpgeneric specifically when using XDP_PASS after
> > > bpf_xdp_adjust_head to pop some headers off. My test environment is qemu using
> > > virtio_net specifically, but it also happens with e1000 in qemu/physical devices.
> > > 
> > > On a real NIC (ixgbe), the same program is successfully passing decapsulated
> > > traffic, but fails in the same way when forcing xdpgeneric mode.    
> >   
> [...]
> > > Kernel is 5.2.2-1-debug, OS is openSUSE Tumbleweed 20190724.    
> > 
> > Can you test an earlier kernel, specifically before: commit
> > 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > (Author: Stephen Hemminger)
> > 
> > $ git describe --contains  458bf2f224f04
> > v5.2-rc3~26^2~11^2
> > 
> > I fear that this commit, which moved generic-XDP to a later call point,
> > might cause this.  Because it could be that the SKB network_header
> > update, is now done before calling XDP program (... still looking at
> > code details).  
> 
> I now have a reproducer myself, and I can confirm that this commit
> 458bf2f224f0 ("net: core: support XDP generic on stacked devices.") is
> the issue... reverting it fixes the problem.  (I don't have a fix yet)

I do have a fix now... It was a matter of calling skb_reset_network_header()
which will fix the issue with ip_hdr(skb) as I suspect was wrong (as
explained earlier).

I wonder if we also need to call skb_reset_transport_header(skb) ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-08-01  8:18 ` Jesper Dangaard Brouer
  2019-08-01 14:54   ` Jesper Dangaard Brouer
@ 2019-08-01 17:33   ` Brandon Cazander
  2019-08-01 18:16     ` Jesper Dangaard Brouer
  1 sibling, 1 reply; 16+ messages in thread
From: Brandon Cazander @ 2019-08-01 17:33 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: xdp-newbies, Stephen Hemminger, David Miller

Thank you very much for your prompt and detailed review of my report!
Some of these answers may be moot now that you have found the specific
commit that caused the issue.

On Thu, Aug 01, 2019 at 10:18:49AM +0200, Jesper Dangaard Brouer wrote:

> Are you sure you are using "xdpgeneric" mode?
> As virtio_net do have "native" XDP mode.

The program is loaded with XDP_FLAGS_SKB_MODE in this case. I'm having a
separate issue with "native" XDP on virtio_net that I'm working on but
it is more likely an issue with my code.

> Encapsulated packet starts at a funny offset, making it harder to follow.

Sorry about that. I should have made it easier.

> Looking at the code, I think the packet will be dropped earlier than
> the pskb_trim_rcsum() call, in the check if (skb->len < len).

I had the same suspicion but a probe in the corresponding drop showed no
results, and furthermore, the InTruncatedPkts SNMP counter was not
increasing.

> You only support IPv4 encapsulated packets?
> 
> Do you handle ARP requests separately?

Sorry, I wanted a minimal example so this is stripped down.

> Can you test an earlier kernel, specifically before: commit
> 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> (Author: Stephen Hemminger)
> 
> $ git describe --contains  458bf2f224f04
> v5.2-rc3~26^2~11^2
> 
> I fear that this commit, which moved generic-XDP to a later call point,
> might cause this.  Because it could be that the SKB network_header
> update, is now done before calling XDP program (... still looking at
> code details).

You have already confirmed this but I can also confirm that a kernel
before commit 458bf2f224f0 works with no changes to my program.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [net v1 PATCH 0/4] net: fix regressions for generic-XDP
  2019-07-31 21:12 xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets Brandon Cazander
  2019-08-01  8:18 ` Jesper Dangaard Brouer
@ 2019-08-01 18:00 ` Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh Jesper Dangaard Brouer
                     ` (4 more replies)
  1 sibling, 5 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:00 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: xdp-newbies, Daniel Borkmann, brandon.cazander,
	Alexei Starovoitov, Jesper Dangaard Brouer

Thanks to Brandon Cazander, who wrote a very detailed bug report that
even used perf probe's on xdp-newbies mailing list, we discovered that
generic-XDP contains some regressions when using bpf_xdp_adjust_head().

First issue were that my selftests script, that use bpf_xdp_adjust_head(),
by mistake didn't use generic-XDP any-longer. That selftest should have
caught the real regression introduced in commit 458bf2f224f0 ("net: core:
support XDP generic on stacked devices.").

To verify this patchset fix the regressions, you can invoked manually via:

  cd tools/testing/selftests/bpf/
  sudo ./test_xdp_vlan_mode_generic.sh
  sudo ./test_xdp_vlan_mode_native.sh

Link: https://www.spinics.net/lists/xdp-newbies/msg01231.html
Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
Reported by: Brandon Cazander <brandon.cazander@multapplied.net>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

---

Jesper Dangaard Brouer (4):
      bpf: fix XDP vlan selftests test_xdp_vlan.sh
      selftests/bpf: add wrapper scripts for test_xdp_vlan.sh
      selftests/bpf: reduce time to execute test_xdp_vlan.sh
      net: fix bpf_xdp_adjust_head regression for generic-XDP


 net/core/dev.c                                     |   15 ++++-
 tools/testing/selftests/bpf/Makefile               |    3 +
 tools/testing/selftests/bpf/test_xdp_vlan.sh       |   57 ++++++++++++++++----
 .../selftests/bpf/test_xdp_vlan_mode_generic.sh    |    9 +++
 .../selftests/bpf/test_xdp_vlan_mode_native.sh     |    9 +++
 5 files changed, 75 insertions(+), 18 deletions(-)
 create mode 100755 tools/testing/selftests/bpf/test_xdp_vlan_mode_generic.sh
 create mode 100755 tools/testing/selftests/bpf/test_xdp_vlan_mode_native.sh

--

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
@ 2019-08-01 18:00   ` Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 2/4] selftests/bpf: add wrapper scripts for test_xdp_vlan.sh Jesper Dangaard Brouer
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:00 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: xdp-newbies, Daniel Borkmann, brandon.cazander,
	Alexei Starovoitov, Jesper Dangaard Brouer

Change BPF selftest test_xdp_vlan.sh to (default) use generic XDP.

This selftest was created together with a fix for generic XDP, in commit
297249569932 ("net: fix generic XDP to handle if eth header was
mangled"). And was suppose to catch if generic XDP was broken again.

The tests are using veth and assumed that veth driver didn't support
native driver XDP, thus it used the (ip link set) 'xdp' attach that fell
back to generic-XDP. But veth gained native-XDP support in 948d4f214fde
("veth: Add driver XDP"), which caused this test script to use
native-XDP.

Fixes: 948d4f214fde ("veth: Add driver XDP")
Fixes: 97396ff0bc2d ("selftests/bpf: add XDP selftests for modifying and popping VLAN headers")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 tools/testing/selftests/bpf/test_xdp_vlan.sh |   42 ++++++++++++++++++++++----
 1 file changed, 36 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_xdp_vlan.sh b/tools/testing/selftests/bpf/test_xdp_vlan.sh
index 51a3a31d1aac..c8aed63b0ffe 100755
--- a/tools/testing/selftests/bpf/test_xdp_vlan.sh
+++ b/tools/testing/selftests/bpf/test_xdp_vlan.sh
@@ -1,7 +1,12 @@
 #!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+# Author: Jesper Dangaard Brouer <hawk@kernel.org>
 
 TESTNAME=xdp_vlan
 
+# Default XDP mode
+XDP_MODE=xdpgeneric
+
 usage() {
   echo "Testing XDP + TC eBPF VLAN manipulations: $TESTNAME"
   echo ""
@@ -9,9 +14,23 @@ usage() {
   echo "  -v | --verbose : Verbose"
   echo "  --flush        : Flush before starting (e.g. after --interactive)"
   echo "  --interactive  : Keep netns setup running after test-run"
+  echo "  --mode=XXX     : Choose XDP mode (xdp | xdpgeneric | xdpdrv)"
   echo ""
 }
 
+valid_xdp_mode()
+{
+	local mode=$1
+
+	case "$mode" in
+		xdpgeneric | xdpdrv | xdp)
+			return 0
+			;;
+		*)
+			return 1
+	esac
+}
+
 cleanup()
 {
 	local status=$?
@@ -37,7 +56,7 @@ cleanup()
 
 # Using external program "getopt" to get --long-options
 OPTIONS=$(getopt -o hvfi: \
-    --long verbose,flush,help,interactive,debug -- "$@")
+    --long verbose,flush,help,interactive,debug,mode: -- "$@")
 if (( $? != 0 )); then
     usage
     echo "selftests: $TESTNAME [FAILED] Error calling getopt, unknown option?"
@@ -60,6 +79,11 @@ while true; do
 		cleanup
 		shift
 		;;
+	    --mode )
+		shift
+		XDP_MODE=$1
+		shift
+		;;
 	    -- )
 		shift
 		break
@@ -81,8 +105,14 @@ if [ "$EUID" -ne 0 ]; then
 	exit 1
 fi
 
-ip link set dev lo xdp off 2>/dev/null > /dev/null
-if [ $? -ne 0 ];then
+valid_xdp_mode $XDP_MODE
+if [ $? -ne 0 ]; then
+	echo "selftests: $TESTNAME [FAILED] unknown XDP mode ($XDP_MODE)"
+	exit 1
+fi
+
+ip link set dev lo xdpgeneric off 2>/dev/null > /dev/null
+if [ $? -ne 0 ]; then
 	echo "selftests: $TESTNAME [SKIP] need ip xdp support"
 	exit 0
 fi
@@ -166,7 +196,7 @@ export FILE=test_xdp_vlan.o
 
 # First test: Remove VLAN by setting VLAN ID 0, using "xdp_vlan_change"
 export XDP_PROG=xdp_vlan_change
-ip netns exec ns1 ip link set $DEVNS1 xdp object $FILE section $XDP_PROG
+ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG
 
 # In ns1: egress use TC to add back VLAN tag 4011
 #  (del cmd)
@@ -187,8 +217,8 @@ ip netns exec ns1 ping -W 2 -c 3 $IPADDR2
 # ETH_P_8021Q indication, and this cause overwriting of our changes.
 #
 export XDP_PROG=xdp_vlan_remove_outer2
-ip netns exec ns1 ip link set $DEVNS1 xdp off
-ip netns exec ns1 ip link set $DEVNS1 xdp object $FILE section $XDP_PROG
+ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE off
+ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG
 
 # Now the namespaces should still be able reach each-other, test with ping:
 ip netns exec ns2 ping -W 2 -c 3 $IPADDR1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net v1 PATCH 2/4] selftests/bpf: add wrapper scripts for test_xdp_vlan.sh
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh Jesper Dangaard Brouer
@ 2019-08-01 18:00   ` Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 3/4] selftests/bpf: reduce time to execute test_xdp_vlan.sh Jesper Dangaard Brouer
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:00 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: xdp-newbies, Daniel Borkmann, brandon.cazander,
	Alexei Starovoitov, Jesper Dangaard Brouer

In-order to test both native-XDP (xdpdrv) and generic-XDP (xdpgeneric)
create two wrapper test scripts, that start the test_xdp_vlan.sh script
with these modes.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 tools/testing/selftests/bpf/Makefile               |    3 ++-
 tools/testing/selftests/bpf/test_xdp_vlan.sh       |    5 ++++-
 .../selftests/bpf/test_xdp_vlan_mode_generic.sh    |    9 +++++++++
 .../selftests/bpf/test_xdp_vlan_mode_native.sh     |    9 +++++++++
 4 files changed, 24 insertions(+), 2 deletions(-)
 create mode 100755 tools/testing/selftests/bpf/test_xdp_vlan_mode_generic.sh
 create mode 100755 tools/testing/selftests/bpf/test_xdp_vlan_mode_native.sh

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 3bd0f4a0336a..29001f944db7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -57,7 +57,8 @@ TEST_PROGS := test_kmod.sh \
 	test_lirc_mode2.sh \
 	test_skb_cgroup_id.sh \
 	test_flow_dissector.sh \
-	test_xdp_vlan.sh \
+	test_xdp_vlan_mode_generic.sh \
+	test_xdp_vlan_mode_native.sh \
 	test_lwt_ip_encap.sh \
 	test_tcp_check_syncookie.sh \
 	test_tc_tunnel.sh \
diff --git a/tools/testing/selftests/bpf/test_xdp_vlan.sh b/tools/testing/selftests/bpf/test_xdp_vlan.sh
index c8aed63b0ffe..7348661be815 100755
--- a/tools/testing/selftests/bpf/test_xdp_vlan.sh
+++ b/tools/testing/selftests/bpf/test_xdp_vlan.sh
@@ -2,7 +2,10 @@
 # SPDX-License-Identifier: GPL-2.0
 # Author: Jesper Dangaard Brouer <hawk@kernel.org>
 
-TESTNAME=xdp_vlan
+# Allow wrapper scripts to name test
+if [ -z "$TESTNAME" ]; then
+    TESTNAME=xdp_vlan
+fi
 
 # Default XDP mode
 XDP_MODE=xdpgeneric
diff --git a/tools/testing/selftests/bpf/test_xdp_vlan_mode_generic.sh b/tools/testing/selftests/bpf/test_xdp_vlan_mode_generic.sh
new file mode 100755
index 000000000000..c515326d6d59
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_xdp_vlan_mode_generic.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Exit on failure
+set -e
+
+# Wrapper script to test generic-XDP
+export TESTNAME=xdp_vlan_mode_generic
+./test_xdp_vlan.sh --mode=xdpgeneric
diff --git a/tools/testing/selftests/bpf/test_xdp_vlan_mode_native.sh b/tools/testing/selftests/bpf/test_xdp_vlan_mode_native.sh
new file mode 100755
index 000000000000..5cf7ce1f16c1
--- /dev/null
+++ b/tools/testing/selftests/bpf/test_xdp_vlan_mode_native.sh
@@ -0,0 +1,9 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# Exit on failure
+set -e
+
+# Wrapper script to test native-XDP
+export TESTNAME=xdp_vlan_mode_native
+./test_xdp_vlan.sh --mode=xdpdrv


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net v1 PATCH 3/4] selftests/bpf: reduce time to execute test_xdp_vlan.sh
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 2/4] selftests/bpf: add wrapper scripts for test_xdp_vlan.sh Jesper Dangaard Brouer
@ 2019-08-01 18:00   ` Jesper Dangaard Brouer
  2019-08-01 18:00   ` [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP Jesper Dangaard Brouer
  2019-08-05 18:19   ` [net v1 PATCH 0/4] net: fix regressions " David Miller
  4 siblings, 0 replies; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:00 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: xdp-newbies, Daniel Borkmann, brandon.cazander,
	Alexei Starovoitov, Jesper Dangaard Brouer

Given the increasing number of BPF selftests, it makes sense to
reduce the time to execute these tests.  The ping parameters are
adjusted to reduce the time from measures 9 sec to approx 2.8 sec.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 tools/testing/selftests/bpf/test_xdp_vlan.sh |   10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/bpf/test_xdp_vlan.sh b/tools/testing/selftests/bpf/test_xdp_vlan.sh
index 7348661be815..bb8b0da91686 100755
--- a/tools/testing/selftests/bpf/test_xdp_vlan.sh
+++ b/tools/testing/selftests/bpf/test_xdp_vlan.sh
@@ -188,7 +188,7 @@ ip netns exec ns2 ip link set lo up
 # At this point, the hosts cannot reach each-other,
 # because ns2 are using VLAN tags on the packets.
 
-ip netns exec ns2 sh -c 'ping -W 1 -c 1 100.64.41.1 || echo "Okay ping fails"'
+ip netns exec ns2 sh -c 'ping -W 1 -c 1 100.64.41.1 || echo "Success: First ping must fail"'
 
 
 # Now we can use the test_xdp_vlan.c program to pop/push these VLAN tags
@@ -210,8 +210,8 @@ ip netns exec ns1 tc filter add dev $DEVNS1 egress \
   prio 1 handle 1 bpf da obj $FILE sec tc_vlan_push
 
 # Now the namespaces can reach each-other, test with ping:
-ip netns exec ns2 ping -W 2 -c 3 $IPADDR1
-ip netns exec ns1 ping -W 2 -c 3 $IPADDR2
+ip netns exec ns2 ping -i 0.2 -W 2 -c 2 $IPADDR1
+ip netns exec ns1 ping -i 0.2 -W 2 -c 2 $IPADDR2
 
 # Second test: Replace xdp prog, that fully remove vlan header
 #
@@ -224,5 +224,5 @@ ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE off
 ip netns exec ns1 ip link set $DEVNS1 $XDP_MODE object $FILE section $XDP_PROG
 
 # Now the namespaces should still be able reach each-other, test with ping:
-ip netns exec ns2 ping -W 2 -c 3 $IPADDR1
-ip netns exec ns1 ping -W 2 -c 3 $IPADDR2
+ip netns exec ns2 ping -i 0.2 -W 2 -c 2 $IPADDR1
+ip netns exec ns1 ping -i 0.2 -W 2 -c 2 $IPADDR2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
                     ` (2 preceding siblings ...)
  2019-08-01 18:00   ` [net v1 PATCH 3/4] selftests/bpf: reduce time to execute test_xdp_vlan.sh Jesper Dangaard Brouer
@ 2019-08-01 18:00   ` Jesper Dangaard Brouer
  2019-08-02  0:44     ` Jakub Kicinski
  2019-08-05 18:19   ` [net v1 PATCH 0/4] net: fix regressions " David Miller
  4 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:00 UTC (permalink / raw)
  To: netdev, David S. Miller
  Cc: xdp-newbies, Daniel Borkmann, brandon.cazander,
	Alexei Starovoitov, Jesper Dangaard Brouer

When generic-XDP was moved to a later processing step by commit
458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
a regression was introduced when using bpf_xdp_adjust_head.

The issue is that after this commit the skb->network_header is now
changed prior to calling generic XDP and not after. Thus, if the header
is changed by XDP (via bpf_xdp_adjust_head), then skb->network_header
also need to be updated again.  Fix by calling skb_reset_network_header().

Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
Reported-by: Brandon Cazander <brandon.cazander@multapplied.net>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/dev.c |   15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index fc676b2610e3..b5533795c3c1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4374,12 +4374,17 @@ static u32 netif_receive_generic_xdp(struct sk_buff *skb,
 
 	act = bpf_prog_run_xdp(xdp_prog, xdp);
 
+	/* check if bpf_xdp_adjust_head was used */
 	off = xdp->data - orig_data;
-	if (off > 0)
-		__skb_pull(skb, off);
-	else if (off < 0)
-		__skb_push(skb, -off);
-	skb->mac_header += off;
+	if (off) {
+		if (off > 0)
+			__skb_pull(skb, off);
+		else if (off < 0)
+			__skb_push(skb, -off);
+
+		skb->mac_header += off;
+		skb_reset_network_header(skb);
+	}
 
 	/* check if bpf_xdp_adjust_tail was used. it can only "shrink"
 	 * pckt.


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-08-01 17:33   ` Brandon Cazander
@ 2019-08-01 18:16     ` Jesper Dangaard Brouer
  2019-08-01 18:53       ` Brandon Cazander
  0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-01 18:16 UTC (permalink / raw)
  To: Brandon Cazander; +Cc: xdp-newbies, Stephen Hemminger, David Miller, brouer

On Thu, 1 Aug 2019 17:33:36 +0000
Brandon Cazander <brandon.cazander@multapplied.net> wrote:

> Thank you very much for your prompt and detailed review of my report!
> Some of these answers may be moot now that you have found the specific
> commit that caused the issue.

True.
 
> On Thu, Aug 01, 2019 at 10:18:49AM +0200, Jesper Dangaard Brouer wrote:
> 
> > Are you sure you are using "xdpgeneric" mode?
> > As virtio_net do have "native" XDP mode.  
> 
> The program is loaded with XDP_FLAGS_SKB_MODE in this case. I'm having a
> separate issue with "native" XDP on virtio_net that I'm working on but
> it is more likely an issue with my code.
> 
> > Encapsulated packet starts at a funny offset, making it harder to follow.  
> 
> Sorry about that. I should have made it easier.

No problem.

> > Looking at the code, I think the packet will be dropped earlier than
> > the pskb_trim_rcsum() call, in the check if (skb->len < len).  
> 
> I had the same suspicion but a probe in the corresponding drop showed no
> results, and furthermore, the InTruncatedPkts SNMP counter was not
> increasing.

Strange.

> > You only support IPv4 encapsulated packets?
> > 
> > Do you handle ARP requests separately?  
> 
> Sorry, I wanted a minimal example so this is stripped down.

I figured, that you already handled that.
 
> > Can you test an earlier kernel, specifically before: commit
> > 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > (Author: Stephen Hemminger)
> > 
> > $ git describe --contains  458bf2f224f04
> > v5.2-rc3~26^2~11^2
> > 
> > I fear that this commit, which moved generic-XDP to a later call point,
> > might cause this.  Because it could be that the SKB network_header
> > update, is now done before calling XDP program (... still looking at
> > code details).  
> 
> You have already confirmed this but I can also confirm that a kernel
> before commit 458bf2f224f0 works with no changes to my program.

Thanks for also confirming. I've just send a patchset with fixes:
 https://patchwork.ozlabs.org/project/netdev/list/?series=122796&state=%2a

I can recommend that you look at the selftest script test_xdp_vlan.sh:
 https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/test_xdp_vlan.sh

Which uses veth and network namespaces for testing.  If you get the
hint, then you can actually create these small scripts, that can
function as unit and regression tests for XDP code snippets.

Our XDP-tutorial also uses veth as a development environment:
 https://github.com/xdp-project/xdp-tutorial/tree/master/testenv
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets
  2019-08-01 18:16     ` Jesper Dangaard Brouer
@ 2019-08-01 18:53       ` Brandon Cazander
  0 siblings, 0 replies; 16+ messages in thread
From: Brandon Cazander @ 2019-08-01 18:53 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: xdp-newbies, Stephen Hemminger, David Miller

[...]

> Thanks for also confirming. I've just send a patchset with fixes:
>  https://patchwork.ozlabs.org/project/netdev/list/?series=122796&state=%2a

Thank you for this; I tested this against my environment and it works as
expected again.

> I can recommend that you look at the selftest script test_xdp_vlan.sh:
>  https://github.com/torvalds/linux/blob/master/tools/testing/selftests/bpf/test_xdp_vlan.sh
> 
> Which uses veth and network namespaces for testing.  If you get the
> hint, then you can actually create these small scripts, that can
> function as unit and regression tests for XDP code snippets.
> 
> Our XDP-tutorial also uses veth as a development environment:
>  https://github.com/xdp-project/xdp-tutorial/tree/master/testenv

Great hint, I will definitely use these going forward!

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP
  2019-08-01 18:00   ` [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP Jesper Dangaard Brouer
@ 2019-08-02  0:44     ` Jakub Kicinski
  2019-08-02  7:53       ` Jesper Dangaard Brouer
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-08-02  0:44 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, David S. Miller, xdp-newbies, Daniel Borkmann,
	brandon.cazander, Alexei Starovoitov

On Thu, 01 Aug 2019 20:00:31 +0200, Jesper Dangaard Brouer wrote:
> When generic-XDP was moved to a later processing step by commit
> 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> a regression was introduced when using bpf_xdp_adjust_head.
> 
> The issue is that after this commit the skb->network_header is now
> changed prior to calling generic XDP and not after. Thus, if the header
> is changed by XDP (via bpf_xdp_adjust_head), then skb->network_header
> also need to be updated again.  Fix by calling skb_reset_network_header().
> 
> Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> Reported-by: Brandon Cazander <brandon.cazander@multapplied.net>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Out of curiosity what was your conclusion regarding resetting the
transport header as well?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP
  2019-08-02  0:44     ` Jakub Kicinski
@ 2019-08-02  7:53       ` Jesper Dangaard Brouer
  2019-08-02 17:08         ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: Jesper Dangaard Brouer @ 2019-08-02  7:53 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, David S. Miller, xdp-newbies, Daniel Borkmann,
	brandon.cazander, Alexei Starovoitov, brouer

On Thu, 1 Aug 2019 17:44:06 -0700
Jakub Kicinski <jakub.kicinski@netronome.com> wrote:

> On Thu, 01 Aug 2019 20:00:31 +0200, Jesper Dangaard Brouer wrote:
> > When generic-XDP was moved to a later processing step by commit
> > 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > a regression was introduced when using bpf_xdp_adjust_head.
> > 
> > The issue is that after this commit the skb->network_header is now
> > changed prior to calling generic XDP and not after. Thus, if the header
> > is changed by XDP (via bpf_xdp_adjust_head), then skb->network_header
> > also need to be updated again.  Fix by calling skb_reset_network_header().
> > 
> > Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > Reported-by: Brandon Cazander <brandon.cazander@multapplied.net>
> > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>  
> 
> Out of curiosity what was your conclusion regarding resetting the
> transport header as well?

Well, I don't know... I need some review, from e.g. Stephen that
changed this... I've added code snippets below signature to helper
reviewers (also helps understand below paragraph).

I think, we perhaps should call skb_reset_transport_header(), as we
change skb->data (via either __skb_pull() or __skb_push()), *BUT* I'm
not sure it is needed/required, as someone/something afterwards still
need to call skb_set_transport_header(), which also calls
skb_reset_transport_header() anyway.

I also want to ask reviewers, if we should move the call to
skb_reset_mac_len() (which Stephen added) in __netif_receive_skb_core()
into netif_receive_generic_xdp() (after the call to
skb_reset_network_header()), (it would be more natural to keep them
together)?

- - 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Code snippet
	/* check if bpf_xdp_adjust_head was used */
	off = xdp->data - orig_data;
	if (off) {
		if (off > 0)
			__skb_pull(skb, off);
		else if (off < 0)
			__skb_push(skb, -off);

		skb->mac_header += off;
		skb_reset_network_header(skb);
	}


static inline bool skb_transport_header_was_set(const struct sk_buff *skb)
{
	return skb->transport_header != (typeof(skb->transport_header))~0U;
}

static inline unsigned char *skb_transport_header(const struct sk_buff *skb)
{
	return skb->head + skb->transport_header;
}

static inline void skb_reset_transport_header(struct sk_buff *skb)
{
	skb->transport_header = skb->data - skb->head;
}

static inline void skb_set_transport_header(struct sk_buff *skb,
					    const int offset)
{
	skb_reset_transport_header(skb);
	skb->transport_header += offset;
}


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP
  2019-08-02  7:53       ` Jesper Dangaard Brouer
@ 2019-08-02 17:08         ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2019-08-02 17:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, David S. Miller, xdp-newbies, Daniel Borkmann,
	brandon.cazander, Alexei Starovoitov, Willem de Bruijn

On Fri, 2 Aug 2019 09:53:54 +0200, Jesper Dangaard Brouer wrote:
> On Thu, 1 Aug 2019 17:44:06 -0700
> Jakub Kicinski <jakub.kicinski@netronome.com> wrote:
> 
> > On Thu, 01 Aug 2019 20:00:31 +0200, Jesper Dangaard Brouer wrote:  
> > > When generic-XDP was moved to a later processing step by commit
> > > 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > > a regression was introduced when using bpf_xdp_adjust_head.
> > > 
> > > The issue is that after this commit the skb->network_header is now
> > > changed prior to calling generic XDP and not after. Thus, if the header
> > > is changed by XDP (via bpf_xdp_adjust_head), then skb->network_header
> > > also need to be updated again.  Fix by calling skb_reset_network_header().
> > > 
> > > Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> > > Reported-by: Brandon Cazander <brandon.cazander@multapplied.net>
> > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>    
> > 
> > Out of curiosity what was your conclusion regarding resetting the
> > transport header as well?  
> 
> Well, I don't know... I need some review, from e.g. Stephen that
> changed this... I've added code snippets below signature to helper
> reviewers (also helps understand below paragraph).
> 
> I think, we perhaps should call skb_reset_transport_header(), as we
> change skb->data (via either __skb_pull() or __skb_push()), *BUT* I'm
> not sure it is needed/required, as someone/something afterwards still
> need to call skb_set_transport_header(), which also calls
> skb_reset_transport_header() anyway.

Perhaps you've seen this, but just in case - this is the last commit
that touched the transport header setting in __netif_receive_skb(), 
and it sounds like it matters mostly for qdisc accounting?

commit fda55eca5a33f33ffcd4192c6b2d75179714a52c
Author: Eric Dumazet <edumazet@google.com>
Date:   Mon Jan 7 09:28:21 2013 +0000

    net: introduce skb_transport_header_was_set()
    
    We have skb_mac_header_was_set() helper to tell if mac_header
    was set on a skb. We would like the same for transport_header.
    
    __netif_receive_skb() doesn't reset the transport header if already
    set by GRO layer.
    
    Note that network stacks usually reset the transport header anyway,
    after pulling the network header, so this change only allows
    a followup patch to have more precise qdisc pkt_len computation
    for GSO packets at ingress side.
    
    Signed-off-by: Eric Dumazet <edumazet@google.com>
    Cc: Jamal Hadi Salim <jhs@mojatatu.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [net v1 PATCH 0/4] net: fix regressions for generic-XDP
  2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
                     ` (3 preceding siblings ...)
  2019-08-01 18:00   ` [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP Jesper Dangaard Brouer
@ 2019-08-05 18:19   ` David Miller
  4 siblings, 0 replies; 16+ messages in thread
From: David Miller @ 2019-08-05 18:19 UTC (permalink / raw)
  To: brouer
  Cc: netdev, xdp-newbies, borkmann, brandon.cazander, alexei.starovoitov

From: Jesper Dangaard Brouer <brouer@redhat.com>
Date: Thu, 01 Aug 2019 20:00:11 +0200

> Thanks to Brandon Cazander, who wrote a very detailed bug report that
> even used perf probe's on xdp-newbies mailing list, we discovered that
> generic-XDP contains some regressions when using bpf_xdp_adjust_head().
> 
> First issue were that my selftests script, that use bpf_xdp_adjust_head(),
> by mistake didn't use generic-XDP any-longer. That selftest should have
> caught the real regression introduced in commit 458bf2f224f0 ("net: core:
> support XDP generic on stacked devices.").
> 
> To verify this patchset fix the regressions, you can invoked manually via:
> 
>   cd tools/testing/selftests/bpf/
>   sudo ./test_xdp_vlan_mode_generic.sh
>   sudo ./test_xdp_vlan_mode_native.sh
> 
> Link: https://www.spinics.net/lists/xdp-newbies/msg01231.html
> Fixes: 458bf2f224f0 ("net: core: support XDP generic on stacked devices.")
> Reported by: Brandon Cazander <brandon.cazander@multapplied.net>
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Series applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-05 18:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-31 21:12 xdpgeneric, XDP_PASS, and bpf_xdp_adjust_head decapsulation dropping packets Brandon Cazander
2019-08-01  8:18 ` Jesper Dangaard Brouer
2019-08-01 14:54   ` Jesper Dangaard Brouer
2019-08-01 16:05     ` Jesper Dangaard Brouer
2019-08-01 17:33   ` Brandon Cazander
2019-08-01 18:16     ` Jesper Dangaard Brouer
2019-08-01 18:53       ` Brandon Cazander
2019-08-01 18:00 ` [net v1 PATCH 0/4] net: fix regressions for generic-XDP Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 1/4] bpf: fix XDP vlan selftests test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 2/4] selftests/bpf: add wrapper scripts for test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 3/4] selftests/bpf: reduce time to execute test_xdp_vlan.sh Jesper Dangaard Brouer
2019-08-01 18:00   ` [net v1 PATCH 4/4] net: fix bpf_xdp_adjust_head regression for generic-XDP Jesper Dangaard Brouer
2019-08-02  0:44     ` Jakub Kicinski
2019-08-02  7:53       ` Jesper Dangaard Brouer
2019-08-02 17:08         ` Jakub Kicinski
2019-08-05 18:19   ` [net v1 PATCH 0/4] net: fix regressions " David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.