All of lore.kernel.org
 help / color / mirror / Atom feed
* Fighting BPF verifier to reach end-of-packet with XDP
@ 2020-05-01 15:41 Jesper Dangaard Brouer
  2020-05-01 18:10 ` David Ahern
  2020-05-06  9:08 ` Jesper Dangaard Brouer
  0 siblings, 2 replies; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2020-05-01 15:41 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: brouer, BPF-dev-list, Toke Høiland-Jørgensen,
	Matteo Croce, Lorenzo Bianconi, Ilias Apalodimas

Hi Daniel,

One use-case for tail grow patchset, is to add a kernel timestamp at
XDP time in the extended tailroom of packet and return XDP_PASS to let
packet travel were it needs to go, and then via tcpdump we can extract
this timestamp. (E.g. this could improve on Ilias TSN measurements[2]).

I have implemented it here[3]. It works, but it is really a hassle to
convince the BPF verifier, that my program was safe.  I use the
IP-headers total length to find the end-of-packet.

Is there an easier BPF way to move a data pointer to data_end?

Any suggestion on how I could extend the kernel (or verifier) to
provide easier access to the tailroom I grow?


[3] https://github.com/xdp-project/xdp-tutorial/blob/tailgrow01.public/packet04-tailgrow/xdp_prog_kern.c#L97-L115
[1] https://patchwork.ozlabs.org/project/netdev/list/?series=173782&state=%2A&archive=both
[2] https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/xdp_for_tsn.org

- - 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

Relevant code copy-pasted below, to make it easier to email comment:

SEC("xdp_tailgrow_parse")
int grow_parse(struct xdp_md *ctx)
{
	void *data_end;
	void *data;
	int action = XDP_PASS;
	int eth_type, ip_type;
	struct hdr_cursor nh;
	struct iphdr *iphdr;
	struct ethhdr *eth;
	__u16 ip_tot_len;

	struct my_timestamp *ts;

	/* Increase packet size (at tail) and reload data pointers */
	__u8 offset = sizeof(*ts);
	if (bpf_xdp_adjust_tail(ctx, offset))
		goto out;
	data_end = (void *)(long)ctx->data_end;
	data = (void *)(long)ctx->data;

	/* These keep track of the next header type and iterator pointer */
	nh.pos = data;

	eth_type = parse_ethhdr(&nh, data_end, &eth);
	if (eth_type < 0) {
		action = XDP_ABORTED;
		goto out;
	}

	if (eth_type == bpf_htons(ETH_P_IP)) {
		ip_type = parse_iphdr(&nh, data_end, &iphdr);
	} else {
		action = XDP_PASS;
		goto out;
	}

	/* Demo use-case: Add timestamp in extended tailroom to ICMP packets,
	 * before sending to network-stack via XDP_PASS.  This can be
	 * captured via tcpdump, and provide earlier (XDP layer) timestamp.
	 */
	if (ip_type == IPPROTO_ICMP) {

		/* Packet size in bytes, including IP header and data */
		ip_tot_len = bpf_ntohs(iphdr->tot_len);

		/*
		 * Tricks to get pass the verifier. Being allowed to use
		 * packet value iphdr->tot_len, involves bounding possible
		 * values to please verifier.
		 */
		if (ip_tot_len < 2) {
			/* This check seems strange on unsigned ip_tot_len,
			 * but is needed, else verifier complains:
			 * "unbounded min value is not allowed"
			 */
			goto out;
		}
		ip_tot_len &= 0xFFF; /* Max 4095 */

		/* Finding end of packet + offset, and bound access */
		if ((void *)iphdr + ip_tot_len + offset > data_end) {
			action = XDP_ABORTED;
			goto out;
		}

		/* Point ts to end-of-packet, that have been offset extended */
		ts = (void *)iphdr + ip_tot_len;
		ts->magic = 0x5354; /* String "TS" in network-byte-order */
		ts->time  = bpf_ktime_get_ns();
	}
out:
	return xdp_stats_record_action(ctx, action);
}


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fighting BPF verifier to reach end-of-packet with XDP
  2020-05-01 15:41 Fighting BPF verifier to reach end-of-packet with XDP Jesper Dangaard Brouer
@ 2020-05-01 18:10 ` David Ahern
  2020-05-06  9:08 ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 4+ messages in thread
From: David Ahern @ 2020-05-01 18:10 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
  Cc: BPF-dev-list, Toke Høiland-Jørgensen, Matteo Croce,
	Lorenzo Bianconi, Ilias Apalodimas

On 5/1/20 9:41 AM, Jesper Dangaard Brouer wrote:
> One use-case for tail grow patchset, is to add a kernel timestamp at
> XDP time in the extended tailroom of packet and return XDP_PASS to let
> packet travel were it needs to go, and then via tcpdump we can extract
> this timestamp. (E.g. this could improve on Ilias TSN measurements[2]).

That's an interesting trick for timestamping xdp frames.

Have you looked at having xdp frames use existing timestamping APIs?
e.g., if PTP is enabled, pull the timestamp as meta data for the frame.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fighting BPF verifier to reach end-of-packet with XDP
  2020-05-01 15:41 Fighting BPF verifier to reach end-of-packet with XDP Jesper Dangaard Brouer
  2020-05-01 18:10 ` David Ahern
@ 2020-05-06  9:08 ` Jesper Dangaard Brouer
  2020-05-06  9:28   ` Daniel Borkmann
  1 sibling, 1 reply; 4+ messages in thread
From: Jesper Dangaard Brouer @ 2020-05-06  9:08 UTC (permalink / raw)
  To: Daniel Borkmann, Alexei Starovoitov
  Cc: BPF-dev-list, Toke Høiland-Jørgensen, Matteo Croce,
	Lorenzo Bianconi, Ilias Apalodimas, brouer, xdp-newbies

On Fri, 1 May 2020 17:41:32 +0200
Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> Hi Daniel,
> 
> One use-case for tail grow patchset, is to add a kernel timestamp at
> XDP time in the extended tailroom of packet and return XDP_PASS to let
> packet travel were it needs to go, and then via tcpdump we can extract
> this timestamp. (E.g. this could improve on Ilias TSN measurements[2]).
>
> I have implemented it here[3]. It works, but it is really a hassle to
> convince the BPF verifier, that my program was safe.  I use the
> IP-headers total length to find the end-of-packet.

I moved the code example here experiment01-tailgrow[4]:
 [4] https://github.com/xdp-project/xdp-tutorial/blob/master/experiment01-tailgrow/xdp_prog_kern.c

People can follow the changes via PR# [123].
 [123] https://github.com/xdp-project/xdp-tutorial/pull/123


> Is there an easier BPF way to move a data pointer to data_end?

I've also added some "fail" examples[5]:
 [5] https://github.com/xdp-project/xdp-tutorial/tree/master/experiment01-tailgrow

That I will appreciate someone to review my explaining comments, on why
verifier chooses to fail these programs... as they might be wrong.

 
> Any suggestion on how I could extend the kernel (or verifier) to
> provide easier access to the tailroom I grow?

Is it possible to use the cls_bpf older style load_byte() helpers?
 
 
> [3] https://github.com/xdp-project/xdp-tutorial/blob/tailgrow01.public/packet04-tailgrow/xdp_prog_kern.c#L97-L115
> [1] https://patchwork.ozlabs.org/project/netdev/list/?series=173782&state=%2A&archive=both
> [2] https://github.com/xdp-project/xdp-project/blob/master/areas/arm64/xdp_for_tsn.org
> 
> 
> Relevant code copy-pasted below, to make it easier to email comment:
> 
> SEC("xdp_tailgrow_parse")
> int grow_parse(struct xdp_md *ctx)
> {
> 	void *data_end;
> 	void *data;
> 	int action = XDP_PASS;
> 	int eth_type, ip_type;
> 	struct hdr_cursor nh;
> 	struct iphdr *iphdr;
> 	struct ethhdr *eth;
> 	__u16 ip_tot_len;
> 
> 	struct my_timestamp *ts;
> 
> 	/* Increase packet size (at tail) and reload data pointers */
> 	__u8 offset = sizeof(*ts);
> 	if (bpf_xdp_adjust_tail(ctx, offset))
> 		goto out;
> 	data_end = (void *)(long)ctx->data_end;
> 	data = (void *)(long)ctx->data;
> 
> 	/* These keep track of the next header type and iterator pointer */
> 	nh.pos = data;
> 
> 	eth_type = parse_ethhdr(&nh, data_end, &eth);
> 	if (eth_type < 0) {
> 		action = XDP_ABORTED;
> 		goto out;
> 	}
> 
> 	if (eth_type == bpf_htons(ETH_P_IP)) {
> 		ip_type = parse_iphdr(&nh, data_end, &iphdr);
> 	} else {
> 		action = XDP_PASS;
> 		goto out;
> 	}
> 
> 	/* Demo use-case: Add timestamp in extended tailroom to ICMP packets,
> 	 * before sending to network-stack via XDP_PASS.  This can be
> 	 * captured via tcpdump, and provide earlier (XDP layer) timestamp.
> 	 */
> 	if (ip_type == IPPROTO_ICMP) {
> 
> 		/* Packet size in bytes, including IP header and data */
> 		ip_tot_len = bpf_ntohs(iphdr->tot_len);
> 
> 		/*
> 		 * Tricks to get pass the verifier. Being allowed to use
> 		 * packet value iphdr->tot_len, involves bounding possible
> 		 * values to please verifier.
> 		 */
> 		if (ip_tot_len < 2) {
> 			/* This check seems strange on unsigned ip_tot_len,
> 			 * but is needed, else verifier complains:
> 			 * "unbounded min value is not allowed"
> 			 */
> 			goto out;
> 		}
> 		ip_tot_len &= 0xFFF; /* Max 4095 */
> 
> 		/* Finding end of packet + offset, and bound access */
> 		if ((void *)iphdr + ip_tot_len + offset > data_end) {
> 			action = XDP_ABORTED;
> 			goto out;
> 		}
> 
> 		/* Point ts to end-of-packet, that have been offset extended */
> 		ts = (void *)iphdr + ip_tot_len;
> 		ts->magic = 0x5354; /* String "TS" in network-byte-order */
> 		ts->time  = bpf_ktime_get_ns();
> 	}
> out:
> 	return xdp_stats_record_action(ctx, action);
> }
> 



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fighting BPF verifier to reach end-of-packet with XDP
  2020-05-06  9:08 ` Jesper Dangaard Brouer
@ 2020-05-06  9:28   ` Daniel Borkmann
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Borkmann @ 2020-05-06  9:28 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Alexei Starovoitov
  Cc: BPF-dev-list, Toke Høiland-Jørgensen, Matteo Croce,
	Lorenzo Bianconi, Ilias Apalodimas, xdp-newbies

On 5/6/20 11:08 AM, Jesper Dangaard Brouer wrote:
> On Fri, 1 May 2020 17:41:32 +0200
> Jesper Dangaard Brouer <brouer@redhat.com> wrote:
> 
>> Hi Daniel,
>>
>> One use-case for tail grow patchset, is to add a kernel timestamp at
>> XDP time in the extended tailroom of packet and return XDP_PASS to let
>> packet travel were it needs to go, and then via tcpdump we can extract
>> this timestamp. (E.g. this could improve on Ilias TSN measurements[2]).
>>
>> I have implemented it here[3]. It works, but it is really a hassle to
>> convince the BPF verifier, that my program was safe.  I use the
>> IP-headers total length to find the end-of-packet.
> 
> I moved the code example here experiment01-tailgrow[4]:
>   [4] https://github.com/xdp-project/xdp-tutorial/blob/master/experiment01-tailgrow/xdp_prog_kern.c
> 
> People can follow the changes via PR# [123].
>   [123] https://github.com/xdp-project/xdp-tutorial/pull/123
> 
>> Is there an easier BPF way to move a data pointer to data_end?
> 
> I've also added some "fail" examples[5]:
>   [5] https://github.com/xdp-project/xdp-tutorial/tree/master/experiment01-tailgrow
> 
> That I will appreciate someone to review my explaining comments, on why
> verifier chooses to fail these programs... as they might be wrong.
>   
>> Any suggestion on how I could extend the kernel (or verifier) to
>> provide easier access to the tailroom I grow?
> 
> Is it possible to use the cls_bpf older style load_byte() helpers?

In cls_bpf we use the tailroom grow for slow-path icmp [0], which may be the
primary use-case from my PoV for the tailroom grow. We haven't had a case where
we need it for crafting custom DNS replies though (it looks like you have one in
XDP (?), so may be good to add a sample code w/ your XDP series on this).

The issue from your fail1 example should very likely be that the offset in your
case is unbounded so verifier cannot do anything with this information. You would
need to make the offset bounded, add it to data and then open the range in the
data/data_end test with the constant you're accessing later, see my comment. I
haven't run your example, but that is what I'd probably try first.

Thanks,
Daniel

   [0] https://github.com/cilium/cilium/blob/master/bpf/lib/icmp6.h#L209

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-06  9:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01 15:41 Fighting BPF verifier to reach end-of-packet with XDP Jesper Dangaard Brouer
2020-05-01 18:10 ` David Ahern
2020-05-06  9:08 ` Jesper Dangaard Brouer
2020-05-06  9:28   ` Daniel Borkmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.