bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Accessing XDP packet memory from the end
@ 2022-04-21 15:56 Larysa Zaremba
  2022-04-21 16:27 ` Jesper Dangaard Brouer
  2022-04-21 17:17 ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 6+ messages in thread
From: Larysa Zaremba @ 2022-04-21 15:56 UTC (permalink / raw)
  To: bpf
  Cc: Larysa Zaremba, netdev, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, Toke Hoiland-Jorgensen,
	Magnus Karlsson, Maciej Fijalkowski, Alexander Lobakin

Dear all,
Our team has encountered a need of accessing data_meta in a following way:

int xdp_meta_prog(struct xdp_md *ctx)
{
	void *data_meta_ptr = (void *)(long)ctx->data_meta;
	void *data_end = (void *)(long)ctx->data_end;
	void *data = (void *)(long)ctx->data;
	u64 data_size = sizeof(u32);
	u32 magic_meta;
	u8 offset;

	offset = (u8)((s64)data - (s64)data_meta_ptr);
	if (offset < data_size) {
		bpf_printk("invalid offset: %ld\n", offset);
		return XDP_DROP;
	}

	data_meta_ptr += offset;
	data_meta_ptr -= data_size;

	if (data_meta_ptr + data_size > data) {
		return XDP_DROP;
	}
		
	magic_meta = *((u32 *)data);
	bpf_printk("Magic: %d\n", magic_meta);
	return XDP_PASS;
}

Unfortunately, verifier claims this code attempts to access packet with
an offset of -2 (a constant part) and negative offset is generally forbidden.

For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
which is pretty good, but not ideal for the hot path.
The second one is the patch at the end.

Do you see any other way of accessing memory from the end of data_meta/data?
What do you think about both suggested solutions?

Best regards,
Larysa Zaremba

---

--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off,
 	}
 
 	err = reg->range < 0 ? -EINVAL :
-	      __check_mem_access(env, regno, off, size, reg->range,
-				 zero_size_allowed);
+	      __check_mem_access(env, regno, off + reg->smin_value, size,
+				 reg->range + reg->smin_value, zero_size_allowed);
+	err = err ? :
+	      __check_mem_access(env, regno, off + reg->umax_value, size,
+				 reg->range + reg->umax_value, zero_size_allowed);
 	if (err) {
 		verbose(env, "R%d offset is outside of the packet\n", regno);
 		return err;

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing XDP packet memory from the end
  2022-04-21 15:56 Accessing XDP packet memory from the end Larysa Zaremba
@ 2022-04-21 16:27 ` Jesper Dangaard Brouer
  2022-04-22 10:06   ` Alexander Lobakin
  2022-04-21 17:17 ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 6+ messages in thread
From: Jesper Dangaard Brouer @ 2022-04-21 16:27 UTC (permalink / raw)
  To: Larysa Zaremba, bpf
  Cc: brouer, netdev, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Toke Hoiland-Jorgensen, Magnus Karlsson,
	Maciej Fijalkowski, Alexander Lobakin, xdp-hints



On 21/04/2022 17.56, Larysa Zaremba wrote:
> Dear all,
> Our team has encountered a need of accessing data_meta in a following way:
> 
> int xdp_meta_prog(struct xdp_md *ctx)
> {
> 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
> 	void *data_end = (void *)(long)ctx->data_end;
> 	void *data = (void *)(long)ctx->data;
> 	u64 data_size = sizeof(u32);
> 	u32 magic_meta;
> 	u8 offset;
> 
> 	offset = (u8)((s64)data - (s64)data_meta_ptr);

I'm not sure the verifier can handle this 'offset' calc. As it cannot
statically know the sized based on this statement. Maybe this is not the
issue.

> 	if (offset < data_size) {
> 		bpf_printk("invalid offset: %ld\n", offset);
> 		return XDP_DROP;
> 	}
> 
> 	data_meta_ptr += offset;
> 	data_meta_ptr -= data_size;
> 
> 	if (data_meta_ptr + data_size > data) {
> 		return XDP_DROP;
> 	}
> 		
> 	magic_meta = *((u32 *)data);
> 	bpf_printk("Magic: %d\n", magic_meta);
> 	return XDP_PASS;
> }
> 
> Unfortunately, verifier claims this code attempts to access packet with
> an offset of -2 (a constant part) and negative offset is generally forbidden.

Are you forgetting to mention:
  - Have you modified the NIC driver to adjust data_meta pointer and 
provide info in this area?

p.s. this is exactly what I'm also working towards[1], so I'll be happy
to collaborate.  I'm missing the driver code, as link[1] is focused on
decoding BTF data_meta area in userspace for AF_XDP.

[1] 
https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction

> For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
> which is pretty good, but not ideal for the hot path.
> The second one is the patch at the end.
> 

Are you saying, verifier cannot handle that driver changed data_meta 
pointer and provided info there (without calling bpf_xdp_adjust_meta)?


> Do you see any other way of accessing memory from the end of data_meta/data?
> What do you think about both suggested solutions?
> 
> Best regards,
> Larysa Zaremba
> 
> ---
> 
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off,
>   	}
>   
>   	err = reg->range < 0 ? -EINVAL :
> -	      __check_mem_access(env, regno, off, size, reg->range,
> -				 zero_size_allowed);
> +	      __check_mem_access(env, regno, off + reg->smin_value, size,
> +				 reg->range + reg->smin_value, zero_size_allowed);
> +	err = err ? :
> +	      __check_mem_access(env, regno, off + reg->umax_value, size,
> +				 reg->range + reg->umax_value, zero_size_allowed);
>   	if (err) {
>   		verbose(env, "R%d offset is outside of the packet\n", regno);
>   		return err;
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing XDP packet memory from the end
  2022-04-21 15:56 Accessing XDP packet memory from the end Larysa Zaremba
  2022-04-21 16:27 ` Jesper Dangaard Brouer
@ 2022-04-21 17:17 ` Toke Høiland-Jørgensen
  2022-04-22 16:41   ` Alexander Lobakin
  1 sibling, 1 reply; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-04-21 17:17 UTC (permalink / raw)
  To: Larysa Zaremba, bpf
  Cc: Larysa Zaremba, netdev, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, Magnus Karlsson,
	Maciej Fijalkowski, Alexander Lobakin

Larysa Zaremba <larysa.zaremba@intel.com> writes:

> Dear all,
> Our team has encountered a need of accessing data_meta in a following way:
>
> int xdp_meta_prog(struct xdp_md *ctx)
> {
> 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
> 	void *data_end = (void *)(long)ctx->data_end;
> 	void *data = (void *)(long)ctx->data;
> 	u64 data_size = sizeof(u32);
> 	u32 magic_meta;
> 	u8 offset;
>
> 	offset = (u8)((s64)data - (s64)data_meta_ptr);
> 	if (offset < data_size) {
> 		bpf_printk("invalid offset: %ld\n", offset);
> 		return XDP_DROP;
> 	}
>
> 	data_meta_ptr += offset;
> 	data_meta_ptr -= data_size;
>
> 	if (data_meta_ptr + data_size > data) {
> 		return XDP_DROP;
> 	}
> 		
> 	magic_meta = *((u32 *)data);
> 	bpf_printk("Magic: %d\n", magic_meta);
> 	return XDP_PASS;
> }
>
> Unfortunately, verifier claims this code attempts to access packet with
> an offset of -2 (a constant part) and negative offset is generally forbidden.
>
> For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
> which is pretty good, but not ideal for the hot path.
> The second one is the patch at the end.
>
> Do you see any other way of accessing memory from the end of data_meta/data?
> What do you think about both suggested solutions?

The problem is that the compiler is generating code that the verifier
doesn't understand. It's notoriously hard to get LLVM to produce code
that preserves the right bounds checks which is why projects like Cilium
use helpers with inline ASM to produce the right loads, like in [0].

Adapting that cilium helper to load from the metadata area, your example
can be rewritten as follows (which works just fine with no verifier
changes):

static __always_inline int
xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len)
{
	void *from;
	int ret;
	/* LLVM tends to generate code that verifier doesn't understand,
	 * so force it the way we want it in order to open up a range
	 * on the reg.
	 */
	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
		     "%[off] &= %[offmax]\n\t"
		     "r1 += %[off]\n\t"
		     "%[from] = r1\n\t"
		     "r1 += %[len]\n\t"
		     "if r1 > r2 goto +2\n\t"
		     "%[ret] = 0\n\t"
		     "goto +1\n\t"
		     "%[ret] = %[errno]\n\t"
		     : [ret]"=r"(ret), [from]"=r"(from)
		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
		       [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL)
		     : "r1", "r2");
	if (!ret)
		__builtin_memcpy(to, from, len);
	return ret;
}


SEC("xdp")
int xdp_meta_prog(struct xdp_md *ctx)
{
        void *data_meta_ptr = (void *)(long)ctx->data_meta;
        void *data = (void *)(long)ctx->data;
        __u32 magic_meta;
        __u8 offset;
	int ret;

        offset = (__u8)((__s64)data - (__s64)data_meta_ptr);
	ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta));
	if (ret) {
		bpf_printk("load bytes failed: %d\n", ret);
                return XDP_DROP;
	}

        bpf_printk("Magic: %d\n", magic_meta);
        return XDP_PASS;
}

-Toke


[0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/ctx/xdp.h#L35


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing XDP packet memory from the end
  2022-04-21 16:27 ` Jesper Dangaard Brouer
@ 2022-04-22 10:06   ` Alexander Lobakin
  0 siblings, 0 replies; 6+ messages in thread
From: Alexander Lobakin @ 2022-04-22 10:06 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Alexander Lobakin, Larysa Zaremba, bpf, brouer, netdev,
	Andrii Nakryiko, Alexei Starovoitov, Daniel Borkmann,
	Toke Hoiland-Jorgensen, Magnus Karlsson, Maciej Fijalkowski,
	xdp-hints

From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Thu, 21 Apr 2022 18:27:47 +0200

> On 21/04/2022 17.56, Larysa Zaremba wrote:
> > Dear all,
> > Our team has encountered a need of accessing data_meta in a following way:
> > 
> > int xdp_meta_prog(struct xdp_md *ctx)
> > {
> > 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
> > 	void *data_end = (void *)(long)ctx->data_end;
> > 	void *data = (void *)(long)ctx->data;
> > 	u64 data_size = sizeof(u32);
> > 	u32 magic_meta;
> > 	u8 offset;
> > 
> > 	offset = (u8)((s64)data - (s64)data_meta_ptr);
> 
> I'm not sure the verifier can handle this 'offset' calc. As it cannot
> statically know the sized based on this statement. Maybe this is not the
> issue.
> 
> > 	if (offset < data_size) {
> > 		bpf_printk("invalid offset: %ld\n", offset);
> > 		return XDP_DROP;
> > 	}
> > 
> > 	data_meta_ptr += offset;
> > 	data_meta_ptr -= data_size;
> > 
> > 	if (data_meta_ptr + data_size > data) {
> > 		return XDP_DROP;
> > 	}
> > 		
> > 	magic_meta = *((u32 *)data);
> > 	bpf_printk("Magic: %d\n", magic_meta);
> > 	return XDP_PASS;
> > }
> > 
> > Unfortunately, verifier claims this code attempts to access packet with
> > an offset of -2 (a constant part) and negative offset is generally forbidden.
> 
> Are you forgetting to mention:
>   - Have you modified the NIC driver to adjust data_meta pointer and 
> provide info in this area?

Exactly. Previously, @data_meta == @data prior to running BPF
program in 100% cases. Now, the driver can provide arbitrary
metadata and set @data_meta to be @data - 32, data - 48 or so.

> 
> p.s. this is exactly what I'm also working towards[1], so I'll be happy
> to collaborate.  I'm missing the driver code, as link[1] is focused on
> decoding BTF data_meta area in userspace for AF_XDP.

Yeah, we're almost about to post a first RFC to LKML. This issue is
the last one, the rest just needs to be rebased to fix some minors
and polish the code.
It will contain the kernel core part and the driver part (only ice
for now). Then we could e.g. fuse it with your changes (we weren't
touching AF_XDP part) etc.
But for now, until an RFC is posted, you could take a look at the
code in my GH[0] if you're wish :) The second half of the ice code
is not committed yet tho.

> 
> [1] 
> https://github.com/xdp-project/bpf-examples/tree/master/AF_XDP-interaction
> 
> > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
> > which is pretty good, but not ideal for the hot path.
> > The second one is the patch at the end.
> > 
> 
> Are you saying, verifier cannot handle that driver changed data_meta 
> pointer and provided info there (without calling bpf_xdp_adjust_meta)?

Correct. I suspect the verifier just assumes that @data_meta always
equals @data when executing BPF prog.
Let's assume:

	offset = data - data_meta; // 64 bytes
	data_meta += offset; // equals to data now
	/* Let's say xdp_meta_generic is 48 bytes long, then */
	data_meta -= sizeof(struct xdp_meta_generic);
	/* data_meta is now 16 bytes past the original data_meta,
	 * or data - 48.
	 */
	bpf_printk("magic: 0x%04x\n",
		   ((struct xdp_meta_generic)data_meta)->magic);

So in fact, this code is absolutely correct, it doesn't go past the
bounds in either direction, but the verifier claims it goes out of
bounds to the left by 48 bytes (not counting the offsetof).
OTOH,

	data_meta = (void *)ctx->data_meta;
	bpf_printk("magic: 0x%04x\n",
		   ((struct xdp_meta_generic)data_meta)->magic);

works with no issues. The verifier still thinks @data_meta == @data,
but this code effectively accesses the metadata, not the frame
itself.

> 
> 
> > Do you see any other way of accessing memory from the end of data_meta/data?
> > What do you think about both suggested solutions?
> > 
> > Best regards,
> > Larysa Zaremba
> > 
> > ---
> > 
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -3576,8 +3576,11 @@ static int check_packet_access(struct bpf_verifier_env *env, u32 regno, int off,
> >   	}
> >   
> >   	err = reg->range < 0 ? -EINVAL :
> > -	      __check_mem_access(env, regno, off, size, reg->range,
> > -				 zero_size_allowed);
> > +	      __check_mem_access(env, regno, off + reg->smin_value, size,
> > +				 reg->range + reg->smin_value, zero_size_allowed);
> > +	err = err ? :
> > +	      __check_mem_access(env, regno, off + reg->umax_value, size,
> > +				 reg->range + reg->umax_value, zero_size_allowed);
> >   	if (err) {
> >   		verbose(env, "R%d offset is outside of the packet\n", regno);
> >   		return err;

[0] https://github.com/alobakin/linux/commits/xdp_hints

Thanks,
Al

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing XDP packet memory from the end
  2022-04-21 17:17 ` Toke Høiland-Jørgensen
@ 2022-04-22 16:41   ` Alexander Lobakin
  2022-04-23 20:05     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Lobakin @ 2022-04-22 16:41 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Larysa Zaremba, bpf, netdev, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, Magnus Karlsson,
	Maciej Fijalkowski, Alexander Lobakin

From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Thu, 21 Apr 2022 19:17:11 +0200

> Larysa Zaremba <larysa.zaremba@intel.com> writes:
> 
> > Dear all,
> > Our team has encountered a need of accessing data_meta in a following way:
> >
> > int xdp_meta_prog(struct xdp_md *ctx)
> > {
> > 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
> > 	void *data_end = (void *)(long)ctx->data_end;
> > 	void *data = (void *)(long)ctx->data;
> > 	u64 data_size = sizeof(u32);
> > 	u32 magic_meta;
> > 	u8 offset;
> >
> > 	offset = (u8)((s64)data - (s64)data_meta_ptr);
> > 	if (offset < data_size) {
> > 		bpf_printk("invalid offset: %ld\n", offset);
> > 		return XDP_DROP;
> > 	}
> >
> > 	data_meta_ptr += offset;
> > 	data_meta_ptr -= data_size;
> >
> > 	if (data_meta_ptr + data_size > data) {
> > 		return XDP_DROP;
> > 	}
> > 		
> > 	magic_meta = *((u32 *)data);
> > 	bpf_printk("Magic: %d\n", magic_meta);
> > 	return XDP_PASS;
> > }
> >
> > Unfortunately, verifier claims this code attempts to access packet with
> > an offset of -2 (a constant part) and negative offset is generally forbidden.
> >
> > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
> > which is pretty good, but not ideal for the hot path.
> > The second one is the patch at the end.
> >
> > Do you see any other way of accessing memory from the end of data_meta/data?
> > What do you think about both suggested solutions?
> 
> The problem is that the compiler is generating code that the verifier
> doesn't understand. It's notoriously hard to get LLVM to produce code
> that preserves the right bounds checks which is why projects like Cilium
> use helpers with inline ASM to produce the right loads, like in [0].
> 
> Adapting that cilium helper to load from the metadata area, your example
> can be rewritten as follows (which works just fine with no verifier
> changes):
> 
> static __always_inline int
> xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len)
> {
> 	void *from;
> 	int ret;
> 	/* LLVM tends to generate code that verifier doesn't understand,
> 	 * so force it the way we want it in order to open up a range
> 	 * on the reg.
> 	 */
> 	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
> 		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
> 		     "%[off] &= %[offmax]\n\t"
> 		     "r1 += %[off]\n\t"
> 		     "%[from] = r1\n\t"
> 		     "r1 += %[len]\n\t"
> 		     "if r1 > r2 goto +2\n\t"
> 		     "%[ret] = 0\n\t"
> 		     "goto +1\n\t"
> 		     "%[ret] = %[errno]\n\t"
> 		     : [ret]"=r"(ret), [from]"=r"(from)
> 		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
> 		       [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL)
> 		     : "r1", "r2");
> 	if (!ret)
> 		__builtin_memcpy(to, from, len);
> 	return ret;
> }
> 
> 
> SEC("xdp")
> int xdp_meta_prog(struct xdp_md *ctx)
> {
>         void *data_meta_ptr = (void *)(long)ctx->data_meta;
>         void *data = (void *)(long)ctx->data;
>         __u32 magic_meta;
>         __u8 offset;
> 	int ret;
> 
>         offset = (__u8)((__s64)data - (__s64)data_meta_ptr);
> 	ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta));
> 	if (ret) {
> 		bpf_printk("load bytes failed: %d\n", ret);
>                 return XDP_DROP;
> 	}
> 
>         bpf_printk("Magic: %d\n", magic_meta);
>         return XDP_PASS;
> }

At the moment, we use this (based on Cilium's and your), it works
just like we want C code to work previously:

#define __CTX_OFF_MAX 0xff

static __always_inline void *
can_i_access_meta_please(const struct xdp_md *ctx, __u64 off, const __u64 len)
{
	void *ret;

	/* LLVM tends to generate code that verifier doesn't understand,
	 * so force it the way we want it in order to open up a range
	 * on the reg.
	 */
	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
		     "%[off] &= %[offmax]\n\t"
		     "r1 += %[off]\n\t"
		     "%[ret] = r1\n\t"
		     "r1 += %[len]\n\t"
		     "if r1 > r2 goto +1\n\t"
		     "goto +1\n\t"
		     "%[ret] = %[null]\n\t"
		     : [ret]"=r"(ret)
		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
		       [offmax]"i"(__CTX_OFF_MAX), [null]"i"(NULL)
		     : "r1", "r2");

	return ret;
}

SEC("xdp")
int xdp_prognum_n0_meta(struct xdp_md *ctx)
{
	void *data_meta = (void *)(__s64)ctx->data_meta;
	void *data = (void *)(__s64)ctx->data;
	struct xdp_meta_generic *md;
	__u64 offset;

	offset = (__u64)((__s64)data - (__s64)data_meta);

	md = can_i_access_meta_please(ctx, offset, sizeof(*md));
	if (__builtin_expect(!md, 0)) {
		bpf_printk("No you can't\n");
		return XDP_DROP;
	}

	bpf_printk("Magic: 0x%04x\n", md->magic_id);
	return XDP_PASS;
}

Thanks for the help! It's a shame LLVM still suck on generating
correct object code from C.
I guess we'll define a helper above in one of the headers to not
copy-paste it back and forth between each program wanting to
access only the generic part of the metadata (which is always being
placed at the end).

> 
> -Toke
> 
> 
> [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/ctx/xdp.h#L35

Thanks,
Al

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Accessing XDP packet memory from the end
  2022-04-22 16:41   ` Alexander Lobakin
@ 2022-04-23 20:05     ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 6+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-04-23 20:05 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: Larysa Zaremba, bpf, netdev, Andrii Nakryiko, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, Magnus Karlsson,
	Maciej Fijalkowski, Alexander Lobakin

Alexander Lobakin <alexandr.lobakin@intel.com> writes:

> From: Toke Høiland-Jørgensen <toke@redhat.com>
> Date: Thu, 21 Apr 2022 19:17:11 +0200
>
>> Larysa Zaremba <larysa.zaremba@intel.com> writes:
>> 
>> > Dear all,
>> > Our team has encountered a need of accessing data_meta in a following way:
>> >
>> > int xdp_meta_prog(struct xdp_md *ctx)
>> > {
>> > 	void *data_meta_ptr = (void *)(long)ctx->data_meta;
>> > 	void *data_end = (void *)(long)ctx->data_end;
>> > 	void *data = (void *)(long)ctx->data;
>> > 	u64 data_size = sizeof(u32);
>> > 	u32 magic_meta;
>> > 	u8 offset;
>> >
>> > 	offset = (u8)((s64)data - (s64)data_meta_ptr);
>> > 	if (offset < data_size) {
>> > 		bpf_printk("invalid offset: %ld\n", offset);
>> > 		return XDP_DROP;
>> > 	}
>> >
>> > 	data_meta_ptr += offset;
>> > 	data_meta_ptr -= data_size;
>> >
>> > 	if (data_meta_ptr + data_size > data) {
>> > 		return XDP_DROP;
>> > 	}
>> > 		
>> > 	magic_meta = *((u32 *)data);
>> > 	bpf_printk("Magic: %d\n", magic_meta);
>> > 	return XDP_PASS;
>> > }
>> >
>> > Unfortunately, verifier claims this code attempts to access packet with
>> > an offset of -2 (a constant part) and negative offset is generally forbidden.
>> >
>> > For now we have 2 solutions, one is using bpf_xdp_adjust_meta(),
>> > which is pretty good, but not ideal for the hot path.
>> > The second one is the patch at the end.
>> >
>> > Do you see any other way of accessing memory from the end of data_meta/data?
>> > What do you think about both suggested solutions?
>> 
>> The problem is that the compiler is generating code that the verifier
>> doesn't understand. It's notoriously hard to get LLVM to produce code
>> that preserves the right bounds checks which is why projects like Cilium
>> use helpers with inline ASM to produce the right loads, like in [0].
>> 
>> Adapting that cilium helper to load from the metadata area, your example
>> can be rewritten as follows (which works just fine with no verifier
>> changes):
>> 
>> static __always_inline int
>> xdp_load_meta_bytes(const struct xdp_md *ctx, __u64 off, void *to, const __u64 len)
>> {
>> 	void *from;
>> 	int ret;
>> 	/* LLVM tends to generate code that verifier doesn't understand,
>> 	 * so force it the way we want it in order to open up a range
>> 	 * on the reg.
>> 	 */
>> 	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
>> 		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
>> 		     "%[off] &= %[offmax]\n\t"
>> 		     "r1 += %[off]\n\t"
>> 		     "%[from] = r1\n\t"
>> 		     "r1 += %[len]\n\t"
>> 		     "if r1 > r2 goto +2\n\t"
>> 		     "%[ret] = 0\n\t"
>> 		     "goto +1\n\t"
>> 		     "%[ret] = %[errno]\n\t"
>> 		     : [ret]"=r"(ret), [from]"=r"(from)
>> 		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
>> 		       [offmax]"i"(__CTX_OFF_MAX), [errno]"i"(-EINVAL)
>> 		     : "r1", "r2");
>> 	if (!ret)
>> 		__builtin_memcpy(to, from, len);
>> 	return ret;
>> }
>> 
>> 
>> SEC("xdp")
>> int xdp_meta_prog(struct xdp_md *ctx)
>> {
>>         void *data_meta_ptr = (void *)(long)ctx->data_meta;
>>         void *data = (void *)(long)ctx->data;
>>         __u32 magic_meta;
>>         __u8 offset;
>> 	int ret;
>> 
>>         offset = (__u8)((__s64)data - (__s64)data_meta_ptr);
>> 	ret = xdp_load_meta_bytes(ctx, offset - 4, &magic_meta, sizeof(magic_meta));
>> 	if (ret) {
>> 		bpf_printk("load bytes failed: %d\n", ret);
>>                 return XDP_DROP;
>> 	}
>> 
>>         bpf_printk("Magic: %d\n", magic_meta);
>>         return XDP_PASS;
>> }
>
> At the moment, we use this (based on Cilium's and your), it works
> just like we want C code to work previously:
>
> #define __CTX_OFF_MAX 0xff
>
> static __always_inline void *
> can_i_access_meta_please(const struct xdp_md *ctx, __u64 off, const __u64 len)
> {
> 	void *ret;
>
> 	/* LLVM tends to generate code that verifier doesn't understand,
> 	 * so force it the way we want it in order to open up a range
> 	 * on the reg.
> 	 */
> 	asm volatile("r1 = *(u32 *)(%[ctx] +8)\n\t"
> 		     "r2 = *(u32 *)(%[ctx] +0)\n\t"
> 		     "%[off] &= %[offmax]\n\t"
> 		     "r1 += %[off]\n\t"
> 		     "%[ret] = r1\n\t"
> 		     "r1 += %[len]\n\t"
> 		     "if r1 > r2 goto +1\n\t"
> 		     "goto +1\n\t"
> 		     "%[ret] = %[null]\n\t"
> 		     : [ret]"=r"(ret)
> 		     : [ctx]"r"(ctx), [off]"r"(off), [len]"ri"(len),
> 		       [offmax]"i"(__CTX_OFF_MAX), [null]"i"(NULL)
> 		     : "r1", "r2");
>
> 	return ret;
> }
>
> SEC("xdp")
> int xdp_prognum_n0_meta(struct xdp_md *ctx)
> {
> 	void *data_meta = (void *)(__s64)ctx->data_meta;
> 	void *data = (void *)(__s64)ctx->data;
> 	struct xdp_meta_generic *md;
> 	__u64 offset;
>
> 	offset = (__u64)((__s64)data - (__s64)data_meta);
>
> 	md = can_i_access_meta_please(ctx, offset, sizeof(*md));
> 	if (__builtin_expect(!md, 0)) {
> 		bpf_printk("No you can't\n");
> 		return XDP_DROP;
> 	}
>
> 	bpf_printk("Magic: 0x%04x\n", md->magic_id);
> 	return XDP_PASS;
> }
>
> Thanks for the help!

Great! You're welcome! :)

> It's a shame LLVM still suck on generating correct object code from C.
> I guess we'll define a helper above in one of the headers to not
> copy-paste it back and forth between each program wanting to access
> only the generic part of the metadata (which is always being placed at
> the end).

Yeah, it would be nice if LLVM could just generate code that works, but
in the meantime we'll just have to define a helper. I suspect we'll need
to define some helper functions to work with xdp-hints style metadata
field anyway, so wrapping the reader into that somewhere would probably
make sense, no?

-Toke


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-04-23 20:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-21 15:56 Accessing XDP packet memory from the end Larysa Zaremba
2022-04-21 16:27 ` Jesper Dangaard Brouer
2022-04-22 10:06   ` Alexander Lobakin
2022-04-21 17:17 ` Toke Høiland-Jørgensen
2022-04-22 16:41   ` Alexander Lobakin
2022-04-23 20:05     ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).