bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/2] xsk: XSKMAP lookup improvements
@ 2019-10-25  7:18 Björn Töpel
  2019-10-25  7:18 ` [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP Björn Töpel
  2019-10-25  7:18 ` [PATCH bpf-next v2 2/2] bpf: implement map_gen_lookup() callback for XSKMAP Björn Töpel
  0 siblings, 2 replies; 8+ messages in thread
From: Björn Töpel @ 2019-10-25  7:18 UTC (permalink / raw)
  To: netdev, ast, daniel, jakub.kicinski
  Cc: Björn Töpel, bpf, magnus.karlsson, magnus.karlsson,
	maciej.fijalkowski, toke

Hi,

This small set consists of two patches from Maciej and myself which
are optimizing the XSKMAP lookups.  In the first patch, the sockets
are moved to be stored at the tail of the struct xsk_map. The second
patch, Maciej implements map_gen_lookup() for XSKMAP.

Based on the XDP program from tools/lib/bpf/xsk.c where
bpf_map_lookup_elem() is explicitly called, this work yields the 5%
improvement for xdpsock's rxdrop scenario.

Cheers,
Björn and Maciej

v1->v2: * Change size/cost to size_t and use {struct, array}_size
          where appropriate. (Jakub)

Björn Töpel (1):
  xsk: store struct xdp_sock as a flexible array member of the XSKMAP

Maciej Fijalkowski (1):
  bpf: implement map_gen_lookup() callback for XSKMAP

 kernel/bpf/xskmap.c | 72 +++++++++++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 32 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-25  7:18 [PATCH bpf-next v2 0/2] xsk: XSKMAP lookup improvements Björn Töpel
@ 2019-10-25  7:18 ` Björn Töpel
  2019-10-28 17:55   ` Jakub Kicinski
  2019-10-25  7:18 ` [PATCH bpf-next v2 2/2] bpf: implement map_gen_lookup() callback for XSKMAP Björn Töpel
  1 sibling, 1 reply; 8+ messages in thread
From: Björn Töpel @ 2019-10-25  7:18 UTC (permalink / raw)
  To: netdev, ast, daniel, jakub.kicinski
  Cc: Björn Töpel, bpf, magnus.karlsson, magnus.karlsson,
	maciej.fijalkowski, toke

From: Björn Töpel <bjorn.topel@intel.com>

Prior this commit, the array storing XDP socket instances were stored
in a separate allocated array of the XSKMAP. Now, we store the sockets
as a flexible array member in a similar fashion as the arraymap. Doing
so, we do less pointer chasing in the lookup.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 kernel/bpf/xskmap.c | 55 +++++++++++++++++++--------------------------
 1 file changed, 23 insertions(+), 32 deletions(-)

diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
index 82a1ffe15dfa..a83e92fe2971 100644
--- a/kernel/bpf/xskmap.c
+++ b/kernel/bpf/xskmap.c
@@ -11,9 +11,9 @@
 
 struct xsk_map {
 	struct bpf_map map;
-	struct xdp_sock **xsk_map;
 	struct list_head __percpu *flush_list;
 	spinlock_t lock; /* Synchronize map updates */
+	struct xdp_sock *xsk_map[];
 };
 
 int xsk_map_inc(struct xsk_map *map)
@@ -80,9 +80,10 @@ static void xsk_map_sock_delete(struct xdp_sock *xs,
 
 static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 {
+	struct bpf_map_memory mem;
+	int cpu, err, numa_node;
 	struct xsk_map *m;
-	int cpu, err;
-	u64 cost;
+	size_t cost, size;
 
 	if (!capable(CAP_NET_ADMIN))
 		return ERR_PTR(-EPERM);
@@ -92,44 +93,35 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
 	    attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
 		return ERR_PTR(-EINVAL);
 
-	m = kzalloc(sizeof(*m), GFP_USER);
-	if (!m)
+	numa_node = bpf_map_attr_numa_node(attr);
+	size = struct_size(m, xsk_map, attr->max_entries);
+	cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());
+
+	err = bpf_map_charge_init(&mem, cost);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	m = bpf_map_area_alloc(size, numa_node);
+	if (!m) {
+		bpf_map_charge_finish(&mem);
 		return ERR_PTR(-ENOMEM);
+	}
 
 	bpf_map_init_from_attr(&m->map, attr);
+	bpf_map_charge_move(&m->map.memory, &mem);
 	spin_lock_init(&m->lock);
 
-	cost = (u64)m->map.max_entries * sizeof(struct xdp_sock *);
-	cost += sizeof(struct list_head) * num_possible_cpus();
-
-	/* Notice returns -EPERM on if map size is larger than memlock limit */
-	err = bpf_map_charge_init(&m->map.memory, cost);
-	if (err)
-		goto free_m;
-
-	err = -ENOMEM;
-
 	m->flush_list = alloc_percpu(struct list_head);
-	if (!m->flush_list)
-		goto free_charge;
+	if (!m->flush_list) {
+		bpf_map_charge_finish(&m->map.memory);
+		bpf_map_area_free(m);
+		return ERR_PTR(-ENOMEM);
+	}
 
 	for_each_possible_cpu(cpu)
 		INIT_LIST_HEAD(per_cpu_ptr(m->flush_list, cpu));
 
-	m->xsk_map = bpf_map_area_alloc(m->map.max_entries *
-					sizeof(struct xdp_sock *),
-					m->map.numa_node);
-	if (!m->xsk_map)
-		goto free_percpu;
 	return &m->map;
-
-free_percpu:
-	free_percpu(m->flush_list);
-free_charge:
-	bpf_map_charge_finish(&m->map.memory);
-free_m:
-	kfree(m);
-	return ERR_PTR(err);
 }
 
 static void xsk_map_free(struct bpf_map *map)
@@ -139,8 +131,7 @@ static void xsk_map_free(struct bpf_map *map)
 	bpf_clear_redirect_map(map);
 	synchronize_net();
 	free_percpu(m->flush_list);
-	bpf_map_area_free(m->xsk_map);
-	kfree(m);
+	bpf_map_area_free(m);
 }
 
 static int xsk_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v2 2/2] bpf: implement map_gen_lookup() callback for XSKMAP
  2019-10-25  7:18 [PATCH bpf-next v2 0/2] xsk: XSKMAP lookup improvements Björn Töpel
  2019-10-25  7:18 ` [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP Björn Töpel
@ 2019-10-25  7:18 ` Björn Töpel
  1 sibling, 0 replies; 8+ messages in thread
From: Björn Töpel @ 2019-10-25  7:18 UTC (permalink / raw)
  To: netdev, ast, daniel, jakub.kicinski
  Cc: Maciej Fijalkowski, bpf, magnus.karlsson, magnus.karlsson, toke

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

I gave a shot for implementing map_gen_lookup() for xskmap just to wipe
away the tears from the bulk redirect struggle... not sure whether it's
still needed anyway since Bjorn posted a patch where he removes the
bpf_map_lookup_elem call from XDP program that libbpf is attaching for
AF_XDP socks.

Let me know what you think, tested it out and seems to do the job,
didn't check the perf difference. If that's useful, then devmap would
have mostly the same implementation.

Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 kernel/bpf/xskmap.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
index a83e92fe2971..62ce88e57d98 100644
--- a/kernel/bpf/xskmap.c
+++ b/kernel/bpf/xskmap.c
@@ -163,6 +163,22 @@ struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, u32 key)
 	return xs;
 }
 
+static u32 xsk_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
+{
+	const int ret = BPF_REG_0, mp = BPF_REG_1, index = BPF_REG_2;
+	struct bpf_insn *insn = insn_buf;
+
+	*insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0);
+	*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5);
+	*insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(sizeof(struct xsk_sock *)));
+	*insn++ = BPF_ALU64_IMM(BPF_ADD, mp, offsetof(struct xsk_map, xsk_map));
+	*insn++ = BPF_ALU64_REG(BPF_ADD, ret, mp);
+	*insn++ = BPF_LDX_MEM(BPF_DW, ret, ret, 0);
+	*insn++ = BPF_JMP_IMM(BPF_JA, 0, 0, 1);
+	*insn++ = BPF_MOV64_IMM(ret, 0);
+	return insn - insn_buf;
+}
+
 int __xsk_map_redirect(struct bpf_map *map, struct xdp_buff *xdp,
 		       struct xdp_sock *xs)
 {
@@ -303,6 +319,7 @@ const struct bpf_map_ops xsk_map_ops = {
 	.map_free = xsk_map_free,
 	.map_get_next_key = xsk_map_get_next_key,
 	.map_lookup_elem = xsk_map_lookup_elem,
+	.map_gen_lookup = xsk_map_gen_lookup,
 	.map_lookup_elem_sys_only = xsk_map_lookup_elem_sys_only,
 	.map_update_elem = xsk_map_update_elem,
 	.map_delete_elem = xsk_map_delete_elem,
-- 
2.20.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-25  7:18 ` [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP Björn Töpel
@ 2019-10-28 17:55   ` Jakub Kicinski
  2019-10-28 22:11     ` Björn Töpel
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2019-10-28 17:55 UTC (permalink / raw)
  To: Björn Töpel
  Cc: netdev, ast, daniel, Björn Töpel, bpf, magnus.karlsson,
	magnus.karlsson, maciej.fijalkowski, toke

On Fri, 25 Oct 2019 09:18:40 +0200, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> Prior this commit, the array storing XDP socket instances were stored
> in a separate allocated array of the XSKMAP. Now, we store the sockets
> as a flexible array member in a similar fashion as the arraymap. Doing
> so, we do less pointer chasing in the lookup.
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

Thanks for the re-spin.

> diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
> index 82a1ffe15dfa..a83e92fe2971 100644
> --- a/kernel/bpf/xskmap.c
> +++ b/kernel/bpf/xskmap.c

> @@ -92,44 +93,35 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
>  	    attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
>  		return ERR_PTR(-EINVAL);
>  
> -	m = kzalloc(sizeof(*m), GFP_USER);
> -	if (!m)
> +	numa_node = bpf_map_attr_numa_node(attr);
> +	size = struct_size(m, xsk_map, attr->max_entries);
> +	cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());

Now we didn't use array_size() previously because the sum here may
overflow.

We could use __ab_c_size() here, the name is probably too ugly to use
directly and IDK what we'd have to name such a accumulation helper...

So maybe just make cost and size a u64 and we should be in the clear.

> +	err = bpf_map_charge_init(&mem, cost);
> +	if (err < 0)
> +		return ERR_PTR(err);
> +
> +	m = bpf_map_area_alloc(size, numa_node);
> +	if (!m) {
> +		bpf_map_charge_finish(&mem);
>  		return ERR_PTR(-ENOMEM);
> +	}
>  
>  	bpf_map_init_from_attr(&m->map, attr);
> +	bpf_map_charge_move(&m->map.memory, &mem);
>  	spin_lock_init(&m->lock);
>  
> -	cost = (u64)m->map.max_entries * sizeof(struct xdp_sock *);
> -	cost += sizeof(struct list_head) * num_possible_cpus();
> -
> -	/* Notice returns -EPERM on if map size is larger than memlock limit */
> -	err = bpf_map_charge_init(&m->map.memory, cost);
> -	if (err)
> -		goto free_m;
> -
> -	err = -ENOMEM;
> -
>  	m->flush_list = alloc_percpu(struct list_head);
> -	if (!m->flush_list)
> -		goto free_charge;
> +	if (!m->flush_list) {
> +		bpf_map_charge_finish(&m->map.memory);
> +		bpf_map_area_free(m);
> +		return ERR_PTR(-ENOMEM);
> +	}
>  
>  	for_each_possible_cpu(cpu)
>  		INIT_LIST_HEAD(per_cpu_ptr(m->flush_list, cpu));
>  
> -	m->xsk_map = bpf_map_area_alloc(m->map.max_entries *
> -					sizeof(struct xdp_sock *),
> -					m->map.numa_node);
> -	if (!m->xsk_map)
> -		goto free_percpu;
>  	return &m->map;
> -
> -free_percpu:
> -	free_percpu(m->flush_list);
> -free_charge:
> -	bpf_map_charge_finish(&m->map.memory);
> -free_m:
> -	kfree(m);
> -	return ERR_PTR(err);
>  }
>  
>  static void xsk_map_free(struct bpf_map *map)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-28 17:55   ` Jakub Kicinski
@ 2019-10-28 22:11     ` Björn Töpel
  2019-10-28 22:26       ` Jakub Kicinski
  0 siblings, 1 reply; 8+ messages in thread
From: Björn Töpel @ 2019-10-28 22:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, Magnus Karlsson, Karlsson, Magnus,
	Fijalkowski, Maciej, Toke Høiland-Jørgensen

On Mon, 28 Oct 2019 at 18:55, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Fri, 25 Oct 2019 09:18:40 +0200, Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > Prior this commit, the array storing XDP socket instances were stored
> > in a separate allocated array of the XSKMAP. Now, we store the sockets
> > as a flexible array member in a similar fashion as the arraymap. Doing
> > so, we do less pointer chasing in the lookup.
> >
> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
>
> Thanks for the re-spin.
>
> > diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
> > index 82a1ffe15dfa..a83e92fe2971 100644
> > --- a/kernel/bpf/xskmap.c
> > +++ b/kernel/bpf/xskmap.c
>
> > @@ -92,44 +93,35 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
> >           attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
> >               return ERR_PTR(-EINVAL);
> >
> > -     m = kzalloc(sizeof(*m), GFP_USER);
> > -     if (!m)
> > +     numa_node = bpf_map_attr_numa_node(attr);
> > +     size = struct_size(m, xsk_map, attr->max_entries);
> > +     cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());
>
> Now we didn't use array_size() previously because the sum here may
> overflow.
>
> We could use __ab_c_size() here, the name is probably too ugly to use
> directly and IDK what we'd have to name such a accumulation helper...
>
> So maybe just make cost and size a u64 and we should be in the clear.
>

Hmm, but both:
  int bpf_map_charge_init(struct bpf_map_memory *mem, size_t size);
  void *bpf_map_area_alloc(size_t size, int numa_node);
pass size as size_t, so casting to u64 doesn't really help on 32-bit
systems, right?

Wdyt about simply adding:
  if (cost < size)
    return ERR_PTR(-EINVAL)
after the cost calculation for explicit overflow checking?

So, if size's struct_size overflows, the allocation will fail.
And we'll catch the cost overflow with the if-statement, no?

Another option is changing the size_t in bpf_map_... to u64. Maybe
that's better, since arraymap and devmap uses u64 for cost/size.


Björn

> > +     err = bpf_map_charge_init(&mem, cost);
> > +     if (err < 0)
> > +             return ERR_PTR(err);
> > +
> > +     m = bpf_map_area_alloc(size, numa_node);
> > +     if (!m) {
> > +             bpf_map_charge_finish(&mem);
> >               return ERR_PTR(-ENOMEM);
> > +     }
> >
> >       bpf_map_init_from_attr(&m->map, attr);
> > +     bpf_map_charge_move(&m->map.memory, &mem);
> >       spin_lock_init(&m->lock);
> >
> > -     cost = (u64)m->map.max_entries * sizeof(struct xdp_sock *);
> > -     cost += sizeof(struct list_head) * num_possible_cpus();
> > -
> > -     /* Notice returns -EPERM on if map size is larger than memlock limit */
> > -     err = bpf_map_charge_init(&m->map.memory, cost);
> > -     if (err)
> > -             goto free_m;
> > -
> > -     err = -ENOMEM;
> > -
> >       m->flush_list = alloc_percpu(struct list_head);
> > -     if (!m->flush_list)
> > -             goto free_charge;
> > +     if (!m->flush_list) {
> > +             bpf_map_charge_finish(&m->map.memory);
> > +             bpf_map_area_free(m);
> > +             return ERR_PTR(-ENOMEM);
> > +     }
> >
> >       for_each_possible_cpu(cpu)
> >               INIT_LIST_HEAD(per_cpu_ptr(m->flush_list, cpu));
> >
> > -     m->xsk_map = bpf_map_area_alloc(m->map.max_entries *
> > -                                     sizeof(struct xdp_sock *),
> > -                                     m->map.numa_node);
> > -     if (!m->xsk_map)
> > -             goto free_percpu;
> >       return &m->map;
> > -
> > -free_percpu:
> > -     free_percpu(m->flush_list);
> > -free_charge:
> > -     bpf_map_charge_finish(&m->map.memory);
> > -free_m:
> > -     kfree(m);
> > -     return ERR_PTR(err);
> >  }
> >
> >  static void xsk_map_free(struct bpf_map *map)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-28 22:11     ` Björn Töpel
@ 2019-10-28 22:26       ` Jakub Kicinski
  2019-10-29  6:20         ` Björn Töpel
  0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2019-10-28 22:26 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, Magnus Karlsson, Karlsson, Magnus,
	Fijalkowski, Maciej, Toke Høiland-Jørgensen

On Mon, 28 Oct 2019 23:11:50 +0100, Björn Töpel wrote:
> On Mon, 28 Oct 2019 at 18:55, Jakub Kicinski
> <jakub.kicinski@netronome.com> wrote:
> >
> > On Fri, 25 Oct 2019 09:18:40 +0200, Björn Töpel wrote:  
> > > From: Björn Töpel <bjorn.topel@intel.com>
> > >
> > > Prior this commit, the array storing XDP socket instances were stored
> > > in a separate allocated array of the XSKMAP. Now, we store the sockets
> > > as a flexible array member in a similar fashion as the arraymap. Doing
> > > so, we do less pointer chasing in the lookup.
> > >
> > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>  
> >
> > Thanks for the re-spin.
> >  
> > > diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
> > > index 82a1ffe15dfa..a83e92fe2971 100644
> > > --- a/kernel/bpf/xskmap.c
> > > +++ b/kernel/bpf/xskmap.c  
> >  
> > > @@ -92,44 +93,35 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
> > >           attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
> > >               return ERR_PTR(-EINVAL);
> > >
> > > -     m = kzalloc(sizeof(*m), GFP_USER);
> > > -     if (!m)
> > > +     numa_node = bpf_map_attr_numa_node(attr);
> > > +     size = struct_size(m, xsk_map, attr->max_entries);
> > > +     cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());  
> >
> > Now we didn't use array_size() previously because the sum here may
> > overflow.
> >
> > We could use __ab_c_size() here, the name is probably too ugly to use
> > directly and IDK what we'd have to name such a accumulation helper...
> >
> > So maybe just make cost and size a u64 and we should be in the clear.
> >  
> 
> Hmm, but both:
>   int bpf_map_charge_init(struct bpf_map_memory *mem, size_t size);
>   void *bpf_map_area_alloc(size_t size, int numa_node);
> pass size as size_t, so casting to u64 doesn't really help on 32-bit
> systems, right?

Yup :( IOW looks like the overflows will not be caught on 32bit
machines in all existing code that does the (u64) cast. I hope 
I'm wrong there.

> Wdyt about simply adding:
>   if (cost < size)
>     return ERR_PTR(-EINVAL)
> after the cost calculation for explicit overflow checking?

We'd need that for all users of these helpers. Could it perhaps makes
sense to pass "alloc_size" and "extra_cost" as separate size_t to
bpf_map_charge_init() and then we can do the overflow checking there,
centrally?

We can probably get rid of all the u64 casting too at that point,
and use standard overflow helpers, yuppie :)

> So, if size's struct_size overflows, the allocation will fail.
> And we'll catch the cost overflow with the if-statement, no?
> 
> Another option is changing the size_t in bpf_map_... to u64. Maybe
> that's better, since arraymap and devmap uses u64 for cost/size.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-28 22:26       ` Jakub Kicinski
@ 2019-10-29  6:20         ` Björn Töpel
  2019-10-29 13:44           ` Alexei Starovoitov
  0 siblings, 1 reply; 8+ messages in thread
From: Björn Töpel @ 2019-10-29  6:20 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, Magnus Karlsson, Karlsson, Magnus,
	Fijalkowski, Maciej, Toke Høiland-Jørgensen

On Mon, 28 Oct 2019 at 23:26, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Mon, 28 Oct 2019 23:11:50 +0100, Björn Töpel wrote:
> > On Mon, 28 Oct 2019 at 18:55, Jakub Kicinski
> > <jakub.kicinski@netronome.com> wrote:
> > >
> > > On Fri, 25 Oct 2019 09:18:40 +0200, Björn Töpel wrote:
> > > > From: Björn Töpel <bjorn.topel@intel.com>
> > > >
> > > > Prior this commit, the array storing XDP socket instances were stored
> > > > in a separate allocated array of the XSKMAP. Now, we store the sockets
> > > > as a flexible array member in a similar fashion as the arraymap. Doing
> > > > so, we do less pointer chasing in the lookup.
> > > >
> > > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > >
> > > Thanks for the re-spin.
> > >
> > > > diff --git a/kernel/bpf/xskmap.c b/kernel/bpf/xskmap.c
> > > > index 82a1ffe15dfa..a83e92fe2971 100644
> > > > --- a/kernel/bpf/xskmap.c
> > > > +++ b/kernel/bpf/xskmap.c
> > >
> > > > @@ -92,44 +93,35 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr)
> > > >           attr->map_flags & ~(BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY))
> > > >               return ERR_PTR(-EINVAL);
> > > >
> > > > -     m = kzalloc(sizeof(*m), GFP_USER);
> > > > -     if (!m)
> > > > +     numa_node = bpf_map_attr_numa_node(attr);
> > > > +     size = struct_size(m, xsk_map, attr->max_entries);
> > > > +     cost = size + array_size(sizeof(*m->flush_list), num_possible_cpus());
> > >
> > > Now we didn't use array_size() previously because the sum here may
> > > overflow.
> > >
> > > We could use __ab_c_size() here, the name is probably too ugly to use
> > > directly and IDK what we'd have to name such a accumulation helper...
> > >
> > > So maybe just make cost and size a u64 and we should be in the clear.
> > >
> >
> > Hmm, but both:
> >   int bpf_map_charge_init(struct bpf_map_memory *mem, size_t size);
> >   void *bpf_map_area_alloc(size_t size, int numa_node);
> > pass size as size_t, so casting to u64 doesn't really help on 32-bit
> > systems, right?
>
> Yup :( IOW looks like the overflows will not be caught on 32bit
> machines in all existing code that does the (u64) cast. I hope
> I'm wrong there.
>
> > Wdyt about simply adding:
> >   if (cost < size)
> >     return ERR_PTR(-EINVAL)
> > after the cost calculation for explicit overflow checking?
>
> We'd need that for all users of these helpers. Could it perhaps makes
> sense to pass "alloc_size" and "extra_cost" as separate size_t to
> bpf_map_charge_init() and then we can do the overflow checking there,
> centrally?
>

The cost/size calculations seem to vary a bit from map to map, so I
don't know about the extra size_t arguments... but all of them do use
u64 for cost and explicit casting, in favor of u32 overflow checks.
Changing bpf_map_charge_init()/bpf_map_area_alloc() size to u64 would
be the smallest change, together with a 64-to-32 overflow check in
those functions.

> We can probably get rid of all the u64 casting too at that point,
> and use standard overflow helpers, yuppie :)
>

Yeah, that's the other path, but more churn (check_add_overflow() in every map).

Preferred path?

> > So, if size's struct_size overflows, the allocation will fail.
> > And we'll catch the cost overflow with the if-statement, no?
> >
> > Another option is changing the size_t in bpf_map_... to u64. Maybe
> > that's better, since arraymap and devmap uses u64 for cost/size.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP
  2019-10-29  6:20         ` Björn Töpel
@ 2019-10-29 13:44           ` Alexei Starovoitov
  0 siblings, 0 replies; 8+ messages in thread
From: Alexei Starovoitov @ 2019-10-29 13:44 UTC (permalink / raw)
  To: Björn Töpel
  Cc: Jakub Kicinski, Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, bpf, Magnus Karlsson, Karlsson, Magnus,
	Fijalkowski, Maciej, Toke Høiland-Jørgensen

On Mon, Oct 28, 2019 at 11:21 PM Björn Töpel <bjorn.topel@gmail.com> wrote:
> Changing bpf_map_charge_init()/bpf_map_area_alloc() size to u64 would
> be the smallest change, together with a 64-to-32 overflow check in
> those functions.

+1. bpf_map_charge_init() already has such check.
Changing them to u64 is probably good idea.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-10-29 13:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-25  7:18 [PATCH bpf-next v2 0/2] xsk: XSKMAP lookup improvements Björn Töpel
2019-10-25  7:18 ` [PATCH bpf-next v2 1/2] xsk: store struct xdp_sock as a flexible array member of the XSKMAP Björn Töpel
2019-10-28 17:55   ` Jakub Kicinski
2019-10-28 22:11     ` Björn Töpel
2019-10-28 22:26       ` Jakub Kicinski
2019-10-29  6:20         ` Björn Töpel
2019-10-29 13:44           ` Alexei Starovoitov
2019-10-25  7:18 ` [PATCH bpf-next v2 2/2] bpf: implement map_gen_lookup() callback for XSKMAP Björn Töpel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).