All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf] bpf: zero-fill re-used per-cpu map element
@ 2020-10-23 12:37 David Verbeiren
  2020-10-26 22:47 ` Andrii Nakryiko
  2020-10-27 22:13 ` [PATCH bpf v2] " David Verbeiren
  0 siblings, 2 replies; 12+ messages in thread
From: David Verbeiren @ 2020-10-23 12:37 UTC (permalink / raw)
  To: bpf; +Cc: netdev, David Verbeiren, Matthieu Baerts

Zero-fill element values for all cpus, just as when not using
prealloc. This is the only way the bpf program can ensure known
initial values for cpus other than the current one ('onallcpus'
cannot be set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---
 kernel/bpf/hashtab.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..667553cce65a 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -836,6 +836,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 	bool prealloc = htab_is_prealloc(htab);
 	struct htab_elem *l_new, **pl_new;
 	void __percpu *pptr;
+	int cpu;
 
 	if (prealloc) {
 		if (old_elem) {
@@ -880,6 +881,17 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 		size = round_up(size, 8);
 		if (prealloc) {
 			pptr = htab_elem_get_ptr(l_new, key_size);
+
+			/* zero-fill element values for all cpus, just as when
+			 * not using prealloc. Only way for bpf program to
+			 * ensure known initial values for cpus other than
+			 * current one (onallcpus=false when coming from bpf
+			 * prog).
+			 */
+			if (!onallcpus)
+				for_each_possible_cpu(cpu)
+					memset((void *)per_cpu_ptr(pptr, cpu),
+					       0, size);
 		} else {
 			/* alloc_percpu zero-fills */
 			pptr = __alloc_percpu_gfp(size, 8,
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf] bpf: zero-fill re-used per-cpu map element
  2020-10-23 12:37 [PATCH bpf] bpf: zero-fill re-used per-cpu map element David Verbeiren
@ 2020-10-26 22:47 ` Andrii Nakryiko
  2020-10-27 14:48   ` David Verbeiren
  2020-10-27 22:13 ` [PATCH bpf v2] " David Verbeiren
  1 sibling, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2020-10-26 22:47 UTC (permalink / raw)
  To: David Verbeiren; +Cc: bpf, Networking, Matthieu Baerts

On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all cpus, just as when not using
> prealloc. This is the only way the bpf program can ensure known
> initial values for cpus other than the current one ('onallcpus'
> cannot be set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>  kernel/bpf/hashtab.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 1815e97d4c9c..667553cce65a 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -836,6 +836,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>         bool prealloc = htab_is_prealloc(htab);
>         struct htab_elem *l_new, **pl_new;
>         void __percpu *pptr;
> +       int cpu;
>
>         if (prealloc) {
>                 if (old_elem) {
> @@ -880,6 +881,17 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>                 size = round_up(size, 8);
>                 if (prealloc) {
>                         pptr = htab_elem_get_ptr(l_new, key_size);
> +
> +                       /* zero-fill element values for all cpus, just as when
> +                        * not using prealloc. Only way for bpf program to
> +                        * ensure known initial values for cpus other than
> +                        * current one (onallcpus=false when coming from bpf
> +                        * prog).
> +                        */
> +                       if (!onallcpus)
> +                               for_each_possible_cpu(cpu)
> +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> +                                              0, size);

Technically, you don't have to memset() for the current CPU, right?
Don't know if extra check is cheaper than avoiding one memset() call,
though.

But regardless, this 6 level nesting looks pretty bad, maybe move the
for_each_possible_cpu() loop into a helper function?

Also, does the per-CPU LRU hashmap need the same treatment?

>                 } else {
>                         /* alloc_percpu zero-fills */
>                         pptr = __alloc_percpu_gfp(size, 8,
> --
> 2.29.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf] bpf: zero-fill re-used per-cpu map element
  2020-10-26 22:47 ` Andrii Nakryiko
@ 2020-10-27 14:48   ` David Verbeiren
  0 siblings, 0 replies; 12+ messages in thread
From: David Verbeiren @ 2020-10-27 14:48 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Networking, Matthieu Baerts

On Mon, Oct 26, 2020 at 11:48 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
> <david.verbeiren@tessares.net> wrote:
> > [...]
> > +                       if (!onallcpus)
> > +                               for_each_possible_cpu(cpu)
> > +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> > +                                              0, size);
>
> Technically, you don't have to memset() for the current CPU, right?
> Don't know if extra check is cheaper than avoiding one memset() call,
> though.

I thought about that as well but, because it depends on the 'size',
I decided to keep it simple. However, taking into account your other
comments, I think there is a possibility to combine it all nicely in a
separate function.

> But regardless, this 6 level nesting looks pretty bad, maybe move the
> for_each_possible_cpu() loop into a helper function?
>
> Also, does the per-CPU LRU hashmap need the same treatment?
I think it does. Good catch!

Thanks for your feedback. v2 is coming.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf v2] bpf: zero-fill re-used per-cpu map element
  2020-10-23 12:37 [PATCH bpf] bpf: zero-fill re-used per-cpu map element David Verbeiren
  2020-10-26 22:47 ` Andrii Nakryiko
@ 2020-10-27 22:13 ` David Verbeiren
  2020-10-27 22:55   ` Andrii Nakryiko
  2020-11-03 15:47   ` [PATCH bpf v3] " David Verbeiren
  1 sibling, 2 replies; 12+ messages in thread
From: David Verbeiren @ 2020-10-27 22:13 UTC (permalink / raw)
  To: bpf; +Cc: netdev, Andrii Nakryiko, David Verbeiren, Matthieu Baerts

Zero-fill element values for all other cpus than current, just as
when not using prealloc. This is the only way the bpf program can
ensure known initial values for all cpus ('onallcpus' cannot be
set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---

Notes:
    v2:
      - Moved memset() to separate pcpu_init_value() function,
        which replaces pcpu_copy_value() but delegates to it
        for the cases where no memset() is needed (Andrii).
      - This function now also avoids doing the memset() for
        the current cpu for which the value must be set
        anyhow (Andrii).
      - Same pcpu_init_value() used for per-cpu LRU map
        (Andrii).
    
      Note that I could not test the per-cpu LRU other than
      by running the bpf selftests. lru_map and maps tests
      passed but for the rest of the test suite, I don't
      think I know how to spot problems...
    
      Question: Is it ok to use raw_smp_processor_id() in
      these contexts? bpf prog context should be fine, I think.
      Is it also ok in the syscall context?

 kernel/bpf/hashtab.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..1fccba6e88c4 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -821,6 +821,32 @@ static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr,
 	}
 }
 
+static void pcpu_init_value(struct bpf_htab *htab, void __percpu *pptr,
+			    void *value, bool onallcpus)
+{
+	/* When using prealloc and not setting the initial value on all cpus,
+	 * zero-fill element values for other cpus (just as what happens when
+	 * not using prealloc). Otherwise, bpf program has no way to ensure
+	 * known initial values for cpus other than current one
+	 * (onallcpus=false always when coming from bpf prog).
+	 */
+	if (htab_is_prealloc(htab) && !onallcpus) {
+		u32 size = round_up(htab->map.value_size, 8);
+		int current_cpu = raw_smp_processor_id();
+		int cpu;
+
+		for_each_possible_cpu(cpu) {
+			if (cpu == current_cpu)
+				bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value,
+						size);
+			else
+				memset(per_cpu_ptr(pptr, cpu), 0, size);
+		}
+	} else {
+		pcpu_copy_value(htab, pptr, value, onallcpus);
+	}
+}
+
 static bool fd_htab_map_needs_adjust(const struct bpf_htab *htab)
 {
 	return htab->map.map_type == BPF_MAP_TYPE_HASH_OF_MAPS &&
@@ -891,7 +917,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 			}
 		}
 
-		pcpu_copy_value(htab, pptr, value, onallcpus);
+		pcpu_init_value(htab, pptr, value, onallcpus);
 
 		if (!prealloc)
 			htab_elem_set_ptr(l_new, key_size, pptr);
@@ -1183,7 +1209,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 		pcpu_copy_value(htab, htab_elem_get_ptr(l_old, key_size),
 				value, onallcpus);
 	} else {
-		pcpu_copy_value(htab, htab_elem_get_ptr(l_new, key_size),
+		pcpu_init_value(htab, htab_elem_get_ptr(l_new, key_size),
 				value, onallcpus);
 		hlist_nulls_add_head_rcu(&l_new->hash_node, head);
 		l_new = NULL;
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2] bpf: zero-fill re-used per-cpu map element
  2020-10-27 22:13 ` [PATCH bpf v2] " David Verbeiren
@ 2020-10-27 22:55   ` Andrii Nakryiko
  2020-10-29 14:44     ` David Verbeiren
  2020-11-03 15:47   ` [PATCH bpf v3] " David Verbeiren
  1 sibling, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2020-10-27 22:55 UTC (permalink / raw)
  To: David Verbeiren; +Cc: bpf, Networking, Matthieu Baerts

On Tue, Oct 27, 2020 at 3:15 PM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all other cpus than current, just as
> when not using prealloc. This is the only way the bpf program can
> ensure known initial values for all cpus ('onallcpus' cannot be
> set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>

Looks good, but would be good to have a unit test (see below). Maybe
in a follow up.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> Notes:
>     v2:
>       - Moved memset() to separate pcpu_init_value() function,
>         which replaces pcpu_copy_value() but delegates to it
>         for the cases where no memset() is needed (Andrii).
>       - This function now also avoids doing the memset() for
>         the current cpu for which the value must be set
>         anyhow (Andrii).
>       - Same pcpu_init_value() used for per-cpu LRU map
>         (Andrii).
>
>       Note that I could not test the per-cpu LRU other than
>       by running the bpf selftests. lru_map and maps tests
>       passed but for the rest of the test suite, I don't
>       think I know how to spot problems...

It would be good to write a new selftest specifically for this. You
can create a single-element pre-allocated per-CPU hashmap. From
user-space, initialize it to non-zeros on all CPUs. Then delete that
key (it will get put on the free list). Then trigger BPF program, do
an update (that should take an element from the freelist), doesn't
matter which value you set (could be zero). Then from user-space get
all per-CPU values for than new key. It should all be zeroes with your
fix and non-zero without it.

It sounds more complicated than it would look like in practice :)

>
>       Question: Is it ok to use raw_smp_processor_id() in
>       these contexts? bpf prog context should be fine, I think.
>       Is it also ok in the syscall context?

From the BPF program side it's definitely ok, because we disable CPU
migration even for sleepable programs. For syscall context, it always
uses onallcpus=true, so we'll never run this logic from syscall
context. So I think it's fine.

>
>  kernel/bpf/hashtab.c | 30 ++++++++++++++++++++++++++++--
>  1 file changed, 28 insertions(+), 2 deletions(-)

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2] bpf: zero-fill re-used per-cpu map element
  2020-10-27 22:55   ` Andrii Nakryiko
@ 2020-10-29 14:44     ` David Verbeiren
  2020-10-29 22:40       ` Andrii Nakryiko
  0 siblings, 1 reply; 12+ messages in thread
From: David Verbeiren @ 2020-10-29 14:44 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, Networking, Matthieu Baerts

On Tue, Oct 27, 2020 at 11:55 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> Looks good, but would be good to have a unit test (see below). Maybe
> in a follow up.

Here is the associated new selftest, implementing the sequence you
proposed (thanks for that!), and also one for LRU case:
https://lore.kernel.org/bpf/20201029111730.6881-1-david.verbeiren@tessares.net/

I hope I did it in the "right" framework of the day.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v2] bpf: zero-fill re-used per-cpu map element
  2020-10-29 14:44     ` David Verbeiren
@ 2020-10-29 22:40       ` Andrii Nakryiko
  0 siblings, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2020-10-29 22:40 UTC (permalink / raw)
  To: David Verbeiren; +Cc: bpf, Networking, Matthieu Baerts

On Thu, Oct 29, 2020 at 7:44 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> On Tue, Oct 27, 2020 at 11:55 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > Looks good, but would be good to have a unit test (see below). Maybe
> > in a follow up.
>
> Here is the associated new selftest, implementing the sequence you
> proposed (thanks for that!), and also one for LRU case:
> https://lore.kernel.org/bpf/20201029111730.6881-1-david.verbeiren@tessares.net/
>
> I hope I did it in the "right" framework of the day.

You did it in a very laborious and hard way, which will be harder to
maintain, unfortunately. But it should be trivial to simplify it with
skeleton. It's probably better to combine the selftest patch with the
kernel fix in this patch and send it as v2?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf v3] bpf: zero-fill re-used per-cpu map element
  2020-10-27 22:13 ` [PATCH bpf v2] " David Verbeiren
  2020-10-27 22:55   ` Andrii Nakryiko
@ 2020-11-03 15:47   ` David Verbeiren
  2020-11-03 18:19     ` Andrii Nakryiko
  2020-11-04 11:23     ` [PATCH bpf v4] " David Verbeiren
  1 sibling, 2 replies; 12+ messages in thread
From: David Verbeiren @ 2020-11-03 15:47 UTC (permalink / raw)
  To: bpf, Andrii Nakryiko; +Cc: netdev, Song Liu, David Verbeiren, Matthieu Baerts

Zero-fill element values for all other cpus than current, just as
when not using prealloc. This is the only way the bpf program can
ensure known initial values for all cpus ('onallcpus' cannot be
set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

A selftest is added to validate correct operation in above
scenario as well as in case of LRU per-cpu map element re-use.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---

Notes:
    v3:
      - Added selftest that was initially provided as separate
        patch, and reworked to
        * use skeleton (Andrii, Song Liu)
        * skip test if <=1 CPU (Song Liu)
    
    v2:
      - Moved memset() to separate pcpu_init_value() function,
        which replaces pcpu_copy_value() but delegates to it
        for the cases where no memset() is needed (Andrii).
      - This function now also avoids doing the memset() for
        the current cpu for which the value must be set
        anyhow (Andrii).
      - Same pcpu_init_value() used for per-cpu LRU map
        (Andrii).

 kernel/bpf/hashtab.c                          |  30 ++-
 .../selftests/bpf/prog_tests/map_init.c       | 213 ++++++++++++++++++
 .../selftests/bpf/progs/test_map_init.c       |  34 +++
 3 files changed, 275 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..1fccba6e88c4 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -821,6 +821,32 @@ static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr,
 	}
 }
 
+static void pcpu_init_value(struct bpf_htab *htab, void __percpu *pptr,
+			    void *value, bool onallcpus)
+{
+	/* When using prealloc and not setting the initial value on all cpus,
+	 * zero-fill element values for other cpus (just as what happens when
+	 * not using prealloc). Otherwise, bpf program has no way to ensure
+	 * known initial values for cpus other than current one
+	 * (onallcpus=false always when coming from bpf prog).
+	 */
+	if (htab_is_prealloc(htab) && !onallcpus) {
+		u32 size = round_up(htab->map.value_size, 8);
+		int current_cpu = raw_smp_processor_id();
+		int cpu;
+
+		for_each_possible_cpu(cpu) {
+			if (cpu == current_cpu)
+				bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value,
+						size);
+			else
+				memset(per_cpu_ptr(pptr, cpu), 0, size);
+		}
+	} else {
+		pcpu_copy_value(htab, pptr, value, onallcpus);
+	}
+}
+
 static bool fd_htab_map_needs_adjust(const struct bpf_htab *htab)
 {
 	return htab->map.map_type == BPF_MAP_TYPE_HASH_OF_MAPS &&
@@ -891,7 +917,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 			}
 		}
 
-		pcpu_copy_value(htab, pptr, value, onallcpus);
+		pcpu_init_value(htab, pptr, value, onallcpus);
 
 		if (!prealloc)
 			htab_elem_set_ptr(l_new, key_size, pptr);
@@ -1183,7 +1209,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 		pcpu_copy_value(htab, htab_elem_get_ptr(l_old, key_size),
 				value, onallcpus);
 	} else {
-		pcpu_copy_value(htab, htab_elem_get_ptr(l_new, key_size),
+		pcpu_init_value(htab, htab_elem_get_ptr(l_new, key_size),
 				value, onallcpus);
 		hlist_nulls_add_head_rcu(&l_new->hash_node, head);
 		l_new = NULL;
diff --git a/tools/testing/selftests/bpf/prog_tests/map_init.c b/tools/testing/selftests/bpf/prog_tests/map_init.c
new file mode 100644
index 000000000000..386d9439bad9
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/map_init.c
@@ -0,0 +1,213 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) 2020 Tessares SA <http://www.tessares.net>
+
+#include <test_progs.h>
+#include "test_map_init.skel.h"
+
+#define TEST_VALUE 0x1234
+#define FILL_VALUE 0xdeadbeef
+
+static int nr_cpus;
+static int duration;
+
+typedef unsigned long long map_key_t;
+typedef unsigned long long map_value_t;
+typedef struct {
+	map_value_t v; /* padding */
+} __bpf_percpu_val_align pcpu_map_value_t;
+
+
+static int map_populate(int map_fd, int num)
+{
+	pcpu_map_value_t value[nr_cpus];
+	int i, err;
+	map_key_t key;
+
+	for (i = 0; i < nr_cpus; i++)
+		bpf_percpu(value, i) = FILL_VALUE;
+
+	for (key = 1; key <= num; key++) {
+		err = bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST);
+		if (!ASSERT_OK(err, "bpf_map_update_elem"))
+			return -1;
+	}
+
+	return 0;
+}
+
+static struct test_map_init *setup(enum bpf_map_type map_type, int map_sz,
+			    int *map_fd, int populate)
+{
+	struct test_map_init *skel;
+	int err;
+
+	skel = test_map_init__open();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return NULL;
+
+	err = bpf_map__set_type(skel->maps.hashmap1, map_type);
+	if (!ASSERT_OK(err, "bpf_map__set_type"))
+		goto error;
+
+	err = bpf_map__set_max_entries(skel->maps.hashmap1, map_sz);
+	if (!ASSERT_OK(err, "bpf_map__set_max_entries"))
+		goto error;
+
+	err = test_map_init__load(skel);
+	if (!ASSERT_OK(err, "skel_load"))
+		goto error;
+
+	*map_fd = bpf_map__fd(skel->maps.hashmap1);
+	if (CHECK(*map_fd < 0, "bpf_map__fd", "failed\n"))
+		goto error;
+
+	err = map_populate(*map_fd, populate);
+	if (!ASSERT_OK(err, "map_populate"))
+		goto error_map;
+
+	return skel;
+
+error_map:
+	close(*map_fd);
+error:
+	test_map_init__destroy(skel);
+	return NULL;
+}
+
+/* executes bpf program that updates map with key, value */
+static int prog_run_insert_elem(struct test_map_init *skel, map_key_t key,
+				map_value_t value)
+{
+	struct test_map_init__bss *bss;
+
+	bss = skel->bss;
+
+	bss->inKey = key;
+	bss->inValue = value;
+
+	if (!ASSERT_OK(test_map_init__attach(skel), "skel_attach"))
+		return -1;
+
+	/* Let tracepoint trigger */
+	usleep(1);
+
+	test_map_init__detach(skel);
+
+	return 0;
+}
+
+static int check_values_one_cpu(pcpu_map_value_t *value, map_value_t expected)
+{
+	int i, nzCnt = 0;
+	map_value_t val;
+
+	for (i = 0; i < nr_cpus; i++) {
+		val = bpf_percpu(value, i);
+		if (val) {
+			if (CHECK(val != expected, "map value",
+				  "unexpected for cpu %d: 0x%llx\n", i, val))
+				return -1;
+			nzCnt++;
+		}
+	}
+
+	if (CHECK(nzCnt != 1, "map value", "set for %d CPUs instead of 1!\n",
+		  nzCnt))
+		return -1;
+
+	return 0;
+}
+
+/* Add key=1 elem with values set for all CPUs
+ * Delete elem key=1
+ * Run bpf prog that inserts new key=1 elem with value=0x1234
+ *   (bpf prog can only set value for current CPU)
+ * Lookup Key=1 and check value is as expected for all CPUs:
+ *   value set by bpf prog for one CPU, 0 for all others
+ */
+static void test_pcpu_map_init(void)
+{
+	pcpu_map_value_t value[nr_cpus];
+	struct test_map_init *skel;
+	int map_fd, err;
+	map_key_t key;
+
+	/* max 1 elem in map so insertion is forced to reuse freed entry */
+	skel = setup(BPF_MAP_TYPE_PERCPU_HASH, 1, &map_fd, 1);
+	if (!ASSERT_OK_PTR(skel, "prog_setup"))
+		return;
+
+	/* delete element so the entry can be re-used*/
+	key = 1;
+	err = bpf_map_delete_elem(map_fd, &key);
+	if (!ASSERT_OK(err, "bpf_map_delete_elem"))
+		goto cleanup;
+
+	/* run bpf prog that inserts new elem, re-using the slot just freed */
+	err = prog_run_insert_elem(skel, key, TEST_VALUE);
+	if (!ASSERT_OK(err, "prog_run_insert_elem"))
+		goto cleanup;
+
+	/* check that key=1 was re-created by bpf prog */
+	err = bpf_map_lookup_elem(map_fd, &key, value);
+	if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+		goto cleanup;
+
+	/* and has expected values */
+	check_values_one_cpu(value, TEST_VALUE);
+
+cleanup:
+	test_map_init__destroy(skel);
+}
+
+/* Add key=1 and key=2 elems with values set for all CPUs
+ * Run bpf prog that inserts new key=3 elem
+ *   (only for current cpu; other cpus should have initial value = 0)
+ * Lookup Key=1 and check value is as expected for all CPUs
+ */
+static void test_pcpu_lru_map_init(void)
+{
+	pcpu_map_value_t value[nr_cpus];
+	struct test_map_init *skel;
+	int map_fd, err;
+	map_key_t key;
+
+	/* Set up LRU map with 2 elements, values filled for all CPUs.
+	 * With these 2 elements, the LRU map is full
+	 */
+	skel = setup(BPF_MAP_TYPE_LRU_PERCPU_HASH, 2, &map_fd, 2);
+	if (!ASSERT_OK_PTR(skel, "prog_setup"))
+		return;
+
+	/* run bpf prog that inserts new key=3 element, re-using LRU slot */
+	key = 3;
+	err = prog_run_insert_elem(skel, key, TEST_VALUE);
+	if (!ASSERT_OK(err, "prog_run_insert_elem"))
+		goto cleanup;
+
+	/* check that key=3 replaced one of earlier elements */
+	err = bpf_map_lookup_elem(map_fd, &key, value);
+	if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+		goto cleanup;
+
+	/* and has expected values */
+	check_values_one_cpu(value, TEST_VALUE);
+
+cleanup:
+	test_map_init__destroy(skel);
+}
+
+void test_map_init(void)
+{
+	nr_cpus = bpf_num_possible_cpus();
+	if (nr_cpus <= 1) {
+		printf("%s:SKIP: >1 cpu needed for this test\n", __func__);
+		test__skip();
+		return;
+	}
+
+	if (test__start_subtest("pcpu_map_init"))
+		test_pcpu_map_init();
+	if (test__start_subtest("pcpu_lru_map_init"))
+		test_pcpu_lru_map_init();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_map_init.c b/tools/testing/selftests/bpf/progs/test_map_init.c
new file mode 100644
index 000000000000..280a45e366d6
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_map_init.c
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Tessares SA <http://www.tessares.net>
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+
+__u64 inKey = 0;
+__u64 inValue = 0;
+__u32 once = 0;
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
+	__uint(max_entries, 2);
+	__type(key, __u64);
+	__type(value, __u64);
+} hashmap1 SEC(".maps");
+
+
+SEC("raw_tp/sys_enter")
+int sys_enter(const void *ctx)
+{
+	/* Just do it once so the value is only updated for a single CPU.
+	 * Indeed, this tracepoint will quickly be hit from different CPUs.
+	 */
+	if (!once) {
+		__sync_fetch_and_add(&once, 1);
+
+		bpf_map_update_elem(&hashmap1, &inKey, &inValue, BPF_NOEXIST);
+	}
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v3] bpf: zero-fill re-used per-cpu map element
  2020-11-03 15:47   ` [PATCH bpf v3] " David Verbeiren
@ 2020-11-03 18:19     ` Andrii Nakryiko
  2020-11-04 11:23     ` [PATCH bpf v4] " David Verbeiren
  1 sibling, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2020-11-03 18:19 UTC (permalink / raw)
  To: David Verbeiren
  Cc: bpf, Andrii Nakryiko, Networking, Song Liu, Matthieu Baerts

On Tue, Nov 3, 2020 at 7:49 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all other cpus than current, just as
> when not using prealloc. This is the only way the bpf program can
> ensure known initial values for all cpus ('onallcpus' cannot be
> set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> A selftest is added to validate correct operation in above
> scenario as well as in case of LRU per-cpu map element re-use.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>

Tests look really nice, thanks! I'm worried about still racy once
check, see suggestions below. Otherwise looks great!

> Notes:
>     v3:
>       - Added selftest that was initially provided as separate
>         patch, and reworked to
>         * use skeleton (Andrii, Song Liu)
>         * skip test if <=1 CPU (Song Liu)
>
>     v2:
>       - Moved memset() to separate pcpu_init_value() function,
>         which replaces pcpu_copy_value() but delegates to it
>         for the cases where no memset() is needed (Andrii).
>       - This function now also avoids doing the memset() for
>         the current cpu for which the value must be set
>         anyhow (Andrii).
>       - Same pcpu_init_value() used for per-cpu LRU map
>         (Andrii).
>
>  kernel/bpf/hashtab.c                          |  30 ++-
>  .../selftests/bpf/prog_tests/map_init.c       | 213 ++++++++++++++++++
>  .../selftests/bpf/progs/test_map_init.c       |  34 +++
>  3 files changed, 275 insertions(+), 2 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c
>

[...]

> diff --git a/tools/testing/selftests/bpf/prog_tests/map_init.c b/tools/testing/selftests/bpf/prog_tests/map_init.c
> new file mode 100644
> index 000000000000..386d9439bad9
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/map_init.c
> @@ -0,0 +1,213 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) 2020 Tessares SA <http://www.tessares.net>
> +

nit: see below, /* */

> +#include <test_progs.h>
> +#include "test_map_init.skel.h"
> +

[...]

> diff --git a/tools/testing/selftests/bpf/progs/test_map_init.c b/tools/testing/selftests/bpf/progs/test_map_init.c
> new file mode 100644
> index 000000000000..280a45e366d6
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_map_init.c
> @@ -0,0 +1,34 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (c) 2020 Tessares SA <http://www.tessares.net>

nit: I think copyright line has to be in /* */ comment block

> +
> +#include "vmlinux.h"
> +#include <bpf/bpf_helpers.h>
> +
> +__u64 inKey = 0;
> +__u64 inValue = 0;
> +__u32 once = 0;
> +
> +struct {
> +       __uint(type, BPF_MAP_TYPE_PERCPU_HASH);
> +       __uint(max_entries, 2);
> +       __type(key, __u64);
> +       __type(value, __u64);
> +} hashmap1 SEC(".maps");
> +
> +
> +SEC("raw_tp/sys_enter")
> +int sys_enter(const void *ctx)
> +{
> +       /* Just do it once so the value is only updated for a single CPU.
> +        * Indeed, this tracepoint will quickly be hit from different CPUs.
> +        */
> +       if (!once) {
> +               __sync_fetch_and_add(&once, 1);

This is quite racy, actually, especially for the generic sys_enter
tracepoint. The way I did this before (see progs/trigger_bench.c) was
through doing a "tp/syscalls/sys_enter_getpgid" tracepoint program and
checking for thread id. Or you can use bpf_test_run, probably, with a
different type of BPF program. I just find bpf_test_run() too
inconvenient with all the extra setup, so I usually stick to
tracepoints.

> +
> +               bpf_map_update_elem(&hashmap1, &inKey, &inValue, BPF_NOEXIST);
> +       }
> +
> +       return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.29.0
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf v4] bpf: zero-fill re-used per-cpu map element
  2020-11-03 15:47   ` [PATCH bpf v3] " David Verbeiren
  2020-11-03 18:19     ` Andrii Nakryiko
@ 2020-11-04 11:23     ` David Verbeiren
  2020-11-04 20:45       ` Andrii Nakryiko
  2020-11-06  4:10       ` patchwork-bot+netdevbpf
  1 sibling, 2 replies; 12+ messages in thread
From: David Verbeiren @ 2020-11-04 11:23 UTC (permalink / raw)
  To: bpf, Andrii Nakryiko; +Cc: netdev, David Verbeiren, Matthieu Baerts

Zero-fill element values for all other cpus than current, just as
when not using prealloc. This is the only way the bpf program can
ensure known initial values for all cpus ('onallcpus' cannot be
set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

A selftest is added to validate correct operation in above
scenario as well as in case of LRU per-cpu map element re-use.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---

Notes:
    v4:
      - Replaced racy "once" test by getpgid syscall with
        check on pid. (Andrii)
      - Copyright lines use /* */ (Andrii)
    
    v3:
      - Added selftest that was initially provided as separate
        patch, and reworked to
        * use skeleton (Andrii, Song Liu)
        * skip test if <=1 CPU (Song Liu)
    
    v2:
      - Moved memset() to separate pcpu_init_value() function,
        which replaces pcpu_copy_value() but delegates to it
        for the cases where no memset() is needed (Andrii).
      - This function now also avoids doing the memset() for
        the current cpu for which the value must be set
        anyhow (Andrii).
      - Same pcpu_init_value() used for per-cpu LRU map
        (Andrii).

 kernel/bpf/hashtab.c                          |  30 ++-
 .../selftests/bpf/prog_tests/map_init.c       | 214 ++++++++++++++++++
 .../selftests/bpf/progs/test_map_init.c       |  33 +++
 3 files changed, 275 insertions(+), 2 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..1fccba6e88c4 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -821,6 +821,32 @@ static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr,
 	}
 }
 
+static void pcpu_init_value(struct bpf_htab *htab, void __percpu *pptr,
+			    void *value, bool onallcpus)
+{
+	/* When using prealloc and not setting the initial value on all cpus,
+	 * zero-fill element values for other cpus (just as what happens when
+	 * not using prealloc). Otherwise, bpf program has no way to ensure
+	 * known initial values for cpus other than current one
+	 * (onallcpus=false always when coming from bpf prog).
+	 */
+	if (htab_is_prealloc(htab) && !onallcpus) {
+		u32 size = round_up(htab->map.value_size, 8);
+		int current_cpu = raw_smp_processor_id();
+		int cpu;
+
+		for_each_possible_cpu(cpu) {
+			if (cpu == current_cpu)
+				bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value,
+						size);
+			else
+				memset(per_cpu_ptr(pptr, cpu), 0, size);
+		}
+	} else {
+		pcpu_copy_value(htab, pptr, value, onallcpus);
+	}
+}
+
 static bool fd_htab_map_needs_adjust(const struct bpf_htab *htab)
 {
 	return htab->map.map_type == BPF_MAP_TYPE_HASH_OF_MAPS &&
@@ -891,7 +917,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 			}
 		}
 
-		pcpu_copy_value(htab, pptr, value, onallcpus);
+		pcpu_init_value(htab, pptr, value, onallcpus);
 
 		if (!prealloc)
 			htab_elem_set_ptr(l_new, key_size, pptr);
@@ -1183,7 +1209,7 @@ static int __htab_lru_percpu_map_update_elem(struct bpf_map *map, void *key,
 		pcpu_copy_value(htab, htab_elem_get_ptr(l_old, key_size),
 				value, onallcpus);
 	} else {
-		pcpu_copy_value(htab, htab_elem_get_ptr(l_new, key_size),
+		pcpu_init_value(htab, htab_elem_get_ptr(l_new, key_size),
 				value, onallcpus);
 		hlist_nulls_add_head_rcu(&l_new->hash_node, head);
 		l_new = NULL;
diff --git a/tools/testing/selftests/bpf/prog_tests/map_init.c b/tools/testing/selftests/bpf/prog_tests/map_init.c
new file mode 100644
index 000000000000..14a31109dd0e
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/map_init.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2020 Tessares SA <http://www.tessares.net> */
+
+#include <test_progs.h>
+#include "test_map_init.skel.h"
+
+#define TEST_VALUE 0x1234
+#define FILL_VALUE 0xdeadbeef
+
+static int nr_cpus;
+static int duration;
+
+typedef unsigned long long map_key_t;
+typedef unsigned long long map_value_t;
+typedef struct {
+	map_value_t v; /* padding */
+} __bpf_percpu_val_align pcpu_map_value_t;
+
+
+static int map_populate(int map_fd, int num)
+{
+	pcpu_map_value_t value[nr_cpus];
+	int i, err;
+	map_key_t key;
+
+	for (i = 0; i < nr_cpus; i++)
+		bpf_percpu(value, i) = FILL_VALUE;
+
+	for (key = 1; key <= num; key++) {
+		err = bpf_map_update_elem(map_fd, &key, value, BPF_NOEXIST);
+		if (!ASSERT_OK(err, "bpf_map_update_elem"))
+			return -1;
+	}
+
+	return 0;
+}
+
+static struct test_map_init *setup(enum bpf_map_type map_type, int map_sz,
+			    int *map_fd, int populate)
+{
+	struct test_map_init *skel;
+	int err;
+
+	skel = test_map_init__open();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return NULL;
+
+	err = bpf_map__set_type(skel->maps.hashmap1, map_type);
+	if (!ASSERT_OK(err, "bpf_map__set_type"))
+		goto error;
+
+	err = bpf_map__set_max_entries(skel->maps.hashmap1, map_sz);
+	if (!ASSERT_OK(err, "bpf_map__set_max_entries"))
+		goto error;
+
+	err = test_map_init__load(skel);
+	if (!ASSERT_OK(err, "skel_load"))
+		goto error;
+
+	*map_fd = bpf_map__fd(skel->maps.hashmap1);
+	if (CHECK(*map_fd < 0, "bpf_map__fd", "failed\n"))
+		goto error;
+
+	err = map_populate(*map_fd, populate);
+	if (!ASSERT_OK(err, "map_populate"))
+		goto error_map;
+
+	return skel;
+
+error_map:
+	close(*map_fd);
+error:
+	test_map_init__destroy(skel);
+	return NULL;
+}
+
+/* executes bpf program that updates map with key, value */
+static int prog_run_insert_elem(struct test_map_init *skel, map_key_t key,
+				map_value_t value)
+{
+	struct test_map_init__bss *bss;
+
+	bss = skel->bss;
+
+	bss->inKey = key;
+	bss->inValue = value;
+	bss->inPid = getpid();
+
+	if (!ASSERT_OK(test_map_init__attach(skel), "skel_attach"))
+		return -1;
+
+	/* Let tracepoint trigger */
+	syscall(__NR_getpgid);
+
+	test_map_init__detach(skel);
+
+	return 0;
+}
+
+static int check_values_one_cpu(pcpu_map_value_t *value, map_value_t expected)
+{
+	int i, nzCnt = 0;
+	map_value_t val;
+
+	for (i = 0; i < nr_cpus; i++) {
+		val = bpf_percpu(value, i);
+		if (val) {
+			if (CHECK(val != expected, "map value",
+				  "unexpected for cpu %d: 0x%llx\n", i, val))
+				return -1;
+			nzCnt++;
+		}
+	}
+
+	if (CHECK(nzCnt != 1, "map value", "set for %d CPUs instead of 1!\n",
+		  nzCnt))
+		return -1;
+
+	return 0;
+}
+
+/* Add key=1 elem with values set for all CPUs
+ * Delete elem key=1
+ * Run bpf prog that inserts new key=1 elem with value=0x1234
+ *   (bpf prog can only set value for current CPU)
+ * Lookup Key=1 and check value is as expected for all CPUs:
+ *   value set by bpf prog for one CPU, 0 for all others
+ */
+static void test_pcpu_map_init(void)
+{
+	pcpu_map_value_t value[nr_cpus];
+	struct test_map_init *skel;
+	int map_fd, err;
+	map_key_t key;
+
+	/* max 1 elem in map so insertion is forced to reuse freed entry */
+	skel = setup(BPF_MAP_TYPE_PERCPU_HASH, 1, &map_fd, 1);
+	if (!ASSERT_OK_PTR(skel, "prog_setup"))
+		return;
+
+	/* delete element so the entry can be re-used*/
+	key = 1;
+	err = bpf_map_delete_elem(map_fd, &key);
+	if (!ASSERT_OK(err, "bpf_map_delete_elem"))
+		goto cleanup;
+
+	/* run bpf prog that inserts new elem, re-using the slot just freed */
+	err = prog_run_insert_elem(skel, key, TEST_VALUE);
+	if (!ASSERT_OK(err, "prog_run_insert_elem"))
+		goto cleanup;
+
+	/* check that key=1 was re-created by bpf prog */
+	err = bpf_map_lookup_elem(map_fd, &key, value);
+	if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+		goto cleanup;
+
+	/* and has expected values */
+	check_values_one_cpu(value, TEST_VALUE);
+
+cleanup:
+	test_map_init__destroy(skel);
+}
+
+/* Add key=1 and key=2 elems with values set for all CPUs
+ * Run bpf prog that inserts new key=3 elem
+ *   (only for current cpu; other cpus should have initial value = 0)
+ * Lookup Key=1 and check value is as expected for all CPUs
+ */
+static void test_pcpu_lru_map_init(void)
+{
+	pcpu_map_value_t value[nr_cpus];
+	struct test_map_init *skel;
+	int map_fd, err;
+	map_key_t key;
+
+	/* Set up LRU map with 2 elements, values filled for all CPUs.
+	 * With these 2 elements, the LRU map is full
+	 */
+	skel = setup(BPF_MAP_TYPE_LRU_PERCPU_HASH, 2, &map_fd, 2);
+	if (!ASSERT_OK_PTR(skel, "prog_setup"))
+		return;
+
+	/* run bpf prog that inserts new key=3 element, re-using LRU slot */
+	key = 3;
+	err = prog_run_insert_elem(skel, key, TEST_VALUE);
+	if (!ASSERT_OK(err, "prog_run_insert_elem"))
+		goto cleanup;
+
+	/* check that key=3 replaced one of earlier elements */
+	err = bpf_map_lookup_elem(map_fd, &key, value);
+	if (!ASSERT_OK(err, "bpf_map_lookup_elem"))
+		goto cleanup;
+
+	/* and has expected values */
+	check_values_one_cpu(value, TEST_VALUE);
+
+cleanup:
+	test_map_init__destroy(skel);
+}
+
+void test_map_init(void)
+{
+	nr_cpus = bpf_num_possible_cpus();
+	if (nr_cpus <= 1) {
+		printf("%s:SKIP: >1 cpu needed for this test\n", __func__);
+		test__skip();
+		return;
+	}
+
+	if (test__start_subtest("pcpu_map_init"))
+		test_pcpu_map_init();
+	if (test__start_subtest("pcpu_lru_map_init"))
+		test_pcpu_lru_map_init();
+}
diff --git a/tools/testing/selftests/bpf/progs/test_map_init.c b/tools/testing/selftests/bpf/progs/test_map_init.c
new file mode 100644
index 000000000000..c89d28ead673
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_map_init.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Tessares SA <http://www.tessares.net> */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+
+__u64 inKey = 0;
+__u64 inValue = 0;
+__u32 inPid = 0;
+
+struct {
+	__uint(type, BPF_MAP_TYPE_PERCPU_HASH);
+	__uint(max_entries, 2);
+	__type(key, __u64);
+	__type(value, __u64);
+} hashmap1 SEC(".maps");
+
+
+SEC("tp/syscalls/sys_enter_getpgid")
+int sysenter_getpgid(const void *ctx)
+{
+	/* Just do it for once, when called from our own test prog. This
+	 * ensures the map value is only updated for a single CPU.
+	 */
+	int cur_pid = bpf_get_current_pid_tgid() >> 32;
+
+	if (cur_pid == inPid)
+		bpf_map_update_elem(&hashmap1, &inKey, &inValue, BPF_NOEXIST);
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.29.0


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v4] bpf: zero-fill re-used per-cpu map element
  2020-11-04 11:23     ` [PATCH bpf v4] " David Verbeiren
@ 2020-11-04 20:45       ` Andrii Nakryiko
  2020-11-06  4:10       ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 12+ messages in thread
From: Andrii Nakryiko @ 2020-11-04 20:45 UTC (permalink / raw)
  To: David Verbeiren; +Cc: bpf, Andrii Nakryiko, Networking, Matthieu Baerts

On Wed, Nov 4, 2020 at 3:26 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all other cpus than current, just as
> when not using prealloc. This is the only way the bpf program can
> ensure known initial values for all cpus ('onallcpus' cannot be
> set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> A selftest is added to validate correct operation in above
> scenario as well as in case of LRU per-cpu map element re-use.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>
> Notes:
>     v4:
>       - Replaced racy "once" test by getpgid syscall with
>         check on pid. (Andrii)
>       - Copyright lines use /* */ (Andrii)

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>
>     v3:
>       - Added selftest that was initially provided as separate
>         patch, and reworked to
>         * use skeleton (Andrii, Song Liu)
>         * skip test if <=1 CPU (Song Liu)
>
>     v2:
>       - Moved memset() to separate pcpu_init_value() function,
>         which replaces pcpu_copy_value() but delegates to it
>         for the cases where no memset() is needed (Andrii).
>       - This function now also avoids doing the memset() for
>         the current cpu for which the value must be set
>         anyhow (Andrii).
>       - Same pcpu_init_value() used for per-cpu LRU map
>         (Andrii).
>
>  kernel/bpf/hashtab.c                          |  30 ++-
>  .../selftests/bpf/prog_tests/map_init.c       | 214 ++++++++++++++++++
>  .../selftests/bpf/progs/test_map_init.c       |  33 +++
>  3 files changed, 275 insertions(+), 2 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/map_init.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_map_init.c
>

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf v4] bpf: zero-fill re-used per-cpu map element
  2020-11-04 11:23     ` [PATCH bpf v4] " David Verbeiren
  2020-11-04 20:45       ` Andrii Nakryiko
@ 2020-11-06  4:10       ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2020-11-06  4:10 UTC (permalink / raw)
  To: David Verbeiren; +Cc: bpf, andrii, netdev, matthieu.baerts

Hello:

This patch was applied to bpf/bpf.git (refs/heads/master):

On Wed,  4 Nov 2020 12:23:32 +0100 you wrote:
> Zero-fill element values for all other cpus than current, just as
> when not using prealloc. This is the only way the bpf program can
> ensure known initial values for all cpus ('onallcpus' cannot be
> set when coming from the bpf program).
> 
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
> 
> [...]

Here is the summary with links:
  - [bpf,v4] bpf: zero-fill re-used per-cpu map element
    https://git.kernel.org/bpf/bpf/c/d3bec0138bfb

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-11-06  4:10 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-23 12:37 [PATCH bpf] bpf: zero-fill re-used per-cpu map element David Verbeiren
2020-10-26 22:47 ` Andrii Nakryiko
2020-10-27 14:48   ` David Verbeiren
2020-10-27 22:13 ` [PATCH bpf v2] " David Verbeiren
2020-10-27 22:55   ` Andrii Nakryiko
2020-10-29 14:44     ` David Verbeiren
2020-10-29 22:40       ` Andrii Nakryiko
2020-11-03 15:47   ` [PATCH bpf v3] " David Verbeiren
2020-11-03 18:19     ` Andrii Nakryiko
2020-11-04 11:23     ` [PATCH bpf v4] " David Verbeiren
2020-11-04 20:45       ` Andrii Nakryiko
2020-11-06  4:10       ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.