bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Lorenz Bauer <lmb@cloudflare.com>
To: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net
Cc: joe@wand.net.nz, Lorenz Bauer <lmb@cloudflare.com>
Subject: [PATCH bpf] bpf: improve htab_map_get_next_key behaviour during races
Date: Fri, 18 Oct 2019 14:43:11 +0100	[thread overview]
Message-ID: <20191018134311.7284-1-lmb@cloudflare.com> (raw)

To iterate a BPF map, userspace must use MAP_GET_NEXT_KEY and provide
the last retrieved key. The code then scans the hash table bucket
for the key and returns the key of the next item.

This presents a problem if the last retrieved key isn't present in the
hash table anymore, e.g. due to concurrent deletion. It's not possible
to ascertain the location of a key in a given bucket, so there isn't
really a correct answer. The implementation currently returns the
first key in the first bucket. This guarantees that we never skip an
existing key. However, it means that a user space program iterating
a heavily modified map may never reach the end of the hash table,
forever restarting at the beginning.

Fixing this outright is rather involved. However, we can improve slightly
by never revisiting earlier buckets. Instead of the first key in the
first bucket we return the first key in the "current" bucket. This
doesn't eliminate the problem, but makes it less likely to occur.

Prior to commit 8fe45924387b ("bpf: map_get_next_key to return first key on NULL")
passing a non-existent key to MAP_GET_NEXT_KEY was the only way to
find the first key. Hence there is a small chance that there is code that
will be broken by this change.

Fixes: 8fe45924387b ("bpf: map_get_next_key to return first key on NULL")
Signed-off-by: Lorenz Bauer <lmb@cloudflare.com>
---
 kernel/bpf/hashtab.c                    | 4 +++-
 tools/testing/selftests/bpf/test_maps.c | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22066a62c8c9..30f0dab488f0 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -613,6 +613,9 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
 
 	head = select_bucket(htab, hash);
 
+	/* don't iterate previous buckets */
+	i = hash & (htab->n_buckets - 1);
+
 	/* lookup the key */
 	l = lookup_nulls_elem_raw(head, hash, key, key_size, htab->n_buckets);
 
@@ -630,7 +633,6 @@ static int htab_map_get_next_key(struct bpf_map *map, void *key, void *next_key)
 	}
 
 	/* no more elements in this hash list, go to the next bucket */
-	i = hash & (htab->n_buckets - 1);
 	i++;
 
 find_first_elem:
diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 806b298397d3..6f351e532ddc 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -100,7 +100,7 @@ static void test_hashmap(unsigned int task, void *data)
 	assert(bpf_map_get_next_key(fd, NULL, &first_key) == 0 &&
 	       (first_key == 1 || first_key == 2));
 	assert(bpf_map_get_next_key(fd, &key, &next_key) == 0 &&
-	       (next_key == first_key));
+	       (next_key == 1 || next_key == 2));
 	assert(bpf_map_get_next_key(fd, &next_key, &next_key) == 0 &&
 	       (next_key == 1 || next_key == 2) &&
 	       (next_key != first_key));
-- 
2.20.1


             reply	other threads:[~2019-10-18 13:43 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-18 13:43 Lorenz Bauer [this message]
2019-10-18 16:03 ` [PATCH bpf] bpf: improve htab_map_get_next_key behaviour during races Alexei Starovoitov
2019-10-18 20:56   ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191018134311.7284-1-lmb@cloudflare.com \
    --to=lmb@cloudflare.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=joe@wand.net.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).