From: Hou Tao <houtao@huaweicloud.com>
To: bpf@vger.kernel.org
Cc: Martin KaFai Lau <martin.lau@linux.dev>,
Alexei Starovoitov <alexei.starovoitov@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
Hao Luo <haoluo@google.com>,
Yonghong Song <yonghong.song@linux.dev>,
Daniel Borkmann <daniel@iogearbox.net>,
KP Singh <kpsingh@kernel.org>,
Stanislav Fomichev <sdf@google.com>, Jiri Olsa <jolsa@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
Hsin-Wei Hung <hsinweih@uci.edu>,
houtao1@huawei.com
Subject: [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned
Date: Mon, 30 Oct 2023 14:36:16 +0800 [thread overview]
Message-ID: <20231030063616.1653024-1-houtao@huaweicloud.com> (raw)
From: Hou Tao <houtao1@huawei.com>
When there are concurrent uref release and bpf timer init operations,
the following sequence diagram is possible. It will break the guarantee
provided by bpf_timer: bpf_timer will still be alive after userspace
application releases or unpins the map. It also will lead to kmemleak
for old kernel version which doesn't release bpf_timer when map is
released.
bpf program X:
bpf_timer_init()
lock timer->lock
read timer->timer as NULL
read map->usercnt != 0
process Y:
close(map_fd)
// put last uref
bpf_map_put_uref()
atomic_dec_and_test(map->usercnt)
array_map_free_timers()
bpf_timer_cancel_and_free()
// just return
read timer->timer is NULL
t = bpf_map_kmalloc_node()
timer->timer = t
unlock timer->lock
Fix the problem by checking map->usercnt after timer->timer is assigned,
so when there are concurrent uref release and bpf timer init, either
bpf_timer_cancel_and_free() from uref release reads a no-NULL timer
or the newly-added atomic64_read() returns a zero usercnt.
Because atomic_dec_and_test(map->usercnt) and READ_ONCE(timer->timer)
in bpf_timer_cancel_and_free() are not protected by a lock, so add
a memory barrier to guarantee the order between map->usercnt and
timer->timer. Also use WRITE_ONCE(timer->timer, x) to match the lockless
read of timer->timer in bpf_timer_cancel_and_free().
Reported-by: Hsin-Wei Hung <hsinweih@uci.edu>
Closes: https://lore.kernel.org/bpf/CABcoxUaT2k9hWsS1tNgXyoU3E-=PuOgMn737qK984fbFmfYixQ@mail.gmail.com
Fixes: b00628b1c7d5 ("bpf: Introduce bpf timers.")
Signed-off-by: Hou Tao <houtao1@huawei.com>
---
v3:
* patch #1: only check map->usercnt once and call kfree() with
spin-lock acquired in error handling path (Alexei)
update the commit messsage to explain that the patch only
fixes the broken-guarantee problem for bpf_timer. The
kmemleak problem will be fixed by another patchset.
* patch #2: remove the selftest patch because it demonstrates the
use-after-free problem for map-in-map and the kmemleak
problem is just the superficial phenomenon. It will be
re-added and refined in another patchset.
v2: https://lore.kernel.org/bpf/20231020014214.2471419-1-houtao@huaweicloud.com
* patch #1: use smp_mb() instead of smp_mb__before_atomic()
* patch #2: use WRITE_ONCE(timer->timer, x) to match the lockless read
of timer->timer
v1: https://lore.kernel.org/bpf/20231017125717.241101-1-houtao@huaweicloud.com
kernel/bpf/helpers.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index e46ac288a1080..aed93df5c8aa0 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1177,13 +1177,6 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
ret = -EBUSY;
goto out;
}
- if (!atomic64_read(&map->usercnt)) {
- /* maps with timers must be either held by user space
- * or pinned in bpffs.
- */
- ret = -EPERM;
- goto out;
- }
/* allocate hrtimer via map_kmalloc to use memcg accounting */
t = bpf_map_kmalloc_node(map, sizeof(*t), GFP_ATOMIC, map->numa_node);
if (!t) {
@@ -1196,7 +1189,21 @@ BPF_CALL_3(bpf_timer_init, struct bpf_timer_kern *, timer, struct bpf_map *, map
rcu_assign_pointer(t->callback_fn, NULL);
hrtimer_init(&t->timer, clockid, HRTIMER_MODE_REL_SOFT);
t->timer.function = bpf_timer_cb;
- timer->timer = t;
+ WRITE_ONCE(timer->timer, t);
+ /* Guarantee the order between timer->timer and map->usercnt. So
+ * when there are concurrent uref release and bpf timer init, either
+ * bpf_timer_cancel_and_free() called by uref release reads a no-NULL
+ * timer or atomic64_read() below returns a zero usercnt.
+ */
+ smp_mb();
+ if (!atomic64_read(&map->usercnt)) {
+ /* maps with timers must be either held by user space
+ * or pinned in bpffs.
+ */
+ WRITE_ONCE(timer->timer, NULL);
+ kfree(t);
+ ret = -EPERM;
+ }
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
return ret;
@@ -1374,7 +1381,7 @@ void bpf_timer_cancel_and_free(void *val)
/* The subsequent bpf_timer_start/cancel() helpers won't be able to use
* this timer, since it won't be initialized.
*/
- timer->timer = NULL;
+ WRITE_ONCE(timer->timer, NULL);
out:
__bpf_spin_unlock_irqrestore(&timer->lock);
if (!t)
--
2.29.2
next reply other threads:[~2023-10-30 6:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-30 6:36 Hou Tao [this message]
2023-11-02 6:00 ` [PATCH bpf-next v3] bpf: Check map->usercnt after timer->timer is assigned patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231030063616.1653024-1-houtao@huaweicloud.com \
--to=houtao@huaweicloud.com \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=haoluo@google.com \
--cc=houtao1@huawei.com \
--cc=hsinweih@uci.edu \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=martin.lau@linux.dev \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.