bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hou Tao <houtao@huaweicloud.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Yonghong Song <yhs@meta.com>, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
	Hao Luo <haoluo@google.com>, Yonghong Song <yhs@fb.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Jiri Olsa <jolsa@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	rcu@vger.kernel.org, Hou Tao <houtao1@huawei.com>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc
Date: Fri, 17 Feb 2023 09:19:30 +0800	[thread overview]
Message-ID: <2a58c4a8-781f-6d84-e72a-f8b7117762b4@huaweicloud.com> (raw)
In-Reply-To: <CAADnVQ+w9h4T6k+F5cLGVVx1jkHvKCF7=ki_Fb1oCp1SF1ZDNA@mail.gmail.com>


On 2/17/2023 12:35 AM, Alexei Starovoitov wrote:
> On Thu, Feb 16, 2023 at 5:55 AM Hou Tao <houtao@huaweicloud.com> wrote:
>> Beside BPF_REUSE_AFTER_RCU_GP, is BPF_FREE_AFTER_RCU_GP a feasible solution ?
> The idea is for bpf_mem_free to wait normal RCU GP before adding
> the elements back to the free list and free the elem to global kernel memory
> only after both rcu and rcu_tasks_trace GPs as it's doing now.
>> Its downside is that it will enforce sleep-able program to use
>> bpf_rcu_read_{lock,unlock}() to access these returned pointers ?
> sleepable can access elems without kptrs/spin_locks
> even when not using rcu_read_lock, since it's safe, but there is uaf.
> Some progs might be fine with it.
> When sleepable needs to avoid uaf they will use bpf_rcu_read_lock.
Thanks for the explanation for BPF_REUSE_AFTER_RCU_GP. It seems that
BPF_REUSE_AFTER_RCU_GP may incur OOM easily, because before the expiration of
one RCU GP, these freed elements will not available to both bpf ma or slab
subsystem and after the expiration of RCU GP, these freed elements are only
available for one bpf ma but the number of these freed elements maybe too many
for one bpf ma, so part of these freed elements will be freed through
call_rcu_tasks_trace() and these freed-again elements will not be available for
slab subsystem untill the expiration of tasks trace RCU. In brief, after one RCU
GP, part of these freed elements will be reused, but the majority of these
elements will still be freed through call_rcu_tasks_trace(). Due to the doubt
above, I proposed BPF_FREE_AFTER_RCU to directly free these elements after one
RCU GP and enforce sleepable program to use bpf_rcu_read_lock() to access these
elements, but the enforcement will break the existing sleepable programs, so
BPF_FREE_AFTER_GP is still not a good idea. I will check whether or not these is
still OOM risk for BPF_REUSE_AFTER_RCU_GP and try to mitigate if it is possible
(e.g., share these freed elements between all bpf ma instead of one bpf ma which
free it).

  reply	other threads:[~2023-02-17  1:19 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  4:11 [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 1/6] bpf: Support ctor in bpf memory allocator Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 2/6] bpf: Factor out a common helper free_llist() Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 3/6] bpf: Pass bitwise flags to bpf_mem_alloc_init() Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 4/6] bpf: Introduce BPF_MA_NO_REUSE for bpf memory allocator Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 5/6] bpf: Use BPF_MA_NO_REUSE in htab map Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 6/6] selftests/bpf: Add test case for element reuse " Hou Tao
2023-01-01  1:26 ` [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc Alexei Starovoitov
2023-01-01 18:48   ` Yonghong Song
2023-01-03 13:47     ` Hou Tao
2023-01-04  6:10       ` Yonghong Song
2023-01-04  6:30         ` Hou Tao
2023-01-04  7:14           ` Yonghong Song
2023-01-04 18:26             ` Alexei Starovoitov
2023-02-10 16:32               ` Kumar Kartikeya Dwivedi
2023-02-10 21:06                 ` Alexei Starovoitov
2023-02-11  1:09                   ` Hou Tao
2023-02-11 16:33                     ` Alexei Starovoitov
2023-02-11 16:34                       ` Alexei Starovoitov
2023-02-15  1:54                         ` Martin KaFai Lau
2023-02-15  4:02                           ` Hou Tao
2023-02-15  7:22                             ` Martin KaFai Lau
2023-02-16  2:11                               ` Hou Tao
2023-02-16  7:47                                 ` Martin KaFai Lau
2023-02-16  8:18                                   ` Hou Tao
2023-02-16 13:55                         ` Hou Tao
2023-02-16 16:35                           ` Alexei Starovoitov
2023-02-17  1:19                             ` Hou Tao [this message]
2023-02-22 19:30                               ` Alexei Starovoitov
2023-02-15  2:35                       ` Hou Tao
2023-02-15  2:42                         ` Alexei Starovoitov
2023-02-15  3:00                           ` Hou Tao
2023-01-03 13:40   ` Hou Tao
2023-01-03 19:38     ` Alexei Starovoitov
2023-01-10  6:26       ` Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a58c4a8-781f-6d84-e72a-f8b7117762b4@huaweicloud.com \
    --to=houtao@huaweicloud.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=houtao1@huawei.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    --cc=yhs@meta.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).