bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hou Tao <houtao@huaweicloud.com>
To: bpf@vger.kernel.org, Martin KaFai Lau <martin.lau@linux.dev>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
	Hao Luo <haoluo@google.com>, Yonghong Song <yhs@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Jiri Olsa <jolsa@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	rcu@vger.kernel.org, houtao1@huawei.com
Subject: [RFC bpf-next v3 0/6] Handle immediate reuse in bpf memory allocator
Date: Sat, 29 Apr 2023 18:12:09 +0800	[thread overview]
Message-ID: <20230429101215.111262-1-houtao@huaweicloud.com> (raw)

From: Hou Tao <houtao1@huawei.com>

Hi,

As discussed in v1, currently the freed objects in bpf memory allocator
may be reused immediately by the new allocation, it introduces
use-after-bpf-ma-free problem for non-preallocated hash map and makes
lookup procedure return incorrect result. The immediate reuse also makes
introducing new use case more difficult (e.g. qp-trie).

The patch series tries to solve these problems by introducing
BPF_MA_{REUSE|FREE}_AFTER_RCU_GP in bpf memory allocator. For
REUSE_AFTER_GP, the freed objects are reused only after one RCU grace
period and may be freed by bpf memory allocator after another
RCU-tasks-trace grace period. So for bpf programs which care about reuse
problem, these programs can use bpf_rcu_read_{lock,unlock}() to access
these objects safely and for those which doesn't care, there will be
safely use-after-bpf-ma-free because these objects have not been freed
by bpf memory allocator. FREE_AFTER_GP behavior differently. Instead of
making the freed elements being reusable after one RCU GP, it directly
freed these elements back to slab after one RCU GP, so sleepable bpf
program must use bpf_rcu_read_{lock,unlock}() to access elements
allocated from FREE_AFTER_GP bpf memory allocator.

Personally I prefer FREE_AFTER_RCU_GP because its implementation is much
simpler compared with REUSE_AFTER_RCU and its memory usage is also better
than REUSE_AFTER_GP. But its shortcoming is also obvious, so I want to get
some feedback before putting in more effort. As usual, comments and
suggestions are always welcome.

Change Log:
v3:
 * add BPF_MA_FREE_AFTER_RCU_GP bpf memory allocator
 * Update htab memory benchmark
   * move the benchmark patch to the last patch
   * remove array and useless bpf_map_lookup_elem(&array, ...) in bpf
     programs
   * add synchronization between addition CPU and deletion CPU for
     add_del_on_diff_cpu case to prevent unnecessary loop
   * add the benchmark result for "extra call_rcu + bpf ma"

v2: https://lore.kernel.org/bpf/20230408141846.1878768-1-houtao@huaweicloud.com/
 * add a benchmark for bpf memory allocator to compare between different
   flavor of bpf memory allocator.
 * implement BPF_MA_REUSE_AFTER_RCU_GP for bpf memory allocator.
v1: https://lore.kernel.org/bpf/20221230041151.1231169-1-houtao@huaweicloud.com/

Hou Tao (6):
  bpf: Factor out a common helper free_all()
  bpf: Pass bitwise flags to bpf_mem_alloc_init()
  bpf: Introduce BPF_MA_REUSE_AFTER_RCU_GP
  bpf: Introduce BPF_MA_FREE_AFTER_RCU_GP
  bpf: Add two module parameters in htab for memory benchmark
  selftests/bpf: Add benchmark for bpf memory allocator

 include/linux/bpf_mem_alloc.h                 |  10 +-
 kernel/bpf/core.c                             |   2 +-
 kernel/bpf/cpumask.c                          |   2 +-
 kernel/bpf/hashtab.c                          |  43 +-
 kernel/bpf/memalloc.c                         | 529 ++++++++++++++++--
 tools/testing/selftests/bpf/Makefile          |   3 +
 tools/testing/selftests/bpf/bench.c           |   4 +
 .../selftests/bpf/benchs/bench_htab_mem.c     | 352 ++++++++++++
 .../bpf/benchs/run_bench_htab_mem.sh          |  64 +++
 .../selftests/bpf/progs/htab_mem_bench.c      | 135 +++++
 10 files changed, 1090 insertions(+), 54 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/benchs/bench_htab_mem.c
 create mode 100755 tools/testing/selftests/bpf/benchs/run_bench_htab_mem.sh
 create mode 100644 tools/testing/selftests/bpf/progs/htab_mem_bench.c

-- 
2.29.2


             reply	other threads:[~2023-04-29  9:41 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-29 10:12 Hou Tao [this message]
2023-04-29 10:12 ` [RFC bpf-next v3 1/6] bpf: Factor out a common helper free_all() Hou Tao
2023-04-29 10:12 ` [RFC bpf-next v3 2/6] bpf: Pass bitwise flags to bpf_mem_alloc_init() Hou Tao
2023-04-29 10:12 ` [RFC bpf-next v3 3/6] bpf: Introduce BPF_MA_REUSE_AFTER_RCU_GP Hou Tao
2023-05-01 23:59   ` Martin KaFai Lau
2023-05-03 18:48   ` Alexei Starovoitov
2023-05-03 21:57     ` Martin KaFai Lau
2023-05-03 23:06       ` Alexei Starovoitov
2023-05-03 23:39         ` Martin KaFai Lau
2023-05-04  1:42           ` Alexei Starovoitov
2023-05-04  2:08           ` Hou Tao
2023-05-04  1:35     ` Hou Tao
2023-05-04  2:00       ` Alexei Starovoitov
2023-05-04  2:30         ` Hou Tao
2023-06-01 17:36           ` Alexei Starovoitov
2023-06-02  2:39             ` Hou Tao
2023-06-02 16:25               ` Alexei Starovoitov
2023-04-29 10:12 ` [RFC bpf-next v3 4/6] bpf: Introduce BPF_MA_FREE_AFTER_RCU_GP Hou Tao
2023-04-29 10:12 ` [RFC bpf-next v3 5/6] bpf: Add two module parameters in htab for memory benchmark Hou Tao
2023-04-29 10:12 ` [RFC bpf-next v3 6/6] selftests/bpf: Add benchmark for bpf memory allocator Hou Tao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230429101215.111262-1-houtao@huaweicloud.com \
    --to=houtao@huaweicloud.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=houtao1@huawei.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).