bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Hou Tao <houtao@huaweicloud.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>,
	Yonghong Song <yhs@meta.com>, bpf <bpf@vger.kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>,
	Hao Luo <haoluo@google.com>, Yonghong Song <yhs@fb.com>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	KP Singh <kpsingh@kernel.org>,
	Stanislav Fomichev <sdf@google.com>, Jiri Olsa <jolsa@kernel.org>,
	John Fastabend <john.fastabend@gmail.com>,
	"Paul E . McKenney" <paulmck@kernel.org>,
	rcu@vger.kernel.org, Hou Tao <houtao1@huawei.com>,
	Martin KaFai Lau <martin.lau@kernel.org>
Subject: Re: [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc
Date: Wed, 15 Feb 2023 10:35:58 +0800	[thread overview]
Message-ID: <19bf22cd-2344-4029-a2ee-ce4bcc1db048@huaweicloud.com> (raw)
In-Reply-To: <CAADnVQKecUqGF-gLFS5Wiz7_E-cHOkp7NPCUK0woHUmJG6hEuA@mail.gmail.com>

Hi,

On 2/12/2023 12:33 AM, Alexei Starovoitov wrote:
> On Fri, Feb 10, 2023 at 5:10 PM Hou Tao <houtao@huaweicloud.com> wrote:
>>>> Hou, are you plannning to resubmit this change? I also hit this while testing my
>>>> changes on bpf-next.
>>> Are you talking about the whole patch set or just GFP_ZERO in mem_alloc?
>>> The former will take a long time to settle.
>>> The latter is trivial.
>>> To unblock yourself just add GFP_ZERO in an extra patch?
>> Sorry for the long delay. Just find find out time to do some tests to compare
>> the performance of bzero and ctor. After it is done, will resubmit on next week.
> I still don't like ctor as a concept. In general the callbacks in the critical
> path are guaranteed to be slow due to retpoline overhead.
> Please send a patch to add GFP_ZERO.
I see. Will do. But i think it is better to know the coarse overhead of these
two methods, so I hack map_perf_test to support customizable value size for
hash_map_alloc and do some benchmarks to show the overheads of ctor and
GFP_ZERO. These benchmark are conducted on a KVM-VM with 8-cpus, it seems when
the number of allocated elements is small, the overheads of ctor and bzero are
basically the same, but when the number of allocated element increases (e.g.,
half full), the overhead of ctor will be bigger. For big value size, the
overhead of ctor and zero are basically the same, and it seems due to the main
overhead comes from slab allocation. The following is the detailed results:

* ./map_perf_test 4 8 8192 10000 $value_size

Key of htab is thread pid, so only 8 elements are allocated.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 256604 | 261112 | 173646 | 74195  | 23138  | 6275   |
| bzero      | 253362 | 257563 | 171445 | 73303  | 22949  | 6249   |
| ctor       | 264570 | 258246 | 175048 | 72511  | 23004  | 6270   |

* ./map_perf_test 4 8 8192 100 $value_size

The key is still thread pid, so only 8 elements are allocated. decrease the loop
count to 100 to show the overhead of first allocation.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 135662 | 137742 | 87043  | 36265  | 12501  | 4450   |
| bzero      | 139993 | 134920 | 94570  | 37190  | 12543  | 4131   |
| ctor       | 147949 | 141825 | 94321  | 38240  | 13131  | 4248   |

* ./map_perf_test 4 8 8192 1000 $value_size

Create 512 different keys per-thread, so the hash table will be half-full. Also
decrease the loop count to 1K.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 4234   | 4289   | 1478   | 510    | 168    | 46     |
| bzero      | 3792   | 4002   | 1473   | 515    | 161    | 37     |
| ctor       | 3846   | 2198   | 1269   | 499    | 161    | 42     |

* ./map_perf_test 4 8 8192 100 $value_size

Create 512 different keys per-thread, so the hash table will be half-full. Also
decrease the loop count to 100.

| value_size | 8      | 256    | 4K     | 16K    | 64K    | 256K   |
| --         | --     | --     | --     | --     | --     | --     |
| base       | 3669   | 3419   | 1272   | 476    | 168    | 44     |
| bzero      | 3468   | 3499   | 1274   | 476    | 150    | 36     |
| ctor       | 2235   | 2312   | 1128   | 452    | 145    | 35     |
>
> Also I realized that we can make the BPF_REUSE_AFTER_RCU_GP flag usable
> without risking OOM by only waiting for normal rcu GP and not rcu_tasks_trace.
> This approach will work for inner nodes of qptrie, since bpf progs
> never see pointers to them. It will work for local storage
> converted to bpf_mem_alloc too. It wouldn't need to use its own call_rcu.
> It's also safe without uaf caveat in sleepable progs and sleepable progs
> can use explicit bpf_rcu_read_lock() when they want to avoid uaf.
> So please respin the set with rcu gp only and that new flag.
> .


  parent reply	other threads:[~2023-02-15  2:36 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-30  4:11 [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 1/6] bpf: Support ctor in bpf memory allocator Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 2/6] bpf: Factor out a common helper free_llist() Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 3/6] bpf: Pass bitwise flags to bpf_mem_alloc_init() Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 4/6] bpf: Introduce BPF_MA_NO_REUSE for bpf memory allocator Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 5/6] bpf: Use BPF_MA_NO_REUSE in htab map Hou Tao
2022-12-30  4:11 ` [RFC PATCH bpf-next 6/6] selftests/bpf: Add test case for element reuse " Hou Tao
2023-01-01  1:26 ` [RFC PATCH bpf-next 0/6] bpf: Handle reuse in bpf memory alloc Alexei Starovoitov
2023-01-01 18:48   ` Yonghong Song
2023-01-03 13:47     ` Hou Tao
2023-01-04  6:10       ` Yonghong Song
2023-01-04  6:30         ` Hou Tao
2023-01-04  7:14           ` Yonghong Song
2023-01-04 18:26             ` Alexei Starovoitov
2023-02-10 16:32               ` Kumar Kartikeya Dwivedi
2023-02-10 21:06                 ` Alexei Starovoitov
2023-02-11  1:09                   ` Hou Tao
2023-02-11 16:33                     ` Alexei Starovoitov
2023-02-11 16:34                       ` Alexei Starovoitov
2023-02-15  1:54                         ` Martin KaFai Lau
2023-02-15  4:02                           ` Hou Tao
2023-02-15  7:22                             ` Martin KaFai Lau
2023-02-16  2:11                               ` Hou Tao
2023-02-16  7:47                                 ` Martin KaFai Lau
2023-02-16  8:18                                   ` Hou Tao
2023-02-16 13:55                         ` Hou Tao
2023-02-16 16:35                           ` Alexei Starovoitov
2023-02-17  1:19                             ` Hou Tao
2023-02-22 19:30                               ` Alexei Starovoitov
2023-02-15  2:35                       ` Hou Tao [this message]
2023-02-15  2:42                         ` Alexei Starovoitov
2023-02-15  3:00                           ` Hou Tao
2023-01-03 13:40   ` Hou Tao
2023-01-03 19:38     ` Alexei Starovoitov
2023-01-10  6:26       ` Martin KaFai Lau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19bf22cd-2344-4029-a2ee-ce4bcc1db048@huaweicloud.com \
    --to=houtao@huaweicloud.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=houtao1@huawei.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=martin.lau@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=memxor@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=yhs@fb.com \
    --cc=yhs@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).