All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>, davem@davemloft.net
Cc: andrii@kernel.org, tj@kernel.org, memxor@gmail.com,
	delyank@fb.com, linux-mm@kvack.org, bpf@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH v4 bpf-next 01/15] bpf: Introduce any context BPF specific memory allocator.
Date: Mon, 29 Aug 2022 23:59:29 +0200	[thread overview]
Message-ID: <181cb6ae-9d98-8986-4419-5013662b0189@iogearbox.net> (raw)
In-Reply-To: <20220826024430.84565-2-alexei.starovoitov@gmail.com>

On 8/26/22 4:44 AM, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Tracing BPF programs can attach to kprobe and fentry. Hence they
> run in unknown context where calling plain kmalloc() might not be safe.
> 
> Front-end kmalloc() with minimal per-cpu cache of free elements.
> Refill this cache asynchronously from irq_work.
> 
> BPF programs always run with migration disabled.
> It's safe to allocate from cache of the current cpu with irqs disabled.
> Free-ing is always done into bucket of the current cpu as well.
> irq_work trims extra free elements from buckets with kfree
> and refills them with kmalloc, so global kmalloc logic takes care
> of freeing objects allocated by one cpu and freed on another.
> 
> struct bpf_mem_alloc supports two modes:
> - When size != 0 create kmem_cache and bpf_mem_cache for each cpu.
>    This is typical bpf hash map use case when all elements have equal size.
> - When size == 0 allocate 11 bpf_mem_cache-s for each cpu, then rely on
>    kmalloc/kfree. Max allocation size is 4096 in this case.
>    This is bpf_dynptr and bpf_kptr use case.
> 
> bpf_mem_alloc/bpf_mem_free are bpf specific 'wrappers' of kmalloc/kfree.
> bpf_mem_cache_alloc/bpf_mem_cache_free are 'wrappers' of kmem_cache_alloc/kmem_cache_free.
> 
> The allocators are NMI-safe from bpf programs only. They are not NMI-safe in general.
> 
> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>   include/linux/bpf_mem_alloc.h |  26 ++
>   kernel/bpf/Makefile           |   2 +-
>   kernel/bpf/memalloc.c         | 476 ++++++++++++++++++++++++++++++++++
>   3 files changed, 503 insertions(+), 1 deletion(-)
>   create mode 100644 include/linux/bpf_mem_alloc.h
>   create mode 100644 kernel/bpf/memalloc.c
> 
[...]
> +#define NUM_CACHES 11
> +
> +struct bpf_mem_cache {
> +	/* per-cpu list of free objects of size 'unit_size'.
> +	 * All accesses are done with interrupts disabled and 'active' counter
> +	 * protection with __llist_add() and __llist_del_first().
> +	 */
> +	struct llist_head free_llist;
> +	local_t active;
> +
> +	/* Operations on the free_list from unit_alloc/unit_free/bpf_mem_refill
> +	 * are sequenced by per-cpu 'active' counter. But unit_free() cannot
> +	 * fail. When 'active' is busy the unit_free() will add an object to
> +	 * free_llist_extra.
> +	 */
> +	struct llist_head free_llist_extra;
> +
> +	/* kmem_cache != NULL when bpf_mem_alloc was created for specific
> +	 * element size.
> +	 */
> +	struct kmem_cache *kmem_cache;
> +	struct irq_work refill_work;
> +	struct obj_cgroup *objcg;
> +	int unit_size;
> +	/* count of objects in free_llist */
> +	int free_cnt;
> +};
> +
> +struct bpf_mem_caches {
> +	struct bpf_mem_cache cache[NUM_CACHES];
> +};
> +

Could we now also completely get rid of the current map prealloc infra (pcpu_freelist*
I mean), and replace it with above variant altogether? Would be nice to make it work
for this case, too, and then get rid of percpu_freelist.{h,c} .. it's essentially a
superset wrt functionality iiuc?

Thanks,
Daniel

  parent reply	other threads:[~2022-08-29 21:59 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26  2:44 [PATCH v4 bpf-next 00/15] bpf: BPF specific memory allocator Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 01/15] bpf: Introduce any context " Alexei Starovoitov
2022-08-29 21:30   ` Daniel Borkmann
2022-08-29 21:45     ` Alexei Starovoitov
2022-08-29 21:59   ` Daniel Borkmann [this message]
2022-08-29 22:04     ` Alexei Starovoitov
2022-08-29 22:39   ` Martin KaFai Lau
2022-08-29 22:42     ` Alexei Starovoitov
2022-08-29 22:59       ` Kumar Kartikeya Dwivedi
2022-08-29 23:13         ` Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 02/15] bpf: Convert hash map to bpf_mem_alloc Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 03/15] selftests/bpf: Improve test coverage of test_maps Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 04/15] samples/bpf: Reduce syscall overhead in map_perf_test Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 05/15] bpf: Relax the requirement to use preallocated hash maps in tracing progs Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 06/15] bpf: Optimize element count in non-preallocated hash map Alexei Starovoitov
2022-08-29 21:47   ` Daniel Borkmann
2022-08-29 21:57     ` Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 07/15] bpf: Optimize call_rcu " Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 08/15] bpf: Adjust low/high watermarks in bpf_mem_cache Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 09/15] bpf: Batch call_rcu callbacks instead of SLAB_TYPESAFE_BY_RCU Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 10/15] bpf: Add percpu allocation support to bpf_mem_alloc Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 11/15] bpf: Convert percpu hash map to per-cpu bpf_mem_alloc Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 12/15] bpf: Remove tracing program restriction on map types Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 13/15] bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 14/15] bpf: Remove prealloc-only restriction for " Alexei Starovoitov
2022-08-26  2:44 ` [PATCH v4 bpf-next 15/15] bpf: Introduce sysctl kernel.bpf_force_dyn_alloc Alexei Starovoitov
2022-08-29 22:02   ` Daniel Borkmann
2022-08-29 22:27     ` Alexei Starovoitov
2022-08-27 16:57 ` [PATCH v4 bpf-next 00/15] bpf: BPF specific memory allocator Andrii Nakryiko
2022-08-27 22:53   ` Kumar Kartikeya Dwivedi
2022-08-29 15:47     ` Alexei Starovoitov
2022-09-09 20:10       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=181cb6ae-9d98-8986-4419-5013662b0189@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=davem@davemloft.net \
    --cc=delyank@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-mm@kvack.org \
    --cc=memxor@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.