All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Borkmann <daniel@iogearbox.net>
To: Roman Gushchin <guro@fb.com>, bpf@vger.kernel.org
Cc: ast@kernel.org, netdev@vger.kernel.org, andrii@kernel.org,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH bpf-next v6 06/34] bpf: prepare for memcg-based memory accounting for bpf maps
Date: Wed, 18 Nov 2020 01:06:17 +0100	[thread overview]
Message-ID: <41eb5e5b-e651-4cb3-a6ea-9ff6b8aa41fb@iogearbox.net> (raw)
In-Reply-To: <20201117034108.1186569-7-guro@fb.com>

On 11/17/20 4:40 AM, Roman Gushchin wrote:
> In the absolute majority of cases if a process is making a kernel
> allocation, it's memory cgroup is getting charged.
> 
> Bpf maps can be updated from an interrupt context and in such
> case there is no process which can be charged. It makes the memory
> accounting of bpf maps non-trivial.
> 
> Fortunately, after commit 4127c6504f25 ("mm: kmem: enable kernel
> memcg accounting from interrupt contexts") and b87d8cefe43c
> ("mm, memcg: rework remote charging API to support nesting")
> it's finally possible.
> 
> To do it, a pointer to the memory cgroup of the process which created
> the map is saved, and this cgroup is getting charged for all
> allocations made from an interrupt context.
> 
> Allocations made from a process context will be accounted in a usual way.
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Acked-by: Song Liu <songliubraving@fb.com>
[...]
>   
> +#ifdef CONFIG_MEMCG_KMEM
> +static __always_inline int __bpf_map_update_elem(struct bpf_map *map, void *key,
> +						 void *value, u64 flags)
> +{
> +	struct mem_cgroup *old_memcg;
> +	bool in_interrupt;
> +	int ret;
> +
> +	/*
> +	 * If update from an interrupt context results in a memory allocation,
> +	 * the memory cgroup to charge can't be determined from the context
> +	 * of the current task. Instead, we charge the memory cgroup, which
> +	 * contained a process created the map.
> +	 */
> +	in_interrupt = in_interrupt();
> +	if (in_interrupt)
> +		old_memcg = set_active_memcg(map->memcg);
> +
> +	ret = map->ops->map_update_elem(map, key, value, flags);
> +
> +	if (in_interrupt)
> +		set_active_memcg(old_memcg);
> +
> +	return ret;

Hmm, this approach here won't work, see also commit 09772d92cd5a ("bpf: avoid
retpoline for lookup/update/delete calls on maps") which removes the indirect
call, so the __bpf_map_update_elem() and therefore the set_active_memcg() is
not invoked for the vast majority of cases.

> +}
> +#else
> +static __always_inline int __bpf_map_update_elem(struct bpf_map *map, void *key,
> +						 void *value, u64 flags)
> +{
> +	return map->ops->map_update_elem(map, key, value, flags);
> +}
> +#endif
> +
>   BPF_CALL_4(bpf_map_update_elem, struct bpf_map *, map, void *, key,
>   	   void *, value, u64, flags)
>   {
>   	WARN_ON_ONCE(!rcu_read_lock_held());
> -	return map->ops->map_update_elem(map, key, value, flags);
> +
> +	return __bpf_map_update_elem(map, key, value, flags);
>   }
>   
>   const struct bpf_func_proto bpf_map_update_elem_proto = {
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index f3fe9f53f93c..2d77fc2496da 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -31,6 +31,7 @@
>   #include <linux/poll.h>
>   #include <linux/bpf-netns.h>
>   #include <linux/rcupdate_trace.h>
> +#include <linux/memcontrol.h>
>   
>   #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
>   			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
> @@ -456,6 +457,27 @@ void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock)
>   		__release(&map_idr_lock);
>   }
>   
> +#ifdef CONFIG_MEMCG_KMEM
> +static void bpf_map_save_memcg(struct bpf_map *map)
> +{
> +	map->memcg = get_mem_cgroup_from_mm(current->mm);
> +}
> +
> +static void bpf_map_release_memcg(struct bpf_map *map)
> +{
> +	mem_cgroup_put(map->memcg);
> +}
> +
> +#else
> +static void bpf_map_save_memcg(struct bpf_map *map)
> +{
> +}
> +
> +static void bpf_map_release_memcg(struct bpf_map *map)
> +{
> +}
> +#endif
> +
>   /* called from workqueue */
>   static void bpf_map_free_deferred(struct work_struct *work)
>   {
> @@ -464,6 +486,7 @@ static void bpf_map_free_deferred(struct work_struct *work)
>   
>   	bpf_map_charge_move(&mem, &map->memory);
>   	security_bpf_map_free(map);
> +	bpf_map_release_memcg(map);
>   	/* implementation dependent freeing */
>   	map->ops->map_free(map);
>   	bpf_map_charge_finish(&mem);
> @@ -875,6 +898,8 @@ static int map_create(union bpf_attr *attr)
>   	if (err)
>   		goto free_map_sec;
>   
> +	bpf_map_save_memcg(map);
> +
>   	err = bpf_map_new_fd(map, f_flags);
>   	if (err < 0) {
>   		/* failed to allocate fd.
> 


  reply	other threads:[~2020-11-18  0:06 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-17  3:40 [PATCH bpf-next v6 00/34] bpf: switch to memcg-based memory accounting Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 01/34] mm: memcontrol: use helpers to read page's memcg data Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 02/34] mm: memcontrol/slab: use helpers to access slab page's memcg_data Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 03/34] mm: introduce page memcg flags Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 04/34] mm: convert page kmemcg type to a page memcg flag Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 05/34] bpf: memcg-based memory accounting for bpf progs Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 06/34] bpf: prepare for memcg-based memory accounting for bpf maps Roman Gushchin
2020-11-18  0:06   ` Daniel Borkmann [this message]
2020-11-18  0:46     ` Roman Gushchin
2020-11-18  1:07       ` Roman Gushchin
2020-11-18  1:11         ` Alexei Starovoitov
2020-11-18  1:11           ` Alexei Starovoitov
2020-11-18  1:28           ` Roman Gushchin
2020-11-18 10:22             ` Daniel Borkmann
2020-11-18 17:15               ` Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 07/34] bpf: " Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 08/34] bpf: refine memcg-based memory accounting for arraymap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 09/34] bpf: refine memcg-based memory accounting for cpumap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 10/34] bpf: memcg-based memory accounting for cgroup storage maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 11/34] bpf: refine memcg-based memory accounting for devmap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 12/34] bpf: refine memcg-based memory accounting for hashtab maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 13/34] bpf: memcg-based memory accounting for lpm_trie maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 14/34] bpf: memcg-based memory accounting for bpf ringbuffer Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 15/34] bpf: memcg-based memory accounting for bpf local storage maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 16/34] bpf: refine memcg-based memory accounting for sockmap and sockhash maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 17/34] bpf: refine memcg-based memory accounting for xskmap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 18/34] bpf: eliminate rlimit-based memory accounting for arraymap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 19/34] bpf: eliminate rlimit-based memory accounting for bpf_struct_ops maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 20/34] bpf: eliminate rlimit-based memory accounting for cpumap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 21/34] bpf: eliminate rlimit-based memory accounting for cgroup storage maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 22/34] bpf: eliminate rlimit-based memory accounting for devmap maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 23/34] bpf: eliminate rlimit-based memory accounting for hashtab maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 24/34] bpf: eliminate rlimit-based memory accounting for lpm_trie maps Roman Gushchin
2020-11-17  3:40 ` [PATCH bpf-next v6 25/34] bpf: eliminate rlimit-based memory accounting for queue_stack_maps maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 26/34] bpf: eliminate rlimit-based memory accounting for reuseport_array maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 27/34] bpf: eliminate rlimit-based memory accounting for bpf ringbuffer Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 28/34] bpf: eliminate rlimit-based memory accounting for sockmap and sockhash maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 29/34] bpf: eliminate rlimit-based memory accounting for stackmap maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 30/34] bpf: eliminate rlimit-based memory accounting for xskmap maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 31/34] bpf: eliminate rlimit-based memory accounting for bpf local storage maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 32/34] bpf: eliminate rlimit-based memory accounting infra for bpf maps Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 33/34] bpf: eliminate rlimit-based memory accounting for bpf progs Roman Gushchin
2020-11-17  3:41 ` [PATCH bpf-next v6 34/34] bpf: samples: do not touch RLIMIT_MEMLOCK Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41eb5e5b-e651-4cb3-a6ea-9ff6b8aa41fb@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.