All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org,
	kafai@fb.com, songliubraving@fb.com, yhs@fb.com,
	john.fastabend@gmail.com, kpsingh@kernel.org,
	akpm@linux-foundation.org, cl@linux.com, penberg@kernel.org,
	rientjes@google.com, iamjoonsoo.kim@lge.com, vbabka@suse.cz,
	hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com,
	guro@fb.com
Cc: linux-mm@kvack.org, netdev@vger.kernel.org, bpf@vger.kernel.org,
	Yafang Shao <laoar.shao@gmail.com>
Subject: [PATCH RFC 0/9] bpf, mm: recharge bpf memory from offline memcg
Date: Tue,  8 Mar 2022 13:10:47 +0000	[thread overview]
Message-ID: <20220308131056.6732-1-laoar.shao@gmail.com> (raw)

When we use memcg to limit the containers which load bpf progs and maps,
we find there is an issue that the lifecycle of container and bpf are not
always the same, because we may pin the maps and progs while update the
container only. So once the container which has alreay pinned progs and
maps is restarted, the pinned progs and maps are no longer charged to it
any more. In other words, this kind of container can steal memory from the
host, that is not expected by us. This patchset means to resolve this
issue.

After the container is restarted, the old memcg which is charged by the
pinned progs and maps will be offline but won't be freed until all of the
related maps and progs are freed. If we want to charge these bpf memory to
the new started memcg, we should uncharge them from the offline memcg first
and then charge it to the new one. As we have already known how the bpf
memroy is allocated and freed, we can also know how to charge and uncharge
it. This pathset implements various charge and uncharge methords for these
memory.

Regarding how to do the recharge, we decide to implement new bpf syscalls
to do it. With the new implemented bpf syscall, the agent running in the
container can use it to do the recharge. As of now we only implement it for
the bpf hash maps. Below is a simple example how to do the recharge,

====
int main(int argc, char *argv[])
{
	union bpf_attr attr = {};
	int map_id;
	int pfd;

	if (argc < 2) {
		printf("Pls. give a map id \n");
		exit(-1);
	}

	map_id = atoi(argv[1]);
	attr.map_id = map_id;
	pfd = syscall(SYS_bpf, BPF_MAP_RECHARGE, &attr, sizeof(attr));
	if (pfd < 0)
		perror("BPF_MAP_RECHARGE");

	return 0;
}

====

Patch #1 and #2 is for the observability, with which we can easily check
whether the bpf maps is charged to a memcg and whether the memcg is offline.
Patch #3, #4 and #5 is for the charge and uncharge methord for vmalloc-ed,
kmalloc-ed and percpu memory.
Patch #6~#9 implements the recharge of bpf hash map, which is mostly used
by our bpf services. The other maps hasn't been implemented yet. The bpf progs
hasn't been implemented neither.

This pathset is still a POC now, with limited testing. Any feedback is
welcomed.

Yafang Shao (9):
  bpftool: fix print error when show bpf man
  bpftool: show memcg info of bpf map
  mm: add methord to charge kmalloc-ed address
  mm: add methord to charge vmalloc-ed address
  mm: add methord to charge percpu address
  bpf: add a helper to find map by id
  bpf: add BPF_MAP_RECHARGE syscall
  bpf: make bpf_map_{save, release}_memcg public
  bpf: support recharge for hash map

 include/linux/bpf.h            | 23 +++++++++++++
 include/linux/percpu.h         |  1 +
 include/linux/slab.h           |  2 ++
 include/linux/vmalloc.h        |  1 +
 include/uapi/linux/bpf.h       | 10 ++++++
 kernel/bpf/hashtab.c           | 35 ++++++++++++++++++++
 kernel/bpf/syscall.c           | 73 ++++++++++++++++++++++++++----------------
 mm/percpu.c                    | 50 +++++++++++++++++++++++++++++
 mm/slab.c                      |  6 ++++
 mm/slob.c                      |  6 ++++
 mm/slub.c                      | 32 ++++++++++++++++++
 mm/util.c                      |  9 ++++++
 mm/vmalloc.c                   | 29 +++++++++++++++++
 tools/bpf/bpftool/map.c        |  9 +++---
 tools/include/uapi/linux/bpf.h |  1 +
 15 files changed, 254 insertions(+), 33 deletions(-)

-- 
1.8.3.1


             reply	other threads:[~2022-03-08 13:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-08 13:10 Yafang Shao [this message]
2022-03-08 13:10 ` [PATCH RFC 1/9] bpftool: fix print error when show bpf man Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 2/9] bpftool: show memcg info of bpf map Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 3/9] mm: add methord to charge kmalloc-ed address Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 4/9] mm: add methord to charge vmalloc-ed address Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 5/9] mm: add methord to charge percpu address Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 6/9] bpf: add a helper to find map by id Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 7/9] bpf: add BPF_MAP_RECHARGE syscall Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 8/9] bpf: make bpf_map_{save, release}_memcg public Yafang Shao
2022-03-08 13:10 ` [PATCH RFC 9/9] bpf: support recharge for hash map Yafang Shao
2022-03-09  1:09 ` [PATCH RFC 0/9] bpf, mm: recharge bpf memory from offline memcg Roman Gushchin
2022-03-09 13:28   ` Yafang Shao
2022-03-09 23:35     ` Roman Gushchin
2022-03-10 13:20       ` Yafang Shao
2022-03-10 18:00         ` Roman Gushchin
2022-03-11 12:48           ` Yafang Shao
2022-03-11 17:49             ` Roman Gushchin
2022-03-12  6:45               ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220308131056.6732-1-laoar.shao@gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cl@linux.com \
    --cc=daniel@iogearbox.net \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=songliubraving@fb.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.