All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <kafai@fb.com>
To: YiFei Zhu <zhuyifei1999@gmail.com>
Cc: <bpf@vger.kernel.org>, Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Stanislav Fomichev <sdf@google.com>,
	Mahesh Bandewar <maheshb@google.com>,
	Roman Gushchin <guro@fb.com>,
	Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	YiFei Zhu <zhuyifei@google.com>
Subject: Re: [PATCH v5 bpf-next 0/5] Make BPF CGROUP_STORAGE map usable by different programs at once
Date: Thu, 23 Jul 2020 17:05:51 -0700	[thread overview]
Message-ID: <20200724000551.qi4y2isxtgdtcgzc@kafai-mbp> (raw)
In-Reply-To: <cover.1595489786.git.zhuyifei@google.com>

On Thu, Jul 23, 2020 at 02:40:53AM -0500, YiFei Zhu wrote:
> From: YiFei Zhu <zhuyifei@google.com>
> 
> To access the storage in a CGROUP_STORAGE map, one uses
> bpf_get_local_storage helper, which is extremely fast due to its
> use of per-CPU variables. However, its whole code is built on
> the assumption that one map can only be used by one program at any
> time, and this prohibits any sharing of data between multiple
> programs using these maps, eliminating a lot of use cases, such
> as some per-cgroup configuration storage, written to by a
> setsockopt program and read by a cg_sock_addr program.
> 
> Why not use other map types? The great part of CGROUP_STORAGE map
> is that it is isolated by different cgroups its attached to. When
> one program uses bpf_get_local_storage, even on the same map, it
> gets different storages if it were run as a result of attaching
> to different cgroups. The kernel manages the storages, simplifying
> BPF program or userspace. In theory, one could probably use other
> maps like array or hash to do the same thing, but it would be a
> major overhead / complexity. Userspace needs to know when a cgroup
> is being freed in order to free up a space in the replacement map.
> 
> This patch set introduces a significant change to the semantics of
> CGROUP_STORAGE map type. Instead of each storage being tied to one
> single attachment, it is shared across different attachments to
> the same cgroup, and persists until either the map or the cgroup
> attached to is being freed.
> 
> User may use u64 as the key to the map, and the result would be
> that the attach type become ignored during key comparison, and
> programs of different attach types will share the same storage if
> the cgroups they are attached to are the same.
> 
> How could this break existing users?
> * Users that uses detach & reattach / program replacement as a
>   shortcut to zeroing the storage. Since we need sharing between
>   programs, we cannot zero the storage. Users that expect this
>   behavior should either attach a program with a new map, or
>   explicitly zero the map with a syscall.
> This case is dependent on undocumented implementation details, 
> so the impact should be very minimal.
> 
> Patch 1 introduces a test on the old expected behavior of the map
> type.
> 
> Patch 2 introduces a test showing how two programs cannot share
> one such map.
> 
> Patch 3 implements the change of semantics to the map.
> 
> Patch 4 amends the new test such that it yields the behavior we
> expect from the change.
> 
> Patch 5 documents the map type.
> 
> Changes since RFC:
> * Clarify commit message in patch 3 such that it says the lifetime
>   of the storage is ended at the freeing of the cgroup_bpf, rather
>   than the cgroup itself.
> * Restored an -ENOMEM check in __cgroup_bpf_attach.
> * Update selftests for recent change in network_helpers API.
> 
> Changes since v1:
> * s/CHECK_FAIL/CHECK/
> * s/bpf_prog_attach/bpf_program__attach_cgroup/
> * Moved test__start_subtest to test_cg_storage_multi.
> * Removed some redundant CHECK_FAIL where they are already CHECK-ed.
> 
> Changes since v2:
> * Lock cgroup_mutex during map_free.
> * Publish new storages only if attach is successful, by tracking
>   exactly which storages are reused in an array of bools.
> * Mention bpftool map dump showing a value of zero for attach_type
>   in patch 3 commit message.
> 
> Changes since v3:
> * Use a much simpler lookup and allocate-if-not-exist from the fact
>   that cgroup_mutex is locked during attach.
> * Removed an unnecessary spinlock hold.
> 
> Changes since v4:
> * Changed semantics so that if the key type is struct
>   bpf_cgroup_storage_key the map retains isolation between different
>   attach types. Sharing between different attach types only occur
>   when key type is u64.
> * Adapted tests and docs for the above change.
Thank for the v5.  Overall looks good. I left some nit comments
for the code changes.  The doc needs some more details.  With
that,

Acked-by: Martin KaFai Lau <kafai@fb.com>

      parent reply	other threads:[~2020-07-24  0:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23  7:40 [PATCH v5 bpf-next 0/5] Make BPF CGROUP_STORAGE map usable by different programs at once YiFei Zhu
2020-07-23  7:40 ` [PATCH v5 bpf-next 1/5] selftests/bpf: Add test for CGROUP_STORAGE map on multiple attaches YiFei Zhu
2020-07-24  0:03   ` Martin KaFai Lau
2020-07-23  7:40 ` [PATCH v5 bpf-next 2/5] selftests/bpf: Test CGROUP_STORAGE map can't be used by multiple progs YiFei Zhu
2020-07-23 23:59   ` Martin KaFai Lau
2020-07-23  7:40 ` [PATCH v5 bpf-next 3/5] bpf: Make cgroup storages shared between programs on the same cgroup YiFei Zhu
2020-07-24  0:00   ` Martin KaFai Lau
2020-07-23  7:40 ` [PATCH v5 bpf-next 4/5] selftests/bpf: Test CGROUP_STORAGE behavior on shared egress + ingress YiFei Zhu
2020-07-23  7:40 ` [PATCH v5 bpf-next 5/5] Documentation/bpf: Document CGROUP_STORAGE map type YiFei Zhu
2020-07-24  0:02   ` Martin KaFai Lau
2020-07-24  0:05 ` Martin KaFai Lau [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200724000551.qi4y2isxtgdtcgzc@kafai-mbp \
    --to=kafai@fb.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=guro@fb.com \
    --cc=maheshb@google.com \
    --cc=sdf@google.com \
    --cc=zhuyifei1999@gmail.com \
    --cc=zhuyifei@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.