bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yafang Shao <laoar.shao@gmail.com>
To: "Michal Koutný" <mkoutny@suse.com>
Cc: ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com,
	 andrii@kernel.org, martin.lau@linux.dev, song@kernel.org,
	 yonghong.song@linux.dev, kpsingh@kernel.org, sdf@google.com,
	 haoluo@google.com, jolsa@kernel.org, tj@kernel.org,
	lizefan.x@bytedance.com,  hannes@cmpxchg.org,
	yosryahmed@google.com, sinquersw@gmail.com,
	 cgroups@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [RFC PATCH bpf-next v2 1/9] cgroup: Make operations on the cgroup root_list RCU safe
Date: Wed, 18 Oct 2023 10:51:47 +0800	[thread overview]
Message-ID: <CALOAHbCP8HVs2UjuegLHvZSRHbfJ2ONMNC58AQxwVaNGSDYzOg@mail.gmail.com> (raw)
In-Reply-To: <q7yaokzrcg5effyr2j7n6f6ljlez755uunlzfzpjgktfmrvhnb@t44uxkbj7k5k>

On Tue, Oct 17, 2023 at 9:20 PM Michal Koutný <mkoutny@suse.com> wrote:
>
> On Tue, Oct 17, 2023 at 12:45:38PM +0000, Yafang Shao <laoar.shao@gmail.com> wrote:
> > Therefore, making it RCU-safe would be beneficial.
>
> Notice that whole cgroup_destroy_root() is actually an RCU callback (via
> css_free_rwork_fn()). So the list traversal under RCU should alreay be
> OK wrt its stability. Do you see a loophole in this argument?

css_free_rwork_fn() is designed for handling the cgroup's CSS. When
the RCU callback is executed, this specific CSS is not accessible to
other tasks. However, the cgroup root remains accessible to other
tasks. For instance, other tasks can still traverse the cgroup
root_list and access the cgroup root that is currently being
destroyed. Hence, we must take explicit measures to make access to the
cgroup root RCU-safe.

>
>
> >  /* iterate across the hierarchies */
> >  #define for_each_root(root)                                          \
> > -     list_for_each_entry((root), &cgroup_roots, root_list)
> > +     list_for_each_entry_rcu((root), &cgroup_roots, root_list,       \
> > +                             !lockdep_is_held(&cgroup_mutex))
>
> The extra condition should be constant false (regardless of
> cgroup_mutex). IOW, RCU read lock is always required.

IIUC, the RCU read lock is not required, while the cgroup_mutex is
required when we want to traverse the cgroup root_list, such as in the
case of cgroup_attach_task_all.

>
> > @@ -1386,13 +1386,15 @@ static inline struct cgroup *__cset_cgroup_from_root(struct css_set *cset,
> >               }
> >       }
> >
> > -     BUG_ON(!res_cgroup);
> > +     WARN_ON_ONCE(!res_cgroup && lockdep_is_held(&cgroup_mutex));
> >       return res_cgroup;
>
> Hm, this would benefit from a comment why !res_cgroup is conditionally
> acceptable.

The cgrp_cset_link is freed before we remove the cgroup root from the
root_list. Consequently, when accessing a cgroup root, the cset_link
may have already been freed, resulting in a NULL res_cgroup. However,
by holding the cgroup_mutex, we ensure that res_cgroup cannot be NULL.

Will add the comments in the next version.

>
> >  }
> >
> >  /*
> >   * look up cgroup associated with current task's cgroup namespace on the
> > - * specified hierarchy
> > + * specified hierarchy. Umount synchronization is ensured via VFS layer,
> > + * so we don't have to hold cgroup_mutex to prevent the root from being
> > + * destroyed.
>
> I tried the similar via explicit lockdep invocation (in a thread I
> linked to you previously) and VFS folks weren't ethusiastic about it.

Thanks for your information. will take a look at it.

>
> You cannot hide this synchronization assumption in a mere comment.

will think about how to address it.

--
Regards
Yafang

  reply	other threads:[~2023-10-18  2:52 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17 12:45 [RFC PATCH bpf-next v2 0/9] bpf, cgroup: Add BPF support for cgroup1 hierarchy Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 1/9] cgroup: Make operations on the cgroup root_list RCU safe Yafang Shao
2023-10-17 13:20   ` Michal Koutný
2023-10-18  2:51     ` Yafang Shao [this message]
2023-10-18  9:35   ` Tejun Heo
2023-10-19  6:38     ` Yafang Shao
2023-10-19 19:08       ` Tejun Heo
2023-10-19 19:43         ` Waiman Long
2023-10-20  9:36           ` Yafang Shao
2023-10-20 17:51             ` Tejun Heo
2023-10-22  9:32               ` Yafang Shao
2023-10-25  8:06   ` kernel test robot
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 2/9] cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show() Yafang Shao
2023-10-17 14:04   ` Michal Koutný
2023-10-18  3:12     ` Yafang Shao
2023-10-18  9:38   ` Tejun Heo
2023-10-19  6:39     ` Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 3/9] cgroup: Add a new helper for cgroup1 hierarchy Yafang Shao
2023-10-18  9:44   ` Tejun Heo
2023-10-19  6:41     ` Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 4/9] bpf: Add a new kfunc " Yafang Shao
2023-10-18  9:40   ` Tejun Heo
2023-10-19  6:40     ` Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 5/9] selftests/bpf: Fix issues in setup_classid_environment() Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 6/9] selftests/bpf: Add parallel support for classid Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 7/9] selftests/bpf: Add a new cgroup helper get_classid_cgroup_id() Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 8/9] selftests/bpf: Add a new cgroup helper get_cgroup_hierarchy_id() Yafang Shao
2023-10-17 12:45 ` [RFC PATCH bpf-next v2 9/9] selftests/bpf: Add selftests for cgroup1 hierarchy Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALOAHbCP8HVs2UjuegLHvZSRHbfJ2ONMNC58AQxwVaNGSDYzOg@mail.gmail.com \
    --to=laoar.shao@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=hannes@cmpxchg.org \
    --cc=haoluo@google.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=martin.lau@linux.dev \
    --cc=mkoutny@suse.com \
    --cc=sdf@google.com \
    --cc=sinquersw@gmail.com \
    --cc=song@kernel.org \
    --cc=tj@kernel.org \
    --cc=yonghong.song@linux.dev \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).