All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tadeusz Struk <tadeusz.struk@linaro.org>
To: Tejun Heo <tj@kernel.org>
Cc: "Michal Koutný" <mkoutny@suse.com>,
	cgroups@vger.kernel.org, linux-kernel@vger.kernel.org,
	"Zefan Li" <lizefan.x@bytedance.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Bui Quang Minh" <minhquangbui99@gmail.com>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path
Date: Wed, 1 Jun 2022 17:26:34 -0700	[thread overview]
Message-ID: <416dc60a-f0e5-7d05-1613-3cd0ca415768@linaro.org> (raw)
In-Reply-To: <Ypf/MpwzByOrSp6A@slm.duckdns.org>

On 6/1/22 17:07, Tejun Heo wrote:
> Hello,
> 
> On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk wrote:
>> What I'm trying to say is that it's not really a ref imbalance problem.
>> I think once the kill_css() has been called on a css, and it is enqueued to be
>> killed, we shouldn't call css_put(css) on it anymore outside of the "killed call
>> flow path". It will all be handled by the css_killed_ref_fn() function.
>>
>> The fact the css_release() is called (via cgroup_kn_unlock()) just after
>> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
>> (cgroup_destroy_wq), just with different function. This results in the
>> BUG: corrupted list in insert_work issue.
> 
> I have a hard time following here. The kill / release paths relationship
> isn't that complicated. The kill path is invoked when the percpu_ref's base
> ref is killed and holds an extra ref so that it's guaranteed that release
> can't happen before the kill path is done with the css. When the final put
> happens - whether that's from the kill path or someone else, which often is
> the case - the release path triggers. If we have release getting scheduled
> while the kill path isn't finished, it is a reference counting problem,
> right?
> 
> Can you elaborate the exact scenario that you think is happening? Please
> feel free to omit the function calls and all that. Just lay out who's doing
> what.

Ok the problem is that

1. kill_css() triggers css_killed_ref_fn(), which enqueues &css->destroy_work on cgroup_destroy_wq
2. Last put_css() calls css_release(), which enqueues &css->destroy_work on cgroup_destroy_wq

We have two instances of the same work struct enqueued on the same WQ (cgroup_destroy_wq),
which causes "BUG: corrupted list in insert_work"

So I think the easiest way to solve this would be to have two separate work_structs,
one for the killed_ref path and css_release path as in:

https://lore.kernel.org/all/20220523212724.233314-1-tadeusz.struk@linaro.org/

-- 
Thanks,
Tadeusz

WARNING: multiple messages have this Message-ID (diff)
From: Tadeusz Struk <tadeusz.struk-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
To: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Cc: "Michal Koutný" <mkoutny-IBi9RG/b67k@public.gmane.org>,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	"Zefan Li" <lizefan.x-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>,
	"Johannes Weiner"
	<hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	"Bui Quang Minh"
	<minhquangbui99-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [PATCH 2/2] cgroup: Use separate work structs on css release path
Date: Wed, 1 Jun 2022 17:26:34 -0700	[thread overview]
Message-ID: <416dc60a-f0e5-7d05-1613-3cd0ca415768@linaro.org> (raw)
In-Reply-To: <Ypf/MpwzByOrSp6A-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>

On 6/1/22 17:07, Tejun Heo wrote:
> Hello,
> 
> On Wed, Jun 01, 2022 at 05:00:44PM -0700, Tadeusz Struk wrote:
>> What I'm trying to say is that it's not really a ref imbalance problem.
>> I think once the kill_css() has been called on a css, and it is enqueued to be
>> killed, we shouldn't call css_put(css) on it anymore outside of the "killed call
>> flow path". It will all be handled by the css_killed_ref_fn() function.
>>
>> The fact the css_release() is called (via cgroup_kn_unlock()) just after
>> kill_css() causes the css->destroy_work to be enqueued twice on the same WQ
>> (cgroup_destroy_wq), just with different function. This results in the
>> BUG: corrupted list in insert_work issue.
> 
> I have a hard time following here. The kill / release paths relationship
> isn't that complicated. The kill path is invoked when the percpu_ref's base
> ref is killed and holds an extra ref so that it's guaranteed that release
> can't happen before the kill path is done with the css. When the final put
> happens - whether that's from the kill path or someone else, which often is
> the case - the release path triggers. If we have release getting scheduled
> while the kill path isn't finished, it is a reference counting problem,
> right?
> 
> Can you elaborate the exact scenario that you think is happening? Please
> feel free to omit the function calls and all that. Just lay out who's doing
> what.

Ok the problem is that

1. kill_css() triggers css_killed_ref_fn(), which enqueues &css->destroy_work on cgroup_destroy_wq
2. Last put_css() calls css_release(), which enqueues &css->destroy_work on cgroup_destroy_wq

We have two instances of the same work struct enqueued on the same WQ (cgroup_destroy_wq),
which causes "BUG: corrupted list in insert_work"

So I think the easiest way to solve this would be to have two separate work_structs,
one for the killed_ref path and css_release path as in:

https://lore.kernel.org/all/20220523212724.233314-1-tadeusz.struk-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org/

-- 
Thanks,
Tadeusz

  reply	other threads:[~2022-06-02  0:26 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-25 15:15 [PATCH 0/2] cgroup_subsys_state lifecycle fixups Michal Koutný
2022-05-25 15:15 ` Michal Koutný
2022-05-25 15:15 ` [PATCH 1/2] cgroup: Wait for cgroup_subsys_state offlining on unmount Michal Koutný
2022-05-25 15:15   ` Michal Koutný
2022-05-25 15:15 ` [PATCH 2/2] cgroup: Use separate work structs on css release path Michal Koutný
2022-05-25 15:15   ` Michal Koutný
2022-05-25 16:14   ` Michal Koutný
2022-05-25 16:14     ` Michal Koutný
2022-05-26  9:56     ` Michal Koutný
2022-05-26  9:56       ` Michal Koutný
2022-05-26 18:15       ` Tejun Heo
2022-05-26 18:15         ` Tejun Heo
2022-05-27 16:39         ` Tadeusz Struk
2022-05-27 16:39           ` Tadeusz Struk
2022-05-27 16:54           ` Michal Koutný
2022-05-27 16:54             ` Michal Koutný
2022-05-27 17:23             ` Tejun Heo
2022-05-27 17:23               ` Tejun Heo
2022-06-01 23:13         ` Tadeusz Struk
2022-06-01 23:13           ` Tadeusz Struk
2022-06-01 23:20           ` Tejun Heo
2022-06-01 23:20             ` Tejun Heo
2022-06-01 23:37             ` Tadeusz Struk
2022-06-01 23:37               ` Tadeusz Struk
2022-06-01 23:43               ` Tejun Heo
2022-06-01 23:43                 ` Tejun Heo
2022-06-02  0:00                 ` Tadeusz Struk
2022-06-02  0:00                   ` Tadeusz Struk
2022-06-02  0:07                   ` Tejun Heo
2022-06-02  0:07                     ` Tejun Heo
2022-06-02  0:26                     ` Tadeusz Struk [this message]
2022-06-02  0:26                       ` Tadeusz Struk
2022-06-02  0:29                       ` Tejun Heo
2022-06-02  0:29                         ` Tejun Heo
2022-06-02  0:40                         ` Tadeusz Struk
2022-06-02  0:40                           ` Tadeusz Struk
2022-06-02 11:47                           ` Michal Koutný
2022-06-02 11:47                             ` Michal Koutný
2022-06-02 14:28                             ` Tadeusz Struk
2022-06-02 14:28                               ` Tadeusz Struk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=416dc60a-f0e5-7d05-1613-3cd0ca415768@linaro.org \
    --to=tadeusz.struk@linaro.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan.x@bytedance.com \
    --cc=minhquangbui99@gmail.com \
    --cc=mkoutny@suse.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.