All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: Jeremy Linton <jeremy.linton@arm.com>
Cc: <linux-mm@kvack.org>, <cgroups@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [BUG/RFC] mm/memcg: Possible cgroup migrate/signal deadlock
Date: Tue, 1 Feb 2022 14:11:49 -0800	[thread overview]
Message-ID: <YfmwJe9cUQnBV311@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20220201205623.1325649-1-jeremy.linton@arm.com>

On Tue, Feb 01, 2022 at 02:56:23PM -0600, Jeremy Linton wrote:
> With CONFIG_MEMCG_KMEM and CONFIG_PROVE_LOCKING enabled (fedora
> rawhide kernel), running a simple podman test tosses a circular
> locking dependency warning. The podman container in question simpy
> contains the echo command and the libc/ld-linux needed to run it. The
> warning can be duplicated with just a single `podman build --network
> host --layers=false -t localhost/echo .` command, although the exact
> sequence that triggers the warning needs the task state to be changing
> the frozen state as well. So, its easier to duplicate with a slightly
> longer test case.
> 
> I've attempted to trigger the actual deadlock with some standalone
> code and been unsuccessful, but looking at the code it appears to be a
> legitimate deadlock if a signal is being sent to the process from
> another thread while the task is migrating between cgroups.
> 
> Attached is a fix which I'm confident fixes the problem, but I'm not
> really that confident in the fix since I don't fully understand all
> the possible states in the cgroup code. The fix avoids the deadlock by
> shifting the objcg->list manipulation to another spinlock and then
> using list_del_rcu in obj_cgroup_release.
> 
> There is a bit more information in the actual BZ
> https://bugzilla.redhat.com/show_bug.cgi?id=2033016 including a shell
> script with the podman test/etc.

Hi Jeremy!

Thank you for the report and the patch!

We've discussed this issue some time ago and I posted a very similar patch:
https://marc.info/?l=linux-cgroups&m=164221633621286&w=2 .

Also I did resend the latest version few hours ago, but somehow the
mail didn't make it to the mailing lists. Anyway, I've added you
explicitly to cc@ and just resent.

Thanks!

WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro-b10kYP2dOMg@public.gmane.org>
To: Jeremy Linton <jeremy.linton-5wv7dgnIgG8@public.gmane.org>
Cc: linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>,
	Michal Hocko <mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Vladimir Davydov
	<vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Subject: Re: [BUG/RFC] mm/memcg: Possible cgroup migrate/signal deadlock
Date: Tue, 1 Feb 2022 14:11:49 -0800	[thread overview]
Message-ID: <YfmwJe9cUQnBV311@carbon.dhcp.thefacebook.com> (raw)
In-Reply-To: <20220201205623.1325649-1-jeremy.linton-5wv7dgnIgG8@public.gmane.org>

On Tue, Feb 01, 2022 at 02:56:23PM -0600, Jeremy Linton wrote:
> With CONFIG_MEMCG_KMEM and CONFIG_PROVE_LOCKING enabled (fedora
> rawhide kernel), running a simple podman test tosses a circular
> locking dependency warning. The podman container in question simpy
> contains the echo command and the libc/ld-linux needed to run it. The
> warning can be duplicated with just a single `podman build --network
> host --layers=false -t localhost/echo .` command, although the exact
> sequence that triggers the warning needs the task state to be changing
> the frozen state as well. So, its easier to duplicate with a slightly
> longer test case.
> 
> I've attempted to trigger the actual deadlock with some standalone
> code and been unsuccessful, but looking at the code it appears to be a
> legitimate deadlock if a signal is being sent to the process from
> another thread while the task is migrating between cgroups.
> 
> Attached is a fix which I'm confident fixes the problem, but I'm not
> really that confident in the fix since I don't fully understand all
> the possible states in the cgroup code. The fix avoids the deadlock by
> shifting the objcg->list manipulation to another spinlock and then
> using list_del_rcu in obj_cgroup_release.
> 
> There is a bit more information in the actual BZ
> https://bugzilla.redhat.com/show_bug.cgi?id=2033016 including a shell
> script with the podman test/etc.

Hi Jeremy!

Thank you for the report and the patch!

We've discussed this issue some time ago and I posted a very similar patch:
https://marc.info/?l=linux-cgroups&m=164221633621286&w=2 .

Also I did resend the latest version few hours ago, but somehow the
mail didn't make it to the mailing lists. Anyway, I've added you
explicitly to cc@ and just resent.

Thanks!

  reply	other threads:[~2022-02-01 22:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-01 20:56 [BUG/RFC] mm/memcg: Possible cgroup migrate/signal deadlock Jeremy Linton
2022-02-01 22:11 ` Roman Gushchin [this message]
2022-02-01 22:11   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YfmwJe9cUQnBV311@carbon.dhcp.thefacebook.com \
    --to=guro@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jeremy.linton@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.