All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>
Subject: Re: [PATCH 1/2] mm: memcontrol: use special workqueue for creating per-memcg caches
Date: Mon, 3 Oct 2016 15:35:06 +0300	[thread overview]
Message-ID: <20161003123505.GA1862@esperanza> (raw)
In-Reply-To: <20161003120641.GC26768@dhcp22.suse.cz>

On Mon, Oct 03, 2016 at 02:06:42PM +0200, Michal Hocko wrote:
> On Sat 01-10-16 16:56:47, Vladimir Davydov wrote:
> > Creating a lot of cgroups at the same time might stall all worker
> > threads with kmem cache creation works, because kmem cache creation is
> > done with the slab_mutex held. To prevent that from happening, let's use
> > a special workqueue for kmem cache creation with max in-flight work
> > items equal to 1.
> > 
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981
> 
> This looks like a regression but I am not really sure I understand what
> has caused it. We had the WQ based cache creation since kmem was
> introduced more or less. So is it 801faf0db894 ("mm/slab: lockless
> decision to grow cache") which was pointed by bisection that changed the
> timing resp. relaxed the cache creation to the point that would allow
> this runaway?

It is in case of SLAB. For SLUB the issue was caused by commit
81ae6d03952c ("mm/slub.c: replace kick_all_cpus_sync() with
synchronize_sched() in kmem_cache_shrink()").

> This would be really useful for the stable backport
> consideration.
> 
> Also, if I understand the fix correctly, now we do limit the number of
> workers to 1 thread. Is this really what we want? Wouldn't it be
> possible that few memcgs could starve others fromm having their cache
> created? What would be the result, missed charges?

Now kmem caches are created in FIFO order, i.e. if one memcg called
kmem_cache_alloc on a non-existent cache before another, it will be
served first. Since the number of caches that can be created by a single
memcg is obviously limited, I don't see any possibility of starvation.
Actually, this patch doesn't introduce any functional changes regarding
the order in which kmem caches are created, as the work function holds
the global slab_mutex during its whole runtime anyway. We only avoid
creating a thread per each work by making the queue single-threaded.

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov.dev@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>
Subject: Re: [PATCH 1/2] mm: memcontrol: use special workqueue for creating per-memcg caches
Date: Mon, 3 Oct 2016 15:35:06 +0300	[thread overview]
Message-ID: <20161003123505.GA1862@esperanza> (raw)
In-Reply-To: <20161003120641.GC26768@dhcp22.suse.cz>

On Mon, Oct 03, 2016 at 02:06:42PM +0200, Michal Hocko wrote:
> On Sat 01-10-16 16:56:47, Vladimir Davydov wrote:
> > Creating a lot of cgroups at the same time might stall all worker
> > threads with kmem cache creation works, because kmem cache creation is
> > done with the slab_mutex held. To prevent that from happening, let's use
> > a special workqueue for kmem cache creation with max in-flight work
> > items equal to 1.
> > 
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981
> 
> This looks like a regression but I am not really sure I understand what
> has caused it. We had the WQ based cache creation since kmem was
> introduced more or less. So is it 801faf0db894 ("mm/slab: lockless
> decision to grow cache") which was pointed by bisection that changed the
> timing resp. relaxed the cache creation to the point that would allow
> this runaway?

It is in case of SLAB. For SLUB the issue was caused by commit
81ae6d03952c ("mm/slub.c: replace kick_all_cpus_sync() with
synchronize_sched() in kmem_cache_shrink()").

> This would be really useful for the stable backport
> consideration.
> 
> Also, if I understand the fix correctly, now we do limit the number of
> workers to 1 thread. Is this really what we want? Wouldn't it be
> possible that few memcgs could starve others fromm having their cache
> created? What would be the result, missed charges?

Now kmem caches are created in FIFO order, i.e. if one memcg called
kmem_cache_alloc on a non-existent cache before another, it will be
served first. Since the number of caches that can be created by a single
memcg is obviously limited, I don't see any possibility of starvation.
Actually, this patch doesn't introduce any functional changes regarding
the order in which kmem caches are created, as the work function holds
the global slab_mutex during its whole runtime anyway. We only avoid
creating a thread per each work by making the queue single-threaded.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-10-03 12:35 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-01 13:56 [PATCH 1/2] mm: memcontrol: use special workqueue for creating per-memcg caches Vladimir Davydov
2016-10-01 13:56 ` Vladimir Davydov
2016-10-01 13:56 ` [PATCH 2/2] slub: move synchronize_sched out of slab_mutex on shrink Vladimir Davydov
2016-10-01 13:56   ` Vladimir Davydov
2016-10-06  6:27   ` Joonsoo Kim
2016-10-06  6:27     ` Joonsoo Kim
2016-10-03 12:06 ` [PATCH 1/2] mm: memcontrol: use special workqueue for creating per-memcg caches Michal Hocko
2016-10-03 12:06   ` Michal Hocko
2016-10-03 12:35   ` Vladimir Davydov [this message]
2016-10-03 12:35     ` Vladimir Davydov
2016-10-03 13:19     ` Michal Hocko
2016-10-03 13:19       ` Michal Hocko
2016-10-04 13:14       ` Vladimir Davydov
2016-10-04 13:14         ` Vladimir Davydov
2016-10-06 12:05         ` Michal Hocko
2016-10-06 12:05           ` Michal Hocko
2016-10-21  3:44         ` Andrew Morton
2016-10-21  3:44           ` Andrew Morton
2016-10-21  6:39           ` Michal Hocko
2016-10-21  6:39             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161003123505.GA1862@esperanza \
    --to=vdavydov.dev@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.