All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	bugzilla-daemon@bugzilla.kernel.org, dsmythies@telus.net,
	linux-mm@kvack.org
Subject: Re: [Bug 172981] New: [bisected] SLAB: extreme load averages and over 2000 kworker threads
Date: Fri, 30 Sep 2016 17:19:41 +0900	[thread overview]
Message-ID: <20160930081940.GA3606@js1304-P5Q-DELUXE> (raw)
In-Reply-To: <20160929134550.GB20312@esperanza>

On Thu, Sep 29, 2016 at 04:45:50PM +0300, Vladimir Davydov wrote:
> On Thu, Sep 29, 2016 at 11:00:50AM +0900, Joonsoo Kim wrote:
> > On Wed, Sep 28, 2016 at 11:09:53AM +0300, Vladimir Davydov wrote:
> > > On Tue, Sep 27, 2016 at 10:03:47PM -0400, Johannes Weiner wrote:
> > > > [CC Vladimir]
> > > > 
> > > > These are the delayed memcg cache allocations, where in a fresh memcg
> > > > that doesn't have per-memcg caches yet, every accounted allocation
> > > > schedules a kmalloc work item in __memcg_schedule_kmem_cache_create()
> > > > until the cache is finally available. It looks like those can be many
> > > > more than the number of slab caches in existence, if there is a storm
> > > > of slab allocations before the workers get a chance to run.
> > > > 
> > > > Vladimir, what do you think of embedding the work item into the
> > > > memcg_cache_array? That way we make sure we have exactly one work per
> > > > cache and not an unbounded number of them. The downside of course is
> > > > that we'd have to keep these things around as long as the memcg is in
> > > > existence, but that's the only place I can think of that allows us to
> > > > serialize this.
> > > 
> > > We could set the entry of the root_cache->memcg_params.memcg_caches
> > > array corresponding to the cache being created to a special value, say
> > > (void*)1, and skip scheduling cache creation work on kmalloc if the
> > > caller sees it. I'm not sure it's really worth it though, because
> > > work_struct isn't that big (at least, in comparison with the cache
> > > itself) to avoid embedding it at all costs.
> > 
> > Hello, Johannes and Vladimir.
> > 
> > I'm not familiar with memcg so have a question about this solution.
> > This solution will solve the current issue but if burst memcg creation
> > happens, similar issue would happen again. My understanding is correct?
> 
> Yes, I think you're right - embedding the work_struct responsible for
> cache creation in kmem_cache struct won't help if a thousand of
> different cgroups call kmem_cache_alloc() simultaneously for a cache
> they haven't used yet.
> 
> Come to think of it, we could fix the issue by simply introducing a
> special single-threaded workqueue used exclusively for cache creation
> works - cache creation is done mostly under the slab_mutex, anyway. This
> way, we wouldn't have to keep those used-once work_structs for the whole
> kmem_cache life time.
> 
> > 
> > I think that the other cause of the problem is that we call
> > synchronize_sched() which is rather slow with holding a slab_mutex and
> > it blocks further kmem_cache creation. Should we fix that, too?
> 
> Well, the patch you posted looks pretty obvious and it helps the
> reporter, so personally I don't see any reason for not applying it.

Oops... I forgot to mention why I asked that.

There is another report that similar problem also happens in SLUB. In there,
synchronize_sched() is called in cache shrinking path with holding the
slab_mutex. I guess that it blocks further kmem_cache creation.

If we uses special single-threaded workqueue, number of kworker would
be limited but kmem_cache creation will be delayed for a long time in
burst memcg creation/destroy scenario.

https://bugzilla.kernel.org/show_bug.cgi?id=172991

Do we need to remove synchronize_sched() in SLUB and find other
solution?

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-30  8:11 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-172981-27@https.bugzilla.kernel.org/>
2016-09-27 18:10 ` [Bug 172981] New: [bisected] SLAB: extreme load averages and over 2000 kworker threads Andrew Morton
2016-09-28  2:03   ` Johannes Weiner
2016-09-28  8:09     ` Vladimir Davydov
2016-09-29  2:00       ` Joonsoo Kim
2016-09-29 13:45         ` Vladimir Davydov
2016-09-30  8:19           ` Joonsoo Kim [this message]
2016-09-30 19:58             ` Vladimir Davydov
2016-10-06  5:04             ` Doug Smythies
2016-10-06  6:35               ` Joonsoo Kim
2016-10-06 16:02               ` Doug Smythies
2016-10-07 15:55               ` Doug Smythies
2016-09-28  3:13   ` Doug Smythies
2016-09-28  5:18     ` Joonsoo Kim
2016-09-28  6:20       ` Joonsoo Kim
2016-09-28 15:22       ` Doug Smythies
2016-09-29  1:50         ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160930081940.GA3606@js1304-P5Q-DELUXE \
    --to=iamjoonsoo.kim@lge.com \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dsmythies@telus.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.