All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	bugzilla-daemon@bugzilla.kernel.org, dsmythies@telus.net,
	linux-mm@kvack.org
Subject: Re: [Bug 172981] New: [bisected] SLAB: extreme load averages and over 2000 kworker threads
Date: Tue, 27 Sep 2016 22:03:47 -0400	[thread overview]
Message-ID: <20160928020347.GA21129@cmpxchg.org> (raw)
In-Reply-To: <20160927111059.282a35c89266202d3cb2f953@linux-foundation.org>

[CC Vladimir]

These are the delayed memcg cache allocations, where in a fresh memcg
that doesn't have per-memcg caches yet, every accounted allocation
schedules a kmalloc work item in __memcg_schedule_kmem_cache_create()
until the cache is finally available. It looks like those can be many
more than the number of slab caches in existence, if there is a storm
of slab allocations before the workers get a chance to run.

Vladimir, what do you think of embedding the work item into the
memcg_cache_array? That way we make sure we have exactly one work per
cache and not an unbounded number of them. The downside of course is
that we'd have to keep these things around as long as the memcg is in
existence, but that's the only place I can think of that allows us to
serialize this.

On Tue, Sep 27, 2016 at 11:10:59AM -0700, Andrew Morton wrote:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Tue, 27 Sep 2016 17:57:08 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > https://bugzilla.kernel.org/show_bug.cgi?id=172981
> > 
> >             Bug ID: 172981
> >            Summary: [bisected] SLAB: extreme load averages and over 2000
> >                     kworker threads
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 4.7+
> >           Hardware: All
> >                 OS: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: Slab Allocator
> >           Assignee: akpm@linux-foundation.org
> >           Reporter: dsmythies@telus.net
> >         Regression: No
> > 
> > Immediately after boot, extreme load average numbers and over 2000 kworker
> > processes are being observed on my main linux test computer (basically a Ubuntu
> > 16.04 server, no GUI). The worker threads appear to be idle, and do disappear
> > after the nominal 5 minute timeout, depending on whatever other stuff might run
> > in the meantime. However, the number of threads can hugely increase again. The
> > issue occurs with ease for kernels compiled using SLAB.
> > 
> > For SLAB, kernel bisection gave:
> > 801faf0db8947e01877920e848a4d338dd7a99e7
> > "mm/slab: lockless decision to grow cache"
> > 
> > The following monitoring script was used for the below examples:
> > 
> > #!/bin/dash
> > 
> > while [ 1 ];
> > do
> >   echo $(uptime) ::: $(ps -A --no-headers | wc -l) ::: $(ps aux | grep kworker
> > | grep -v u | grep -v H | wc -l)
> >   sleep 10.0
> > done
> > 
> > Example (SLAB):
> > 
> > After boot:
> > 
> > 22:26:21 up 1 min, 2 users, load average: 295.98, 85.67, 29.47 ::: 2240 :::
> > 2074
> > 22:26:31 up 1 min, 2 users, load average: 250.47, 82.85, 29.15 ::: 2240 :::
> > 2074
> > 22:26:41 up 1 min, 2 users, load average: 211.96, 80.12, 28.84 ::: 2240 :::
> > 2074
> > ...
> > 22:52:34 up 27 min, 3 users, load average: 0.00, 0.43, 5.40 ::: 165 ::: 17
> > 22:52:44 up 27 min, 3 users, load average: 0.00, 0.42, 5.34 ::: 165 ::: 17
> > 
> > Now type: sudo echo "bla":
> > 
> > 22:53:14 up 27 min, 3 users, load average: 0.00, 0.38, 5.17 ::: 493 ::: 345
> > 22:53:24 up 28 min, 3 users, load average: 0.00, 0.36, 5.11 ::: 493 ::: 345
> > 
> > Caused 328 new kworker threads.
> > Now queue just a few (8 in this case) very simple jobs.
> > 
> > 22:55:45 up 30 min, 3 users, load average: 0.11, 0.27, 4.38 ::: 493 ::: 345
> > 22:55:55 up 30 min, 3 users, load average: 0.09, 0.26, 4.34 ::: 2207 ::: 2059
> > 22:56:05 up 30 min, 3 users, load average: 0.08, 0.25, 4.29 ::: 2207 ::: 2059
> > 
> > If I look at linux/Documentation/workqueue.txt and do:
> > 
> > echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event
> > 
> > and:
> > 
> > cat /sys/kernel/debug/tracing/trace_pipe > out.txt
> > 
> > I get somewhere between 10,000 and 20,000 occurrences of
> > memcg_kmem_cache_create_func in the file (using my simple test method).
> > 
> > Also tested with kernel 4.8-rc7.
> > 
> > -- 
> > You are receiving this mail because:
> > You are the assignee for the bug.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-28  2:04 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-172981-27@https.bugzilla.kernel.org/>
2016-09-27 18:10 ` [Bug 172981] New: [bisected] SLAB: extreme load averages and over 2000 kworker threads Andrew Morton
2016-09-28  2:03   ` Johannes Weiner [this message]
2016-09-28  8:09     ` Vladimir Davydov
2016-09-29  2:00       ` Joonsoo Kim
2016-09-29 13:45         ` Vladimir Davydov
2016-09-30  8:19           ` Joonsoo Kim
2016-09-30 19:58             ` Vladimir Davydov
2016-10-06  5:04             ` Doug Smythies
2016-10-06  6:35               ` Joonsoo Kim
2016-10-06 16:02               ` Doug Smythies
2016-10-07 15:55               ` Doug Smythies
2016-09-28  3:13   ` Doug Smythies
2016-09-28  5:18     ` Joonsoo Kim
2016-09-28  6:20       ` Joonsoo Kim
2016-09-28 15:22       ` Doug Smythies
2016-09-29  1:50         ` Joonsoo Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160928020347.GA21129@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=dsmythies@telus.net \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-mm@kvack.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.