linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: threads-max observe limits
Date: Wed, 18 Sep 2019 09:15:41 +0200	[thread overview]
Message-ID: <20190918071541.GB12770@dhcp22.suse.cz> (raw)
In-Reply-To: <87ftku96md.fsf@x220.int.ebiederm.org>

On Tue 17-09-19 12:26:18, Eric W. Biederman wrote:
> Michal Hocko <mhocko@kernel.org> writes:
> 
> > On Tue 17-09-19 17:28:02, Heinrich Schuchardt wrote:
> >> 
> >> On 9/17/19 12:03 PM, Michal Hocko wrote:
> >> > Hi,
> >> > I have just stumbled over 16db3d3f1170 ("kernel/sysctl.c: threads-max
> >> > observe limits") and I am really wondering what is the motivation behind
> >> > the patch. We've had a customer noticing the threads_max autoscaling
> >> > differences btween 3.12 and 4.4 kernels and wanted to override the auto
> >> > tuning from the userspace, just to find out that this is not possible.
> >> 
> >> set_max_threads() sets the upper limit (max_threads_suggested) for
> >> threads such that at a maximum 1/8th of the total memory can be occupied
> >> by the thread's administrative data (of size THREADS_SIZE). On my 32 GiB
> >> system this results in 254313 threads.
> >
> > This is quite arbitrary, isn't it? What would happen if the limit was
> > twice as large?
> >
> >> With patch 16db3d3f1170 ("kernel/sysctl.c: threads-max observe limits")
> >> a user cannot set an arbitrarily high number for
> >> /proc/sys/kernel/threads-max which could lead to a system stalling
> >> because the thread headers occupy all the memory.
> >
> > This is still a decision of the admin to make.  You can consume the
> > memory by other means and that is why we have measures in place. E.g.
> > memcg accounting.
> >
> >> When developing the patch I remarked that on a system where memory is
> >> installed dynamically it might be a good idea to recalculate this limit.
> >> If you have a system that boots with let's say 8 GiB and than
> >> dynamically installs a few TiB of RAM this might make sense. But such a
> >> dynamic update of thread_max_suggested was left out for the sake of
> >> simplicity.
> >> 
> >> Anyway if more than 100,000 threads are used on a system, I would wonder
> >> if the software should not be changed to use thread-pools instead.
> >
> > You do not change the software to overcome artificial bounds based on
> > guessing.
> >
> > So can we get back to the justification of the patch. What kind of
> > real life problem does it solve and why is it ok to override an admin
> > decision?
> > If there is no strong justification then the patch should be reverted
> > because from what I have heard it has been noticed and it has broken
> > a certain deployment. I am not really clear about technical details yet
> > but it seems that there are workloads that believe they need to touch
> > this tuning and complain if that is not possible.
> 
> Taking a quick look myself.
> 
> I am completely mystified by both sides of this conversation.
> 
> a) The logic to set the default number of threads in a system
>    has not changed since 2.6.12-rc2 (the start of the git history).
> 
> The implementation has changed but we should still get the same
> value.  So anyone seeing threads_max autoscaling differences
> between kernels is either seeing a bug in the rewritten formula
> or something else weird is going on.
> 
> Michal is it a very small effect your customers are seeing?
> Is it another bug somewhere else?

I am still trying to get more information. Reportedly they see a
different auto tuned limit between two kernel versions which results in
an applicaton complaining. As already mentioned this might be a side
effect of something else and this is not yet fully analyzed. My main
point for bringing up this discussion is ...

> b) Not being able to bump threads_max to the physical limit of
>    the machine is very clearly a regression.

... exactly this part. The changelog of the respective patch doesn't
really exaplain why it is needed except of "it sounds like a good idea
to be consistent".
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-09-18  7:15 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-17 10:03 threads-max observe limits Michal Hocko
2019-09-17 15:28 ` Heinrich Schuchardt
2019-09-17 15:38   ` Michal Hocko
2019-09-17 17:26     ` Eric W. Biederman
2019-09-18  7:15       ` Michal Hocko [this message]
2019-09-19  7:59         ` Michal Hocko
2019-09-19 19:38           ` Andrew Morton
2019-09-19 19:33         ` Eric W. Biederman
2019-09-22  6:58           ` Michal Hocko
2019-09-22 15:31             ` Heinrich Schuchardt
2019-09-22 21:40               ` Eric W. Biederman
2019-09-22 21:24             ` Eric W. Biederman
2019-09-23  8:08               ` Michal Hocko
2019-09-23 21:23                 ` Eric W. Biederman
2019-09-24  8:48                   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190918071541.GB12770@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xypron.glpk@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).