From: Michal Hocko <mhocko@kernel.org>
To: Chris Down <chris@chrisdown.name>
Cc: linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page
Date: Tue, 21 Jul 2020 17:00:24 +0200 [thread overview]
Message-ID: <20200721150024.GM4061@dhcp22.suse.cz> (raw)
In-Reply-To: <20200721141749.GA742741@chrisdown.name>
On Tue 21-07-20 15:17:49, Chris Down wrote:
> I understand the pragmatic considerations here, but I'm quite concerned
> about the maintainability and long-term ability to reason about a patch like
> this. For example, how do we know when this patch is safe to remove? Also,
> what other precedent does this set for us covering for poor userspace
> behaviour?
>
> Speaking as a systemd maintainer, if udev could be doing something better on
> these machines, we'd be more than receptive to help fix it. In general I am
> against explicit watchdog tweaking here because a.) there's potential to
> mask other problems, and b.) it seems like the kind of one-off trivia nobody
> is going to remember exists when doing complex debugging in future.
>
> Is there anything preventing this being remedied in udev, instead of the
> kernel?
Yes, I believe that there is a configuration to cap the maximum number
of workers. This is not my area but my understanding is that the maximum
is tuned based on available memory and/or cpus. We have been hit byt
this quite heavily on SLES. Maybe newer version of systemd have a better
tuning.
But, it seems that udev is just a messenger here. There is nothing
really fundamentally udev specific in the underlying problem unless I
miss something. It is quite possible that this could be triggered by
other userspace which happens to fire many workers at the same time and
condending on a shared page.
Not that I like this workaround in the first place but it seems that the
existing code allows very long wait chains and !PREEMPT kernels simply
do not have any scheduling point for a long time potentially. I believe
we should focus on that even if the systemd as the current trigger can
be tuned better. I do not insist on this patch, hence RFC, but I am
simply not seeing a much better, yet not convoluted, solution.
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2020-07-21 15:01 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-21 6:32 [RFC PATCH] mm: silence soft lockups from unlock_page Michal Hocko
[not found] ` <FCC3EB2D-9F11-4E9E-88F4-40B2926B35CC@lca.pw>
2020-07-21 11:25 ` Michal Hocko
[not found] ` <664A07B6-DBCD-4520-84F1-241A4E7A339F@lca.pw>
2020-07-21 12:17 ` Michal Hocko
[not found] ` <20200721132343.GA4261@lca.pw>
2020-07-21 13:38 ` Michal Hocko
2020-07-21 14:17 ` Chris Down
2020-07-21 15:00 ` Michal Hocko [this message]
2020-07-21 15:33 ` Linus Torvalds
2020-07-21 15:49 ` Michal Hocko
2020-07-22 18:29 ` Linus Torvalds
2020-07-22 21:29 ` Hugh Dickins
2020-07-22 22:10 ` Linus Torvalds
2020-07-22 23:42 ` Linus Torvalds
2020-07-23 0:23 ` Linus Torvalds
2020-07-23 12:47 ` Oleg Nesterov
2020-07-23 17:32 ` Linus Torvalds
2020-07-23 18:01 ` Oleg Nesterov
2020-07-23 18:22 ` Linus Torvalds
2020-07-23 19:03 ` Linus Torvalds
2020-07-24 14:45 ` Oleg Nesterov
2020-07-23 20:03 ` Linus Torvalds
2020-07-23 23:11 ` Hugh Dickins
2020-07-23 23:43 ` Linus Torvalds
2020-07-24 0:07 ` Hugh Dickins
2020-07-24 0:46 ` Linus Torvalds
2020-07-24 3:45 ` Hugh Dickins
2020-07-24 15:24 ` Oleg Nesterov
2020-07-24 17:32 ` Linus Torvalds
2020-07-24 23:25 ` Linus Torvalds
2020-07-25 2:08 ` Hugh Dickins
2020-07-25 2:46 ` Linus Torvalds
2020-07-25 10:14 ` Oleg Nesterov
2020-07-25 18:48 ` Linus Torvalds
2020-07-25 19:27 ` Oleg Nesterov
2020-07-25 19:51 ` Linus Torvalds
2020-07-26 13:57 ` Oleg Nesterov
2020-07-25 21:19 ` Hugh Dickins
2020-07-26 4:22 ` Hugh Dickins
2020-07-26 20:30 ` Hugh Dickins
2020-07-26 20:41 ` Linus Torvalds
2020-07-26 22:09 ` Hugh Dickins
2020-07-27 19:35 ` Greg KH
2020-08-06 5:46 ` Hugh Dickins
2020-08-18 13:50 ` Greg KH
2020-08-06 5:21 ` Hugh Dickins
2020-08-06 17:07 ` Linus Torvalds
2020-08-06 18:00 ` Matthew Wilcox
2020-08-06 18:32 ` Linus Torvalds
2020-08-07 18:41 ` Hugh Dickins
2020-08-07 19:07 ` Linus Torvalds
2020-08-07 19:35 ` Matthew Wilcox
2020-08-03 13:14 ` Michal Hocko
2020-08-03 17:56 ` Linus Torvalds
2020-07-25 9:39 ` Oleg Nesterov
2020-07-23 8:03 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200721150024.GM4061@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=chris@chrisdown.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=tim.c.chen@linux.intel.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).