linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Qian Cai <cai@lca.pw>
Cc: linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page
Date: Tue, 21 Jul 2020 15:38:35 +0200	[thread overview]
Message-ID: <20200721133835.GL4061@dhcp22.suse.cz> (raw)
In-Reply-To: <20200721132343.GA4261@lca.pw>

On Tue 21-07-20 09:23:44, Qian Cai wrote:
> On Tue, Jul 21, 2020 at 02:17:52PM +0200, Michal Hocko wrote:
> > On Tue 21-07-20 07:44:07, Qian Cai wrote:
> > > 
> > > 
> > > > On Jul 21, 2020, at 7:25 AM, Michal Hocko <mhocko@kernel.org> wrote:
> > > > 
> > > > Are these really important? I believe I can dig that out from the bug
> > > > report but I didn't really consider that important enough.
> > > 
> > > Please dig them out. We have also been running those things on
> > > “large” powerpc as well and never saw such soft-lockups. Those
> > > details may give us some clues about the actual problem.
> > 
> > I strongly suspect this is not really relevant but just FYI this is
> > 16Node, 11.9TB with 1536CPUs system.
> 
> Okay, we are now talking about the HPC special case. Just brain-storming some
> ideas here.
> 
> 
> 1) What about increase the soft-lockup threshold early at boot and restore
> afterwards? As far as I can tell, those soft-lockups are just a few bursts of
> things and then cure itself after the booting.

Is this really better option than silencing soft lockup from the code
itself? What if the same access pattern happens later on?

> 2) Reading through the comments above page_waitqueue(), it said rare hash
> collisions could happen, so sounds like in this HPC case, it is rather easy to
> hit those hash collisons. Thus, need to deal with that instead?

As all of those seem to be the same class of process I suspect it is
more likely that many processes are hitting the page fault on the same
file page. E.g. a code/library.

> 3) The commit 62906027091f ("mm: add PageWaiters indicating tasks are waiting
> for a page bit") mentioned that,
> 
> "Putting two bits in the same word opens the opportunity to remove the memory
> barrier between clearing the lock bit and testing the waiters bit, after some
> work on the arch primitives (e.g., ensuring memory operand widths match and
> cover both bits)."
> 
> Do you happen to know if this only happen on powerpc?

I have only seen this single instance on that machine. I do not think
this is very much HW specific but ppc platform is likely more prone to
that. Just think of the memory itself. Each memory block is notified via
udev and ppc has very small memblocks (16M to 256M). X86 will use 2G
blocks on large machines.

> Also, probably need to
> dig out if those memory barrier is still there that could be removed to speed
> up things.

I would be really suprised if memory barriers matter much. It sounds
much more likely that there is the same underlying problem as
11a19c7b099f. There are just too many waiters on the page. The commit
prevents just the hard lockup part of the problem by dropping the lock
and continuing after the bookmark. But, as mentioned in the changelog,
cond_resched is not really an option because this path is called from
atomic context as well. So !PREEMPT kernels are still in the same boat.

I might have misunderstood something, of course, and would like to hear
where is my thinking wrong.
-- 
Michal Hocko
SUSE Labs


  parent reply	other threads:[~2020-07-23  2:25 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-21  6:32 [RFC PATCH] mm: silence soft lockups from unlock_page Michal Hocko
     [not found] ` <FCC3EB2D-9F11-4E9E-88F4-40B2926B35CC@lca.pw>
2020-07-21 11:25   ` Michal Hocko
     [not found]     ` <664A07B6-DBCD-4520-84F1-241A4E7A339F@lca.pw>
2020-07-21 12:17       ` Michal Hocko
     [not found]         ` <20200721132343.GA4261@lca.pw>
2020-07-21 13:38           ` Michal Hocko [this message]
2020-07-21 14:17 ` Chris Down
2020-07-21 15:00   ` Michal Hocko
2020-07-21 15:33 ` Linus Torvalds
2020-07-21 15:49   ` Michal Hocko
2020-07-22 18:29   ` Linus Torvalds
2020-07-22 21:29     ` Hugh Dickins
2020-07-22 22:10       ` Linus Torvalds
2020-07-22 23:42         ` Linus Torvalds
2020-07-23  0:23           ` Linus Torvalds
2020-07-23 12:47           ` Oleg Nesterov
2020-07-23 17:32             ` Linus Torvalds
2020-07-23 18:01               ` Oleg Nesterov
2020-07-23 18:22                 ` Linus Torvalds
2020-07-23 19:03                   ` Linus Torvalds
2020-07-24 14:45                     ` Oleg Nesterov
2020-07-23 20:03               ` Linus Torvalds
2020-07-23 23:11                 ` Hugh Dickins
2020-07-23 23:43                   ` Linus Torvalds
2020-07-24  0:07                     ` Hugh Dickins
2020-07-24  0:46                       ` Linus Torvalds
2020-07-24  3:45                         ` Hugh Dickins
2020-07-24 15:24                     ` Oleg Nesterov
2020-07-24 17:32                       ` Linus Torvalds
2020-07-24 23:25                         ` Linus Torvalds
2020-07-25  2:08                           ` Hugh Dickins
2020-07-25  2:46                             ` Linus Torvalds
2020-07-25 10:14                           ` Oleg Nesterov
2020-07-25 18:48                             ` Linus Torvalds
2020-07-25 19:27                               ` Oleg Nesterov
2020-07-25 19:51                                 ` Linus Torvalds
2020-07-26 13:57                                   ` Oleg Nesterov
2020-07-25 21:19                               ` Hugh Dickins
2020-07-26  4:22                                 ` Hugh Dickins
2020-07-26 20:30                                   ` Hugh Dickins
2020-07-26 20:41                                     ` Linus Torvalds
2020-07-26 22:09                                       ` Hugh Dickins
2020-07-27 19:35                                     ` Greg KH
2020-08-06  5:46                                       ` Hugh Dickins
2020-08-18 13:50                                         ` Greg KH
2020-08-06  5:21                                     ` Hugh Dickins
2020-08-06 17:07                                       ` Linus Torvalds
2020-08-06 18:00                                         ` Matthew Wilcox
2020-08-06 18:32                                           ` Linus Torvalds
2020-08-07 18:41                                             ` Hugh Dickins
2020-08-07 19:07                                               ` Linus Torvalds
2020-08-07 19:35                                               ` Matthew Wilcox
2020-08-03 13:14                           ` Michal Hocko
2020-08-03 17:56                             ` Linus Torvalds
2020-07-25  9:39                         ` Oleg Nesterov
2020-07-23  8:03     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200721133835.GL4061@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).