linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Liang, Kan" <kan.liang@intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@elte.hu>, Andi Kleen <ak@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Jan Kara <jack@suse.cz>,
	Christopher Lameter <cl@linux.com>,
	"Eric W . Biederman" <ebiederm@xmission.com>,
	Davidlohr Bueso <dave@stgolabs.net>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit
Date: Wed, 13 Sep 2017 19:12:23 -0700	[thread overview]
Message-ID: <a9e74f64-dee6-dc23-128e-8ef8c7383d77@linux.intel.com> (raw)
In-Reply-To: <CA+55aFx3WY00yvEDBg7TagX4h_-QO71=HAq5GAT8awtewRXONQ@mail.gmail.com>

On 08/29/2017 09:24 AM, Linus Torvalds wrote:
> On Tue, Aug 29, 2017 at 9:13 AM, Tim Chen <tim.c.chen@linux.intel.com> wrote:
>>
>> It is affecting not a production use, but the customer's acceptance
>> test for their systems.  So I suspect it is a stress test.
> 
> Can you gently poke them and ask if they might make theie stress test
> code available?
> 
> Tell them that we have a fix, but right now it's delayed into 4.14
> because we have no visibility into what it is that it actually fixes,
> and whether it's all that critical or just some microbenchmark.
> 
>

Linus,

Here's what the customer think happened and is willing to tell us.
They have a parent process that spawns off 10 children per core and
kicked them to run. The child processes all access a common library.
We have 384 cores so 3840 child processes running.  When migration occur on
a page in the common library, the first child that access the page will
page fault and lock the page, with the other children also page faulting
quickly and pile up in the page wait list, till the first child is done.

Probably some kind of access pattern of the common library induces the
page migration to happen.

BTW, will you be merging these 2 patches in 4.14?

Thanks.

Tim

  parent reply	other threads:[~2017-09-14  2:12 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-25 16:13 [PATCH 1/2 v2] sched/wait: Break up long wake list walk Tim Chen
2017-08-25 16:13 ` [PATCH 2/2 v2] sched/wait: Introduce lock breaker in wake_up_page_bit Tim Chen
2017-08-25 19:58   ` Linus Torvalds
2017-08-25 22:19     ` Tim Chen
2017-08-25 22:51       ` Linus Torvalds
2017-08-25 23:03         ` Linus Torvalds
2017-08-26  0:31           ` Linus Torvalds
2017-08-26  2:54             ` Linus Torvalds
2017-08-26 18:15               ` Linus Torvalds
2017-08-27 21:40                 ` Linus Torvalds
2017-08-27 21:42                   ` Linus Torvalds
2017-08-27 23:12                   ` Linus Torvalds
2017-08-28  1:16                     ` Nicholas Piggin
2017-08-28  1:29                       ` Nicholas Piggin
2017-08-28  5:17                         ` Linus Torvalds
2017-08-28  7:18                           ` Nicholas Piggin
2017-08-28 14:51                 ` Liang, Kan
2017-08-28 16:48                   ` Linus Torvalds
2017-08-28 20:01                     ` Tim Chen
2017-08-29 12:57                     ` Liang, Kan
2017-08-29 16:01                       ` Linus Torvalds
2017-08-29 16:13                         ` Tim Chen
2017-08-29 16:24                           ` Linus Torvalds
2017-08-29 16:57                             ` Tim Chen
2017-09-14  2:12                             ` Tim Chen [this message]
2017-09-14  2:27                               ` Linus Torvalds
2017-09-14 16:50                                 ` Tim Chen
2017-09-14 17:00                                   ` Linus Torvalds
2017-09-14 16:39                               ` Christopher Lameter
2017-08-29 16:17                     ` Tim Chen
2017-08-29 16:22                       ` Linus Torvalds
2017-08-25 17:46 ` [PATCH 1/2 v2] sched/wait: Break up long wake list walk Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a9e74f64-dee6-dc23-128e-8ef8c7383d77@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=dave@stgolabs.net \
    --cc=ebiederm@xmission.com \
    --cc=hannes@cmpxchg.org \
    --cc=jack@suse.cz \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).