linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Mel Gorman <mgorman@suse.de>, Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>,
	xfs@oss.sgi.com, ppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur
Date: Tue, 17 Mar 2015 14:30:57 -0700	[thread overview]
Message-ID: <CA+55aFzSPcNgxw4GC7aAV1r0P5LniyVVC66COz=3cgMcx73Nag@mail.gmail.com> (raw)
In-Reply-To: <20150317205104.GA28621@dastard>

On Tue, Mar 17, 2015 at 1:51 PM, Dave Chinner <david@fromorbit.com> wrote:
>
> On the -o ag_stride=-1 -o bhash=101073 config, the 60s perf stat I
> was using during steady state shows:
>
>      471,752      migrate:mm_migrate_pages ( +-  7.38% )
>
> The migrate pages rate is even higher than in 4.0-rc1 (~360,000)
> and 3.19 (~55,000), so that looks like even more of a problem than
> before.

Hmm. How stable are those numbers boot-to-boot?

That kind of extreme spread makes me suspicious. It's also interesting
that if the numbers really go up even more (and by that big amount),
then why does there seem to be almost no correlation with performance
(which apparently went up since rc1, despite migrate_pages getting
even _worse_).

> And the profile looks like:
>
> -   43.73%     0.05%  [kernel]            [k] native_flush_tlb_others

Ok, that's down from rc1 (67%), but still hugely up from 3.19 (13.7%).
And flush_tlb_page() does seem to be called about ten times more
(flush_tlb_mm_range used to be 1.4% of the callers, now it's invisible
at 0.13%)

Damn. From a performance number standpoint, it looked like we zoomed
in on the right thing. But now it's migrating even more pages than
before. Odd.

> And the vmstats are:
>
> 3.19:
>
> numa_hit 5163221
> numa_local 5153127

> 4.0-rc1:
>
> numa_hit 36952043
> numa_local 36927384
>
> 4.0-rc4:
>
> numa_hit 23447345
> numa_local 23438564
>
> Page migrations are still up by a factor of ~20 on 3.19.

The thing is, those "numa_hit" things come from the zone_statistics()
call in buffered_rmqueue(), which in turn is simple from the memory
allocator. That has *nothing* to do with virtual memory, and
everything to do with actual physical memory allocations.  So the load
is simply allocating a lot more pages, presumably for those stupid
migration events.

But then it doesn't correlate with performance anyway..

Can you do a simple stupid test? Apply that commit 53da3bc2ba9e ("mm:
fix up numa read-only thread grouping logic") to 3.19, so that it uses
the same "pte_dirty()" logic as 4.0-rc4. That *should* make the 3.19
and 4.0-rc4 numbers comparable.

It does make me wonder if your load is "chaotic" wrt scheduling. The
load presumably wants to spread out across all cpu's, but then the
numa code tries to group things together for numa accesses, but
depending on just random allocation patterns and layout in the hash
tables, there either are patters with page access or there aren't.

Which is kind of why I wonder how stable those numbers are boot to
boot. Maybe this is at least partly about lucky allocation patterns.

                              Linus

  reply	other threads:[~2015-03-17 21:31 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-07 15:20 [RFC PATCH 0/4] Automatic NUMA balancing and PROT_NONE handling followup v2r8 Mel Gorman
2015-03-07 15:20 ` [PATCH 1/4] mm: thp: Return the correct value for change_huge_pmd Mel Gorman
2015-03-07 20:13   ` Linus Torvalds
2015-03-07 20:31   ` Linus Torvalds
2015-03-07 20:56     ` Mel Gorman
2015-03-07 15:20 ` [PATCH 2/4] mm: numa: Remove migrate_ratelimited Mel Gorman
2015-03-07 15:20 ` [PATCH 3/4] mm: numa: Mark huge PTEs young when clearing NUMA hinting faults Mel Gorman
2015-03-07 18:33   ` Linus Torvalds
2015-03-07 18:42     ` Linus Torvalds
2015-03-07 15:20 ` [PATCH 4/4] mm: numa: Slow PTE scan rate if migration failures occur Mel Gorman
2015-03-07 16:36   ` Ingo Molnar
2015-03-07 17:37     ` Mel Gorman
2015-03-08  9:54       ` Ingo Molnar
2015-03-07 19:12     ` Linus Torvalds
2015-03-08 10:02       ` Ingo Molnar
2015-03-08 18:35         ` Linus Torvalds
2015-03-08 18:46           ` Linus Torvalds
2015-03-09 11:29           ` Dave Chinner
2015-03-09 16:52             ` Linus Torvalds
2015-03-09 19:19               ` Dave Chinner
2015-03-10 23:55                 ` Linus Torvalds
2015-03-12 13:10                   ` Mel Gorman
2015-03-12 16:20                     ` Linus Torvalds
2015-03-12 18:49                       ` Mel Gorman
2015-03-17  7:06                         ` Dave Chinner
2015-03-17 16:53                           ` Linus Torvalds
2015-03-17 20:51                             ` Dave Chinner
2015-03-17 21:30                               ` Linus Torvalds [this message]
2015-03-17 22:08                                 ` Dave Chinner
2015-03-18 16:08                                   ` Linus Torvalds
2015-03-18 17:31                                     ` Linus Torvalds
2015-03-18 22:23                                       ` Dave Chinner
2015-03-19 14:10                                       ` Mel Gorman
2015-03-19 18:09                                         ` Linus Torvalds
2015-03-19 21:41                                       ` Linus Torvalds
2015-03-19 22:41                                         ` Dave Chinner
2015-03-19 23:05                                           ` Linus Torvalds
2015-03-19 23:23                                             ` Dave Chinner
2015-03-20  0:23                                             ` Dave Chinner
2015-03-20  1:29                                               ` Linus Torvalds
2015-03-20  4:13                                                 ` Dave Chinner
2015-03-20 17:02                                                   ` Linus Torvalds
2015-03-23 12:01                                                     ` Mel Gorman
2015-03-20 10:12                                                 ` Mel Gorman
2015-03-20  9:56                                             ` Mel Gorman
2015-03-08 20:40         ` Mel Gorman
2015-03-09 21:02           ` Mel Gorman
2015-03-10 13:08             ` Mel Gorman
2015-03-08  9:41   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+55aFzSPcNgxw4GC7aAV1r0P5LniyVVC66COz=3cgMcx73Nag@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=david@fromorbit.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).