linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Peter Zijlstra <peterz@infradead.org>
Cc: ?????? <yun.wang@linux.alibaba.com>,
	Ingo Molnar <mingo@redhat.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>,
	Michal Koutn? <mkoutny@suse.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org,
	"Paul E. McKenney" <paulmck@linux.ibm.com>,
	Randy Dunlap <rdunlap@infradead.org>,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH RESEND v8 1/2] sched/numa: introduce per-cgroup NUMA locality info
Date: Mon, 17 Feb 2020 11:58:10 +0000	[thread overview]
Message-ID: <20200217115810.GA3420@suse.de> (raw)
In-Reply-To: <20200214151048.GL14914@hirez.programming.kicks-ass.net>

On Fri, Feb 14, 2020 at 04:10:48PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 07, 2020 at 11:35:30AM +0800, ?????? wrote:
> > By monitoring the increments, we will be able to locate the per-cgroup
> > workload which NUMA Balancing can't helpwith (usually caused by wrong
> > CPU and memory node bindings), then we got chance to fix that in time.
> > 
> > Cc: Mel Gorman <mgorman@suse.de>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Michal Koutný <mkoutny@suse.com>
> > Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
> 
> So here:
> 
>   https://lkml.kernel.org/r/20191127101932.GN28938@suse.de
> 
> Mel argues that the information exposed is fairly implementation
> specific and hard to use without understanding how NUMA balancing works.
> 
> By exposing it to userspace, we tie ourselves to these particulars. We
> can no longer change these NUMA balancing details if we wanted to, due
> to UAPI concerns.
> 
> Mel, I suspect you still feel that way, right?
> 

Yes, I still think it would be a struggle to interpret the data
meaningfully without very specific knowledge of the implementation. If
the scan rate was constant, it would be easier but that would make NUMA
balancing worse overall. Similarly, the stat might get very difficult to
interpret when NUMA balancing is failing because of a load imbalance,
pages are shared and being interleaved or NUMA groups span multiple
active nodes.

For example, the series that reconciles NUMA and CPU balancers may look
worse in these stats even though the overall performance may be better.

> In the document (patch 2/2) you write:
> 
> > +However, there are no hardware counters for per-task local/remote accessing
> > +info, we don't know how many remote page accesses have occurred for a
> > +particular task.
> 
> We can of course 'fix' that by adding a tracepoint.
> 
> Mel, would you feel better by having a tracepoint in task_numa_fault() ?
> 

A bit, although interpreting the data would still be difficult and the
tracepoint would have to include information about the cgroup. While
I've never tried, this seems like the type of thing that would be suited
to a BPF script that probes task_numa_fault and extract the information
it needs.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2020-02-17 11:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-07  3:34 [PATCH RESEND v8 0/2] sched/numa: introduce numa locality 王贇
2020-02-07  3:35 ` [PATCH RESEND v8 1/2] sched/numa: introduce per-cgroup NUMA locality info 王贇
2020-02-07  3:37   ` 王贇
2020-02-13  2:35     ` 王贇
2020-02-14 15:10   ` Peter Zijlstra
2020-02-17 11:58     ` Mel Gorman [this message]
2020-02-17 13:23       ` 王贇
2020-02-17 14:16         ` Mel Gorman
2020-02-18  1:39           ` 王贇
2020-02-21 14:20             ` Mel Gorman
2020-02-21 15:47               ` Peter Zijlstra
2020-02-24  3:13                 ` 王贇
2020-02-24  3:05               ` 王贇
2020-02-21 15:28         ` Peter Zijlstra
2020-02-24  3:09           ` 王贇
2020-02-07  3:35 ` [PATCH RESEND v8 2/2] sched/numa: documentation for per-cgroup numa 王贇

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200217115810.GA3420@suse.de \
    --to=mgorman@suse.de \
    --cc=bsegall@google.com \
    --cc=corbet@lwn.net \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=paulmck@linux.ibm.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=yun.wang@linux.alibaba.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).