linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Bharata B Rao <bharata@amd.com>
Cc: linux-kernel@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org, juri.lelli@redhat.com,
	vincent.guittot@linaro.org, dietmar.eggemann@arm.com,
	rostedt@goodmis.org, bsegall@google.com, bristot@redhat.com,
	dishaa.talreja@amd.com, Wei Huang <wei.huang2@amd.com>
Subject: Re: [RFC PATCH v0 2/3] sched/numa: Add cumulative history of per-process fault stats
Date: Mon, 31 Jan 2022 12:17:20 +0000	[thread overview]
Message-ID: <20220131121720.GY3301@suse.de> (raw)
In-Reply-To: <20220128052851.17162-3-bharata@amd.com>

On Fri, Jan 28, 2022 at 10:58:50AM +0530, Bharata B Rao wrote:
> From: Disha Talreja <dishaa.talreja@amd.com>
> 
> The cumulative history of local/remote (lr) and private/shared (ps)
> will be used for calculating adaptive scan period.
> 

How it used to calculate adaptive scan period?

As it is likely used in a later patch, note here that the per-thread
stats are simply accumulated in the address space for now.

> Co-developed-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Wei Huang <wei.huang2@amd.com>
> Signed-off-by: Disha Talreja <dishaa.talreja@amd.com>
> Signed-off-by: Bharata B Rao <bharata@amd.com>
> ---
>  include/linux/mm_types.h |  2 ++
>  kernel/sched/fair.c      | 49 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 50 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 4f978c09d3db..2c6f119b947f 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -614,6 +614,8 @@ struct mm_struct {
>  		/* Process-based Adaptive NUMA */
>  		atomic_long_t faults_locality[2];
>  		atomic_long_t faults_shared[2];
> +		unsigned long faults_locality_history[2];
> +		unsigned long faults_shared_history[2];
>  
>  		spinlock_t pan_numa_lock;
>  		unsigned int numa_scan_period;
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1d6404b2d42e..4911b3841d00 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2102,14 +2102,56 @@ static void numa_group_count_active_nodes(struct numa_group *numa_group)
>  /**********************************************/
>  /*  Process-based Adaptive NUMA (PAN) Design  */
>  /**********************************************/
> +/*
> + * Update the cumulative history of local/remote and private/shared
> + * statistics. If the numbers are too small worthy of updating,
> + * return FALSE, otherwise return TRUE.
> + */
> +static bool pan_update_history(struct task_struct *p)
> +{
> +	unsigned long local, remote, shared, private;
> +	long diff;
> +	int i;
> +
> +	remote = atomic_long_read(&p->mm->faults_locality[0]);
> +	local = atomic_long_read(&p->mm->faults_locality[1]);
> +	shared = atomic_long_read(&p->mm->faults_shared[0]);
> +	private = atomic_long_read(&p->mm->faults_shared[1]);
> +
> +	/* skip if the activities in this window are too small */
> +	if (local + remote < 100)
> +		return false;
> +

Why 100?

> +	/* decay over the time window by 1/4 */
> +	diff = local - (long)(p->mm->faults_locality_history[1] / 4);
> +	p->mm->faults_locality_history[1] += diff;
> +	diff = remote - (long)(p->mm->faults_locality_history[0] / 4);
> +	p->mm->faults_locality_history[0] += diff;
> +
> +	/* decay over the time window by 1/2 */
> +	diff = shared - (long)(p->mm->faults_shared_history[0] / 2);
> +	p->mm->faults_shared_history[0] += diff;
> +	diff = private - (long)(p->mm->faults_shared_history[1] / 2);
> +	p->mm->faults_shared_history[1] += diff;
> +

Why are the decay windows different?


> +	/* clear the statistics for the next window */
> +	for (i = 0; i < 2; i++) {
> +		atomic_long_set(&(p->mm->faults_locality[i]), 0);
> +		atomic_long_set(&(p->mm->faults_shared[i]), 0);
> +	}
> +
> +	return true;
> +}
> +
>  /*
>   * Updates mm->numa_scan_period under mm->pan_numa_lock.
> - *
>   * Returns p->numa_scan_period now but updated to return
>   * p->mm->numa_scan_period in a later patch.
>   */

Spurious whitespace change.

>  static unsigned long pan_get_scan_period(struct task_struct *p)
>  {
> +	pan_update_history(p);
> +
>  	return p->numa_scan_period;
>  }
>  

Ok, so the spinlock is protecting the RMW of the PAN history. It still
may be a concern that task_numa_work gets aborted if the spinlock cannot
be acquired.

> @@ -2836,10 +2878,15 @@ static void task_numa_work(struct callback_head *work)
>  static void pan_init_numa(struct task_struct *p)
>  {
>  	struct mm_struct *mm = p->mm;
> +	int i;
>  
>  	spin_lock_init(&mm->pan_numa_lock);
>  	mm->numa_scan_period = sysctl_numa_balancing_scan_delay;
>  
> +	for (i = 0; i < 2; i++) {
> +		mm->faults_locality_history[i] = 0;
> +		mm->faults_shared_history[i] = 0;
> +	}
>  }
>  
>  void init_numa_balancing(unsigned long clone_flags, struct task_struct *p)
> -- 
> 2.25.1
> 

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2022-01-31 12:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-28  5:28 [RFC PATCH v0 0/3] sched/numa: Process Adaptive autoNUMA Bharata B Rao
2022-01-28  5:28 ` [RFC PATCH v0 1/3] sched/numa: Process based autonuma scan period framework Bharata B Rao
2022-01-31 12:17   ` Mel Gorman
2022-02-01 12:22     ` Bharata B Rao
2022-02-01 14:15       ` Mel Gorman
2022-02-04 11:03         ` Bharata B Rao
2022-02-04 14:09           ` Mel Gorman
2023-06-21  5:50         ` Raghavendra K T
2022-01-28  5:28 ` [RFC PATCH v0 2/3] sched/numa: Add cumulative history of per-process fault stats Bharata B Rao
2022-01-31 12:17   ` Mel Gorman [this message]
2022-02-01 12:30     ` Bharata B Rao
2022-01-28  5:28 ` [RFC PATCH v0 3/3] sched/numa: Add adaptive scan period calculation Bharata B Rao
2022-01-31 12:17   ` Mel Gorman
2022-02-01 13:00     ` Bharata B Rao
2022-01-31 12:17 ` [RFC PATCH v0 0/3] sched/numa: Process Adaptive autoNUMA Mel Gorman
2022-02-01 13:07   ` Bharata B Rao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220131121720.GY3301@suse.de \
    --to=mgorman@suse.de \
    --cc=bharata@amd.com \
    --cc=bristot@redhat.com \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=dishaa.talreja@amd.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=wei.huang2@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).