linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: James Morse <james.morse@arm.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	<shameerali.kolothum.thodi@huawei.com>,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	<carl@os.amperecomputing.com>, <lcherian@marvell.com>,
	<bobo.shaobowang@huawei.com>, <tan.shaopeng@fujitsu.com>,
	<xingxin.hx@openanolis.org>, <baolin.wang@linux.alibaba.com>,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>, <peternewman@google.com>
Subject: Re: [PATCH v2 16/18] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu
Date: Thu, 2 Feb 2023 15:49:05 -0800	[thread overview]
Message-ID: <ab71c33e-f1e3-63be-ac83-685d11c8b4eb@intel.com> (raw)
In-Reply-To: <20230113175459.14825-17-james.morse@arm.com>

Hi James,

On 1/13/2023 9:54 AM, James Morse wrote:
> When a cpu is taken offline resctrl may need to move the overflow or
> limbo handlers to run on a different CPU.

cpu -> CPU

> 
> Once the offline callbacks have been split, cqm_setup_limbo_handler()
> will be called while the CPU that is going offline is still present
> in the cpu_mask.
> 
> Pass the CPU to exclude to cqm_setup_limbo_handler() and
> mbm_setup_overflow_handler(). These functions can use cpumask_any_but()
> when selecting the CPU. -1 is used to indicate no CPUs need excluding.
> 
> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> 
> Both cpumask_any() and cpumask_any_but() return a value >= nr_cpu_ids
> on error. schedule_delayed_work_on() doesn't appear to check. Add the
> error handling to be robust. It doesn't look like its possible to hit
> this.
> ---
>  arch/x86/kernel/cpu/resctrl/core.c     |  6 ++--
>  arch/x86/kernel/cpu/resctrl/internal.h |  6 ++--
>  arch/x86/kernel/cpu/resctrl/monitor.c  | 39 +++++++++++++++++++++-----
>  arch/x86/kernel/cpu/resctrl/rdtgroup.c |  4 +--
>  4 files changed, 42 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
> index 749d9a450749..a3c171bd2de0 100644
> --- a/arch/x86/kernel/cpu/resctrl/core.c
> +++ b/arch/x86/kernel/cpu/resctrl/core.c
> @@ -557,12 +557,14 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
>  	if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) {
>  		if (is_mbm_enabled() && cpu == d->mbm_work_cpu) {
>  			cancel_delayed_work(&d->mbm_over);
> -			mbm_setup_overflow_handler(d, 0);
> +			/* exclude_cpu=-1 as we already cpumask_clear_cpu()d */

Please do not use "we".

> +			mbm_setup_overflow_handler(d, 0, -1);
>  		}
>  		if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu &&
>  		    has_busy_rmid(r, d)) {
>  			cancel_delayed_work(&d->cqm_limbo);
> -			cqm_setup_limbo_handler(d, 0);
> +			/* exclude_cpu=-1 as we already cpumask_clear_cpu()d */
> +			cqm_setup_limbo_handler(d, 0, -1);
>  		}
>  	}
>  }
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index a1bf97adee2e..d8c7a549b43a 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -515,11 +515,13 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
>  		    struct rdt_domain *d, struct rdtgroup *rdtgrp,
>  		    int evtid, int first);
>  void mbm_setup_overflow_handler(struct rdt_domain *dom,
> -				unsigned long delay_ms);
> +				unsigned long delay_ms,
> +				int exclude_cpu);
>  void mbm_handle_overflow(struct work_struct *work);
>  void __init intel_rdt_mbm_apply_quirk(void);
>  bool is_mba_sc(struct rdt_resource *r);
> -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms);
> +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms,
> +			     int exclude_cpu);
>  void cqm_handle_limbo(struct work_struct *work);
>  bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d);
>  void __check_limbo(struct rdt_domain *d, bool force_free);
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 1a214bd32ed4..334fb3f1c6e2 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -440,7 +440,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry)
>  		 * setup up the limbo worker.
>  		 */
>  		if (!has_busy_rmid(r, d))
> -			cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL);
> +			cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL, -1);
>  		set_bit(idx, d->rmid_busy_llc);
>  		entry->busy++;
>  	}
> @@ -773,15 +773,27 @@ void cqm_handle_limbo(struct work_struct *work)
>  	mutex_unlock(&rdtgroup_mutex);
>  }
>  
> -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms)
> +/**
> + * cqm_setup_limbo_handler() - Schedule the limbo handler to run for this
> + *                             domain.
> + * @delay_ms:      How far in the future the handler should run.
> + * @exclude_cpu:   Which CPU the handler should not run on, -1 to pick any CPU.
> + */
> +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms,
> +			     int exclude_cpu)
>  {
>  	unsigned long delay = msecs_to_jiffies(delay_ms);
>  	int cpu;
>  
> -	cpu = cpumask_any(&dom->cpu_mask);
> +	if (exclude_cpu == -1)
> +		cpu = cpumask_any(&dom->cpu_mask);
> +	else
> +		cpu = cpumask_any_but(&dom->cpu_mask, exclude_cpu);
> +
>  	dom->cqm_work_cpu = cpu;
>  

This assignment is unexpected considering the error handling that follows.
cqm_work_cpu can thus be >= nr_cpu_ids. I assume it is to help during
domain remove where the CPU being removed is checked against this value?
If indeed this invalid CPU assignment is done in support of future code
path, could you please add a comment to help explain this assignment?

Reinette

  reply	other threads:[~2023-02-02 23:49 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 17:54 [PATCH v2 00/18] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2023-01-13 17:54 ` [PATCH v2 01/18] x86/resctrl: Track the closid with the rmid James Morse
2023-01-13 17:54 ` [PATCH v2 02/18] x86/resctrl: Access per-rmid structures by index James Morse
2023-01-17 18:28   ` Yu, Fenghua
2023-03-03 18:33     ` James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-02-02 23:44   ` Reinette Chatre
2023-03-03 18:33     ` James Morse
2023-01-13 17:54 ` [PATCH v2 03/18] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2023-01-13 17:54 ` [PATCH v2 04/18] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2023-01-17 18:28   ` Yu, Fenghua
2023-02-02 23:45   ` Reinette Chatre
2023-03-03 18:33     ` James Morse
2023-01-13 17:54 ` [PATCH v2 05/18] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2023-01-17 18:53   ` Yu, Fenghua
2023-03-03 18:34     ` James Morse
2023-02-02 23:45   ` Reinette Chatre
2023-03-03 18:34     ` James Morse
2023-03-10 19:57       ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 06/18] x86/resctrl: Allow the allocator to check if a CLOSID can allocate clean RMID James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-03-03 18:35     ` James Morse
2023-02-02 23:46   ` Reinette Chatre
2023-03-03 18:36     ` James Morse
2023-03-10 19:59       ` Reinette Chatre
2023-03-20 17:12         ` James Morse
2023-01-13 17:54 ` [PATCH v2 07/18] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2023-01-17 19:10   ` Yu, Fenghua
2023-03-03 18:37     ` James Morse
2023-02-02 23:47   ` Reinette Chatre
2023-03-06 11:32     ` James Morse
2023-03-08 10:30       ` Peter Newman
2023-03-10 20:00       ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 08/18] x86/resctrl: Queue mon_event_read() instead of sending an IPI James Morse
2023-01-17 18:29   ` Yu, Fenghua
2023-03-06 11:32     ` James Morse
2023-03-10 20:00       ` Reinette Chatre
2023-02-02 23:47   ` Reinette Chatre
2023-03-06 11:33     ` James Morse
2023-03-08 16:09       ` James Morse
2023-03-10 20:06         ` Reinette Chatre
2023-03-20 17:12           ` James Morse
2023-01-13 17:54 ` [PATCH v2 09/18] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2023-01-23 13:54   ` Peter Newman
2023-03-06 11:33     ` James Morse
2023-01-23 15:33   ` Peter Newman
2023-03-06 11:33     ` James Morse
2023-03-06 13:14       ` Peter Newman
2023-03-08 17:45         ` James Morse
2023-03-09 13:41           ` Peter Newman
2023-03-09 17:35             ` James Morse
2023-03-10  9:28               ` Peter Newman
2023-03-20 17:12                 ` James Morse
2023-03-22 13:21                   ` Peter Newman
2023-01-13 17:54 ` [PATCH v2 10/18] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2023-01-13 17:54 ` [PATCH v2 11/18] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2023-01-13 17:54 ` [PATCH v2 12/18] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2023-01-13 17:54 ` [PATCH v2 13/18] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2023-02-02 23:48   ` Reinette Chatre
2023-01-13 17:54 ` [PATCH v2 14/18] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2023-01-25  7:16   ` Shaopeng Tan (Fujitsu)
2023-03-06 11:34     ` James Morse
2023-01-13 17:54 ` [PATCH v2 15/18] x86/resctrl: Add cpu online callback for resctrl work James Morse
2023-01-13 17:54 ` [PATCH v2 16/18] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2023-02-02 23:49   ` Reinette Chatre [this message]
2023-03-06 11:34     ` James Morse
2023-01-13 17:54 ` [PATCH v2 17/18] x86/resctrl: Add cpu offline callback for resctrl work James Morse
2023-01-13 17:54 ` [PATCH v2 18/18] x86/resctrl: Separate arch and fs resctrl locks James Morse
2023-02-02 23:50   ` Reinette Chatre
2023-03-06 11:34     ` James Morse
2023-03-11  0:22       ` Reinette Chatre
2023-03-20 17:12         ` James Morse
2023-01-25  7:19 ` [PATCH v2 00/18] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Shaopeng Tan (Fujitsu)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab71c33e-f1e3-63be-ac83-685d11c8b4eb@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Babu.Moger@amd.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=carl@os.amperecomputing.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    --cc=xingxin.hx@openanolis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).