All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: James Morse <james.morse@arm.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	<shameerali.kolothum.thodi@huawei.com>,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	<carl@os.amperecomputing.com>, <lcherian@marvell.com>,
	<bobo.shaobowang@huawei.com>, <tan.shaopeng@fujitsu.com>,
	<baolin.wang@linux.alibaba.com>,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>, <peternewman@google.com>,
	<dfustini@baylibre.com>, <amitsinght@marvell.com>
Subject: Re: [PATCH v7 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu
Date: Thu, 14 Dec 2023 10:53:24 -0800	[thread overview]
Message-ID: <9a66b7f8-097b-45af-97b2-4c403c295301@intel.com> (raw)
In-Reply-To: <b077b38f-b42c-f679-1e08-70b55d116e17@arm.com>

Hi James,

On 12/14/2023 3:38 AM, James Morse wrote:
> On 09/11/2023 17:48, Reinette Chatre wrote:
>> On 10/25/2023 11:03 AM, James Morse wrote:

>>> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
>>> index c4c1e1909058..f5fff2f0d866 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/internal.h
>>> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
>>> @@ -61,19 +61,36 @@
>>>   * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that
>>>   *			        aren't marked nohz_full
>>>   * @mask:	The mask to pick a CPU from.
>>> + * @exclude_cpu:The CPU to avoid picking.
>>>   *
>>> - * Returns a CPU in @mask. If there are housekeeping CPUs that don't use
>>> - * nohz_full, these are preferred.
>>> + * Returns a CPU from @mask, but not @exclude_cpu. If there are housekeeping
>>> + * CPUs that don't use nohz_full, these are preferred. Pass
>>> + * RESCTRL_PICK_ANY_CPU to avoid excluding any CPUs.
>>> + *
>>> + * When a CPU is excluded, returns >= nr_cpu_ids if no CPUs are available.
>>>   */
>>> -static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask)
>>> +static inline unsigned int
>>> +cpumask_any_housekeeping(const struct cpumask *mask, int exclude_cpu)
>>>  {
>>>  	unsigned int cpu, hk_cpu;
>>>  
>>> -	cpu = cpumask_any(mask);
>>> -	if (!tick_nohz_full_cpu(cpu))
>>> +	if (exclude_cpu == RESCTRL_PICK_ANY_CPU)
>>> +		cpu = cpumask_any(mask);
>>> +	else
>>> +		cpu = cpumask_any_but(mask, exclude_cpu);
>>> +
>>> +	if (!IS_ENABLED(CONFIG_NO_HZ_FULL))
>>> +		return cpu;
>>
>> It is not clear to me how cpumask_any_but() failure is handled.
>>
>> cpumask_any_but() returns ">= nr_cpu_ids if no cpus set" ...
> 
> It wasn't a satisfiable request, there are no CPUs for this domain other than the one that
> was excluded. cpumask_any_but() also describes its errors as "returns >= nr_cpu_ids if no
> CPUs are available".
> 
> The places this can happen in resctrl are:
> cqm_setup_limbo_handler(), where it causes the schedule_delayed_work_on() call to be skipped.
> mbm_setup_overflow_handler(), which does similar.
> 
> These two cases are triggered from resctrl_offline_cpu() when the last CPU in a domain is
> going offline, and the domain is about to be free()d. This is how the limbo/overflow
> 'threads' stop.

Right ... yet this is a generic function, if there are any requirements on when/how it should
be called then it needs to be specified in the function comments. I do not expect this to
be the case for this function.

>>> +
>>> +	/* If the CPU picked isn't marked nohz_full, we're done */
>>
>> Please don't impersonate code.
>>
>>> +	if (cpu <= nr_cpu_ids && !tick_nohz_full_cpu(cpu))
>>>  		return cpu;
>>
>> Is this intended to be "cpu < nr_cpu_ids"?
> 
> Yes, fixed - thanks!
> 
> 
>> But that would have
>> code continue ... so maybe it needs explicit error check of
>> cpumask_any_but() failure with an earlier exit?
> 
> I'm not sure what the problem you refer to here is.
> If 'cpu' is valid, and not marked nohz_full, nothing more needs doing.
> If 'cpu' is invalid or a CPU marked nohz_full, then a second attempt is made to find a
> housekeeping CPU into 'hk_cpu'. If the second attempt is valid, it's used in preference.

Considering that the second attempt can only be on the same or smaller set of CPUs,
how could the second attempt ever succeed if the first attempt failed? I do not see
why it is worth continuing.

> An error is returned if the request couldn't be satisfied, i.e. an empty mask was passed,
> or the only CPU set in the mask was excluded.
> There is a second attempt in this case for a housekeeping CPU - but that will fail too.
> As above, this only happens when CPUs are going offline, and this error is handled by the
> caller.

Reinette

  reply	other threads:[~2023-12-14 18:53 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 18:03 [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2023-10-25 18:03 ` [PATCH v7 01/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef James Morse
2023-10-25 18:03 ` [PATCH v7 02/24] x86/resctrl: kfree() rmid_ptrs from rdtgroup_exit() James Morse
2023-11-09 17:39   ` Reinette Chatre
2023-12-13 18:03     ` James Morse
2023-12-13 23:27       ` Reinette Chatre
2023-12-14 18:28         ` James Morse
2023-12-14 19:06           ` Reinette Chatre
2023-12-15 17:40             ` James Morse
2023-11-09 20:28   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2023-11-09 17:40   ` Reinette Chatre
2023-11-09 20:28   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2023-11-09 20:29   ` Moger, Babu
2023-12-13 18:03     ` James Morse
2023-10-25 18:03 ` [PATCH v7 05/24] x86/resctrl: Track the closid with the rmid James Morse
2023-11-09 17:41   ` Reinette Chatre
2023-12-13 18:03     ` James Morse
2023-11-09 20:31   ` Moger, Babu
2023-12-13 18:04     ` James Morse
2023-10-25 18:03 ` [PATCH v7 06/24] x86/resctrl: Access per-rmid structures by index James Morse
2023-10-31  7:43   ` [EXT] " Amit Singh Tomar
2023-12-11 14:33     ` James Morse
2024-01-21 10:27       ` Amit Singh Tomar
2024-01-22 18:07         ` James Morse
2023-11-09 17:42   ` Reinette Chatre
2023-12-13 18:04     ` James Morse
2023-11-09 20:32   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 07/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2023-11-09 17:42   ` Reinette Chatre
2023-11-09 20:32   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 08/24] x86/resctrl: Track the number of dirty RMID a CLOSID has James Morse
2023-11-09 17:43   ` Reinette Chatre
2023-12-13 18:04     ` James Morse
2023-11-09 20:38   ` Moger, Babu
2023-12-13 18:04     ` James Morse
2023-10-25 18:03 ` [PATCH v7 09/24] x86/resctrl: Use __set_bit()/__clear_bit() instead of open coding James Morse
2023-11-09 17:44   ` Reinette Chatre
2023-12-13 18:05     ` James Morse
2023-11-09 20:38   ` Moger, Babu
2023-12-13 18:05     ` James Morse
2023-10-25 18:03 ` [PATCH v7 10/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-12-14 11:36     ` James Morse
2023-11-09 20:39   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 11/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2023-11-09 20:39   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-11-09 20:39   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 12/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-11-09 20:40   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI James Morse
2023-11-09 17:46   ` Reinette Chatre
2023-11-09 20:40   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2023-11-09 17:47   ` Reinette Chatre
2023-12-14 11:37     ` James Morse
2023-12-14 18:52       ` Reinette Chatre
2023-11-09 20:42   ` Moger, Babu
2023-12-14 11:37     ` James Morse
2023-10-25 18:03 ` [PATCH v7 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2023-11-09 20:47   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 16/24] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2023-11-09 20:47   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 17/24] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2023-11-09 20:48   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2023-11-09 20:48   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 20/24] x86/resctrl: Add CPU online callback for resctrl work James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2023-11-09 17:48   ` Reinette Chatre
2023-12-14 11:38     ` James Morse
2023-12-14 18:53       ` Reinette Chatre [this message]
2023-12-15 17:41         ` James Morse
2023-11-09 20:51   ` Moger, Babu
2023-12-14 11:38     ` James Morse
2023-10-25 18:03 ` [PATCH v7 22/24] x86/resctrl: Add CPU offline callback for resctrl work James Morse
2023-11-09 20:52   ` Moger, Babu
2023-12-14 11:39     ` James Morse
2023-10-25 18:03 ` [PATCH v7 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() James Morse
2023-11-09 20:52   ` Moger, Babu
2023-10-25 18:03 ` [PATCH v7 24/24] x86/resctrl: Separate arch and fs resctrl locks James Morse
2023-11-09 17:48   ` Reinette Chatre
2023-12-14 11:39     ` James Morse
2023-11-09 20:52   ` Moger, Babu
2023-12-14 11:39     ` James Morse
2023-11-09 14:05 ` [PATCH v7 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Moger, Babu
2023-12-14 11:39   ` James Morse
2023-11-13  1:54 ` Shaopeng Tan (Fujitsu)
2023-12-14 18:28   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a66b7f8-097b-45af-97b2-4c403c295301@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Babu.Moger@amd.com \
    --cc=amitsinght@marvell.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=carl@os.amperecomputing.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.