linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: James Morse <james.morse@arm.com>, <x86@kernel.org>,
	<linux-kernel@vger.kernel.org>
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	<shameerali.kolothum.thodi@huawei.com>,
	Jamie Iles <jamie@nuviainc.com>,
	"D Scott Phillips OS" <scott@os.amperecomputing.com>,
	<lcherian@marvell.com>, <bobo.shaobowang@huawei.com>,
	<tan.shaopeng@fujitsu.com>
Subject: Re: [PATCH v3 19/21] x86/resctrl: Rename and change the units of resctrl_cqm_threshold
Date: Fri, 1 Apr 2022 15:55:13 -0700	[thread overview]
Message-ID: <72d7ca0d-7235-2ac5-4639-38d27aafa222@intel.com> (raw)
In-Reply-To: <1d4220ef-277d-fbb0-edb7-14f09bae0c23@arm.com>

Hi James,

On 3/30/2022 9:45 AM, James Morse wrote:
> Hi Reinette,
> 
> On 17/03/2022 17:00, Reinette Chatre wrote:
>> On 2/17/2022 10:21 AM, James Morse wrote:
>>> resctrl_cqm_threshold is stored in a hardware specific chunk size,
>>> but exposed to user-space as bytes.
>>>
>>> This means the filesystem parts of resctrl need to know how the hardware
>>> counts, to convert the user provided byte value to chunks. The interface
>>> between the architecture's resctrl code and the filesystem ought to
>>> treat everything as bytes.
>>>
>>> Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
>>> still returns its value in chunks, so this needs converting to bytes.
>>> As all the callers have been touched, rename the variable to
>>> resctrl_rmid_realloc_threshold, which describes what the value is for.
> 
>>> @@ -762,10 +763,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>>>  	 *
>>>  	 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
>>>  	 */
>>> -	resctrl_cqm_threshold = cl_size * 1024 / r->num_rmid;
>>> -
>>> -	/* h/w works in units of "boot_cpu_data.x86_cache_occ_scale" */
>>> -	resctrl_cqm_threshold /= hw_res->mon_scale;
>>> +	resctrl_rmid_realloc_threshold = cl_size * 1024 / r->num_rmid;
>>>  
>>>  	ret = dom_data_init(r);
>>>  	if (ret)
>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> index 7ec089d72ab7..93b3697027df 100644
>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>> @@ -1030,10 +1030,7 @@ static int rdt_delay_linear_show(struct kernfs_open_file *of,
>>>  static int max_threshold_occ_show(struct kernfs_open_file *of,
>>>  				  struct seq_file *seq, void *v)
>>>  {
>>> -	struct rdt_resource *r = of->kn->parent->priv;
>>> -	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>> -
>>> -	seq_printf(seq, "%u\n", resctrl_cqm_threshold * hw_res->mon_scale);
>>> +	seq_printf(seq, "%u\n", resctrl_rmid_realloc_threshold);
>>>  
>>>  	return 0;
>>>  }
>>
>>
>> This change has some user visible impact that I am still digesting but thought
>> that I would share for your consideration.
>>
>> As seen in the above two snippets, the original code did:
>>
>> resctrl_cqm_threshold /= hw_res->mon_scale; /* resctrl_cqm_threshold used internally */
>>
>> resctrl_cqm_threshold * hw_res->mon_scale; /* this is displayed to user */
>>
>> The original loss due to truncation during the division is not recovered
>> when the value is displayed to the user the user may see significant differences
>> before and after this patch.
>>
>> I tried this out on a system with a large cache and the before and after 
>> information is significant:
>> Before this patch:
>> info/L3_MON/max_threshold_occupancy:147456
>>
>> After this patch:
>> info/L3_MON/max_threshold_occupancy:196608
> 
> Hmm. I hadn't considered that information would be lost by the current way of doing this.
> It looks like this happens because num_rmid isn't necessarily a power of 2.
> 
> 
>> As I understand this change indeed represents the information more accurately but
>> I found it noteworthy that this is not just a simple "change the units" and
>> may thus have broader impact and may indeed result in different behavior that
>> should be considered.
> 
> I agree it more accurately reflects resctrl's calculation of "the number
> of lines tagged per RMID if all RMIDs have the same number of lines", but if that
> produces a number the hardware will never actually measure, then the rounding is still
> happening, but somewhere else.
> 
> I think the right thing to do is round resctrl_rmid_realloc_threshold down to the nearest
> multiple of hw_res->mon_scale in rdt_get_mon_l3_config(). This way the filesystem parts
> still handle things in bytes, and the architecture code provides the quantised value that
> will actually get measured. Its this value that should be reported to user-space.
> 
> It doesn't look like the 'Upscaling Factor' is guaranteed to be a power of 2, so I can't
> use the round_down() helpers.
> 
> I've added this to the commit message:
> | Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power
> | of 2, so the existing code introduces a rounding error from resctrl's
> | theoretical fraction of the cache usage. This behaviour is kept as it
> | ensures the user visible value matches the value read from hardware
> | when the rmid will be reallocated.
> 
> and the hunk below, which fixes it for me.
> 
> 
> 
> Thanks,
> 
> James
> 
> ---------------%<---------------
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index b18e227d585c..fb81d650c457 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -753,6 +753,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>         unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
>         struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>         unsigned int cl_size = boot_cpu_data.x86_cache_size;
> +       u64 threshold;
>         int ret;
> 
>         hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
> @@ -771,7 +772,15 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>          *
>          * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
>          */
> -       resctrl_rmid_realloc_threshold = cl_size * 1024 / r->num_rmid;
> +       threshold = cl_size * 1024 / r->num_rmid;
> +
> +       /*
> +        * Because num_rmid may not be a power of two, round the value
> +        * to the nearest multiple of hw_res->mon_scale so it matches a
> +        * value the hardware will measure. mon_scale may not be a power of 2.
> +        */
> +       threshold /= hw_res->mon_scale;
> +       resctrl_rmid_realloc_threshold = threshold * hw_res->mon_scale;
> 
>         ret = dom_data_init(r);
>         if (ret)
> ---------------%<---------------

Thank you for the added explanation. From what I can tell this also restores current
behavior.

Reinette

  reply	other threads:[~2022-04-01 22:55 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 18:20 [PATCH v3 00/21] x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes James Morse
2022-02-17 18:20 ` [PATCH v3 01/21] x86/resctrl: Kill off alloc_enabled James Morse
2022-03-16 21:48   ` Reinette Chatre
2022-02-17 18:20 ` [PATCH v3 02/21] x86/resctrl: Merge mon_capable and mon_enabled James Morse
2022-02-17 18:20 ` [PATCH v3 03/21] x86/resctrl: Add domain online callback for resctrl work James Morse
2022-02-17 18:20 ` [PATCH v3 04/21] x86/resctrl: Group struct rdt_hw_domain cleanup James Morse
2022-02-17 18:20 ` [PATCH v3 05/21] x86/resctrl: Add domain offline callback for resctrl work James Morse
2022-02-17 18:20 ` [PATCH v3 06/21] x86/resctrl: Remove set_mba_sc()s control array re-initialisation James Morse
2022-02-17 18:20 ` [PATCH v3 07/21] x86/resctrl: Create mba_sc configuration in the rdt_domain James Morse
2022-03-05  0:26   ` Reinette Chatre
2022-03-30 16:43     ` James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-03-30 16:43     ` James Morse
2022-04-01 22:54       ` Reinette Chatre
2022-04-04 16:35         ` James Morse
2022-04-04 20:43           ` Reinette Chatre
2022-02-17 18:20 ` [PATCH v3 08/21] x86/resctrl: Switch over to the resctrl mbps_val list James Morse
2022-02-17 18:20 ` [PATCH v3 09/21] x86/resctrl: Remove architecture copy of mbps_val James Morse
2022-02-17 18:20 ` [PATCH v3 10/21] x86/resctrl: Abstract and use supports_mba_mbps() James Morse
2022-02-17 18:21 ` [PATCH v3 11/21] x86/resctrl: Allow update_mba_bw() to update controls directly James Morse
2022-02-17 18:21 ` [PATCH v3 12/21] x86/resctrl: Calculate bandwidth from the previous __mon_event_count() chunks James Morse
2022-03-05  0:27   ` Reinette Chatre
2022-03-30 16:44     ` James Morse
2022-02-17 18:21 ` [PATCH v3 13/21] x86/recstrl: Add per-rmid arch private storage for overflow and chunks James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-02-17 18:21 ` [PATCH v3 14/21] x86/recstrl: Allow per-rmid arch private storage to be reset James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-02-17 18:21 ` [PATCH v3 15/21] x86/resctrl: Abstract __rmid_read() James Morse
2022-03-16 21:52   ` Reinette Chatre
2022-03-30 16:44     ` James Morse
2022-02-17 18:21 ` [PATCH v3 16/21] x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read() James Morse
2022-03-23 20:58   ` Rob Herring
2022-03-30 16:45     ` James Morse
2022-02-17 18:21 ` [PATCH v3 17/21] x86/resctrl: Move mbm_overflow_count() " James Morse
2022-02-17 18:21 ` [PATCH v3 18/21] x86/resctrl: Move get_corrected_mbm_count() " James Morse
2022-02-17 18:21 ` [PATCH v3 19/21] x86/resctrl: Rename and change the units of resctrl_cqm_threshold James Morse
2022-03-17 17:00   ` Reinette Chatre
2022-03-30 16:45     ` James Morse
2022-04-01 22:55       ` Reinette Chatre [this message]
2022-02-17 18:21 ` [PATCH v3 20/21] x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data James Morse
2022-02-17 18:21 ` [PATCH v3 21/21] x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes James Morse
2022-03-23 21:17   ` Rob Herring
2022-04-04 16:36     ` James Morse
2022-03-07 12:56 ` [PATCH v3 00/21] " Jamie Iles
2022-04-04 16:36   ` James Morse
2022-03-15  6:41 ` Xin Hao
2022-04-04 16:36   ` James Morse
2022-03-15  8:16 ` tan.shaopeng
2022-04-04 16:35   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=72d7ca0d-7235-2ac5-4639-38d27aafa222@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=Babu.Moger@amd.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=james.morse@arm.com \
    --cc=jamie@nuviainc.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).