linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	shameerali.kolothum.thodi@huawei.com,
	Jamie Iles <jamie@nuviainc.com>,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	lcherian@marvell.com, bobo.shaobowang@huawei.com,
	tan.shaopeng@fujitsu.com
Subject: Re: [PATCH v3 19/21] x86/resctrl: Rename and change the units of resctrl_cqm_threshold
Date: Wed, 30 Mar 2022 17:45:56 +0100	[thread overview]
Message-ID: <1d4220ef-277d-fbb0-edb7-14f09bae0c23@arm.com> (raw)
In-Reply-To: <87c00fe2-e4fc-b006-f608-3dc2a209ed77@intel.com>

Hi Reinette,

On 17/03/2022 17:00, Reinette Chatre wrote:
> On 2/17/2022 10:21 AM, James Morse wrote:
>> resctrl_cqm_threshold is stored in a hardware specific chunk size,
>> but exposed to user-space as bytes.
>>
>> This means the filesystem parts of resctrl need to know how the hardware
>> counts, to convert the user provided byte value to chunks. The interface
>> between the architecture's resctrl code and the filesystem ought to
>> treat everything as bytes.
>>
>> Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read()
>> still returns its value in chunks, so this needs converting to bytes.
>> As all the callers have been touched, rename the variable to
>> resctrl_rmid_realloc_threshold, which describes what the value is for.

>> @@ -762,10 +763,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
>>  	 *
>>  	 * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
>>  	 */
>> -	resctrl_cqm_threshold = cl_size * 1024 / r->num_rmid;
>> -
>> -	/* h/w works in units of "boot_cpu_data.x86_cache_occ_scale" */
>> -	resctrl_cqm_threshold /= hw_res->mon_scale;
>> +	resctrl_rmid_realloc_threshold = cl_size * 1024 / r->num_rmid;
>>  
>>  	ret = dom_data_init(r);
>>  	if (ret)
>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> index 7ec089d72ab7..93b3697027df 100644
>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>> @@ -1030,10 +1030,7 @@ static int rdt_delay_linear_show(struct kernfs_open_file *of,
>>  static int max_threshold_occ_show(struct kernfs_open_file *of,
>>  				  struct seq_file *seq, void *v)
>>  {
>> -	struct rdt_resource *r = of->kn->parent->priv;
>> -	struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>> -
>> -	seq_printf(seq, "%u\n", resctrl_cqm_threshold * hw_res->mon_scale);
>> +	seq_printf(seq, "%u\n", resctrl_rmid_realloc_threshold);
>>  
>>  	return 0;
>>  }
> 
> 
> This change has some user visible impact that I am still digesting but thought
> that I would share for your consideration.
> 
> As seen in the above two snippets, the original code did:
> 
> resctrl_cqm_threshold /= hw_res->mon_scale; /* resctrl_cqm_threshold used internally */
> 
> resctrl_cqm_threshold * hw_res->mon_scale; /* this is displayed to user */
> 
> The original loss due to truncation during the division is not recovered
> when the value is displayed to the user the user may see significant differences
> before and after this patch.
> 
> I tried this out on a system with a large cache and the before and after 
> information is significant:
> Before this patch:
> info/L3_MON/max_threshold_occupancy:147456
> 
> After this patch:
> info/L3_MON/max_threshold_occupancy:196608

Hmm. I hadn't considered that information would be lost by the current way of doing this.
It looks like this happens because num_rmid isn't necessarily a power of 2.


> As I understand this change indeed represents the information more accurately but
> I found it noteworthy that this is not just a simple "change the units" and
> may thus have broader impact and may indeed result in different behavior that
> should be considered.

I agree it more accurately reflects resctrl's calculation of "the number
of lines tagged per RMID if all RMIDs have the same number of lines", but if that
produces a number the hardware will never actually measure, then the rounding is still
happening, but somewhere else.

I think the right thing to do is round resctrl_rmid_realloc_threshold down to the nearest
multiple of hw_res->mon_scale in rdt_get_mon_l3_config(). This way the filesystem parts
still handle things in bytes, and the architecture code provides the quantised value that
will actually get measured. Its this value that should be reported to user-space.

It doesn't look like the 'Upscaling Factor' is guaranteed to be a power of 2, so I can't
use the round_down() helpers.

I've added this to the commit message:
| Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power
| of 2, so the existing code introduces a rounding error from resctrl's
| theoretical fraction of the cache usage. This behaviour is kept as it
| ensures the user visible value matches the value read from hardware
| when the rmid will be reallocated.

and the hunk below, which fixes it for me.



Thanks,

James

---------------%<---------------
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b18e227d585c..fb81d650c457 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -753,6 +753,7 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
        unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
        struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
        unsigned int cl_size = boot_cpu_data.x86_cache_size;
+       u64 threshold;
        int ret;

        hw_res->mon_scale = boot_cpu_data.x86_cache_occ_scale;
@@ -771,7 +772,15 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
         *
         * For a 35MB LLC and 56 RMIDs, this is ~1.8% of the LLC.
         */
-       resctrl_rmid_realloc_threshold = cl_size * 1024 / r->num_rmid;
+       threshold = cl_size * 1024 / r->num_rmid;
+
+       /*
+        * Because num_rmid may not be a power of two, round the value
+        * to the nearest multiple of hw_res->mon_scale so it matches a
+        * value the hardware will measure. mon_scale may not be a power of 2.
+        */
+       threshold /= hw_res->mon_scale;
+       resctrl_rmid_realloc_threshold = threshold * hw_res->mon_scale;

        ret = dom_data_init(r);
        if (ret)
---------------%<---------------

  reply	other threads:[~2022-03-30 16:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 18:20 [PATCH v3 00/21] x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes James Morse
2022-02-17 18:20 ` [PATCH v3 01/21] x86/resctrl: Kill off alloc_enabled James Morse
2022-03-16 21:48   ` Reinette Chatre
2022-02-17 18:20 ` [PATCH v3 02/21] x86/resctrl: Merge mon_capable and mon_enabled James Morse
2022-02-17 18:20 ` [PATCH v3 03/21] x86/resctrl: Add domain online callback for resctrl work James Morse
2022-02-17 18:20 ` [PATCH v3 04/21] x86/resctrl: Group struct rdt_hw_domain cleanup James Morse
2022-02-17 18:20 ` [PATCH v3 05/21] x86/resctrl: Add domain offline callback for resctrl work James Morse
2022-02-17 18:20 ` [PATCH v3 06/21] x86/resctrl: Remove set_mba_sc()s control array re-initialisation James Morse
2022-02-17 18:20 ` [PATCH v3 07/21] x86/resctrl: Create mba_sc configuration in the rdt_domain James Morse
2022-03-05  0:26   ` Reinette Chatre
2022-03-30 16:43     ` James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-03-30 16:43     ` James Morse
2022-04-01 22:54       ` Reinette Chatre
2022-04-04 16:35         ` James Morse
2022-04-04 20:43           ` Reinette Chatre
2022-02-17 18:20 ` [PATCH v3 08/21] x86/resctrl: Switch over to the resctrl mbps_val list James Morse
2022-02-17 18:20 ` [PATCH v3 09/21] x86/resctrl: Remove architecture copy of mbps_val James Morse
2022-02-17 18:20 ` [PATCH v3 10/21] x86/resctrl: Abstract and use supports_mba_mbps() James Morse
2022-02-17 18:21 ` [PATCH v3 11/21] x86/resctrl: Allow update_mba_bw() to update controls directly James Morse
2022-02-17 18:21 ` [PATCH v3 12/21] x86/resctrl: Calculate bandwidth from the previous __mon_event_count() chunks James Morse
2022-03-05  0:27   ` Reinette Chatre
2022-03-30 16:44     ` James Morse
2022-02-17 18:21 ` [PATCH v3 13/21] x86/recstrl: Add per-rmid arch private storage for overflow and chunks James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-02-17 18:21 ` [PATCH v3 14/21] x86/recstrl: Allow per-rmid arch private storage to be reset James Morse
2022-03-16 21:50   ` Reinette Chatre
2022-02-17 18:21 ` [PATCH v3 15/21] x86/resctrl: Abstract __rmid_read() James Morse
2022-03-16 21:52   ` Reinette Chatre
2022-03-30 16:44     ` James Morse
2022-02-17 18:21 ` [PATCH v3 16/21] x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read() James Morse
2022-03-23 20:58   ` Rob Herring
2022-03-30 16:45     ` James Morse
2022-02-17 18:21 ` [PATCH v3 17/21] x86/resctrl: Move mbm_overflow_count() " James Morse
2022-02-17 18:21 ` [PATCH v3 18/21] x86/resctrl: Move get_corrected_mbm_count() " James Morse
2022-02-17 18:21 ` [PATCH v3 19/21] x86/resctrl: Rename and change the units of resctrl_cqm_threshold James Morse
2022-03-17 17:00   ` Reinette Chatre
2022-03-30 16:45     ` James Morse [this message]
2022-04-01 22:55       ` Reinette Chatre
2022-02-17 18:21 ` [PATCH v3 20/21] x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data James Morse
2022-02-17 18:21 ` [PATCH v3 21/21] x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes James Morse
2022-03-23 21:17   ` Rob Herring
2022-04-04 16:36     ` James Morse
2022-03-07 12:56 ` [PATCH v3 00/21] " Jamie Iles
2022-04-04 16:36   ` James Morse
2022-03-15  6:41 ` Xin Hao
2022-04-04 16:36   ` James Morse
2022-03-15  8:16 ` tan.shaopeng
2022-04-04 16:35   ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1d4220ef-277d-fbb0-edb7-14f09bae0c23@arm.com \
    --to=james.morse@arm.com \
    --cc=Babu.Moger@amd.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=jamie@nuviainc.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).