linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs
@ 2022-12-14 16:08 Peter Newman
  2022-12-14 16:08 ` [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count() Peter Newman
  2022-12-17  0:59 ` [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Reinette Chatre
  0 siblings, 2 replies; 5+ messages in thread
From: Peter Newman @ 2022-12-14 16:08 UTC (permalink / raw)
  To: reinette.chatre, fenghua.yu
  Cc: Babu.Moger, bp, dave.hansen, eranian, hpa, james.morse,
	linux-kernel, mingo, quic_jiles, tan.shaopeng, tglx, x86,
	Peter Newman

When creating a new monitoring group, the RMID allocated for it may have
been used by a group which was previously removed. In this case, the
hardware counters will have non-zero values which should be deducted
from what is reported in the new group's counts.

resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
0, causing the initial count to be charged to the new group. Resurrect
__rmid_read() and use it to initialize prev_msr correctly.

Unlike before, __rmid_read() checks for error bits in the MSR read so
that callers don't need to.

Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
Signed-off-by: Peter Newman <peternewman@google.com>
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 49 ++++++++++++++++++---------
 1 file changed, 33 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index efe0c30d3a12..77538abeb72a 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -146,6 +146,30 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid)
 	return entry;
 }
 
+static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
+{
+	u64 msr_val;
+
+	/*
+	 * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
+	 * with a valid event code for supported resource type and the bits
+	 * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
+	 * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
+	 * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
+	 * are error bits.
+	 */
+	wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
+	rdmsrl(MSR_IA32_QM_CTR, msr_val);
+
+	if (msr_val & RMID_VAL_ERROR)
+		return -EIO;
+	if (msr_val & RMID_VAL_UNAVAIL)
+		return -EINVAL;
+
+	*val = msr_val;
+	return 0;
+}
+
 static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom,
 						 u32 rmid,
 						 enum resctrl_event_id eventid)
@@ -172,8 +196,12 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
 	struct arch_mbm_state *am;
 
 	am = get_arch_mbm_state(hw_dom, rmid, eventid);
-	if (am)
+	if (am) {
 		memset(am, 0, sizeof(*am));
+
+		/* Record any initial, non-zero count value. */
+		__rmid_read(rmid, eventid, &am->prev_msr);
+	}
 }
 
 static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
@@ -191,25 +219,14 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
 	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
 	struct arch_mbm_state *am;
 	u64 msr_val, chunks;
+	int ret;
 
 	if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
 		return -EINVAL;
 
-	/*
-	 * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
-	 * with a valid event code for supported resource type and the bits
-	 * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
-	 * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
-	 * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
-	 * are error bits.
-	 */
-	wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
-	rdmsrl(MSR_IA32_QM_CTR, msr_val);
-
-	if (msr_val & RMID_VAL_ERROR)
-		return -EIO;
-	if (msr_val & RMID_VAL_UNAVAIL)
-		return -EINVAL;
+	ret = __rmid_read(rmid, eventid, &msr_val);
+	if (ret)
+		return ret;
 
 	am = get_arch_mbm_state(hw_dom, rmid, eventid);
 	if (am) {

base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count()
  2022-12-14 16:08 [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Peter Newman
@ 2022-12-14 16:08 ` Peter Newman
  2022-12-17  1:10   ` Reinette Chatre
  2022-12-17  0:59 ` [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Reinette Chatre
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Newman @ 2022-12-14 16:08 UTC (permalink / raw)
  To: reinette.chatre, fenghua.yu
  Cc: Babu.Moger, bp, dave.hansen, eranian, hpa, james.morse,
	linux-kernel, mingo, quic_jiles, tan.shaopeng, tglx, x86,
	Peter Newman

__mon_event_count() does the per-RMID, per-domain work for
user-initiated event count reads and the initialization of new monitor
groups.

In the initialization case, after resctrl_arch_reset_rmid() calls
__rmid_read() to record an initial count for a new monitor group, it
immediately calls resctrl_arch_rmid_read(). This re-read of the hardware
counter is unnecessary.

Following return from resctrl_arch_reset_rmid(), just clear the
mbm_state and return.

Also, move the mbm_state lookup into the rr->first case, as it's not
needed for regular event count reads: the QOS_L3_OCCUP_EVENT_ID case was
redundant with the accumulating logic at the end of the function.

Signed-off-by: Peter Newman <peternewman@google.com>
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 43 ++++++++++++---------------
 1 file changed, 19 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 77538abeb72a..e708df478077 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -366,41 +366,36 @@ void free_rmid(u32 rmid)
 		list_add_tail(&entry->list, &rmid_free_lru);
 }
 
+static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid,
+				       enum resctrl_event_id evtid)
+{
+	switch (evtid) {
+	case QOS_L3_MBM_TOTAL_EVENT_ID:
+		return &d->mbm_total[rmid];
+	case QOS_L3_MBM_LOCAL_EVENT_ID:
+		return &d->mbm_local[rmid];
+	default:
+		return NULL;
+	}
+}
+
 static int __mon_event_count(u32 rmid, struct rmid_read *rr)
 {
 	struct mbm_state *m;
 	u64 tval = 0;
 
-	if (rr->first)
+	if (rr->first) {
 		resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid);
+		m = get_mbm_state(rr->d, rmid, rr->evtid);
+		if (m)
+			memset(m, 0, sizeof(struct mbm_state));
+		return 0;
+	}
 
 	rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval);
 	if (rr->err)
 		return rr->err;
 
-	switch (rr->evtid) {
-	case QOS_L3_OCCUP_EVENT_ID:
-		rr->val += tval;
-		return 0;
-	case QOS_L3_MBM_TOTAL_EVENT_ID:
-		m = &rr->d->mbm_total[rmid];
-		break;
-	case QOS_L3_MBM_LOCAL_EVENT_ID:
-		m = &rr->d->mbm_local[rmid];
-		break;
-	default:
-		/*
-		 * Code would never reach here because an invalid
-		 * event id would fail in resctrl_arch_rmid_read().
-		 */
-		return -EINVAL;
-	}
-
-	if (rr->first) {
-		memset(m, 0, sizeof(struct mbm_state));
-		return 0;
-	}
-
 	rr->val += tval;
 
 	return 0;
-- 
2.39.0.rc1.256.g54fd8350bd-goog


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs
  2022-12-14 16:08 [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Peter Newman
  2022-12-14 16:08 ` [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count() Peter Newman
@ 2022-12-17  0:59 ` Reinette Chatre
  2022-12-19 10:31   ` Peter Newman
  1 sibling, 1 reply; 5+ messages in thread
From: Reinette Chatre @ 2022-12-17  0:59 UTC (permalink / raw)
  To: Peter Newman, fenghua.yu
  Cc: Babu.Moger, bp, dave.hansen, eranian, hpa, james.morse,
	linux-kernel, mingo, quic_jiles, tan.shaopeng, tglx, x86

Hi Peter,

On 12/14/2022 8:08 AM, Peter Newman wrote:
> When creating a new monitoring group, the RMID allocated for it may have
> been used by a group which was previously removed. In this case, the
> hardware counters will have non-zero values which should be deducted
> from what is reported in the new group's counts.
> 
> resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
> 0, causing the initial count to be charged to the new group. Resurrect
> __rmid_read() and use it to initialize prev_msr correctly.
> 
> Unlike before, __rmid_read() checks for error bits in the MSR read so
> that callers don't need to.
> 
> Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
> Signed-off-by: Peter Newman <peternewman@google.com>

This does look like a candidate for stable?

> ---

It is helpful to have a summary here of what changed since previous version.

>  arch/x86/kernel/cpu/resctrl/monitor.c | 49 ++++++++++++++++++---------
>  1 file changed, 33 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index efe0c30d3a12..77538abeb72a 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -146,6 +146,30 @@ static inline struct rmid_entry *__rmid_entry(u32 rmid)
>  	return entry;
>  }
>  
> +static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val)
> +{
> +	u64 msr_val;
> +
> +	/*
> +	 * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
> +	 * with a valid event code for supported resource type and the bits
> +	 * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
> +	 * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
> +	 * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
> +	 * are error bits.
> +	 */
> +	wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> +	rdmsrl(MSR_IA32_QM_CTR, msr_val);
> +
> +	if (msr_val & RMID_VAL_ERROR)
> +		return -EIO;
> +	if (msr_val & RMID_VAL_UNAVAIL)
> +		return -EINVAL;
> +
> +	*val = msr_val;
> +	return 0;
> +}
> +
>  static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom,
>  						 u32 rmid,
>  						 enum resctrl_event_id eventid)
> @@ -172,8 +196,12 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d,
>  	struct arch_mbm_state *am;
>  
>  	am = get_arch_mbm_state(hw_dom, rmid, eventid);
> -	if (am)
> +	if (am) {
>  		memset(am, 0, sizeof(*am));
> +
> +		/* Record any initial, non-zero count value. */
> +		__rmid_read(rmid, eventid, &am->prev_msr);
> +	}
>  }
>  
>  static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
> @@ -191,25 +219,14 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d,
>  	struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d);
>  	struct arch_mbm_state *am;
>  	u64 msr_val, chunks;
> +	int ret;
>  
>  	if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask))
>  		return -EINVAL;
>  
> -	/*
> -	 * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured
> -	 * with a valid event code for supported resource type and the bits
> -	 * IA32_QM_EVTSEL.RMID (bits 41:32) are configured with valid RMID,
> -	 * IA32_QM_CTR.data (bits 61:0) reports the monitored data.
> -	 * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62)
> -	 * are error bits.
> -	 */
> -	wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid);
> -	rdmsrl(MSR_IA32_QM_CTR, msr_val);
> -
> -	if (msr_val & RMID_VAL_ERROR)
> -		return -EIO;
> -	if (msr_val & RMID_VAL_UNAVAIL)
> -		return -EINVAL;
> +	ret = __rmid_read(rmid, eventid, &msr_val);
> +	if (ret)
> +		return ret;
>  
>  	am = get_arch_mbm_state(hw_dom, rmid, eventid);
>  	if (am) {
> 
> base-commit: 830b3c68c1fb1e9176028d02ef86f3cf76aa2476


Thank you very much for catching and fixing this.

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count()
  2022-12-14 16:08 ` [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count() Peter Newman
@ 2022-12-17  1:10   ` Reinette Chatre
  0 siblings, 0 replies; 5+ messages in thread
From: Reinette Chatre @ 2022-12-17  1:10 UTC (permalink / raw)
  To: Peter Newman, fenghua.yu
  Cc: Babu.Moger, bp, dave.hansen, eranian, hpa, james.morse,
	linux-kernel, mingo, quic_jiles, tan.shaopeng, tglx, x86

Hi Peter,

On 12/14/2022 8:08 AM, Peter Newman wrote:
> __mon_event_count() does the per-RMID, per-domain work for
> user-initiated event count reads and the initialization of new monitor
> groups.
> 
> In the initialization case, after resctrl_arch_reset_rmid() calls
> __rmid_read() to record an initial count for a new monitor group, it
> immediately calls resctrl_arch_rmid_read(). This re-read of the hardware
> counter is unnecessary.

"... and the following computations are ignored by the caller during
initialization."

> 
> Following return from resctrl_arch_reset_rmid(), just clear the
> mbm_state and return.
> 
> Also, move the mbm_state lookup into the rr->first case, as it's not
> needed for regular event count reads: the QOS_L3_OCCUP_EVENT_ID case was
> redundant with the accumulating logic at the end of the function.

Starting the above with "Also" creates impression that two logical
changes are merged into a single patch. Since this is not the case you
can avoid the impression with something like:

Following return from resctrl_arch_reset_rmid(), just clear the
mbm_state and return. This involves moving the mbm_state lookup into
the rr->first case, as it's not needed for regular event count reads:
the QOS_L3_OCCUP_EVENT_ID case was redundant with the accumulating logic
at the end of the function.


> 
> Signed-off-by: Peter Newman <peternewman@google.com>
> ---
>  arch/x86/kernel/cpu/resctrl/monitor.c | 43 ++++++++++++---------------
>  1 file changed, 19 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
> index 77538abeb72a..e708df478077 100644
> --- a/arch/x86/kernel/cpu/resctrl/monitor.c
> +++ b/arch/x86/kernel/cpu/resctrl/monitor.c
> @@ -366,41 +366,36 @@ void free_rmid(u32 rmid)
>  		list_add_tail(&entry->list, &rmid_free_lru);
>  }
>  
> +static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid,
> +				       enum resctrl_event_id evtid)
> +{
> +	switch (evtid) {
> +	case QOS_L3_MBM_TOTAL_EVENT_ID:
> +		return &d->mbm_total[rmid];
> +	case QOS_L3_MBM_LOCAL_EVENT_ID:
> +		return &d->mbm_local[rmid];
> +	default:
> +		return NULL;
> +	}
> +}
> +
>  static int __mon_event_count(u32 rmid, struct rmid_read *rr)
>  {
>  	struct mbm_state *m;
>  	u64 tval = 0;
>  
> -	if (rr->first)
> +	if (rr->first) {
>  		resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid);
> +		m = get_mbm_state(rr->d, rmid, rr->evtid);
> +		if (m)
> +			memset(m, 0, sizeof(struct mbm_state));
> +		return 0;
> +	}
>  
>  	rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval);
>  	if (rr->err)
>  		return rr->err;
>  
> -	switch (rr->evtid) {
> -	case QOS_L3_OCCUP_EVENT_ID:
> -		rr->val += tval;
> -		return 0;
> -	case QOS_L3_MBM_TOTAL_EVENT_ID:
> -		m = &rr->d->mbm_total[rmid];
> -		break;
> -	case QOS_L3_MBM_LOCAL_EVENT_ID:
> -		m = &rr->d->mbm_local[rmid];
> -		break;
> -	default:
> -		/*
> -		 * Code would never reach here because an invalid
> -		 * event id would fail in resctrl_arch_rmid_read().
> -		 */
> -		return -EINVAL;
> -	}
> -
> -	if (rr->first) {
> -		memset(m, 0, sizeof(struct mbm_state));
> -		return 0;
> -	}
> -
>  	rr->val += tval;
>  
>  	return 0;

Thank you very much.

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Reinette


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs
  2022-12-17  0:59 ` [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Reinette Chatre
@ 2022-12-19 10:31   ` Peter Newman
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Newman @ 2022-12-19 10:31 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: fenghua.yu, Babu.Moger, bp, dave.hansen, eranian, hpa,
	james.morse, linux-kernel, mingo, quic_jiles, tan.shaopeng, tglx,
	x86

Hi Reinette,

On Sat, Dec 17, 2022 at 1:59 AM Reinette Chatre
<reinette.chatre@intel.com> wrote:
> On 12/14/2022 8:08 AM, Peter Newman wrote:
> > When creating a new monitoring group, the RMID allocated for it may have
> > been used by a group which was previously removed. In this case, the
> > hardware counters will have non-zero values which should be deducted
> > from what is reported in the new group's counts.
> >
> > resctrl_arch_reset_rmid() initializes the prev_msr value for counters to
> > 0, causing the initial count to be charged to the new group. Resurrect
> > __rmid_read() and use it to initialize prev_msr correctly.
> >
> > Unlike before, __rmid_read() checks for error bits in the MSR read so
> > that callers don't need to.
> >
> > Fixes: 1d81d15db39c ("x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read()")
> > Signed-off-by: Peter Newman <peternewman@google.com>
>
> This does look like a candidate for stable?

Yes, this bug is serious and reproducible. Every RMID reuse would
have up to one overflow's-worth of measurement error.

Should I elaborate on the impact more in the changelog?

>
> > ---
>
> It is helpful to have a summary here of what changed since previous version.

ok, I'll add this

> Thank you very much for catching and fixing this.
>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>

Thanks, Reinette!

-Peter

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-12-19 10:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-14 16:08 [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Peter Newman
2022-12-14 16:08 ` [PATCH v2 2/2] x86/resctrl: Avoid redundant counter read in __mon_event_count() Peter Newman
2022-12-17  1:10   ` Reinette Chatre
2022-12-17  0:59 ` [PATCH v2 1/2] x86/resctrl: Fix event counts regression in reused RMIDs Reinette Chatre
2022-12-19 10:31   ` Peter Newman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).