All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] x86/resctrl: Fix default monitoring groups reporting
@ 2021-08-02 19:38 Babu Moger
  2021-08-12 18:20 ` [tip: x86/urgent] " tip-bot2 for Babu Moger
  0 siblings, 1 reply; 2+ messages in thread
From: Babu Moger @ 2021-08-02 19:38 UTC (permalink / raw)
  To: tglx, mingo, bp
  Cc: fenghua.yu, x86, linux-kernel, babu.moger, hpa, reinette.chatre,
	pawel.szulik

From: Babu Moger <Babu.Moger@amd.com>

Creating a new sub monitoring group in the root /sys/fs/resctrl leads to
getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes
on the entire filesystem.

Steps to reproduce.
1. #mount -t resctrl resctrl /sys/fs/resctrl/

2. #cd /sys/fs/resctrl/

3. #cat mon_data/mon_L3_00/mbm_total_bytes
   23189832

4. #mkdir mon_groups/test1 (create sub monitor group)

5. #cat mon_data/mon_L3_00/mbm_total_bytes
   Unavailable

When a new monitoring group is created, a new RMID is assigned to the new
group. But the RMID is not active yet. When the events are read on the new
RMID, it is expected to report the status as "Unavailable".

When the user reads the events on the default monitoring group with
multiple subgroups, the events on all sub groups are consolidated together.
Currently, if any of the RMID reads report as "Unavailable", then
everything will be reported as "Unavailable".

Fix the issue by discarding the "Unavailable" reads and reporting all the
successful RMID reads. This is not a problem on Intel systems as Intel
reports 0 on Inactive RMIDs.

Cc: stable@vger.kernel.org
Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311
Reported-by: Paweł Szulik <pawel.szulik@intel.com>
Signed-off-by: Babu Moger <Babu.Moger@amd.com>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
---
v2:
  Taken care of Reinette's comments. Added Fixes line and Added Cc to stable.

v1:
  https://lore.kernel.org/lkml/162499005859.4842.12410192091197461691.stgit@bmoger-ubuntu/


 arch/x86/kernel/cpu/resctrl/monitor.c |   27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f07c10b87a87..57e4bb695ff9 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -285,15 +285,14 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
 	return chunks >>= shift;
 }
 
-static int __mon_event_count(u32 rmid, struct rmid_read *rr)
+static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
 {
 	struct mbm_state *m;
 	u64 chunks, tval;
 
 	tval = __rmid_read(rmid, rr->evtid);
 	if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) {
-		rr->val = tval;
-		return -EINVAL;
+		return tval;
 	}
 	switch (rr->evtid) {
 	case QOS_L3_OCCUP_EVENT_ID:
@@ -305,12 +304,6 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
 	case QOS_L3_MBM_LOCAL_EVENT_ID:
 		m = &rr->d->mbm_local[rmid];
 		break;
-	default:
-		/*
-		 * Code would never reach here because
-		 * an invalid event id would fail the __rmid_read.
-		 */
-		return -EINVAL;
 	}
 
 	if (rr->first) {
@@ -361,23 +354,29 @@ void mon_event_count(void *info)
 	struct rdtgroup *rdtgrp, *entry;
 	struct rmid_read *rr = info;
 	struct list_head *head;
+	u64 ret_val;
 
 	rdtgrp = rr->rgrp;
 
-	if (__mon_event_count(rdtgrp->mon.rmid, rr))
-		return;
+	ret_val = __mon_event_count(rdtgrp->mon.rmid, rr);
 
 	/*
-	 * For Ctrl groups read data from child monitor groups.
+	 * For Ctrl groups read data from child monitor groups and
+	 * add them together. Count events which are read successfully.
+	 * Discard the rmid_read's reporting errors.
 	 */
 	head = &rdtgrp->mon.crdtgrp_list;
 
 	if (rdtgrp->type == RDTCTRL_GROUP) {
 		list_for_each_entry(entry, head, mon.crdtgrp_list) {
-			if (__mon_event_count(entry->mon.rmid, rr))
-				return;
+			if (__mon_event_count(entry->mon.rmid, rr) == 0)
+				ret_val = 0;
 		}
 	}
+
+	/* Report error if none of rmid_reads are successful */
+	if (ret_val)
+		rr->val = ret_val;
 }
 
 /*


^ permalink raw reply	[flat|nested] 2+ messages in thread

* [tip: x86/urgent] x86/resctrl: Fix default monitoring groups reporting
  2021-08-02 19:38 [PATCH v2] x86/resctrl: Fix default monitoring groups reporting Babu Moger
@ 2021-08-12 18:20 ` tip-bot2 for Babu Moger
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot2 for Babu Moger @ 2021-08-12 18:20 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: pawel.szulik, Babu Moger, Borislav Petkov, Reinette Chatre,
	stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     064855a69003c24bd6b473b367d364e418c57625
Gitweb:        https://git.kernel.org/tip/064855a69003c24bd6b473b367d364e418c57625
Author:        Babu Moger <Babu.Moger@amd.com>
AuthorDate:    Mon, 02 Aug 2021 14:38:58 -05:00
Committer:     Borislav Petkov <bp@suse.de>
CommitterDate: Thu, 12 Aug 2021 20:12:20 +02:00

x86/resctrl: Fix default monitoring groups reporting

Creating a new sub monitoring group in the root /sys/fs/resctrl leads to
getting the "Unavailable" value for mbm_total_bytes and mbm_local_bytes
on the entire filesystem.

Steps to reproduce:

  1. mount -t resctrl resctrl /sys/fs/resctrl/

  2. cd /sys/fs/resctrl/

  3. cat mon_data/mon_L3_00/mbm_total_bytes
     23189832

  4. Create sub monitor group:
  mkdir mon_groups/test1

  5. cat mon_data/mon_L3_00/mbm_total_bytes
     Unavailable

When a new monitoring group is created, a new RMID is assigned to the
new group. But the RMID is not active yet. When the events are read on
the new RMID, it is expected to report the status as "Unavailable".

When the user reads the events on the default monitoring group with
multiple subgroups, the events on all subgroups are consolidated
together. Currently, if any of the RMID reads report as "Unavailable",
then everything will be reported as "Unavailable".

Fix the issue by discarding the "Unavailable" reads and reporting all
the successful RMID reads. This is not a problem on Intel systems as
Intel reports 0 on Inactive RMIDs.

Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data")
Reported-by: Paweł Szulik <pawel.szulik@intel.com>
Signed-off-by: Babu Moger <Babu.Moger@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=213311
Link: https://lkml.kernel.org/r/162793309296.9224.15871659871696482080.stgit@bmoger-ubuntu
---
 arch/x86/kernel/cpu/resctrl/monitor.c | 27 ++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index f07c10b..57e4bb6 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -285,15 +285,14 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
 	return chunks >>= shift;
 }
 
-static int __mon_event_count(u32 rmid, struct rmid_read *rr)
+static u64 __mon_event_count(u32 rmid, struct rmid_read *rr)
 {
 	struct mbm_state *m;
 	u64 chunks, tval;
 
 	tval = __rmid_read(rmid, rr->evtid);
 	if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL)) {
-		rr->val = tval;
-		return -EINVAL;
+		return tval;
 	}
 	switch (rr->evtid) {
 	case QOS_L3_OCCUP_EVENT_ID:
@@ -305,12 +304,6 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr)
 	case QOS_L3_MBM_LOCAL_EVENT_ID:
 		m = &rr->d->mbm_local[rmid];
 		break;
-	default:
-		/*
-		 * Code would never reach here because
-		 * an invalid event id would fail the __rmid_read.
-		 */
-		return -EINVAL;
 	}
 
 	if (rr->first) {
@@ -361,23 +354,29 @@ void mon_event_count(void *info)
 	struct rdtgroup *rdtgrp, *entry;
 	struct rmid_read *rr = info;
 	struct list_head *head;
+	u64 ret_val;
 
 	rdtgrp = rr->rgrp;
 
-	if (__mon_event_count(rdtgrp->mon.rmid, rr))
-		return;
+	ret_val = __mon_event_count(rdtgrp->mon.rmid, rr);
 
 	/*
-	 * For Ctrl groups read data from child monitor groups.
+	 * For Ctrl groups read data from child monitor groups and
+	 * add them together. Count events which are read successfully.
+	 * Discard the rmid_read's reporting errors.
 	 */
 	head = &rdtgrp->mon.crdtgrp_list;
 
 	if (rdtgrp->type == RDTCTRL_GROUP) {
 		list_for_each_entry(entry, head, mon.crdtgrp_list) {
-			if (__mon_event_count(entry->mon.rmid, rr))
-				return;
+			if (__mon_event_count(entry->mon.rmid, rr) == 0)
+				ret_val = 0;
 		}
 	}
+
+	/* Report error if none of rmid_reads are successful */
+	if (ret_val)
+		rr->val = ret_val;
 }
 
 /*

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-12 18:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-02 19:38 [PATCH v2] x86/resctrl: Fix default monitoring groups reporting Babu Moger
2021-08-12 18:20 ` [tip: x86/urgent] " tip-bot2 for Babu Moger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.