All of lore.kernel.org
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Fenghua Yu <fenghua.yu@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	H Peter Anvin <hpa@zytor.com>, Babu Moger <Babu.Moger@amd.com>,
	James Morse <james.morse@arm.com>,
	shameerali.kolothum.thodi@huawei.com,
	D Scott Phillips OS <scott@os.amperecomputing.com>,
	carl@os.amperecomputing.com, lcherian@marvell.com,
	bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com,
	baolin.wang@linux.alibaba.com,
	Jamie Iles <quic_jiles@quicinc.com>,
	Xin Hao <xhao@linux.alibaba.com>,
	peternewman@google.com, dfustini@baylibre.com,
	amitsinght@marvell.com, David Hildenbrand <david@redhat.com>,
	Babu Moger <babu.moger@amd.com>
Subject: [PATCH v9 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI
Date: Tue, 13 Feb 2024 18:44:27 +0000	[thread overview]
Message-ID: <20240213184438.16675-14-james.morse@arm.com> (raw)
In-Reply-To: <20240213184438.16675-1-james.morse@arm.com>

Intel is blessed with an abundance of monitors, one per RMID, that can be
read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC,
the number implemented is up to the manufacturer. This means when there are
fewer monitors than needed, they need to be allocated and freed.

MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The
CSU counter is allowed to return 'not ready' for a small number of
micro-seconds after programming. To allow one CSU hardware monitor to be
used for multiple control or monitor groups, the CPU accessing the
monitor needs to be able to block when configuring and reading the
counter.

Worse, the domain may be broken up into slices, and the MMIO accesses
for each slice may need performing from different CPUs.

These two details mean MPAMs monitor code needs to be able to sleep, and
IPI another CPU in the domain to read from a resource that has been sliced.

mon_event_read() already invokes mon_event_count() via IPI, which means
this isn't possible. On systems using nohz-full, some CPUs need to be
interrupted to run kernel work as they otherwise stay in user-space
running realtime workloads. Interrupting these CPUs should be avoided,
and scheduling work on them may never complete.

Change mon_event_read() to pick a housekeeping CPU, (one that is not using
nohz_full) and schedule mon_event_count() and wait. If all the CPUs
in a domain are using nohz-full, then an IPI is used as the fallback.

This function is only used in response to a user-space filesystem request
(not the timing sensitive overflow code).

This allows MPAM to hide the slice behaviour from resctrl, and to keep
the monitor-allocation in monitor.c. When the IPI fallback is used on
machines where MPAM needs to make an access on multiple CPUs, the counter
read will always fail.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64
Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
---
Changes since v2:
 * Use cpumask_any_housekeeping() and fallback to an IPI if needed.

Changes since v3:
 * Actually include the IPI fallback code.

Changes since v4:
 * Tinkered with existing capitalisation.

Changes since v5:
 * Added a newline.

Changes since v6:
 * Moved lockdep annotations to a later patch.
---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 26 +++++++++++++++++++++--
 arch/x86/kernel/cpu/resctrl/monitor.c     |  2 +-
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index beccb0e87ba7..d07f99245851 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -19,6 +19,8 @@
 #include <linux/kernfs.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
+#include <linux/tick.h>
+
 #include "internal.h"
 
 /*
@@ -522,12 +524,21 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
 	return ret;
 }
 
+static int smp_mon_event_count(void *arg)
+{
+	mon_event_count(arg);
+
+	return 0;
+}
+
 void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 		    struct rdt_domain *d, struct rdtgroup *rdtgrp,
 		    int evtid, int first)
 {
+	int cpu;
+
 	/*
-	 * setup the parameters to send to the IPI to read the data.
+	 * Setup the parameters to pass to mon_event_count() to read the data.
 	 */
 	rr->rgrp = rdtgrp;
 	rr->evtid = evtid;
@@ -536,7 +547,18 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
 	rr->val = 0;
 	rr->first = first;
 
-	smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1);
+	cpu = cpumask_any_housekeeping(&d->cpu_mask);
+
+	/*
+	 * cpumask_any_housekeeping() prefers housekeeping CPUs, but
+	 * are all the CPUs nohz_full? If yes, pick a CPU to IPI.
+	 * MPAM's resctrl_arch_rmid_read() is unable to read the
+	 * counters on some platforms if its called in irq context.
+	 */
+	if (tick_nohz_full_cpu(cpu))
+		smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1);
+	else
+		smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
 }
 
 int rdtgroup_mondata_show(struct seq_file *m, void *arg)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 38f85e53ca93..fd060ef86f38 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -585,7 +585,7 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr)
 }
 
 /*
- * This is called via IPI to read the CQM/MBM counters
+ * This is scheduled by mon_event_read() to read the CQM/MBM counters
  * on a domain.
  */
 void mon_event_count(void *info)
-- 
2.39.2


  parent reply	other threads:[~2024-02-13 18:45 UTC|newest]

Thread overview: 104+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-13 18:44 [PATCH v9 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2024-02-13 18:44 ` [PATCH v9 01/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 02/24] x86/resctrl: kfree() rmid_ptrs from resctrl_exit() James Morse
2024-02-13 23:14   ` Reinette Chatre
2024-02-19 16:53     ` James Morse
2024-02-19 18:37   ` [tip: x86/cache] x86/resctrl: Free " tip-bot2 for James Morse
2024-02-20 15:27   ` [PATCH v9 02/24] x86/resctrl: kfree() " David Hildenbrand
2024-02-20 15:46     ` James Morse
2024-02-20 15:54       ` Thomas Gleixner
2024-02-20 16:01         ` James Morse
2024-02-20 16:12           ` Thomas Gleixner
2024-02-13 18:44 ` [PATCH v9 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation James Morse
2024-02-19 15:49   ` David Hildenbrand
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() James Morse
2024-02-19 18:37   ` [tip: x86/cache] x86/resctrl: Move RMID " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 05/24] x86/resctrl: Track the closid with the rmid James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 06/24] x86/resctrl: Access per-rmid structures by index James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 07/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 08/24] x86/resctrl: Track the number of dirty RMID a CLOSID has James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 09/24] x86/resctrl: Use __set_bit()/__clear_bit() instead of open coding James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-20 16:00   ` [PATCH v9 09/24] " David Hildenbrand
2024-02-20 16:27     ` James Morse
2024-02-20 16:44       ` David Hildenbrand
2024-02-13 18:44 ` [PATCH v9 10/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 11/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 12/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` James Morse [this message]
2024-02-19 18:37   ` [tip: x86/cache] x86/resctrl: Queue mon_event_read() instead of sending an IPI tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 16/24] x86/resctrl: Make resctrl_mounted checks explicit James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 17/24] x86/resctrl: Move alloc/mon static keys into helpers James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 20/24] x86/resctrl: Add CPU online callback for resctrl work James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu James Morse
2024-02-19 18:37   ` [tip: x86/cache] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but CPU tip-bot2 for James Morse
2024-03-25 23:14   ` [PATCH v9 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu Tony Luck
2024-02-13 18:44 ` [PATCH v9 22/24] x86/resctrl: Add CPU offline callback for resctrl work James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 18:44 ` [PATCH v9 24/24] x86/resctrl: Separate arch and fs resctrl locks James Morse
2024-02-19 18:37   ` [tip: x86/cache] " tip-bot2 for James Morse
2024-02-13 23:26 ` [PATCH v9 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking Reinette Chatre
2024-02-14 15:01 ` Moger, Babu
2024-02-19 16:53   ` James Morse
2024-02-17  0:28 ` Tony Luck
2024-02-20 18:18   ` Reinette Chatre
2024-02-20 18:48     ` Luck, Tony
2024-02-17 10:55 ` Borislav Petkov
2024-02-19 16:49   ` Thomas Gleixner
2024-02-19 16:53     ` James Morse
2024-02-19 17:51       ` Borislav Petkov
2024-02-19 17:55         ` James Morse
2024-02-20 20:59     ` Tony Luck
2024-02-20 22:58       ` Reinette Chatre
2024-02-20 23:25         ` Luck, Tony
2024-02-21  0:34           ` [PATCH] x86/resctrl: Fix WARN in get_domain_from_cpu() Tony Luck
2024-02-21  5:10             ` Reinette Chatre
2024-02-21 12:06             ` James Morse
2024-02-21 19:31             ` [PATCH v2] " Tony Luck
2024-02-21 22:59               ` Reinette Chatre
2024-02-21 23:56                 ` Tony Luck
2024-02-22 18:50               ` [PATCH v3 0/2] x86/resctrl: Pass domain to target CPU Tony Luck
2024-02-22 18:50                 ` [PATCH v3 1/2] " Tony Luck
2024-02-27 22:05                   ` Reinette Chatre
2024-02-22 18:50                 ` [PATCH v3 2/2] x86/resctrl: Simply call convention for MSR update functions Tony Luck
2024-02-27 22:05                   ` Reinette Chatre
2024-02-22 23:18                 ` [PATCH v3 0/2] x86/resctrl: Pass domain to target CPU Reinette Chatre
2024-02-22 23:26                   ` Luck, Tony
2024-03-08 21:38               ` [PATCH v5 " Tony Luck
2024-03-08 21:38                 ` [PATCH v5 1/2] " Tony Luck
2024-03-12 20:06                   ` Moger, Babu
2024-03-12 20:08                     ` Luck, Tony
2024-04-24 12:01                   ` [tip: x86/cache] " tip-bot2 for Tony Luck
2024-03-08 21:38                 ` [PATCH v5 2/2] x86/resctrl: Simplify call convention for MSR update functions Tony Luck
2024-03-12 20:06                   ` Moger, Babu
2024-04-24 12:01                   ` [tip: x86/cache] " tip-bot2 for Tony Luck
2024-03-08 23:08                 ` [PATCH v5 0/2] x86/resctrl: Pass domain to target CPU Reinette Chatre
2024-03-08 23:26                   ` Luck, Tony
2024-03-29  4:55                 ` Reinette Chatre
2024-02-21 12:06         ` [PATCH v9 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse
2024-02-21 16:48           ` Reinette Chatre
2024-02-21 17:30             ` Luck, Tony
2024-02-22 15:26       ` [tip: x86/cache] x86/resctrl: Remove lockdep annotation that triggers false positive tip-bot2 for James Morse
2024-02-19 16:54   ` [PATCH v9 00/24] x86/resctrl: monitored closid+rmid together, separate arch/fs locking James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240213184438.16675-14-james.morse@arm.com \
    --to=james.morse@arm.com \
    --cc=Babu.Moger@amd.com \
    --cc=amitsinght@marvell.com \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=bp@alien8.de \
    --cc=carl@os.amperecomputing.com \
    --cc=david@redhat.com \
    --cc=dfustini@baylibre.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=lcherian@marvell.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peternewman@google.com \
    --cc=quic_jiles@quicinc.com \
    --cc=reinette.chatre@intel.com \
    --cc=scott@os.amperecomputing.com \
    --cc=shameerali.kolothum.thodi@huawei.com \
    --cc=tan.shaopeng@fujitsu.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xhao@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.