* [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix
@ 2022-08-01 20:55 Babu Moger
2022-08-01 20:55 ` [PATCH v2 01/10] x86/resctrl: Fix min_cbm_bits for AMD Babu Moger
` (10 more replies)
0 siblings, 11 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:55 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
New AMD processors can now support following QoS features.
1. Slow Memory Bandwidth Configuration
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.
2. Bandwidth Monitoring Event Configuration (BMEC)
The bandwidth monitoring events mbm_total_event and mbm_local_event
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not enough
to count all the different types are memory events. With the feature
BMEC, the users have the option to configure mbm_total_event and
mbm_local_event to count the specific type of events.
Following are the bitmaps of events supported.
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
This series adds support for these features.
Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Also added a bugfix submitted by Stephane Eranian.
---
v2:
a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
b. Added the patch to fix CBM issue on AMD. This was originally discussed
https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/
v1:
https://lore.kernel.org/lkml/165757543252.416408.13547339307237713464.stgit@bmoger-ubuntu/
Babu Moger (10):
x86/resctrl: Fix min_cbm_bits for AMD
x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration
x86/resctrl: Add sysfs interface files to read/write event configuration
x86/resctrl: Add the sysfs interface to read the event configuration
x86/resctrl: Add sysfs interface to write the event configuration
Documentation/x86: Update resctrl_ui.rst for new features
Documentation/x86/resctrl.rst | 123 +++++++++++
arch/x86/include/asm/cpufeatures.h | 2 +
arch/x86/kernel/cpu/resctrl/core.c | 70 ++++++-
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +-
arch/x86/kernel/cpu/resctrl/internal.h | 26 +++
arch/x86/kernel/cpu/resctrl/monitor.c | 16 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 237 +++++++++++++++++++++-
arch/x86/kernel/cpu/scattered.c | 2 +
include/linux/resctrl.h | 1 +
9 files changed, 469 insertions(+), 10 deletions(-)
--
Signature
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v2 01/10] x86/resctrl: Fix min_cbm_bits for AMD
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
@ 2022-08-01 20:55 ` Babu Moger
2022-08-01 20:55 ` [PATCH v2 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag Babu Moger
` (9 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:55 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
AMD systems support zero CBM (capacity bit mask) for L3 allocation.
That is reflected in rdt_init_res_defs_amd() by:
r->cache.arch_has_empty_bitmaps = true;
However given the unified code in cbm_validate(), checking for:
val == 0 && !arch_has_empty_bitmaps
is not enough because of another check in cbm_validate():
if ((zero_bit - first_bit) < r->cache.min_cbm_bits)
The default value of r->cache.min_cbm_bits = 1.
Leading to:
$ cd /sys/fs/resctrl
$ mkdir foo
$ cd foo
$ echo L3:0=0 > schemata
-bash: echo: write error: Invalid argument
$ cat /sys/fs/resctrl/info/last_cmd_status
Need at least 1 bits in the mask
Fix the issue by initializing the min_cbm_bits to 0 for AMD. Also remove
the default setting of min_cbm_bits and initialize it separately.
After the fix
$ cd /sys/fs/resctrl
$ mkdir foo
$ cd foo
$ echo L3:0=0 > schemata
$ cat /sys/fs/resctrl/info/last_cmd_status
ok
Link: https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/
Fixes: 316e7f901f5a ("x86/resctrl: Add struct rdt_cache::arch_has_{sparse, empty}_bitmaps")
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index bb1c3f5f60c8..a5c51a14fbce 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -66,9 +66,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.rid = RDT_RESOURCE_L3,
.name = "L3",
.cache_level = 3,
- .cache = {
- .min_cbm_bits = 1,
- },
.domains = domain_init(RDT_RESOURCE_L3),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
@@ -83,9 +80,6 @@ struct rdt_hw_resource rdt_resources_all[] = {
.rid = RDT_RESOURCE_L2,
.name = "L2",
.cache_level = 2,
- .cache = {
- .min_cbm_bits = 1,
- },
.domains = domain_init(RDT_RESOURCE_L2),
.parse_ctrlval = parse_cbm,
.format_str = "%d=%0*x",
@@ -877,6 +871,7 @@ static __init void rdt_init_res_defs_intel(void)
r->cache.arch_has_sparse_bitmaps = false;
r->cache.arch_has_empty_bitmaps = false;
r->cache.arch_has_per_cpu_cfg = false;
+ r->cache.min_cbm_bits = 1;
} else if (r->rid == RDT_RESOURCE_MBA) {
hw_res->msr_base = MSR_IA32_MBA_THRTL_BASE;
hw_res->msr_update = mba_wrmsr_intel;
@@ -897,6 +892,7 @@ static __init void rdt_init_res_defs_amd(void)
r->cache.arch_has_sparse_bitmaps = true;
r->cache.arch_has_empty_bitmaps = true;
r->cache.arch_has_per_cpu_cfg = true;
+ r->cache.min_cbm_bits = 0;
} else if (r->rid == RDT_RESOURCE_MBA) {
hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
hw_res->msr_update = mba_wrmsr_amd;
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
2022-08-01 20:55 ` [PATCH v2 01/10] x86/resctrl: Fix min_cbm_bits for AMD Babu Moger
@ 2022-08-01 20:55 ` Babu Moger
2022-08-01 20:55 ` [PATCH v2 03/10] x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA Babu Moger
` (8 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:55 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Adds the new AMD feature X86_FEATURE_SMBA. With this feature, the QOS
enforcement policies can be applied to external slow memory connected
to the host. QOS enforcement is accomplished by assigning a Class Of
Service (COS) to a processor and specifying allocations or limits for
that COS for each resource to be allocated.
This feature is identified by the CPUID Function 8000_0020_EBX_x0.
CPUID Fn8000_0020_EBX_x0 AMD Bandwidth Enforcement Feature Identifiers (ECX=0)
Bits Field Name Description
2 L3SBE L3 external slow memory bandwidth enforcement
Feature description is available in the specification, "AMD64 Technology Platform Quality
of Service Extensions, Revision: 1.03 Publication # 56375 Revision: 1.03 Issue Date: February 2022".
Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index a77b915d36a8..14b5cc8da75c 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -303,6 +303,7 @@
#define X86_FEATURE_RETHUNK (11*32+14) /* "" Use REturn THUNK */
#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
+#define X86_FEATURE_SMBA (11*32+17) /* SLOW Memory Bandwidth Allocation */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index fd44b54c90d5..885ecf46abb2 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -44,6 +44,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
+ { X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ 0, 0, 0, 0, 0 }
};
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 03/10] x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
2022-08-01 20:55 ` [PATCH v2 01/10] x86/resctrl: Fix min_cbm_bits for AMD Babu Moger
2022-08-01 20:55 ` [PATCH v2 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag Babu Moger
@ 2022-08-01 20:55 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 04/10] x86/resctrl: Detect and configure Slow Memory Bandwidth allocation Babu Moger
` (7 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:55 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Adds a new resource type RDT_RESOURCE_SMBA to handle the QoS
enforcement policies on the external slow memory.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 12 ++++++++++++
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
2 files changed, 13 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index a5c51a14fbce..6c38427b71a2 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -100,6 +100,18 @@ struct rdt_hw_resource rdt_resources_all[] = {
.fflags = RFTYPE_RES_MB,
},
},
+ [RDT_RESOURCE_SMBA] =
+ {
+ .r_resctrl = {
+ .rid = RDT_RESOURCE_SMBA,
+ .name = "SB",
+ .cache_level = 3,
+ .domains = domain_init(RDT_RESOURCE_SMBA),
+ .parse_ctrlval = parse_bw,
+ .format_str = "%d=%*u",
+ .fflags = RFTYPE_RES_MB,
+ },
+ },
};
/*
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 1d647188a43b..24a1dfeb6cb2 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -418,6 +418,7 @@ enum resctrl_res_level {
RDT_RESOURCE_L3,
RDT_RESOURCE_L2,
RDT_RESOURCE_MBA,
+ RDT_RESOURCE_SMBA,
/* Must be the last */
RDT_NUM_RESOURCES,
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 04/10] x86/resctrl: Detect and configure Slow Memory Bandwidth allocation
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (2 preceding siblings ...)
2022-08-01 20:55 ` [PATCH v2 03/10] x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 05/10] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag Babu Moger
` (6 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
The QoS slow memory configuration details are available via
CPUID_Fn80000020_EDX_x02. Detect the available details and
initialize the rest to defaults.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/core.c | 50 +++++++++++++++++++++++++++++
arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 2 +
arch/x86/kernel/cpu/resctrl/internal.h | 1 +
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +++--
4 files changed, 58 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 6c38427b71a2..36ad97ab7342 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -253,6 +253,37 @@ static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
return true;
}
+static bool __rdt_get_s_mem_config_amd(struct rdt_resource *r)
+{
+ struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
+ union cpuid_0x10_3_eax eax;
+ union cpuid_0x10_x_edx edx;
+ u32 ebx, ecx;
+
+ cpuid_count(0x80000020, 2, &eax.full, &ebx, &ecx, &edx.full);
+ hw_res->num_closid = edx.split.cos_max + 1;
+ r->default_ctrl = MAX_MBA_BW_AMD;
+
+ /* AMD does not use delay */
+ r->membw.delay_linear = false;
+ r->membw.arch_needs_linear = false;
+
+ /*
+ * AMD does not use memory delay throttle model to control
+ * the allocation like Intel does.
+ */
+ r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
+ r->membw.min_bw = 0;
+ r->membw.bw_gran = 1;
+ /* Max value is 2048, Data width should be 4 in decimal */
+ r->data_width = 4;
+
+ r->alloc_capable = true;
+ r->alloc_enabled = true;
+
+ return true;
+}
+
static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
@@ -787,6 +818,19 @@ static __init bool get_mem_config(void)
return false;
}
+static __init bool get_s_mem_config(void)
+{
+ struct rdt_hw_resource *hw_res = &rdt_resources_all[RDT_RESOURCE_SMBA];
+
+ if (!rdt_cpu_has(X86_FEATURE_SMBA))
+ return false;
+
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ return __rdt_get_s_mem_config_amd(&hw_res->r_resctrl);
+
+ return false;
+}
+
static __init bool get_rdt_alloc_resources(void)
{
struct rdt_resource *r;
@@ -817,6 +861,9 @@ static __init bool get_rdt_alloc_resources(void)
if (get_mem_config())
ret = true;
+ if (get_s_mem_config())
+ ret = true;
+
return ret;
}
@@ -908,6 +955,9 @@ static __init void rdt_init_res_defs_amd(void)
} else if (r->rid == RDT_RESOURCE_MBA) {
hw_res->msr_base = MSR_IA32_MBA_BW_BASE;
hw_res->msr_update = mba_wrmsr_amd;
+ } else if (r->rid == RDT_RESOURCE_SMBA) {
+ hw_res->msr_base = MSR_IA32_SMBA_BW_BASE;
+ hw_res->msr_update = mba_wrmsr_amd;
}
}
}
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 87666275eed9..11ec3577db40 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -203,7 +203,7 @@ static int parse_line(char *line, struct resctrl_schema *s,
unsigned long dom_id;
if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
- r->rid == RDT_RESOURCE_MBA) {
+ (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) {
rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n");
return -EINVAL;
}
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 24a1dfeb6cb2..c049a274383c 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -14,6 +14,7 @@
#define MSR_IA32_L2_CBM_BASE 0xd10
#define MSR_IA32_MBA_THRTL_BASE 0xd50
#define MSR_IA32_MBA_BW_BASE 0xc0000200
+#define MSR_IA32_SMBA_BW_BASE 0xc0000280
#define MSR_IA32_QM_CTR 0x0c8e
#define MSR_IA32_QM_EVTSEL 0x0c8d
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index f276aff521e8..fc5286067201 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1218,7 +1218,7 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp)
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- if (r->rid == RDT_RESOURCE_MBA)
+ if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)
continue;
has_cache = true;
list_for_each_entry(d, &r->domains, list) {
@@ -1399,7 +1399,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
ctrl = resctrl_arch_get_config(r, d,
rdtgrp->closid,
schema->conf_type);
- if (r->rid == RDT_RESOURCE_MBA)
+ if (r->rid == RDT_RESOURCE_MBA ||
+ r->rid == RDT_RESOURCE_SMBA)
size = ctrl;
else
size = rdtgroup_cbm_to_size(r, d, ctrl);
@@ -2807,7 +2808,9 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp)
list_for_each_entry(s, &resctrl_schema_all, list) {
r = s->res;
- if (r->rid == RDT_RESOURCE_MBA) {
+ if (r->rid == RDT_RESOURCE_MBA ||
+ r->rid == RDT_RESOURCE_SMBA) {
+
rdtgroup_init_mba(r);
} else {
ret = rdtgroup_init_cat(s, rdtgrp->closid);
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 05/10] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (3 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 04/10] x86/resctrl: Detect and configure Slow Memory Bandwidth allocation Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 06/10] x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration Babu Moger
` (5 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Newer AMD processors support the new feature Bandwidth Monitoring Event
Configuration (BMEC).
Support of this feature is available via CPUID Fn8000_0020_EBX_x0 (ECX=0).
Bits Field Name Description
3 EVT_CFG Bandwidth Monitoring Event Configuration (BMEC)
Currently, the bandwidth monitoring events mbm_total_bytes and
mbm_local_bytes are set to count all the total and local reads/writes
respectively. With the introduction of SLOW memory, the two counters are
not enough to count all the different types of memory events. With the
feature BMEC, the users have the option to configure mbm_total_bytes and
mbm_local_bytes to count the specific type of events.
Each BMEC event has a configuration MSR, QOS_EVT_CFG (0x000_0400h +
EventID) which contains one field for each Bandwidth Type that can be used
to configure the Bandwidth Event to track any combination of supported
bandwidth types. The event will count requests from every Bandwidth Type
bit that is set in the corresponding configuration register.
Following are the types of events supported.
=======================================================================
Bits Description 6 Dirty Victims from the QOS domain to all types
of memory 5 Reads to slow memory in the non-local NUMA domain 4
Reads to slow memory in the local NUMA domain 3 Non-temporal writes
to non-local NUMA domain 2 Non-temporal writes to local NUMA domain 1
Reads to memory in the non-local NUMA domain 0 Reads to memory in the
local NUMA domain
Feature description is available in the specification, "AMD64 Technology
Platform Quality of Service Extensions, Revision: 1.03 Publication # 56375
Revision: 1.03 Issue Date: February 2022".
Link: https://www.amd.com/en/support/tech-docs/amd64-technology-platform-quality-service-extensions
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/scattered.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 14b5cc8da75c..2608416f5d0a 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -304,6 +304,7 @@
#define X86_FEATURE_UNRET (11*32+15) /* "" AMD BTB untrain return */
#define X86_FEATURE_USE_IBPB_FW (11*32+16) /* "" Use IBPB during runtime firmware calls */
#define X86_FEATURE_SMBA (11*32+17) /* SLOW Memory Bandwidth Allocation */
+#define X86_FEATURE_BMEC (11*32+18) /* AMD Bandwidth Monitoring Event Configuration (BMEC) */
/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 885ecf46abb2..7981df0b910e 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -45,6 +45,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_SMBA, CPUID_EBX, 2, 0x80000020, 0 },
+ { X86_FEATURE_BMEC, CPUID_EBX, 3, 0x80000020, 0 },
{ X86_FEATURE_PERFMON_V2, CPUID_EAX, 0, 0x80000022, 0 },
{ 0, 0, 0, 0, 0 }
};
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 06/10] x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (4 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 05/10] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 07/10] x86/resctrl: Add sysfs interface files to read/write event configuration Babu Moger
` (4 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Newer AMD processors support the new feature Bandwidth Monitoring Event
Configuration (BMEC). The events mbm_total_bytes and mbm_local_bytes
are configurable when this feature is present.
Set mon_configurable if the feature is available.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/monitor.c | 14 ++++++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++++++++
include/linux/resctrl.h | 1 +
3 files changed, 32 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index eaf25a234ff5..b9de417dac1c 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -682,6 +682,16 @@ static void l3_mon_evt_init(struct rdt_resource *r)
list_add_tail(&mbm_local_event.list, &r->evt_list);
}
+
+void __rdt_get_mon_l3_config_amd(struct rdt_resource *r)
+{
+ /*
+ * Check if CPU supports the Bandwidth Monitoring Event Configuration
+ */
+ if (boot_cpu_has(X86_FEATURE_BMEC))
+ r->mon_configurable = true;
+}
+
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
@@ -714,6 +724,10 @@ int rdt_get_mon_l3_config(struct rdt_resource *r)
if (ret)
return ret;
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ __rdt_get_mon_l3_config_amd(r);
+
+
l3_mon_evt_init(r);
r->mon_capable = true;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index fc5286067201..855483b297a8 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -995,6 +995,16 @@ static int rdt_num_rmids_show(struct kernfs_open_file *of,
return 0;
}
+static int rdt_mon_configurable_show(struct kernfs_open_file *of,
+ struct seq_file *seq, void *v)
+{
+ struct rdt_resource *r = of->kn->parent->priv;
+
+ seq_printf(seq, "%d\n", r->mon_configurable);
+
+ return 0;
+}
+
static int rdt_mon_features_show(struct kernfs_open_file *of,
struct seq_file *seq, void *v)
{
@@ -1447,6 +1457,13 @@ static struct rftype res_common_files[] = {
.seq_show = rdt_num_rmids_show,
.fflags = RF_MON_INFO,
},
+ {
+ .name = "mon_configurable",
+ .mode = 0444,
+ .kf_ops = &rdtgroup_kf_single_ops,
+ .seq_show = rdt_mon_configurable_show,
+ .fflags = RF_MON_INFO,
+ },
{
.name = "cbm_mask",
.mode = 0444,
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 21deb5212bbd..4ee2b606ac14 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -154,6 +154,7 @@ struct rdt_resource {
bool mon_enabled;
bool alloc_capable;
bool mon_capable;
+ bool mon_configurable;
int num_rmid;
int cache_level;
struct resctrl_cache cache;
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 07/10] x86/resctrl: Add sysfs interface files to read/write event configuration
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (5 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 06/10] x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 08/10] x86/resctrl: Add the sysfs interface to read the " Babu Moger
` (3 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Add two new sysfs files to read/write the event configuration if
the feature Bandwidth Monitoring Event Configuration (BMEC) is
supported. The file mbm_local_config is for the configuration
of the event mbm_local_bytes and the file mbm_total_config is
for the configuration of mbm_total_bytes.
$ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local*
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_bytes
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
$ls /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total*
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_bytes
/sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/internal.h | 3 +++
arch/x86/kernel/cpu/resctrl/monitor.c | 2 ++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 32 ++++++++++++++++++++++++++++++++
3 files changed, 37 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index c049a274383c..fc725f5e9024 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -72,11 +72,13 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key);
* struct mon_evt - Entry in the event list of a resource
* @evtid: event id
* @name: name of the event
+ * @config: current configuration
* @list: entry in &rdt_resource->evt_list
*/
struct mon_evt {
u32 evtid;
char *name;
+ char *config;
struct list_head list;
};
@@ -95,6 +97,7 @@ union mon_data_bits {
unsigned int rid : 10;
unsigned int evtid : 8;
unsigned int domid : 14;
+ unsigned int mon_config : 32;
} u;
};
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index b9de417dac1c..3f900241dbab 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -656,11 +656,13 @@ static struct mon_evt llc_occupancy_event = {
static struct mon_evt mbm_total_event = {
.name = "mbm_total_bytes",
.evtid = QOS_L3_MBM_TOTAL_EVENT_ID,
+ .config = "mbm_total_config",
};
static struct mon_evt mbm_local_event = {
.name = "mbm_local_bytes",
.evtid = QOS_L3_MBM_LOCAL_EVENT_ID,
+ .config = "mbm_local_config",
};
/*
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 855483b297a8..30d2182d4fda 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -254,6 +254,10 @@ static const struct kernfs_ops kf_mondata_ops = {
.seq_show = rdtgroup_mondata_show,
};
+static const struct kernfs_ops kf_mondata_config_ops = {
+ .atomic_write_len = PAGE_SIZE,
+};
+
static bool is_cpu_list(struct kernfs_open_file *of)
{
struct rftype *rft = of->kn->priv;
@@ -2534,6 +2538,25 @@ void rmdir_mondata_subdir_allrdtgrp(struct rdt_resource *r, unsigned int dom_id)
}
}
+static int mon_config_addfile(struct kernfs_node *parent_kn, const char *name,
+ void *priv)
+{
+ struct kernfs_node *kn;
+ int ret = 0;
+
+ kn = __kernfs_create_file(parent_kn, name, 0644,
+ GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0,
+ &kf_mondata_config_ops, priv, NULL, NULL);
+ if (IS_ERR(kn))
+ return PTR_ERR(kn);
+
+ ret = rdtgroup_kn_set_ugid(kn);
+ if (ret)
+ kernfs_remove(kn);
+
+ return ret;
+}
+
static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
struct rdt_domain *d,
struct rdt_resource *r, struct rdtgroup *prgrp)
@@ -2568,6 +2591,15 @@ static int mkdir_mondata_subdir(struct kernfs_node *parent_kn,
if (ret)
goto out_destroy;
+ /* Create the sysfs event configuration files */
+ if (r->mon_configurable &&
+ (mevt->evtid == QOS_L3_MBM_TOTAL_EVENT_ID ||
+ mevt->evtid == QOS_L3_MBM_LOCAL_EVENT_ID)) {
+ ret = mon_config_addfile(kn, mevt->config, priv.priv);
+ if (ret)
+ goto out_destroy;
+ }
+
if (is_mbm_event(mevt->evtid))
mon_event_read(&rr, r, d, prgrp, mevt->evtid, true);
}
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 08/10] x86/resctrl: Add the sysfs interface to read the event configuration
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (6 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 07/10] x86/resctrl: Add sysfs interface files to read/write event configuration Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 09/10] x86/resctrl: Add sysfs interface to write " Babu Moger
` (2 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
The current event configuration can be viewed by the user by reading
the sysfs configuration file.
Following are the types of events supported.
====================================================================
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
By default the mbm_total_bytes configuration is set to 0x7f to count
all the types of events and mbm_local_bytes configuration is set to
0x15 to count all the local memory events.
$cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
0x7f
$cat /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
0x15
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/internal.h | 21 ++++++++++
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 70 ++++++++++++++++++++++++++++++++
2 files changed, 91 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index fc725f5e9024..457666709386 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -15,6 +15,7 @@
#define MSR_IA32_MBA_THRTL_BASE 0xd50
#define MSR_IA32_MBA_BW_BASE 0xc0000200
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
+#define MSR_IA32_EVT_CFG_BASE 0xc0000400
#define MSR_IA32_QM_CTR 0x0c8e
#define MSR_IA32_QM_EVTSEL 0x0c8d
@@ -50,6 +51,26 @@
*/
#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
+/* Reads to Local DRAM Memory */
+#define READS_TO_LOCAL_MEM BIT(0)
+
+/* Reads to Remote DRAM Memory */
+#define READS_TO_REMOTE_MEM BIT(1)
+
+/* Non-Temporal Writes to Local Memory */
+#define NON_TEMP_WRITE_TO_LOCAL_MEM BIT(2)
+
+/* Non-Temporal Writes to Remote Memory */
+#define NON_TEMP_WRITE_TO_REMOTE_MEM BIT(3)
+
+/* Reads to Local Memory the system identifies as "Slow Memory" */
+#define READS_TO_LOCAL_S_MEM BIT(4)
+
+/* Reads to Remote Memory the system identifies as "Slow Memory" */
+#define READS_TO_REMOTE_S_MEM BIT(5)
+
+/* Dirty Victims to All Types of Memory */
+#define DIRTY_VICTIS_TO_ALL_MEM BIT(6)
struct rdt_fs_context {
struct kernfs_fs_context kfc;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 30d2182d4fda..e1847d49fa15 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -254,8 +254,78 @@ static const struct kernfs_ops kf_mondata_ops = {
.seq_show = rdtgroup_mondata_show,
};
+/*
+ * This is called via IPI to read the CQM/MBM counters
+ * in a domain.
+ */
+void mon_event_config_read(void *info)
+{
+ union mon_data_bits *md = info;
+ u32 evtid = md->u.evtid;
+ u32 h, msr_index;
+
+ switch (evtid) {
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ msr_index = 0;
+ break;
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ msr_index = 1;
+ break;
+ default:
+ return; /* Not expected to come here */
+ }
+
+ rdmsr(MSR_IA32_EVT_CFG_BASE + msr_index, md->u.mon_config, h);
+}
+
+void mondata_config_read(struct rdt_domain *d, union mon_data_bits *md)
+{
+ smp_call_function_any(&d->cpu_mask, mon_event_config_read, md, 1);
+}
+
+int rdtgroup_mondata_config_show(struct seq_file *m, void *arg)
+{
+ struct kernfs_open_file *of = m->private;
+ struct rdt_hw_resource *hw_res;
+ u32 resid, evtid, domid;
+ struct rdtgroup *rdtgrp;
+ struct rdt_resource *r;
+ union mon_data_bits md;
+ struct rdt_domain *d;
+ int ret = 0;
+
+ rdtgrp = rdtgroup_kn_lock_live(of->kn);
+ if (!rdtgrp) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ md.priv = of->kn->priv;
+ resid = md.u.rid;
+ domid = md.u.domid;
+ evtid = md.u.evtid;
+
+ hw_res = &rdt_resources_all[resid];
+ r = &hw_res->r_resctrl;
+
+ d = rdt_find_domain(r, domid, NULL);
+ if (IS_ERR_OR_NULL(d)) {
+ ret = -ENOENT;
+ goto out;
+ }
+
+ mondata_config_read(d, &md);
+
+ seq_printf(m, "0x%x\n", md.u.mon_config);
+
+out:
+ rdtgroup_kn_unlock(of->kn);
+ return ret;
+}
+
static const struct kernfs_ops kf_mondata_config_ops = {
.atomic_write_len = PAGE_SIZE,
+ .seq_show = rdtgroup_mondata_config_show,
};
static bool is_cpu_list(struct kernfs_open_file *of)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 09/10] x86/resctrl: Add sysfs interface to write the event configuration
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (7 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 08/10] x86/resctrl: Add the sysfs interface to read the " Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-01 20:56 ` [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features Babu Moger
2022-08-02 2:16 ` [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Bagas Sanjaya
10 siblings, 0 replies; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Add the sysfs interface to write the event configuration for the
MBM configurable events. The event configuration can be changed by
writing to the sysfs file for that specific event.
Following are the types of events supported.
==================================================================
Bits Description
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
3 Non-temporal writes to non-local NUMA domain
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
By default the mbm_total_bytes configuration is set to 0x7f to count
all the types of events and mbm_local_bytes configuration is set to
0x15 to count all the local memory events.
For example:
To change the mbm_total_bytes to count all the reads, run the command.
$echo 0x33 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_total_config
To change the mbm_local_bytes to count all the slow memory reads, run
the command.
$echo 0x30 > /sys/fs/resctrl/mon_data/mon_L3_00/mbm_local_config
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 109 ++++++++++++++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index e1847d49fa15..83c8780726ff 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -323,9 +323,118 @@ int rdtgroup_mondata_config_show(struct seq_file *m, void *arg)
return ret;
}
+/*
+ * This is called via IPI to read the CQM/MBM counters
+ * in a domain.
+ */
+void mon_event_config_write(void *info)
+{
+ union mon_data_bits *md = info;
+ u32 evtid = md->u.evtid;
+ u32 msr_index;
+
+ switch (evtid) {
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ msr_index = 0;
+ break;
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ msr_index = 1;
+ break;
+ default:
+ return; /* Not expected to come here */
+ }
+
+ wrmsr(MSR_IA32_EVT_CFG_BASE + msr_index, md->u.mon_config, 0);
+}
+
+ssize_t rdtgroup_mondata_config_write(struct kernfs_open_file *of,
+ char *buf, size_t nbytes, loff_t off)
+{
+ struct rdt_hw_resource *hw_res;
+ struct rdtgroup *rdtgrp;
+ struct rdt_resource *r;
+ unsigned int mon_config;
+ cpumask_var_t cpu_mask;
+ union mon_data_bits md;
+ struct rdt_domain *d;
+ u32 resid, domid;
+ int ret = 0, cpu;
+
+ ret = kstrtouint(buf, 0, &mon_config);
+ if (ret)
+ return ret;
+
+ rdt_last_cmd_clear();
+
+ /* mon_config cannot be more than the supported set of events */
+ if (mon_config > GENMASK(6, 0)) {
+ rdt_last_cmd_puts("Invalid event configuration\n");
+ return -EINVAL;
+ }
+
+ cpus_read_lock();
+ rdtgrp = rdtgroup_kn_lock_live(of->kn);
+ if (!rdtgrp) {
+ return -ENOENT;
+ goto e_unlock;
+ }
+
+ if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) {
+ ret = -ENOMEM;
+ goto e_unlock;
+ }
+
+
+ md.priv = of->kn->priv;
+ resid = md.u.rid;
+ domid = md.u.domid;
+
+ hw_res = &rdt_resources_all[resid];
+ r = &hw_res->r_resctrl;
+ d = rdt_find_domain(r, domid, NULL);
+ if (IS_ERR_OR_NULL(d)) {
+ ret = -ENOENT;
+ goto e_cpumask;
+ }
+
+ md.u.mon_config = mon_config & 0xFF;
+
+ /* Pick all the CPUs in the domain instance */
+ for_each_cpu(cpu, &d->cpu_mask)
+ cpumask_set_cpu(cpu, cpu_mask);
+
+ cpu = get_cpu();
+ /* Update MSR_IA32_EVT_CFG_BASE MSR on this cpu if it's in cpu_mask */
+ if (cpumask_test_cpu(cpu, cpu_mask))
+ mon_event_config_write(&md);
+
+ /* Update MSR_IA32_EVT_CFG_BASE MSR on all other cpus in cpu_mask */
+ smp_call_function_many(cpu_mask, mon_event_config_write, &md, 1);
+ put_cpu();
+
+ /*
+ * When an Event Configuration is changed, the bandwidth counters
+ * for all RMIDs and Events will be cleared, and the U-bit for every
+ * RMID will be set on the next read to any BwEvent for every RMID.
+ * Clear the mbm_local and mbm_total counts for all the RMIDs.
+ */
+ memset(d->mbm_local, 0, sizeof(struct mbm_state) * r->num_rmid);
+ memset(d->mbm_total, 0, sizeof(struct mbm_state) * r->num_rmid);
+
+e_cpumask:
+ free_cpumask_var(cpu_mask);
+
+e_unlock:
+ rdtgroup_kn_unlock(of->kn);
+ cpus_read_unlock();
+
+ return ret ?: nbytes;
+}
+
static const struct kernfs_ops kf_mondata_config_ops = {
.atomic_write_len = PAGE_SIZE,
.seq_show = rdtgroup_mondata_config_show,
+ .write = rdtgroup_mondata_config_write,
};
static bool is_cpu_list(struct kernfs_open_file *of)
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (8 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 09/10] x86/resctrl: Add sysfs interface to write " Babu Moger
@ 2022-08-01 20:56 ` Babu Moger
2022-08-03 3:50 ` Bagas Sanjaya
2022-08-02 2:16 ` [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Bagas Sanjaya
10 siblings, 1 reply; 17+ messages in thread
From: Babu Moger @ 2022-08-01 20:56 UTC (permalink / raw)
To: fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, hpa, linux-kernel, linux-doc
Update the documentation for the new features:
1. Slow Memory Bandwidth allocation.
With this feature, the QOS enforcement policies can be applied
to the external slow memory connected to the host. QOS enforcement
is accomplished by assigning a Class Of Service (COS) to a processor
and specifying allocations or limits for that COS for each resource
to be allocated.
2. Bandwidth Monitoring Event Configuration (BMEC).
The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
are set to count all the total and local reads/writes respectively.
With the introduction of slow memory, the two counters are not
enough to count all the different types are memory events. With the
feature BMEC, the users have the option to configure mbm_total_bytes
and mbm_local_bytes to count the specific type of events.
Added the instructions to configure with examples.
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
Documentation/x86/resctrl.rst | 123 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 123 insertions(+)
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index 71a531061e4e..ba6833171c0a 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -167,6 +167,12 @@ with the following files:
bytes) at which a previously used LLC_occupancy
counter can be considered for re-use.
+"mon_configurable":
+ Provides the information if the events mbm_total and
+ mbm_local are configurable. See the configuration
+ details for "mbm_total_config" and "mbm_local_config"
+ for more information.
+
Finally, in the top level of the "info" directory there is a file
named "last_cmd_status". This is reset with every "command" issued
via the file system (making new directories or writing to any of the
@@ -264,6 +270,29 @@ When monitoring is enabled all MON groups will also contain:
the sum for all tasks in the CTRL_MON group and all tasks in
MON groups. Please see example section for more details on usage.
+"mbm_total_config":
+"mbm_local_config":
+ This contains the current event configuration for the events
+ mbm_total_bytes and mbm_local_bytes, respectively, when the
+ Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
+ These files are organized by L3 domains under the subdirectories
+ "mon_L3_00" and "mon_L3_01". When BMEC is supported, the events
+ mbm_local_bytes and mbm_total_bytes are configurable.
+
+ Following are the types of events supported.
+ Bits Description
+ 6 Dirty Victims from the QOS domain to all types of memory
+ 5 Reads to slow memory in the non-local NUMA domain
+ 4 Reads to slow memory in the local NUMA domain
+ 3 Non-temporal writes to non-local NUMA domain
+ 2 Non-temporal writes to local NUMA domain
+ 1 Reads to memory in the non-local NUMA domain
+ 0 Reads to memory in the local NUMA domain
+
+ By default, the mbm_total_bytes configuration is set to 0x7f to count
+ all the event types and the mbm_local_bytes configuration is set to
+ 0x15 to count all the local memory events.
+
Resource allocation rules
-------------------------
@@ -464,6 +493,14 @@ Memory bandwidth domain is L3 cache.
MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
+Slow Memory bandwidth Allocation (when supported)
+------------------------------------------
+
+Slow Memory b/w domain is L3 cache.
+::
+
+ SB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
+
Reading/writing the schemata file
---------------------------------
Reading the schemata file will show the state of all resources
@@ -479,6 +516,44 @@ which you wish to change. E.g.
L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
+Reading/writing the schemata file (on AMD systems)
+---------------------------------------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing the memory bandwidth allocation you
+only need to specify those values in an absolute number expressed
+in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
+need to specify the value 16 (16 * 1/8 = 2). E.g.
+::
+
+ # cat schemata
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "MB:1=16" > schemata
+ # cat schemata
+ MB:0=2048;1= 16;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+Reading/writing the schemata file (on AMD systems) with slow memory
+---------------------------------------------------------------
+Reading the schemata file will show the state of all resources
+on all domains. When writing the memory bandwidth allocation you
+only need to specify those values in an absolute number expressed
+in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
+need to specify the value 64 (64 * 1/8 = 8). E.g.
+::
+
+ # cat schemata
+ SB:0=2048;1=2048;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
+ # echo "SB:1=64" > schemata
+ # cat schemata
+ SB:0=2048;1= 64;2=2048;3=2048
+ MB:0=2048;1=2048;2=2048;3=2048
+ L3:0=ffff;1=ffff;2=ffff;3=ffff
+
Cache Pseudo-Locking
====================
CAT enables a user to specify the amount of cache space that an
@@ -1210,6 +1285,54 @@ View the llc occupancy snapshot::
# cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
11234000
+Example 5 (Configure and Monitor specific event types)
+-------------------------------------------------
+
+A single socket system which has real time tasks running on cores 0-4
+and non real time tasks on other CPUs. We want to monitor the memory
+bandwidth allocation for specific events.
+::
+
+ # mount -t resctrl resctrl /sys/fs/resctrl
+ # cd /sys/fs/resctrl
+ # mkdir p1
+
+Move the CPUs 0-4 over to p1::
+
+ # echo 0xf > p1/cpus
+
+View the current mbm_local_bytes::
+
+ # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
+ 112501
+
+Change the mbm_local_bytes to count mon-temporal writes to both local
+and non-local NUMA domain. Refer to event supported bitmap under
+mbm_local_config::
+
+ # echo 0xc > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_config
+
+View the updated mbm_local_bytes::
+
+ # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
+ 12601
+
+Similar experiment on mbm_total_bytes. First view the current mbm_total_bytes::
+
+ # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
+ 1532501
+
+Change the mbm_total_bytes to count only reads to slow memory on both local
+and non-local NUMA domain. Refer to event supported bitmap under
+mbm_total_config::
+
+ # echo 0x30 > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_config
+
+View the updated mbm_total_bytes::
+
+ # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
+ 104562
+
Intel RDT Errata
================
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
` (9 preceding siblings ...)
2022-08-01 20:56 ` [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features Babu Moger
@ 2022-08-02 2:16 ` Bagas Sanjaya
2022-08-02 9:39 ` Ingo Molnar
10 siblings, 1 reply; 17+ messages in thread
From: Bagas Sanjaya @ 2022-08-02 2:16 UTC (permalink / raw)
To: Babu Moger, fenghua.yu, reinette.chatre, tglx, mingo, bp
Cc: eranian, dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
On 8/2/22 03:55, Babu Moger wrote:
> v2:
> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> b. Added the patch to fix CBM issue on AMD. This was originally discussed
> https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/
>
Shouldn't this series be rebased on tip tree? I think it's odd to base
new feature series on stable tree, since patches on the latter are mostly
bugfixes backported from mainline.
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix
2022-08-02 2:16 ` [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Bagas Sanjaya
@ 2022-08-02 9:39 ` Ingo Molnar
2022-08-02 13:41 ` Moger, Babu
0 siblings, 1 reply; 17+ messages in thread
From: Ingo Molnar @ 2022-08-02 9:39 UTC (permalink / raw)
To: Bagas Sanjaya
Cc: Babu Moger, fenghua.yu, reinette.chatre, tglx, mingo, bp,
eranian, dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
* Bagas Sanjaya <bagasdotme@gmail.com> wrote:
> On 8/2/22 03:55, Babu Moger wrote:
> > v2:
> > a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
> > b. Added the patch to fix CBM issue on AMD. This was originally discussed
> > https://lore.kernel.org/lkml/20220517001234.3137157-1-eranian@google.com/
> >
>
> Shouldn't this series be rebased on tip tree? I think it's odd to base
> new feature series on stable tree, since patches on the latter are mostly
> bugfixes backported from mainline.
Normally that's true, but AFAICS the patchset applies cleanly to latest
-tip as well, so 'stable == tip' in this regard.
Series looks good to me (without having tested it):
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Thanks,
Ingo
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix
2022-08-02 9:39 ` Ingo Molnar
@ 2022-08-02 13:41 ` Moger, Babu
0 siblings, 0 replies; 17+ messages in thread
From: Moger, Babu @ 2022-08-02 13:41 UTC (permalink / raw)
To: Ingo Molnar, Bagas Sanjaya
Cc: fenghua.yu, reinette.chatre, tglx, mingo, bp, eranian,
dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
On 8/2/22 04:39, Ingo Molnar wrote:
> * Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
>> On 8/2/22 03:55, Babu Moger wrote:
>>> v2:
>>> a. Rebased the patches to latest stable tree (v5.18.15). Resolved some conflicts.
>>> b. Added the patch to fix CBM issue on AMD. This was originally discussed
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Flkml%2F20220517001234.3137157-1-eranian%40google.com%2F&data=05%7C01%7Cbabu.moger%40amd.com%7C90f7a070780f42fc9b5808da746ae74c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637950299744423989%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tZNlIIolEGbVfZmbeADA%2BHjLE%2BPhNWKDfcHfL2cNUuE%3D&reserved=0
>>>
>> Shouldn't this series be rebased on tip tree? I think it's odd to base
Sure. Will re-base on top of tip next revision.
>> new feature series on stable tree, since patches on the latter are mostly
>> bugfixes backported from mainline.
> Normally that's true, but AFAICS the patchset applies cleanly to latest
> -tip as well, so 'stable == tip' in this regard.
>
> Series looks good to me (without having tested it):
>
> Reviewed-by: Ingo Molnar <mingo@kernel.org>
Thank you.
Babu Moger
>
> Thanks,
>
> Ingo
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features
2022-08-01 20:56 ` [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features Babu Moger
@ 2022-08-03 3:50 ` Bagas Sanjaya
2022-08-03 15:59 ` Moger, Babu
0 siblings, 1 reply; 17+ messages in thread
From: Bagas Sanjaya @ 2022-08-03 3:50 UTC (permalink / raw)
To: Babu Moger
Cc: fenghua.yu, reinette.chatre, tglx, mingo, bp, eranian,
dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
On Mon, Aug 01, 2022 at 03:56:43PM -0500, Babu Moger wrote:
> Update the documentation for the new features:
> 1. Slow Memory Bandwidth allocation.
> With this feature, the QOS enforcement policies can be applied
> to the external slow memory connected to the host. QOS enforcement
> is accomplished by assigning a Class Of Service (COS) to a processor
> and specifying allocations or limits for that COS for each resource
> to be allocated.
>
> 2. Bandwidth Monitoring Event Configuration (BMEC).
> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
> are set to count all the total and local reads/writes respectively.
> With the introduction of slow memory, the two counters are not
> enough to count all the different types are memory events. With the
> feature BMEC, the users have the option to configure mbm_total_bytes
> and mbm_local_bytes to count the specific type of events.
>
> Added the instructions to configure with examples.
>
Say "Also add configuration instructions with examples" instead.
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> ---
> Documentation/x86/resctrl.rst | 123 +++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 123 insertions(+)
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index 71a531061e4e..ba6833171c0a 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -167,6 +167,12 @@ with the following files:
> bytes) at which a previously used LLC_occupancy
> counter can be considered for re-use.
>
> +"mon_configurable":
> + Provides the information if the events mbm_total and
> + mbm_local are configurable. See the configuration
> + details for "mbm_total_config" and "mbm_local_config"
> + for more information.
> +
> Finally, in the top level of the "info" directory there is a file
> named "last_cmd_status". This is reset with every "command" issued
> via the file system (making new directories or writing to any of the
> @@ -264,6 +270,29 @@ When monitoring is enabled all MON groups will also contain:
> the sum for all tasks in the CTRL_MON group and all tasks in
> MON groups. Please see example section for more details on usage.
>
> +"mbm_total_config":
> +"mbm_local_config":
> + This contains the current event configuration for the events
> + mbm_total_bytes and mbm_local_bytes, respectively, when the
> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
> + These files are organized by L3 domains under the subdirectories
> + "mon_L3_00" and "mon_L3_01". When BMEC is supported, the events
> + mbm_local_bytes and mbm_total_bytes are configurable.
> +
> + Following are the types of events supported.
> + Bits Description
> + 6 Dirty Victims from the QOS domain to all types of memory
> + 5 Reads to slow memory in the non-local NUMA domain
> + 4 Reads to slow memory in the local NUMA domain
> + 3 Non-temporal writes to non-local NUMA domain
> + 2 Non-temporal writes to local NUMA domain
> + 1 Reads to memory in the non-local NUMA domain
> + 0 Reads to memory in the local NUMA domain
> +
Why not use table?
> + By default, the mbm_total_bytes configuration is set to 0x7f to count
> + all the event types and the mbm_local_bytes configuration is set to
> + 0x15 to count all the local memory events.
> +
> Resource allocation rules
> -------------------------
>
> @@ -464,6 +493,14 @@ Memory bandwidth domain is L3 cache.
>
> MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
>
> +Slow Memory bandwidth Allocation (when supported)
> +------------------------------------------
> +
> +Slow Memory b/w domain is L3 cache.
> +::
> +
> + SB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
> +
> Reading/writing the schemata file
> ---------------------------------
> Reading the schemata file will show the state of all resources
> @@ -479,6 +516,44 @@ which you wish to change. E.g.
> L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>
> +Reading/writing the schemata file (on AMD systems)
> +---------------------------------------------------------------
> +Reading the schemata file will show the state of all resources
> +on all domains. When writing the memory bandwidth allocation you
> +only need to specify those values in an absolute number expressed
> +in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
> +need to specify the value 16 (16 * 1/8 = 2). E.g.
> +::
> +
> + # cat schemata
> + MB:0=2048;1=2048;2=2048;3=2048
> + L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> + # echo "MB:1=16" > schemata
> + # cat schemata
> + MB:0=2048;1= 16;2=2048;3=2048
> + L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> +Reading/writing the schemata file (on AMD systems) with slow memory
> +---------------------------------------------------------------
> +Reading the schemata file will show the state of all resources
> +on all domains. When writing the memory bandwidth allocation you
> +only need to specify those values in an absolute number expressed
> +in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
> +need to specify the value 64 (64 * 1/8 = 8). E.g.
> +::
> +
> + # cat schemata
> + SB:0=2048;1=2048;2=2048;3=2048
> + MB:0=2048;1=2048;2=2048;3=2048
> + L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> + # echo "SB:1=64" > schemata
> + # cat schemata
> + SB:0=2048;1= 64;2=2048;3=2048
> + MB:0=2048;1=2048;2=2048;3=2048
> + L3:0=ffff;1=ffff;2=ffff;3=ffff
> +
> Cache Pseudo-Locking
> ====================
> CAT enables a user to specify the amount of cache space that an
> @@ -1210,6 +1285,54 @@ View the llc occupancy snapshot::
> # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
> 11234000
>
> +Example 5 (Configure and Monitor specific event types)
> +-------------------------------------------------
> +
> +A single socket system which has real time tasks running on cores 0-4
> +and non real time tasks on other CPUs. We want to monitor the memory
> +bandwidth allocation for specific events.
> +::
> +
> + # mount -t resctrl resctrl /sys/fs/resctrl
> + # cd /sys/fs/resctrl
> + # mkdir p1
> +
> +Move the CPUs 0-4 over to p1::
> +
> + # echo 0xf > p1/cpus
> +
> +View the current mbm_local_bytes::
> +
> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
> + 112501
> +
> +Change the mbm_local_bytes to count mon-temporal writes to both local
> +and non-local NUMA domain. Refer to event supported bitmap under
> +mbm_local_config::
> +
> + # echo 0xc > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_config
> +
> +View the updated mbm_local_bytes::
> +
> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
> + 12601
> +
> +Similar experiment on mbm_total_bytes. First view the current mbm_total_bytes::
> +
> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
> + 1532501
> +
> +Change the mbm_total_bytes to count only reads to slow memory on both local
> +and non-local NUMA domain. Refer to event supported bitmap under
> +mbm_total_config::
> +
> + # echo 0x30 > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_config
> +
> +View the updated mbm_total_bytes::
> +
> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
> + 104562
> +
> Intel RDT Errata
> ================
>
>
>
When building this documentation, I get new warnings:
Documentation/x86/resctrl.rst:275: WARNING: Unexpected indentation.
Documentation/x86/resctrl.rst:497: WARNING: Title underline too short.
Slow Memory bandwidth Allocation (when supported)
------------------------------------------
Documentation/x86/resctrl.rst:497: WARNING: Title underline too short.
Slow Memory bandwidth Allocation (when supported)
------------------------------------------
Documentation/x86/resctrl.rst:538: WARNING: Title underline too short.
Reading/writing the schemata file (on AMD systems) with slow memory
---------------------------------------------------------------
Documentation/x86/resctrl.rst:538: WARNING: Title underline too short.
Reading/writing the schemata file (on AMD systems) with slow memory
---------------------------------------------------------------
Documentation/x86/resctrl.rst:1289: WARNING: Title underline too short.
Example 5 (Configure and Monitor specific event types)
-------------------------------------------------
Documentation/x86/resctrl.rst:1289: WARNING: Title underline too short.
Example 5 (Configure and Monitor specific event types)
-------------------------------------------------
I have to apply the fixup:
---- >8 ----
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
index ba6833171c0a39..886cb9bd7a181c 100644
--- a/Documentation/x86/resctrl.rst
+++ b/Documentation/x86/resctrl.rst
@@ -270,8 +270,7 @@ When monitoring is enabled all MON groups will also contain:
the sum for all tasks in the CTRL_MON group and all tasks in
MON groups. Please see example section for more details on usage.
-"mbm_total_config":
-"mbm_local_config":
+"mbm_total_config", "mbm_local_config":
This contains the current event configuration for the events
mbm_total_bytes and mbm_local_bytes, respectively, when the
Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
@@ -279,8 +278,11 @@ When monitoring is enabled all MON groups will also contain:
"mon_L3_00" and "mon_L3_01". When BMEC is supported, the events
mbm_local_bytes and mbm_total_bytes are configurable.
- Following are the types of events supported.
+ Following are the types of events supported:
+
+ ==== ========================================================
Bits Description
+ ==== ========================================================
6 Dirty Victims from the QOS domain to all types of memory
5 Reads to slow memory in the non-local NUMA domain
4 Reads to slow memory in the local NUMA domain
@@ -288,6 +290,7 @@ When monitoring is enabled all MON groups will also contain:
2 Non-temporal writes to local NUMA domain
1 Reads to memory in the non-local NUMA domain
0 Reads to memory in the local NUMA domain
+ ==== ========================================================
By default, the mbm_total_bytes configuration is set to 0x7f to count
all the event types and the mbm_local_bytes configuration is set to
@@ -494,7 +497,7 @@ Memory bandwidth domain is L3 cache.
MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
Slow Memory bandwidth Allocation (when supported)
-------------------------------------------
+-------------------------------------------------
Slow Memory b/w domain is L3 cache.
::
@@ -517,7 +520,7 @@ which you wish to change. E.g.
L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
Reading/writing the schemata file (on AMD systems)
----------------------------------------------------------------
+--------------------------------------------------
Reading the schemata file will show the state of all resources
on all domains. When writing the memory bandwidth allocation you
only need to specify those values in an absolute number expressed
@@ -535,7 +538,7 @@ need to specify the value 16 (16 * 1/8 = 2). E.g.
L3:0=ffff;1=ffff;2=ffff;3=ffff
Reading/writing the schemata file (on AMD systems) with slow memory
----------------------------------------------------------------
+-------------------------------------------------------------------
Reading the schemata file will show the state of all resources
on all domains. When writing the memory bandwidth allocation you
only need to specify those values in an absolute number expressed
@@ -1286,7 +1289,7 @@ View the llc occupancy snapshot::
11234000
Example 5 (Configure and Monitor specific event types)
--------------------------------------------------
+------------------------------------------------------
A single socket system which has real time tasks running on cores 0-4
and non real time tasks on other CPUs. We want to monitor the memory
Thanks.
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features
2022-08-03 3:50 ` Bagas Sanjaya
@ 2022-08-03 15:59 ` Moger, Babu
2022-08-04 7:57 ` Bagas Sanjaya
0 siblings, 1 reply; 17+ messages in thread
From: Moger, Babu @ 2022-08-03 15:59 UTC (permalink / raw)
To: Bagas Sanjaya
Cc: fenghua.yu, reinette.chatre, tglx, mingo, bp, eranian,
dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
Hi Bagas,
On 8/2/22 22:50, Bagas Sanjaya wrote:
> On Mon, Aug 01, 2022 at 03:56:43PM -0500, Babu Moger wrote:
>> Update the documentation for the new features:
>> 1. Slow Memory Bandwidth allocation.
>> With this feature, the QOS enforcement policies can be applied
>> to the external slow memory connected to the host. QOS enforcement
>> is accomplished by assigning a Class Of Service (COS) to a processor
>> and specifying allocations or limits for that COS for each resource
>> to be allocated.
>>
>> 2. Bandwidth Monitoring Event Configuration (BMEC).
>> The bandwidth monitoring events mbm_total_bytes and mbm_local_bytes
>> are set to count all the total and local reads/writes respectively.
>> With the introduction of slow memory, the two counters are not
>> enough to count all the different types are memory events. With the
>> feature BMEC, the users have the option to configure mbm_total_bytes
>> and mbm_local_bytes to count the specific type of events.
>>
>> Added the instructions to configure with examples.
>>
> Say "Also add configuration instructions with examples" instead.
Sure.
>
>> Signed-off-by: Babu Moger <babu.moger@amd.com>
>> ---
>> Documentation/x86/resctrl.rst | 123 +++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 123 insertions(+)
>>
>> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
>> index 71a531061e4e..ba6833171c0a 100644
>> --- a/Documentation/x86/resctrl.rst
>> +++ b/Documentation/x86/resctrl.rst
>> @@ -167,6 +167,12 @@ with the following files:
>> bytes) at which a previously used LLC_occupancy
>> counter can be considered for re-use.
>>
>> +"mon_configurable":
>> + Provides the information if the events mbm_total and
>> + mbm_local are configurable. See the configuration
>> + details for "mbm_total_config" and "mbm_local_config"
>> + for more information.
>> +
>> Finally, in the top level of the "info" directory there is a file
>> named "last_cmd_status". This is reset with every "command" issued
>> via the file system (making new directories or writing to any of the
>> @@ -264,6 +270,29 @@ When monitoring is enabled all MON groups will also contain:
>> the sum for all tasks in the CTRL_MON group and all tasks in
>> MON groups. Please see example section for more details on usage.
>>
>> +"mbm_total_config":
>> +"mbm_local_config":
>> + This contains the current event configuration for the events
>> + mbm_total_bytes and mbm_local_bytes, respectively, when the
>> + Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
>> + These files are organized by L3 domains under the subdirectories
>> + "mon_L3_00" and "mon_L3_01". When BMEC is supported, the events
>> + mbm_local_bytes and mbm_total_bytes are configurable.
>> +
>> + Following are the types of events supported.
>> + Bits Description
>> + 6 Dirty Victims from the QOS domain to all types of memory
>> + 5 Reads to slow memory in the non-local NUMA domain
>> + 4 Reads to slow memory in the local NUMA domain
>> + 3 Non-temporal writes to non-local NUMA domain
>> + 2 Non-temporal writes to local NUMA domain
>> + 1 Reads to memory in the non-local NUMA domain
>> + 0 Reads to memory in the local NUMA domain
>> +
> Why not use table?
I have not used the table in vim. But pulled your diff. Now it looks like
table.
Hope that works.
>
>> + By default, the mbm_total_bytes configuration is set to 0x7f to count
>> + all the event types and the mbm_local_bytes configuration is set to
>> + 0x15 to count all the local memory events.
>> +
>> Resource allocation rules
>> -------------------------
>>
>> @@ -464,6 +493,14 @@ Memory bandwidth domain is L3 cache.
>>
>> MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
>>
>> +Slow Memory bandwidth Allocation (when supported)
>> +------------------------------------------
>> +
>> +Slow Memory b/w domain is L3 cache.
>> +::
>> +
>> + SB:<cache_id0>=bandwidth0;<cache_id1>=bandwidth1;...
>> +
>> Reading/writing the schemata file
>> ---------------------------------
>> Reading the schemata file will show the state of all resources
>> @@ -479,6 +516,44 @@ which you wish to change. E.g.
>> L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
>> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>>
>> +Reading/writing the schemata file (on AMD systems)
>> +---------------------------------------------------------------
>> +Reading the schemata file will show the state of all resources
>> +on all domains. When writing the memory bandwidth allocation you
>> +only need to specify those values in an absolute number expressed
>> +in 1/8 GB/s increments. To allocate bandwidth limit of 2GB, you
>> +need to specify the value 16 (16 * 1/8 = 2). E.g.
>> +::
>> +
>> + # cat schemata
>> + MB:0=2048;1=2048;2=2048;3=2048
>> + L3:0=ffff;1=ffff;2=ffff;3=ffff
>> +
>> + # echo "MB:1=16" > schemata
>> + # cat schemata
>> + MB:0=2048;1= 16;2=2048;3=2048
>> + L3:0=ffff;1=ffff;2=ffff;3=ffff
>> +
>> +Reading/writing the schemata file (on AMD systems) with slow memory
>> +---------------------------------------------------------------
>> +Reading the schemata file will show the state of all resources
>> +on all domains. When writing the memory bandwidth allocation you
>> +only need to specify those values in an absolute number expressed
>> +in 1/8 GB/s increments. To allocate bandwidth limit of 8GB, you
>> +need to specify the value 64 (64 * 1/8 = 8). E.g.
>> +::
>> +
>> + # cat schemata
>> + SB:0=2048;1=2048;2=2048;3=2048
>> + MB:0=2048;1=2048;2=2048;3=2048
>> + L3:0=ffff;1=ffff;2=ffff;3=ffff
>> +
>> + # echo "SB:1=64" > schemata
>> + # cat schemata
>> + SB:0=2048;1= 64;2=2048;3=2048
>> + MB:0=2048;1=2048;2=2048;3=2048
>> + L3:0=ffff;1=ffff;2=ffff;3=ffff
>> +
>> Cache Pseudo-Locking
>> ====================
>> CAT enables a user to specify the amount of cache space that an
>> @@ -1210,6 +1285,54 @@ View the llc occupancy snapshot::
>> # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/llc_occupancy
>> 11234000
>>
>> +Example 5 (Configure and Monitor specific event types)
>> +-------------------------------------------------
>> +
>> +A single socket system which has real time tasks running on cores 0-4
>> +and non real time tasks on other CPUs. We want to monitor the memory
>> +bandwidth allocation for specific events.
>> +::
>> +
>> + # mount -t resctrl resctrl /sys/fs/resctrl
>> + # cd /sys/fs/resctrl
>> + # mkdir p1
>> +
>> +Move the CPUs 0-4 over to p1::
>> +
>> + # echo 0xf > p1/cpus
>> +
>> +View the current mbm_local_bytes::
>> +
>> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
>> + 112501
>> +
>> +Change the mbm_local_bytes to count mon-temporal writes to both local
>> +and non-local NUMA domain. Refer to event supported bitmap under
>> +mbm_local_config::
>> +
>> + # echo 0xc > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_config
>> +
>> +View the updated mbm_local_bytes::
>> +
>> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_local_bytes
>> + 12601
>> +
>> +Similar experiment on mbm_total_bytes. First view the current mbm_total_bytes::
>> +
>> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
>> + 1532501
>> +
>> +Change the mbm_total_bytes to count only reads to slow memory on both local
>> +and non-local NUMA domain. Refer to event supported bitmap under
>> +mbm_total_config::
>> +
>> + # echo 0x30 > /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_config
>> +
>> +View the updated mbm_total_bytes::
>> +
>> + # cat /sys/fs/resctrl/p1/mon_data/mon_L3_00/mbm_total_bytes
>> + 104562
>> +
>> Intel RDT Errata
>> ================
>>
>>
>>
> When building this documentation, I get new warnings:
Is there a command I can use to build? That would be helpful next time.
>
> Documentation/x86/resctrl.rst:275: WARNING: Unexpected indentation.
> Documentation/x86/resctrl.rst:497: WARNING: Title underline too short.
>
> Slow Memory bandwidth Allocation (when supported)
> ------------------------------------------
> Documentation/x86/resctrl.rst:497: WARNING: Title underline too short.
>
> Slow Memory bandwidth Allocation (when supported)
> ------------------------------------------
> Documentation/x86/resctrl.rst:538: WARNING: Title underline too short.
>
> Reading/writing the schemata file (on AMD systems) with slow memory
> ---------------------------------------------------------------
> Documentation/x86/resctrl.rst:538: WARNING: Title underline too short.
>
> Reading/writing the schemata file (on AMD systems) with slow memory
> ---------------------------------------------------------------
> Documentation/x86/resctrl.rst:1289: WARNING: Title underline too short.
>
> Example 5 (Configure and Monitor specific event types)
> -------------------------------------------------
> Documentation/x86/resctrl.rst:1289: WARNING: Title underline too short.
>
> Example 5 (Configure and Monitor specific event types)
> -------------------------------------------------
>
> I have to apply the fixup:
>
> ---- >8 ----
>
> diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst
> index ba6833171c0a39..886cb9bd7a181c 100644
> --- a/Documentation/x86/resctrl.rst
> +++ b/Documentation/x86/resctrl.rst
> @@ -270,8 +270,7 @@ When monitoring is enabled all MON groups will also contain:
> the sum for all tasks in the CTRL_MON group and all tasks in
> MON groups. Please see example section for more details on usage.
>
> -"mbm_total_config":
> -"mbm_local_config":
> +"mbm_total_config", "mbm_local_config":
> This contains the current event configuration for the events
> mbm_total_bytes and mbm_local_bytes, respectively, when the
> Bandwidth Monitoring Event Configuration (BMEC) feature is supported.
> @@ -279,8 +278,11 @@ When monitoring is enabled all MON groups will also contain:
> "mon_L3_00" and "mon_L3_01". When BMEC is supported, the events
> mbm_local_bytes and mbm_total_bytes are configurable.
>
> - Following are the types of events supported.
> + Following are the types of events supported:
> +
> + ==== ========================================================
> Bits Description
> + ==== ========================================================
> 6 Dirty Victims from the QOS domain to all types of memory
> 5 Reads to slow memory in the non-local NUMA domain
> 4 Reads to slow memory in the local NUMA domain
> @@ -288,6 +290,7 @@ When monitoring is enabled all MON groups will also contain:
> 2 Non-temporal writes to local NUMA domain
> 1 Reads to memory in the non-local NUMA domain
> 0 Reads to memory in the local NUMA domain
> + ==== ========================================================
>
> By default, the mbm_total_bytes configuration is set to 0x7f to count
> all the event types and the mbm_local_bytes configuration is set to
> @@ -494,7 +497,7 @@ Memory bandwidth domain is L3 cache.
> MB:<cache_id0>=bw_MBps0;<cache_id1>=bw_MBps1;...
>
> Slow Memory bandwidth Allocation (when supported)
> -------------------------------------------
> +-------------------------------------------------
>
> Slow Memory b/w domain is L3 cache.
> ::
> @@ -517,7 +520,7 @@ which you wish to change. E.g.
> L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
>
> Reading/writing the schemata file (on AMD systems)
> ----------------------------------------------------------------
> +--------------------------------------------------
> Reading the schemata file will show the state of all resources
> on all domains. When writing the memory bandwidth allocation you
> only need to specify those values in an absolute number expressed
> @@ -535,7 +538,7 @@ need to specify the value 16 (16 * 1/8 = 2). E.g.
> L3:0=ffff;1=ffff;2=ffff;3=ffff
>
> Reading/writing the schemata file (on AMD systems) with slow memory
> ----------------------------------------------------------------
> +-------------------------------------------------------------------
> Reading the schemata file will show the state of all resources
> on all domains. When writing the memory bandwidth allocation you
> only need to specify those values in an absolute number expressed
> @@ -1286,7 +1289,7 @@ View the llc occupancy snapshot::
> 11234000
>
> Example 5 (Configure and Monitor specific event types)
> --------------------------------------------------
> +------------------------------------------------------
>
> A single socket system which has real time tasks running on cores 0-4
> and non real time tasks on other CPUs. We want to monitor the memory
>
> Thanks.
>
I have pulled your diff now. Thank you
--
Thanks
Babu Moger
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features
2022-08-03 15:59 ` Moger, Babu
@ 2022-08-04 7:57 ` Bagas Sanjaya
0 siblings, 0 replies; 17+ messages in thread
From: Bagas Sanjaya @ 2022-08-04 7:57 UTC (permalink / raw)
To: babu.moger
Cc: fenghua.yu, reinette.chatre, tglx, mingo, bp, eranian,
dave.hansen, x86, hpa, corbet, linux-kernel, linux-doc
On 8/3/22 22:59, Moger, Babu wrote:
>> When building this documentation, I get new warnings:
> Is there a command I can use to build? That would be helpful next time.
Use `make htmldocs`.
--
An old man doll... just what I always wanted! - Clara
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2022-08-04 7:57 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-01 20:55 [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Babu Moger
2022-08-01 20:55 ` [PATCH v2 01/10] x86/resctrl: Fix min_cbm_bits for AMD Babu Moger
2022-08-01 20:55 ` [PATCH v2 02/10] x86/cpufeatures: Add Slow Memory Bandwidth Allocation feature flag Babu Moger
2022-08-01 20:55 ` [PATCH v2 03/10] x86/resctrl: Add a new resource type RDT_RESOURCE_SMBA Babu Moger
2022-08-01 20:56 ` [PATCH v2 04/10] x86/resctrl: Detect and configure Slow Memory Bandwidth allocation Babu Moger
2022-08-01 20:56 ` [PATCH v2 05/10] x86/cpufeatures: Add Bandwidth Monitoring Event Configuration feature flag Babu Moger
2022-08-01 20:56 ` [PATCH v2 06/10] x86/resctrl: Introduce mon_configurable to detect Bandwidth Monitoring Event Configuration Babu Moger
2022-08-01 20:56 ` [PATCH v2 07/10] x86/resctrl: Add sysfs interface files to read/write event configuration Babu Moger
2022-08-01 20:56 ` [PATCH v2 08/10] x86/resctrl: Add the sysfs interface to read the " Babu Moger
2022-08-01 20:56 ` [PATCH v2 09/10] x86/resctrl: Add sysfs interface to write " Babu Moger
2022-08-01 20:56 ` [PATCH v2 10/10] Documentation/x86: Update resctrl_ui.rst for new features Babu Moger
2022-08-03 3:50 ` Bagas Sanjaya
2022-08-03 15:59 ` Moger, Babu
2022-08-04 7:57 ` Bagas Sanjaya
2022-08-02 2:16 ` [PATCH v2 00/10] x86/resctrl: Support for AMD QoS new features and bug fix Bagas Sanjaya
2022-08-02 9:39 ` Ingo Molnar
2022-08-02 13:41 ` Moger, Babu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).