linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache
@ 2019-06-26 17:48 Reinette Chatre
  2019-06-26 17:48 ` [PATCH 01/10] x86/CPU: Expose if cache is inclusive of lower level caches Reinette Chatre
                   ` (10 more replies)
  0 siblings, 11 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Dear Maintainers,

Cache pseudo-locking involves preloading a region of physical memory into a
reserved portion of cache that no task or CPU can subsequently fill into and
from that point on will only serve cache hits. At this time it is only
possible to create cache pseudo-locked regions in either L2 or L3 cache,
supporting systems that support either L2 Cache Allocation Technology (CAT)
or L3 CAT because CAT is the mechanism used to manage reservations of cache
portions.

This series introduces support for cache pseudo-locked regions that can span
L2 and L3 cache in preparation for systems that may support CAT on L2 and
L3 cache. Only systems with L3 inclusive cache is supported at this time
because if the L3 cache is not inclusive then pseudo-locked memory within
the L3 cache would be evicted when migrated to L2. Because of this
constraint the first patch in this series introduces support in cacheinfo.c
for resctrl to discover if the L3 cache is inclusive. All other patches in
this series are to the resctrl subsystem.

In support of cache pseudo-locked regions spanning L2 and L3 cache the term
"cache pseudo-lock portion" is introduced. Each portion of a cache
pseudo-locked region spans one level of cache and a cache pseudo-locked
region can be made up of one or two cache pseudo-lock portions.

On systems supporting L2 and L3 CAT where L3 cache is inclusive it is
possible to create two types of pseudo-locked regions:
1) A pseudo-locked region spanning just L3 cache, consisting out of a
single pseudo-locked portion.
2) A pseudo-locked region spanning L2 and L3 cache, consisting out of two
pseudo-locked portions.

In an L3 inclusive cache system a L2 pseudo-locked portion is required to
be matched with an L3 pseudo-locked portion to prevent a cache line from
being evicted from L2 when it is evicted from L3.

Patches 2 to 8 to the resctrl subsystem are preparing for the new feature
and should result in no functional change, but some comments do refer to
the new feature. Support for pseudo-locked regions spanning L2 and L3 cache
is introduced in patches 9 and 10.

Your feedback will be greatly appreciated.

Regards,

Reinette

Reinette Chatre (10):
  x86/CPU: Expose if cache is inclusive of lower level caches
  x86/resctrl: Remove unnecessary size compute
  x86/resctrl: Constrain C-states during pseudo-lock region init
  x86/resctrl: Set cache line size using new utility
  x86/resctrl: Associate pseudo-locked region's cache instance by id
  x86/resctrl: Introduce utility to return pseudo-locked cache portion
  x86/resctrl: Remove unnecessary pointer to pseudo-locked region
  x86/resctrl: Support pseudo-lock regions spanning resources
  x86/resctrl: Pseudo-lock portions of multiple resources
  x86/resctrl: Only pseudo-lock L3 cache when inclusive

 arch/x86/kernel/cpu/cacheinfo.c           |  42 +-
 arch/x86/kernel/cpu/resctrl/core.c        |   7 -
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  37 +-
 arch/x86/kernel/cpu/resctrl/internal.h    |  39 +-
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 444 +++++++++++++++++++---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  61 ++-
 include/linux/cacheinfo.h                 |   4 +
 7 files changed, 512 insertions(+), 122 deletions(-)

-- 
2.17.2


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 01/10] x86/CPU: Expose if cache is inclusive of lower level caches
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 02/10] x86/resctrl: Remove unnecessary size compute Reinette Chatre
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Deterministic cache parameters can be learned from CPUID leaf 04H.
Executing CPUID with a particular index in EAX would return the cache
parameters associated with that index in the EAX, EBX, ECX, and EDX
registers.

At this time, when discovering cache parameters for a particular cache
index, only the parameters returned in EAX, EBX, and ECX are parsed.
Parameters returned in EDX are ignored. One of the parameters in EDX,
whether the cache is inclusive of lower level caches, is valuable to
know when determining if a system can support L3 cache pseudo-locking.
If the L3 cache is not inclusive then pseudo-locked data within the L3
cache would be evicted when migrated to L2.

Add support for parsing the cache parameters obtained from EDX and make
the inclusive cache parameter available via the cacheinfo that can be
queried from the cache pseudo-locking code.

Do not expose this information to user space at this time. At this time
this information is required within the kernel only. Also, it is
not obvious what the best formatting of this information should be in
support of the variety of ways users may use this information.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/cacheinfo.c | 42 +++++++++++++++++++++++++++++----
 include/linux/cacheinfo.h       |  4 ++++
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 395d46f78582..f99104673329 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -154,10 +154,33 @@ union _cpuid4_leaf_ecx {
 	u32 full;
 };
 
+/*
+ * According to details about CPUID instruction documented in Intel SDM
+ * the third bit of the EDX register is used to indicate if complex
+ * cache indexing is in use.
+ * According to AMD specification (Open Source Register Reference For AMD
+ * Family 17h processors Models 00h-2Fh 56255 Rev 3.03 - July, 2018), only
+ * the first two bits are in use. Since HYGON is based on AMD the
+ * assumption is that it supports the same.
+ *
+ * There is no consumer for the complex indexing information so this bit is
+ * not added to the declaration of what processor can provide in EDX
+ * register. The declaration thus only considers bits supported by all
+ * architectures.
+ */
+union _cpuid4_leaf_edx {
+	struct {
+		unsigned int		wbinvd_no_guarantee:1;
+		unsigned int		inclusive:1;
+	} split;
+	u32 full;
+};
+
 struct _cpuid4_info_regs {
 	union _cpuid4_leaf_eax eax;
 	union _cpuid4_leaf_ebx ebx;
 	union _cpuid4_leaf_ecx ecx;
+	union _cpuid4_leaf_edx edx;
 	unsigned int id;
 	unsigned long size;
 	struct amd_northbridge *nb;
@@ -595,21 +618,24 @@ cpuid4_cache_lookup_regs(int index, struct _cpuid4_info_regs *this_leaf)
 	union _cpuid4_leaf_eax	eax;
 	union _cpuid4_leaf_ebx	ebx;
 	union _cpuid4_leaf_ecx	ecx;
-	unsigned		edx;
+	union _cpuid4_leaf_edx	edx;
+
+	edx.full = 0;
 
 	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
 		if (boot_cpu_has(X86_FEATURE_TOPOEXT))
 			cpuid_count(0x8000001d, index, &eax.full,
-				    &ebx.full, &ecx.full, &edx);
+				    &ebx.full, &ecx.full, &edx.full);
 		else
 			amd_cpuid4(index, &eax, &ebx, &ecx);
 		amd_init_l3_cache(this_leaf, index);
 	} else if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
 		cpuid_count(0x8000001d, index, &eax.full,
-			    &ebx.full, &ecx.full, &edx);
+			    &ebx.full, &ecx.full, &edx.full);
 		amd_init_l3_cache(this_leaf, index);
 	} else {
-		cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full, &edx);
+		cpuid_count(4, index, &eax.full, &ebx.full, &ecx.full,
+			    &edx.full);
 	}
 
 	if (eax.split.type == CTYPE_NULL)
@@ -618,6 +644,7 @@ cpuid4_cache_lookup_regs(int index, struct _cpuid4_info_regs *this_leaf)
 	this_leaf->eax = eax;
 	this_leaf->ebx = ebx;
 	this_leaf->ecx = ecx;
+	this_leaf->edx = edx;
 	this_leaf->size = (ecx.split.number_of_sets          + 1) *
 			  (ebx.split.coherency_line_size     + 1) *
 			  (ebx.split.physical_line_partition + 1) *
@@ -983,6 +1010,13 @@ static void ci_leaf_init(struct cacheinfo *this_leaf,
 	this_leaf->number_of_sets = base->ecx.split.number_of_sets + 1;
 	this_leaf->physical_line_partition =
 				base->ebx.split.physical_line_partition + 1;
+	if ((boot_cpu_data.x86_vendor == X86_VENDOR_AMD &&
+	     boot_cpu_has(X86_FEATURE_TOPOEXT)) ||
+	    boot_cpu_data.x86_vendor == X86_VENDOR_HYGON ||
+	    boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) {
+		this_leaf->attributes |= CACHE_INCLUSIVE_SET;
+		this_leaf->inclusive = base->edx.split.inclusive;
+	}
 	this_leaf->priv = base->nb;
 }
 
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index 70e19bc6cc9f..2550b5ce7fea 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -31,6 +31,8 @@ enum cache_type {
  * @physical_line_partition: number of physical cache lines sharing the
  *	same cachetag
  * @size: Total size of the cache
+ * @inclusive: Cache is inclusive of lower level caches. Only valid if
+ *	CACHE_INCLUSIVE_SET attribute is set.
  * @shared_cpu_map: logical cpumask representing all the cpus sharing
  *	this cache node
  * @attributes: bitfield representing various cache attributes
@@ -53,6 +55,7 @@ struct cacheinfo {
 	unsigned int ways_of_associativity;
 	unsigned int physical_line_partition;
 	unsigned int size;
+	unsigned int inclusive;
 	cpumask_t shared_cpu_map;
 	unsigned int attributes;
 #define CACHE_WRITE_THROUGH	BIT(0)
@@ -64,6 +67,7 @@ struct cacheinfo {
 #define CACHE_ALLOCATE_POLICY_MASK	\
 	(CACHE_READ_ALLOCATE | CACHE_WRITE_ALLOCATE)
 #define CACHE_ID		BIT(4)
+#define CACHE_INCLUSIVE_SET	BIT(5)
 	void *fw_token;
 	bool disable_sysfs;
 	void *priv;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 02/10] x86/resctrl: Remove unnecessary size compute
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
  2019-06-26 17:48 ` [PATCH 01/10] x86/CPU: Expose if cache is inclusive of lower level caches Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 03/10] x86/resctrl: Constrain C-states during pseudo-lock region init Reinette Chatre
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Information about a cache pseudo-locked region is maintained in its
struct pseudo_lock_region. One of these properties is the size of the
region that is computed before it is created and does not change over
the pseudo-locked region's lifetime.

When displaying the size of the pseudo-locked region to the user it is
thus not necessary to compute the size again from other properties, it
could just be printed directly from its struct.

The code being changed is entered only when the resource group is
pseudo-locked. At this time the pseudo-locked region has already been
created and the size property would thus be valid.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index bf3034994754..721fd7b0b0dc 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -1308,10 +1308,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
 		} else {
 			seq_printf(s, "%*s:", max_name_width,
 				   rdtgrp->plr->r->name);
-			size = rdtgroup_cbm_to_size(rdtgrp->plr->r,
-						    rdtgrp->plr->d,
-						    rdtgrp->plr->cbm);
-			seq_printf(s, "%d=%u\n", rdtgrp->plr->d->id, size);
+			seq_printf(s, "%d=%u\n", rdtgrp->plr->d->id,
+				   rdtgrp->plr->size);
 		}
 		goto out;
 	}
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 03/10] x86/resctrl: Constrain C-states during pseudo-lock region init
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
  2019-06-26 17:48 ` [PATCH 01/10] x86/CPU: Expose if cache is inclusive of lower level caches Reinette Chatre
  2019-06-26 17:48 ` [PATCH 02/10] x86/resctrl: Remove unnecessary size compute Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 04/10] x86/resctrl: Set cache line size using new utility Reinette Chatre
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

CPUs associated with a pseudo-locked cache region are prevented
from entering C6 and deeper C-states to ensure that the
power savings associated with those C-states cannot impact
the pseudo-locked region by forcing the pseudo-locked memory to
be evicted.

When supporting pseudo-locked regions that span L2 and L3 cache
levels it is not necessary to prevent all CPUs associated with both
cache levels from entering deeper C-states. Instead, only the
CPUs associated with the L2 cache need to be limited. This would
potentially result in more power savings since another L2 cache
that does not have a pseudo-locked region but share the L3 cache
would be able to enter power savings.

In preparation for limiting the C-states only where required the code to
do so is moved to earlier in the pseudo-lock region initialization.
Moving this code has the consequence that its actions need to be undone
in more error paths. This is accommodated by moving the C-state cleanup
code to the generic cleanup code (pseudo_lock_region_clear()) and
ensuring that the C-state cleanup code can handle the case when C-states
have not yet been constrained.

Also in preparation for limiting the C-states only on required CPUs the
function now accepts a parameter that specifies which CPUs should have
their C-states constrained - at this time the parameter is still used
for all CPUs associated with the pseudo-locked region.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 30 ++++++++++++-----------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 604c0e3bcc83..519057b6741f 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -175,6 +175,9 @@ static void pseudo_lock_cstates_relax(struct pseudo_lock_region *plr)
 {
 	struct pseudo_lock_pm_req *pm_req, *next;
 
+	if (list_empty(&plr->pm_reqs))
+		return;
+
 	list_for_each_entry_safe(pm_req, next, &plr->pm_reqs, list) {
 		dev_pm_qos_remove_request(&pm_req->req);
 		list_del(&pm_req->list);
@@ -184,6 +187,8 @@ static void pseudo_lock_cstates_relax(struct pseudo_lock_region *plr)
 
 /**
  * pseudo_lock_cstates_constrain - Restrict cores from entering C6
+ * @plr: pseudo-lock region requiring the C-states to be restricted
+ * @cpu_mask: the CPUs that should have their C-states restricted
  *
  * To prevent the cache from being affected by power management entering
  * C6 has to be avoided. This is accomplished by requesting a latency
@@ -197,13 +202,14 @@ static void pseudo_lock_cstates_relax(struct pseudo_lock_region *plr)
  * may be set to map to deeper sleep states. In this case the latency
  * requirement needs to prevent entering C2 also.
  */
-static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr)
+static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr,
+					 struct cpumask *cpu_mask)
 {
 	struct pseudo_lock_pm_req *pm_req;
 	int cpu;
 	int ret;
 
-	for_each_cpu(cpu, &plr->d->cpu_mask) {
+	for_each_cpu(cpu, cpu_mask) {
 		pm_req = kzalloc(sizeof(*pm_req), GFP_KERNEL);
 		if (!pm_req) {
 			rdt_last_cmd_puts("Failure to allocate memory for PM QoS\n");
@@ -251,6 +257,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
 		plr->d->plr = NULL;
 	plr->d = NULL;
 	plr->cbm = 0;
+	pseudo_lock_cstates_relax(plr);
 	plr->debugfs_dir = NULL;
 }
 
@@ -292,6 +299,10 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 
 	plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm);
 
+	ret = pseudo_lock_cstates_constrain(plr, &plr->d->cpu_mask);
+	if (ret < 0)
+		goto out_region;
+
 	for (i = 0; i < ci->num_leaves; i++) {
 		if (ci->info_list[i].level == plr->r->cache_level) {
 			plr->line_size = ci->info_list[i].coherency_line_size;
@@ -1284,12 +1295,6 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 	if (ret < 0)
 		return ret;
 
-	ret = pseudo_lock_cstates_constrain(plr);
-	if (ret < 0) {
-		ret = -EINVAL;
-		goto out_region;
-	}
-
 	plr->thread_done = 0;
 
 	thread = kthread_create_on_node(pseudo_lock_fn, rdtgrp,
@@ -1298,7 +1303,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 	if (IS_ERR(thread)) {
 		ret = PTR_ERR(thread);
 		rdt_last_cmd_printf("Locking thread returned error %d\n", ret);
-		goto out_cstates;
+		goto out_region;
 	}
 
 	kthread_bind(thread, plr->cpu);
@@ -1316,13 +1321,13 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 		 * empty pseudo-locking loop.
 		 */
 		rdt_last_cmd_puts("Locking thread interrupted\n");
-		goto out_cstates;
+		goto out_region;
 	}
 
 	ret = pseudo_lock_minor_get(&new_minor);
 	if (ret < 0) {
 		rdt_last_cmd_puts("Unable to obtain a new minor number\n");
-		goto out_cstates;
+		goto out_region;
 	}
 
 	/*
@@ -1379,8 +1384,6 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp)
 out_debugfs:
 	debugfs_remove_recursive(plr->debugfs_dir);
 	pseudo_lock_minor_release(new_minor);
-out_cstates:
-	pseudo_lock_cstates_relax(plr);
 out_region:
 	pseudo_lock_region_clear(plr);
 out:
@@ -1414,7 +1417,6 @@ void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp)
 		goto free;
 	}
 
-	pseudo_lock_cstates_relax(plr);
 	debugfs_remove_recursive(rdtgrp->plr->debugfs_dir);
 	device_destroy(pseudo_lock_class, MKDEV(pseudo_lock_major, plr->minor));
 	pseudo_lock_minor_release(plr->minor);
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 04/10] x86/resctrl: Set cache line size using new utility
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (2 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 03/10] x86/resctrl: Constrain C-states during pseudo-lock region init Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 05/10] x86/resctrl: Associate pseudo-locked region's cache instance by id Reinette Chatre
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

In preparation for support of pseudo-locked regions spanning two
cache levels the cache line size computation is moved to a utility.

Setting of the cache line size is moved a few lines earlier, before
the C-states are constrained, to reduce the amount of cleanup needed
on failure.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 42 +++++++++++++++++------
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 519057b6741f..3d73b08871cc 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -101,6 +101,30 @@ static u64 get_prefetch_disable_bits(void)
 	return 0;
 }
 
+/**
+ * get_cache_line_size - Determine the cache coherency line size
+ * @cpu: CPU with which cache is associated
+ * @level: Cache level
+ *
+ * Context: @cpu has to be online.
+ * Return: The cache coherency line size for cache level @level associated
+ * with CPU @cpu. Zero on failure.
+ */
+static unsigned int get_cache_line_size(unsigned int cpu, int level)
+{
+	struct cpu_cacheinfo *ci;
+	int i;
+
+	ci = get_cpu_cacheinfo(cpu);
+
+	for (i = 0; i < ci->num_leaves; i++) {
+		if (ci->info_list[i].level == level)
+			return ci->info_list[i].coherency_line_size;
+	}
+
+	return 0;
+}
+
 /**
  * pseudo_lock_minor_get - Obtain available minor number
  * @minor: Pointer to where new minor number will be stored
@@ -281,9 +305,7 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
  */
 static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 {
-	struct cpu_cacheinfo *ci;
 	int ret;
-	int i;
 
 	/* Pick the first cpu we find that is associated with the cache. */
 	plr->cpu = cpumask_first(&plr->d->cpu_mask);
@@ -295,7 +317,12 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 		goto out_region;
 	}
 
-	ci = get_cpu_cacheinfo(plr->cpu);
+	plr->line_size = get_cache_line_size(plr->cpu, plr->r->cache_level);
+	if (plr->line_size == 0) {
+		rdt_last_cmd_puts("Unable to determine cache line size\n");
+		ret = -1;
+		goto out_region;
+	}
 
 	plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm);
 
@@ -303,15 +330,8 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 	if (ret < 0)
 		goto out_region;
 
-	for (i = 0; i < ci->num_leaves; i++) {
-		if (ci->info_list[i].level == plr->r->cache_level) {
-			plr->line_size = ci->info_list[i].coherency_line_size;
-			return 0;
-		}
-	}
+	return 0;
 
-	ret = -1;
-	rdt_last_cmd_puts("Unable to determine cache line size\n");
 out_region:
 	pseudo_lock_region_clear(plr);
 	return ret;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 05/10] x86/resctrl: Associate pseudo-locked region's cache instance by id
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (3 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 04/10] x86/resctrl: Set cache line size using new utility Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 06/10] x86/resctrl: Introduce utility to return pseudo-locked cache portion Reinette Chatre
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

The properties of a cache pseudo-locked region that are maintained in
its struct pseudo_lock_region include a pointer to the cache domain to
which it belongs. A cache domain is a structure that is associated with
a cache instance and when all CPUs associated with the cache instance go
offline the cache domain associated with it is removed. When a cache
domain is removed care is taken to not point to it anymore from the
pseudo-locked region and all possible references to this removed
information are ensured to be safe, often resulting in an error message
to the user.

Replace the cache domain pointer in the properties of the cache
pseudo-locked region with the actual cache ID to eliminate the special
care that needs to be taken when using this data while also making it
possible to keep displaying cache pseudo-locked region information to
the user even when the cache domain structure has been removed.
Associating the cache ID with the pseudo-locked region will also
simplify the restoration of the pseudo-locked region when the
cache domain is restored when the CPUs come back online.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/core.c        |  7 -----
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 14 ++-------
 arch/x86/kernel/cpu/resctrl/internal.h    |  4 +--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 38 +++++++++++++++++------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    | 21 +++++--------
 5 files changed, 40 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c
index 03eb90d00af0..e043074a88cc 100644
--- a/arch/x86/kernel/cpu/resctrl/core.c
+++ b/arch/x86/kernel/cpu/resctrl/core.c
@@ -633,13 +633,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 			cancel_delayed_work(&d->cqm_limbo);
 		}
 
-		/*
-		 * rdt_domain "d" is going to be freed below, so clear
-		 * its pointer from pseudo_lock_region struct.
-		 */
-		if (d->plr)
-			d->plr->d = NULL;
-
 		kfree(d->ctrl_val);
 		kfree(d->mbps_val);
 		bitmap_free(d->rmid_busy_llc);
diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index efbd54cc4e69..072f584cb238 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -291,7 +291,7 @@ static int parse_line(char *line, struct rdt_resource *r,
 				 * region and return.
 				 */
 				rdtgrp->plr->r = r;
-				rdtgrp->plr->d = d;
+				rdtgrp->plr->d_id = d->id;
 				rdtgrp->plr->cbm = d->new_ctrl;
 				d->plr = rdtgrp->plr;
 				return 0;
@@ -471,16 +471,8 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
 			for_each_alloc_enabled_rdt_resource(r)
 				seq_printf(s, "%s:uninitialized\n", r->name);
 		} else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-			if (!rdtgrp->plr->d) {
-				rdt_last_cmd_clear();
-				rdt_last_cmd_puts("Cache domain offline\n");
-				ret = -ENODEV;
-			} else {
-				seq_printf(s, "%s:%d=%x\n",
-					   rdtgrp->plr->r->name,
-					   rdtgrp->plr->d->id,
-					   rdtgrp->plr->cbm);
-			}
+			seq_printf(s, "%s:%d=%x\n", rdtgrp->plr->r->name,
+				   rdtgrp->plr->d_id, rdtgrp->plr->cbm);
 		} else {
 			closid = rdtgrp->closid;
 			for_each_alloc_enabled_rdt_resource(r) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index e49b77283924..65f558a2e806 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -149,7 +149,7 @@ struct mongroup {
  * struct pseudo_lock_region - pseudo-lock region information
  * @r:			RDT resource to which this pseudo-locked region
  *			belongs
- * @d:			RDT domain to which this pseudo-locked region
+ * @d_id:		ID of cache instance to which this pseudo-locked region
  *			belongs
  * @cbm:		bitmask of the pseudo-locked region
  * @lock_thread_wq:	waitqueue used to wait on the pseudo-locking thread
@@ -169,7 +169,7 @@ struct mongroup {
  */
 struct pseudo_lock_region {
 	struct rdt_resource	*r;
-	struct rdt_domain	*d;
+	int			d_id;
 	u32			cbm;
 	wait_queue_head_t	lock_thread_wq;
 	int			thread_done;
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 3d73b08871cc..3ad0c5b59d34 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -272,14 +272,19 @@ static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr,
  */
 static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
 {
+	struct rdt_domain *d;
+
 	plr->size = 0;
 	plr->line_size = 0;
 	kfree(plr->kmem);
 	plr->kmem = NULL;
+	if (plr->r && plr->d_id >= 0) {
+		d = rdt_find_domain(plr->r, plr->d_id, NULL);
+		if (!IS_ERR_OR_NULL(d))
+			d->plr = NULL;
+	}
 	plr->r = NULL;
-	if (plr->d)
-		plr->d->plr = NULL;
-	plr->d = NULL;
+	plr->d_id = -1;
 	plr->cbm = 0;
 	pseudo_lock_cstates_relax(plr);
 	plr->debugfs_dir = NULL;
@@ -305,10 +310,18 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
  */
 static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 {
+	struct rdt_domain *d;
 	int ret;
 
 	/* Pick the first cpu we find that is associated with the cache. */
-	plr->cpu = cpumask_first(&plr->d->cpu_mask);
+	d = rdt_find_domain(plr->r, plr->d_id, NULL);
+	if (IS_ERR_OR_NULL(d)) {
+		rdt_last_cmd_puts("Cache domain offline\n");
+		ret = -ENODEV;
+		goto out_region;
+	}
+
+	plr->cpu = cpumask_first(&d->cpu_mask);
 
 	if (!cpu_online(plr->cpu)) {
 		rdt_last_cmd_printf("CPU %u associated with cache not online\n",
@@ -324,9 +337,9 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 		goto out_region;
 	}
 
-	plr->size = rdtgroup_cbm_to_size(plr->r, plr->d, plr->cbm);
+	plr->size = rdtgroup_cbm_to_size(plr->r, d, plr->cbm);
 
-	ret = pseudo_lock_cstates_constrain(plr, &plr->d->cpu_mask);
+	ret = pseudo_lock_cstates_constrain(plr, &d->cpu_mask);
 	if (ret < 0)
 		goto out_region;
 
@@ -358,6 +371,7 @@ static int pseudo_lock_init(struct rdtgroup *rdtgrp)
 
 	init_waitqueue_head(&plr->lock_thread_wq);
 	INIT_LIST_HEAD(&plr->pm_reqs);
+	plr->d_id = -1;
 	rdtgrp->plr = plr;
 	return 0;
 }
@@ -1187,6 +1201,7 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 {
 	struct pseudo_lock_region *plr = rdtgrp->plr;
 	struct task_struct *thread;
+	struct rdt_domain *d;
 	unsigned int cpu;
 	int ret = -1;
 
@@ -1198,13 +1213,14 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 		goto out;
 	}
 
-	if (!plr->d) {
+	d = rdt_find_domain(plr->r, plr->d_id, NULL);
+	if (IS_ERR_OR_NULL(d)) {
 		ret = -ENODEV;
 		goto out;
 	}
 
 	plr->thread_done = 0;
-	cpu = cpumask_first(&plr->d->cpu_mask);
+	cpu = cpumask_first(&d->cpu_mask);
 	if (!cpu_online(cpu)) {
 		ret = -ENODEV;
 		goto out;
@@ -1501,6 +1517,7 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 	struct pseudo_lock_region *plr;
 	struct rdtgroup *rdtgrp;
 	unsigned long physical;
+	struct rdt_domain *d;
 	unsigned long psize;
 
 	mutex_lock(&rdtgroup_mutex);
@@ -1514,7 +1531,8 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 
 	plr = rdtgrp->plr;
 
-	if (!plr->d) {
+	d = rdt_find_domain(plr->r, plr->d_id, NULL);
+	if (IS_ERR_OR_NULL(d)) {
 		mutex_unlock(&rdtgroup_mutex);
 		return -ENODEV;
 	}
@@ -1525,7 +1543,7 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 	 * may be scheduled elsewhere and invalidate entries in the
 	 * pseudo-locked region.
 	 */
-	if (!cpumask_subset(&current->cpus_allowed, &plr->d->cpu_mask)) {
+	if (!cpumask_subset(&current->cpus_allowed, &d->cpu_mask)) {
 		mutex_unlock(&rdtgroup_mutex);
 		return -EINVAL;
 	}
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 721fd7b0b0dc..8e6bebd62646 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -262,22 +262,23 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 			      struct seq_file *s, void *v)
 {
 	struct rdtgroup *rdtgrp;
-	struct cpumask *mask;
+	struct rdt_domain *d;
 	int ret = 0;
 
 	rdtgrp = rdtgroup_kn_lock_live(of->kn);
 
 	if (rdtgrp) {
 		if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-			if (!rdtgrp->plr->d) {
+			d = rdt_find_domain(rdtgrp->plr->r, rdtgrp->plr->d_id,
+					    NULL);
+			if (IS_ERR_OR_NULL(d)) {
 				rdt_last_cmd_clear();
 				rdt_last_cmd_puts("Cache domain offline\n");
 				ret = -ENODEV;
 			} else {
-				mask = &rdtgrp->plr->d->cpu_mask;
 				seq_printf(s, is_cpu_list(of) ?
 					   "%*pbl\n" : "%*pb\n",
-					   cpumask_pr_args(mask));
+					   cpumask_pr_args(&d->cpu_mask));
 			}
 		} else {
 			seq_printf(s, is_cpu_list(of) ? "%*pbl\n" : "%*pb\n",
@@ -1301,16 +1302,8 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
 	}
 
 	if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-		if (!rdtgrp->plr->d) {
-			rdt_last_cmd_clear();
-			rdt_last_cmd_puts("Cache domain offline\n");
-			ret = -ENODEV;
-		} else {
-			seq_printf(s, "%*s:", max_name_width,
-				   rdtgrp->plr->r->name);
-			seq_printf(s, "%d=%u\n", rdtgrp->plr->d->id,
-				   rdtgrp->plr->size);
-		}
+		seq_printf(s, "%*s:", max_name_width, rdtgrp->plr->r->name);
+		seq_printf(s, "%d=%u\n", rdtgrp->plr->d_id, rdtgrp->plr->size);
 		goto out;
 	}
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 06/10] x86/resctrl: Introduce utility to return pseudo-locked cache portion
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (4 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 05/10] x86/resctrl: Associate pseudo-locked region's cache instance by id Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 07/10] x86/resctrl: Remove unnecessary pointer to pseudo-locked region Reinette Chatre
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

To prevent eviction of pseudo-locked memory it is required that no
other resource group uses any portion of a cache that is in use by
a cache pseudo-locked region.

Introduce a utility that will return a Capacity BitMask (CBM) indicating
all portions of a provided cache instance being used for cache
pseudo-locking. This CBM can be used in overlap checking as well as
cache usage reporting.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/internal.h    |  1 +
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 65f558a2e806..f17633cf4776 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -568,6 +568,7 @@ int rdtgroup_tasks_assigned(struct rdtgroup *r);
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
 bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm);
+u32 rdtgroup_pseudo_locked_bits(struct rdt_resource *r, struct rdt_domain *d);
 bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d);
 int rdt_pseudo_lock_init(void);
 void rdt_pseudo_lock_release(void);
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 3ad0c5b59d34..9a4dbdb72d3e 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -1630,3 +1630,26 @@ void rdt_pseudo_lock_release(void)
 	unregister_chrdev(pseudo_lock_major, "pseudo_lock");
 	pseudo_lock_major = 0;
 }
+
+/**
+ * rdt_pseudo_locked_bits - Portions of cache instance used for pseudo-locking
+ * @r:		RDT resource to which cache instance belongs
+ * @d:		Cache instance
+ *
+ * Return: bits in CBM of @d that are used for cache pseudo-locking
+ */
+u32 rdtgroup_pseudo_locked_bits(struct rdt_resource *r, struct rdt_domain *d)
+{
+	struct rdtgroup *rdtgrp;
+	u32 pseudo_locked = 0;
+
+	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+		if (!rdtgrp->plr)
+			continue;
+		if (rdtgrp->plr->r && rdtgrp->plr->r->rid == r->rid &&
+		    rdtgrp->plr->d_id == d->id)
+			pseudo_locked |= rdtgrp->plr->cbm;
+	}
+
+	return pseudo_locked;
+}
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 07/10] x86/resctrl: Remove unnecessary pointer to pseudo-locked region
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (5 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 06/10] x86/resctrl: Introduce utility to return pseudo-locked cache portion Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 08/10] x86/resctrl: Support pseudo-lock regions spanning resources Reinette Chatre
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Each cache domain (struct rdt_domain) contains a pointer to a
pseudo-locked region that (if set) is associated with it. At the
same time each resource group (struct rdtgroup) also contains a
pointer to a pseudo-locked region that (if set) is associated with
it.

If a pointer from a cache domain to its pseudo-locked region is
maintained then multiple cache domains could point to a single
pseudo-locked region when a pseudo-locked region spans multiple
resources. Such an arrangement would make it harder to support the
current mechanism of iterating over cache domains in order to find all
pseudo-locked regions.

In preparation for pseudo-locked regions that could span multiple
resources the pointer from a cache domain to a pseudo-locked region is
removed. The pointer to a pseudo-locked region from a resource
group remains - when needing to process all pseudo-locked regions on the
system an iteration over all resource groups is used instead of an
iteration over all cache domains.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  3 +-
 arch/x86/kernel/cpu/resctrl/internal.h    |  6 +--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 46 +++++++++++------------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  8 ++--
 4 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index 072f584cb238..a0383ff80afe 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -217,7 +217,7 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
 
 	if ((rdtgrp->mode == RDT_MODE_EXCLUSIVE ||
 	     rdtgrp->mode == RDT_MODE_SHAREABLE) &&
-	    rdtgroup_cbm_overlaps_pseudo_locked(d, cbm_val)) {
+	    rdtgroup_cbm_overlaps_pseudo_locked(r, d, cbm_val)) {
 		rdt_last_cmd_puts("CBM overlaps with pseudo-locked region\n");
 		return -EINVAL;
 	}
@@ -293,7 +293,6 @@ static int parse_line(char *line, struct rdt_resource *r,
 				rdtgrp->plr->r = r;
 				rdtgrp->plr->d_id = d->id;
 				rdtgrp->plr->cbm = d->new_ctrl;
-				d->plr = rdtgrp->plr;
 				return 0;
 			}
 			goto next;
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index f17633cf4776..892f38899dda 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -309,7 +309,6 @@ struct mbm_state {
  * @mbps_val:	When mba_sc is enabled, this holds the bandwidth in MBps
  * @new_ctrl:	new ctrl value to be loaded
  * @have_new_ctrl: did user provide new_ctrl for this domain
- * @plr:	pseudo-locked region (if any) associated with domain
  */
 struct rdt_domain {
 	struct list_head		list;
@@ -326,7 +325,6 @@ struct rdt_domain {
 	u32				*mbps_val;
 	u32				new_ctrl;
 	bool				have_new_ctrl;
-	struct pseudo_lock_region	*plr;
 };
 
 /**
@@ -567,7 +565,9 @@ enum rdtgrp_mode rdtgroup_mode_by_closid(int closid);
 int rdtgroup_tasks_assigned(struct rdtgroup *r);
 int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp);
 int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp);
-bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm);
+bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_resource *r,
+					 struct rdt_domain *d,
+					 unsigned long cbm);
 u32 rdtgroup_pseudo_locked_bits(struct rdt_resource *r, struct rdt_domain *d);
 bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d);
 int rdt_pseudo_lock_init(void);
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 9a4dbdb72d3e..8f20af017f7b 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -272,17 +272,10 @@ static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr,
  */
 static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
 {
-	struct rdt_domain *d;
-
 	plr->size = 0;
 	plr->line_size = 0;
 	kfree(plr->kmem);
 	plr->kmem = NULL;
-	if (plr->r && plr->d_id >= 0) {
-		d = rdt_find_domain(plr->r, plr->d_id, NULL);
-		if (!IS_ERR_OR_NULL(d))
-			d->plr = NULL;
-	}
 	plr->r = NULL;
 	plr->d_id = -1;
 	plr->cbm = 0;
@@ -826,6 +819,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
 
 /**
  * rdtgroup_cbm_overlaps_pseudo_locked - Test if CBM or portion is pseudo-locked
+ * @r: RDT resource to which @d belongs
  * @d: RDT domain
  * @cbm: CBM to test
  *
@@ -839,17 +833,17 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp)
  * Return: true if @cbm overlaps with pseudo-locked region on @d, false
  * otherwise.
  */
-bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm)
+bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_resource *r,
+					 struct rdt_domain *d,
+					 unsigned long cbm)
 {
+	unsigned long pseudo_locked;
 	unsigned int cbm_len;
-	unsigned long cbm_b;
 
-	if (d->plr) {
-		cbm_len = d->plr->r->cache.cbm_len;
-		cbm_b = d->plr->cbm;
-		if (bitmap_intersects(&cbm, &cbm_b, cbm_len))
-			return true;
-	}
+	pseudo_locked = rdtgroup_pseudo_locked_bits(r, d);
+	cbm_len = r->cache.cbm_len;
+	if (bitmap_intersects(&cbm, &pseudo_locked, cbm_len))
+		return true;
 	return false;
 }
 
@@ -863,13 +857,13 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_domain *d, unsigned long cbm
  * attempts to create new pseudo-locked regions in the same hierarchy.
  *
  * Return: true if a pseudo-locked region exists in the hierarchy of @d or
- *         if it is not possible to test due to memory allocation issue,
- *         false otherwise.
+ *         if it is not possible to test due to memory allocation or other
+ *         failure, false otherwise.
  */
 bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 {
 	cpumask_var_t cpu_with_psl;
-	struct rdt_resource *r;
+	struct rdtgroup *rdtgrp;
 	struct rdt_domain *d_i;
 	bool ret = false;
 
@@ -880,11 +874,16 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 	 * First determine which cpus have pseudo-locked regions
 	 * associated with them.
 	 */
-	for_each_alloc_enabled_rdt_resource(r) {
-		list_for_each_entry(d_i, &r->domains, list) {
-			if (d_i->plr)
-				cpumask_or(cpu_with_psl, cpu_with_psl,
-					   &d_i->cpu_mask);
+	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
+		if (rdtgrp->plr && rdtgrp->plr->d_id >= 0) {
+			d_i = rdt_find_domain(rdtgrp->plr->r, rdtgrp->plr->d_id,
+					      NULL);
+			if (IS_ERR_OR_NULL(d_i)) {
+				ret = true;
+				goto out;
+			}
+			cpumask_or(cpu_with_psl, cpu_with_psl,
+				   &d_i->cpu_mask);
 		}
 	}
 
@@ -895,6 +894,7 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 	if (cpumask_intersects(&d->cpu_mask, cpu_with_psl))
 		ret = true;
 
+out:
 	free_cpumask_var(cpu_with_psl);
 	return ret;
 }
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index 8e6bebd62646..c9070cb4b6a5 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -845,8 +845,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of,
 				break;
 			}
 		}
+
+		pseudo_locked = rdtgroup_pseudo_locked_bits(r, dom);
+
 		for (i = r->cache.cbm_len - 1; i >= 0; i--) {
-			pseudo_locked = dom->plr ? dom->plr->cbm : 0;
 			hwb = test_bit(i, &hw_shareable);
 			swb = test_bit(i, &sw_shareable);
 			excl = test_bit(i, &exclusive);
@@ -2542,8 +2544,8 @@ static int __init_one_rdt_domain(struct rdt_domain *d, struct rdt_resource *r,
 				d->new_ctrl |= *ctrl | peer_ctl;
 		}
 	}
-	if (d->plr && d->plr->cbm > 0)
-		used_b |= d->plr->cbm;
+
+	used_b |= rdtgroup_pseudo_locked_bits(r, d);
 	unused_b = used_b ^ (BIT_MASK(r->cache.cbm_len) - 1);
 	unused_b &= BIT_MASK(r->cache.cbm_len) - 1;
 	d->new_ctrl |= unused_b;
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 08/10] x86/resctrl: Support pseudo-lock regions spanning resources
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (6 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 07/10] x86/resctrl: Remove unnecessary pointer to pseudo-locked region Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 09/10] x86/resctrl: Pseudo-lock portions of multiple resources Reinette Chatre
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Currently cache pseudo-locked regions only consider one cache level but
cache pseudo-locked regions may span multiple cache levels.

In preparation for support of pseudo-locked regions spanning multiple
cache levels pseudo-lock 'portions' are introduced. A 'portion' of a
pseudo-locked region is the portion of a pseudo-locked region that
belongs to a specific resource. Each pseudo-locked portion is identified
with the resource (for example, L2 or L3 cache), the domain (the
specific cache instance), and the capacity bitmask that specifies which
region of the cache is used by the pseudo-locked region.

In support of pseudo-locked regions spanning multiple cache levels a
pseudo-locked region could have multiple 'portions' but in this
introduction only single portions are allowed.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/ctrlmondata.c |  26 +++-
 arch/x86/kernel/cpu/resctrl/internal.h    |  32 ++--
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 180 ++++++++++++++++------
 arch/x86/kernel/cpu/resctrl/rdtgroup.c    |  44 ++++--
 4 files changed, 211 insertions(+), 71 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
index a0383ff80afe..a60fb38a4d20 100644
--- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
+++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
@@ -207,7 +207,7 @@ int parse_cbm(struct rdt_parse_data *data, struct rdt_resource *r,
 	 * hierarchy.
 	 */
 	if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP &&
-	    rdtgroup_pseudo_locked_in_hierarchy(d)) {
+	    rdtgroup_pseudo_locked_in_hierarchy(rdtgrp, d)) {
 		rdt_last_cmd_puts("Pseudo-locked region in hierarchy\n");
 		return -EINVAL;
 	}
@@ -282,6 +282,7 @@ static int parse_line(char *line, struct rdt_resource *r,
 			if (r->parse_ctrlval(&data, r, d))
 				return -EINVAL;
 			if (rdtgrp->mode ==  RDT_MODE_PSEUDO_LOCKSETUP) {
+				struct pseudo_lock_portion *p;
 				/*
 				 * In pseudo-locking setup mode and just
 				 * parsed a valid CBM that should be
@@ -290,9 +291,15 @@ static int parse_line(char *line, struct rdt_resource *r,
 				 * the required initialization for single
 				 * region and return.
 				 */
-				rdtgrp->plr->r = r;
-				rdtgrp->plr->d_id = d->id;
-				rdtgrp->plr->cbm = d->new_ctrl;
+				p = kzalloc(sizeof(*p), GFP_KERNEL);
+				if (!p) {
+					rdt_last_cmd_puts("Unable to allocate memory for pseudo-lock portion\n");
+					return -ENOMEM;
+				}
+				p->r = r;
+				p->d_id = d->id;
+				p->cbm = d->new_ctrl;
+				list_add(&p->list, &rdtgrp->plr->portions);
 				return 0;
 			}
 			goto next;
@@ -410,8 +417,11 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of,
 			goto out;
 		}
 		ret = rdtgroup_parse_resource(resname, tok, rdtgrp);
-		if (ret)
+		if (ret) {
+			if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP)
+				pseudo_lock_region_clear(rdtgrp->plr);
 			goto out;
+		}
 	}
 
 	for_each_alloc_enabled_rdt_resource(r) {
@@ -459,6 +469,7 @@ static void show_doms(struct seq_file *s, struct rdt_resource *r, int closid)
 int rdtgroup_schemata_show(struct kernfs_open_file *of,
 			   struct seq_file *s, void *v)
 {
+	struct pseudo_lock_portion *p;
 	struct rdtgroup *rdtgrp;
 	struct rdt_resource *r;
 	int ret = 0;
@@ -470,8 +481,9 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of,
 			for_each_alloc_enabled_rdt_resource(r)
 				seq_printf(s, "%s:uninitialized\n", r->name);
 		} else if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-			seq_printf(s, "%s:%d=%x\n", rdtgrp->plr->r->name,
-				   rdtgrp->plr->d_id, rdtgrp->plr->cbm);
+			list_for_each_entry(p, &rdtgrp->plr->portions, list)
+				seq_printf(s, "%s:%d=%x\n", p->r->name, p->d_id,
+					   p->cbm);
 		} else {
 			closid = rdtgrp->closid;
 			for_each_alloc_enabled_rdt_resource(r) {
diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
index 892f38899dda..b041029d4de1 100644
--- a/arch/x86/kernel/cpu/resctrl/internal.h
+++ b/arch/x86/kernel/cpu/resctrl/internal.h
@@ -145,13 +145,27 @@ struct mongroup {
 	u32			rmid;
 };
 
+/**
+ * struct pseudo_lock_portion - portion of a pseudo-lock region on one resource
+ * @r:		RDT resource to which this pseudo-locked portion
+ *		belongs
+ * @d_id:	ID of cache instance to which this pseudo-locked portion
+ *		belongs
+ * @cbm:	bitmask of the pseudo-locked portion
+ * @list:	Entry in the list of pseudo-locked portion
+ *		belonging to the pseudo-locked region
+ */
+struct pseudo_lock_portion {
+	struct rdt_resource	*r;
+	int			d_id;
+	u32			cbm;
+	struct list_head	list;
+};
+
 /**
  * struct pseudo_lock_region - pseudo-lock region information
- * @r:			RDT resource to which this pseudo-locked region
- *			belongs
- * @d_id:		ID of cache instance to which this pseudo-locked region
- *			belongs
- * @cbm:		bitmask of the pseudo-locked region
+ * @portions:		list of portions across different resources that
+ *			are associated with this pseudo-locked region
  * @lock_thread_wq:	waitqueue used to wait on the pseudo-locking thread
  *			completion
  * @thread_done:	variable used by waitqueue to test if pseudo-locking
@@ -168,9 +182,7 @@ struct mongroup {
  * @pm_reqs:		Power management QoS requests related to this region
  */
 struct pseudo_lock_region {
-	struct rdt_resource	*r;
-	int			d_id;
-	u32			cbm;
+	struct list_head	portions;
 	wait_queue_head_t	lock_thread_wq;
 	int			thread_done;
 	int			cpu;
@@ -569,11 +581,13 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_resource *r,
 					 struct rdt_domain *d,
 					 unsigned long cbm);
 u32 rdtgroup_pseudo_locked_bits(struct rdt_resource *r, struct rdt_domain *d);
-bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d);
+bool rdtgroup_pseudo_locked_in_hierarchy(struct rdtgroup *selfgrp,
+					 struct rdt_domain *d);
 int rdt_pseudo_lock_init(void);
 void rdt_pseudo_lock_release(void);
 int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp);
 void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp);
+void pseudo_lock_region_clear(struct pseudo_lock_region *plr);
 struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r);
 int update_domains(struct rdt_resource *r, int closid);
 int closids_supported(void);
diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 8f20af017f7b..a7fe53447a7e 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -270,28 +270,85 @@ static int pseudo_lock_cstates_constrain(struct pseudo_lock_region *plr,
  *
  * Return: void
  */
-static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
+void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
 {
+	struct pseudo_lock_portion *p, *tmp;
+
 	plr->size = 0;
 	plr->line_size = 0;
 	kfree(plr->kmem);
 	plr->kmem = NULL;
-	plr->r = NULL;
-	plr->d_id = -1;
-	plr->cbm = 0;
 	pseudo_lock_cstates_relax(plr);
+	if (!list_empty(&plr->portions)) {
+		list_for_each_entry_safe(p, tmp, &plr->portions, list) {
+			list_del(&p->list);
+			kfree(p);
+		}
+	}
 	plr->debugfs_dir = NULL;
 }
 
+/**
+ * pseudo_lock_single_portion_valid - Verify properties of pseudo-lock region
+ * @plr: the main pseudo-lock region
+ * @p: the single portion that makes up the pseudo-locked region
+ *
+ * Verify and initialize properties of the pseudo-locked region.
+ *
+ * Return: -1 if portion of cache unable to be used for pseudo-locking
+ *         0 if portion of cache can be used for pseudo-locking, in
+ *         addition the CPU on which pseudo-locking will be performed will
+ *         be initialized as well as the size and cache line size of the region
+ */
+static int pseudo_lock_single_portion_valid(struct pseudo_lock_region *plr,
+					    struct pseudo_lock_portion *p)
+{
+	struct rdt_domain *d;
+
+	d = rdt_find_domain(p->r, p->d_id, NULL);
+	if (IS_ERR_OR_NULL(d)) {
+		rdt_last_cmd_puts("Cannot find cache domain\n");
+		return -1;
+	}
+
+	plr->cpu = cpumask_first(&d->cpu_mask);
+	if (!cpu_online(plr->cpu)) {
+		rdt_last_cmd_printf("CPU %u not online\n", plr->cpu);
+		goto err_cpu;
+	}
+
+	plr->line_size = get_cache_line_size(plr->cpu, p->r->cache_level);
+	if (plr->line_size == 0) {
+		rdt_last_cmd_puts("Unable to compute cache line length\n");
+		goto err_cpu;
+	}
+
+	if (pseudo_lock_cstates_constrain(plr, &d->cpu_mask)) {
+		rdt_last_cmd_puts("Cannot limit C-states\n");
+		goto err_line;
+	}
+
+	plr->size = rdtgroup_cbm_to_size(p->r, d, p->cbm);
+
+	return 0;
+
+err_line:
+	plr->line_size = 0;
+err_cpu:
+	plr->cpu = 0;
+	return -1;
+}
+
 /**
  * pseudo_lock_region_init - Initialize pseudo-lock region information
  * @plr: pseudo-lock region
  *
  * Called after user provided a schemata to be pseudo-locked. From the
  * schemata the &struct pseudo_lock_region is on entry already initialized
- * with the resource, domain, and capacity bitmask. Here the information
- * required for pseudo-locking is deduced from this data and &struct
- * pseudo_lock_region initialized further. This information includes:
+ * with the resource, domain, and capacity bitmask. Here the
+ * provided data is validated and information required for pseudo-locking
+ * deduced, and &struct pseudo_lock_region initialized further. This
+ * information includes:
  * - size in bytes of the region to be pseudo-locked
  * - cache line size to know the stride with which data needs to be accessed
  *   to be pseudo-locked
@@ -303,44 +360,50 @@ static void pseudo_lock_region_clear(struct pseudo_lock_region *plr)
  */
 static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 {
-	struct rdt_domain *d;
+	struct rdt_resource *l3_resource = &rdt_resources_all[RDT_RESOURCE_L3];
+	struct pseudo_lock_portion *p;
 	int ret;
 
-	/* Pick the first cpu we find that is associated with the cache. */
-	d = rdt_find_domain(plr->r, plr->d_id, NULL);
-	if (IS_ERR_OR_NULL(d)) {
-		rdt_last_cmd_puts("Cache domain offline\n");
-		ret = -ENODEV;
+	if (list_empty(&plr->portions)) {
+		rdt_last_cmd_puts("No pseudo-lock portions provided\n");
 		goto out_region;
 	}
 
-	plr->cpu = cpumask_first(&d->cpu_mask);
-
-	if (!cpu_online(plr->cpu)) {
-		rdt_last_cmd_printf("CPU %u associated with cache not online\n",
-				    plr->cpu);
-		ret = -ENODEV;
-		goto out_region;
+	/* Cache Pseudo-Locking only supported on L2 and L3 resources */
+	list_for_each_entry(p, &plr->portions, list) {
+		if (p->r->rid != RDT_RESOURCE_L2 &&
+		    p->r->rid != RDT_RESOURCE_L3) {
+			rdt_last_cmd_puts("Unsupported resource\n");
+			goto out_region;
+		}
 	}
 
-	plr->line_size = get_cache_line_size(plr->cpu, plr->r->cache_level);
-	if (plr->line_size == 0) {
-		rdt_last_cmd_puts("Unable to determine cache line size\n");
-		ret = -1;
-		goto out_region;
+	/*
+	 * If only one resource requested to be pseudo-locked then:
+	 * - Just a L3 cache portion is valid
+	 * - Just a L2 cache portion on system without L3 cache is valid
+	 */
+	if (list_is_singular(&plr->portions)) {
+		p = list_first_entry(&plr->portions, struct pseudo_lock_portion,
+				     list);
+		if (p->r->rid == RDT_RESOURCE_L3 ||
+		    (p->r->rid == RDT_RESOURCE_L2 &&
+		     !l3_resource->alloc_capable)) {
+			ret = pseudo_lock_single_portion_valid(plr, p);
+			if (ret < 0)
+				goto out_region;
+			return 0;
+		} else {
+			rdt_last_cmd_puts("Invalid resource or just L2 provided when L3 is required\n");
+			goto out_region;
+		}
+	} else {
+		rdt_last_cmd_puts("Multiple pseudo-lock portions unsupported\n");
 	}
 
-	plr->size = rdtgroup_cbm_to_size(plr->r, d, plr->cbm);
-
-	ret = pseudo_lock_cstates_constrain(plr, &d->cpu_mask);
-	if (ret < 0)
-		goto out_region;
-
-	return 0;
-
 out_region:
 	pseudo_lock_region_clear(plr);
-	return ret;
+	return -1;
 }
 
 /**
@@ -362,9 +425,9 @@ static int pseudo_lock_init(struct rdtgroup *rdtgrp)
 	if (!plr)
 		return -ENOMEM;
 
+	INIT_LIST_HEAD(&plr->portions);
 	init_waitqueue_head(&plr->lock_thread_wq);
 	INIT_LIST_HEAD(&plr->pm_reqs);
-	plr->d_id = -1;
 	rdtgrp->plr = plr;
 	return 0;
 }
@@ -849,6 +912,7 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_resource *r,
 
 /**
  * rdtgroup_pseudo_locked_in_hierarchy - Pseudo-locked region in cache hierarchy
+ * @selfgrp: current resource group testing for overlap
  * @d: RDT domain under test
  *
  * The setup of a pseudo-locked region affects all cache instances within
@@ -860,8 +924,10 @@ bool rdtgroup_cbm_overlaps_pseudo_locked(struct rdt_resource *r,
  *         if it is not possible to test due to memory allocation or other
  *         failure, false otherwise.
  */
-bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
+bool rdtgroup_pseudo_locked_in_hierarchy(struct rdtgroup *selfgrp,
+					 struct rdt_domain *d)
 {
+	struct pseudo_lock_portion *p;
 	cpumask_var_t cpu_with_psl;
 	struct rdtgroup *rdtgrp;
 	struct rdt_domain *d_i;
@@ -875,15 +941,15 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d)
 	 * associated with them.
 	 */
 	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
-		if (rdtgrp->plr && rdtgrp->plr->d_id >= 0) {
-			d_i = rdt_find_domain(rdtgrp->plr->r, rdtgrp->plr->d_id,
-					      NULL);
+		if (!rdtgrp->plr || rdtgrp == selfgrp)
+			continue;
+		list_for_each_entry(p, &rdtgrp->plr->portions, list) {
+			d_i = rdt_find_domain(p->r, p->d_id, NULL);
 			if (IS_ERR_OR_NULL(d_i)) {
 				ret = true;
 				goto out;
 			}
-			cpumask_or(cpu_with_psl, cpu_with_psl,
-				   &d_i->cpu_mask);
+			cpumask_or(cpu_with_psl, cpu_with_psl, &d_i->cpu_mask);
 		}
 	}
 
@@ -1200,6 +1266,7 @@ static int measure_l3_residency(void *_plr)
 static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 {
 	struct pseudo_lock_region *plr = rdtgrp->plr;
+	struct pseudo_lock_portion *p;
 	struct task_struct *thread;
 	struct rdt_domain *d;
 	unsigned int cpu;
@@ -1213,7 +1280,16 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 		goto out;
 	}
 
-	d = rdt_find_domain(plr->r, plr->d_id, NULL);
+	/*
+	 * Ensure test is run on CPU associated with the pseudo-locked
+	 * region. If pseudo-locked region spans L2 and L3, L2 portion
+	 * would be first in the list and all CPUs associated with the
+	 * L2 cache instance would be associated with pseudo-locked
+	 * region.
+	 */
+	p = list_first_entry(&plr->portions, struct pseudo_lock_portion, list);
+
+	d = rdt_find_domain(p->r, p->d_id, NULL);
 	if (IS_ERR_OR_NULL(d)) {
 		ret = -ENODEV;
 		goto out;
@@ -1515,6 +1591,7 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 	unsigned long vsize = vma->vm_end - vma->vm_start;
 	unsigned long off = vma->vm_pgoff << PAGE_SHIFT;
 	struct pseudo_lock_region *plr;
+	struct pseudo_lock_portion *p;
 	struct rdtgroup *rdtgrp;
 	unsigned long physical;
 	struct rdt_domain *d;
@@ -1531,7 +1608,16 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 
 	plr = rdtgrp->plr;
 
-	d = rdt_find_domain(plr->r, plr->d_id, NULL);
+	/*
+	 * If pseudo-locked region spans one cache level, there will be
+	 * only one portion to consider. If pseudo-locked region spans two
+	 * cache levels then the L2 cache portion will be the first entry
+	 * and a CPU associated with it is what a task is required to be run
+	 * on.
+	 */
+	p = list_first_entry(&plr->portions, struct pseudo_lock_portion, list);
+
+	d = rdt_find_domain(p->r, p->d_id, NULL);
 	if (IS_ERR_OR_NULL(d)) {
 		mutex_unlock(&rdtgroup_mutex);
 		return -ENODEV;
@@ -1640,15 +1726,17 @@ void rdt_pseudo_lock_release(void)
  */
 u32 rdtgroup_pseudo_locked_bits(struct rdt_resource *r, struct rdt_domain *d)
 {
+	struct pseudo_lock_portion *p;
 	struct rdtgroup *rdtgrp;
 	u32 pseudo_locked = 0;
 
 	list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) {
 		if (!rdtgrp->plr)
 			continue;
-		if (rdtgrp->plr->r && rdtgrp->plr->r->rid == r->rid &&
-		    rdtgrp->plr->d_id == d->id)
-			pseudo_locked |= rdtgrp->plr->cbm;
+		list_for_each_entry(p, &rdtgrp->plr->portions, list) {
+			if (p->r->rid == r->rid && p->d_id == d->id)
+				pseudo_locked |= p->cbm;
+		}
 	}
 
 	return pseudo_locked;
diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
index c9070cb4b6a5..6bea91d87883 100644
--- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
+++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
@@ -269,17 +269,31 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 
 	if (rdtgrp) {
 		if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-			d = rdt_find_domain(rdtgrp->plr->r, rdtgrp->plr->d_id,
-					    NULL);
+			struct pseudo_lock_portion *p;
+
+			/*
+			 * User space needs to know all CPUs associated
+			 * with the pseudo-locked region. When the
+			 * pseudo-locked region spans multiple resources it
+			 * is possible that not all CPUs are associated with
+			 * all portions of the pseudo-locked region.
+			 * Only display CPUs that are associated with _all_
+			 * portions of the region. The first portion in the
+			 * list will be the L2 cache if the region spans
+			 * multiple resource, it is thus only needed to
+			 * print CPUs associated with the first portion.
+			 */
+			p = list_first_entry(&rdtgrp->plr->portions,
+					     struct pseudo_lock_portion, list);
+			d = rdt_find_domain(p->r, p->d_id, NULL);
 			if (IS_ERR_OR_NULL(d)) {
 				rdt_last_cmd_clear();
 				rdt_last_cmd_puts("Cache domain offline\n");
 				ret = -ENODEV;
-			} else {
-				seq_printf(s, is_cpu_list(of) ?
-					   "%*pbl\n" : "%*pb\n",
-					   cpumask_pr_args(&d->cpu_mask));
+				goto out;
 			}
+			seq_printf(s, is_cpu_list(of) ?  "%*pbl\n" : "%*pb\n",
+				   cpumask_pr_args(&d->cpu_mask));
 		} else {
 			seq_printf(s, is_cpu_list(of) ? "%*pbl\n" : "%*pb\n",
 				   cpumask_pr_args(&rdtgrp->cpu_mask));
@@ -287,8 +301,9 @@ static int rdtgroup_cpus_show(struct kernfs_open_file *of,
 	} else {
 		ret = -ENOENT;
 	}
-	rdtgroup_kn_unlock(of->kn);
 
+out:
+	rdtgroup_kn_unlock(of->kn);
 	return ret;
 }
 
@@ -1304,8 +1319,19 @@ static int rdtgroup_size_show(struct kernfs_open_file *of,
 	}
 
 	if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED) {
-		seq_printf(s, "%*s:", max_name_width, rdtgrp->plr->r->name);
-		seq_printf(s, "%d=%u\n", rdtgrp->plr->d_id, rdtgrp->plr->size);
+		struct pseudo_lock_portion *portion;
+
+		/*
+		 * While the portions of the L2 and L3 caches allocated for a
+		 * pseudo-locked region may be different, the size used for
+		 * the pseudo-locked region is the same.
+		 */
+		list_for_each_entry(portion, &rdtgrp->plr->portions, list) {
+			seq_printf(s, "%*s:", max_name_width,
+				   portion->r->name);
+			seq_printf(s, "%d=%u\n", portion->d_id,
+				   rdtgrp->plr->size);
+		}
 		goto out;
 	}
 
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 09/10] x86/resctrl: Pseudo-lock portions of multiple resources
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (7 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 08/10] x86/resctrl: Support pseudo-lock regions spanning resources Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-26 17:48 ` [PATCH 10/10] x86/resctrl: Only pseudo-lock L3 cache when inclusive Reinette Chatre
  2019-06-27  9:12 ` [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache David Laight
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

A cache pseudo-locked region may span more than one level of cache. A
part of the pseudo-locked region that falls on one cache level is
referred to as a pseudo-lock portion that was introduced previously.

Now a pseudo-locked region is allowed to have two portions instead of
the previous limit of one. When a pseudo-locked region consists out of
two portions it can only span a L2 and L3 resctrl resource.
When a pseudo-locked region consists out of a L2 and L3 portion then
there are some requirements:
- the L2 and L3 cache has to be in same cache hierarchy
- the L3 portion must be same size or larger than L2 portion

As documented in previous changes the list of portions are
maintained so that the L2 portion would always appear first in the list
to simplify any information retrieval.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 142 +++++++++++++++++++++-
 1 file changed, 139 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index a7fe53447a7e..4e47ad582db6 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -339,13 +339,104 @@ static int pseudo_lock_single_portion_valid(struct pseudo_lock_region *plr,
 	return -1;
 }
 
+/**
+ * pseudo_lock_l2_l3_portions_valid - Verify region across L2 and L3
+ * @plr: Pseudo-Locked region
+ * @l2_portion: L2 Cache portion of pseudo-locked region
+ * @l3_portion: L3 Cache portion of pseudo-locked region
+ *
+ * User requested a pseudo-locked region consisting of a L2 as well as L3
+ * cache portion. The portions are tested as follows:
+ *   - L2 and L3 cache instances have to be in the same cache hierarchy.
+ *     This is tested by ensuring that the L2 portion's cpumask is a
+ *     subset of the L3 portion's cpumask.
+ *   - L3 portion must be same size or larger than L2 portion.
+ *
+ * Return: -1 if the portions are unable to be used for a pseudo-locked
+ *         region, 0 if the portions could be used for a pseudo-locked
+ *         region. When returning 0:
+ *         - the pseudo-locked region's size, line_size (cache line length)
+ *           and CPU on which locking thread will be run are set.
+ *         - CPUs associated with L2 cache portion are constrained from
+ *           entering C-state that will affect the pseudo-locked region.
+ */
+static int pseudo_lock_l2_l3_portions_valid(struct pseudo_lock_region *plr,
+					    struct pseudo_lock_portion *l2_p,
+					    struct pseudo_lock_portion *l3_p)
+{
+	struct rdt_domain *l2_d, *l3_d;
+	unsigned int l2_size, l3_size;
+
+	l2_d = rdt_find_domain(l2_p->r, l2_p->d_id, NULL);
+	if (IS_ERR_OR_NULL(l2_d)) {
+		rdt_last_cmd_puts("Cannot locate L2 cache domain\n");
+		return -1;
+	}
+
+	l3_d = rdt_find_domain(l3_p->r, l3_p->d_id, NULL);
+	if (IS_ERR_OR_NULL(l3_d)) {
+		rdt_last_cmd_puts("Cannot locate L3 cache domain\n");
+		return -1;
+	}
+
+	if (!cpumask_subset(&l2_d->cpu_mask, &l3_d->cpu_mask)) {
+		rdt_last_cmd_puts("L2 and L3 caches need to be in same hierarchy\n");
+		return -1;
+	}
+
+	if (pseudo_lock_cstates_constrain(plr, &l2_d->cpu_mask)) {
+		rdt_last_cmd_puts("Cannot limit C-states\n");
+		return -1;
+	}
+
+	l2_size = rdtgroup_cbm_to_size(l2_p->r, l2_d, l2_p->cbm);
+	l3_size = rdtgroup_cbm_to_size(l3_p->r, l3_d, l3_p->cbm);
+
+	if (l2_size > l3_size) {
+		rdt_last_cmd_puts("L3 cache portion has to be same size or larger than L2 cache portion\n");
+		goto err_size;
+	}
+
+	plr->size = l2_size;
+
+	l2_size = get_cache_line_size(cpumask_first(&l2_d->cpu_mask),
+				      l2_p->r->cache_level);
+	l3_size = get_cache_line_size(cpumask_first(&l3_d->cpu_mask),
+				      l3_p->r->cache_level);
+	if (l2_size != l3_size) {
+		rdt_last_cmd_puts("L2 and L3 caches have different coherency cache line sizes\n");
+		goto err_line;
+	}
+
+	plr->line_size = l2_size;
+
+	plr->cpu = cpumask_first(&l2_d->cpu_mask);
+
+	if (!cpu_online(plr->cpu)) {
+		rdt_last_cmd_printf("CPU %u associated with cache not online\n",
+				    plr->cpu);
+		goto err_cpu;
+	}
+
+	return 0;
+
+err_cpu:
+	plr->line_size = 0;
+	plr->cpu = 0;
+err_line:
+	plr->size = 0;
+err_size:
+	pseudo_lock_cstates_relax(plr);
+	return -1;
+}
+
 /**
  * pseudo_lock_region_init - Initialize pseudo-lock region information
  * @plr: pseudo-lock region
  *
  * Called after user provided a schemata to be pseudo-locked. From the
  * schemata the &struct pseudo_lock_region is on entry already initialized
- * with the resource, domain, and capacity bitmask. Here the
+ * with the resource(s), domain(s), and capacity bitmask(s). Here the
  * provided data is validated and information required for pseudo-locking
  * deduced, and &struct pseudo_lock_region initialized further. This
  * information includes:
@@ -355,13 +446,24 @@ static int pseudo_lock_single_portion_valid(struct pseudo_lock_region *plr,
  * - a cpu associated with the cache instance on which the pseudo-locking
  *   flow can be executed
  *
+ * A user provides a schemata for a pseudo-locked region. This schemata may
+ * contain portions that span different resources, for example, a cache
+ * pseudo-locked region that spans L2 and L3 cache. After the schemata is
+ * parsed into portions it needs to be verified that the provided portions
+ * are valid with the following tests:
+ *
+ * - L2 only portion on system that has only L2 resource - OK
+ * - L3 only portion on any system that supports it - OK
+ * - L2 portion on system that has L3 resource - require L3 portion
+ **
+ *
  * Return: 0 on success, <0 on failure. Descriptive error will be written
  * to last_cmd_status buffer.
  */
 static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 {
 	struct rdt_resource *l3_resource = &rdt_resources_all[RDT_RESOURCE_L3];
-	struct pseudo_lock_portion *p;
+	struct pseudo_lock_portion *p, *n_p, *tmp;
 	int ret;
 
 	if (list_empty(&plr->portions)) {
@@ -397,8 +499,42 @@ static int pseudo_lock_region_init(struct pseudo_lock_region *plr)
 			rdt_last_cmd_puts("Invalid resource or just L2 provided when L3 is required\n");
 			goto out_region;
 		}
+	}
+
+	/*
+	 * List is neither empty nor singular, process first and second portions
+	 */
+	p = list_first_entry(&plr->portions, struct pseudo_lock_portion, list);
+	n_p = list_next_entry(p, list);
+
+	/*
+	 * If the second portion is not also the last portion user provided
+	 * more portions than can be supported.
+	 */
+	tmp = list_last_entry(&plr->portions, struct pseudo_lock_portion, list);
+	if (n_p != tmp) {
+		rdt_last_cmd_puts("Only two pseudo-lock portions supported\n");
+		goto out_region;
+	}
+
+	if (p->r->rid == RDT_RESOURCE_L2 && n_p->r->rid == RDT_RESOURCE_L3) {
+		ret = pseudo_lock_l2_l3_portions_valid(plr, p, n_p);
+		if (ret < 0)
+			goto out_region;
+		return 0;
+	} else if (p->r->rid == RDT_RESOURCE_L3 &&
+		   n_p->r->rid == RDT_RESOURCE_L2) {
+		if (pseudo_lock_l2_l3_portions_valid(plr, n_p, p) == 0) {
+			/*
+			 * Let L2 and L3 portions appear in order in the
+			 * portions list in support of consistent output to
+			 * user space.
+			 */
+			list_rotate_left(&plr->portions);
+			return 0;
+		}
 	} else {
-		rdt_last_cmd_puts("Multiple pseudo-lock portions unsupported\n");
+		rdt_last_cmd_puts("Invalid combination of resources\n");
 	}
 
 out_region:
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 10/10] x86/resctrl: Only pseudo-lock L3 cache when inclusive
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (8 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 09/10] x86/resctrl: Pseudo-lock portions of multiple resources Reinette Chatre
@ 2019-06-26 17:48 ` Reinette Chatre
  2019-06-27  9:12 ` [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache David Laight
  10 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-26 17:48 UTC (permalink / raw)
  To: tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel, Reinette Chatre

Cache pseudo-locking is a model specific feature and platforms
supporting this feature are added by adding the x86 model data to the
source code after cache pseudo-locking has been validated for the
particular platform.

Indicating support for cache pseudo-locking for an entire platform is
sufficient when the cache characteristics of the platform is the same
for all instances of the platform. If this is not the case then an
additional check needs to be added. In particular, it is currently only
possible to pseudo-lock an L3 cache region if the L3 cache is inclusive
of lower level caches. If the L3 cache is not inclusive then any
pseudo-locked data would be evicted from the pseudo-locked region when
it is moved to the L2 cache.

When some SKUs of a platform may have inclusive cache while other SKUs
may have non inclusive cache it is necessary to, in addition of checking
if the platform supports cache pseudo-locking, also check if the cache
being pseudo-locked is inclusive.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 35 +++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
index 4e47ad582db6..e79f555d5226 100644
--- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
+++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c
@@ -125,6 +125,30 @@ static unsigned int get_cache_line_size(unsigned int cpu, int level)
 	return 0;
 }
 
+/**
+ * get_cache_inclusive - Determine if cache is inclusive of lower levels
+ * @cpu: CPU with which cache is associated
+ * @level: Cache level
+ *
+ * Context: @cpu has to be online.
+ * Return: 1 if cache is inclusive of lower cache levels, 0 if cache is not
+ *         inclusive of lower cache levels or on failure.
+ */
+static unsigned int get_cache_inclusive(unsigned int cpu, int level)
+{
+	struct cpu_cacheinfo *ci;
+	int i;
+
+	ci = get_cpu_cacheinfo(cpu);
+
+	for (i = 0; i < ci->num_leaves; i++) {
+		if (ci->info_list[i].level == level)
+			return ci->info_list[i].inclusive;
+	}
+
+	return 0;
+}
+
 /**
  * pseudo_lock_minor_get - Obtain available minor number
  * @minor: Pointer to where new minor number will be stored
@@ -317,6 +341,12 @@ static int pseudo_lock_single_portion_valid(struct pseudo_lock_region *plr,
 		goto err_cpu;
 	}
 
+	if (p->r->cache_level == 3 &&
+	    !get_cache_inclusive(plr->cpu, p->r->cache_level)) {
+		rdt_last_cmd_puts("L3 cache not inclusive\n");
+		goto err_cpu;
+	}
+
 	plr->line_size = get_cache_line_size(plr->cpu, p->r->cache_level);
 	if (plr->line_size == 0) {
 		rdt_last_cmd_puts("Unable to compute cache line length\n");
@@ -418,6 +448,11 @@ static int pseudo_lock_l2_l3_portions_valid(struct pseudo_lock_region *plr,
 		goto err_cpu;
 	}
 
+	if (!get_cache_inclusive(plr->cpu, l3_p->r->cache_level)) {
+		rdt_last_cmd_puts("L3 cache not inclusive\n");
+		goto err_cpu;
+	}
+
 	return 0;
 
 err_cpu:
-- 
2.17.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache
  2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
                   ` (9 preceding siblings ...)
  2019-06-26 17:48 ` [PATCH 10/10] x86/resctrl: Only pseudo-lock L3 cache when inclusive Reinette Chatre
@ 2019-06-27  9:12 ` David Laight
  2019-06-27 17:55   ` Reinette Chatre
  10 siblings, 1 reply; 13+ messages in thread
From: David Laight @ 2019-06-27  9:12 UTC (permalink / raw)
  To: 'Reinette Chatre', tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel

From: Reinette Chatre
> Sent: 26 June 2019 18:49
> 
> Cache pseudo-locking involves preloading a region of physical memory into a
> reserved portion of cache that no task or CPU can subsequently fill into and
> from that point on will only serve cache hits. At this time it is only
> possible to create cache pseudo-locked regions in either L2 or L3 cache,
> supporting systems that support either L2 Cache Allocation Technology (CAT)
> or L3 CAT because CAT is the mechanism used to manage reservations of cache
> portions.

While this is a 'nice' hardware feature for some kinds of embedded systems
I don't see how it can be sensibly used inside a Linux kernel.
There are an awful lot of places where things can go horribly wrong.
I can imagine:
- Multiple requests to lock regions that end up trying to use the same
  set-associative cache lines leaving none for normal operation.
- Excessive cache line bouncing because fewer lines are available.
- The effect of cache invalidate requests for the locked addresses.
- I suspect the Linux kernel can do full cache invalidates at certain times.

You've not given a use case.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache
  2019-06-27  9:12 ` [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache David Laight
@ 2019-06-27 17:55   ` Reinette Chatre
  0 siblings, 0 replies; 13+ messages in thread
From: Reinette Chatre @ 2019-06-27 17:55 UTC (permalink / raw)
  To: David Laight, tglx, fenghua.yu, bp, tony.luck
  Cc: mingo, hpa, x86, linux-kernel

Hi David,

On 6/27/2019 2:12 AM, David Laight wrote:
> From: Reinette Chatre
>> Sent: 26 June 2019 18:49
>>
>> Cache pseudo-locking involves preloading a region of physical memory into a
>> reserved portion of cache that no task or CPU can subsequently fill into and
>> from that point on will only serve cache hits. At this time it is only
>> possible to create cache pseudo-locked regions in either L2 or L3 cache,
>> supporting systems that support either L2 Cache Allocation Technology (CAT)
>> or L3 CAT because CAT is the mechanism used to manage reservations of cache
>> portions.
> 
> While this is a 'nice' hardware feature for some kinds of embedded systems
> I don't see how it can be sensibly used inside a Linux kernel.

Cache pseudo-locking is an existing (obviously not well known) feature
in Linux kernel since v4.19.

> There are an awful lot of places where things can go horribly wrong.

The worse thing that can go wrong is that the memory is evicted from the
pseudo-locked region and when it is accessed again it will have to share
cache with all other memory sharing the same class of service it is
accessed under. The consequence is lower latency when accessing this
high priority memory and reduced cache availability due to the orphaned
ways used for the pseudo-locked region.

This worse case could happen when the task runs on a CPU that is not
associated with the cache on which its memory is pseudo-locked, so the
application is expected to be associated only to CPUs associated with
the correct cache. This is familiar to high priority applications.

Other ways in which memory could be evicted are addressed below as part
of your detailed concerns.

> I can imagine:
> - Multiple requests to lock regions that end up trying to use the same
>   set-associative cache lines leaving none for normal operation.

I think that you are comparing this to cache coloring perhaps? Cache
pseudo-locking builds on CAT that is a way-based cache allocation
mechanism. It is impossible to use all cache ways for pseudo-locking
since the default resource group cannot be used for pseudo-locking and
resource groups will always have cache available to them (specifically:
an all zero capacity bitmask (CBM) is illegal on Intel hardware to which
this feature is specific).

> - Excessive cache line bouncing because fewer lines are available.

This is not specific to cache pseudo-locking. With cache allocation
technology (CAT), on which cache pseudo-locking is built, the system
administrator can partition the cache into portions and assign
tasks/CPUs to these different portions to manage interference between
the different tasks/CPUs.

You are right that fewer cache lines would be available to different
tasks/CPUs. By reducing the number of cache lines available to specific
classes of service and managing overlap between these different classes
of service the system administrator is able to manage interference
between different classes of tasks or even CPUs.

> - The effect of cache invalidate requests for the locked addresses.

This is correct and documented in Documentation/x86/resctrl_ui.rst

<snip>
Cache pseudo-locking increases the probability that data will remain
in the cache via carefully configuring the CAT feature and controlling
application behavior. There is no guarantee that data is placed in
cache. Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict
“locked” data from cache. Power management C-states may shrink or
power off cache. Deeper C-states will automatically be restricted on
pseudo-locked region creation.
<snip>

An application requesting pseudo-locked memory should not CLFLUSH that
memory.

> - I suspect the Linux kernel can do full cache invalidates at certain times.

This is correct. Fortunately Linux kernel is averse to calling WBINVD
during runtime and not many instances remain. A previous attempt at
handling these found only two direct invocations of WBINVD, neither of
which were likely to be used on a cache pseudo-lock system. During that
discussion it was proposed that instead of needing to handle these, we
should just be getting rid of WBINVD but such a system wide change was
too daunting for me at that time. For reference, please see:
http://lkml.kernel.org/r/alpine.DEB.2.21.1808031343020.1745@nanos.tec.linutronix.de

> 
> You've not given a use case.
> 

I think you may be asking for a use case of the original cache
pseudo-locking feature, not a use case for the additional support
contained in this series? Primary usages right now for cache
pseudo-locking are industrial PLCs/automation and high-frequency
trading/financial enterprise systems, but anything with relatively small
repeating data structures should see benefit.

Reinette

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-06-27 17:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-26 17:48 [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache Reinette Chatre
2019-06-26 17:48 ` [PATCH 01/10] x86/CPU: Expose if cache is inclusive of lower level caches Reinette Chatre
2019-06-26 17:48 ` [PATCH 02/10] x86/resctrl: Remove unnecessary size compute Reinette Chatre
2019-06-26 17:48 ` [PATCH 03/10] x86/resctrl: Constrain C-states during pseudo-lock region init Reinette Chatre
2019-06-26 17:48 ` [PATCH 04/10] x86/resctrl: Set cache line size using new utility Reinette Chatre
2019-06-26 17:48 ` [PATCH 05/10] x86/resctrl: Associate pseudo-locked region's cache instance by id Reinette Chatre
2019-06-26 17:48 ` [PATCH 06/10] x86/resctrl: Introduce utility to return pseudo-locked cache portion Reinette Chatre
2019-06-26 17:48 ` [PATCH 07/10] x86/resctrl: Remove unnecessary pointer to pseudo-locked region Reinette Chatre
2019-06-26 17:48 ` [PATCH 08/10] x86/resctrl: Support pseudo-lock regions spanning resources Reinette Chatre
2019-06-26 17:48 ` [PATCH 09/10] x86/resctrl: Pseudo-lock portions of multiple resources Reinette Chatre
2019-06-26 17:48 ` [PATCH 10/10] x86/resctrl: Only pseudo-lock L3 cache when inclusive Reinette Chatre
2019-06-27  9:12 ` [PATCH 00/10] x86/CPU and x86/resctrl: Support pseudo-lock regions spanning L2 and L3 cache David Laight
2019-06-27 17:55   ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).