All of lore.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: tglx@linutronix.de, fenghua.yu@intel.com, tony.luck@intel.com
Cc: vikas.shivappa@linux.intel.com, dave.hansen@intel.com,
	mingo@redhat.com, hpa@zytor.com, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Reinette Chatre <reinette.chatre@intel.com>
Subject: [RFC PATCH 01/20] x86/intel_rdt: Documentation for Cache Pseudo-Locking
Date: Mon, 13 Nov 2017 08:39:24 -0800	[thread overview]
Message-ID: <a223710811a54d69a61a92466c6ad528f4990f96.1510568528.git.reinette.chatre@intel.com> (raw)
In-Reply-To: <cover.1510568528.git.reinette.chatre@intel.com>
In-Reply-To: <cover.1510568528.git.reinette.chatre@intel.com>

Add description of Cache Pseudo-Locking feature, its interface,
as well as an example of its usage.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 Documentation/x86/intel_rdt_ui.txt | 229 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 228 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index 6851854cf69d..9924f7146c63 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -18,7 +18,10 @@ mount options are:
 "cdp": Enable code/data prioritization in L3 cache allocations.
 
 RDT features are orthogonal. A particular system may support only
-monitoring, only control, or both monitoring and control.
+monitoring, only control, or both monitoring and control. Cache
+pseudo-locking is a unique way of using cache control to "pin" or
+"lock" data in the cache. Details can be found in
+"Cache Pseudo-Locking".
 
 The mount succeeds if either of allocation or monitoring is present, but
 only those files and directories supported by the system will be created.
@@ -320,6 +323,149 @@ L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 L3DATA:0=fffff;1=fffff;2=3c0;3=fffff
 L3CODE:0=fffff;1=fffff;2=fffff;3=fffff
 
+Cache Pseudo-Locking
+--------------------
+CAT enables a user to specify the amount of cache space into which an
+application can fill. Cache pseudo-locking builds on the fact that a
+CPU can still read and write data pre-allocated outside its current
+allocated area on a cache hit. With cache pseudo-locking, data can be
+preloaded into a reserved portion of cache that no application can
+fill, and from that point on will only serve cache hits. The cache
+pseudo-locked memory is made accessible to user space where an
+application can map it into its virtual address space and thus have
+a region of memory with reduced average read latency.
+
+Cache pseudo-locking increases the probability that data will remain
+in the cache via carefully configuring the CAT feature and controlling
+application behavior. There is no guarantee that data is placed in
+cache. Instructions like INVD, WBINVD, CLFLUSH, etc. can still evict
+“locked” data from cache. Power management C-states may shrink or
+power off cache. It is thus recommended to limit the processor maximum
+C-state, for example, by setting the processor.max_cstate kernel parameter.
+
+It is required that an application using a pseudo-locked region runs
+with affinity to the cores (or a subset of the cores) associated
+with the cache on which the pseudo-locked region resides. This is
+enforced by the implementation.
+
+Pseudo-locking is accomplished in two stages:
+1) During the first stage the system administrator allocates a portion
+   of cache that should be dedicated to pseudo-locking. At this time an
+   equivalent portion of memory is allocated, loaded into allocated
+   cache portion, and exposed as a character device.
+2) During the second stage a user-space application maps (mmap()) the
+   pseudo-locked memory into its address space.
+
+Cache Pseudo-Locking Interface
+------------------------------
+Platforms supporting cache pseudo-locking will expose a new
+"/sys/fs/restrl/pseudo_lock" directory after successful mount of the
+resctrl filesystem. Initially this directory will contain a single file,
+"avail" that contains the schemata, one line per resource, of cache region
+available for pseudo-locking.
+
+A pseudo-locked region is created by creating a new directory within
+/sys/fs/resctrl/pseudo_lock. On success two new files will appear in
+the directory:
+
+"schemata":
+	Shows the schemata representing the pseudo-locked cache region.
+	User writes schemata of requested locked area to file.
+	Only one id of single resource accepted - can only lock from
+	single cache instance. Writing of schemata to this file will
+	return success on successful pseudo-locked region setup.
+"size":
+	After successful pseudo-locked region setup this read-only file
+	will contain the size in bytes of pseudo-locked region.
+
+Cache Pseudo-Locking Debugging Interface
+---------------------------------------
+The pseudo-locking debugging interface is enabled with
+CONFIG_INTEL_RDT_DEBUGFS and can be found in
+/sys/kernel/debug/resctrl/pseudo_lock.
+
+There is no explicit way for the kernel to test if a provided memory
+location is present in the cache. The pseudo-locking debugging interface uses
+the tracing infrastructure to provide two ways to measure cache residency of
+the pseudo-locked region:
+1) Memory access latency using the pseudo_lock_mem_latency tracepoint. Data
+   from these measurements are best visualized using a hist trigger (see
+   example below). In this test the pseudo-locked region is traversed at
+   a stride of 32 bytes while hardware prefetchers, preemption, and interrupts
+   are disabled. This also provides a substitute visualization of cache
+   hits and misses.
+2) Cache hit and miss measurements using model specific precision counters if
+   available. Depending on the levels of cache on the system the following
+   tracepoints are available: pseudo_lock_l2_hits, pseudo_lock_l2_miss,
+   pseudo_lock_l3_miss, and pseudo_lock_l3_hits. WARNING: triggering this
+   measurement uses from two (for just L2 measurements) to four (for L2 and L3
+   measurements) precision counters on the system, if any other
+   measurements are in progress the counters and their corresponding event
+   registers will be clobbered.
+
+When a pseudo-locked region is created a new debugfs directory is created for
+it in debugfs as /sys/kernel/debug/resctrl/pseudo_lock/<newdir>. A single
+write-only file, measure_trigger, is present in this directory. The
+measurement on the pseudo-locked region depends on the number, 1 or 2,
+written to this debugfs file. Since the measurements are recorded with the
+tracing infrastructure the relevant tracepoints need to be enabled before the
+measurement is triggered.
+
+Example of latency debugging interface:
+In this example a pseudo-locked region named "newlock" was created. Here is
+how we can measure the latency in cycles of reading from this region:
+# :> /sys/kernel/debug/tracing/trace
+# echo 'hist:keys=latency' > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_mem_latency/trigger
+# echo 1 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_mem_latency/enable
+# echo 1 > /sys/kernel/debug/resctrl/pseudo_lock/newlock/measure_trigger
+# echo 0 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_mem_latency/enable
+# cat /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_mem_latency/hist
+
+# event histogram
+#
+# trigger info: hist:keys=latency:vals=hitcount:sort=hitcount:size=2048 [active]
+#
+
+{ latency:        456 } hitcount:          1
+{ latency:         50 } hitcount:         83
+{ latency:         36 } hitcount:         96
+{ latency:         44 } hitcount:        174
+{ latency:         48 } hitcount:        195
+{ latency:         46 } hitcount:        262
+{ latency:         42 } hitcount:        693
+{ latency:         40 } hitcount:       3204
+{ latency:         38 } hitcount:       3484
+
+Totals:
+    Hits: 8192
+    Entries: 9
+    Dropped: 0
+
+Example of cache hits/misses debugging:
+In this example a pseudo-locked region named "newlock" was created on the L2
+cache of a platform. Here is how we can obtain details of the cache hits
+and misses using the platform's precision counters.
+
+# :> /sys/kernel/debug/tracing/trace
+# echo 1 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_l2_hits/enable
+# echo 1 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_l2_miss/enable
+# echo 2 > /sys/kernel/debug/resctrl/pseudo_lock/newlock/measure_trigger
+# echo 0 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_l2_hits/enable
+# echo 0 > /sys/kernel/debug/tracing/events/pseudo_lock/pseudo_lock_l2_miss/enable
+# cat /sys/kernel/debug/tracing/trace
+
+# tracer: nop
+#
+#                              _-----=> irqs-off
+#                             / _----=> need-resched
+#                            | / _---=> hardirq/softirq
+#                            || / _--=> preempt-depth
+#                            ||| /     delay
+#           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
+#              | |       |   ||||       |         |
+ pseudo_lock_mea-1039  [002] ....  1598.825180: pseudo_lock_l2_hits: L2 hits=4097
+ pseudo_lock_mea-1039  [002] ....  1598.825184: pseudo_lock_l2_miss: L2 miss=2
+
 Examples for RDT allocation usage:
 
 Example 1
@@ -434,6 +580,87 @@ siblings and only the real time threads are scheduled on the cores 4-7.
 
 # echo F0 > p0/cpus
 
+Example of Cache Pseudo-Locking
+-------------------------------
+Lock portion of L2 cache from cache id 1 using CBM 0x3. Pseudo-locked
+region is exposed at /dev/pseudo_lock/newlock that can be provided to
+application for argument to mmap().
+
+# cd /sys/fs/resctrl/pseudo_lock
+# cat avail
+L2:0=ff;1=ff
+# mkdir newlock
+# cd newlock
+# cat schemata
+# L2:uninitialized
+# echo ‘L2:1=3’ > schemata
+# ls -l /dev/pseudo_lock/newlock
+crw------- 1 root root 244, 0 Mar 30 03:00 /dev/pseudo_lock/newlock
+
+/*
+ * Example code to access one page of pseudo-locked cache region
+ * from user space.
+ */
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <sched.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
+/*
+ * It is required that the application runs with affinity to only
+ * cores associated with the pseudo-locked region. Here the cpu
+ * is hardcoded for convenience of example.
+ */
+static int cpuid = 2;
+
+int main(int argc, char *argv[])
+{
+	cpu_set_t cpuset;
+	long page_size;
+	void *mapping;
+	int dev_fd;
+	int ret;
+
+	page_size = sysconf(_SC_PAGESIZE);
+
+	CPU_ZERO(&cpuset);
+	CPU_SET(cpuid, &cpuset);
+	ret = sched_setaffinity(0, sizeof(cpuset), &cpuset);
+	if (ret < 0) {
+		perror("sched_setaffinity");
+		exit(EXIT_FAILURE);
+	}
+
+	dev_fd = open("/dev/pseudo_lock/newlock", O_RDWR);
+	if (dev_fd < 0) {
+		perror("open");
+		exit(EXIT_FAILURE);
+	}
+
+	mapping = mmap(0, page_size, PROT_READ | PROT_WRITE, MAP_SHARED,
+		       dev_fd, 0);
+	if (mapping == MAP_FAILED) {
+		perror("mmap");
+		close(dev_fd);
+		exit(EXIT_FAILURE);
+	}
+
+	/* Application interacts with pseudo-locked memory @mapping */
+
+	ret = munmap(mapping, page_size);
+	if (ret < 0) {
+		perror("munmap");
+		close(dev_fd);
+		exit(EXIT_FAILURE);
+	}
+
+	close(dev_fd);
+	exit(EXIT_SUCCESS);
+}
+
 4) Locking between applications
 
 Certain operations on the resctrl filesystem, composed of read/writes
-- 
2.13.5

  reply	other threads:[~2017-11-14  0:42 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-13 16:39 [RFC PATCH 00/20] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling Reinette Chatre
2017-11-13 16:39 ` Reinette Chatre [this message]
2017-11-13 16:39 ` [RFC PATCH 02/20] x86/intel_rdt: Make useful functions available internally Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 03/20] x86/intel_rdt: Introduce hooks to create pseudo-locking files Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 04/20] x86/intel_rdt: Introduce test to determine if closid is in use Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 05/20] x86/intel_rdt: Print more accurate pseudo-locking availability Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 06/20] x86/intel_rdt: Create pseudo-locked regions Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 07/20] x86/intel_rdt: Connect pseudo-locking directory to operations Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 08/20] x86/intel_rdt: Introduce pseudo-locking resctrl files Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 09/20] x86/intel_rdt: Discover supported platforms via prefetch disable bits Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 10/20] x86/intel_rdt: Disable pseudo-locking if CDP enabled Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 11/20] x86/intel_rdt: Associate pseudo-locked regions with its domain Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 12/20] x86/intel_rdt: Support CBM checking from value and character buffer Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 13/20] x86/intel_rdt: Support schemata write - pseudo-locking core Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 14/20] x86/intel_rdt: Enable testing for pseudo-locked region Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 15/20] x86/intel_rdt: Prevent new allocations from pseudo-locked regions Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 16/20] x86/intel_rdt: Create debugfs files for pseudo-locking testing Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 17/20] x86/intel_rdt: Create character device exposing pseudo-locked region Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 18/20] x86/intel_rdt: More precise L2 hit/miss measurements Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 19/20] x86/intel_rdt: Support L3 cache performance event of Broadwell Reinette Chatre
2017-11-13 16:39 ` [RFC PATCH 20/20] x86/intel_rdt: Limit C-states dynamically when pseudo-locking active Reinette Chatre
2017-11-18  0:48 ` [RFC PATCH 00/20] Intel(R) Resource Director Technology Cache Pseudo-Locking enabling Thomas Gleixner
2017-11-18  6:42   ` Reinette Chatre
2018-01-14 22:54     ` Thomas Gleixner
2018-01-15 16:23       ` Hindman, Gavin
2018-01-16 11:38         ` Thomas Gleixner
2018-01-17  0:53           ` Reinette Chatre
2018-02-12 19:07           ` Reinette Chatre
2018-02-13 10:27             ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a223710811a54d69a61a92466c6ad528f4990f96.1510568528.git.reinette.chatre@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=vikas.shivappa@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.