linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf
@ 2018-09-11 17:14 Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 1/6] perf/core: Add sanity check to deal with pinned event failure Reinette Chatre
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

Dear Maintainers,

This new series fixing the lack of coordination between the
pseudo-locking measurement code and perf addresses all feedback received
for V2.

Changes since V2:
- Move the helper to obtain the performance counter index to
  include/linux/perf_event.h. The request was actually to move this helper
  to arch/x86/include/asm/perf_event.h - but doing so would be more
  involved since this header file does not know about struct perf_event
  that is used by this helper. There was no response for further
  clarification of the request to move this helper so I proceeded to move it
  to include/linux/perf_event.h instead.
- Change name of helper to obtain the index to perf_rdpmc_index() - the
  original request was to name it x86_perf_rdpmc_index() but this seems to
  be tied to the suggested header location. With the header location of
  include/linux/perf_event.h the name perf_rdpmc_index() seems to fit
  better with the new destination. There was no response for further
  clarification of the naming change request so I proceeded with the change.
- Replace all local register variables used in the measurement routines
  with local variables using READ_ONCE().
- The removal of local register variables also enable us to replace the
  direct __wrmsr() with wrmsr().
- Merge the L2 and L3 measurement routines following Peter's suggested
  framework.
- Do not copy the text from SDM that refers to serializing instructions.
- Include another LFENCE call after loop reading pseudo-locked memory.

The above addresses all feedback received for V2. The one unanswered
question that remains following the review is why the memory reading
was done with asm: the reason I did so was to avoid any compiler
optimizations while constraining the code exactly to what needed to be
measured. By using the asm instruction I am able to use a single instruction
to read a cache line into a register. To me this seemed the most constrained
way to measure if a cache line is in the cache.

- Below is verbatim from V2 submission (except for diffstat below) -

This is the second attempt at fixing the lack of coordination between the
pseudo-locking measurement code and perf. Thank you very much for your
feedback on the first version. The entire solution, including the cover
letter, has been reworked based on your feedback, while submitted as a V2,
none of the patches from V1 remained.

Changes since V1:
- Use in-kernel interface to perf.
- Do not write directly to PMU registers.
- Do not introduce another PMU owner. perf maintains role as performing
  resource arbitration for PMU.
- User space is able to use perf and resctrl at the same time.
- event_base_rdpmc is accessed and used only within an interrupts
  disabled section.
- Internals of events are never accessed directly, inline function used.
- Due to "pinned" usage the scheduling of event may have failed.  Error
  state is checked in recommended way and have a credible error
  handling.
- use X86_CONFIG

This code is based on the x86/cache branch of tip.git

The success of Cache Pseudo-Locking, as measured by how many cache lines
from a physical memory region has been locked to cache, can be measured
via the use of hardware performance events. Specifically, the number of
cache hits and misses reading a memory region after it has been
pseudo-locked to cache. This measurement is triggered via the resctrl
debugfs interface.

The current solution accesses performance counters and their configuration
registers directly without coordination with other performance event users
(perf).
Two of the issues that exist with the current solution:
- By writing to the performance monitoring registers directly a new owner
  for these resources is introduced. The perf infrastructure already exist
  to perform resource arbitration and the in-kernel infrastructure should
  be used to do so.
- The current lack of coordination with perf will have consequences any time
  two users, for example perf and cache pseudo-locking, attempt to do any
  kind of measurement at the same time.

In this series the measurement of Cache Pseudo-Lock regions is moved to use
the in-kernel interface to perf. During the rework of the measurement
function the L2 and L3 cache measurements are separated to avoid the
additional code needed to decide on which measurement causing unrelated
cache hits and misses.

Your feedback on this work will be greatly appreciated.

Reinette

Reinette Chatre (6):
  perf/core: Add sanity check to deal with pinned event failure
  perf/core: Add helper to obtain performance counter index
  x86/intel_rdt: Remove local register variables
  x86/intel_rdt: Create required perf event attributes
  x86/intel_rdt: Use perf infrastructure for measurements
  x86/intel_rdt: Re-enable pseudo-lock measurements

 Documentation/x86/intel_rdt_ui.txt          |  22 +-
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 365 +++++++++++---------
 include/linux/perf_event.h                  |  26 +-
 kernel/events/core.c                        |   6 +
 4 files changed, 255 insertions(+), 164 deletions(-)

-- 
2.17.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V3 1/6] perf/core: Add sanity check to deal with pinned event failure
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index Reinette Chatre
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

It is possible that a failure can occur during the scheduling of a
pinned event. The initial portion of perf_event_read_local() contains
the various error checks an event should pass before it can be
considered valid. Ensure that the potential scheduling failure
of a pinned event is checked for and have a credible error.

Link: http://lkml.kernel.org/r/20180807093615.GY2494@hirez.programming.kicks-ass.net
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 kernel/events/core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2a62b96600ad..191c0c2e10de 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3940,6 +3940,12 @@ int perf_event_read_local(struct perf_event *event, u64 *value,
 		goto out;
 	}
 
+	/* If this is a pinned event it must be running on this CPU */
+	if (event->attr.pinned && event->oncpu != smp_processor_id()) {
+		ret = -EBUSY;
+		goto out;
+	}
+
 	/*
 	 * If the event is currently on this CPU, its either a per-task event,
 	 * or local to this CPU. Furthermore it means its ACTIVE (otherwise
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 1/6] perf/core: Add sanity check to deal with pinned event failure Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  2018-09-17  8:23   ` Peter Zijlstra
  2018-09-11 17:14 ` [PATCH V3 3/6] x86/intel_rdt: Remove local register variables Reinette Chatre
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

perf_event_read_local() is the safest way to obtain measurements
associated with performance events. In some cases the overhead
introduced by perf_event_read_local() affects the measurements and the
use of rdpmcl() is needed. rdpmcl() requires the index
of the performance counter used so a helper is introduced to determine
the index used by a provided performance event.

The index used by a performance event may change when interrupts are
enabled. A check is added to ensure that the index is only accessed
with interrupts disabled. Even with this check the use of this counter
needs to be done with care to ensure it is queried and used within the
same disabled interrupts section.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 include/linux/perf_event.h | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f0ca79..c04dc666425c 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1025,6 +1025,27 @@ static inline int in_software_context(struct perf_event *event)
 	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
 }
 
+/**
+ * perf_rdpmc_index - Return PMC counter used for event
+ * @event: the perf_event to which the PMC counter was assigned
+ *
+ * The counter assigned to this performance event may change if interrupts
+ * are enabled. This counter should thus never be used while interrupts are
+ * enabled. Before this function is used to obtain the assigned counter the
+ * event could be checked for validity using, for example,
+ * perf_event_read_local(), within the same interrupt disabled section in
+ * which this counter is planned to be used.
+ *
+ * Return: The index of the performance monitoring counter assigned to
+ * @perf_event.
+ */
+static inline int perf_rdpmc_index(struct perf_event *event)
+{
+	lockdep_assert_irqs_disabled();
+
+	return event->hw.event_base_rdpmc;
+}
+
 extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];
 
 extern void ___perf_sw_event(u32, u64, struct pt_regs *, u64);
@@ -1319,7 +1340,10 @@ static inline int perf_event_refresh(struct perf_event *event, int refresh)
 {
 	return -EINVAL;
 }
-
+static inline int perf_rdpmc_index(struct perf_event *event)
+{
+	return -1;
+}
 static inline void
 perf_sw_event(u32 event_id, u64 nr, struct pt_regs *regs, u64 addr)	{ }
 static inline void
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 3/6] x86/intel_rdt: Remove local register variables
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 1/6] perf/core: Add sanity check to deal with pinned event failure Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

Local register variables were used in an effort to improve the
accuracy of the measurement of cache residency of a pseudo-locked
region. This was done to ensure that only the cache residency of
the memory is measured and not the cache residency of the variables
used to perform the measurement.

While local register variables do accomplish the goal they do require
significant care since different architectures have different registers
available. Local register variables also cannot be used with valuable
developer tools like KASAN.

Significant testing has shown that similar accuracy in measurement
results can be obtained by replacing local register variables with
regular local variables.

Make use of local variables in the critical code but do so with
READ_ONCE() to prevent the compiler from merging or refetching reads.
Ensure these variables are initialized before the measurement starts,
and ensure it is only the local variables that are accessed during
the measurement.

With the removal of the local register variables and using READ_ONCE()
there is no longer a motivation for using a direct wrmsr call (that
avoids the additional tracing code that may clobber the local register
variables).

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 53 ++++-----------------
 1 file changed, 9 insertions(+), 44 deletions(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 40f3903ae5d9..8ad83eb3fc89 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -886,31 +886,14 @@ static int measure_cycles_lat_fn(void *_plr)
 	struct pseudo_lock_region *plr = _plr;
 	unsigned long i;
 	u64 start, end;
-#ifdef CONFIG_KASAN
-	/*
-	 * The registers used for local register variables are also used
-	 * when KASAN is active. When KASAN is active we use a regular
-	 * variable to ensure we always use a valid pointer to access memory.
-	 * The cost is that accessing this pointer, which could be in
-	 * cache, will be included in the measurement of memory read latency.
-	 */
 	void *mem_r;
-#else
-#ifdef CONFIG_X86_64
-	register void *mem_r asm("rbx");
-#else
-	register void *mem_r asm("ebx");
-#endif /* CONFIG_X86_64 */
-#endif /* CONFIG_KASAN */
 
 	local_irq_disable();
 	/*
-	 * The wrmsr call may be reordered with the assignment below it.
-	 * Call wrmsr as directly as possible to avoid tracing clobbering
-	 * local register variable used for memory pointer.
+	 * Disable hardware prefetchers.
 	 */
-	__wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
-	mem_r = plr->kmem;
+	wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
+	mem_r = READ_ONCE(plr->kmem);
 	/*
 	 * Dummy execute of the time measurement to load the needed
 	 * instructions into the L1 instruction cache.
@@ -939,26 +922,10 @@ static int measure_cycles_perf_fn(void *_plr)
 	struct pseudo_lock_region *plr = _plr;
 	unsigned long long l2_hits, l2_miss;
 	u64 l2_hit_bits, l2_miss_bits;
-	unsigned long i;
-#ifdef CONFIG_KASAN
-	/*
-	 * The registers used for local register variables are also used
-	 * when KASAN is active. When KASAN is active we use regular variables
-	 * at the cost of including cache access latency to these variables
-	 * in the measurements.
-	 */
 	unsigned int line_size;
 	unsigned int size;
+	unsigned long i;
 	void *mem_r;
-#else
-	register unsigned int line_size asm("esi");
-	register unsigned int size asm("edi");
-#ifdef CONFIG_X86_64
-	register void *mem_r asm("rbx");
-#else
-	register void *mem_r asm("ebx");
-#endif /* CONFIG_X86_64 */
-#endif /* CONFIG_KASAN */
 
 	/*
 	 * Non-architectural event for the Goldmont Microarchitecture
@@ -1011,11 +978,9 @@ static int measure_cycles_perf_fn(void *_plr)
 
 	local_irq_disable();
 	/*
-	 * Call wrmsr direcly to avoid the local register variables from
-	 * being overwritten due to reordering of their assignment with
-	 * the wrmsr calls.
+	 * Disable hardware prefetchers.
 	 */
-	__wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
+	wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
 	/* Disable events and reset counters */
 	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, 0x0);
 	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x0);
@@ -1028,6 +993,9 @@ static int measure_cycles_perf_fn(void *_plr)
 		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 3, 0x0);
 	}
 	/* Set and enable the L2 counters */
+	mem_r = READ_ONCE(plr->kmem);
+	size = READ_ONCE(plr->size);
+	line_size = READ_ONCE(plr->line_size);
 	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, l2_hit_bits);
 	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits);
 	if (l3_hit_bits > 0) {
@@ -1036,9 +1004,6 @@ static int measure_cycles_perf_fn(void *_plr)
 		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3,
 				      l3_miss_bits);
 	}
-	mem_r = plr->kmem;
-	size = plr->size;
-	line_size = plr->line_size;
 	for (i = 0; i < size; i += line_size) {
 		asm volatile("mov (%0,%1,1), %%eax\n\t"
 			     :
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
                   ` (2 preceding siblings ...)
  2018-09-11 17:14 ` [PATCH V3 3/6] x86/intel_rdt: Remove local register variables Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  2018-09-12  5:49   ` kbuild test robot
                     ` (2 more replies)
  2018-09-11 17:14 ` [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements Reinette Chatre
  2018-09-11 17:14 ` [PATCH V3 6/6] x86/intel_rdt: Re-enable pseudo-lock measurements Reinette Chatre
  5 siblings, 3 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

A perf event has many attributes that are maintained in a separate
structure that should be provided when a new perf_event is created.

In preparation for the transition to perf_events the required attribute
structures are created for all the events that may be used in the
measurements. Most attributes for all the events are identical. The
actual configuration, what specifies what needs to be measured, is what
will be different between the events used. This configuration needs to
be done with X86_CONFIG that cannot be used as part of the designated
initializers used here, this will be introduced later.

Although they do look identical at this time the attribute structures
needs to be maintained separately since a perf_event will maintain a
pointer to its unique attributes.

In support of patch testing the new structs are given the unused attribute
until their use in later patches.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 25 +++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 8ad83eb3fc89..2191623f1f27 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -915,6 +915,31 @@ static int measure_cycles_lat_fn(void *_plr)
 	return 0;
 }
 
+/*
+ * Create a perf_event_attr for the hit and miss perf events that will
+ * be used during the performance measurement. A perf_event maintains
+ * a pointer to its perf_event_attr so a unique attribute structure is
+ * created for each perf_event.
+ *
+ * The actual configuration of the event is set right before use in order
+ * to use the X86_CONFIG macro.
+ */
+static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
+	.type		= PERF_TYPE_RAW,
+	.size		= sizeof(struct perf_event_attr),
+	.pinned		= 1,
+	.disabled	= 0,
+	.exclude_user	= 1,
+};
+
+static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
+	.type		= PERF_TYPE_RAW,
+	.size		= sizeof(struct perf_event_attr),
+	.pinned		= 1,
+	.disabled	= 0,
+	.exclude_user	= 1,
+};
+
 static int measure_cycles_perf_fn(void *_plr)
 {
 	unsigned long long l3_hits = 0, l3_miss = 0;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
                   ` (3 preceding siblings ...)
  2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  2018-09-17  8:58   ` Peter Zijlstra
  2018-09-11 17:14 ` [PATCH V3 6/6] x86/intel_rdt: Re-enable pseudo-lock measurements Reinette Chatre
  5 siblings, 1 reply; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

The success of a cache pseudo-locked region is measured using
performance monitoring events that are programmed directly at the time
the user requests a measurement.

Modifying the performance event registers directly is not appropriate
since it circumvents the in-kernel perf infrastructure that exists to
manage these resources and provide resource arbitration to the
performance monitoring hardware.

The cache pseudo-locking measurements are modified to use the in-kernel
perf infrastructure. Performance events are created and validated with
the appropriate perf API. The performance counters are still read as
directly as possible to avoid the additional cache hits. This is
done safely by first ensuring with the perf API that the counters have
been programmed correctly and only accessing the counters in an
interrupt disabled section where they are not able to be moved.

As part of the transition to the in-kernel perf infrastructure the L2
and L3 measurements are split into two separate measurements that can
be triggered independently. This separation prevents additional cache
misses incurred during the extra testing code used to decide if a
L2 and/or L3 measurement should be made.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 Documentation/x86/intel_rdt_ui.txt          |  22 +-
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 295 ++++++++++++--------
 2 files changed, 194 insertions(+), 123 deletions(-)

diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index f662d3c530e5..52b10945ff75 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -520,18 +520,24 @@ the pseudo-locked region:
 2) Cache hit and miss measurements using model specific precision counters if
    available. Depending on the levels of cache on the system the pseudo_lock_l2
    and pseudo_lock_l3 tracepoints are available.
-   WARNING: triggering this  measurement uses from two (for just L2
-   measurements) to four (for L2 and L3 measurements) precision counters on
-   the system, if any other measurements are in progress the counters and
-   their corresponding event registers will be clobbered.
 
 When a pseudo-locked region is created a new debugfs directory is created for
 it in debugfs as /sys/kernel/debug/resctrl/<newdir>. A single
 write-only file, pseudo_lock_measure, is present in this directory. The
-measurement on the pseudo-locked region depends on the number, 1 or 2,
-written to this debugfs file. Since the measurements are recorded with the
-tracing infrastructure the relevant tracepoints need to be enabled before the
-measurement is triggered.
+measurement of the pseudo-locked region depends on the number written to this
+debugfs file:
+1 -  writing "1" to the pseudo_lock_measure file will trigger the latency
+     measurement captured in the pseudo_lock_mem_latency tracepoint. See
+     example below.
+2 -  writing "2" to the pseudo_lock_measure file will trigger the L2 cache
+     residency (cache hits and misses) measurement captured in the
+     pseudo_lock_l2 tracepoint. See example below.
+3 -  writing "3" to the pseudo_lock_measure file will trigger the L3 cache
+     residency (cache hits and misses) measurement captured in the
+     pseudo_lock_l3 tracepoint.
+
+All measurements are recorded with the tracing infrastructure. This requires
+the relevant tracepoints to be enabled before the measurement is triggered.
 
 Example of latency debugging interface:
 In this example a pseudo-locked region named "newlock" was created. Here is
diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 2191623f1f27..5a294fe080d7 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -26,6 +26,7 @@
 #include <asm/intel_rdt_sched.h>
 #include <asm/perf_event.h>
 
+#include "../../events/perf_event.h" /* For X86_CONFIG() */
 #include "intel_rdt.h"
 
 #define CREATE_TRACE_POINTS
@@ -106,16 +107,6 @@ static u64 get_prefetch_disable_bits(void)
 	return 0;
 }
 
-/*
- * Helper to write 64bit value to MSR without tracing. Used when
- * use of the cache should be restricted and use of registers used
- * for local variables avoided.
- */
-static inline void pseudo_wrmsrl_notrace(unsigned int msr, u64 val)
-{
-	__wrmsr(msr, (u32)(val & 0xffffffffULL), (u32)(val >> 32));
-}
-
 /**
  * pseudo_lock_minor_get - Obtain available minor number
  * @minor: Pointer to where new minor number will be stored
@@ -924,7 +915,7 @@ static int measure_cycles_lat_fn(void *_plr)
  * The actual configuration of the event is set right before use in order
  * to use the X86_CONFIG macro.
  */
-static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
+static struct perf_event_attr perf_miss_attr = {
 	.type		= PERF_TYPE_RAW,
 	.size		= sizeof(struct perf_event_attr),
 	.pinned		= 1,
@@ -932,7 +923,7 @@ static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
 	.exclude_user	= 1,
 };
 
-static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
+static struct perf_event_attr perf_hit_attr = {
 	.type		= PERF_TYPE_RAW,
 	.size		= sizeof(struct perf_event_attr),
 	.pinned		= 1,
@@ -940,139 +931,206 @@ static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
 	.exclude_user	= 1,
 };
 
-static int measure_cycles_perf_fn(void *_plr)
+struct residency_counts {
+	u64 miss_before, hits_before;
+	u64 miss_after,  hits_after;
+};
+
+static int measure_residency_fn(struct perf_event_attr *miss_attr,
+				struct perf_event_attr *hit_attr,
+				struct pseudo_lock_region *plr,
+				struct residency_counts *counts)
 {
-	unsigned long long l3_hits = 0, l3_miss = 0;
-	u64 l3_hit_bits = 0, l3_miss_bits = 0;
-	struct pseudo_lock_region *plr = _plr;
-	unsigned long long l2_hits, l2_miss;
-	u64 l2_hit_bits, l2_miss_bits;
+	u64 hits_before = 0, hits_after = 0, miss_before = 0, miss_after = 0;
+	struct perf_event *miss_event, *hit_event;
+	int hit_pmcnum, miss_pmcnum;
 	unsigned int line_size;
 	unsigned int size;
 	unsigned long i;
 	void *mem_r;
+	u64 tmp;
 
-	/*
-	 * Non-architectural event for the Goldmont Microarchitecture
-	 * from Intel x86 Architecture Software Developer Manual (SDM):
-	 * MEM_LOAD_UOPS_RETIRED D1H (event number)
-	 * Umask values:
-	 *     L1_HIT   01H
-	 *     L2_HIT   02H
-	 *     L1_MISS  08H
-	 *     L2_MISS  10H
-	 *
-	 * On Broadwell Microarchitecture the MEM_LOAD_UOPS_RETIRED event
-	 * has two "no fix" errata associated with it: BDM35 and BDM100. On
-	 * this platform we use the following events instead:
-	 *  L2_RQSTS 24H (Documented in https://download.01.org/perfmon/BDW/)
-	 *       REFERENCES FFH
-	 *       MISS       3FH
-	 *  LONGEST_LAT_CACHE 2EH (Documented in SDM)
-	 *       REFERENCE 4FH
-	 *       MISS      41H
-	 */
+	miss_event = perf_event_create_kernel_counter(miss_attr, plr->cpu,
+						      NULL, NULL, NULL);
+	if (IS_ERR(miss_event))
+		goto out;
+
+	hit_event = perf_event_create_kernel_counter(hit_attr, plr->cpu,
+						     NULL, NULL, NULL);
+	if (IS_ERR(hit_event))
+		goto out_miss;
 
+	local_irq_disable();
 	/*
-	 * Start by setting flags for IA32_PERFEVTSELx:
-	 *     OS  (Operating system mode)  0x2
-	 *     INT (APIC interrupt enable)  0x10
-	 *     EN  (Enable counter)         0x40
-	 *
-	 * Then add the Umask value and event number to select performance
-	 * event.
+	 * Check any possible error state of events used by performing
+	 * one local read.
 	 */
-
-	switch (boot_cpu_data.x86_model) {
-	case INTEL_FAM6_ATOM_GOLDMONT:
-	case INTEL_FAM6_ATOM_GEMINI_LAKE:
-		l2_hit_bits = (0x52ULL << 16) | (0x2 << 8) | 0xd1;
-		l2_miss_bits = (0x52ULL << 16) | (0x10 << 8) | 0xd1;
-		break;
-	case INTEL_FAM6_BROADWELL_X:
-		/* On BDW the l2_hit_bits count references, not hits */
-		l2_hit_bits = (0x52ULL << 16) | (0xff << 8) | 0x24;
-		l2_miss_bits = (0x52ULL << 16) | (0x3f << 8) | 0x24;
-		/* On BDW the l3_hit_bits count references, not hits */
-		l3_hit_bits = (0x52ULL << 16) | (0x4f << 8) | 0x2e;
-		l3_miss_bits = (0x52ULL << 16) | (0x41 << 8) | 0x2e;
-		break;
-	default:
-		goto out;
+	if (perf_event_read_local(miss_event, &tmp, NULL, NULL)) {
+		local_irq_enable();
+		goto out_hit;
+	}
+	if (perf_event_read_local(hit_event, &tmp, NULL, NULL)) {
+		local_irq_enable();
+		goto out_hit;
 	}
 
-	local_irq_disable();
 	/*
 	 * Disable hardware prefetchers.
 	 */
 	wrmsr(MSR_MISC_FEATURE_CONTROL, prefetch_disable_bits, 0x0);
-	/* Disable events and reset counters */
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, 0x0);
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, 0x0);
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0, 0x0);
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 1, 0x0);
-	if (l3_hit_bits > 0) {
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2, 0x0);
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3, 0x0);
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 2, 0x0);
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_PERFCTR0 + 3, 0x0);
-	}
-	/* Set and enable the L2 counters */
+
+	/* Initialize rest of local variables */
+	/*
+	 * Performance event has been validated right before this with
+	 * interrupts disabled - it is thus safe to read the counter index.
+	 */
+	miss_pmcnum = perf_rdpmc_index(miss_event);
+	hit_pmcnum = perf_rdpmc_index(hit_event);
+	line_size = READ_ONCE(plr->line_size);
 	mem_r = READ_ONCE(plr->kmem);
 	size = READ_ONCE(plr->size);
-	line_size = READ_ONCE(plr->line_size);
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0, l2_hit_bits);
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1, l2_miss_bits);
-	if (l3_hit_bits > 0) {
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2,
-				      l3_hit_bits);
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3,
-				      l3_miss_bits);
-	}
+
+	/*
+	 * Read counter variables twice - first to load the instructions
+	 * used in L1 cache, second to capture accurate value that does not
+	 * include cache misses incurred because of instruction loads.
+	 */
+	rdpmcl(hit_pmcnum, hits_before);
+	rdpmcl(miss_pmcnum, miss_before);
+	/*
+	 * From SDM: Performing back-to-back fast reads are not guaranteed
+	 * to be monotonic.
+	 * Use LFENCE to ensure all previous instructions are retired
+	 * before proceeding.
+	 */
+	rmb();
+	rdpmcl(hit_pmcnum, hits_before);
+	rdpmcl(miss_pmcnum, miss_before);
+	/*
+	 * Use LFENCE to ensure all previous instructions are retired
+	 * before proceeding.
+	 */
+	rmb();
 	for (i = 0; i < size; i += line_size) {
+		/*
+		 * Add a barrier to prevent speculative execution of this
+		 * loop reading beyond the end of the buffer.
+		 */
+		rmb();
 		asm volatile("mov (%0,%1,1), %%eax\n\t"
 			     :
 			     : "r" (mem_r), "r" (i)
 			     : "%eax", "memory");
 	}
 	/*
-	 * Call wrmsr directly (no tracing) to not influence
-	 * the cache access counters as they are disabled.
+	 * Use LFENCE to ensure all previous instructions are retired
+	 * before proceeding.
 	 */
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0,
-			      l2_hit_bits & ~(0x40ULL << 16));
-	pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 1,
-			      l2_miss_bits & ~(0x40ULL << 16));
-	if (l3_hit_bits > 0) {
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 2,
-				      l3_hit_bits & ~(0x40ULL << 16));
-		pseudo_wrmsrl_notrace(MSR_ARCH_PERFMON_EVENTSEL0 + 3,
-				      l3_miss_bits & ~(0x40ULL << 16));
-	}
-	l2_hits = native_read_pmc(0);
-	l2_miss = native_read_pmc(1);
-	if (l3_hit_bits > 0) {
-		l3_hits = native_read_pmc(2);
-		l3_miss = native_read_pmc(3);
-	}
+	rmb();
+	rdpmcl(hit_pmcnum, hits_after);
+	rdpmcl(miss_pmcnum, miss_after);
+	/*
+	 * Use LFENCE to ensure all previous instructions are retired
+	 * before proceeding.
+	 */
+	rmb();
+	/* Re-enable hardware prefetchers */
 	wrmsr(MSR_MISC_FEATURE_CONTROL, 0x0, 0x0);
 	local_irq_enable();
+out_hit:
+	perf_event_release_kernel(hit_event);
+out_miss:
+	perf_event_release_kernel(miss_event);
+out:
+	counts->miss_before = miss_before;
+	counts->hits_before = hits_before;
+	counts->miss_after  = miss_after;
+	counts->hits_after  = hits_after;
+	return 0;
+}
+
+static int measure_l2_residency(void *_plr)
+{
+	struct pseudo_lock_region *plr = _plr;
+	struct residency_counts counts = {0};
+
 	/*
-	 * On BDW we count references and misses, need to adjust. Sometimes
-	 * the "hits" counter is a bit more than the references, for
-	 * example, x references but x + 1 hits. To not report invalid
-	 * hit values in this case we treat that as misses eaqual to
-	 * references.
+	 * Non-architectural event for the Goldmont Microarchitecture
+	 * from Intel x86 Architecture Software Developer Manual (SDM):
+	 * MEM_LOAD_UOPS_RETIRED D1H (event number)
+	 * Umask values:
+	 *     L2_HIT   02H
+	 *     L2_MISS  10H
 	 */
-	if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X)
-		l2_hits -= (l2_miss > l2_hits ? l2_hits : l2_miss);
-	trace_pseudo_lock_l2(l2_hits, l2_miss);
-	if (l3_hit_bits > 0) {
-		if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X)
-			l3_hits -= (l3_miss > l3_hits ? l3_hits : l3_miss);
-		trace_pseudo_lock_l3(l3_hits, l3_miss);
+	switch (boot_cpu_data.x86_model) {
+	case INTEL_FAM6_ATOM_GOLDMONT:
+	case INTEL_FAM6_ATOM_GEMINI_LAKE:
+		perf_miss_attr.config = X86_CONFIG(.event = 0xd1,
+						   .umask = 0x10);
+		perf_hit_attr.config = X86_CONFIG(.event = 0xd1,
+						  .umask = 0x2);
+		break;
+	default:
+		goto out;
 	}
 
+	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);
+
+	trace_pseudo_lock_l2(counts.hits_after - counts.hits_before,
+			     counts.miss_after - counts.miss_before);
+out:
+	plr->thread_done = 1;
+	wake_up_interruptible(&plr->lock_thread_wq);
+	return 0;
+}
+
+static int measure_l3_residency(void *_plr)
+{
+	struct pseudo_lock_region *plr = _plr;
+	struct residency_counts counts = {0};
+
+	/*
+	 * On Broadwell Microarchitecture the MEM_LOAD_UOPS_RETIRED event
+	 * has two "no fix" errata associated with it: BDM35 and BDM100. On
+	 * this platform the following events are used instead:
+	 * LONGEST_LAT_CACHE 2EH (Documented in SDM)
+	 *       REFERENCE 4FH
+	 *       MISS      41H
+	 */
+
+	switch (boot_cpu_data.x86_model) {
+	case INTEL_FAM6_BROADWELL_X:
+		/* On BDW the hit event counts references, not hits */
+		perf_hit_attr.config = X86_CONFIG(.event = 0x2e,
+						  .umask = 0x4f);
+		perf_miss_attr.config = X86_CONFIG(.event = 0x2e,
+						   .umask = 0x41);
+		break;
+	default:
+		goto out;
+	}
+
+	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);
+
+	counts.miss_after -= counts.miss_before;
+	if (boot_cpu_data.x86_model == INTEL_FAM6_BROADWELL_X) {
+		/*
+		 * On BDW references and misses are counted, need to adjust.
+		 * Sometimes the "hits" counter is a bit more than the
+		 * references, for example, x references but x + 1 hits.
+		 * To not report invalid hit values in this case we treat
+		 * that as misses equal to references.
+		 */
+		/* First compute the number of cache references measured */
+		counts.hits_after -= counts.hits_before;
+		/* Next convert references to cache hits */
+		counts.hits_after -= counts.miss_after > counts.hits_after ?
+					counts.hits_after : counts.miss_after;
+	} else {
+		counts.hits_after -= counts.hits_before;
+	}
+
+	trace_pseudo_lock_l3(counts.hits_after, counts.miss_after);
 out:
 	plr->thread_done = 1;
 	wake_up_interruptible(&plr->lock_thread_wq);
@@ -1111,13 +1169,20 @@ static int pseudo_lock_measure_cycles(struct rdtgroup *rdtgrp, int sel)
 		goto out;
 	}
 
+	plr->cpu = cpu;
+
 	if (sel == 1)
 		thread = kthread_create_on_node(measure_cycles_lat_fn, plr,
 						cpu_to_node(cpu),
 						"pseudo_lock_measure/%u",
 						cpu);
 	else if (sel == 2)
-		thread = kthread_create_on_node(measure_cycles_perf_fn, plr,
+		thread = kthread_create_on_node(measure_l2_residency, plr,
+						cpu_to_node(cpu),
+						"pseudo_lock_measure/%u",
+						cpu);
+	else if (sel == 3)
+		thread = kthread_create_on_node(measure_l3_residency, plr,
 						cpu_to_node(cpu),
 						"pseudo_lock_measure/%u",
 						cpu);
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V3 6/6] x86/intel_rdt: Re-enable pseudo-lock measurements
  2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
                   ` (4 preceding siblings ...)
  2018-09-11 17:14 ` [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements Reinette Chatre
@ 2018-09-11 17:14 ` Reinette Chatre
  5 siblings, 0 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-11 17:14 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

Commit 4a7a54a55e ("x86/intel_rdt: Disable PMU access") disabled
measurements of pseudo-locked regions because of incorrect usage
of the performance monitoring hardware.

Cache pseudo-locking measurements are now done correctly with the
in-kernel perf API and its use can be re-enabled at this time.

The adjustment to the in-kernel perf API also separated the L2 and L3
measurements that can be triggered separately from user space. The
re-enabling of the measurements is thus not a simple revert of the
original disable in order to accommodate the additional parameter
possible.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 5a294fe080d7..e2bd212f382f 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -1226,7 +1226,7 @@ static ssize_t pseudo_lock_measure_trigger(struct file *file,
 	buf[buf_size] = '\0';
 	ret = kstrtoint(buf, 10, &sel);
 	if (ret == 0) {
-		if (sel != 1)
+		if (sel != 1 && sel != 2 && sel != 3)
 			return -EINVAL;
 		ret = debugfs_file_get(file->f_path.dentry);
 		if (ret)
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes
  2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
@ 2018-09-12  5:49   ` kbuild test robot
  2018-09-12  7:20   ` kbuild test robot
  2018-09-12 19:54   ` [PATCH V4 " Reinette Chatre
  2 siblings, 0 replies; 17+ messages in thread
From: kbuild test robot @ 2018-09-12  5:49 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: kbuild-all, tglx, fenghua.yu, tony.luck, peterz, mingo, acme,
	gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

[-- Attachment #1: Type: text/plain, Size: 7746 bytes --]

Hi Reinette,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on v4.19-rc3 next-20180911]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Reinette-Chatre/perf-core-and-x86-intel_rdt-Fix-lack-of-coordination-with-perf/20180912-101526
config: i386-randconfig-x001-201836 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

Note: the linux-review/Reinette-Chatre/perf-core-and-x86-intel_rdt-Fix-lack-of-coordination-with-perf/20180912-101526 HEAD b684b8727deb9e3cf635badb292b3314904d17b2 builds fine.
      It only hurts bisectibility.

All error/warnings (new ones prefixed by >>):

>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:927:15: error: variable 'perf_miss_attr' has initializer but incomplete type
    static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
                  ^~~~~~~~~~~~~~~
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:928:3: error: 'struct perf_event_attr' has no member named 'type'
     .type  = PERF_TYPE_RAW,
      ^~~~
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:928:11: error: 'PERF_TYPE_RAW' undeclared here (not in a function); did you mean 'PIDTYPE_MAX'?
     .type  = PERF_TYPE_RAW,
              ^~~~~~~~~~~~~
              PIDTYPE_MAX
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:928:11: warning: excess elements in struct initializer
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:928:11: note: (near initialization for 'perf_miss_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:929:3: error: 'struct perf_event_attr' has no member named 'size'
     .size  = sizeof(struct perf_event_attr),
      ^~~~
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:929:18: error: invalid application of 'sizeof' to incomplete type 'struct perf_event_attr'
     .size  = sizeof(struct perf_event_attr),
                     ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:929:11: warning: excess elements in struct initializer
     .size  = sizeof(struct perf_event_attr),
              ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:929:11: note: (near initialization for 'perf_miss_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:930:3: error: 'struct perf_event_attr' has no member named 'pinned'
     .pinned  = 1,
      ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:930:13: warning: excess elements in struct initializer
     .pinned  = 1,
                ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:930:13: note: (near initialization for 'perf_miss_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:931:3: error: 'struct perf_event_attr' has no member named 'disabled'
     .disabled = 0,
      ^~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:931:14: warning: excess elements in struct initializer
     .disabled = 0,
                 ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:931:14: note: (near initialization for 'perf_miss_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:932:3: error: 'struct perf_event_attr' has no member named 'exclude_user'
     .exclude_user = 1,
      ^~~~~~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:932:18: warning: excess elements in struct initializer
     .exclude_user = 1,
                     ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:932:18: note: (near initialization for 'perf_miss_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:935:15: error: variable 'perf_hit_attr' has initializer but incomplete type
    static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
                  ^~~~~~~~~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:936:3: error: 'struct perf_event_attr' has no member named 'type'
     .type  = PERF_TYPE_RAW,
      ^~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:936:11: warning: excess elements in struct initializer
     .type  = PERF_TYPE_RAW,
              ^~~~~~~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:936:11: note: (near initialization for 'perf_hit_attr')
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:937:3: error: 'struct perf_event_attr' has no member named 'size'
     .size  = sizeof(struct perf_event_attr),
      ^~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:937:18: error: invalid application of 'sizeof' to incomplete type 'struct perf_event_attr'
     .size  = sizeof(struct perf_event_attr),
                     ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:937:11: warning: excess elements in struct initializer
     .size  = sizeof(struct perf_event_attr),
              ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:937:11: note: (near initialization for 'perf_hit_attr')
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:938:3: error: 'struct perf_event_attr' has no member named 'pinned'
     .pinned  = 1,
      ^~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:938:13: warning: excess elements in struct initializer
     .pinned  = 1,
                ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:938:13: note: (near initialization for 'perf_hit_attr')
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:939:3: error: 'struct perf_event_attr' has no member named 'disabled'
     .disabled = 0,
      ^~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:939:14: warning: excess elements in struct initializer
     .disabled = 0,
                 ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:939:14: note: (near initialization for 'perf_hit_attr')
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:940:3: error: 'struct perf_event_attr' has no member named 'exclude_user'
     .exclude_user = 1,
      ^~~~~~~~~~~~
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:940:18: warning: excess elements in struct initializer
     .exclude_user = 1,
                     ^
   arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:940:18: note: (near initialization for 'perf_hit_attr')
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:927:55: error: storage size of 'perf_miss_attr' isn't known
    static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
                                                          ^~~~~~~~~~~~~~
>> arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c:935:55: error: storage size of 'perf_hit_attr' isn't known
    static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
                                                          ^~~~~~~~~~~~~

vim +/perf_miss_attr +927 arch/x86//kernel/cpu/intel_rdt_pseudo_lock.c

   917	
   918	/*
   919	 * Create a perf_event_attr for the hit and miss perf events that will
   920	 * be used during the performance measurement. A perf_event maintains
   921	 * a pointer to its perf_event_attr so a unique attribute structure is
   922	 * created for each perf_event.
   923	 *
   924	 * The actual configuration of the event is set right before use in order
   925	 * to use the X86_CONFIG macro.
   926	 */
 > 927	static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
 > 928		.type		= PERF_TYPE_RAW,
 > 929		.size		= sizeof(struct perf_event_attr),
 > 930		.pinned		= 1,
 > 931		.disabled	= 0,
 > 932		.exclude_user	= 1,
   933	};
   934	
 > 935	static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
   936		.type		= PERF_TYPE_RAW,
   937		.size		= sizeof(struct perf_event_attr),
   938		.pinned		= 1,
 > 939		.disabled	= 0,
 > 940		.exclude_user	= 1,
   941	};
   942	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25812 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes
  2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
  2018-09-12  5:49   ` kbuild test robot
@ 2018-09-12  7:20   ` kbuild test robot
  2018-09-12 19:54   ` [PATCH V4 " Reinette Chatre
  2 siblings, 0 replies; 17+ messages in thread
From: kbuild test robot @ 2018-09-12  7:20 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: kbuild-all, tglx, fenghua.yu, tony.luck, peterz, mingo, acme,
	gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

[-- Attachment #1: Type: text/plain, Size: 6683 bytes --]

Hi Reinette,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on v4.19-rc3 next-20180911]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Reinette-Chatre/perf-core-and-x86-intel_rdt-Fix-lack-of-coordination-with-perf/20180912-101526
config: i386-randconfig-x0-09121359 (attached as .config)
compiler: gcc-5 (Debian 5.5.0-3) 5.4.1 20171010
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

Note: the linux-review/Reinette-Chatre/perf-core-and-x86-intel_rdt-Fix-lack-of-coordination-with-perf/20180912-101526 HEAD b684b8727deb9e3cf635badb292b3314904d17b2 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:927:15: error: variable 'perf_miss_attr' has initializer but incomplete type
    static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
                  ^
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:928:2: error: unknown field 'type' specified in initializer
     .type  = PERF_TYPE_RAW,
     ^
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:928:11: error: 'PERF_TYPE_RAW' undeclared here (not in a function)
     .type  = PERF_TYPE_RAW,
              ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:928:11: warning: excess elements in struct initializer
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:928:11: note: (near initialization for 'perf_miss_attr')
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:929:2: error: unknown field 'size' specified in initializer
     .size  = sizeof(struct perf_event_attr),
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:929:18: error: invalid application of 'sizeof' to incomplete type 'struct perf_event_attr'
     .size  = sizeof(struct perf_event_attr),
                     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:929:11: warning: excess elements in struct initializer
     .size  = sizeof(struct perf_event_attr),
              ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:929:11: note: (near initialization for 'perf_miss_attr')
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:930:2: error: unknown field 'pinned' specified in initializer
     .pinned  = 1,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:930:13: warning: excess elements in struct initializer
     .pinned  = 1,
                ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:930:13: note: (near initialization for 'perf_miss_attr')
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:931:2: error: unknown field 'disabled' specified in initializer
     .disabled = 0,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:931:14: warning: excess elements in struct initializer
     .disabled = 0,
                 ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:931:14: note: (near initialization for 'perf_miss_attr')
>> arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:932:2: error: unknown field 'exclude_user' specified in initializer
     .exclude_user = 1,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:932:18: warning: excess elements in struct initializer
     .exclude_user = 1,
                     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:932:18: note: (near initialization for 'perf_miss_attr')
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:935:15: error: variable 'perf_hit_attr' has initializer but incomplete type
    static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
                  ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:936:2: error: unknown field 'type' specified in initializer
     .type  = PERF_TYPE_RAW,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:936:11: warning: excess elements in struct initializer
     .type  = PERF_TYPE_RAW,
              ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:936:11: note: (near initialization for 'perf_hit_attr')
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:937:2: error: unknown field 'size' specified in initializer
     .size  = sizeof(struct perf_event_attr),
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:937:18: error: invalid application of 'sizeof' to incomplete type 'struct perf_event_attr'
     .size  = sizeof(struct perf_event_attr),
                     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:937:11: warning: excess elements in struct initializer
     .size  = sizeof(struct perf_event_attr),
              ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:937:11: note: (near initialization for 'perf_hit_attr')
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:938:2: error: unknown field 'pinned' specified in initializer
     .pinned  = 1,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:938:13: warning: excess elements in struct initializer
     .pinned  = 1,
                ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:938:13: note: (near initialization for 'perf_hit_attr')
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:939:2: error: unknown field 'disabled' specified in initializer
     .disabled = 0,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:939:14: warning: excess elements in struct initializer
     .disabled = 0,
                 ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:939:14: note: (near initialization for 'perf_hit_attr')
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:940:2: error: unknown field 'exclude_user' specified in initializer
     .exclude_user = 1,
     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:940:18: warning: excess elements in struct initializer
     .exclude_user = 1,
                     ^
   arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c:940:18: note: (near initialization for 'perf_hit_attr')

vim +/type +928 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c

   917	
   918	/*
   919	 * Create a perf_event_attr for the hit and miss perf events that will
   920	 * be used during the performance measurement. A perf_event maintains
   921	 * a pointer to its perf_event_attr so a unique attribute structure is
   922	 * created for each perf_event.
   923	 *
   924	 * The actual configuration of the event is set right before use in order
   925	 * to use the X86_CONFIG macro.
   926	 */
 > 927	static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
 > 928		.type		= PERF_TYPE_RAW,
 > 929		.size		= sizeof(struct perf_event_attr),
 > 930		.pinned		= 1,
 > 931		.disabled	= 0,
 > 932		.exclude_user	= 1,
   933	};
   934	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25464 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V4 4/6] x86/intel_rdt: Create required perf event attributes
  2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
  2018-09-12  5:49   ` kbuild test robot
  2018-09-12  7:20   ` kbuild test robot
@ 2018-09-12 19:54   ` Reinette Chatre
  2 siblings, 0 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-12 19:54 UTC (permalink / raw)
  To: tglx, fenghua.yu, tony.luck, peterz, mingo, acme
  Cc: gavin.hindman, jithu.joseph, dave.hansen, hpa, x86, linux-kernel,
	Reinette Chatre

A perf event has many attributes that are maintained in a separate
structure that should be provided when a new perf_event is created.

In preparation for the transition to perf_events the required attribute
structures are created for all the events that may be used in the
measurements. Most attributes for all the events are identical. The
actual configuration, what specifies what needs to be measured, is what
will be different between the events used. This configuration needs to
be done with X86_CONFIG that cannot be used as part of the designated
initializers used here, this will be introduced later.

Although they do look identical at this time the attribute structures
needs to be maintained separately since a perf_event will maintain a
pointer to its unique attributes.

In support of patch testing the new structs are given the unused attribute
until their use in later patches.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
V4:
The kbuild test robot reported a build issue that at first seems to be
related to "make ARCH=i386" but was actually a result of the kernel
configuration used not having CONFIG_TRACEPOINTS set.
The consequence was that include/linux/perf_event.h was not being included
via the tracing code (from include/trace/define_trace.h) and we thus encounter
the build failure where struct perf_event_attr is unknown.
Fix this by explicitly including perf_event.h

 arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c | 26 +++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
index 8ad83eb3fc89..33d7968f152a 100644
--- a/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
+++ b/arch/x86/kernel/cpu/intel_rdt_pseudo_lock.c
@@ -17,6 +17,7 @@
 #include <linux/debugfs.h>
 #include <linux/kthread.h>
 #include <linux/mman.h>
+#include <linux/perf_event.h>
 #include <linux/pm_qos.h>
 #include <linux/slab.h>
 #include <linux/uaccess.h>
@@ -915,6 +916,31 @@ static int measure_cycles_lat_fn(void *_plr)
 	return 0;
 }
 
+/*
+ * Create a perf_event_attr for the hit and miss perf events that will
+ * be used during the performance measurement. A perf_event maintains
+ * a pointer to its perf_event_attr so a unique attribute structure is
+ * created for each perf_event.
+ *
+ * The actual configuration of the event is set right before use in order
+ * to use the X86_CONFIG macro.
+ */
+static struct perf_event_attr __attribute__((unused)) perf_miss_attr = {
+	.type		= PERF_TYPE_RAW,
+	.size		= sizeof(struct perf_event_attr),
+	.pinned		= 1,
+	.disabled	= 0,
+	.exclude_user	= 1,
+};
+
+static struct perf_event_attr __attribute__((unused)) perf_hit_attr = {
+	.type		= PERF_TYPE_RAW,
+	.size		= sizeof(struct perf_event_attr),
+	.pinned		= 1,
+	.disabled	= 0,
+	.exclude_user	= 1,
+};
+
 static int measure_cycles_perf_fn(void *_plr)
 {
 	unsigned long long l3_hits = 0, l3_miss = 0;
-- 
2.17.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-11 17:14 ` [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index Reinette Chatre
@ 2018-09-17  8:23   ` Peter Zijlstra
  2018-09-17 16:37     ` Reinette Chatre
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-09-17  8:23 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

On Tue, Sep 11, 2018 at 10:14:33AM -0700, Reinette Chatre wrote:
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 53c500f0ca79..c04dc666425c 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1025,6 +1025,27 @@ static inline int in_software_context(struct perf_event *event)
>  	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
>  }
>  
> +/**
> + * perf_rdpmc_index - Return PMC counter used for event
> + * @event: the perf_event to which the PMC counter was assigned
> + *
> + * The counter assigned to this performance event may change if interrupts
> + * are enabled. This counter should thus never be used while interrupts are
> + * enabled. Before this function is used to obtain the assigned counter the
> + * event could be checked for validity using, for example,
> + * perf_event_read_local(), within the same interrupt disabled section in
> + * which this counter is planned to be used.
> + *
> + * Return: The index of the performance monitoring counter assigned to
> + * @perf_event.
> + */
> +static inline int perf_rdpmc_index(struct perf_event *event)
> +{
> +	lockdep_assert_irqs_disabled();
> +
> +	return event->hw.event_base_rdpmc;
> +}

I said arch/x86/include/asm/perf_events.h and call it:
x86_perf_rdpmc_index().

This function is very much x86 specific.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements
  2018-09-11 17:14 ` [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements Reinette Chatre
@ 2018-09-17  8:58   ` Peter Zijlstra
  2018-09-17 16:54     ` Reinette Chatre
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-09-17  8:58 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

On Tue, Sep 11, 2018 at 10:14:36AM -0700, Reinette Chatre wrote:
> +static int measure_l2_residency(void *_plr)
> +{

> +	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);

> +}
> +
> +static int measure_l3_residency(void *_plr)
> +{

> +	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);

Not sure it's important, but both sites loose the error return.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-17  8:23   ` Peter Zijlstra
@ 2018-09-17 16:37     ` Reinette Chatre
  2018-09-17 23:07       ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Reinette Chatre @ 2018-09-17 16:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

Hi Peter,

On 9/17/2018 1:23 AM, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 10:14:33AM -0700, Reinette Chatre wrote:
>> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
>> index 53c500f0ca79..c04dc666425c 100644
>> --- a/include/linux/perf_event.h
>> +++ b/include/linux/perf_event.h
>> @@ -1025,6 +1025,27 @@ static inline int in_software_context(struct perf_event *event)
>>  	return event->ctx->pmu->task_ctx_nr == perf_sw_context;
>>  }
>>  
>> +/**
>> + * perf_rdpmc_index - Return PMC counter used for event
>> + * @event: the perf_event to which the PMC counter was assigned
>> + *
>> + * The counter assigned to this performance event may change if interrupts
>> + * are enabled. This counter should thus never be used while interrupts are
>> + * enabled. Before this function is used to obtain the assigned counter the
>> + * event could be checked for validity using, for example,
>> + * perf_event_read_local(), within the same interrupt disabled section in
>> + * which this counter is planned to be used.
>> + *
>> + * Return: The index of the performance monitoring counter assigned to
>> + * @perf_event.
>> + */
>> +static inline int perf_rdpmc_index(struct perf_event *event)
>> +{
>> +	lockdep_assert_irqs_disabled();
>> +
>> +	return event->hw.event_base_rdpmc;
>> +}
> 
> I said arch/x86/include/asm/perf_events.h and call it:
> x86_perf_rdpmc_index().
> 
> This function is very much x86 specific.
> 

My response to your original request includes the reason why I made this
change instead. Since you did not reply I assumed that you agreed with
the conclusion and I proceeded with my proposal there:
http://lkml.kernel.org/r/f47a2146-2f1a-49fc-2306-3341154f1186@intel.com

The reason why I made this change is repeated in the cover letter of
this series:
http://lkml.kernel.org/r/cover.1536685533.git.reinette.chatre@intel.com

My original response is copied here for your convenience:
Hi Peter,

On 9/6/2018 7:47 AM, Peter Zijlstra wrote:
> On Thu, Aug 16, 2018 at 01:16:07PM -0700, Reinette Chatre wrote:
>
>> +static inline int x86_perf_rdpmc_ctr_get(struct perf_event *event)
>> +{
>> +	lockdep_assert_irqs_disabled();
>> +
>> +	return IS_ERR_OR_NULL(event) ? -1 : event->hw.event_base_rdpmc;
>> +}
>
> That should be in arch/x86/include/asm/perf_event.h if anywhere. Also,
> call the thing x86_perf_rdpmc_index(), that's consistent with the other
> naming.

Moving it to arch/x86/include/asm/perf_event.h is not trivial since this
file is not familiar with struct perf_event.

struct perf_event, struct hw_perf_event and its member event_base_rdpmc
are all defined in include/linux/perf_event.h - could this function
perhaps be moved there? If so, would perf_rdpmc_index() perhaps be a
better name to be consistent with the other naming?

>
> I don't think there's any point in testing for !event, this is an
> interface that mandates you know wth you're doing anyway.
>

I could add:
/* !CONFIG_PERF_EVENTS */
static inline int perf_rdpmc_index(struct perf_event *event)
{
        return -1;
}

Reinette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements
  2018-09-17  8:58   ` Peter Zijlstra
@ 2018-09-17 16:54     ` Reinette Chatre
  0 siblings, 0 replies; 17+ messages in thread
From: Reinette Chatre @ 2018-09-17 16:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

Hi Peter,

On 9/17/2018 1:58 AM, Peter Zijlstra wrote:
> On Tue, Sep 11, 2018 at 10:14:36AM -0700, Reinette Chatre wrote:
>> +static int measure_l2_residency(void *_plr)
>> +{
> 
>> +	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);
> 
>> +}
>> +
>> +static int measure_l3_residency(void *_plr)
>> +{
> 
>> +	measure_residency_fn(&perf_miss_attr, &perf_hit_attr, plr, &counts);
> 
> Not sure it's important, but both sites loose the error return.
> 

measure_l2_residency() as well as measure_l3_residency() are thread
functions so this error does not propagate directly to the caller.

Even so, by not exiting the thread here if measure_residency_fn() fails
the tracepoints will be written with zeroes. If we exited here the
tracepoints will not be written and the trace buffer will be empty when
the user searches for the measurement data. Do you perhaps have an
inclination to which a user would prefer?

Reinette

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-17 16:37     ` Reinette Chatre
@ 2018-09-17 23:07       ` Peter Zijlstra
  2018-09-18 17:54         ` Reinette Chatre
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Zijlstra @ 2018-09-17 23:07 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

On Mon, Sep 17, 2018 at 09:37:14AM -0700, Reinette Chatre wrote:
> On 9/17/2018 1:23 AM, Peter Zijlstra wrote:

> > I said arch/x86/include/asm/perf_events.h and call it:
> > x86_perf_rdpmc_index().
> > 
> > This function is very much x86 specific.

> Moving it to arch/x86/include/asm/perf_event.h is not trivial since this
> file is not familiar with struct perf_event.

Urgh, right you are. Does it work if you make it a regular function
instead of an inline? Put the thing in arch/x86/events/core.c or so and
only an extern decl in asm/perf_event.h.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-17 23:07       ` Peter Zijlstra
@ 2018-09-18 17:54         ` Reinette Chatre
  2018-09-19  9:17           ` Peter Zijlstra
  0 siblings, 1 reply; 17+ messages in thread
From: Reinette Chatre @ 2018-09-18 17:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

Hi Peter,

On 9/17/2018 4:07 PM, Peter Zijlstra wrote:
> On Mon, Sep 17, 2018 at 09:37:14AM -0700, Reinette Chatre wrote:
>> On 9/17/2018 1:23 AM, Peter Zijlstra wrote:
> 
>>> I said arch/x86/include/asm/perf_events.h and call it:
>>> x86_perf_rdpmc_index().
>>>
>>> This function is very much x86 specific.
> 
>> Moving it to arch/x86/include/asm/perf_event.h is not trivial since this
>> file is not familiar with struct perf_event.
> 
> Urgh, right you are. Does it work if you make it a regular function
> instead of an inline? Put the thing in arch/x86/events/core.c or so and
> only an extern decl in asm/perf_event.h.

It works, but checkpatch.pl does not like it very much:
> CHECK: extern prototypes should be avoided in .h files
> #66: FILE: arch/x86/include/asm/perf_event.h:273:
> +extern int x86_perf_rdpmc_index(struct perf_event *event);

Doing this also prevents the availability of a x86_perf_rdpmc_index()
for when !CONFIG_PERF_EVENTS. I do not know if this is of big concern
since CONFIG_PERF_EVENTS is automatically selected by CONFIG_X86 ... but
at the same time there are other function definitions in
arch/x86/include/asm/perf_event.h for when !CONFIG_PERF_EVENTS.

Considering the above, would you like me to continue with the move to
arch/x86/events/core.c? Here is what I understood your suggestion to mean:

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index dfb2f7c0d019..3550d800b030 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -1033,6 +1033,27 @@ static inline void x86_assign_hw_event(struct
perf_event *event,
        }
 }

+/**
+ * x86_perf_rdpmc_index - Return PMC counter used for event
+ * @event: the perf_event to which the PMC counter was assigned
+ *
+ * The counter assigned to this performance event may change if interrupts
+ * are enabled. This counter should thus never be used while interrupts are
+ * enabled. Before this function is used to obtain the assigned counter the
+ * event should be checked for validity using, for example,
+ * perf_event_read_local(), within the same interrupt disabled section in
+ * which this counter is planned to be used.
+ *
+ * Return: The index of the performance monitoring counter assigned to
+ * @perf_event.
+ */
+int x86_perf_rdpmc_index(struct perf_event *event)
+{
+       lockdep_assert_irqs_disabled();
+
+       return event->hw.event_base_rdpmc;
+}
+
 static inline int match_prev_assignment(struct hw_perf_event *hwc,
                                        struct cpu_hw_events *cpuc,
                                        int i)
diff --git a/arch/x86/include/asm/perf_event.h
b/arch/x86/include/asm/perf_event.h
index 12f54082f4c8..b2cf84c35a6d 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -270,6 +270,7 @@ struct perf_guest_switch_msr {
 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
 extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
 extern void perf_check_microcode(void);
+extern int x86_perf_rdpmc_index(struct perf_event *event);
 #else
 static inline struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
 {

Thank you

Reinette

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index
  2018-09-18 17:54         ` Reinette Chatre
@ 2018-09-19  9:17           ` Peter Zijlstra
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Zijlstra @ 2018-09-19  9:17 UTC (permalink / raw)
  To: Reinette Chatre
  Cc: tglx, fenghua.yu, tony.luck, mingo, acme, gavin.hindman,
	jithu.joseph, dave.hansen, hpa, x86, linux-kernel

On Tue, Sep 18, 2018 at 10:54:28AM -0700, Reinette Chatre wrote:
> It works, but checkpatch.pl does not like it very much:
> > CHECK: extern prototypes should be avoided in .h files
> > #66: FILE: arch/x86/include/asm/perf_event.h:273:
> > +extern int x86_perf_rdpmc_index(struct perf_event *event);

Checkpatch is wrong. Ignore it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-09-19  9:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-11 17:14 [PATCH V3 0/6] perf/core and x86/intel_rdt: Fix lack of coordination with perf Reinette Chatre
2018-09-11 17:14 ` [PATCH V3 1/6] perf/core: Add sanity check to deal with pinned event failure Reinette Chatre
2018-09-11 17:14 ` [PATCH V3 2/6] perf/core: Add helper to obtain performance counter index Reinette Chatre
2018-09-17  8:23   ` Peter Zijlstra
2018-09-17 16:37     ` Reinette Chatre
2018-09-17 23:07       ` Peter Zijlstra
2018-09-18 17:54         ` Reinette Chatre
2018-09-19  9:17           ` Peter Zijlstra
2018-09-11 17:14 ` [PATCH V3 3/6] x86/intel_rdt: Remove local register variables Reinette Chatre
2018-09-11 17:14 ` [PATCH V3 4/6] x86/intel_rdt: Create required perf event attributes Reinette Chatre
2018-09-12  5:49   ` kbuild test robot
2018-09-12  7:20   ` kbuild test robot
2018-09-12 19:54   ` [PATCH V4 " Reinette Chatre
2018-09-11 17:14 ` [PATCH V3 5/6] x86/intel_rdt: Use perf infrastructure for measurements Reinette Chatre
2018-09-17  8:58   ` Peter Zijlstra
2018-09-17 16:54     ` Reinette Chatre
2018-09-11 17:14 ` [PATCH V3 6/6] x86/intel_rdt: Re-enable pseudo-lock measurements Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).