[PATCH 0/4] perf: Implement event group read using txn interface

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] perf: Implement event group read using txn interface
@ 2015-03-04  8:35 ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: dev, linux-kernel, linuxppc-dev

Unlike normal hardware PMCs, the 24x7 counters[1] in Power8 are stored in
memory and accessed via a hypervisor call (HCALL).  A major aspect of the
HCALL is that it allows retireving _SEVERAL_ counters at once (unlike
regular PMCs, which are read one at a time). By reading several counters
at once, we can get a more consistent snapshot of the system.

This patchset explores the possibility of using the transaction interface
to accomplish submitting several "events" to the PMU and have the PMU read
them all at once. User is expected to submit the set of events they want
to read as an "event group".

In the kernel, we submit each event to the PMU using the following logic
(from Peter Zijlstra).

	pmu->start_txn(pmu, PMU_TXN_READ);

	leader->read();
	for_each_sibling()
		sibling->read();
	pmu->commit_txn();

where:
	- the ->read()s queue events to be submitted to the hypervisor, and,
	- the ->commit_txn() issues the HCALL, retrieves the result and
	  updates the event count.

Since the ->commit_txn() updates the event count, perf subsystem can use
the event count directly and skip the a ->read() call immediately after
the transaction.

TODO: 
	- Lightly tested on Powerpc; Needs more testing on Power and x86.
	- Need to review/update txn and perf_event_read_value() interfaces
	  for other architectures.

Thanks to Peter Zijlstra for his input.

Sukadev Bhattiprolu (4):
  perf: Add 'flags' parameter to pmu txn interfaces
  perf: Split perf_event_read() and perf_event_count()
  perf: Add 'update' parameter to perf_event_read_value()
  perf/powerpc: Implement group_read() txn interface for 24x7 counters

 arch/powerpc/perf/core-book3s.c  |  15 +++-
 arch/powerpc/perf/hv-24x7.c      | 171 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/perf_event.c |  15 +++-
 arch/x86/kvm/pmu.c               |   2 +-
 include/linux/perf_event.h       |  17 +++-
 kernel/events/core.c             |  93 ++++++++++++++++-----
 6 files changed, 281 insertions(+), 32 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/4] perf: Implement event group read using txn interface
@ 2015-03-04  8:35 ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: linuxppc-dev, dev, linux-kernel

Unlike normal hardware PMCs, the 24x7 counters[1] in Power8 are stored in
memory and accessed via a hypervisor call (HCALL).  A major aspect of the
HCALL is that it allows retireving _SEVERAL_ counters at once (unlike
regular PMCs, which are read one at a time). By reading several counters
at once, we can get a more consistent snapshot of the system.

This patchset explores the possibility of using the transaction interface
to accomplish submitting several "events" to the PMU and have the PMU read
them all at once. User is expected to submit the set of events they want
to read as an "event group".

In the kernel, we submit each event to the PMU using the following logic
(from Peter Zijlstra).

	pmu->start_txn(pmu, PMU_TXN_READ);

	leader->read();
	for_each_sibling()
		sibling->read();
	pmu->commit_txn();

where:
	- the ->read()s queue events to be submitted to the hypervisor, and,
	- the ->commit_txn() issues the HCALL, retrieves the result and
	  updates the event count.

Since the ->commit_txn() updates the event count, perf subsystem can use
the event count directly and skip the a ->read() call immediately after
the transaction.

TODO: 
	- Lightly tested on Powerpc; Needs more testing on Power and x86.
	- Need to review/update txn and perf_event_read_value() interfaces
	  for other architectures.

Thanks to Peter Zijlstra for his input.

Sukadev Bhattiprolu (4):
  perf: Add 'flags' parameter to pmu txn interfaces
  perf: Split perf_event_read() and perf_event_count()
  perf: Add 'update' parameter to perf_event_read_value()
  perf/powerpc: Implement group_read() txn interface for 24x7 counters

 arch/powerpc/perf/core-book3s.c  |  15 +++-
 arch/powerpc/perf/hv-24x7.c      | 171 +++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/cpu/perf_event.c |  15 +++-
 arch/x86/kvm/pmu.c               |   2 +-
 include/linux/perf_event.h       |  17 +++-
 kernel/events/core.c             |  93 ++++++++++++++++-----
 6 files changed, 281 insertions(+), 32 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
  2015-03-04  8:35 ` Sukadev Bhattiprolu
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: dev, linux-kernel, linuxppc-dev

In addition to using the transaction interface to schedule events
on a PMU, we will use it to also read a group of counters at once.
Accordingly, add a flags parameter to the transaction interfaces.
The flags indicate wheether the transaction is to add events to
the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.

Based on input from Peter Zijlstra.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
 arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
 include/linux/perf_event.h       | 14 +++++++++++---
 kernel/events/core.c             | 26 +++++++++++++++-----------
 4 files changed, 50 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 7c4f669..3d3739a 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1573,10 +1573,13 @@ static void power_pmu_stop(struct perf_event *event, int ef_flags)
  * Set the flag to make pmu::enable() not perform the
  * schedulability test, it will be performed at commit time
  */
-static void power_pmu_start_txn(struct pmu *pmu)
+static void power_pmu_start_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
 	cpuhw->n_txn_start = cpuhw->n_events;
@@ -1587,10 +1590,13 @@ static void power_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void power_pmu_cancel_txn(struct pmu *pmu)
+static void power_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
 }
@@ -1600,11 +1606,14 @@ static void power_pmu_cancel_txn(struct pmu *pmu)
  * Perform the group schedulability test as a whole
  * Return 0 if success
  */
-static int power_pmu_commit_txn(struct pmu *pmu)
+static int power_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw;
 	long i, n;
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return -EINVAL;
+
 	if (!ppmu)
 		return -EAGAIN;
 	cpuhw = this_cpu_ptr(&cpu_hw_events);
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index b71a7f8..b2c9e3b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1607,8 +1607,11 @@ static inline void x86_pmu_read(struct perf_event *event)
  * Set the flag to make pmu::enable() not perform the
  * schedulability test, it will be performed at commit time
  */
-static void x86_pmu_start_txn(struct pmu *pmu)
+static void x86_pmu_start_txn(struct pmu *pmu, int flags)
 {
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	perf_pmu_disable(pmu);
 	__this_cpu_or(cpu_hw_events.group_flag, PERF_EVENT_TXN);
 	__this_cpu_write(cpu_hw_events.n_txn, 0);
@@ -1619,8 +1622,11 @@ static void x86_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void x86_pmu_cancel_txn(struct pmu *pmu)
+static void x86_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	__this_cpu_and(cpu_hw_events.group_flag, ~PERF_EVENT_TXN);
 	/*
 	 * Truncate collected array by the number of events added in this
@@ -1638,12 +1644,15 @@ static void x86_pmu_cancel_txn(struct pmu *pmu)
  *
  * Does not cancel the transaction on failure; expects the caller to do this.
  */
-static int x86_pmu_commit_txn(struct pmu *pmu)
+static int x86_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int assign[X86_PMC_IDX_MAX];
 	int n, ret;
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return -EINVAL;
+
 	n = cpuc->n_events;
 
 	if (!x86_pmu_initialized())
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2b62198..c8fe60e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -225,6 +225,8 @@ struct pmu {
 	 * should stop the counter when perf_event_overflow() returns
 	 * !0. ->start() will be used to continue.
 	 */
+#define	PERF_PMU_TXN_ADD	1
+#define	PERF_PMU_TXN_READ	2
 	void (*start)			(struct perf_event *event, int flags);
 	void (*stop)			(struct perf_event *event, int flags);
 
@@ -240,20 +242,26 @@ struct pmu {
 	 *
 	 * Start the transaction, after this ->add() doesn't need to
 	 * do schedulability tests.
+	 *
+	 * Optional.
 	 */
-	void (*start_txn)		(struct pmu *pmu); /* optional */
+	void (*start_txn)		(struct pmu *pmu, int flags);
 	/*
 	 * If ->start_txn() disabled the ->add() schedulability test
 	 * then ->commit_txn() is required to perform one. On success
 	 * the transaction is closed. On error the transaction is kept
 	 * open until ->cancel_txn() is called.
+	 *
+	 * Optional.
 	 */
-	int  (*commit_txn)		(struct pmu *pmu); /* optional */
+	int  (*commit_txn)		(struct pmu *pmu, int flags);
 	/*
 	 * Will cancel the transaction, assumes ->del() is called
 	 * for each successful ->add() during the transaction.
+	 *
+	 * Optional.
 	 */
-	void (*cancel_txn)		(struct pmu *pmu); /* optional */
+	void (*cancel_txn)		(struct pmu *pmu, int flags);
 
 	/*
 	 * Will return the value for perf_event_mmap_page::index for this event,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f04daab..dbc12bf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1924,10 +1924,10 @@ group_sched_in(struct perf_event *group_event,
 	if (group_event->state == PERF_EVENT_STATE_OFF)
 		return 0;
 
-	pmu->start_txn(pmu);
+	pmu->start_txn(pmu, PERF_PMU_TXN_ADD);
 
 	if (event_sched_in(group_event, cpuctx, ctx)) {
-		pmu->cancel_txn(pmu);
+		pmu->cancel_txn(pmu, PERF_PMU_TXN_ADD);
 		perf_cpu_hrtimer_restart(cpuctx);
 		return -EAGAIN;
 	}
@@ -1942,7 +1942,7 @@ group_sched_in(struct perf_event *group_event,
 		}
 	}
 
-	if (!pmu->commit_txn(pmu))
+	if (!pmu->commit_txn(pmu, PERF_PMU_TXN_ADD))
 		return 0;
 
 group_error:
@@ -1973,7 +1973,7 @@ group_error:
 	}
 	event_sched_out(group_event, cpuctx, ctx);
 
-	pmu->cancel_txn(pmu);
+	pmu->cancel_txn(pmu, PERF_PMU_TXN_ADD);
 
 	perf_cpu_hrtimer_restart(cpuctx);
 
@@ -6718,23 +6718,27 @@ static void perf_pmu_nop_void(struct pmu *pmu)
 {
 }
 
-static int perf_pmu_nop_int(struct pmu *pmu)
+static void perf_pmu_nop_txn(struct pmu *pmu, int flags)
+{
+}
+
+static int perf_pmu_nop_txn_int(struct pmu *pmu, int flags)
 {
 	return 0;
 }
 
-static void perf_pmu_start_txn(struct pmu *pmu)
+static void perf_pmu_start_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_disable(pmu);
 }
 
-static int perf_pmu_commit_txn(struct pmu *pmu)
+static int perf_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_enable(pmu);
 	return 0;
 }
 
-static void perf_pmu_cancel_txn(struct pmu *pmu)
+static void perf_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_enable(pmu);
 }
@@ -6968,9 +6972,9 @@ got_cpu_context:
 			pmu->commit_txn = perf_pmu_commit_txn;
 			pmu->cancel_txn = perf_pmu_cancel_txn;
 		} else {
-			pmu->start_txn  = perf_pmu_nop_void;
-			pmu->commit_txn = perf_pmu_nop_int;
-			pmu->cancel_txn = perf_pmu_nop_void;
+			pmu->start_txn  = perf_pmu_nop_txn;
+			pmu->commit_txn = perf_pmu_nop_txn_int;
+			pmu->cancel_txn = perf_pmu_nop_txn;
 		}
 	}
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: linuxppc-dev, dev, linux-kernel

In addition to using the transaction interface to schedule events
on a PMU, we will use it to also read a group of counters at once.
Accordingly, add a flags parameter to the transaction interfaces.
The flags indicate wheether the transaction is to add events to
the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.

Based on input from Peter Zijlstra.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
 arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
 include/linux/perf_event.h       | 14 +++++++++++---
 kernel/events/core.c             | 26 +++++++++++++++-----------
 4 files changed, 50 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 7c4f669..3d3739a 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1573,10 +1573,13 @@ static void power_pmu_stop(struct perf_event *event, int ef_flags)
  * Set the flag to make pmu::enable() not perform the
  * schedulability test, it will be performed at commit time
  */
-static void power_pmu_start_txn(struct pmu *pmu)
+static void power_pmu_start_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	perf_pmu_disable(pmu);
 	cpuhw->group_flag |= PERF_EVENT_TXN;
 	cpuhw->n_txn_start = cpuhw->n_events;
@@ -1587,10 +1590,13 @@ static void power_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void power_pmu_cancel_txn(struct pmu *pmu)
+static void power_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw = this_cpu_ptr(&cpu_hw_events);
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	cpuhw->group_flag &= ~PERF_EVENT_TXN;
 	perf_pmu_enable(pmu);
 }
@@ -1600,11 +1606,14 @@ static void power_pmu_cancel_txn(struct pmu *pmu)
  * Perform the group schedulability test as a whole
  * Return 0 if success
  */
-static int power_pmu_commit_txn(struct pmu *pmu)
+static int power_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuhw;
 	long i, n;
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return -EINVAL;
+
 	if (!ppmu)
 		return -EAGAIN;
 	cpuhw = this_cpu_ptr(&cpu_hw_events);
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index b71a7f8..b2c9e3b 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1607,8 +1607,11 @@ static inline void x86_pmu_read(struct perf_event *event)
  * Set the flag to make pmu::enable() not perform the
  * schedulability test, it will be performed at commit time
  */
-static void x86_pmu_start_txn(struct pmu *pmu)
+static void x86_pmu_start_txn(struct pmu *pmu, int flags)
 {
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	perf_pmu_disable(pmu);
 	__this_cpu_or(cpu_hw_events.group_flag, PERF_EVENT_TXN);
 	__this_cpu_write(cpu_hw_events.n_txn, 0);
@@ -1619,8 +1622,11 @@ static void x86_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void x86_pmu_cancel_txn(struct pmu *pmu)
+static void x86_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return;
+
 	__this_cpu_and(cpu_hw_events.group_flag, ~PERF_EVENT_TXN);
 	/*
 	 * Truncate collected array by the number of events added in this
@@ -1638,12 +1644,15 @@ static void x86_pmu_cancel_txn(struct pmu *pmu)
  *
  * Does not cancel the transaction on failure; expects the caller to do this.
  */
-static int x86_pmu_commit_txn(struct pmu *pmu)
+static int x86_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
 	int assign[X86_PMC_IDX_MAX];
 	int n, ret;
 
+	if (flags & ~PERF_PMU_TXN_ADD)
+		return -EINVAL;
+
 	n = cpuc->n_events;
 
 	if (!x86_pmu_initialized())
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 2b62198..c8fe60e 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -225,6 +225,8 @@ struct pmu {
 	 * should stop the counter when perf_event_overflow() returns
 	 * !0. ->start() will be used to continue.
 	 */
+#define	PERF_PMU_TXN_ADD	1
+#define	PERF_PMU_TXN_READ	2
 	void (*start)			(struct perf_event *event, int flags);
 	void (*stop)			(struct perf_event *event, int flags);
 
@@ -240,20 +242,26 @@ struct pmu {
 	 *
 	 * Start the transaction, after this ->add() doesn't need to
 	 * do schedulability tests.
+	 *
+	 * Optional.
 	 */
-	void (*start_txn)		(struct pmu *pmu); /* optional */
+	void (*start_txn)		(struct pmu *pmu, int flags);
 	/*
 	 * If ->start_txn() disabled the ->add() schedulability test
 	 * then ->commit_txn() is required to perform one. On success
 	 * the transaction is closed. On error the transaction is kept
 	 * open until ->cancel_txn() is called.
+	 *
+	 * Optional.
 	 */
-	int  (*commit_txn)		(struct pmu *pmu); /* optional */
+	int  (*commit_txn)		(struct pmu *pmu, int flags);
 	/*
 	 * Will cancel the transaction, assumes ->del() is called
 	 * for each successful ->add() during the transaction.
+	 *
+	 * Optional.
 	 */
-	void (*cancel_txn)		(struct pmu *pmu); /* optional */
+	void (*cancel_txn)		(struct pmu *pmu, int flags);
 
 	/*
 	 * Will return the value for perf_event_mmap_page::index for this event,
diff --git a/kernel/events/core.c b/kernel/events/core.c
index f04daab..dbc12bf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1924,10 +1924,10 @@ group_sched_in(struct perf_event *group_event,
 	if (group_event->state == PERF_EVENT_STATE_OFF)
 		return 0;
 
-	pmu->start_txn(pmu);
+	pmu->start_txn(pmu, PERF_PMU_TXN_ADD);
 
 	if (event_sched_in(group_event, cpuctx, ctx)) {
-		pmu->cancel_txn(pmu);
+		pmu->cancel_txn(pmu, PERF_PMU_TXN_ADD);
 		perf_cpu_hrtimer_restart(cpuctx);
 		return -EAGAIN;
 	}
@@ -1942,7 +1942,7 @@ group_sched_in(struct perf_event *group_event,
 		}
 	}
 
-	if (!pmu->commit_txn(pmu))
+	if (!pmu->commit_txn(pmu, PERF_PMU_TXN_ADD))
 		return 0;
 
 group_error:
@@ -1973,7 +1973,7 @@ group_error:
 	}
 	event_sched_out(group_event, cpuctx, ctx);
 
-	pmu->cancel_txn(pmu);
+	pmu->cancel_txn(pmu, PERF_PMU_TXN_ADD);
 
 	perf_cpu_hrtimer_restart(cpuctx);
 
@@ -6718,23 +6718,27 @@ static void perf_pmu_nop_void(struct pmu *pmu)
 {
 }
 
-static int perf_pmu_nop_int(struct pmu *pmu)
+static void perf_pmu_nop_txn(struct pmu *pmu, int flags)
+{
+}
+
+static int perf_pmu_nop_txn_int(struct pmu *pmu, int flags)
 {
 	return 0;
 }
 
-static void perf_pmu_start_txn(struct pmu *pmu)
+static void perf_pmu_start_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_disable(pmu);
 }
 
-static int perf_pmu_commit_txn(struct pmu *pmu)
+static int perf_pmu_commit_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_enable(pmu);
 	return 0;
 }
 
-static void perf_pmu_cancel_txn(struct pmu *pmu)
+static void perf_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
 	perf_pmu_enable(pmu);
 }
@@ -6968,9 +6972,9 @@ got_cpu_context:
 			pmu->commit_txn = perf_pmu_commit_txn;
 			pmu->cancel_txn = perf_pmu_cancel_txn;
 		} else {
-			pmu->start_txn  = perf_pmu_nop_void;
-			pmu->commit_txn = perf_pmu_nop_int;
-			pmu->cancel_txn = perf_pmu_nop_void;
+			pmu->start_txn  = perf_pmu_nop_txn;
+			pmu->commit_txn = perf_pmu_nop_txn_int;
+			pmu->cancel_txn = perf_pmu_nop_txn;
 		}
 	}
 
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] perf: Split perf_event_read() and perf_event_count()
  2015-03-04  8:35 ` Sukadev Bhattiprolu
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: dev, linux-kernel, linuxppc-dev

perf_event_read() does two things:

	- call the PMU to read/update the counter value
	- and compute the total count of the event and its children

perf_event_reset() needs the first piece but doesn't need the second.

Similarly, when we implement the ability to read a group of events
using the transaction interface, we would sometimes need one but
not both.

Break up perf_event_read() and have it just read/update the counter
and have the callers compute the total count if necessary.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index dbc12bf..11c4154 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3223,7 +3223,7 @@ static inline u64 perf_event_count(struct perf_event *event)
 	return local64_read(&event->count) + atomic64_read(&event->child_count);
 }
 
-static u64 perf_event_read(struct perf_event *event)
+static void perf_event_read(struct perf_event *event)
 {
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3249,8 +3249,6 @@ static u64 perf_event_read(struct perf_event *event)
 		update_event_times(event);
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
 	}
-
-	return perf_event_count(event);
 }
 
 /*
@@ -3654,14 +3652,18 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 	*running = 0;
 
 	mutex_lock(&event->child_mutex);
-	total += perf_event_read(event);
+
+	perf_event_read(event);
+	total += perf_event_count(event);
+
 	*enabled += event->total_time_enabled +
 			atomic64_read(&event->child_total_time_enabled);
 	*running += event->total_time_running +
 			atomic64_read(&event->child_total_time_running);
 
 	list_for_each_entry(child, &event->child_list, child_list) {
-		total += perf_event_read(child);
+		perf_event_read(child);
+		total += perf_event_count(child);
 		*enabled += child->total_time_enabled;
 		*running += child->total_time_running;
 	}
@@ -3821,7 +3823,7 @@ static unsigned int perf_poll(struct file *file, poll_table *wait)
 
 static void _perf_event_reset(struct perf_event *event)
 {
-	(void)perf_event_read(event);
+	perf_event_read(event);
 	local64_set(&event->count, 0);
 	perf_event_update_userpage(event);
 }
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] perf: Split perf_event_read() and perf_event_count()
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: linuxppc-dev, dev, linux-kernel

perf_event_read() does two things:

	- call the PMU to read/update the counter value
	- and compute the total count of the event and its children

perf_event_reset() needs the first piece but doesn't need the second.

Similarly, when we implement the ability to read a group of events
using the transaction interface, we would sometimes need one but
not both.

Break up perf_event_read() and have it just read/update the counter
and have the callers compute the total count if necessary.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 kernel/events/core.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index dbc12bf..11c4154 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3223,7 +3223,7 @@ static inline u64 perf_event_count(struct perf_event *event)
 	return local64_read(&event->count) + atomic64_read(&event->child_count);
 }
 
-static u64 perf_event_read(struct perf_event *event)
+static void perf_event_read(struct perf_event *event)
 {
 	/*
 	 * If event is enabled and currently active on a CPU, update the
@@ -3249,8 +3249,6 @@ static u64 perf_event_read(struct perf_event *event)
 		update_event_times(event);
 		raw_spin_unlock_irqrestore(&ctx->lock, flags);
 	}
-
-	return perf_event_count(event);
 }
 
 /*
@@ -3654,14 +3652,18 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 	*running = 0;
 
 	mutex_lock(&event->child_mutex);
-	total += perf_event_read(event);
+
+	perf_event_read(event);
+	total += perf_event_count(event);
+
 	*enabled += event->total_time_enabled +
 			atomic64_read(&event->child_total_time_enabled);
 	*running += event->total_time_running +
 			atomic64_read(&event->child_total_time_running);
 
 	list_for_each_entry(child, &event->child_list, child_list) {
-		total += perf_event_read(child);
+		perf_event_read(child);
+		total += perf_event_count(child);
 		*enabled += child->total_time_enabled;
 		*running += child->total_time_running;
 	}
@@ -3821,7 +3823,7 @@ static unsigned int perf_poll(struct file *file, poll_table *wait)
 
 static void _perf_event_reset(struct perf_event *event)
 {
-	(void)perf_event_read(event);
+	perf_event_read(event);
 	local64_set(&event->count, 0);
 	perf_event_update_userpage(event);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value()
  2015-03-04  8:35 ` Sukadev Bhattiprolu
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: dev, linux-kernel, linuxppc-dev

perf_event_read_value() reads the counter from the PMC and computes the
total count (including child events). Add an 'update' parameter and have
it read the PMC only if 'update' parameter is TRUE (which it always is
for now). When we add support for reading multiple events using the
transaction interface, we could optimize consulting the PMU, when we know
that the counts are already upto date.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/x86/kvm/pmu.c         |  2 +-
 include/linux/perf_event.h |  2 +-
 kernel/events/core.c       | 20 ++++++++++++++------
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8e6b7d8..ed91009 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -146,7 +146,7 @@ static u64 read_pmc(struct kvm_pmc *pmc)
 
 	if (pmc->perf_event)
 		counter += perf_event_read_value(pmc->perf_event,
-						 &enabled, &running);
+						 &enabled, &running, 1);
 
 	/* FIXME: Scaling needed? */
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8fe60e..8c571fb 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -580,7 +580,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
 extern void perf_pmu_migrate_context(struct pmu *pmu,
 				int src_cpu, int dst_cpu);
 extern u64 perf_event_read_value(struct perf_event *event,
-				 u64 *enabled, u64 *running);
+				 u64 *enabled, u64 *running, int update);
 
 
 struct perf_sample_data {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 11c4154..77ce4f3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3643,7 +3643,8 @@ static void orphans_remove_work(struct work_struct *work)
 	put_ctx(ctx);
 }
 
-u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
+u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
+				int update)
 {
 	struct perf_event *child;
 	u64 total = 0;
@@ -3653,7 +3654,9 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	mutex_lock(&event->child_mutex);
 
-	perf_event_read(event);
+	if (update)
+		perf_event_read(event);
+
 	total += perf_event_count(event);
 
 	*enabled += event->total_time_enabled +
@@ -3662,7 +3665,8 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 			atomic64_read(&event->child_total_time_running);
 
 	list_for_each_entry(child, &event->child_list, child_list) {
-		perf_event_read(child);
+		if (update)
+			perf_event_read(child);
 		total += perf_event_count(child);
 		*enabled += child->total_time_enabled;
 		*running += child->total_time_running;
@@ -3681,10 +3685,13 @@ static int perf_event_read_group(struct perf_event *event,
 	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
 	u64 values[5];
+	int update;
+	
 
 	lockdep_assert_held(&ctx->mutex);
 
-	count = perf_event_read_value(leader, &enabled, &running);
+	update = 1;
+	count = perf_event_read_value(leader, &enabled, &running, update);
 
 	values[n++] = 1 + leader->nr_siblings;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -3705,7 +3712,8 @@ static int perf_event_read_group(struct perf_event *event,
 	list_for_each_entry(sub, &leader->sibling_list, group_entry) {
 		n = 0;
 
-		values[n++] = perf_event_read_value(sub, &enabled, &running);
+		values[n++] = perf_event_read_value(sub, &enabled, &running,
+								update);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -3728,7 +3736,7 @@ static int perf_event_read_one(struct perf_event *event,
 	u64 values[4];
 	int n = 0;
 
-	values[n++] = perf_event_read_value(event, &enabled, &running);
+	values[n++] = perf_event_read_value(event, &enabled, &running, 1);
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
 		values[n++] = enabled;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value()
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: linuxppc-dev, dev, linux-kernel

perf_event_read_value() reads the counter from the PMC and computes the
total count (including child events). Add an 'update' parameter and have
it read the PMC only if 'update' parameter is TRUE (which it always is
for now). When we add support for reading multiple events using the
transaction interface, we could optimize consulting the PMU, when we know
that the counts are already upto date.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/x86/kvm/pmu.c         |  2 +-
 include/linux/perf_event.h |  2 +-
 kernel/events/core.c       | 20 ++++++++++++++------
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8e6b7d8..ed91009 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -146,7 +146,7 @@ static u64 read_pmc(struct kvm_pmc *pmc)
 
 	if (pmc->perf_event)
 		counter += perf_event_read_value(pmc->perf_event,
-						 &enabled, &running);
+						 &enabled, &running, 1);
 
 	/* FIXME: Scaling needed? */
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8fe60e..8c571fb 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -580,7 +580,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
 extern void perf_pmu_migrate_context(struct pmu *pmu,
 				int src_cpu, int dst_cpu);
 extern u64 perf_event_read_value(struct perf_event *event,
-				 u64 *enabled, u64 *running);
+				 u64 *enabled, u64 *running, int update);
 
 
 struct perf_sample_data {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 11c4154..77ce4f3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3643,7 +3643,8 @@ static void orphans_remove_work(struct work_struct *work)
 	put_ctx(ctx);
 }
 
-u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
+u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
+				int update)
 {
 	struct perf_event *child;
 	u64 total = 0;
@@ -3653,7 +3654,9 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	mutex_lock(&event->child_mutex);
 
-	perf_event_read(event);
+	if (update)
+		perf_event_read(event);
+
 	total += perf_event_count(event);
 
 	*enabled += event->total_time_enabled +
@@ -3662,7 +3665,8 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 			atomic64_read(&event->child_total_time_running);
 
 	list_for_each_entry(child, &event->child_list, child_list) {
-		perf_event_read(child);
+		if (update)
+			perf_event_read(child);
 		total += perf_event_count(child);
 		*enabled += child->total_time_enabled;
 		*running += child->total_time_running;
@@ -3681,10 +3685,13 @@ static int perf_event_read_group(struct perf_event *event,
 	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
 	u64 values[5];
+	int update;
+	
 
 	lockdep_assert_held(&ctx->mutex);
 
-	count = perf_event_read_value(leader, &enabled, &running);
+	update = 1;
+	count = perf_event_read_value(leader, &enabled, &running, update);
 
 	values[n++] = 1 + leader->nr_siblings;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -3705,7 +3712,8 @@ static int perf_event_read_group(struct perf_event *event,
 	list_for_each_entry(sub, &leader->sibling_list, group_entry) {
 		n = 0;
 
-		values[n++] = perf_event_read_value(sub, &enabled, &running);
+		values[n++] = perf_event_read_value(sub, &enabled, &running,
+								update);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -3728,7 +3736,7 @@ static int perf_event_read_one(struct perf_event *event,
 	u64 values[4];
 	int n = 0;
 
-	values[n++] = perf_event_read_value(event, &enabled, &running);
+	values[n++] = perf_event_read_value(event, &enabled, &running, 1);
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
 		values[n++] = enabled;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
  2015-03-04  8:35 ` Sukadev Bhattiprolu
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: dev, linux-kernel, linuxppc-dev

The 24x7 counters in Powerpc allow monitoring a large number of counters
simultaneously. They also allow reading several counters in a single
HCALL so we can get a more consistent snapshot of the system.

Use the PMU's transaction interface to monitor and read several event
counters at once. The idea is that users can group several 24x7 events
into a single group of events. We use the following logic to submit
the group of events to the PMU and read the values:

	pmu->start_txn()		// Intialize before first event

	for each event in group
		pmu->read(event);	// queue the event to be read

	pmu->commit_txn()		// Read/update all queued counters

The ->commit_txn() also updates the event counts in the respective
perf_event objects.  The perf subsystem can then directly get the
event counts from the perf_event and can avoid submitting a new
->read() request to the PMU.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 171 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/perf_event.h  |   1 +
 kernel/events/core.c        |  37 ++++++++++
 3 files changed, 209 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8c571fb..a144d67 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -166,6 +166,7 @@ struct perf_event;
  * pmu::capabilities flags
  */
 #define PERF_PMU_CAP_NO_INTERRUPT		0x01
+#define PERF_PMU_CAP_GROUP_READ			0x02
 
 /**
  * struct pmu - generic performance monitoring unit
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 77ce4f3..ff62ea5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
 }
 EXPORT_SYMBOL_GPL(perf_event_read_value);
 
+static int do_pmu_group_read(struct perf_event *leader)
+{
+	int ret;
+	struct pmu *pmu;
+	struct perf_event *sub;
+
+	pmu = leader->pmu;
+	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
+
+	pmu->read(leader);
+	list_for_each_entry(sub, &leader->sibling_list, group_entry)
+		pmu->read(sub);
+
+	/*
+	 * Commit_txn submits the transaction to read all the counters
+	 * in the group _and_ updates the event count.
+	 */
+	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
+
+	return ret;
+}
+
 static int perf_event_read_group(struct perf_event *event,
 				   u64 read_format, char __user *buf)
 {
 	struct perf_event *leader = event->group_leader, *sub;
 	struct perf_event_context *ctx = leader->ctx;
+	struct pmu *pmu;
 	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
 	u64 values[5];
@@ -3690,7 +3713,21 @@ static int perf_event_read_group(struct perf_event *event,
 
 	lockdep_assert_held(&ctx->mutex);
 
+	pmu = event->pmu;
 	update = 1;
+
+	if ((read_format & PERF_FORMAT_GROUP) &&
+			(pmu->capabilities & PERF_PMU_CAP_GROUP_READ)) {
+		ret = do_pmu_group_read(event);
+		if (ret)
+			return ret;
+		/*
+		 * ->commit_txn() would have updated the event count,
+		 * so we don't have to consult the PMU again.
+		 */
+		update = 0;
+	}
+
 	count = perf_event_read_value(leader, &enabled, &running, update);
 
 	values[n++] = 1 + leader->nr_siblings;
diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 66fa6c8..08c69c1 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -142,6 +142,13 @@ static struct attribute_group event_long_desc_group = {
 
 static struct kmem_cache *hv_page_cache;
 
+struct h_24x7_hw {
+	int flags;
+	int in_txn;
+	int txn_err;
+	struct perf_event *events[255];
+};
+
 /*
  * request_buffer and result_buffer are not required to be 4k aligned,
  * but are not allowed to cross any 4k boundary. Aligning them to 4k is
@@ -150,6 +157,7 @@ static struct kmem_cache *hv_page_cache;
 #define H24x7_DATA_BUFFER_SIZE	4096
 DEFINE_PER_CPU(char, hv_24x7_reqb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 DEFINE_PER_CPU(char, hv_24x7_resb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
+DEFINE_PER_CPU(struct h_24x7_hw, h_24x7_hw);
 
 static char *event_name(struct hv_24x7_event_data *ev, int *len)
 {
@@ -1210,10 +1218,46 @@ static void update_event_count(struct perf_event *event, u64 now)
 
 static void h_24x7_event_read(struct perf_event *event)
 {
+	int ret;
 	u64 now;
 
+	struct h_24x7_hw *h24x7hw;
+	struct hv_24x7_request_buffer *request_buffer;
+
+	/*
+	 * If in a READ transaction, add this counter to the list of
+	 * counters to read during the next HCALL (i.e commit_txn()).
+	 * If not in a READ transaction, go ahead and make the HCALL
+	 * to read this counter by itself.
+	 */
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	if (h24x7hw->txn_err)
+		goto out;
+
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	if (h24x7hw->in_txn) {
+		int i;
+
+		ret = add_event_to_24x7_request(event, request_buffer);
+		if (ret) {
+			h24x7hw->txn_err = ret;
+			goto out;
+		}
+		/*
+		 * Assoicate the event with the HCALL request index, so we
+		 * can quickly find/update the count in ->commit_txn().
+		 */
+		i = request_buffer->num_requests - 1;
+		h24x7hw->events[i] = event;
+		ret = 0;
+		goto out;
+	}
+
 	now = h_24x7_get_value(event);
 	update_event_count(event, now);
+
+out:
+	put_cpu_var(h_24x7_hw);
 }
 
 static void h_24x7_event_start(struct perf_event *event, int flags)
@@ -1235,6 +1279,129 @@ static int h_24x7_event_add(struct perf_event *event, int flags)
 	return 0;
 }
 
+static void h_24x7_event_start_txn(struct pmu *pmu, int flags)
+{
+	struct hv_24x7_request_buffer *request_buffer;
+	struct hv_24x7_data_result_buffer *result_buffer;
+	struct h_24x7_hw *h24x7hw;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	result_buffer = (void *)get_cpu_var(hv_24x7_resb);
+
+	/* We should not be called if we are already in a txn */
+	WARN_ON_ONCE(h24x7hw->in_txn);
+
+	start_24x7_get_data(request_buffer, result_buffer);
+	h24x7hw->in_txn = 1;
+
+	put_cpu_var(h_24x7_hw);
+
+	return;
+}
+
+static void reset_txn(struct h_24x7_hw *h24x7hw)
+{
+	/* Clean up transaction */
+	h24x7hw->in_txn = 0;
+	h24x7hw->txn_err = 0;
+	h24x7hw->flags = 0;
+
+	/*
+	 * request_buffer and result_buffer will be initialized
+	 * during the next read/txn.
+	 */
+	return;
+}
+
+static int h_24x7_event_commit_txn(struct pmu *pmu, int flags)
+{
+	struct hv_24x7_request_buffer *request_buffer;
+	struct hv_24x7_data_result_buffer *result_buffer;
+	struct h_24x7_hw *h24x7hw;
+	struct hv_24x7_result *resb;
+	struct perf_event *event;
+	u64 count;
+	int i, ret;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return 0;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	if (h24x7hw->txn_err) {
+		ret = h24x7hw->txn_err;
+		goto out;
+	}
+
+	ret = -EINVAL;
+	if (!h24x7hw->in_txn) {
+		WARN_ON_ONCE(1);
+		goto out;
+	}
+
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	result_buffer = (void *)get_cpu_var(hv_24x7_resb);
+
+	ret = commit_24x7_get_data(request_buffer, result_buffer);
+	if (ret) {
+		log_24x7_hcall(request_buffer, result_buffer, ret);
+		goto put_reqb;
+	}
+
+	/* Update event counts from hcall */
+	for (i = 0; i < request_buffer->num_requests; i++) {
+		resb = &result_buffer->results[i];
+		count = be64_to_cpu(resb->elements[0].element_data[0]);
+		event = h24x7hw->events[i];
+		h24x7hw->events[i] = NULL;
+		update_event_count(event, count);
+	}
+
+put_reqb:
+        put_cpu_var(hv_24x7_reqb);
+        put_cpu_var(hv_24x7_resb);
+out:
+	reset_txn(h24x7hw);
+	put_cpu_var(h_24x7_hw);
+	return ret;
+}
+
+static void h_24x7_event_cancel_txn(struct pmu *pmu, int flags)
+{
+	struct h_24x7_hw *h24x7hw;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+
+	if (!h24x7hw->in_txn) {
+		WARN_ON_ONCE(1);
+		goto out;
+	}
+
+	reset_txn(h24x7hw);
+
+out:
+	put_cpu_var(h_24x7_hw);
+	return;
+}
+
 static struct pmu h_24x7_pmu = {
 	.task_ctx_nr = perf_invalid_context,
 
@@ -1246,6 +1413,9 @@ static struct pmu h_24x7_pmu = {
 	.start       = h_24x7_event_start,
 	.stop        = h_24x7_event_stop,
 	.read        = h_24x7_event_read,
+	.start_txn   = h_24x7_event_start_txn,
+	.commit_txn  = h_24x7_event_commit_txn,
+	.cancel_txn  = h_24x7_event_cancel_txn,
 };
 
 static int hv_24x7_init(void)
@@ -1272,6 +1442,7 @@ static int hv_24x7_init(void)
 
 	/* sampling not supported */
 	h_24x7_pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
+	h_24x7_pmu.capabilities |= PERF_PMU_CAP_GROUP_READ;
 
 	create_events_from_catalog(&event_group.attrs,
 				   &event_desc_group.attrs,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
@ 2015-03-04  8:35   ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-04  8:35 UTC (permalink / raw)
  To: Michael Ellerman, Paul Mackerras, peterz; +Cc: linuxppc-dev, dev, linux-kernel

The 24x7 counters in Powerpc allow monitoring a large number of counters
simultaneously. They also allow reading several counters in a single
HCALL so we can get a more consistent snapshot of the system.

Use the PMU's transaction interface to monitor and read several event
counters at once. The idea is that users can group several 24x7 events
into a single group of events. We use the following logic to submit
the group of events to the PMU and read the values:

	pmu->start_txn()		// Intialize before first event

	for each event in group
		pmu->read(event);	// queue the event to be read

	pmu->commit_txn()		// Read/update all queued counters

The ->commit_txn() also updates the event counts in the respective
perf_event objects.  The perf subsystem can then directly get the
event counts from the perf_event and can avoid submitting a new
->read() request to the PMU.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 171 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/perf_event.h  |   1 +
 kernel/events/core.c        |  37 ++++++++++
 3 files changed, 209 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 8c571fb..a144d67 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -166,6 +166,7 @@ struct perf_event;
  * pmu::capabilities flags
  */
 #define PERF_PMU_CAP_NO_INTERRUPT		0x01
+#define PERF_PMU_CAP_GROUP_READ			0x02
 
 /**
  * struct pmu - generic performance monitoring unit
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 77ce4f3..ff62ea5 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
 }
 EXPORT_SYMBOL_GPL(perf_event_read_value);
 
+static int do_pmu_group_read(struct perf_event *leader)
+{
+	int ret;
+	struct pmu *pmu;
+	struct perf_event *sub;
+
+	pmu = leader->pmu;
+	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
+
+	pmu->read(leader);
+	list_for_each_entry(sub, &leader->sibling_list, group_entry)
+		pmu->read(sub);
+
+	/*
+	 * Commit_txn submits the transaction to read all the counters
+	 * in the group _and_ updates the event count.
+	 */
+	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
+
+	return ret;
+}
+
 static int perf_event_read_group(struct perf_event *event,
 				   u64 read_format, char __user *buf)
 {
 	struct perf_event *leader = event->group_leader, *sub;
 	struct perf_event_context *ctx = leader->ctx;
+	struct pmu *pmu;
 	int n = 0, size = 0, ret;
 	u64 count, enabled, running;
 	u64 values[5];
@@ -3690,7 +3713,21 @@ static int perf_event_read_group(struct perf_event *event,
 
 	lockdep_assert_held(&ctx->mutex);
 
+	pmu = event->pmu;
 	update = 1;
+
+	if ((read_format & PERF_FORMAT_GROUP) &&
+			(pmu->capabilities & PERF_PMU_CAP_GROUP_READ)) {
+		ret = do_pmu_group_read(event);
+		if (ret)
+			return ret;
+		/*
+		 * ->commit_txn() would have updated the event count,
+		 * so we don't have to consult the PMU again.
+		 */
+		update = 0;
+	}
+
 	count = perf_event_read_value(leader, &enabled, &running, update);
 
 	values[n++] = 1 + leader->nr_siblings;
diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 66fa6c8..08c69c1 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -142,6 +142,13 @@ static struct attribute_group event_long_desc_group = {
 
 static struct kmem_cache *hv_page_cache;
 
+struct h_24x7_hw {
+	int flags;
+	int in_txn;
+	int txn_err;
+	struct perf_event *events[255];
+};
+
 /*
  * request_buffer and result_buffer are not required to be 4k aligned,
  * but are not allowed to cross any 4k boundary. Aligning them to 4k is
@@ -150,6 +157,7 @@ static struct kmem_cache *hv_page_cache;
 #define H24x7_DATA_BUFFER_SIZE	4096
 DEFINE_PER_CPU(char, hv_24x7_reqb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 DEFINE_PER_CPU(char, hv_24x7_resb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
+DEFINE_PER_CPU(struct h_24x7_hw, h_24x7_hw);
 
 static char *event_name(struct hv_24x7_event_data *ev, int *len)
 {
@@ -1210,10 +1218,46 @@ static void update_event_count(struct perf_event *event, u64 now)
 
 static void h_24x7_event_read(struct perf_event *event)
 {
+	int ret;
 	u64 now;
 
+	struct h_24x7_hw *h24x7hw;
+	struct hv_24x7_request_buffer *request_buffer;
+
+	/*
+	 * If in a READ transaction, add this counter to the list of
+	 * counters to read during the next HCALL (i.e commit_txn()).
+	 * If not in a READ transaction, go ahead and make the HCALL
+	 * to read this counter by itself.
+	 */
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	if (h24x7hw->txn_err)
+		goto out;
+
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	if (h24x7hw->in_txn) {
+		int i;
+
+		ret = add_event_to_24x7_request(event, request_buffer);
+		if (ret) {
+			h24x7hw->txn_err = ret;
+			goto out;
+		}
+		/*
+		 * Assoicate the event with the HCALL request index, so we
+		 * can quickly find/update the count in ->commit_txn().
+		 */
+		i = request_buffer->num_requests - 1;
+		h24x7hw->events[i] = event;
+		ret = 0;
+		goto out;
+	}
+
 	now = h_24x7_get_value(event);
 	update_event_count(event, now);
+
+out:
+	put_cpu_var(h_24x7_hw);
 }
 
 static void h_24x7_event_start(struct perf_event *event, int flags)
@@ -1235,6 +1279,129 @@ static int h_24x7_event_add(struct perf_event *event, int flags)
 	return 0;
 }
 
+static void h_24x7_event_start_txn(struct pmu *pmu, int flags)
+{
+	struct hv_24x7_request_buffer *request_buffer;
+	struct hv_24x7_data_result_buffer *result_buffer;
+	struct h_24x7_hw *h24x7hw;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	result_buffer = (void *)get_cpu_var(hv_24x7_resb);
+
+	/* We should not be called if we are already in a txn */
+	WARN_ON_ONCE(h24x7hw->in_txn);
+
+	start_24x7_get_data(request_buffer, result_buffer);
+	h24x7hw->in_txn = 1;
+
+	put_cpu_var(h_24x7_hw);
+
+	return;
+}
+
+static void reset_txn(struct h_24x7_hw *h24x7hw)
+{
+	/* Clean up transaction */
+	h24x7hw->in_txn = 0;
+	h24x7hw->txn_err = 0;
+	h24x7hw->flags = 0;
+
+	/*
+	 * request_buffer and result_buffer will be initialized
+	 * during the next read/txn.
+	 */
+	return;
+}
+
+static int h_24x7_event_commit_txn(struct pmu *pmu, int flags)
+{
+	struct hv_24x7_request_buffer *request_buffer;
+	struct hv_24x7_data_result_buffer *result_buffer;
+	struct h_24x7_hw *h24x7hw;
+	struct hv_24x7_result *resb;
+	struct perf_event *event;
+	u64 count;
+	int i, ret;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return 0;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+	if (h24x7hw->txn_err) {
+		ret = h24x7hw->txn_err;
+		goto out;
+	}
+
+	ret = -EINVAL;
+	if (!h24x7hw->in_txn) {
+		WARN_ON_ONCE(1);
+		goto out;
+	}
+
+	request_buffer = (void *)get_cpu_var(hv_24x7_reqb);
+	result_buffer = (void *)get_cpu_var(hv_24x7_resb);
+
+	ret = commit_24x7_get_data(request_buffer, result_buffer);
+	if (ret) {
+		log_24x7_hcall(request_buffer, result_buffer, ret);
+		goto put_reqb;
+	}
+
+	/* Update event counts from hcall */
+	for (i = 0; i < request_buffer->num_requests; i++) {
+		resb = &result_buffer->results[i];
+		count = be64_to_cpu(resb->elements[0].element_data[0]);
+		event = h24x7hw->events[i];
+		h24x7hw->events[i] = NULL;
+		update_event_count(event, count);
+	}
+
+put_reqb:
+        put_cpu_var(hv_24x7_reqb);
+        put_cpu_var(hv_24x7_resb);
+out:
+	reset_txn(h24x7hw);
+	put_cpu_var(h_24x7_hw);
+	return ret;
+}
+
+static void h_24x7_event_cancel_txn(struct pmu *pmu, int flags)
+{
+	struct h_24x7_hw *h24x7hw;
+
+	/*
+	 * 24x7 counters only support READ transactions. They are
+	 * always counting and dont need/support ADD transactions. 
+	 */
+	if (flags & ~PERF_PMU_TXN_READ)
+		return;
+
+	h24x7hw = &get_cpu_var(h_24x7_hw);
+
+	if (!h24x7hw->in_txn) {
+		WARN_ON_ONCE(1);
+		goto out;
+	}
+
+	reset_txn(h24x7hw);
+
+out:
+	put_cpu_var(h_24x7_hw);
+	return;
+}
+
 static struct pmu h_24x7_pmu = {
 	.task_ctx_nr = perf_invalid_context,
 
@@ -1246,6 +1413,9 @@ static struct pmu h_24x7_pmu = {
 	.start       = h_24x7_event_start,
 	.stop        = h_24x7_event_stop,
 	.read        = h_24x7_event_read,
+	.start_txn   = h_24x7_event_start_txn,
+	.commit_txn  = h_24x7_event_commit_txn,
+	.cancel_txn  = h_24x7_event_cancel_txn,
 };
 
 static int hv_24x7_init(void)
@@ -1272,6 +1442,7 @@ static int hv_24x7_init(void)
 
 	/* sampling not supported */
 	h_24x7_pmu.capabilities |= PERF_PMU_CAP_NO_INTERRUPT;
+	h_24x7_pmu.capabilities |= PERF_PMU_CAP_GROUP_READ;
 
 	create_events_from_catalog(&event_group.attrs,
 				   &event_desc_group.attrs,
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
  2015-03-04  8:35   ` Sukadev Bhattiprolu
@ 2015-03-17  6:47     ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:47 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:05AM -0800, Sukadev Bhattiprolu wrote:
> In addition to using the transaction interface to schedule events
> on a PMU, we will use it to also read a group of counters at once.
> Accordingly, add a flags parameter to the transaction interfaces.
> The flags indicate wheether the transaction is to add events to
> the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.
> 
> Based on input from Peter Zijlstra.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
>  arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
>  include/linux/perf_event.h       | 14 +++++++++++---
>  kernel/events/core.c             | 26 +++++++++++++++-----------
>  4 files changed, 50 insertions(+), 20 deletions(-)

s390 and sparc also implement the txn.

# git grep "\.start_txn"
arch/powerpc/perf/core-book3s.c:        .start_txn      = power_pmu_start_txn,
arch/s390/kernel/perf_cpum_cf.c:        .start_txn    = cpumf_pmu_start_txn,
arch/sparc/kernel/perf_event.c: .start_txn      = sparc_pmu_start_txn,
arch/x86/kernel/cpu/perf_event.c:       .start_txn              = x86_pmu_start_txn,

Also; you add the flag to all 3 calls; does it make sense to only pass
it to the first and save the txn type in the txn state itself? We could
add PERF_EVENT_TXN_READ for this..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
@ 2015-03-17  6:47     ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:47 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:05AM -0800, Sukadev Bhattiprolu wrote:
> In addition to using the transaction interface to schedule events
> on a PMU, we will use it to also read a group of counters at once.
> Accordingly, add a flags parameter to the transaction interfaces.
> The flags indicate wheether the transaction is to add events to
> the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.
> 
> Based on input from Peter Zijlstra.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
>  arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
>  include/linux/perf_event.h       | 14 +++++++++++---
>  kernel/events/core.c             | 26 +++++++++++++++-----------
>  4 files changed, 50 insertions(+), 20 deletions(-)

s390 and sparc also implement the txn.

# git grep "\.start_txn"
arch/powerpc/perf/core-book3s.c:        .start_txn      = power_pmu_start_txn,
arch/s390/kernel/perf_cpum_cf.c:        .start_txn    = cpumf_pmu_start_txn,
arch/sparc/kernel/perf_event.c: .start_txn      = sparc_pmu_start_txn,
arch/x86/kernel/cpu/perf_event.c:       .start_txn              = x86_pmu_start_txn,

Also; you add the flag to all 3 calls; does it make sense to only pass
it to the first and save the txn type in the txn state itself? We could
add PERF_EVENT_TXN_READ for this..

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value()
  2015-03-04  8:35   ` Sukadev Bhattiprolu
@ 2015-03-17  6:54     ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:54 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:07AM -0800, Sukadev Bhattiprolu wrote:

>  extern u64 perf_event_read_value(struct perf_event *event,
> -				 u64 *enabled, u64 *running);
> +				 u64 *enabled, u64 *running, int update);
>  

I think someone recently showed that bool generates better code in some
cases. The advantage of int is that you can stuff more bits in, but then
you need to call it flags or so anyhow.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value()
@ 2015-03-17  6:54     ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:54 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:07AM -0800, Sukadev Bhattiprolu wrote:

>  extern u64 perf_event_read_value(struct perf_event *event,
> -				 u64 *enabled, u64 *running);
> +				 u64 *enabled, u64 *running, int update);
>  

I think someone recently showed that bool generates better code in some
cases. The advantage of int is that you can stuff more bits in, but then
you need to call it flags or so anyhow.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
  2015-03-04  8:35   ` Sukadev Bhattiprolu
@ 2015-03-17  6:57     ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:57 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:08AM -0800, Sukadev Bhattiprolu wrote:
> +++ b/kernel/events/core.c
> @@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
>  }
>  EXPORT_SYMBOL_GPL(perf_event_read_value);
>  
> +static int do_pmu_group_read(struct perf_event *leader)
> +{
> +	int ret;
> +	struct pmu *pmu;
> +	struct perf_event *sub;
> +
> +	pmu = leader->pmu;
> +	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
> +
> +	pmu->read(leader);
> +	list_for_each_entry(sub, &leader->sibling_list, group_entry)
> +		pmu->read(sub);
> +
> +	/*
> +	 * Commit_txn submits the transaction to read all the counters
> +	 * in the group _and_ updates the event count.
> +	 */
> +	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
> +
> +	return ret;
> +}
> +
>  static int perf_event_read_group(struct perf_event *event,
>  				   u64 read_format, char __user *buf)
>  {
>  	struct perf_event *leader = event->group_leader, *sub;
>  	struct perf_event_context *ctx = leader->ctx;
> +	struct pmu *pmu;
>  	int n = 0, size = 0, ret;
>  	u64 count, enabled, running;
>  	u64 values[5];
> @@ -3690,7 +3713,21 @@ static int perf_event_read_group(struct perf_event *event,
>  
>  	lockdep_assert_held(&ctx->mutex);
>  
> +	pmu = event->pmu;
>  	update = 1;
> +
> +	if ((read_format & PERF_FORMAT_GROUP) &&
> +			(pmu->capabilities & PERF_PMU_CAP_GROUP_READ)) {
> +		ret = do_pmu_group_read(event);
> +		if (ret)
> +			return ret;
> +		/*
> +		 * ->commit_txn() would have updated the event count,
> +		 * so we don't have to consult the PMU again.
> +		 */
> +		update = 0;
> +	}
> +

Is there a down-side to always doing the txn based group read? If an
arch does not implement the read txn support it'll fall back to doing
independent read ops, but we end up doing those anyway.

That way we get less special case code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
@ 2015-03-17  6:57     ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-03-17  6:57 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

On Wed, Mar 04, 2015 at 12:35:08AM -0800, Sukadev Bhattiprolu wrote:
> +++ b/kernel/events/core.c
> @@ -3677,11 +3677,34 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running,
>  }
>  EXPORT_SYMBOL_GPL(perf_event_read_value);
>  
> +static int do_pmu_group_read(struct perf_event *leader)
> +{
> +	int ret;
> +	struct pmu *pmu;
> +	struct perf_event *sub;
> +
> +	pmu = leader->pmu;
> +	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
> +
> +	pmu->read(leader);
> +	list_for_each_entry(sub, &leader->sibling_list, group_entry)
> +		pmu->read(sub);
> +
> +	/*
> +	 * Commit_txn submits the transaction to read all the counters
> +	 * in the group _and_ updates the event count.
> +	 */
> +	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
> +
> +	return ret;
> +}
> +
>  static int perf_event_read_group(struct perf_event *event,
>  				   u64 read_format, char __user *buf)
>  {
>  	struct perf_event *leader = event->group_leader, *sub;
>  	struct perf_event_context *ctx = leader->ctx;
> +	struct pmu *pmu;
>  	int n = 0, size = 0, ret;
>  	u64 count, enabled, running;
>  	u64 values[5];
> @@ -3690,7 +3713,21 @@ static int perf_event_read_group(struct perf_event *event,
>  
>  	lockdep_assert_held(&ctx->mutex);
>  
> +	pmu = event->pmu;
>  	update = 1;
> +
> +	if ((read_format & PERF_FORMAT_GROUP) &&
> +			(pmu->capabilities & PERF_PMU_CAP_GROUP_READ)) {
> +		ret = do_pmu_group_read(event);
> +		if (ret)
> +			return ret;
> +		/*
> +		 * ->commit_txn() would have updated the event count,
> +		 * so we don't have to consult the PMU again.
> +		 */
> +		update = 0;
> +	}
> +

Is there a down-side to always doing the txn based group read? If an
arch does not implement the read txn support it'll fall back to doing
independent read ops, but we end up doing those anyway.

That way we get less special case code.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
  2015-03-17  6:47     ` Peter Zijlstra
@ 2015-03-26  4:00       ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-26  4:00 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

Peter Zijlstra [peterz@infradead.org] wrote:
| On Wed, Mar 04, 2015 at 12:35:05AM -0800, Sukadev Bhattiprolu wrote:
| > In addition to using the transaction interface to schedule events
| > on a PMU, we will use it to also read a group of counters at once.
| > Accordingly, add a flags parameter to the transaction interfaces.
| > The flags indicate wheether the transaction is to add events to
| > the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.
| > 
| > Based on input from Peter Zijlstra.
| > 
| > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
| > ---
| >  arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
| >  arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
| >  include/linux/perf_event.h       | 14 +++++++++++---
| >  kernel/events/core.c             | 26 +++++++++++++++-----------
| >  4 files changed, 50 insertions(+), 20 deletions(-)
| 
| s390 and sparc also implement the txn.

Yes, I have fixed that now. Was mostly exploring the basic txn interface.
| 
| # git grep "\.start_txn"
| arch/powerpc/perf/core-book3s.c:        .start_txn      = power_pmu_start_txn,
| arch/s390/kernel/perf_cpum_cf.c:        .start_txn    = cpumf_pmu_start_txn,
| arch/sparc/kernel/perf_event.c: .start_txn      = sparc_pmu_start_txn,
| arch/x86/kernel/cpu/perf_event.c:       .start_txn              = x86_pmu_start_txn,
| 
| Also; you add the flag to all 3 calls; does it make sense to only pass
| it to the first and save the txn type in the txn state itself? We could
| add PERF_EVENT_TXN_READ for this..

We could do that. The one small downside I see with passing the txn flag
only ot ->start_txn() is that checks like this become more complicated,
even in PMUs that don't care about the TXN_READ transactions.

@@ -1619,8 +1622,11 @@ static void x86_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void x86_pmu_cancel_txn(struct pmu *pmu)
+static void x86_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
+       if (flags & ~PERF_PMU_TXN_ADD)
+               return;
+

The ->start_txn will need to save the transaction type in the
architecture's 'cpuhw' and check/clear in ->commit_txn() and
->clear_txn() - right ?

Sukadev

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces
@ 2015-03-26  4:00       ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-26  4:00 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

Peter Zijlstra [peterz@infradead.org] wrote:
| On Wed, Mar 04, 2015 at 12:35:05AM -0800, Sukadev Bhattiprolu wrote:
| > In addition to using the transaction interface to schedule events
| > on a PMU, we will use it to also read a group of counters at once.
| > Accordingly, add a flags parameter to the transaction interfaces.
| > The flags indicate wheether the transaction is to add events to
| > the PMU (PERF_PMU_TXN_ADD) or to read the events PERF_PMU_TXN_READ.
| > 
| > Based on input from Peter Zijlstra.
| > 
| > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
| > ---
| >  arch/powerpc/perf/core-book3s.c  | 15 ++++++++++++---
| >  arch/x86/kernel/cpu/perf_event.c | 15 ++++++++++++---
| >  include/linux/perf_event.h       | 14 +++++++++++---
| >  kernel/events/core.c             | 26 +++++++++++++++-----------
| >  4 files changed, 50 insertions(+), 20 deletions(-)
| 
| s390 and sparc also implement the txn.

Yes, I have fixed that now. Was mostly exploring the basic txn interface.
| 
| # git grep "\.start_txn"
| arch/powerpc/perf/core-book3s.c:        .start_txn      = power_pmu_start_txn,
| arch/s390/kernel/perf_cpum_cf.c:        .start_txn    = cpumf_pmu_start_txn,
| arch/sparc/kernel/perf_event.c: .start_txn      = sparc_pmu_start_txn,
| arch/x86/kernel/cpu/perf_event.c:       .start_txn              = x86_pmu_start_txn,
| 
| Also; you add the flag to all 3 calls; does it make sense to only pass
| it to the first and save the txn type in the txn state itself? We could
| add PERF_EVENT_TXN_READ for this..

We could do that. The one small downside I see with passing the txn flag
only ot ->start_txn() is that checks like this become more complicated,
even in PMUs that don't care about the TXN_READ transactions.

@@ -1619,8 +1622,11 @@ static void x86_pmu_start_txn(struct pmu *pmu)
  * Clear the flag and pmu::enable() will perform the
  * schedulability test.
  */
-static void x86_pmu_cancel_txn(struct pmu *pmu)
+static void x86_pmu_cancel_txn(struct pmu *pmu, int flags)
 {
+       if (flags & ~PERF_PMU_TXN_ADD)
+               return;
+

The ->start_txn will need to save the transaction type in the
architecture's 'cpuhw' and check/clear in ->commit_txn() and
->clear_txn() - right ?

Sukadev

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
  2015-03-17  6:57     ` Peter Zijlstra
@ 2015-03-26  5:57       ` Sukadev Bhattiprolu
  -1 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-26  5:57 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

Peter Zijlstra [peterz@infradead.org] wrote:
| 
| Is there a down-side to always doing the txn based group read? If an
| arch does not implement the read txn support it'll fall back to doing
| independent read ops, but we end up doing those anyway.
| 
| That way we get less special case code.

We could, but would need to move the perf_event_read() earlier in
the perf_event_read_group(). Can we do something like this (it
could be broken into two patches, but merging for easier review)
Would something liks this work?

----

perf_event_read_value() is mostly computing event count, enabled
and running times. Move the perf_event_read() into caller and
rename perf_event_read_value() to perf_event_compute_values().

Then, in perf_event_read_group(), read the event counts using the
transaction interface for all PMUs.


----
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8e6b7d8..5896cb1 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -144,9 +144,11 @@ static u64 read_pmc(struct kvm_pmc *pmc)
 
 	counter = pmc->counter;
 
-	if (pmc->perf_event)
-		counter += perf_event_read_value(pmc->perf_event,
+	if (pmc->perf_event) {
+		perf_event_read(pmc->perf_event);
+		counter += perf_event_compute_values(pmc->perf_event,
 						 &enabled, &running);
+	}
 
 	/* FIXME: Scaling needed? */
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8fe60e..1e30560 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -579,7 +579,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
 				void *context);
 extern void perf_pmu_migrate_context(struct pmu *pmu,
 				int src_cpu, int dst_cpu);
-extern u64 perf_event_read_value(struct perf_event *event,
+extern u64 perf_event_compute_values(struct perf_event *event,
 				 u64 *enabled, u64 *running);
 
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a6abcd3..f7e4705 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3643,7 +3643,27 @@ static void orphans_remove_work(struct work_struct *work)
 	put_ctx(ctx);
 }
 
-u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
+static int perf_event_read_values(struct perf_event *leader)
+{
+	int ret;
+	struct perf_event *sub;
+	struct pmu *pmu;
+
+	pmu = leader->pmu;
+
+	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
+
+	pmu->read(leader);
+	list_for_each_entry(sub, &leader->sibling_list, group_entry)
+		pmu->read(sub);
+
+	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
+
+	return ret;
+}
+
+u64 perf_event_compute_values(struct perf_event *event, u64 *enabled,
+				u64 *running)
 {
 	struct perf_event *child;
 	u64 total = 0;
@@ -3653,7 +3673,6 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	mutex_lock(&event->child_mutex);
 
-	perf_event_read(event);
 	total += perf_event_count(event);
 
 	*enabled += event->total_time_enabled +
@@ -3671,7 +3690,7 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	return total;
 }
-EXPORT_SYMBOL_GPL(perf_event_read_value);
+EXPORT_SYMBOL_GPL(perf_event_compute_values);
 
 static int perf_event_read_group(struct perf_event *event,
 				   u64 read_format, char __user *buf)
@@ -3684,7 +3703,11 @@ static int perf_event_read_group(struct perf_event *event,
 
 	lockdep_assert_held(&ctx->mutex);
 
-	count = perf_event_read_value(leader, &enabled, &running);
+	ret = perf_event_read_values(leader);
+	if (ret)
+		return ret;
+
+	count = perf_event_compute_values(leader, &enabled, &running);
 
 	values[n++] = 1 + leader->nr_siblings;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -3705,7 +3728,7 @@ static int perf_event_read_group(struct perf_event *event,
 	list_for_each_entry(sub, &leader->sibling_list, group_entry) {
 		n = 0;
 
-		values[n++] = perf_event_read_value(sub, &enabled, &running);
+		values[n++] = perf_event_compute_values(sub, &enabled, &running);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -3728,7 +3751,8 @@ static int perf_event_read_one(struct perf_event *event,
 	u64 values[4];
 	int n = 0;
 
-	values[n++] = perf_event_read_value(event, &enabled, &running);
+	perf_event_read(event);
+	values[n++] = perf_event_compute_values(event, &enabled, &running);
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
 		values[n++] = enabled;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
@ 2015-03-26  5:57       ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 22+ messages in thread
From: Sukadev Bhattiprolu @ 2015-03-26  5:57 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

Peter Zijlstra [peterz@infradead.org] wrote:
| 
| Is there a down-side to always doing the txn based group read? If an
| arch does not implement the read txn support it'll fall back to doing
| independent read ops, but we end up doing those anyway.
| 
| That way we get less special case code.

We could, but would need to move the perf_event_read() earlier in
the perf_event_read_group(). Can we do something like this (it
could be broken into two patches, but merging for easier review)
Would something liks this work?

----

perf_event_read_value() is mostly computing event count, enabled
and running times. Move the perf_event_read() into caller and
rename perf_event_read_value() to perf_event_compute_values().

Then, in perf_event_read_group(), read the event counts using the
transaction interface for all PMUs.


----
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 8e6b7d8..5896cb1 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -144,9 +144,11 @@ static u64 read_pmc(struct kvm_pmc *pmc)
 
 	counter = pmc->counter;
 
-	if (pmc->perf_event)
-		counter += perf_event_read_value(pmc->perf_event,
+	if (pmc->perf_event) {
+		perf_event_read(pmc->perf_event);
+		counter += perf_event_compute_values(pmc->perf_event,
 						 &enabled, &running);
+	}
 
 	/* FIXME: Scaling needed? */
 
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index c8fe60e..1e30560 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -579,7 +579,7 @@ perf_event_create_kernel_counter(struct perf_event_attr *attr,
 				void *context);
 extern void perf_pmu_migrate_context(struct pmu *pmu,
 				int src_cpu, int dst_cpu);
-extern u64 perf_event_read_value(struct perf_event *event,
+extern u64 perf_event_compute_values(struct perf_event *event,
 				 u64 *enabled, u64 *running);
 
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index a6abcd3..f7e4705 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3643,7 +3643,27 @@ static void orphans_remove_work(struct work_struct *work)
 	put_ctx(ctx);
 }
 
-u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
+static int perf_event_read_values(struct perf_event *leader)
+{
+	int ret;
+	struct perf_event *sub;
+	struct pmu *pmu;
+
+	pmu = leader->pmu;
+
+	pmu->start_txn(pmu, PERF_PMU_TXN_READ);
+
+	pmu->read(leader);
+	list_for_each_entry(sub, &leader->sibling_list, group_entry)
+		pmu->read(sub);
+
+	ret = pmu->commit_txn(pmu, PERF_PMU_TXN_READ);
+
+	return ret;
+}
+
+u64 perf_event_compute_values(struct perf_event *event, u64 *enabled,
+				u64 *running)
 {
 	struct perf_event *child;
 	u64 total = 0;
@@ -3653,7 +3673,6 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	mutex_lock(&event->child_mutex);
 
-	perf_event_read(event);
 	total += perf_event_count(event);
 
 	*enabled += event->total_time_enabled +
@@ -3671,7 +3690,7 @@ u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
 
 	return total;
 }
-EXPORT_SYMBOL_GPL(perf_event_read_value);
+EXPORT_SYMBOL_GPL(perf_event_compute_values);
 
 static int perf_event_read_group(struct perf_event *event,
 				   u64 read_format, char __user *buf)
@@ -3684,7 +3703,11 @@ static int perf_event_read_group(struct perf_event *event,
 
 	lockdep_assert_held(&ctx->mutex);
 
-	count = perf_event_read_value(leader, &enabled, &running);
+	ret = perf_event_read_values(leader);
+	if (ret)
+		return ret;
+
+	count = perf_event_compute_values(leader, &enabled, &running);
 
 	values[n++] = 1 + leader->nr_siblings;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
@@ -3705,7 +3728,7 @@ static int perf_event_read_group(struct perf_event *event,
 	list_for_each_entry(sub, &leader->sibling_list, group_entry) {
 		n = 0;
 
-		values[n++] = perf_event_read_value(sub, &enabled, &running);
+		values[n++] = perf_event_compute_values(sub, &enabled, &running);
 		if (read_format & PERF_FORMAT_ID)
 			values[n++] = primary_event_id(sub);
 
@@ -3728,7 +3751,8 @@ static int perf_event_read_one(struct perf_event *event,
 	u64 values[4];
 	int n = 0;
 
-	values[n++] = perf_event_read_value(event, &enabled, &running);
+	perf_event_read(event);
+	values[n++] = perf_event_compute_values(event, &enabled, &running);
 	if (read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
 		values[n++] = enabled;
 	if (read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
  2015-03-26  5:57       ` Sukadev Bhattiprolu
@ 2015-04-01 18:10         ` Peter Zijlstra
  -1 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-04-01 18:10 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Michael Ellerman, Paul Mackerras, dev, linux-kernel, linuxppc-dev

On Wed, Mar 25, 2015 at 10:57:05PM -0700, Sukadev Bhattiprolu wrote:
> Would something liks this work?
> 

Sure, that looks fine.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters
@ 2015-04-01 18:10         ` Peter Zijlstra
  0 siblings, 0 replies; 22+ messages in thread
From: Peter Zijlstra @ 2015-04-01 18:10 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dev, Paul Mackerras, linux-kernel, linuxppc-dev

On Wed, Mar 25, 2015 at 10:57:05PM -0700, Sukadev Bhattiprolu wrote:
> Would something liks this work?
> 

Sure, that looks fine.

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-04-01 18:10 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-04  8:35 [PATCH 0/4] perf: Implement event group read using txn interface Sukadev Bhattiprolu
2015-03-04  8:35 ` Sukadev Bhattiprolu
2015-03-04  8:35 ` [PATCH 1/4] perf: Add 'flags' parameter to pmu txn interfaces Sukadev Bhattiprolu
2015-03-04  8:35   ` Sukadev Bhattiprolu
2015-03-17  6:47   ` Peter Zijlstra
2015-03-17  6:47     ` Peter Zijlstra
2015-03-26  4:00     ` Sukadev Bhattiprolu
2015-03-26  4:00       ` Sukadev Bhattiprolu
2015-03-04  8:35 ` [PATCH 2/4] perf: Split perf_event_read() and perf_event_count() Sukadev Bhattiprolu
2015-03-04  8:35   ` Sukadev Bhattiprolu
2015-03-04  8:35 ` [PATCH 3/4] perf: Add 'update' parameter to perf_event_read_value() Sukadev Bhattiprolu
2015-03-04  8:35   ` Sukadev Bhattiprolu
2015-03-17  6:54   ` Peter Zijlstra
2015-03-17  6:54     ` Peter Zijlstra
2015-03-04  8:35 ` [PATCH 4/4] perf/powerpc: Implement group_read() txn interface for 24x7 counters Sukadev Bhattiprolu
2015-03-04  8:35   ` Sukadev Bhattiprolu
2015-03-17  6:57   ` Peter Zijlstra
2015-03-17  6:57     ` Peter Zijlstra
2015-03-26  5:57     ` Sukadev Bhattiprolu
2015-03-26  5:57       ` Sukadev Bhattiprolu
2015-04-01 18:10       ` Peter Zijlstra
2015-04-01 18:10         ` Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.