linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware
@ 2020-07-01  9:20 Athira Rajeev
  2020-07-01  9:20 ` [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs Athira Rajeev
                   ` (9 more replies)
  0 siblings, 10 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

The patch series adds support for power10 PMU hardware.

Anju T Sudhakar (2):
  powerpc/perf: Add support for outputting extended regs in perf
    intr_regs
  tools/perf: Add perf tools support for extended register capability in
    powerpc

Athira Rajeev (5):
  KVM: PPC: Book3S HV: Save/restore new PMU registers
  powerpc/perf: Update Power PMU cache_events to u64 type
  powerpc/perf: power10 Performance Monitoring support
  powerpc/perf: support BHRB disable bit and new filtering modes
  powerpc/perf: Add extended regs support for power10 platform

Madhavan Srinivasan (3):
  powerpc/perf: Add support for ISA3.1 PMU SPRs
  powerpc/xmon: Add PowerISA v3.1 PMU SPRs
  powerpc/perf: Add power10_feat to dt_cpu_ftrs

---

Changes from v1 -> v2
- Added support for extended regs in powerpc
  for power9/power10 platform ( patches 8, 9, 10)
- Addressed change/removal of some event codes
  in the PMU driver

---

 arch/powerpc/include/asm/kvm_book3s_asm.h       |   2 +-
 arch/powerpc/include/asm/kvm_host.h             |   4 +-
 arch/powerpc/include/asm/perf_event_server.h    |  11 +-
 arch/powerpc/include/asm/processor.h            |   4 +
 arch/powerpc/include/asm/reg.h                  |   9 +
 arch/powerpc/include/uapi/asm/perf_regs.h       |  20 +-
 arch/powerpc/kernel/asm-offsets.c               |   3 +
 arch/powerpc/kernel/cpu_setup_power.S           |   7 +
 arch/powerpc/kernel/dt_cpu_ftrs.c               |  26 ++
 arch/powerpc/kernel/sysfs.c                     |   8 +
 arch/powerpc/kvm/book3s_hv.c                    |   6 +-
 arch/powerpc/kvm/book3s_hv_interrupts.S         |   8 +
 arch/powerpc/kvm/book3s_hv_rmhandlers.S         |  24 ++
 arch/powerpc/perf/Makefile                      |   2 +-
 arch/powerpc/perf/core-book3s.c                 |  61 +++-
 arch/powerpc/perf/generic-compat-pmu.c          |   2 +-
 arch/powerpc/perf/internal.h                    |   1 +
 arch/powerpc/perf/isa207-common.c               |  72 +++-
 arch/powerpc/perf/isa207-common.h               |  33 +-
 arch/powerpc/perf/mpc7450-pmu.c                 |   2 +-
 arch/powerpc/perf/perf_regs.c                   |  42 ++-
 arch/powerpc/perf/power10-events-list.h         |  70 ++++
 arch/powerpc/perf/power10-pmu.c                 | 425 ++++++++++++++++++++++++
 arch/powerpc/perf/power5+-pmu.c                 |   2 +-
 arch/powerpc/perf/power5-pmu.c                  |   2 +-
 arch/powerpc/perf/power6-pmu.c                  |   2 +-
 arch/powerpc/perf/power7-pmu.c                  |   2 +-
 arch/powerpc/perf/power8-pmu.c                  |   2 +-
 arch/powerpc/perf/power9-pmu.c                  |   8 +-
 arch/powerpc/perf/ppc970-pmu.c                  |   2 +-
 arch/powerpc/platforms/powernv/idle.c           |  14 +
 arch/powerpc/xmon/xmon.c                        |  15 +
 tools/arch/powerpc/include/uapi/asm/perf_regs.h |  20 +-
 tools/perf/arch/powerpc/include/perf_regs.h     |   8 +-
 tools/perf/arch/powerpc/util/perf_regs.c        |  61 ++++
 35 files changed, 939 insertions(+), 41 deletions(-)
 create mode 100644 arch/powerpc/perf/power10-events-list.h
 create mode 100644 arch/powerpc/perf/power10-pmu.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-08 11:02   ` Michael Ellerman
  2020-07-01  9:20 ` [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers Athira Rajeev
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

From: Madhavan Srinivasan <maddy@linux.ibm.com>

PowerISA v3.1 includes new performance monitoring unit(PMU)
special purpose registers (SPRs). They are

Monitor Mode Control Register 3 (MMCR3)
Sampled Instruction Event Register 2 (SIER2)
Sampled Instruction Event Register 3 (SIER3)

MMCR3 is added for further sampling related configuration
control. SIER2/SIER3 are added to provide additional
information about the sampled instruction.

Patch adds new PPMU flag called "PPMU_ARCH_310S" to support
handling of these new SPRs, updates the struct thread_struct
to include these new SPRs, increase the size of mmcr[] array
by one to include MMCR3 in struct cpu_hw_event. This is needed
to support programming of MMCR3 SPR during event_[enable/disable].
Patch also adds the sysfs support for the MMCR3 SPR along with
SPRN_ macros for these new pmu sprs.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h |  1 +
 arch/powerpc/include/asm/processor.h         |  4 ++++
 arch/powerpc/include/asm/reg.h               |  6 ++++++
 arch/powerpc/kernel/sysfs.c                  |  8 ++++++++
 arch/powerpc/perf/core-book3s.c              | 29 ++++++++++++++++++++++++++--
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 3e9703f..895aeaa 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -69,6 +69,7 @@ struct power_pmu {
 #define PPMU_HAS_SIER		0x00000040 /* Has SIER */
 #define PPMU_ARCH_207S		0x00000080 /* PMC is architecture v2.07S */
 #define PPMU_NO_SIAR		0x00000100 /* Do not use SIAR */
+#define PPMU_ARCH_310S		0x00000200 /* Has MMCR3, SIER2 and SIER3 */
 
 /*
  * Values for flags to get_alternatives()
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 52a6783..a466e94 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -272,6 +272,10 @@ struct thread_struct {
 	unsigned 	mmcr0;
 
 	unsigned 	used_ebb;
+	unsigned long   mmcr3;
+	unsigned long   sier2;
+	unsigned long   sier3;
+
 #endif
 };
 
diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 88e6c78..21a1b2d 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -876,7 +876,9 @@
 #define   MMCR0_FCHV	0x00000001UL /* freeze conditions in hypervisor mode */
 #define SPRN_MMCR1	798
 #define SPRN_MMCR2	785
+#define SPRN_MMCR3	754
 #define SPRN_UMMCR2	769
+#define SPRN_UMMCR3	738
 #define SPRN_MMCRA	0x312
 #define   MMCRA_SDSYNC	0x80000000UL /* SDAR synced with SIAR */
 #define   MMCRA_SDAR_DCACHE_MISS 0x40000000UL
@@ -918,6 +920,10 @@
 #define   SIER_SIHV		0x1000000	/* Sampled MSR_HV */
 #define   SIER_SIAR_VALID	0x0400000	/* SIAR contents valid */
 #define   SIER_SDAR_VALID	0x0200000	/* SDAR contents valid */
+#define SPRN_SIER2	752
+#define SPRN_SIER3	753
+#define SPRN_USIER2	736
+#define SPRN_USIER3	737
 #define SPRN_SIAR	796
 #define SPRN_SDAR	797
 #define SPRN_TACR	888
diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 571b325..46b4ebc 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -622,8 +622,10 @@ void ppc_enable_pmcs(void)
 SYSFS_PMCSETUP(pmc8, SPRN_PMC8);
 
 SYSFS_PMCSETUP(mmcra, SPRN_MMCRA);
+SYSFS_PMCSETUP(mmcr3, SPRN_MMCR3);
 
 static DEVICE_ATTR(mmcra, 0600, show_mmcra, store_mmcra);
+static DEVICE_ATTR(mmcr3, 0600, show_mmcr3, store_mmcr3);
 #endif /* HAS_PPC_PMC56 */
 
 
@@ -886,6 +888,9 @@ static int register_cpu_online(unsigned int cpu)
 #ifdef	CONFIG_PMU_SYSFS
 	if (cpu_has_feature(CPU_FTR_MMCRA))
 		device_create_file(s, &dev_attr_mmcra);
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		device_create_file(s, &dev_attr_mmcr3);
 #endif /* CONFIG_PMU_SYSFS */
 
 	if (cpu_has_feature(CPU_FTR_PURR)) {
@@ -980,6 +985,9 @@ static int unregister_cpu_online(unsigned int cpu)
 #ifdef CONFIG_PMU_SYSFS
 	if (cpu_has_feature(CPU_FTR_MMCRA))
 		device_remove_file(s, &dev_attr_mmcra);
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		device_remove_file(s, &dev_attr_mmcr3);
 #endif /* CONFIG_PMU_SYSFS */
 
 	if (cpu_has_feature(CPU_FTR_PURR)) {
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index cd6a742..5c64bd3 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -39,10 +39,10 @@ struct cpu_hw_events {
 	unsigned int flags[MAX_HWEVENTS];
 	/*
 	 * The order of the MMCR array is:
-	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
+	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
 	 *  - 32-bit, MMCR0, MMCR1, MMCR2
 	 */
-	unsigned long mmcr[4];
+	unsigned long mmcr[5];
 	struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
 	u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
 	u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
@@ -586,6 +586,11 @@ static void ebb_switch_out(unsigned long mmcr0)
 	current->thread.sdar  = mfspr(SPRN_SDAR);
 	current->thread.mmcr0 = mmcr0 & MMCR0_USER_MASK;
 	current->thread.mmcr2 = mfspr(SPRN_MMCR2) & MMCR2_USER_MASK;
+	if (ppmu->flags & PPMU_ARCH_310S) {
+		current->thread.mmcr3 = mfspr(SPRN_MMCR3);
+		current->thread.sier2 = mfspr(SPRN_SIER2);
+		current->thread.sier3 = mfspr(SPRN_SIER3);
+	}
 }
 
 static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
@@ -625,6 +630,12 @@ static unsigned long ebb_switch_in(bool ebb, struct cpu_hw_events *cpuhw)
 	 * instead manage the MMCR2 entirely by itself.
 	 */
 	mtspr(SPRN_MMCR2, cpuhw->mmcr[3] | current->thread.mmcr2);
+
+	if (ppmu->flags & PPMU_ARCH_310S) {
+		mtspr(SPRN_MMCR3, current->thread.mmcr3);
+		mtspr(SPRN_SIER2, current->thread.sier2);
+		mtspr(SPRN_SIER3, current->thread.sier3);
+	}
 out:
 	return mmcr0;
 }
@@ -845,6 +856,11 @@ void perf_event_print_debug(void)
 		pr_info("EBBRR: %016lx BESCR: %016lx\n",
 			mfspr(SPRN_EBBRR), mfspr(SPRN_BESCR));
 	}
+
+	if (ppmu->flags & PPMU_ARCH_310S) {
+		pr_info("MMCR3: %016lx SIER2: %016lx SIER3: %016lx\n",
+		mfspr(SPRN_MMCR3), mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));
+	}
 #endif
 	pr_info("SIAR:  %016lx SDAR:  %016lx SIER:  %016lx\n",
 		mfspr(SPRN_SIAR), sdar, sier);
@@ -1310,6 +1326,10 @@ static void power_pmu_enable(struct pmu *pmu)
 	if (!cpuhw->n_added) {
 		mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
 		mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
+#ifdef CONFIG_PPC64
+		if (ppmu->flags & PPMU_ARCH_310S)
+			mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
+#endif /* CONFIG_PPC64 */
 		goto out_enable;
 	}
 
@@ -1353,6 +1373,11 @@ static void power_pmu_enable(struct pmu *pmu)
 	if (ppmu->flags & PPMU_ARCH_207S)
 		mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
 
+#ifdef CONFIG_PPC64
+	if (ppmu->flags & PPMU_ARCH_310S)
+		mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
+#endif /* CONFIG_PPC64 */
+
 	/*
 	 * Read off any pre-existing events that need to move
 	 * to another PMC.
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
  2020-07-01  9:20 ` [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-01 11:11   ` Paul Mackerras
  2020-07-07  6:13   ` Michael Neuling
  2020-07-01  9:20 ` [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs Athira Rajeev
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

PowerISA v3.1 has added new performance monitoring unit (PMU)
special purpose registers (SPRs). They are

Monitor Mode Control Register 3 (MMCR3)
Sampled Instruction Event Register A (SIER2)
Sampled Instruction Event Register B (SIER3)

Patch addes support to save/restore these new
SPRs while entering/exiting guest.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/kvm_book3s_asm.h |  2 +-
 arch/powerpc/include/asm/kvm_host.h       |  4 ++--
 arch/powerpc/kernel/asm-offsets.c         |  3 +++
 arch/powerpc/kvm/book3s_hv.c              |  6 ++++--
 arch/powerpc/kvm/book3s_hv_interrupts.S   |  8 ++++++++
 arch/powerpc/kvm/book3s_hv_rmhandlers.S   | 24 ++++++++++++++++++++++++
 6 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_asm.h b/arch/powerpc/include/asm/kvm_book3s_asm.h
index 45704f2..078f464 100644
--- a/arch/powerpc/include/asm/kvm_book3s_asm.h
+++ b/arch/powerpc/include/asm/kvm_book3s_asm.h
@@ -119,7 +119,7 @@ struct kvmppc_host_state {
 	void __iomem *xive_tima_virt;
 	u32 saved_xirr;
 	u64 dabr;
-	u64 host_mmcr[7];	/* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER */
+	u64 host_mmcr[10];	/* MMCR 0,1,A, SIAR, SDAR, MMCR2, SIER, MMCR3, SIER2/3 */
 	u32 host_pmc[8];
 	u64 host_purr;
 	u64 host_spurr;
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 7e2d061..d718061 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -637,12 +637,12 @@ struct kvm_vcpu_arch {
 	u32 ccr1;
 	u32 dbsr;
 
-	u64 mmcr[5];
+	u64 mmcr[6];
 	u32 pmc[8];
 	u32 spmc[2];
 	u64 siar;
 	u64 sdar;
-	u64 sier;
+	u64 sier[3];
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
 	u64 tfhar;
 	u64 texasr;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 6657dc6..20a8b1e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -696,6 +696,9 @@ int main(void)
 	HSTATE_FIELD(HSTATE_SDAR, host_mmcr[4]);
 	HSTATE_FIELD(HSTATE_MMCR2, host_mmcr[5]);
 	HSTATE_FIELD(HSTATE_SIER, host_mmcr[6]);
+	HSTATE_FIELD(HSTATE_MMCR3, host_mmcr[7]);
+	HSTATE_FIELD(HSTATE_SIER2, host_mmcr[8]);
+	HSTATE_FIELD(HSTATE_SIER3, host_mmcr[9]);
 	HSTATE_FIELD(HSTATE_PMC1, host_pmc[0]);
 	HSTATE_FIELD(HSTATE_PMC2, host_pmc[1]);
 	HSTATE_FIELD(HSTATE_PMC3, host_pmc[2]);
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 6bf66649..c265800 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1698,7 +1698,8 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		*val = get_reg_val(id, vcpu->arch.sdar);
 		break;
 	case KVM_REG_PPC_SIER:
-		*val = get_reg_val(id, vcpu->arch.sier);
+		i = id - KVM_REG_PPC_SIER;
+		*val = get_reg_val(id, vcpu->arch.sier[i]);
 		break;
 	case KVM_REG_PPC_IAMR:
 		*val = get_reg_val(id, vcpu->arch.iamr);
@@ -1919,7 +1920,8 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
 		vcpu->arch.sdar = set_reg_val(id, *val);
 		break;
 	case KVM_REG_PPC_SIER:
-		vcpu->arch.sier = set_reg_val(id, *val);
+		i = id - KVM_REG_PPC_SIER;
+		vcpu->arch.sier[i] = set_reg_val(id, *val);
 		break;
 	case KVM_REG_PPC_IAMR:
 		vcpu->arch.iamr = set_reg_val(id, *val);
diff --git a/arch/powerpc/kvm/book3s_hv_interrupts.S b/arch/powerpc/kvm/book3s_hv_interrupts.S
index 63fd81f..59822cb 100644
--- a/arch/powerpc/kvm/book3s_hv_interrupts.S
+++ b/arch/powerpc/kvm/book3s_hv_interrupts.S
@@ -140,6 +140,14 @@ BEGIN_FTR_SECTION
 	std	r8, HSTATE_MMCR2(r13)
 	std	r9, HSTATE_SIER(r13)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+	mfspr	r5, SPRN_MMCR3
+	mfspr	r6, SPRN_SIER2
+	mfspr	r7, SPRN_SIER3
+	std	r5, HSTATE_MMCR3(r13)
+	std	r6, HSTATE_SIER2(r13)
+	std	r7, HSTATE_SIER3(r13)
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mfspr	r3, SPRN_PMC1
 	mfspr	r5, SPRN_PMC2
 	mfspr	r6, SPRN_PMC3
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 7194389..57b6c14 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -3436,6 +3436,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_PMAO_BUG)
 	mtspr	SPRN_SIAR, r7
 	mtspr	SPRN_SDAR, r8
 BEGIN_FTR_SECTION
+	ld	r5, VCPU_MMCR + 40(r4)
+	ld	r6, VCPU_SIER + 8(r4)
+	ld	r7, VCPU_SIER + 16(r4)
+	mtspr	SPRN_MMCR3, r5
+	mtspr	SPRN_SIER2, r6
+	mtspr	SPRN_SIER3, r7
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
+BEGIN_FTR_SECTION
 	ld	r5, VCPU_MMCR + 24(r4)
 	ld	r6, VCPU_SIER(r4)
 	mtspr	SPRN_MMCR2, r5
@@ -3496,6 +3504,14 @@ BEGIN_FTR_SECTION
 	mtspr	SPRN_MMCR2, r8
 	mtspr	SPRN_SIER, r9
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+	ld	r5, HSTATE_MMCR3(r13)
+	ld	r6, HSTATE_SIER2(r13)
+	ld	r7, HSTATE_SIER3(r13)
+	mtspr	SPRN_MMCR3, r5
+	mtspr	SPRN_SIER2, r6
+	mtspr	SPRN_SIER3, r7
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	mtspr	SPRN_MMCR0, r3
 	isync
 	mtlr	r0
@@ -3555,6 +3571,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
 BEGIN_FTR_SECTION
 	std	r10, VCPU_MMCR + 24(r9)
 END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
+BEGIN_FTR_SECTION
+	mfspr	r5, SPRN_MMCR3
+	mfspr	r6, SPRN_SIER2
+	mfspr	r7, SPRN_SIER3
+	std	r5, VCPU_MMCR + 40(r9)
+	std	r6, VCPU_SIER + 8(r9)
+	std	r7, VCPU_SIER + 16(r9)
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_31)
 	std	r7, VCPU_SIAR(r9)
 	std	r8, VCPU_SDAR(r9)
 	mfspr	r3, SPRN_PMC1
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
  2020-07-01  9:20 ` [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs Athira Rajeev
  2020-07-01  9:20 ` [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-08 11:04   ` Michael Ellerman
  2020-07-01  9:20 ` [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs Athira Rajeev
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

From: Madhavan Srinivasan <maddy@linux.ibm.com>

PowerISA v3.1 added three new perfromance
monitoring unit (PMU) speical purpose register (SPR).
They are Monitor Mode Control Register 3 (MMCR3),
Sampled Instruction Event Register 2 (SIER2),
Sampled Instruction Event Register 3 (SIER3).

Patch here adds a new dump function dump_310_sprs
to print these SPR values.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
 arch/powerpc/xmon/xmon.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 7efe4bc..8917fe8 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2022,6 +2022,20 @@ static void dump_300_sprs(void)
 #endif
 }
 
+static void dump_310_sprs(void)
+{
+#ifdef CONFIG_PPC64
+	if (!cpu_has_feature(CPU_FTR_ARCH_31))
+		return;
+
+	printf("mmcr3  = %.16lx\n",
+		mfspr(SPRN_MMCR3));
+
+	printf("sier2  = %.16lx  sier3  = %.16lx\n",
+		mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));
+#endif
+}
+
 static void dump_one_spr(int spr, bool show_unimplemented)
 {
 	unsigned long val;
@@ -2076,6 +2090,7 @@ static void super_regs(void)
 		dump_206_sprs();
 		dump_207_sprs();
 		dump_300_sprs();
+		dump_310_sprs();
 
 		return;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (2 preceding siblings ...)
  2020-07-01  9:20 ` [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-07  6:22   ` Michael Neuling
  2020-07-08 11:15   ` Michael Ellerman
  2020-07-01  9:20 ` [PATCH v2 05/10] powerpc/perf: Update Power PMU cache_events to u64 type Athira Rajeev
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

From: Madhavan Srinivasan <maddy@linux.ibm.com>

Add power10 feature function to dt_cpu_ftrs.c along
with a power10 specific init() to initialize pmu sprs.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
---
 arch/powerpc/include/asm/reg.h        |  3 +++
 arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
 arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
 3 files changed, 36 insertions(+)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 21a1b2d..900ada1 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -1068,6 +1068,9 @@
 #define MMCR0_PMC2_LOADMISSTIME	0x5
 #endif
 
+/* BHRB disable bit for PowerISA v3.10 */
+#define MMCRA_BHRB_DISABLE	0x0000002000000000
+
 /*
  * SPRG usage:
  *
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index efdcfa7..e8b3370c 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -233,3 +233,10 @@ __init_PMU_ISA207:
 	li	r5,0
 	mtspr	SPRN_MMCRS,r5
 	blr
+
+__init_PMU_ISA31:
+	li	r5,0
+	mtspr	SPRN_MMCR3,r5
+	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
+	mtspr	SPRN_MMCRA,r5
+	blr
diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c b/arch/powerpc/kernel/dt_cpu_ftrs.c
index a0edeb3..14a513f 100644
--- a/arch/powerpc/kernel/dt_cpu_ftrs.c
+++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
@@ -449,6 +449,31 @@ static int __init feat_enable_pmu_power9(struct dt_cpu_feature *f)
 	return 1;
 }
 
+static void init_pmu_power10(void)
+{
+	init_pmu_power9();
+
+	mtspr(SPRN_MMCR3, 0);
+	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
+}
+
+static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
+{
+	hfscr_pmu_enable();
+
+	init_pmu_power10();
+	init_pmu_registers = init_pmu_power10;
+
+	cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
+	cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
+
+	cur_cpu_spec->num_pmcs          = 6;
+	cur_cpu_spec->pmc_type          = PPC_PMC_IBM;
+	cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
+
+	return 1;
+}
+
 static int __init feat_enable_tm(struct dt_cpu_feature *f)
 {
 #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
@@ -638,6 +663,7 @@ struct dt_cpu_feature_match {
 	{"pc-relative-addressing", feat_enable, 0},
 	{"machine-check-power9", feat_enable_mce_power9, 0},
 	{"performance-monitor-power9", feat_enable_pmu_power9, 0},
+	{"performance-monitor-power10", feat_enable_pmu_power10, 0},
 	{"event-based-branch-v3", feat_enable, 0},
 	{"random-number-generator", feat_enable, 0},
 	{"system-call-vectored", feat_disable, 0},
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 05/10] powerpc/perf: Update Power PMU cache_events to u64 type
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (3 preceding siblings ...)
  2020-07-01  9:20 ` [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-01  9:20 ` [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support Athira Rajeev
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

Events of type PERF_TYPE_HW_CACHE was described for Power PMU
as: int (*cache_events)[type][op][result];

where type, op, result values unpacked from the event attribute config
value is used to generate the raw event code at runtime.

So far the event code values which used to create these cache-related
events were within 32 bit and `int` type worked. In power10,
some of the event codes are of 64-bit value and hence update the
Power PMU cache_events to `u64` type in `power_pmu` struct.
Also propagate this change to existing all PMU driver code paths
which are using ppmu->cache_events.

Signed-off-by: Athira Rajeev<atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h | 2 +-
 arch/powerpc/perf/core-book3s.c              | 2 +-
 arch/powerpc/perf/generic-compat-pmu.c       | 2 +-
 arch/powerpc/perf/mpc7450-pmu.c              | 2 +-
 arch/powerpc/perf/power5+-pmu.c              | 2 +-
 arch/powerpc/perf/power5-pmu.c               | 2 +-
 arch/powerpc/perf/power6-pmu.c               | 2 +-
 arch/powerpc/perf/power7-pmu.c               | 2 +-
 arch/powerpc/perf/power8-pmu.c               | 2 +-
 arch/powerpc/perf/power9-pmu.c               | 2 +-
 arch/powerpc/perf/ppc970-pmu.c               | 2 +-
 11 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 895aeaa..cb207f8 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -47,7 +47,7 @@ struct power_pmu {
 	const struct attribute_group	**attr_groups;
 	int		n_generic;
 	int		*generic_events;
-	int		(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
+	u64		(*cache_events)[PERF_COUNT_HW_CACHE_MAX]
 			       [PERF_COUNT_HW_CACHE_OP_MAX]
 			       [PERF_COUNT_HW_CACHE_RESULT_MAX];
 
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 5c64bd3..58bfb9a 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1820,7 +1820,7 @@ static void hw_perf_event_destroy(struct perf_event *event)
 static int hw_perf_cache_event(u64 config, u64 *eventp)
 {
 	unsigned long type, op, result;
-	int ev;
+	u64 ev;
 
 	if (!ppmu->cache_events)
 		return -EINVAL;
diff --git a/arch/powerpc/perf/generic-compat-pmu.c b/arch/powerpc/perf/generic-compat-pmu.c
index 5e5a54d..eb8a6aaf 100644
--- a/arch/powerpc/perf/generic-compat-pmu.c
+++ b/arch/powerpc/perf/generic-compat-pmu.c
@@ -101,7 +101,7 @@ enum {
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 generic_compat_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = 0,
diff --git a/arch/powerpc/perf/mpc7450-pmu.c b/arch/powerpc/perf/mpc7450-pmu.c
index 4d5ef92..cf1eb89 100644
--- a/arch/powerpc/perf/mpc7450-pmu.c
+++ b/arch/powerpc/perf/mpc7450-pmu.c
@@ -354,7 +354,7 @@ static void mpc7450_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 mpc7450_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0,		0x225	},
 		[C(OP_WRITE)] = {	0,		0x227	},
diff --git a/arch/powerpc/perf/power5+-pmu.c b/arch/powerpc/perf/power5+-pmu.c
index f857454..9252281 100644
--- a/arch/powerpc/perf/power5+-pmu.c
+++ b/arch/powerpc/perf/power5+-pmu.c
@@ -618,7 +618,7 @@ static void power5p_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5p_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x1c10a8,	0x3c1088	},
 		[C(OP_WRITE)] = {	0x2c10a8,	0xc10c3		},
diff --git a/arch/powerpc/perf/power5-pmu.c b/arch/powerpc/perf/power5-pmu.c
index da52eca..3b36630 100644
--- a/arch/powerpc/perf/power5-pmu.c
+++ b/arch/powerpc/perf/power5-pmu.c
@@ -560,7 +560,7 @@ static void power5_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power5_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x4c1090,	0x3c1088	},
 		[C(OP_WRITE)] = {	0x3c1090,	0xc10c3		},
diff --git a/arch/powerpc/perf/power6-pmu.c b/arch/powerpc/perf/power6-pmu.c
index 3929cac..540b78d 100644
--- a/arch/powerpc/perf/power6-pmu.c
+++ b/arch/powerpc/perf/power6-pmu.c
@@ -481,7 +481,7 @@ static void p6_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * are event codes.
  * The "DTLB" and "ITLB" events relate to the DERAT and IERAT.
  */
-static int power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power6_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x280030,	0x80080		},
 		[C(OP_WRITE)] = {	0x180032,	0x80088		},
diff --git a/arch/powerpc/perf/power7-pmu.c b/arch/powerpc/perf/power7-pmu.c
index a137813..2b7f375 100644
--- a/arch/powerpc/perf/power7-pmu.c
+++ b/arch/powerpc/perf/power7-pmu.c
@@ -332,7 +332,7 @@ static void power7_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power7_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0xc880,		0x400f0	},
 		[C(OP_WRITE)] = {	0,		0x300f0	},
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 3a5fcc2..5282e84 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -253,7 +253,7 @@ static void power8_config_bhrb(u64 pmu_bhrb_filter)
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power8_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 08c3ef7..05dae38 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -310,7 +310,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 power9_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[ C(L1D) ] = {
 		[ C(OP_READ) ] = {
 			[ C(RESULT_ACCESS) ] = PM_LD_REF_L1,
diff --git a/arch/powerpc/perf/ppc970-pmu.c b/arch/powerpc/perf/ppc970-pmu.c
index 4035d93..2970d1e 100644
--- a/arch/powerpc/perf/ppc970-pmu.c
+++ b/arch/powerpc/perf/ppc970-pmu.c
@@ -432,7 +432,7 @@ static void p970_disable_pmc(unsigned int pmc, unsigned long mmcr[])
  * 0 means not supported, -1 means nonsensical, other values
  * are event codes.
  */
-static int ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+static u64 ppc970_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
 	[C(L1D)] = {		/* 	RESULT_ACCESS	RESULT_MISS */
 		[C(OP_READ)] = {	0x8810,		0x3810	},
 		[C(OP_WRITE)] = {	0x7810,		0x813	},
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (4 preceding siblings ...)
  2020-07-01  9:20 ` [PATCH v2 05/10] powerpc/perf: Update Power PMU cache_events to u64 type Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-02  9:06   ` kernel test robot
  2020-07-07  6:50   ` Michael Neuling
  2020-07-01  9:20 ` [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes Athira Rajeev
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

Base enablement patch to register performance monitoring
hardware support for power10. Patch introduce the raw event
encoding format, defines the supported list of events, config
fields for the event attributes and their corresponding bit values
which are exported via sysfs.

Patch also enhances the support function in isa207_common.c to
include power10 pmu hardware.

[Enablement of base PMU driver code]
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
[Addition of ISA macros for counter support functions]
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/Makefile              |   2 +-
 arch/powerpc/perf/core-book3s.c         |   2 +
 arch/powerpc/perf/internal.h            |   1 +
 arch/powerpc/perf/isa207-common.c       |  59 ++++-
 arch/powerpc/perf/isa207-common.h       |  33 ++-
 arch/powerpc/perf/power10-events-list.h |  70 ++++++
 arch/powerpc/perf/power10-pmu.c         | 410 ++++++++++++++++++++++++++++++++
 7 files changed, 566 insertions(+), 11 deletions(-)
 create mode 100644 arch/powerpc/perf/power10-events-list.h
 create mode 100644 arch/powerpc/perf/power10-pmu.c

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 53d614e..c02854d 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -9,7 +9,7 @@ obj-$(CONFIG_PPC_PERF_CTRS)	+= core-book3s.o bhrb.o
 obj64-$(CONFIG_PPC_PERF_CTRS)	+= ppc970-pmu.o power5-pmu.o \
 				   power5+-pmu.o power6-pmu.o power7-pmu.o \
 				   isa207-common.o power8-pmu.o power9-pmu.o \
-				   generic-compat-pmu.o
+				   generic-compat-pmu.o power10-pmu.o
 obj32-$(CONFIG_PPC_PERF_CTRS)	+= mpc7450-pmu.o
 
 obj-$(CONFIG_PPC_POWERNV)	+= imc-pmu.o
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 58bfb9a..fad5159 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2333,6 +2333,8 @@ static int __init init_ppc64_pmu(void)
 		return 0;
 	else if (!init_power9_pmu())
 		return 0;
+	else if (!init_power10_pmu())
+		return 0;
 	else if (!init_ppc970_pmu())
 		return 0;
 	else
diff --git a/arch/powerpc/perf/internal.h b/arch/powerpc/perf/internal.h
index f755c64..80bbf72 100644
--- a/arch/powerpc/perf/internal.h
+++ b/arch/powerpc/perf/internal.h
@@ -9,4 +9,5 @@
 extern int init_power7_pmu(void);
 extern int init_power8_pmu(void);
 extern int init_power9_pmu(void);
+extern int init_power10_pmu(void);
 extern int init_generic_compat_pmu(void);
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 4c86da5..7d4839e 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -55,7 +55,9 @@ static bool is_event_valid(u64 event)
 {
 	u64 valid_mask = EVENT_VALID_MASK;
 
-	if (cpu_has_feature(CPU_FTR_ARCH_300))
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		valid_mask = p10_EVENT_VALID_MASK;
+	else if (cpu_has_feature(CPU_FTR_ARCH_300))
 		valid_mask = p9_EVENT_VALID_MASK;
 
 	return !(event & ~valid_mask);
@@ -69,6 +71,14 @@ static inline bool is_event_marked(u64 event)
 	return false;
 }
 
+static unsigned long sdar_mod_val(u64 event)
+{
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		return p10_SDAR_MODE(event);
+
+	return p9_SDAR_MODE(event);
+}
+
 static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 {
 	/*
@@ -79,7 +89,7 @@ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 	 * MMCRA[SDAR_MODE] will be programmed as "0b01" for continous sampling
 	 * mode and will be un-changed when setting MMCRA[63] (Marked events).
 	 *
-	 * Incase of Power9:
+	 * Incase of Power9/power10:
 	 * Marked event: MMCRA[SDAR_MODE] will be set to 0b00 ('No Updates'),
 	 *               or if group already have any marked events.
 	 * For rest
@@ -90,8 +100,8 @@ static void mmcra_sdar_mode(u64 event, unsigned long *mmcra)
 	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 		if (is_event_marked(event) || (*mmcra & MMCRA_SAMPLE_ENABLE))
 			*mmcra &= MMCRA_SDAR_MODE_NO_UPDATES;
-		else if (p9_SDAR_MODE(event))
-			*mmcra |=  p9_SDAR_MODE(event) << MMCRA_SDAR_MODE_SHIFT;
+		else if (sdar_mod_val(event))
+			*mmcra |= sdar_mod_val(event) << MMCRA_SDAR_MODE_SHIFT;
 		else
 			*mmcra |= MMCRA_SDAR_MODE_DCACHE;
 	} else
@@ -134,7 +144,11 @@ static bool is_thresh_cmp_valid(u64 event)
 	/*
 	 * Check the mantissa upper two bits are not zero, unless the
 	 * exponent is also zero. See the THRESH_CMP_MANTISSA doc.
+	 * Power10: thresh_cmp is replaced by l2_l3 event select.
 	 */
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		return false;
+
 	cmp = (event >> EVENT_THR_CMP_SHIFT) & EVENT_THR_CMP_MASK;
 	exp = cmp >> 7;
 
@@ -251,7 +265,12 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
 
 	pmc   = (event >> EVENT_PMC_SHIFT)        & EVENT_PMC_MASK;
 	unit  = (event >> EVENT_UNIT_SHIFT)       & EVENT_UNIT_MASK;
-	cache = (event >> EVENT_CACHE_SEL_SHIFT)  & EVENT_CACHE_SEL_MASK;
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		cache = (event >> EVENT_CACHE_SEL_SHIFT) &
+			p10_EVENT_CACHE_SEL_MASK;
+	else
+		cache = (event >> EVENT_CACHE_SEL_SHIFT) &
+			EVENT_CACHE_SEL_MASK;
 	ebb   = (event >> EVENT_EBB_SHIFT)        & EVENT_EBB_MASK;
 
 	if (pmc) {
@@ -283,7 +302,10 @@ int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp)
 	}
 
 	if (unit >= 6 && unit <= 9) {
-		if (cpu_has_feature(CPU_FTR_ARCH_300)) {
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && (unit == 6)) {
+			mask |= CNST_L2L3_GROUP_MASK;
+			value |= CNST_L2L3_GROUP_VAL(event >> p10_L2L3_EVENT_SHIFT);
+		} else if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 			mask  |= CNST_CACHE_GROUP_MASK;
 			value |= CNST_CACHE_GROUP_VAL(event & 0xff);
 
@@ -367,6 +389,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			       struct perf_event *pevents[])
 {
 	unsigned long mmcra, mmcr1, mmcr2, unit, combine, psel, cache, val;
+	unsigned long mmcr3;
 	unsigned int pmc, pmc_inuse;
 	int i;
 
@@ -379,7 +402,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			pmc_inuse |= 1 << pmc;
 	}
 
-	mmcra = mmcr1 = mmcr2 = 0;
+	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
 
 	/* Second pass: assign PMCs, set all MMCR1 fields */
 	for (i = 0; i < n_ev; ++i) {
@@ -438,8 +461,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 			mmcra |= val << MMCRA_THR_CTL_SHIFT;
 			val = (event[i] >> EVENT_THR_SEL_SHIFT) & EVENT_THR_SEL_MASK;
 			mmcra |= val << MMCRA_THR_SEL_SHIFT;
-			val = (event[i] >> EVENT_THR_CMP_SHIFT) & EVENT_THR_CMP_MASK;
-			mmcra |= thresh_cmp_val(val);
+			if (!cpu_has_feature(CPU_FTR_ARCH_31)) {
+				val = (event[i] >> EVENT_THR_CMP_SHIFT) &
+					EVENT_THR_CMP_MASK;
+				mmcra |= thresh_cmp_val(val);
+			}
+		}
+
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && (unit == 6)) {
+			val = (event[i] >> p10_L2L3_EVENT_SHIFT) &
+				p10_EVENT_L2L3_SEL_MASK;
+			mmcr2 |= val << p10_L2L3_SEL_SHIFT;
 		}
 
 		if (event[i] & EVENT_WANTS_BHRB) {
@@ -460,6 +492,14 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 				mmcr2 |= MMCR2_FCS(pmc);
 		}
 
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			if (pmc <= 4) {
+				val = (event[i] >> p10_EVENT_MMCR3_SHIFT) &
+					p10_EVENT_MMCR3_MASK;
+				mmcr3 |= val << MMCR3_SHIFT(pmc);
+			}
+		}
+
 		hwc[i] = pmc - 1;
 	}
 
@@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 	mmcr[1] = mmcr1;
 	mmcr[2] = mmcra;
 	mmcr[3] = mmcr2;
+	mmcr[4] = mmcr3;
 
 	return 0;
 }
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index 63fd4f3..85cbce5 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -87,6 +87,31 @@
 	 EVENT_LINUX_MASK					|	\
 	 EVENT_PSEL_MASK))
 
+/* Contants to support power10 raw encoding format */
+#define p10_SDAR_MODE_SHIFT		22
+#define p10_SDAR_MODE_MASK		0x3ull
+#define p10_SDAR_MODE(v)		(((v) >> p10_SDAR_MODE_SHIFT) & \
+					p10_SDAR_MODE_MASK)
+#define p10_EVENT_L2L3_SEL_MASK		0x1f
+#define p10_L2L3_SEL_SHIFT		3
+#define p10_L2L3_EVENT_SHIFT		40
+#define p10_EVENT_THRESH_MASK		0xffffull
+#define p10_EVENT_CACHE_SEL_MASK	0x3ull
+#define p10_EVENT_MMCR3_MASK		0x7fffull
+#define p10_EVENT_MMCR3_SHIFT		45
+
+#define p10_EVENT_VALID_MASK		\
+	((p10_SDAR_MODE_MASK   << p10_SDAR_MODE_SHIFT		|	\
+	(p10_EVENT_THRESH_MASK  << EVENT_THRESH_SHIFT)		|	\
+	(EVENT_SAMPLE_MASK     << EVENT_SAMPLE_SHIFT)		|	\
+	(p10_EVENT_CACHE_SEL_MASK  << EVENT_CACHE_SEL_SHIFT)	|	\
+	(EVENT_PMC_MASK        << EVENT_PMC_SHIFT)		|	\
+	(EVENT_UNIT_MASK       << EVENT_UNIT_SHIFT)		|	\
+	(p9_EVENT_COMBINE_MASK << p9_EVENT_COMBINE_SHIFT)	|	\
+	(p10_EVENT_MMCR3_MASK  << p10_EVENT_MMCR3_SHIFT)	|	\
+	(EVENT_MARKED_MASK     << EVENT_MARKED_SHIFT)		|	\
+	 EVENT_LINUX_MASK					|	\
+	EVENT_PSEL_MASK))
 /*
  * Layout of constraint bits:
  *
@@ -135,6 +160,9 @@
 #define CNST_CACHE_PMC4_VAL	(1ull << 54)
 #define CNST_CACHE_PMC4_MASK	CNST_CACHE_PMC4_VAL
 
+#define CNST_L2L3_GROUP_VAL(v)	(((v) & 0x1full) << 55)
+#define CNST_L2L3_GROUP_MASK	CNST_L2L3_GROUP_VAL(0x1f)
+
 /*
  * For NC we are counting up to 4 events. This requires three bits, and we need
  * the fifth event to overflow and set the 4th bit. To achieve that we bias the
@@ -191,7 +219,7 @@
 #define MMCRA_THR_CTR_EXP(v)		(((v) >> MMCRA_THR_CTR_EXP_SHIFT) &\
 						MMCRA_THR_CTR_EXP_MASK)
 
-/* MMCR1 Threshold Compare bit constant for power9 */
+/* MMCRA Threshold Compare bit constant for power9/power10 */
 #define p9_MMCRA_THR_CMP_SHIFT	45
 
 /* Bits in MMCR2 for PowerISA v2.07 */
@@ -202,6 +230,9 @@
 #define MAX_ALT				2
 #define MAX_PMU_COUNTERS		6
 
+/* Bits in MMCR3 for PowerISA v3.10 */
+#define MMCR3_SHIFT(pmc)		(49 - (15 * ((pmc) - 1)))
+
 #define ISA207_SIER_TYPE_SHIFT		15
 #define ISA207_SIER_TYPE_MASK		(0x7ull << ISA207_SIER_TYPE_SHIFT)
 
diff --git a/arch/powerpc/perf/power10-events-list.h b/arch/powerpc/perf/power10-events-list.h
new file mode 100644
index 0000000..60c1b81
--- /dev/null
+++ b/arch/powerpc/perf/power10-events-list.h
@@ -0,0 +1,70 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Performance counter support for POWER10 processors.
+ *
+ * Copyright 2020 Madhavan Srinivasan, IBM Corporation.
+ * Copyright 2020 Athira Rajeev, IBM Corporation.
+ */
+
+/*
+ * Power10 event codes.
+ */
+EVENT(PM_RUN_CYC,				0x600f4);
+EVENT(PM_DISP_STALL_CYC,			0x100f8);
+EVENT(PM_EXEC_STALL,				0x30008);
+EVENT(PM_RUN_INST_CMPL,				0x500fa);
+EVENT(PM_BR_CMPL,                               0x4d05e);
+EVENT(PM_BR_MPRED_CMPL,                         0x400f6);
+
+/* All L1 D cache load references counted at finish, gated by reject */
+EVENT(PM_LD_REF_L1,				0x100fc);
+/* Load Missed L1 */
+EVENT(PM_LD_MISS_L1,				0x3e054);
+/* Store Missed L1 */
+EVENT(PM_ST_MISS_L1,				0x300f0);
+/* L1 cache data prefetches */
+EVENT(PM_LD_PREFETCH_CACHE_LINE_MISS,		0x1002c);
+/* Demand iCache Miss */
+EVENT(PM_L1_ICACHE_MISS,			0x200fc);
+/* Instruction fetches from L1 */
+EVENT(PM_INST_FROM_L1,				0x04080);
+/* Instruction Demand sectors wriittent into IL1 */
+EVENT(PM_INST_FROM_L1MISS,			0x03f00000001c040);
+/* Instruction prefetch written into IL1 */
+EVENT(PM_IC_PREF_REQ,				0x040a0);
+/* The data cache was reloaded from local core's L3 due to a demand load */
+EVENT(PM_DATA_FROM_L3,				0x01340000001c040);
+/* Demand LD - L3 Miss (not L2 hit and not L3 hit) */
+EVENT(PM_DATA_FROM_L3MISS,			0x300fe);
+/* Data PTEG reload */
+EVENT(PM_DTLB_MISS,				0x300fc);
+/* ITLB Reloaded */
+EVENT(PM_ITLB_MISS,				0x400fc);
+
+EVENT(PM_RUN_CYC_ALT,				0x0001e);
+EVENT(PM_RUN_INST_CMPL_ALT,			0x00002);
+
+/*
+ * Memory Access Events
+ *
+ * Primary PMU event used here is PM_MRK_INST_CMPL (0x401e0)
+ * To enable capturing of memory profiling, these MMCRA bits
+ * needs to be programmed and corresponding raw event format
+ * encoding.
+ *
+ * MMCRA bits encoding needed are
+ *     SM (Sampling Mode)
+ *     EM (Eligibility for Random Sampling)
+ *     TECE (Threshold Event Counter Event)
+ *     TS (Threshold Start Event)
+ *     TE (Threshold End Event)
+ *
+ * Corresponding Raw Encoding bits:
+ *     sample [EM,SM]
+ *     thresh_sel (TECE)
+ *     thresh start (TS)
+ *     thresh end (TE)
+ */
+
+EVENT(MEM_LOADS,				0x34340401e0);
+EVENT(MEM_STORES,				0x343c0401e0);
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
new file mode 100644
index 0000000..d64d69d
--- /dev/null
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -0,0 +1,410 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Performance counter support for POWER10 processors.
+ *
+ * Copyright 2020 Madhavan Srinivasan, IBM Corporation.
+ * Copyright 2020 Athira Rajeev, IBM Corporation.
+ */
+
+#define pr_fmt(fmt)	"power10-pmu: " fmt
+
+#include "isa207-common.h"
+
+/*
+ * Raw event encoding for Power10:
+ *
+ *        60        56        52        48        44        40        36        32
+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
+ *   | | [ ]   [ src_match ] [  src_mask ]   | [ ] [ l2l3_sel ]  [  thresh_ctl   ]
+ *   | |  |                                  |  |                         |
+ *   | |  *- IFM (Linux)                     |  |        thresh start/stop -*
+ *   | *- BHRB (Linux)                       |  src_sel
+ *   *- EBB (Linux)                          *invert_bit
+ *
+ *        28        24        20        16        12         8         4         0
+ * | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - | - - - - |
+ *   [   ] [  sample ]   [ ] [ ]   [ pmc ]   [unit ]   [ ]   m   [    pmcxsel    ]
+ *     |        |        |    |                        |     |
+ *     |        |        |    |                        |     *- mark
+ *     |        |        |    *- L1/L2/L3 cache_sel    |
+ *     |        |        sdar_mode                     |
+ *     |        *- sampling mode for marked events     *- combine
+ *     |
+ *     *- thresh_sel
+ *
+ * Below uses IBM bit numbering.
+ *
+ * MMCR1[x:y] = unit    (PMCxUNIT)
+ * MMCR1[24]   = pmc1combine[0]
+ * MMCR1[25]   = pmc1combine[1]
+ * MMCR1[26]   = pmc2combine[0]
+ * MMCR1[27]   = pmc2combine[1]
+ * MMCR1[28]   = pmc3combine[0]
+ * MMCR1[29]   = pmc3combine[1]
+ * MMCR1[30]   = pmc4combine[0]
+ * MMCR1[31]   = pmc4combine[1]
+ *
+ * if pmc == 3 and unit == 0 and pmcxsel[0:6] == 0b0101011
+ *	MMCR1[20:27] = thresh_ctl
+ * else if pmc == 4 and unit == 0xf and pmcxsel[0:6] == 0b0101001
+ *	MMCR1[20:27] = thresh_ctl
+ * else
+ *	MMCRA[48:55] = thresh_ctl   (THRESH START/END)
+ *
+ * if thresh_sel:
+ *	MMCRA[45:47] = thresh_sel
+ *
+ * if l2l3_sel:
+ * MMCR2[56:60] = l2l3_sel[0:4]
+ *
+ * MMCR1[16] = cache_sel[0]
+ * MMCR1[17] = cache_sel[1]
+ *
+ * if mark:
+ *	MMCRA[63]    = 1		(SAMPLE_ENABLE)
+ *	MMCRA[57:59] = sample[0:2]	(RAND_SAMP_ELIG)
+ *	MMCRA[61:62] = sample[3:4]	(RAND_SAMP_MODE)
+ *
+ * if EBB and BHRB:
+ *	MMCRA[32:33] = IFM
+ *
+ * MMCRA[SDAR_MODE]  = sdar_mode[0:1]
+ */
+
+/*
+ * Some power10 event codes.
+ */
+#define EVENT(_name, _code)     enum{_name = _code}
+
+#include "power10-events-list.h"
+
+#undef EVENT
+
+/* MMCRA IFM bits - POWER10 */
+#define POWER10_MMCRA_IFM1		0x0000000040000000UL
+#define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
+
+/* Table of alternatives, sorted by column 0 */
+static const unsigned int power10_event_alternatives[][MAX_ALT] = {
+	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
+	{ PM_RUN_INST_CMPL_ALT,		PM_RUN_INST_CMPL },
+};
+
+static int power10_get_alternatives(u64 event, unsigned int flags, u64 alt[])
+{
+	int num_alt = 0;
+
+	num_alt = isa207_get_alternatives(event, alt,
+					  ARRAY_SIZE(
+					  power10_event_alternatives), flags,
+					  power10_event_alternatives);
+
+	return num_alt;
+}
+
+GENERIC_EVENT_ATTR(cpu-cycles,			PM_RUN_CYC);
+GENERIC_EVENT_ATTR(instructions,		PM_RUN_INST_CMPL);
+GENERIC_EVENT_ATTR(branch-instructions,		PM_BR_CMPL);
+GENERIC_EVENT_ATTR(branch-misses,		PM_BR_MPRED_CMPL);
+GENERIC_EVENT_ATTR(cache-references,		PM_LD_REF_L1);
+GENERIC_EVENT_ATTR(cache-misses,		PM_LD_MISS_L1);
+GENERIC_EVENT_ATTR(mem-loads,			MEM_LOADS);
+GENERIC_EVENT_ATTR(mem-stores,			MEM_STORES);
+
+CACHE_EVENT_ATTR(L1-dcache-load-misses,		PM_LD_MISS_L1);
+CACHE_EVENT_ATTR(L1-dcache-loads,		PM_LD_REF_L1);
+CACHE_EVENT_ATTR(L1-dcache-prefetches,		PM_LD_PREFETCH_CACHE_LINE_MISS);
+CACHE_EVENT_ATTR(L1-dcache-store-misses,	PM_ST_MISS_L1);
+CACHE_EVENT_ATTR(L1-icache-load-misses,		PM_L1_ICACHE_MISS);
+CACHE_EVENT_ATTR(L1-icache-loads,		PM_INST_FROM_L1);
+CACHE_EVENT_ATTR(L1-icache-prefetches,		PM_IC_PREF_REQ);
+CACHE_EVENT_ATTR(LLC-load-misses,		PM_DATA_FROM_L3MISS);
+CACHE_EVENT_ATTR(LLC-loads,			PM_DATA_FROM_L3);
+CACHE_EVENT_ATTR(branch-load-misses,		PM_BR_MPRED_CMPL);
+CACHE_EVENT_ATTR(branch-loads,			PM_BR_CMPL);
+CACHE_EVENT_ATTR(dTLB-load-misses,		PM_DTLB_MISS);
+CACHE_EVENT_ATTR(iTLB-load-misses,		PM_ITLB_MISS);
+
+static struct attribute *power10_events_attr[] = {
+	GENERIC_EVENT_PTR(PM_RUN_CYC),
+	GENERIC_EVENT_PTR(PM_RUN_INST_CMPL),
+	GENERIC_EVENT_PTR(PM_BR_CMPL),
+	GENERIC_EVENT_PTR(PM_BR_MPRED_CMPL),
+	GENERIC_EVENT_PTR(PM_LD_REF_L1),
+	GENERIC_EVENT_PTR(PM_LD_MISS_L1),
+	GENERIC_EVENT_PTR(MEM_LOADS),
+	GENERIC_EVENT_PTR(MEM_STORES),
+	CACHE_EVENT_PTR(PM_LD_MISS_L1),
+	CACHE_EVENT_PTR(PM_LD_REF_L1),
+	CACHE_EVENT_PTR(PM_LD_PREFETCH_CACHE_LINE_MISS),
+	CACHE_EVENT_PTR(PM_ST_MISS_L1),
+	CACHE_EVENT_PTR(PM_L1_ICACHE_MISS),
+	CACHE_EVENT_PTR(PM_INST_FROM_L1),
+	CACHE_EVENT_PTR(PM_IC_PREF_REQ),
+	CACHE_EVENT_PTR(PM_DATA_FROM_L3MISS),
+	CACHE_EVENT_PTR(PM_DATA_FROM_L3),
+	CACHE_EVENT_PTR(PM_BR_MPRED_CMPL),
+	CACHE_EVENT_PTR(PM_BR_CMPL),
+	CACHE_EVENT_PTR(PM_DTLB_MISS),
+	CACHE_EVENT_PTR(PM_ITLB_MISS),
+	NULL
+};
+
+static struct attribute_group power10_pmu_events_group = {
+	.name = "events",
+	.attrs = power10_events_attr,
+};
+
+PMU_FORMAT_ATTR(event,          "config:0-59");
+PMU_FORMAT_ATTR(pmcxsel,        "config:0-7");
+PMU_FORMAT_ATTR(mark,           "config:8");
+PMU_FORMAT_ATTR(combine,        "config:10-11");
+PMU_FORMAT_ATTR(unit,           "config:12-15");
+PMU_FORMAT_ATTR(pmc,            "config:16-19");
+PMU_FORMAT_ATTR(cache_sel,      "config:20-21");
+PMU_FORMAT_ATTR(sdar_mode,      "config:22-23");
+PMU_FORMAT_ATTR(sample_mode,    "config:24-28");
+PMU_FORMAT_ATTR(thresh_sel,     "config:29-31");
+PMU_FORMAT_ATTR(thresh_stop,    "config:32-35");
+PMU_FORMAT_ATTR(thresh_start,   "config:36-39");
+PMU_FORMAT_ATTR(l2l3_sel,       "config:40-44");
+PMU_FORMAT_ATTR(src_sel,        "config:45-46");
+PMU_FORMAT_ATTR(invert_bit,     "config:47");
+PMU_FORMAT_ATTR(src_mask,       "config:48-53");
+PMU_FORMAT_ATTR(src_match,      "config:54-59");
+
+static struct attribute *power10_pmu_format_attr[] = {
+	&format_attr_event.attr,
+	&format_attr_pmcxsel.attr,
+	&format_attr_mark.attr,
+	&format_attr_combine.attr,
+	&format_attr_unit.attr,
+	&format_attr_pmc.attr,
+	&format_attr_cache_sel.attr,
+	&format_attr_sdar_mode.attr,
+	&format_attr_sample_mode.attr,
+	&format_attr_thresh_sel.attr,
+	&format_attr_thresh_stop.attr,
+	&format_attr_thresh_start.attr,
+	&format_attr_l2l3_sel.attr,
+	&format_attr_src_sel.attr,
+	&format_attr_invert_bit.attr,
+	&format_attr_src_mask.attr,
+	&format_attr_src_match.attr,
+	NULL,
+};
+
+static struct attribute_group power10_pmu_format_group = {
+	.name = "format",
+	.attrs = power10_pmu_format_attr,
+};
+
+static const struct attribute_group *power10_pmu_attr_groups[] = {
+	&power10_pmu_format_group,
+	&power10_pmu_events_group,
+	NULL,
+};
+
+static int power10_generic_events[] = {
+	[PERF_COUNT_HW_CPU_CYCLES] =			PM_RUN_CYC,
+	[PERF_COUNT_HW_INSTRUCTIONS] =			PM_RUN_INST_CMPL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] =		PM_BR_CMPL,
+	[PERF_COUNT_HW_BRANCH_MISSES] =			PM_BR_MPRED_CMPL,
+	[PERF_COUNT_HW_CACHE_REFERENCES] =		PM_LD_REF_L1,
+	[PERF_COUNT_HW_CACHE_MISSES] =			PM_LD_MISS_L1,
+};
+
+static u64 power10_bhrb_filter_map(u64 branch_sample_type)
+{
+	u64 pmu_bhrb_filter = 0;
+
+	/* BHRB and regular PMU events share the same privilege state
+	 * filter configuration. BHRB is always recorded along with a
+	 * regular PMU event. As the privilege state filter is handled
+	 * in the basic PMC configuration of the accompanying regular
+	 * PMU event, we ignore any separate BHRB specific request.
+	 */
+
+	/* No branch filter requested */
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
+		return pmu_bhrb_filter;
+
+	/* Invalid branch filter options - HW does not support */
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
+		return -1;
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM1;
+		return pmu_bhrb_filter;
+	}
+
+	/* Every thing else is unsupported */
+	return -1;
+}
+
+static void power10_config_bhrb(u64 pmu_bhrb_filter)
+{
+	pmu_bhrb_filter &= POWER10_MMCRA_BHRB_MASK;
+
+	/* Enable BHRB filter in PMU */
+	mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
+}
+
+#define C(x)	PERF_COUNT_HW_CACHE_##x
+
+/*
+ * Table of generalized cache-related events.
+ * 0 means not supported, -1 means nonsensical, other values
+ * are event codes.
+ */
+static u64 power10_cache_events[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = {
+	[C(L1D)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_LD_REF_L1,
+			[C(RESULT_MISS)] = PM_LD_MISS_L1,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_ST_MISS_L1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = PM_LD_PREFETCH_CACHE_LINE_MISS,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	[C(L1I)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_INST_FROM_L1,
+			[C(RESULT_MISS)] = PM_L1_ICACHE_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = PM_INST_FROM_L1MISS,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = PM_IC_PREF_REQ,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_DATA_FROM_L3,
+			[C(RESULT_MISS)] = PM_DATA_FROM_L3MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = 0,
+		},
+	},
+	 [C(DTLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_DTLB_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(ITLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = 0,
+			[C(RESULT_MISS)] = PM_ITLB_MISS,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(BPU)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = PM_BR_CMPL,
+			[C(RESULT_MISS)] = PM_BR_MPRED_CMPL,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+	[C(NODE)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)] = -1,
+			[C(RESULT_MISS)] = -1,
+		},
+	},
+};
+
+#undef C
+
+static struct power_pmu power10_pmu = {
+	.name			= "POWER10",
+	.n_counter		= MAX_PMU_COUNTERS,
+	.add_fields		= ISA207_ADD_FIELDS,
+	.test_adder		= ISA207_TEST_ADDER,
+	.group_constraint_mask	= CNST_CACHE_PMC4_MASK,
+	.group_constraint_val	= CNST_CACHE_PMC4_VAL,
+	.compute_mmcr		= isa207_compute_mmcr,
+	.config_bhrb		= power10_config_bhrb,
+	.bhrb_filter_map	= power10_bhrb_filter_map,
+	.get_constraint		= isa207_get_constraint,
+	.get_alternatives	= power10_get_alternatives,
+	.get_mem_data_src	= isa207_get_mem_data_src,
+	.get_mem_weight		= isa207_get_mem_weight,
+	.disable_pmc		= isa207_disable_pmc,
+	.flags			= PPMU_HAS_SIER | PPMU_ARCH_207S |
+				  PPMU_ARCH_310S,
+	.n_generic		= ARRAY_SIZE(power10_generic_events),
+	.generic_events		= power10_generic_events,
+	.cache_events		= &power10_cache_events,
+	.attr_groups		= power10_pmu_attr_groups,
+	.bhrb_nr		= 32,
+};
+
+int init_power10_pmu(void)
+{
+	int rc;
+
+	/* Comes from cpu_specs[] */
+	if (!cur_cpu_spec->oprofile_cpu_type ||
+	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10"))
+		return -ENODEV;
+
+	rc = register_power_pmu(&power10_pmu);
+	if (rc)
+		return rc;
+
+	/* Tell userspace that EBB is supported */
+	cur_cpu_spec->cpu_user_features2 |= PPC_FEATURE2_EBB;
+
+	return 0;
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (5 preceding siblings ...)
  2020-07-01  9:20 ` [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support Athira Rajeev
@ 2020-07-01  9:20 ` Athira Rajeev
  2020-07-07  7:17   ` Michael Neuling
  2020-07-08 11:42   ` Michael Ellerman
  2020-07-01  9:21 ` [PATCH v2 08/10] powerpc/perf: Add support for outputting extended regs in perf intr_regs Athira Rajeev
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:20 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
First is the addition of BHRB disable bit and second new filtering
modes for BHRB.

BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
whether BHRB entries are written when BHRB recording is enabled by other
bits. Patch implements support for this BHRB disable bit.

Secondly PowerISA v3.1 introduce filtering support for
PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
for "ind_call" and "cond" in power10_bhrb_filter_map().

'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")'
added a check in bhrb_read() to filter the kernel address from BHRB buffer. Patch here modified
it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1 allows
only MSR[PR]=1 address to be written to BHRB buffer.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
 arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
 arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
 arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
 4 files changed, 59 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index fad5159..9709606 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
 			 * addresses at this point. Check the privileges before
 			 * exporting it to userspace (avoid exposure of regions
 			 * where we could have speculative execution)
+			 * Incase of ISA 310, BHRB will capture only user-space
+			 * address,hence include a check before filtering code
 			 */
-			if (is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
-				continue;
+			if (!(ppmu->flags & PPMU_ARCH_310S))
+				if (is_kernel_addr(addr) &&
+				perf_allow_kernel(&event->attr) != 0)
+					continue;
 
 			/* Branches are read most recent first (ie. mfbhrb 0 is
 			 * the most recent branch).
@@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw, unsigned long mmcr0)
 static void power_pmu_disable(struct pmu *pmu)
 {
 	struct cpu_hw_events *cpuhw;
-	unsigned long flags, mmcr0, val;
+	unsigned long flags, mmcr0, val, mmcra = 0;
 
 	if (!ppmu)
 		return;
@@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
 		mb();
 		isync();
 
+		val = mmcra = cpuhw->mmcr[2];
+
 		/*
 		 * Disable instruction sampling if it was enabled
 		 */
-		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
-			mtspr(SPRN_MMCRA,
-			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
+		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
+			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
+
+		/* Disable BHRB via mmcra [:26] for p10 if needed */
+		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
+			mmcra |= MMCRA_BHRB_DISABLE;
+
+		/* Write SPRN_MMCRA if mmcra has either disabled
+		 * instruction sampling or BHRB
+		 */
+		if (val != mmcra) {
+			mtspr(SPRN_MMCRA, mmcra);
 			mb();
 			isync();
 		}
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index 7d4839e..463d925 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 
 	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
 
+	/* Disable bhrb unless explicitly requested
+	 * by setting MMCRA [:26] bit.
+	 */
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		mmcra |= MMCRA_BHRB_DISABLE;
+
 	/* Second pass: assign PMCs, set all MMCR1 fields */
 	for (i = 0; i < n_ev; ++i) {
 		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
@@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
 		}
 
 		if (event[i] & EVENT_WANTS_BHRB) {
+			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
+			if (cpu_has_feature(CPU_FTR_ARCH_31))
+				mmcra &= ~MMCRA_BHRB_DISABLE;
 			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
 			mmcra |= val << MMCRA_IFM_SHIFT;
 		}
 
+		/* set MMCRA[:26] to 0 if there is user request for BHRB */
+		if (cpu_has_feature(CPU_FTR_ARCH_31) && has_branch_stack(pevents[i]))
+			mmcra &= ~MMCRA_BHRB_DISABLE;
+
 		if (pevents[i]->attr.exclude_user)
 			mmcr2 |= MMCR2_FCP(pmc);
 
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index d64d69d..07fb919 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -82,6 +82,8 @@
 
 /* MMCRA IFM bits - POWER10 */
 #define POWER10_MMCRA_IFM1		0x0000000040000000UL
+#define POWER10_MMCRA_IFM2		0x0000000080000000UL
+#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
 #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
 /* Table of alternatives, sorted by column 0 */
@@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64 branch_sample_type)
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
 		return -1;
 
-	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
-		return -1;
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
+		return pmu_bhrb_filter;
+	}
+
+	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
+		return pmu_bhrb_filter;
+	}
 
 	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
 		return -1;
diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
index 2dd4673..7db99c7 100644
--- a/arch/powerpc/platforms/powernv/idle.c
+++ b/arch/powerpc/platforms/powernv/idle.c
@@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 	unsigned long srr1;
 	unsigned long pls;
 	unsigned long mmcr0 = 0;
+	unsigned long mmcra_bhrb = 0;
 	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
 	bool sprs_saved = false;
 
@@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 		  */
 		mmcr0		= mfspr(SPRN_MMCR0);
 	}
+
+	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+		/* POWER10 uses MMCRA[:26] as BHRB disable bit
+		 * to disable BHRB logic when not used. Hence Save and
+		 * restore MMCRA after a state-loss idle.
+		 */
+		mmcra_bhrb		= mfspr(SPRN_MMCRA);
+	}
+
 	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
 		sprs.lpcr	= mfspr(SPRN_LPCR);
 		sprs.hfscr	= mfspr(SPRN_HFSCR);
@@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
 			mtspr(SPRN_MMCR0, mmcr0);
 		}
 
+		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
+		if (cpu_has_feature(CPU_FTR_ARCH_31))
+			mtspr(SPRN_MMCRA, mmcra_bhrb);
+
 		/*
 		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
 		 * to ensure the PMU starts running.
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 08/10] powerpc/perf: Add support for outputting extended regs in perf intr_regs
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (6 preceding siblings ...)
  2020-07-01  9:20 ` [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes Athira Rajeev
@ 2020-07-01  9:21 ` Athira Rajeev
  2020-07-01  9:21 ` [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc Athira Rajeev
  2020-07-01  9:21 ` [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform Athira Rajeev
  9 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:21 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

Add support for perf extended register capability in powerpc.
The capability flag PERF_PMU_CAP_EXTENDED_REGS, is used to indicate the
PMU which support extended registers. The generic code define the mask
of extended registers as 0 for non supported architectures.

Patch adds extended regs support for power9 platform by
exposing MMCR0, MMCR1 and MMCR2 registers.

REG_RESERVED mask needs update to include extended regs.
`PERF_REG_EXTENDED_MASK`, contains mask value of the supported registers,
is defined at runtime in the kernel based on platform since the supported
registers may differ from one processor version to another and hence the
MASK value.

with patch
----------

available registers: r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11
r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26
r27 r28 r29 r30 r31 nip msr orig_r3 ctr link xer ccr softe
trap dar dsisr sier mmcra mmcr0 mmcr1 mmcr2

PERF_RECORD_SAMPLE(IP, 0x1): 4784/4784: 0 period: 1 addr: 0
... intr regs: mask 0xffffffffffff ABI 64-bit
.... r0    0xc00000000012b77c
.... r1    0xc000003fe5e03930
.... r2    0xc000000001b0e000
.... r3    0xc000003fdcddf800
.... r4    0xc000003fc7880000
.... r5    0x9c422724be
.... r6    0xc000003fe5e03908
.... r7    0xffffff63bddc8706
.... r8    0x9e4
.... r9    0x0
.... r10   0x1
.... r11   0x0
.... r12   0xc0000000001299c0
.... r13   0xc000003ffffc4800
.... r14   0x0
.... r15   0x7fffdd8b8b00
.... r16   0x0
.... r17   0x7fffdd8be6b8
.... r18   0x7e7076607730
.... r19   0x2f
.... r20   0xc00000001fc26c68
.... r21   0xc0002041e4227e00
.... r22   0xc00000002018fb60
.... r23   0x1
.... r24   0xc000003ffec4d900
.... r25   0x80000000
.... r26   0x0
.... r27   0x1
.... r28   0x1
.... r29   0xc000000001be1260
.... r30   0x6008010
.... r31   0xc000003ffebb7218
.... nip   0xc00000000012b910
.... msr   0x9000000000009033
.... orig_r3 0xc00000000012b86c
.... ctr   0xc0000000001299c0
.... link  0xc00000000012b77c
.... xer   0x0
.... ccr   0x28002222
.... softe 0x1
.... trap  0xf00
.... dar   0x0
.... dsisr 0x80000000000
.... sier  0x0
.... mmcra 0x80000000000
.... mmcr0 0x82008090
.... mmcr1 0x1e000000
.... mmcr2 0x0
 ... thread: perf:4784

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
[Defined PERF_REG_EXTENDED_MASK at run time to add support for different platforms ]
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h |  8 +++++++
 arch/powerpc/include/uapi/asm/perf_regs.h    | 14 +++++++++++-
 arch/powerpc/perf/core-book3s.c              |  1 +
 arch/powerpc/perf/perf_regs.c                | 34 +++++++++++++++++++++++++---
 arch/powerpc/perf/power9-pmu.c               |  6 +++++
 5 files changed, 59 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index cb207f8..e8d35b6 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -15,6 +15,9 @@
 #define MAX_EVENT_ALTERNATIVES	8
 #define MAX_LIMITED_HWCOUNTERS	2
 
+extern u64 mask_var;
+#define PERF_REG_EXTENDED_MASK          mask_var
+
 struct perf_event;
 
 /*
@@ -55,6 +58,11 @@ struct power_pmu {
 	int 		*blacklist_ev;
 	/* BHRB entries in the PMU */
 	int		bhrb_nr;
+	/*
+	 * set this flag with `PERF_PMU_CAP_EXTENDED_REGS` if
+	 * the pmu supports extended perf regs capability
+	 */
+	int		capabilities;
 };
 
 /*
diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index f599064..485b1d5 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_DSISR,
 	PERF_REG_POWERPC_SIER,
 	PERF_REG_POWERPC_MMCRA,
-	PERF_REG_POWERPC_MAX,
+	/* Extended registers */
+	PERF_REG_POWERPC_MMCR0,
+	PERF_REG_POWERPC_MMCR1,
+	PERF_REG_POWERPC_MMCR2,
+	/* Max regs without the extended regs */
+	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
+
+#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
+
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
+#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
+
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 9709606..382d770 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2317,6 +2317,7 @@ int register_power_pmu(struct power_pmu *pmu)
 		pmu->name);
 
 	power_pmu.attr_groups = ppmu->attr_groups;
+	power_pmu.capabilities |= (ppmu->capabilities & PERF_PMU_CAP_EXTENDED_REGS);
 
 #ifdef MSR_HV
 	/*
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index a213a0a..c8a7e8c 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -13,9 +13,11 @@
 #include <asm/ptrace.h>
 #include <asm/perf_regs.h>
 
+u64 mask_var;
+
 #define PT_REGS_OFFSET(id, r) [id] = offsetof(struct pt_regs, r)
 
-#define REG_RESERVED (~((1ULL << PERF_REG_POWERPC_MAX) - 1))
+#define REG_RESERVED (~(PERF_REG_EXTENDED_MASK | PERF_REG_PMU_MASK))
 
 static unsigned int pt_regs_offset[PERF_REG_POWERPC_MAX] = {
 	PT_REGS_OFFSET(PERF_REG_POWERPC_R0,  gpr[0]),
@@ -69,10 +71,26 @@
 	PT_REGS_OFFSET(PERF_REG_POWERPC_MMCRA, dsisr),
 };
 
+/* Function to return the extended register values */
+static u64 get_ext_regs_value(int idx)
+{
+	switch (idx) {
+	case PERF_REG_POWERPC_MMCR0:
+		return mfspr(SPRN_MMCR0);
+	case PERF_REG_POWERPC_MMCR1:
+		return mfspr(SPRN_MMCR1);
+	case PERF_REG_POWERPC_MMCR2:
+		return mfspr(SPRN_MMCR2);
+	default: return 0;
+	}
+}
+
 u64 perf_reg_value(struct pt_regs *regs, int idx)
 {
-	if (WARN_ON_ONCE(idx >= PERF_REG_POWERPC_MAX))
-		return 0;
+	u64 PERF_REG_EXTENDED_MAX;
+
+	if (cpu_has_feature(CPU_FTR_ARCH_300))
+		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_MMCR2 + 1;
 
 	if (idx == PERF_REG_POWERPC_SIER &&
 	   (IS_ENABLED(CONFIG_FSL_EMB_PERF_EVENT) ||
@@ -85,6 +103,16 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
 	    IS_ENABLED(CONFIG_PPC32)))
 		return 0;
 
+	if (idx >= PERF_REG_POWERPC_MAX && idx < PERF_REG_EXTENDED_MAX)
+		return get_ext_regs_value(idx);
+
+	/*
+	 * If the idx is referring to value beyond the
+	 * supported registers, return 0 with a warning
+	 */
+	if (WARN_ON_ONCE(idx >= PERF_REG_EXTENDED_MAX))
+		return 0;
+
 	return regs_get_register(regs, pt_regs_offset[idx]);
 }
 
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 05dae38..fb6fcad 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -90,6 +90,8 @@ enum {
 #define POWER9_MMCRA_IFM3		0x00000000C0000000UL
 #define POWER9_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
+extern u64 mask_var;
+
 /* Nasty Power9 specific hack */
 #define PVR_POWER9_CUMULUS		0x00002000
 
@@ -434,6 +436,7 @@ static void power9_config_bhrb(u64 pmu_bhrb_filter)
 	.cache_events		= &power9_cache_events,
 	.attr_groups		= power9_pmu_attr_groups,
 	.bhrb_nr		= 32,
+	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
 };
 
 int init_power9_pmu(void)
@@ -457,6 +460,9 @@ int init_power9_pmu(void)
 		}
 	}
 
+	/* Set the PERF_REG_EXTENDED_MASK here */
+	mask_var = PERF_REG_PMU_MASK_300;
+
 	rc = register_power_pmu(&power9_pmu);
 	if (rc)
 		return rc;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (7 preceding siblings ...)
  2020-07-01  9:21 ` [PATCH v2 08/10] powerpc/perf: Add support for outputting extended regs in perf intr_regs Athira Rajeev
@ 2020-07-01  9:21 ` Athira Rajeev
  2020-07-08 12:04   ` Michael Ellerman
  2020-07-01  9:21 ` [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform Athira Rajeev
  9 siblings, 1 reply; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:21 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

Add extended regs to sample_reg_mask in the tool side to use
with `-I?` option. Perf tools side uses extended mask to display
the platform supported register names (with -I? option) to the user
and also send this mask to the kernel to capture the extended registers
in each sample. Hence decide the mask value based on the processor
version.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
[Decide extended mask at run time based on platform]
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 tools/arch/powerpc/include/uapi/asm/perf_regs.h | 14 ++++++-
 tools/perf/arch/powerpc/include/perf_regs.h     |  5 ++-
 tools/perf/arch/powerpc/util/perf_regs.c        | 55 +++++++++++++++++++++++++
 3 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index f599064..485b1d5 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_DSISR,
 	PERF_REG_POWERPC_SIER,
 	PERF_REG_POWERPC_MMCRA,
-	PERF_REG_POWERPC_MAX,
+	/* Extended registers */
+	PERF_REG_POWERPC_MMCR0,
+	PERF_REG_POWERPC_MMCR1,
+	PERF_REG_POWERPC_MMCR2,
+	/* Max regs without the extended regs */
+	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
+
+#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
+
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
+#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
+
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
index e18a355..46ed00d 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -64,7 +64,10 @@
 	[PERF_REG_POWERPC_DAR] = "dar",
 	[PERF_REG_POWERPC_DSISR] = "dsisr",
 	[PERF_REG_POWERPC_SIER] = "sier",
-	[PERF_REG_POWERPC_MMCRA] = "mmcra"
+	[PERF_REG_POWERPC_MMCRA] = "mmcra",
+	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
+	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
+	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
 };
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index 0a52429..9179230 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -6,9 +6,14 @@
 
 #include "../../../util/perf_regs.h"
 #include "../../../util/debug.h"
+#include "../../../util/event.h"
+#include "../../../util/header.h"
+#include "../../../perf-sys.h"
 
 #include <linux/kernel.h>
 
+#define PVR_POWER9		0x004E
+
 const struct sample_reg sample_reg_masks[] = {
 	SMPL_REG(r0, PERF_REG_POWERPC_R0),
 	SMPL_REG(r1, PERF_REG_POWERPC_R1),
@@ -55,6 +60,9 @@
 	SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
 	SMPL_REG(sier, PERF_REG_POWERPC_SIER),
 	SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
+	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
+	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
+	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
 	SMPL_REG_END
 };
 
@@ -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
 
 	return SDT_ARG_VALID;
 }
+
+uint64_t arch__intr_reg_mask(void)
+{
+	struct perf_event_attr attr = {
+		.type                   = PERF_TYPE_HARDWARE,
+		.config                 = PERF_COUNT_HW_CPU_CYCLES,
+		.sample_type            = PERF_SAMPLE_REGS_INTR,
+		.precise_ip             = 1,
+		.disabled               = 1,
+		.exclude_kernel         = 1,
+	};
+	int fd, ret;
+	char buffer[64];
+	u32 version;
+	u64 extended_mask = 0;
+
+	/* Get the PVR value to set the extended
+	 * mask specific to platform
+	 */
+	get_cpuid(buffer, sizeof(buffer));
+	ret = sscanf(buffer, "%u,", &version);
+
+	if (ret != 1) {
+		pr_debug("Failed to get the processor version, unable to output extended registers\n");
+		return PERF_REGS_MASK;
+	}
+
+	if (version == PVR_POWER9)
+		extended_mask = PERF_REG_PMU_MASK_300;
+	else
+		return PERF_REGS_MASK;
+
+	attr.sample_regs_intr = extended_mask;
+	attr.sample_period = 1;
+	event_attr_init(&attr);
+
+	/*
+	 * check if the pmu supports perf extended regs, before
+	 * returning the register mask to sample.
+	 */
+	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
+	if (fd != -1) {
+		close(fd);
+		return (extended_mask | PERF_REGS_MASK);
+	}
+	return PERF_REGS_MASK;
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform
  2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
                   ` (8 preceding siblings ...)
  2020-07-01  9:21 ` [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc Athira Rajeev
@ 2020-07-01  9:21 ` Athira Rajeev
  2020-07-02  9:40   ` kernel test robot
  2020-07-08 12:04   ` Michael Ellerman
  9 siblings, 2 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-01  9:21 UTC (permalink / raw)
  To: mpe; +Cc: mikey, maddy, linuxppc-dev

Include capability flag `PERF_PMU_CAP_EXTENDED_REGS` for power10
and expose MMCR3, SIER2, SIER3 registers as part of extended regs.
Also introduce `PERF_REG_PMU_MASK_31` to define extended mask
value at runtime for power10

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/uapi/asm/perf_regs.h       |  6 ++++++
 arch/powerpc/perf/perf_regs.c                   | 10 +++++++++-
 arch/powerpc/perf/power10-pmu.c                 |  6 ++++++
 tools/arch/powerpc/include/uapi/asm/perf_regs.h |  6 ++++++
 tools/perf/arch/powerpc/include/perf_regs.h     |  3 +++
 tools/perf/arch/powerpc/util/perf_regs.c        |  6 ++++++
 6 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
index 485b1d5..020b51c 100644
--- a/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -52,6 +52,9 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_MMCR0,
 	PERF_REG_POWERPC_MMCR1,
 	PERF_REG_POWERPC_MMCR2,
+	PERF_REG_POWERPC_MMCR3,
+	PERF_REG_POWERPC_SIER2,
+	PERF_REG_POWERPC_SIER3,
 	/* Max regs without the extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
@@ -62,4 +65,7 @@ enum perf_event_powerpc_regs {
 #define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
 				- PERF_REG_PMU_MASK)
 
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
+#define PERF_REG_PMU_MASK_31	(((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
index c8a7e8c..c969935 100644
--- a/arch/powerpc/perf/perf_regs.c
+++ b/arch/powerpc/perf/perf_regs.c
@@ -81,6 +81,12 @@ static u64 get_ext_regs_value(int idx)
 		return mfspr(SPRN_MMCR1);
 	case PERF_REG_POWERPC_MMCR2:
 		return mfspr(SPRN_MMCR2);
+	case PERF_REG_POWERPC_MMCR3:
+			return mfspr(SPRN_MMCR3);
+	case PERF_REG_POWERPC_SIER2:
+			return mfspr(SPRN_SIER2);
+	case PERF_REG_POWERPC_SIER3:
+			return mfspr(SPRN_SIER3);
 	default: return 0;
 	}
 }
@@ -89,7 +95,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
 {
 	u64 PERF_REG_EXTENDED_MAX;
 
-	if (cpu_has_feature(CPU_FTR_ARCH_300))
+	if (cpu_has_feature(CPU_FTR_ARCH_31))
+		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_SIER3 + 1;
+	else if (cpu_has_feature(CPU_FTR_ARCH_300))
 		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_MMCR2 + 1;
 
 	if (idx == PERF_REG_POWERPC_SIER &&
diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
index 07fb919..51082d6 100644
--- a/arch/powerpc/perf/power10-pmu.c
+++ b/arch/powerpc/perf/power10-pmu.c
@@ -86,6 +86,8 @@
 #define POWER10_MMCRA_IFM3		0x00000000C0000000UL
 #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
 
+extern u64 mask_var;
+
 /* Table of alternatives, sorted by column 0 */
 static const unsigned int power10_event_alternatives[][MAX_ALT] = {
 	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
@@ -397,6 +399,7 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter)
 	.cache_events		= &power10_cache_events,
 	.attr_groups		= power10_pmu_attr_groups,
 	.bhrb_nr		= 32,
+	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
 };
 
 int init_power10_pmu(void)
@@ -408,6 +411,9 @@ int init_power10_pmu(void)
 	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10"))
 		return -ENODEV;
 
+	/* Set the PERF_REG_EXTENDED_MASK here */
+	mask_var = PERF_REG_PMU_MASK_31;
+
 	rc = register_power_pmu(&power10_pmu);
 	if (rc)
 		return rc;
diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index 485b1d5..020b51c 100644
--- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -52,6 +52,9 @@ enum perf_event_powerpc_regs {
 	PERF_REG_POWERPC_MMCR0,
 	PERF_REG_POWERPC_MMCR1,
 	PERF_REG_POWERPC_MMCR2,
+	PERF_REG_POWERPC_MMCR3,
+	PERF_REG_POWERPC_SIER2,
+	PERF_REG_POWERPC_SIER3,
 	/* Max regs without the extended regs */
 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
 };
@@ -62,4 +65,7 @@ enum perf_event_powerpc_regs {
 #define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
 				- PERF_REG_PMU_MASK)
 
+/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
+#define PERF_REG_PMU_MASK_31	(((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) \
+				- PERF_REG_PMU_MASK)
 #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
index 46ed00d..63f3ac9 100644
--- a/tools/perf/arch/powerpc/include/perf_regs.h
+++ b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -68,6 +68,9 @@
 	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
 	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
 	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
+	[PERF_REG_POWERPC_MMCR3] = "mmcr3",
+	[PERF_REG_POWERPC_SIER2] = "sier2",
+	[PERF_REG_POWERPC_SIER3] = "sier3",
 };
 
 static inline const char *perf_reg_name(int id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
index 9179230..ccc625f 100644
--- a/tools/perf/arch/powerpc/util/perf_regs.c
+++ b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -13,6 +13,7 @@
 #include <linux/kernel.h>
 
 #define PVR_POWER9		0x004E
+#define PVR_POWER10		0x0080
 
 const struct sample_reg sample_reg_masks[] = {
 	SMPL_REG(r0, PERF_REG_POWERPC_R0),
@@ -63,6 +64,9 @@
 	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
 	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
 	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
+	SMPL_REG(mmcr3, PERF_REG_POWERPC_MMCR3),
+	SMPL_REG(sier2, PERF_REG_POWERPC_SIER2),
+	SMPL_REG(sier3, PERF_REG_POWERPC_SIER3),
 	SMPL_REG_END
 };
 
@@ -200,6 +204,8 @@ uint64_t arch__intr_reg_mask(void)
 
 	if (version == PVR_POWER9)
 		extended_mask = PERF_REG_PMU_MASK_300;
+	else if (version == PVR_POWER10)
+		extended_mask = PERF_REG_PMU_MASK_31;
 	else
 		return PERF_REGS_MASK;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
  2020-07-01  9:20 ` [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers Athira Rajeev
@ 2020-07-01 11:11   ` Paul Mackerras
  2020-07-02  6:22     ` Athira Rajeev
  2020-07-07  6:13   ` Michael Neuling
  1 sibling, 1 reply; 41+ messages in thread
From: Paul Mackerras @ 2020-07-01 11:11 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

On Wed, Jul 01, 2020 at 05:20:54AM -0400, Athira Rajeev wrote:
> PowerISA v3.1 has added new performance monitoring unit (PMU)
> special purpose registers (SPRs). They are
> 
> Monitor Mode Control Register 3 (MMCR3)
> Sampled Instruction Event Register A (SIER2)
> Sampled Instruction Event Register B (SIER3)
> 
> Patch addes support to save/restore these new
> SPRs while entering/exiting guest.

This mostly looks reasonable, at a quick glance at least, but I am
puzzled by two of the changes you are making.  See below.

> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> index 6bf66649..c265800 100644
> --- a/arch/powerpc/kvm/book3s_hv.c
> +++ b/arch/powerpc/kvm/book3s_hv.c
> @@ -1698,7 +1698,8 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
>  		*val = get_reg_val(id, vcpu->arch.sdar);
>  		break;
>  	case KVM_REG_PPC_SIER:
> -		*val = get_reg_val(id, vcpu->arch.sier);
> +		i = id - KVM_REG_PPC_SIER;
> +		*val = get_reg_val(id, vcpu->arch.sier[i]);

This is inside a switch (id) statement, so here we know that id is
KVM_REG_PPC_SIER.  In other words i will always be zero, so what is
the point of doing the subtraction?

>  		break;
>  	case KVM_REG_PPC_IAMR:
>  		*val = get_reg_val(id, vcpu->arch.iamr);
> @@ -1919,7 +1920,8 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
>  		vcpu->arch.sdar = set_reg_val(id, *val);
>  		break;
>  	case KVM_REG_PPC_SIER:
> -		vcpu->arch.sier = set_reg_val(id, *val);
> +		i = id - KVM_REG_PPC_SIER;
> +		vcpu->arch.sier[i] = set_reg_val(id, *val);

Same comment here.

I think that new defines for the new registers will need to be added
to arch/powerpc/include/uapi/asm/kvm.h and
Documentation/virt/kvm/api.rst, and then new cases will need to be
added to these switch statements.

By the way, please cc kvm-ppc@vger.kernel.org and kvm@vger.kernel.org
on KVM patches.

Paul.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
  2020-07-01 11:11   ` Paul Mackerras
@ 2020-07-02  6:22     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-02  6:22 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: mikey, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2652 bytes --]



> On 01-Jul-2020, at 4:41 PM, Paul Mackerras <paulus@ozlabs.org> wrote:
> 
> On Wed, Jul 01, 2020 at 05:20:54AM -0400, Athira Rajeev wrote:
>> PowerISA v3.1 has added new performance monitoring unit (PMU)
>> special purpose registers (SPRs). They are
>> 
>> Monitor Mode Control Register 3 (MMCR3)
>> Sampled Instruction Event Register A (SIER2)
>> Sampled Instruction Event Register B (SIER3)
>> 
>> Patch addes support to save/restore these new
>> SPRs while entering/exiting guest.
> 
> This mostly looks reasonable, at a quick glance at least, but I am
> puzzled by two of the changes you are making.  See below.
> 
>> diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
>> index 6bf66649..c265800 100644
>> --- a/arch/powerpc/kvm/book3s_hv.c
>> +++ b/arch/powerpc/kvm/book3s_hv.c
>> @@ -1698,7 +1698,8 @@ static int kvmppc_get_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
>> 		*val = get_reg_val(id, vcpu->arch.sdar);
>> 		break;
>> 	case KVM_REG_PPC_SIER:
>> -		*val = get_reg_val(id, vcpu->arch.sier);
>> +		i = id - KVM_REG_PPC_SIER;
>> +		*val = get_reg_val(id, vcpu->arch.sier[i]);
> 
> This is inside a switch (id) statement, so here we know that id is
> KVM_REG_PPC_SIER.  In other words i will always be zero, so what is
> the point of doing the subtraction?
> 
>> 		break;
>> 	case KVM_REG_PPC_IAMR:
>> 		*val = get_reg_val(id, vcpu->arch.iamr);
>> @@ -1919,7 +1920,8 @@ static int kvmppc_set_one_reg_hv(struct kvm_vcpu *vcpu, u64 id,
>> 		vcpu->arch.sdar = set_reg_val(id, *val);
>> 		break;
>> 	case KVM_REG_PPC_SIER:
>> -		vcpu->arch.sier = set_reg_val(id, *val);
>> +		i = id - KVM_REG_PPC_SIER;
>> +		vcpu->arch.sier[i] = set_reg_val(id, *val);
> 
> Same comment here.

Hi Paul,

Thanks for reviewing the patch. Yes, true that currently `id` will be zero since it is only KVM_REG_PPC_SIER. I have kept the subtraction here considering that there will be addition of new registers to switch case. 
ex: case KVM_REG_PPC_SIER..KVM_REG_PPC_SIER3

> 
> I think that new defines for the new registers will need to be added
> to arch/powerpc/include/uapi/asm/kvm.h and
> Documentation/virt/kvm/api.rst, and then new cases will need to be
> added to these switch statements.

Yes, New registers are not yet added to kvm.h 
I will address these comments and include changes for arch/powerpc/include/uapi/asm/kvm.h and Documentation/virt/kvm/api.rst in the
next version.

> 
> By the way, please cc kvm-ppc@vger.kernel.org and kvm@vger.kernel.org
> on KVM patches.

Sure, will include KVM mailing list in the next version

Thanks
Athira 
> 
> Paul.


[-- Attachment #2: Type: text/html, Size: 7456 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
  2020-07-01  9:20 ` [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support Athira Rajeev
@ 2020-07-02  9:06   ` kernel test robot
  2020-07-07  6:50   ` Michael Neuling
  1 sibling, 0 replies; 41+ messages in thread
From: kernel test robot @ 2020-07-02  9:06 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: linuxppc-dev, mikey, maddy, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1568 bytes --]

Hi Athira,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on powerpc/next]
[also build test WARNING on tip/perf/core v5.8-rc3 next-20200702]
[cannot apply to kvm-ppc/kvm-ppc-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Add-support-for-power10-PMU-Hardware/20200701-181147
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/powerpc/perf/power10-pmu.c:393:5: warning: no previous prototype for 'init_power10_pmu' [-Wmissing-prototypes]
     393 | int init_power10_pmu(void)
         |     ^~~~~~~~~~~~~~~~

vim +/init_power10_pmu +393 arch/powerpc/perf/power10-pmu.c

   392	
 > 393	int init_power10_pmu(void)

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 69717 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform
  2020-07-01  9:21 ` [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform Athira Rajeev
@ 2020-07-02  9:40   ` kernel test robot
  2020-07-08  1:53     ` Athira Rajeev
  2020-07-08 12:04   ` Michael Ellerman
  1 sibling, 1 reply; 41+ messages in thread
From: kernel test robot @ 2020-07-02  9:40 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: linuxppc-dev, mikey, maddy, kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1489 bytes --]

Hi Athira,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on tip/perf/core v5.8-rc3 next-20200702]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Add-support-for-power10-PMU-Hardware/20200701-181147
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-pmac32_defconfig (attached as .config)
compiler: powerpc-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   {standard input}: Assembler messages:
>> {standard input}:84: Error: unsupported relocation against SPRN_SIER2
>> {standard input}:91: Error: unsupported relocation against SPRN_SIER3
>> {standard input}:119: Error: unsupported relocation against SPRN_MMCR3

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24961 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers
  2020-07-01  9:20 ` [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers Athira Rajeev
  2020-07-01 11:11   ` Paul Mackerras
@ 2020-07-07  6:13   ` Michael Neuling
  1 sibling, 0 replies; 41+ messages in thread
From: Michael Neuling @ 2020-07-07  6:13 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: Paul Mackerras, maddy, linuxppc-dev

@@ -637,12 +637,12 @@ struct kvm_vcpu_arch {
>  	u32 ccr1;
>  	u32 dbsr;
>  
> -	u64 mmcr[5];
> +	u64 mmcr[6];
>  	u32 pmc[8];
>  	u32 spmc[2];
>  	u64 siar;


> +	mfspr	r5, SPRN_MMCR3
> +	mfspr	r6, SPRN_SIER2
> +	mfspr	r7, SPRN_SIER3
> +	std	r5, VCPU_MMCR + 40(r9)
> +	std	r6, VCPU_SIER + 8(r9)
> +	std	r7, VCPU_SIER + 16(r9)


This is looking pretty fragile now. vcpu mmcr[6] stores (in this strict order):
   mmcr0, mmcr1, mmcra, mmcr2, mmcrs, mmmcr3.

Can we clean that up? Give mmcra and mmcrs their own entries in vcpu and then
have a flat array for mmcr0-3.

Mikey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
  2020-07-01  9:20 ` [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs Athira Rajeev
@ 2020-07-07  6:22   ` Michael Neuling
  2020-07-08  2:13     ` Athira Rajeev
  2020-07-08 11:15   ` Michael Ellerman
  1 sibling, 1 reply; 41+ messages in thread
From: Michael Neuling @ 2020-07-07  6:22 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev

On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
> 
> Add power10 feature function to dt_cpu_ftrs.c along
> with a power10 specific init() to initialize pmu sprs.

Can you say why you're doing this?

Can you add some text about what you're doing to the BHRB in this patch?

Mikey

> 
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/reg.h        |  3 +++
>  arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 21a1b2d..900ada1 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1068,6 +1068,9 @@
>  #define MMCR0_PMC2_LOADMISSTIME	0x5
>  #endif
>  
> +/* BHRB disable bit for PowerISA v3.10 */
> +#define MMCRA_BHRB_DISABLE	0x0000002000000000
> +
>  /*
>   * SPRG usage:
>   *
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S
> b/arch/powerpc/kernel/cpu_setup_power.S
> index efdcfa7..e8b3370c 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
>  	li	r5,0
>  	mtspr	SPRN_MMCRS,r5
>  	blr
> +
> +__init_PMU_ISA31:
> +	li	r5,0
> +	mtspr	SPRN_MMCR3,r5
> +	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
> +	mtspr	SPRN_MMCRA,r5
> +	blr
> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c
> b/arch/powerpc/kernel/dt_cpu_ftrs.c
> index a0edeb3..14a513f 100644
> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
> @@ -449,6 +449,31 @@ static int __init feat_enable_pmu_power9(struct
> dt_cpu_feature *f)
>  	return 1;
>  }
>  
> +static void init_pmu_power10(void)
> +{
> +	init_pmu_power9();
> +
> +	mtspr(SPRN_MMCR3, 0);
> +	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
> +}
> +
> +static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
> +{
> +	hfscr_pmu_enable();
> +
> +	init_pmu_power10();
> +	init_pmu_registers = init_pmu_power10;
> +
> +	cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
> +	cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
> +
> +	cur_cpu_spec->num_pmcs          = 6;
> +	cur_cpu_spec->pmc_type          = PPC_PMC_IBM;
> +	cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
> +
> +	return 1;
> +}
> +
>  static int __init feat_enable_tm(struct dt_cpu_feature *f)
>  {
>  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> @@ -638,6 +663,7 @@ struct dt_cpu_feature_match {
>  	{"pc-relative-addressing", feat_enable, 0},
>  	{"machine-check-power9", feat_enable_mce_power9, 0},
>  	{"performance-monitor-power9", feat_enable_pmu_power9, 0},
> +	{"performance-monitor-power10", feat_enable_pmu_power10, 0},
>  	{"event-based-branch-v3", feat_enable, 0},
>  	{"random-number-generator", feat_enable, 0},
>  	{"system-call-vectored", feat_disable, 0},


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
  2020-07-01  9:20 ` [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support Athira Rajeev
  2020-07-02  9:06   ` kernel test robot
@ 2020-07-07  6:50   ` Michael Neuling
  2020-07-08 10:56     ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Michael Neuling @ 2020-07-07  6:50 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: maddy, linuxppc-dev


> @@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  	mmcr[1] = mmcr1;
>  	mmcr[2] = mmcra;
>  	mmcr[3] = mmcr2;
> +	mmcr[4] = mmcr3;

This is fragile like the kvm vcpu case I commented on before but it gets passed
in via a function parameter?! Can you create a struct to store these in rather
than this odd ball numbering?

The cleanup should start in patch 1/10 here:

        /*
         * The order of the MMCR array is:
-        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
+        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
         *  - 32-bit, MMCR0, MMCR1, MMCR2
         */
-       unsigned long mmcr[4];
+       unsigned long mmcr[5];



mikey

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-01  9:20 ` [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes Athira Rajeev
@ 2020-07-07  7:17   ` Michael Neuling
  2020-07-08  7:41     ` Athira Rajeev
  2020-07-08  7:43     ` Gautham R Shenoy
  2020-07-08 11:42   ` Michael Ellerman
  1 sibling, 2 replies; 41+ messages in thread
From: Michael Neuling @ 2020-07-07  7:17 UTC (permalink / raw)
  To: Athira Rajeev, mpe; +Cc: Vaidyanathan Srinivasan, ego, maddy, linuxppc-dev

On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
> First is the addition of BHRB disable bit and second new filtering
> modes for BHRB.
> 
> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
> whether BHRB entries are written when BHRB recording is enabled by other
> bits. Patch implements support for this BHRB disable bit.

Probably good to note here that this is backwards compatible. So if you have a
kernel that doesn't know about this bit, it'll clear it and hence you still get
BHRB. 

You should also note why you'd want to do disable this (ie. the core will run
faster).

> Secondly PowerISA v3.1 introduce filtering support for
> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
> for "ind_call" and "cond" in power10_bhrb_filter_map().
> 
> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace
> via BHRB buffer")'
> added a check in bhrb_read() to filter the kernel address from BHRB buffer.
> Patch here modified
> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1
> allows
> only MSR[PR]=1 address to be written to BHRB buffer.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>  arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>  arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>  arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++

This touches the idle code so we should get those guys on CC (adding Vaidy and
Ego).

>  4 files changed, 59 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index fad5159..9709606 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event,
> struct cpu_hw_events *
>  			 * addresses at this point. Check the privileges before
>  			 * exporting it to userspace (avoid exposure of regions
>  			 * where we could have speculative execution)
> +			 * Incase of ISA 310, BHRB will capture only user-space
> +			 * address,hence include a check before filtering code
>  			 */
> -			if (is_kernel_addr(addr) && perf_allow_kernel(&event-
> >attr) != 0)
> -				continue;
> +			if (!(ppmu->flags & PPMU_ARCH_310S))
> +				if (is_kernel_addr(addr) &&
> +				perf_allow_kernel(&event->attr) != 0)
> +					continue;
>  
>  			/* Branches are read most recent first (ie. mfbhrb 0 is
>  			 * the most recent branch).
> @@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw,
> unsigned long mmcr0)
>  static void power_pmu_disable(struct pmu *pmu)
>  {
>  	struct cpu_hw_events *cpuhw;
> -	unsigned long flags, mmcr0, val;
> +	unsigned long flags, mmcr0, val, mmcra = 0;
>  
>  	if (!ppmu)
>  		return;
> @@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
>  		mb();
>  		isync();
>  
> +		val = mmcra = cpuhw->mmcr[2];
> +
>  		/*
>  		 * Disable instruction sampling if it was enabled
>  		 */
> -		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
> -			mtspr(SPRN_MMCRA,
> -			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> +		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
> +			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
> +
> +		/* Disable BHRB via mmcra [:26] for p10 if needed */
> +		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
> +			mmcra |= MMCRA_BHRB_DISABLE;
> +
> +		/* Write SPRN_MMCRA if mmcra has either disabled
> +		 * instruction sampling or BHRB
> +		 */
> +		if (val != mmcra) {
> +			mtspr(SPRN_MMCRA, mmcra);
>  			mb();
>  			isync();
>  		}
> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-
> common.c
> index 7d4839e..463d925 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  
>  	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
>  
> +	/* Disable bhrb unless explicitly requested
> +	 * by setting MMCRA [:26] bit.
> +	 */
> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		mmcra |= MMCRA_BHRB_DISABLE;
> +
>  	/* Second pass: assign PMCs, set all MMCR1 fields */
>  	for (i = 0; i < n_ev; ++i) {
>  		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
> @@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  		}
>  
>  		if (event[i] & EVENT_WANTS_BHRB) {
> +			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
> +			if (cpu_has_feature(CPU_FTR_ARCH_31))
> +				mmcra &= ~MMCRA_BHRB_DISABLE;
>  			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
>  			mmcra |= val << MMCRA_IFM_SHIFT;
>  		}
>  
> +		/* set MMCRA[:26] to 0 if there is user request for BHRB */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31) &&
> has_branch_stack(pevents[i]))
> +			mmcra &= ~MMCRA_BHRB_DISABLE;
> +
>  		if (pevents[i]->attr.exclude_user)
>  			mmcr2 |= MMCR2_FCP(pmc);
>  
> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
> index d64d69d..07fb919 100644
> --- a/arch/powerpc/perf/power10-pmu.c
> +++ b/arch/powerpc/perf/power10-pmu.c
> @@ -82,6 +82,8 @@
>  
>  /* MMCRA IFM bits - POWER10 */
>  #define POWER10_MMCRA_IFM1		0x0000000040000000UL
> +#define POWER10_MMCRA_IFM2		0x0000000080000000UL
> +#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>  #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
>  
>  /* Table of alternatives, sorted by column 0 */
> @@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64
> branch_sample_type)
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>  		return -1;
>  
> -	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
> -		return -1;
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
> +		return pmu_bhrb_filter;
> +	}
> +
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
> +		return pmu_bhrb_filter;
> +	}
>  
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
>  		return -1;
> diff --git a/arch/powerpc/platforms/powernv/idle.c
> b/arch/powerpc/platforms/powernv/idle.c
> index 2dd4673..7db99c7 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr,
> bool mmu_on)
>  	unsigned long srr1;
>  	unsigned long pls;
>  	unsigned long mmcr0 = 0;
> +	unsigned long mmcra_bhrb = 0;
>  	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>  	bool sprs_saved = false;
>  
> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
>  		  */
>  		mmcr0		= mfspr(SPRN_MMCR0);
>  	}
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
> +		 * to disable BHRB logic when not used. Hence Save and
> +		 * restore MMCRA after a state-loss idle.
> +		 */
> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);


Why is the bhrb bit of mmcra special here?

> +	}
> +
>  	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>  		sprs.lpcr	= mfspr(SPRN_LPCR);
>  		sprs.hfscr	= mfspr(SPRN_HFSCR);
> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long
> psscr, bool mmu_on)
>  			mtspr(SPRN_MMCR0, mmcr0);
>  		}
>  
> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
> +
>  		/*
>  		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>  		 * to ensure the PMU starts running.


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform
  2020-07-02  9:40   ` kernel test robot
@ 2020-07-08  1:53     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-08  1:53 UTC (permalink / raw)
  To: kernel test robot; +Cc: mikey, maddy, linuxppc-dev, kbuild-all



> On 02-Jul-2020, at 3:10 PM, kernel test robot <lkp@intel.com> wrote:
> 
> Hi Athira,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on powerpc/next]
> [also build test ERROR on tip/perf/core v5.8-rc3 next-20200702]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use  as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Athira-Rajeev/powerpc-perf-Add-support-for-power10-PMU-Hardware/20200701-181147
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
> config: powerpc-pmac32_defconfig (attached as .config)
> compiler: powerpc-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>        chmod +x ~/bin/make.cross
>        # save the attached .config to linux build tree
>        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All errors (new ones prefixed by >>):
> 
>   {standard input}: Assembler messages:
>>> {standard input}:84: Error: unsupported relocation against SPRN_SIER2
>>> {standard input}:91: Error: unsupported relocation against SPRN_SIER3
>>> {standard input}:119: Error: unsupported relocation against SPRN_MMCR3

These regs are not valid for ppc32 platform. Will fix this by including usage of these regs under conditional check for
“CONFIG_PPC64” in next version

Thanks
Athira Rajeev
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> <.config.gz>


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
  2020-07-07  6:22   ` Michael Neuling
@ 2020-07-08  2:13     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-08  2:13 UTC (permalink / raw)
  To: Michael Neuling; +Cc: maddy, linuxppc-dev



> On 07-Jul-2020, at 11:52 AM, Michael Neuling <mikey@neuling.org> wrote:
> 
> On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
>> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>> 
>> Add power10 feature function to dt_cpu_ftrs.c along
>> with a power10 specific init() to initialize pmu sprs.
> 
> Can you say why you're doing this?
> 
> Can you add some text about what you're doing to the BHRB in this patch?

Sure, I will include these information for commit message in the next version

Thanks
Athira 

> 
> Mikey
> 
>> 
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/reg.h        |  3 +++
>> arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
>> 3 files changed, 36 insertions(+)
>> 
>> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
>> index 21a1b2d..900ada1 100644
>> --- a/arch/powerpc/include/asm/reg.h
>> +++ b/arch/powerpc/include/asm/reg.h
>> @@ -1068,6 +1068,9 @@
>> #define MMCR0_PMC2_LOADMISSTIME	0x5
>> #endif
>> 
>> +/* BHRB disable bit for PowerISA v3.10 */
>> +#define MMCRA_BHRB_DISABLE	0x0000002000000000
>> +
>> /*
>>  * SPRG usage:
>>  *
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.S
>> b/arch/powerpc/kernel/cpu_setup_power.S
>> index efdcfa7..e8b3370c 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.S
>> +++ b/arch/powerpc/kernel/cpu_setup_power.S
>> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
>> 	li	r5,0
>> 	mtspr	SPRN_MMCRS,r5
>> 	blr
>> +
>> +__init_PMU_ISA31:
>> +	li	r5,0
>> +	mtspr	SPRN_MMCR3,r5
>> +	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
>> +	mtspr	SPRN_MMCRA,r5
>> +	blr
>> diff --git a/arch/powerpc/kernel/dt_cpu_ftrs.c
>> b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> index a0edeb3..14a513f 100644
>> --- a/arch/powerpc/kernel/dt_cpu_ftrs.c
>> +++ b/arch/powerpc/kernel/dt_cpu_ftrs.c
>> @@ -449,6 +449,31 @@ static int __init feat_enable_pmu_power9(struct
>> dt_cpu_feature *f)
>> 	return 1;
>> }
>> 
>> +static void init_pmu_power10(void)
>> +{
>> +	init_pmu_power9();
>> +
>> +	mtspr(SPRN_MMCR3, 0);
>> +	mtspr(SPRN_MMCRA, MMCRA_BHRB_DISABLE);
>> +}
>> +
>> +static int __init feat_enable_pmu_power10(struct dt_cpu_feature *f)
>> +{
>> +	hfscr_pmu_enable();
>> +
>> +	init_pmu_power10();
>> +	init_pmu_registers = init_pmu_power10;
>> +
>> +	cur_cpu_spec->cpu_features |= CPU_FTR_MMCRA;
>> +	cur_cpu_spec->cpu_user_features |= PPC_FEATURE_PSERIES_PERFMON_COMPAT;
>> +
>> +	cur_cpu_spec->num_pmcs          = 6;
>> +	cur_cpu_spec->pmc_type          = PPC_PMC_IBM;
>> +	cur_cpu_spec->oprofile_cpu_type = "ppc64/power10";
>> +
>> +	return 1;
>> +}
>> +
>> static int __init feat_enable_tm(struct dt_cpu_feature *f)
>> {
>> #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
>> @@ -638,6 +663,7 @@ struct dt_cpu_feature_match {
>> 	{"pc-relative-addressing", feat_enable, 0},
>> 	{"machine-check-power9", feat_enable_mce_power9, 0},
>> 	{"performance-monitor-power9", feat_enable_pmu_power9, 0},
>> +	{"performance-monitor-power10", feat_enable_pmu_power10, 0},
>> 	{"event-based-branch-v3", feat_enable, 0},
>> 	{"random-number-generator", feat_enable, 0},
>> 	{"system-call-vectored", feat_disable, 0},
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-07  7:17   ` Michael Neuling
@ 2020-07-08  7:41     ` Athira Rajeev
  2020-07-08  7:43     ` Gautham R Shenoy
  1 sibling, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-08  7:41 UTC (permalink / raw)
  To: Michael Neuling; +Cc: Vaidyanathan Srinivasan, maddy, linuxppc-dev, ego

[-- Attachment #1: Type: text/plain, Size: 8382 bytes --]



> On 07-Jul-2020, at 12:47 PM, Michael Neuling <mikey@neuling.org> wrote:
> 
> On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
>> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
>> First is the addition of BHRB disable bit and second new filtering
>> modes for BHRB.
>> 
>> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
>> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
>> whether BHRB entries are written when BHRB recording is enabled by other
>> bits. Patch implements support for this BHRB disable bit.
> 
> Probably good to note here that this is backwards compatible. So if you have a
> kernel that doesn't know about this bit, it'll clear it and hence you still get
> BHRB. 
> 
> You should also note why you'd want to do disable this (ie. the core will run
> faster).
> 


Sure Mikey, will add these information in commit message 

Thanks
Athira


>> Secondly PowerISA v3.1 introduce filtering support for
>> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
>> for "ind_call" and "cond" in power10_bhrb_filter_map().
>> 
>> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace
>> via BHRB buffer")'
>> added a check in bhrb_read() to filter the kernel address from BHRB buffer.
>> Patch here modified
>> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1
>> allows
>> only MSR[PR]=1 address to be written to BHRB buffer.
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>> arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>> arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>> arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
> 
> This touches the idle code so we should get those guys on CC (adding Vaidy and
> Ego).
> 
>> 4 files changed, 59 insertions(+), 8 deletions(-)
>> 
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index fad5159..9709606 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event,
>> struct cpu_hw_events *
>> 			 * addresses at this point. Check the privileges before
>> 			 * exporting it to userspace (avoid exposure of regions
>> 			 * where we could have speculative execution)
>> +			 * Incase of ISA 310, BHRB will capture only user-space
>> +			 * address,hence include a check before filtering code
>> 			 */
>> -			if (is_kernel_addr(addr) && perf_allow_kernel(&event-
>>> attr) != 0)
>> -				continue;
>> +			if (!(ppmu->flags & PPMU_ARCH_310S))
>> +				if (is_kernel_addr(addr) &&
>> +				perf_allow_kernel(&event->attr) != 0)
>> +					continue;
>> 
>> 			/* Branches are read most recent first (ie. mfbhrb 0 is
>> 			 * the most recent branch).
>> @@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw,
>> unsigned long mmcr0)
>> static void power_pmu_disable(struct pmu *pmu)
>> {
>> 	struct cpu_hw_events *cpuhw;
>> -	unsigned long flags, mmcr0, val;
>> +	unsigned long flags, mmcr0, val, mmcra = 0;
>> 
>> 	if (!ppmu)
>> 		return;
>> @@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
>> 		mb();
>> 		isync();
>> 
>> +		val = mmcra = cpuhw->mmcr[2];
>> +
>> 		/*
>> 		 * Disable instruction sampling if it was enabled
>> 		 */
>> -		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
>> -			mtspr(SPRN_MMCRA,
>> -			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>> +		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
>> +			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
>> +
>> +		/* Disable BHRB via mmcra [:26] for p10 if needed */
>> +		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
>> +			mmcra |= MMCRA_BHRB_DISABLE;
>> +
>> +		/* Write SPRN_MMCRA if mmcra has either disabled
>> +		 * instruction sampling or BHRB
>> +		 */
>> +		if (val != mmcra) {
>> +			mtspr(SPRN_MMCRA, mmcra);
>> 			mb();
>> 			isync();
>> 		}
>> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-
>> common.c
>> index 7d4839e..463d925 100644
>> --- a/arch/powerpc/perf/isa207-common.c
>> +++ b/arch/powerpc/perf/isa207-common.c
>> @@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>> 
>> 	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
>> 
>> +	/* Disable bhrb unless explicitly requested
>> +	 * by setting MMCRA [:26] bit.
>> +	 */
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +		mmcra |= MMCRA_BHRB_DISABLE;
>> +
>> 	/* Second pass: assign PMCs, set all MMCR1 fields */
>> 	for (i = 0; i < n_ev; ++i) {
>> 		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
>> @@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>> 		}
>> 
>> 		if (event[i] & EVENT_WANTS_BHRB) {
>> +			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
>> +			if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +				mmcra &= ~MMCRA_BHRB_DISABLE;
>> 			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
>> 			mmcra |= val << MMCRA_IFM_SHIFT;
>> 		}
>> 
>> +		/* set MMCRA[:26] to 0 if there is user request for BHRB */
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31) &&
>> has_branch_stack(pevents[i]))
>> +			mmcra &= ~MMCRA_BHRB_DISABLE;
>> +
>> 		if (pevents[i]->attr.exclude_user)
>> 			mmcr2 |= MMCR2_FCP(pmc);
>> 
>> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
>> index d64d69d..07fb919 100644
>> --- a/arch/powerpc/perf/power10-pmu.c
>> +++ b/arch/powerpc/perf/power10-pmu.c
>> @@ -82,6 +82,8 @@
>> 
>> /* MMCRA IFM bits - POWER10 */
>> #define POWER10_MMCRA_IFM1		0x0000000040000000UL
>> +#define POWER10_MMCRA_IFM2		0x0000000080000000UL
>> +#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>> #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
>> 
>> /* Table of alternatives, sorted by column 0 */
>> @@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64
>> branch_sample_type)
>> 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> 		return -1;
>> 
>> -	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
>> -		return -1;
>> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
>> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
>> +		return pmu_bhrb_filter;
>> +	}
>> +
>> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
>> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
>> +		return pmu_bhrb_filter;
>> +	}
>> 
>> 	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
>> 		return -1;
>> diff --git a/arch/powerpc/platforms/powernv/idle.c
>> b/arch/powerpc/platforms/powernv/idle.c
>> index 2dd4673..7db99c7 100644
>> --- a/arch/powerpc/platforms/powernv/idle.c
>> +++ b/arch/powerpc/platforms/powernv/idle.c
>> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr,
>> bool mmu_on)
>> 	unsigned long srr1;
>> 	unsigned long pls;
>> 	unsigned long mmcr0 = 0;
>> +	unsigned long mmcra_bhrb = 0;
>> 	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>> 	bool sprs_saved = false;
>> 
>> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long
>> psscr, bool mmu_on)
>> 		  */
>> 		mmcr0		= mfspr(SPRN_MMCR0);
>> 	}
>> +
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
>> +		 * to disable BHRB logic when not used. Hence Save and
>> +		 * restore MMCRA after a state-loss idle.
>> +		 */
>> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);
> 
> 
> Why is the bhrb bit of mmcra special here?

This to save/restore BHRB disable bit in state-loss idle state to make sure
we keep BHRB disabled if it was not enabled on request at runtime.
> 
>> +	}
>> +
>> 	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>> 		sprs.lpcr	= mfspr(SPRN_LPCR);
>> 		sprs.hfscr	= mfspr(SPRN_HFSCR);
>> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long
>> psscr, bool mmu_on)
>> 			mtspr(SPRN_MMCR0, mmcr0);
>> 		}
>> 
>> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
>> +
>> 		/*
>> 		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>> 		 * to ensure the PMU starts running.


[-- Attachment #2: Type: text/html, Size: 37042 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-07  7:17   ` Michael Neuling
  2020-07-08  7:41     ` Athira Rajeev
@ 2020-07-08  7:43     ` Gautham R Shenoy
  2020-07-09  2:01       ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Gautham R Shenoy @ 2020-07-08  7:43 UTC (permalink / raw)
  To: Michael Neuling
  Cc: ego, Athira Rajeev, Vaidyanathan Srinivasan, maddy, linuxppc-dev

On Tue, Jul 07, 2020 at 05:17:55PM +1000, Michael Neuling wrote:
> On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
> > PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
> > First is the addition of BHRB disable bit and second new filtering
> > modes for BHRB.
> > 
> > BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
> > bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
> > whether BHRB entries are written when BHRB recording is enabled by other
> > bits. Patch implements support for this BHRB disable bit.
> 
> Probably good to note here that this is backwards compatible. So if you have a
> kernel that doesn't know about this bit, it'll clear it and hence you still get
> BHRB. 
> 
> You should also note why you'd want to do disable this (ie. the core will run
> faster).
> 
> > Secondly PowerISA v3.1 introduce filtering support for
> > PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
> > for "ind_call" and "cond" in power10_bhrb_filter_map().
> > 
> > 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace
> > via BHRB buffer")'
> > added a check in bhrb_read() to filter the kernel address from BHRB buffer.
> > Patch here modified
> > it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1
> > allows
> > only MSR[PR]=1 address to be written to BHRB buffer.
> > 
> > Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> > ---
> >  arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
> >  arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
> >  arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
> >  arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
> 
> This touches the idle code so we should get those guys on CC (adding Vaidy and
> Ego).
> 
> >  4 files changed, 59 insertions(+), 8 deletions(-)
> > 

[..snip..]


> > diff --git a/arch/powerpc/platforms/powernv/idle.c
> > b/arch/powerpc/platforms/powernv/idle.c
> > index 2dd4673..7db99c7 100644
> > --- a/arch/powerpc/platforms/powernv/idle.c
> > +++ b/arch/powerpc/platforms/powernv/idle.c
> > @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr,
> > bool mmu_on)
> >  	unsigned long srr1;
> >  	unsigned long pls;
> >  	unsigned long mmcr0 = 0;
> > +	unsigned long mmcra_bhrb = 0;

We are saving the whole of MMCRA aren't we ? We might want to just
name it mmcra in that case.

> >  	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
> >  	bool sprs_saved = false;
> >  
> > @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long
> > psscr, bool mmu_on)
> >  		  */
> >  		mmcr0		= mfspr(SPRN_MMCR0);
> >  	}
> > +
> > +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> > +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
> > +		 * to disable BHRB logic when not used. Hence Save and
> > +		 * restore MMCRA after a state-loss idle.
> > +		 */

Multi-line comment usually has the first line blank.

		/*
	         * Line 1
		 * Line 2
		 * .
		 * .
		 * .
		 * Line N
		 */

> > +		mmcra_bhrb		= mfspr(SPRN_MMCRA);
> 
> 
> Why is the bhrb bit of mmcra special here?

The comment above could include the consequence of not saving and
restoring MMCRA i.e

- If the user hasn't asked for the BHRB to be
  written the value of MMCRA[BHRBD] = 1.

- On wakeup from stop, MMCRA[BHRBD] will be 0, since MMCRA is not a
  previleged resource and will be lost.

- Thus, if we do not save and restore the MMCRA[BHRBD], the hardware
  will be needlessly writing to the BHRB in the problem mode.

> 
> > +	}
> > +
> >  	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
> >  		sprs.lpcr	= mfspr(SPRN_LPCR);
> >  		sprs.hfscr	= mfspr(SPRN_HFSCR);
> > @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long
> > psscr, bool mmu_on)
> >  			mtspr(SPRN_MMCR0, mmcr0);
> >  		}
> >  
> > +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
> > +		if (cpu_has_feature(CPU_FTR_ARCH_31))
> > +			mtspr(SPRN_MMCRA, mmcra_bhrb);
> > +
> >  		/*
> >  		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
> >  		 * to ensure the PMU starts running.
> 

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support
  2020-07-07  6:50   ` Michael Neuling
@ 2020-07-08 10:56     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-08 10:56 UTC (permalink / raw)
  To: Michael Neuling; +Cc: maddy, linuxppc-dev



> On 07-Jul-2020, at 12:20 PM, Michael Neuling <mikey@neuling.org> wrote:
> 
> 
>> @@ -480,6 +520,7 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>> 	mmcr[1] = mmcr1;
>> 	mmcr[2] = mmcra;
>> 	mmcr[3] = mmcr2;
>> +	mmcr[4] = mmcr3;
> 
> This is fragile like the kvm vcpu case I commented on before but it gets passed
> in via a function parameter?! Can you create a struct to store these in rather
> than this odd ball numbering?

Mikey,
Yes, it gets passed as cpuhw->mmcr array 
I will check on these cleanup changes for the kvm vcpu case as well as cpu_hw_events mmcr array

Thanks
Athira
> 
> The cleanup should start in patch 1/10 here:
> 
>        /*
>         * The order of the MMCR array is:
> -        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
> +        *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
>         *  - 32-bit, MMCR0, MMCR1, MMCR2
>         */
> -       unsigned long mmcr[4];
> +       unsigned long mmcr[5];
> 
> 
> 
> mikey


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs
  2020-07-01  9:20 ` [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs Athira Rajeev
@ 2020-07-08 11:02   ` Michael Ellerman
  2020-07-09  1:53     ` Athira Rajeev
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 11:02 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
...
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index cd6a742..5c64bd3 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -39,10 +39,10 @@ struct cpu_hw_events {
>  	unsigned int flags[MAX_HWEVENTS];
>  	/*
>  	 * The order of the MMCR array is:
> -	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
> +	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
>  	 *  - 32-bit, MMCR0, MMCR1, MMCR2
>  	 */
> -	unsigned long mmcr[4];
> +	unsigned long mmcr[5];
>  	struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
>  	u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
>  	u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
...
> @@ -1310,6 +1326,10 @@ static void power_pmu_enable(struct pmu *pmu)
>  	if (!cpuhw->n_added) {
>  		mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>  		mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
> +#ifdef CONFIG_PPC64
> +		if (ppmu->flags & PPMU_ARCH_310S)
> +			mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
> +#endif /* CONFIG_PPC64 */
>  		goto out_enable;
>  	}
>  
> @@ -1353,6 +1373,11 @@ static void power_pmu_enable(struct pmu *pmu)
>  	if (ppmu->flags & PPMU_ARCH_207S)
>  		mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
>  
> +#ifdef CONFIG_PPC64
> +	if (ppmu->flags & PPMU_ARCH_310S)
> +		mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
> +#endif /* CONFIG_PPC64 */

I don't think you need the #ifdef CONFIG_PPC64?

cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs
  2020-07-01  9:20 ` [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs Athira Rajeev
@ 2020-07-08 11:04   ` Michael Ellerman
  2020-07-09  1:57     ` Athira Rajeev
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 11:04 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>
> PowerISA v3.1 added three new perfromance
> monitoring unit (PMU) speical purpose register (SPR).
> They are Monitor Mode Control Register 3 (MMCR3),
> Sampled Instruction Event Register 2 (SIER2),
> Sampled Instruction Event Register 3 (SIER3).
>
> Patch here adds a new dump function dump_310_sprs
> to print these SPR values.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
>  arch/powerpc/xmon/xmon.c | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
>
> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> index 7efe4bc..8917fe8 100644
> --- a/arch/powerpc/xmon/xmon.c
> +++ b/arch/powerpc/xmon/xmon.c
> @@ -2022,6 +2022,20 @@ static void dump_300_sprs(void)
>  #endif
>  }
>  
> +static void dump_310_sprs(void)
> +{
> +#ifdef CONFIG_PPC64
> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
> +		return;
> +
> +	printf("mmcr3  = %.16lx\n",
> +		mfspr(SPRN_MMCR3));
> +
> +	printf("sier2  = %.16lx  sier3  = %.16lx\n",
> +		mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));

Why not all on one line like many of the others?

cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
  2020-07-01  9:20 ` [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs Athira Rajeev
  2020-07-07  6:22   ` Michael Neuling
@ 2020-07-08 11:15   ` Michael Ellerman
  2020-07-09 11:07     ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 11:15 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>
> Add power10 feature function to dt_cpu_ftrs.c along
> with a power10 specific init() to initialize pmu sprs.
>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/reg.h        |  3 +++
>  arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
>  arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 21a1b2d..900ada1 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -1068,6 +1068,9 @@
>  #define MMCR0_PMC2_LOADMISSTIME	0x5
>  #endif
>  
> +/* BHRB disable bit for PowerISA v3.10 */
> +#define MMCRA_BHRB_DISABLE	0x0000002000000000
> +
>  /*
>   * SPRG usage:
>   *
> diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
> index efdcfa7..e8b3370c 100644
> --- a/arch/powerpc/kernel/cpu_setup_power.S
> +++ b/arch/powerpc/kernel/cpu_setup_power.S
> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
>  	li	r5,0
>  	mtspr	SPRN_MMCRS,r5
>  	blr
> +
> +__init_PMU_ISA31:
> +	li	r5,0
> +	mtspr	SPRN_MMCR3,r5
> +	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
> +	mtspr	SPRN_MMCRA,r5
> +	blr

This doesn't seem like it belongs in this patch. It's not called?

cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-01  9:20 ` [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes Athira Rajeev
  2020-07-07  7:17   ` Michael Neuling
@ 2020-07-08 11:42   ` Michael Ellerman
  2020-07-09  2:43     ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 11:42 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:

> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
                   ^
                   a
> First is the addition of BHRB disable bit and second new filtering
                                                      ^
                                                      is
> modes for BHRB.
>
> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls

Most people call that bit 37.

> whether BHRB entries are written when BHRB recording is enabled by other
> bits. Patch implements support for this BHRB disable bit.
       ^
       This

> Secondly PowerISA v3.1 introduce filtering support for

.. that should be in a separate patch please.

> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
                                    ^
                                    This
> for "ind_call" and "cond" in power10_bhrb_filter_map().
>
> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")'

That doesn't need single quotes, and should be wrapped at 72 columns
like the rest of the text.

> added a check in bhrb_read() to filter the kernel address from BHRB buffer. Patch here modified
> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1 allows
> only MSR[PR]=1 address to be written to BHRB buffer.

And that should be a separate patch again please.

> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>  arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>  arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>  arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
>  4 files changed, 59 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
> index fad5159..9709606 100644
> --- a/arch/powerpc/perf/core-book3s.c
> +++ b/arch/powerpc/perf/core-book3s.c
> @@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>  			 * addresses at this point. Check the privileges before
>  			 * exporting it to userspace (avoid exposure of regions
>  			 * where we could have speculative execution)
> +			 * Incase of ISA 310, BHRB will capture only user-space
                           ^
                           In case of ISA v3.1,

> +			 * address,hence include a check before filtering code
                           ^                                                  ^
                           addresses, hence                                   .
>  			 */
> -			if (is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
> -				continue;
> +			if (!(ppmu->flags & PPMU_ARCH_310S))
> +				if (is_kernel_addr(addr) &&
> +				perf_allow_kernel(&event->attr) != 0)
> +					continue;

The indentation is weird. You should just check all three conditions
with &&.

>  
>  			/* Branches are read most recent first (ie. mfbhrb 0 is
>  			 * the most recent branch).
> @@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw, unsigned long mmcr0)
>  static void power_pmu_disable(struct pmu *pmu)
>  {
>  	struct cpu_hw_events *cpuhw;
> -	unsigned long flags, mmcr0, val;
> +	unsigned long flags, mmcr0, val, mmcra = 0;

You initialise it below.

>  	if (!ppmu)
>  		return;
> @@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
>  		mb();
>  		isync();
>  
> +		val = mmcra = cpuhw->mmcr[2];
> +

For mmcr0 (above), val is the variable we mutate and mmcr0 is the
original value. But here you've done the reverse, which is confusing.

>  		/*
>  		 * Disable instruction sampling if it was enabled
>  		 */
> -		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
> -			mtspr(SPRN_MMCRA,
> -			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
> +		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
> +			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;

You just loaded cpuhw->mmcr[2] into mmcra, use it rather than referring
back to cpuhw->mmcr[2] over and over.

> +
> +		/* Disable BHRB via mmcra [:26] for p10 if needed */
> +		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))

You don't need to check that it's clear AFAICS. Just always set disable
and the check against val below will catch the nop case.

> +			mmcra |= MMCRA_BHRB_DISABLE;
> +
> +		/* Write SPRN_MMCRA if mmcra has either disabled

Comment format is wrong.

> +		 * instruction sampling or BHRB

Full stop please.

> +		 */
> +		if (val != mmcra) {
> +			mtspr(SPRN_MMCRA, mmcra);
>  			mb();
>  			isync();
>  		}
> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
> index 7d4839e..463d925 100644
> --- a/arch/powerpc/perf/isa207-common.c
> +++ b/arch/powerpc/perf/isa207-common.c
> @@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  
>  	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
>  
> +	/* Disable bhrb unless explicitly requested
> +	 * by setting MMCRA [:26] bit.
> +	 */

Comment format again.

> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		mmcra |= MMCRA_BHRB_DISABLE;

Here we do a feature check before setting MMCRA_BHRB_DISABLE, but you
didn't above?

> +
>  	/* Second pass: assign PMCs, set all MMCR1 fields */
>  	for (i = 0; i < n_ev; ++i) {
>  		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
> @@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>  		}
>  
>  		if (event[i] & EVENT_WANTS_BHRB) {
> +			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */

"set MMCRA[:26] to 0" == "clear MMCRA[:26]"

> +			if (cpu_has_feature(CPU_FTR_ARCH_31))
> +				mmcra &= ~MMCRA_BHRB_DISABLE;

Newline please.

>  			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
>  			mmcra |= val << MMCRA_IFM_SHIFT;
>  		}
>  
> +		/* set MMCRA[:26] to 0 if there is user request for BHRB */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31) && has_branch_stack(pevents[i]))
> +			mmcra &= ~MMCRA_BHRB_DISABLE;
> +

I think it would be cleaner if you did a single test, eg:

		if (cpu_has_feature(CPU_FTR_ARCH_31) &&
                   (has_branch_stack(pevents[i]) || (event[i] & EVENT_WANTS_BHRB)))
			mmcra &= ~MMCRA_BHRB_DISABLE;

>  		if (pevents[i]->attr.exclude_user)
>  			mmcr2 |= MMCR2_FCP(pmc);
>  
> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
> index d64d69d..07fb919 100644
> --- a/arch/powerpc/perf/power10-pmu.c
> +++ b/arch/powerpc/perf/power10-pmu.c
> @@ -82,6 +82,8 @@
>  
>  /* MMCRA IFM bits - POWER10 */
>  #define POWER10_MMCRA_IFM1		0x0000000040000000UL
> +#define POWER10_MMCRA_IFM2		0x0000000080000000UL
> +#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>  #define POWER10_MMCRA_BHRB_MASK	0x00000000C0000000UL
>  
>  /* Table of alternatives, sorted by column 0 */
> @@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64 branch_sample_type)
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>  		return -1;
>  
> -	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
> -		return -1;
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
> +		return pmu_bhrb_filter;
> +	}
> +
> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
> +		return pmu_bhrb_filter;
> +	}
>  
>  	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
>  		return -1;
> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
> index 2dd4673..7db99c7 100644
> --- a/arch/powerpc/platforms/powernv/idle.c
> +++ b/arch/powerpc/platforms/powernv/idle.c
> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  	unsigned long srr1;
>  	unsigned long pls;
>  	unsigned long mmcr0 = 0;
> +	unsigned long mmcra_bhrb = 0;
>  	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>  	bool sprs_saved = false;
>  
> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  		  */
>  		mmcr0		= mfspr(SPRN_MMCR0);
>  	}
> +
> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit

Comment format.

> +		 * to disable BHRB logic when not used. Hence Save and
> +		 * restore MMCRA after a state-loss idle.
> +		 */
> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);
> +	}

It's the whole mmcra it should be called mmcra?

> +
>  	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>  		sprs.lpcr	= mfspr(SPRN_LPCR);
>  		sprs.hfscr	= mfspr(SPRN_HFSCR);
> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>  			mtspr(SPRN_MMCR0, mmcr0);
>  		}
>  
> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
> +
>  		/*
>  		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>  		 * to ensure the PMU starts running.


cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform
  2020-07-01  9:21 ` [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform Athira Rajeev
  2020-07-02  9:40   ` kernel test robot
@ 2020-07-08 12:04   ` Michael Ellerman
  2020-07-09  6:29     ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 12:04 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> Include capability flag `PERF_PMU_CAP_EXTENDED_REGS` for power10
> and expose MMCR3, SIER2, SIER3 registers as part of extended regs.
> Also introduce `PERF_REG_PMU_MASK_31` to define extended mask
> value at runtime for power10
>
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/uapi/asm/perf_regs.h       |  6 ++++++
>  arch/powerpc/perf/perf_regs.c                   | 10 +++++++++-
>  arch/powerpc/perf/power10-pmu.c                 |  6 ++++++
>  tools/arch/powerpc/include/uapi/asm/perf_regs.h |  6 ++++++
>  tools/perf/arch/powerpc/include/perf_regs.h     |  3 +++
>  tools/perf/arch/powerpc/util/perf_regs.c        |  6 ++++++

Please split into a kernel patch and a tools patch. And cc the tools people.

>  6 files changed, 36 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
> index 485b1d5..020b51c 100644
> --- a/arch/powerpc/include/uapi/asm/perf_regs.h
> +++ b/arch/powerpc/include/uapi/asm/perf_regs.h
> @@ -52,6 +52,9 @@ enum perf_event_powerpc_regs {
>  	PERF_REG_POWERPC_MMCR0,
>  	PERF_REG_POWERPC_MMCR1,
>  	PERF_REG_POWERPC_MMCR2,
> +	PERF_REG_POWERPC_MMCR3,
> +	PERF_REG_POWERPC_SIER2,
> +	PERF_REG_POWERPC_SIER3,
>  	/* Max regs without the extended regs */
>  	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
>  };
> @@ -62,4 +65,7 @@ enum perf_event_powerpc_regs {
>  #define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
>  				- PERF_REG_PMU_MASK)
>  
> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
> +#define PERF_REG_PMU_MASK_31	(((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) \
> +				- PERF_REG_PMU_MASK)

Wrapping that provides no benefit, just let it be long.

>  #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
> diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
> index c8a7e8c..c969935 100644
> --- a/arch/powerpc/perf/perf_regs.c
> +++ b/arch/powerpc/perf/perf_regs.c
> @@ -81,6 +81,12 @@ static u64 get_ext_regs_value(int idx)
>  		return mfspr(SPRN_MMCR1);
>  	case PERF_REG_POWERPC_MMCR2:
>  		return mfspr(SPRN_MMCR2);
> +	case PERF_REG_POWERPC_MMCR3:
> +			return mfspr(SPRN_MMCR3);
> +	case PERF_REG_POWERPC_SIER2:
> +			return mfspr(SPRN_SIER2);
> +	case PERF_REG_POWERPC_SIER3:
> +			return mfspr(SPRN_SIER3);

Indentation is wrong.

>  	default: return 0;
>  	}
>  }
> @@ -89,7 +95,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
>  {
>  	u64 PERF_REG_EXTENDED_MAX;
>  
> -	if (cpu_has_feature(CPU_FTR_ARCH_300))
> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
> +		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_SIER3 + 1;

There's no way to know if that's correct other than going back to the
header to look at the list of values.

So instead you should define it in the header, next to the other values,
with a meaningful name, like PERF_REG_MAX_ISA_31 or something.

> +	else if (cpu_has_feature(CPU_FTR_ARCH_300))
>  		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_MMCR2 + 1;

Same.

>  	if (idx == PERF_REG_POWERPC_SIER &&
> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
> index 07fb919..51082d6 100644
> --- a/arch/powerpc/perf/power10-pmu.c
> +++ b/arch/powerpc/perf/power10-pmu.c
> @@ -86,6 +86,8 @@
>  #define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>  #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
>  
> +extern u64 mask_var;

Why is it extern? Also not a good name for a global.

Hang on, it's not even used? Is there some macro magic somewhere?

>  /* Table of alternatives, sorted by column 0 */
>  static const unsigned int power10_event_alternatives[][MAX_ALT] = {
>  	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
> @@ -397,6 +399,7 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter)
>  	.cache_events		= &power10_cache_events,
>  	.attr_groups		= power10_pmu_attr_groups,
>  	.bhrb_nr		= 32,
> +	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
>  };
>  
>  int init_power10_pmu(void)
> @@ -408,6 +411,9 @@ int init_power10_pmu(void)
>  	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10"))
>  		return -ENODEV;
>  
> +	/* Set the PERF_REG_EXTENDED_MASK here */
> +	mask_var = PERF_REG_PMU_MASK_31;
> +
>  	rc = register_power_pmu(&power10_pmu);
>  	if (rc)
>  		return rc;


cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc
  2020-07-01  9:21 ` [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc Athira Rajeev
@ 2020-07-08 12:04   ` Michael Ellerman
  2020-07-09  3:10     ` Athira Rajeev
  2020-07-13  2:36     ` Athira Rajeev
  0 siblings, 2 replies; 41+ messages in thread
From: Michael Ellerman @ 2020-07-08 12:04 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>
> Add extended regs to sample_reg_mask in the tool side to use
> with `-I?` option. Perf tools side uses extended mask to display
> the platform supported register names (with -I? option) to the user
> and also send this mask to the kernel to capture the extended registers
> in each sample. Hence decide the mask value based on the processor
> version.
>
> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> [Decide extended mask at run time based on platform]
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>

Will need an ack from perf tools folks, who are not on Cc by the looks.

> diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
> index f599064..485b1d5 100644
> --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
> +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
> @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
>  	PERF_REG_POWERPC_DSISR,
>  	PERF_REG_POWERPC_SIER,
>  	PERF_REG_POWERPC_MMCRA,
> -	PERF_REG_POWERPC_MAX,
> +	/* Extended registers */
> +	PERF_REG_POWERPC_MMCR0,
> +	PERF_REG_POWERPC_MMCR1,
> +	PERF_REG_POWERPC_MMCR2,
> +	/* Max regs without the extended regs */
> +	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,

I don't really understand this idea of a max that's not the max.

>  };
> +
> +#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
> +
> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
> +#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
> +				- PERF_REG_PMU_MASK)
> +
>  #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
> diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
> index e18a355..46ed00d 100644
> --- a/tools/perf/arch/powerpc/include/perf_regs.h
> +++ b/tools/perf/arch/powerpc/include/perf_regs.h
> @@ -64,7 +64,10 @@
>  	[PERF_REG_POWERPC_DAR] = "dar",
>  	[PERF_REG_POWERPC_DSISR] = "dsisr",
>  	[PERF_REG_POWERPC_SIER] = "sier",
> -	[PERF_REG_POWERPC_MMCRA] = "mmcra"
> +	[PERF_REG_POWERPC_MMCRA] = "mmcra",
> +	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
> +	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
> +	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
>  };
>  
>  static inline const char *perf_reg_name(int id)
> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
> index 0a52429..9179230 100644
> --- a/tools/perf/arch/powerpc/util/perf_regs.c
> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
> @@ -6,9 +6,14 @@
>  
>  #include "../../../util/perf_regs.h"
>  #include "../../../util/debug.h"
> +#include "../../../util/event.h"
> +#include "../../../util/header.h"
> +#include "../../../perf-sys.h"
>  
>  #include <linux/kernel.h>
>  
> +#define PVR_POWER9		0x004E
> +
>  const struct sample_reg sample_reg_masks[] = {
>  	SMPL_REG(r0, PERF_REG_POWERPC_R0),
>  	SMPL_REG(r1, PERF_REG_POWERPC_R1),
> @@ -55,6 +60,9 @@
>  	SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
>  	SMPL_REG(sier, PERF_REG_POWERPC_SIER),
>  	SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
> +	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
> +	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
> +	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
>  	SMPL_REG_END
>  };
>  
> @@ -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>  
>  	return SDT_ARG_VALID;
>  }
> +
> +uint64_t arch__intr_reg_mask(void)
> +{
> +	struct perf_event_attr attr = {
> +		.type                   = PERF_TYPE_HARDWARE,
> +		.config                 = PERF_COUNT_HW_CPU_CYCLES,
> +		.sample_type            = PERF_SAMPLE_REGS_INTR,
> +		.precise_ip             = 1,
> +		.disabled               = 1,
> +		.exclude_kernel         = 1,
> +	};
> +	int fd, ret;
> +	char buffer[64];
> +	u32 version;
> +	u64 extended_mask = 0;
> +
> +	/* Get the PVR value to set the extended
> +	 * mask specific to platform

Comment format is wrong, and punctuation please.

> +	 */
> +	get_cpuid(buffer, sizeof(buffer));
> +	ret = sscanf(buffer, "%u,", &version);

This is powerpc specific code, why not just use mfspr(SPRN_PVR), rather
than redirecting via printf/sscanf.

> +
> +	if (ret != 1) {
> +		pr_debug("Failed to get the processor version, unable to output extended registers\n");
> +		return PERF_REGS_MASK;
> +	}
> +
> +	if (version == PVR_POWER9)
> +		extended_mask = PERF_REG_PMU_MASK_300;
> +	else
> +		return PERF_REGS_MASK;
> +
> +	attr.sample_regs_intr = extended_mask;
> +	attr.sample_period = 1;
> +	event_attr_init(&attr);
> +
> +	/*
> +	 * check if the pmu supports perf extended regs, before
> +	 * returning the register mask to sample.
> +	 */
> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
> +	if (fd != -1) {
> +		close(fd);
> +		return (extended_mask | PERF_REGS_MASK);
> +	}
> +	return PERF_REGS_MASK;

I think this would read a bit better like:

	mask = PERF_REGS_MASK;

	if (version == PVR_POWER9)
		extended_mask = PERF_REG_PMU_MASK_300;
        else
        	return mask;

        attr.sample_regs_intr = extended_mask;
        attr.sample_period = 1;
        event_attr_init(&attr);

        /*
          * check if the pmu supports perf extended regs, before
          * returning the register mask to sample.
          */
        fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
        if (fd != -1) {
                close(fd);
                mask |= extended_mask;
        }

	return mask;


cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs
  2020-07-08 11:02   ` Michael Ellerman
@ 2020-07-09  1:53     ` Athira Rajeev
  2020-07-13 12:50       ` Michael Ellerman
  0 siblings, 1 reply; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  1:53 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev



> On 08-Jul-2020, at 4:32 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
> ...
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index cd6a742..5c64bd3 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -39,10 +39,10 @@ struct cpu_hw_events {
>> 	unsigned int flags[MAX_HWEVENTS];
>> 	/*
>> 	 * The order of the MMCR array is:
>> -	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
>> +	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
>> 	 *  - 32-bit, MMCR0, MMCR1, MMCR2
>> 	 */
>> -	unsigned long mmcr[4];
>> +	unsigned long mmcr[5];
>> 	struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
>> 	u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
>> 	u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
> ...
>> @@ -1310,6 +1326,10 @@ static void power_pmu_enable(struct pmu *pmu)
>> 	if (!cpuhw->n_added) {
>> 		mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>> 		mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
>> +#ifdef CONFIG_PPC64
>> +		if (ppmu->flags & PPMU_ARCH_310S)
>> +			mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>> +#endif /* CONFIG_PPC64 */
>> 		goto out_enable;
>> 	}
>> 
>> @@ -1353,6 +1373,11 @@ static void power_pmu_enable(struct pmu *pmu)
>> 	if (ppmu->flags & PPMU_ARCH_207S)
>> 		mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
>> 
>> +#ifdef CONFIG_PPC64
>> +	if (ppmu->flags & PPMU_ARCH_310S)
>> +		mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>> +#endif /* CONFIG_PPC64 */
> 
> I don't think you need the #ifdef CONFIG_PPC64?

Hi Michael

Thanks for reviewing this series.

SPRN_MMCR3 is not defined for PPC32 and we hit build failure for pmac32_defconfig.
The #ifdef CONFIG_PPC64 is to address this.

Thanks
Athira


> 
> cheers


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs
  2020-07-08 11:04   ` Michael Ellerman
@ 2020-07-09  1:57     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  1:57 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1428 bytes --]



> On 08-Jul-2020, at 4:34 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>> 
>> PowerISA v3.1 added three new perfromance
>> monitoring unit (PMU) speical purpose register (SPR).
>> They are Monitor Mode Control Register 3 (MMCR3),
>> Sampled Instruction Event Register 2 (SIER2),
>> Sampled Instruction Event Register 3 (SIER3).
>> 
>> Patch here adds a new dump function dump_310_sprs
>> to print these SPR values.
>> 
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
>> ---
>> arch/powerpc/xmon/xmon.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>> 
>> diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
>> index 7efe4bc..8917fe8 100644
>> --- a/arch/powerpc/xmon/xmon.c
>> +++ b/arch/powerpc/xmon/xmon.c
>> @@ -2022,6 +2022,20 @@ static void dump_300_sprs(void)
>> #endif
>> }
>> 
>> +static void dump_310_sprs(void)
>> +{
>> +#ifdef CONFIG_PPC64
>> +	if (!cpu_has_feature(CPU_FTR_ARCH_31))
>> +		return;
>> +
>> +	printf("mmcr3  = %.16lx\n",
>> +		mfspr(SPRN_MMCR3));
>> +
>> +	printf("sier2  = %.16lx  sier3  = %.16lx\n",
>> +		mfspr(SPRN_SIER2), mfspr(SPRN_SIER3));
> 
> Why not all on one line like many of the others?

Sure, will change this to one line

Thanks
Athira
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 6580 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-08  7:43     ` Gautham R Shenoy
@ 2020-07-09  2:01       ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  2:01 UTC (permalink / raw)
  To: ego; +Cc: Vaidyanathan Srinivasan, Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4609 bytes --]



> On 08-Jul-2020, at 1:13 PM, Gautham R Shenoy <ego@linux.vnet.ibm.com> wrote:
> 
> On Tue, Jul 07, 2020 at 05:17:55PM +1000, Michael Neuling wrote:
>> On Wed, 2020-07-01 at 05:20 -0400, Athira Rajeev wrote:
>>> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
>>> First is the addition of BHRB disable bit and second new filtering
>>> modes for BHRB.
>>> 
>>> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
>>> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
>>> whether BHRB entries are written when BHRB recording is enabled by other
>>> bits. Patch implements support for this BHRB disable bit.
>> 
>> Probably good to note here that this is backwards compatible. So if you have a
>> kernel that doesn't know about this bit, it'll clear it and hence you still get
>> BHRB. 
>> 
>> You should also note why you'd want to do disable this (ie. the core will run
>> faster).
>> 
>>> Secondly PowerISA v3.1 introduce filtering support for
>>> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
>>> for "ind_call" and "cond" in power10_bhrb_filter_map().
>>> 
>>> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace
>>> via BHRB buffer")'
>>> added a check in bhrb_read() to filter the kernel address from BHRB buffer.
>>> Patch here modified
>>> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1
>>> allows
>>> only MSR[PR]=1 address to be written to BHRB buffer.
>>> 
>>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>>> ---
>>> arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>>> arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>>> arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>>> arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
>> 
>> This touches the idle code so we should get those guys on CC (adding Vaidy and
>> Ego).
>> 
>>> 4 files changed, 59 insertions(+), 8 deletions(-)
>>> 
> 
> [..snip..]
> 
> 
>>> diff --git a/arch/powerpc/platforms/powernv/idle.c
>>> b/arch/powerpc/platforms/powernv/idle.c
>>> index 2dd4673..7db99c7 100644
>>> --- a/arch/powerpc/platforms/powernv/idle.c
>>> +++ b/arch/powerpc/platforms/powernv/idle.c
>>> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr,
>>> bool mmu_on)
>>> 	unsigned long srr1;
>>> 	unsigned long pls;
>>> 	unsigned long mmcr0 = 0;
>>> +	unsigned long mmcra_bhrb = 0;
> 
> We are saving the whole of MMCRA aren't we ? We might want to just
> name it mmcra in that case.
> 
>>> 	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>>> 	bool sprs_saved = false;
>>> 
>>> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long
>>> psscr, bool mmu_on)
>>> 		  */
>>> 		mmcr0		= mfspr(SPRN_MMCR0);
>>> 	}
>>> +
>>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>>> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
>>> +		 * to disable BHRB logic when not used. Hence Save and
>>> +		 * restore MMCRA after a state-loss idle.
>>> +		 */
> 
> Multi-line comment usually has the first line blank.

Hi Gautham

Thanks for checking. I will change the comment format
Yes, we are saving whole of MMCRA. 
> 
> 		/*
> 	         * Line 1
> 		 * Line 2
> 		 * .
> 		 * .
> 		 * .
> 		 * Line N
> 		 */
> 
>>> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);
>> 
>> 
>> Why is the bhrb bit of mmcra special here?
> 
> The comment above could include the consequence of not saving and
> restoring MMCRA i.e
> 
> - If the user hasn't asked for the BHRB to be
>  written the value of MMCRA[BHRBD] = 1.
> 
> - On wakeup from stop, MMCRA[BHRBD] will be 0, since MMCRA is not a
>  previleged resource and will be lost.
> 
> - Thus, if we do not save and restore the MMCRA[BHRBD], the hardware
>  will be needlessly writing to the BHRB in the problem mode.
> 
>> 
>>> +	}
>>> +
>>> 	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>>> 		sprs.lpcr	= mfspr(SPRN_LPCR);
>>> 		sprs.hfscr	= mfspr(SPRN_HFSCR);
>>> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long
>>> psscr, bool mmu_on)
>>> 			mtspr(SPRN_MMCR0, mmcr0);
>>> 		}
>>> 
>>> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
>>> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
>>> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
>>> +
>>> 		/*
>>> 		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>>> 		 * to ensure the PMU starts running.
>> 
> 
> --
> Thanks and Regards
> gautham.


[-- Attachment #2: Type: text/html, Size: 39451 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes
  2020-07-08 11:42   ` Michael Ellerman
@ 2020-07-09  2:43     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  2:43 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 10270 bytes --]



> On 08-Jul-2020, at 5:12 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
> 
>> PowerISA v3.1 has few updates for the Branch History Rolling Buffer(BHRB).
>                   ^
>                   a
>> First is the addition of BHRB disable bit and second new filtering
>                                                      ^
>                                                      is
>> modes for BHRB.
>> 
>> BHRB disable is controlled via Monitor Mode Control Register A (MMCRA)
>> bit 26, namely "BHRB Recording Disable (BHRBRD)". This field controls
> 
> Most people call that bit 37.
> 
>> whether BHRB entries are written when BHRB recording is enabled by other
>> bits. Patch implements support for this BHRB disable bit.
>       ^
>       This
> 
>> Secondly PowerISA v3.1 introduce filtering support for
> 
> .. that should be in a separate patch please.
> 
>> PERF_SAMPLE_BRANCH_IND_CALL/COND. The patch adds BHRB filter support
>                                    ^
>                                    This
>> for "ind_call" and "cond" in power10_bhrb_filter_map().
>> 
>> 'commit bb19af816025 ("powerpc/perf: Prevent kernel address leak to userspace via BHRB buffer")'
> 
> That doesn't need single quotes, and should be wrapped at 72 columns
> like the rest of the text.
> 
>> added a check in bhrb_read() to filter the kernel address from BHRB buffer. Patch here modified
>> it to avoid that check for PowerISA v3.1 based processors, since PowerISA v3.1 allows
>> only MSR[PR]=1 address to be written to BHRB buffer.
> 
> And that should be a separate patch again please.

Sure, I will split these to separate patches

> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/perf/core-book3s.c       | 27 +++++++++++++++++++++------
>> arch/powerpc/perf/isa207-common.c     | 13 +++++++++++++
>> arch/powerpc/perf/power10-pmu.c       | 13 +++++++++++--
>> arch/powerpc/platforms/powernv/idle.c | 14 ++++++++++++++
>> 4 files changed, 59 insertions(+), 8 deletions(-)
>> 
>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>> index fad5159..9709606 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -466,9 +466,13 @@ static void power_pmu_bhrb_read(struct perf_event *event, struct cpu_hw_events *
>> 			 * addresses at this point. Check the privileges before
>> 			 * exporting it to userspace (avoid exposure of regions
>> 			 * where we could have speculative execution)
>> +			 * Incase of ISA 310, BHRB will capture only user-space
>                           ^
>                           In case of ISA v3.1,

Ok, 
> 
>> +			 * address,hence include a check before filtering code
>                           ^                                                  ^
>                           addresses, hence                                   .
>> 			 */
>> -			if (is_kernel_addr(addr) && perf_allow_kernel(&event->attr) != 0)
>> -				continue;
>> +			if (!(ppmu->flags & PPMU_ARCH_310S))
>> +				if (is_kernel_addr(addr) &&
>> +				perf_allow_kernel(&event->attr) != 0)
>> +					continue;
> 
> The indentation is weird. You should just check all three conditions
> with &&.

Ok, will correct this.
> 
>> 
>> 			/* Branches are read most recent first (ie. mfbhrb 0 is
>> 			 * the most recent branch).
>> @@ -1212,7 +1216,7 @@ static void write_mmcr0(struct cpu_hw_events *cpuhw, unsigned long mmcr0)
>> static void power_pmu_disable(struct pmu *pmu)
>> {
>> 	struct cpu_hw_events *cpuhw;
>> -	unsigned long flags, mmcr0, val;
>> +	unsigned long flags, mmcr0, val, mmcra = 0;
> 
> You initialise it below.
> 
>> 	if (!ppmu)
>> 		return;
>> @@ -1245,12 +1249,23 @@ static void power_pmu_disable(struct pmu *pmu)
>> 		mb();
>> 		isync();
>> 
>> +		val = mmcra = cpuhw->mmcr[2];
>> +
> 
> For mmcr0 (above), val is the variable we mutate and mmcr0 is the
> original value. But here you've done the reverse, which is confusing.

Yes, I am altering mmcra here and using val as original value. I should have done it reverse.

> 
>> 		/*
>> 		 * Disable instruction sampling if it was enabled
>> 		 */
>> -		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE) {
>> -			mtspr(SPRN_MMCRA,
>> -			      cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>> +		if (cpuhw->mmcr[2] & MMCRA_SAMPLE_ENABLE)
>> +			mmcra = cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE;
> 
> You just loaded cpuhw->mmcr[2] into mmcra, use it rather than referring
> back to cpuhw->mmcr[2] over and over.
> 

Ok,
>> +
>> +		/* Disable BHRB via mmcra [:26] for p10 if needed */
>> +		if (!(cpuhw->mmcr[2] & MMCRA_BHRB_DISABLE))
> 
> You don't need to check that it's clear AFAICS. Just always set disable
> and the check against val below will catch the nop case.

My thought here was to avoid writing to MMCRA ( also avoid mb() and isync() ) if its not needed.
But as you suggested, since I am comparing against original value before writing, I may not need this check.
And I missed feature check here. Will correct it.
 
> 
>> +			mmcra |= MMCRA_BHRB_DISABLE;
>> +
>> +		/* Write SPRN_MMCRA if mmcra has either disabled
> 
> Comment format is wrong.
> 
>> +		 * instruction sampling or BHRB
> 
> Full stop please.

Sure
> 
>> +		 */
>> +		if (val != mmcra) {
>> +			mtspr(SPRN_MMCRA, mmcra);
>> 			mb();
>> 			isync();
>> 		}
>> diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
>> index 7d4839e..463d925 100644
>> --- a/arch/powerpc/perf/isa207-common.c
>> +++ b/arch/powerpc/perf/isa207-common.c
>> @@ -404,6 +404,12 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>> 
>> 	mmcra = mmcr1 = mmcr2 = mmcr3 = 0;
>> 
>> +	/* Disable bhrb unless explicitly requested
>> +	 * by setting MMCRA [:26] bit.
>> +	 */
> 
> Comment format again.
> 
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +		mmcra |= MMCRA_BHRB_DISABLE;
> 
> Here we do a feature check before setting MMCRA_BHRB_DISABLE, but you
> didn't above?
> 
>> +
>> 	/* Second pass: assign PMCs, set all MMCR1 fields */
>> 	for (i = 0; i < n_ev; ++i) {
>> 		pmc     = (event[i] >> EVENT_PMC_SHIFT) & EVENT_PMC_MASK;
>> @@ -475,10 +481,17 @@ int isa207_compute_mmcr(u64 event[], int n_ev,
>> 		}
>> 
>> 		if (event[i] & EVENT_WANTS_BHRB) {
>> +			/* set MMCRA[:26] to 0 for Power10 to enable BHRB */
> 
> "set MMCRA[:26] to 0" == "clear MMCRA[:26]”
> 
Ok

>> +			if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +				mmcra &= ~MMCRA_BHRB_DISABLE;
> 
> Newline please.
> 
>> 			val = (event[i] >> EVENT_IFM_SHIFT) & EVENT_IFM_MASK;
>> 			mmcra |= val << MMCRA_IFM_SHIFT;
>> 		}
>> 
>> +		/* set MMCRA[:26] to 0 if there is user request for BHRB */
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31) && has_branch_stack(pevents[i]))
>> +			mmcra &= ~MMCRA_BHRB_DISABLE;
>> +
> 
> I think it would be cleaner if you did a single test, eg:
> 
> 		if (cpu_has_feature(CPU_FTR_ARCH_31) &&
>                   (has_branch_stack(pevents[i]) || (event[i] & EVENT_WANTS_BHRB)))
> 			mmcra &= ~MMCRA_BHRB_DISABLE;

Sure Michael

Thanks for the review. I will address all these changes in the next version

Thanks
Athira 
> 
>> 		if (pevents[i]->attr.exclude_user)
>> 			mmcr2 |= MMCR2_FCP(pmc);
>> 
>> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
>> index d64d69d..07fb919 100644
>> --- a/arch/powerpc/perf/power10-pmu.c
>> +++ b/arch/powerpc/perf/power10-pmu.c
>> @@ -82,6 +82,8 @@
>> 
>> /* MMCRA IFM bits - POWER10 */
>> #define POWER10_MMCRA_IFM1		0x0000000040000000UL
>> +#define POWER10_MMCRA_IFM2		0x0000000080000000UL
>> +#define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>> #define POWER10_MMCRA_BHRB_MASK	0x00000000C0000000UL
>> 
>> /* Table of alternatives, sorted by column 0 */
>> @@ -233,8 +235,15 @@ static u64 power10_bhrb_filter_map(u64 branch_sample_type)
>> 	if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> 		return -1;
>> 
>> -	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
>> -		return -1;
>> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
>> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM2;
>> +		return pmu_bhrb_filter;
>> +	}
>> +
>> +	if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
>> +		pmu_bhrb_filter |= POWER10_MMCRA_IFM3;
>> +		return pmu_bhrb_filter;
>> +	}
>> 
>> 	if (branch_sample_type & PERF_SAMPLE_BRANCH_CALL)
>> 		return -1;
>> diff --git a/arch/powerpc/platforms/powernv/idle.c b/arch/powerpc/platforms/powernv/idle.c
>> index 2dd4673..7db99c7 100644
>> --- a/arch/powerpc/platforms/powernv/idle.c
>> +++ b/arch/powerpc/platforms/powernv/idle.c
>> @@ -611,6 +611,7 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>> 	unsigned long srr1;
>> 	unsigned long pls;
>> 	unsigned long mmcr0 = 0;
>> +	unsigned long mmcra_bhrb = 0;
>> 	struct p9_sprs sprs = {}; /* avoid false used-uninitialised */
>> 	bool sprs_saved = false;
>> 
>> @@ -657,6 +658,15 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>> 		  */
>> 		mmcr0		= mfspr(SPRN_MMCR0);
>> 	}
>> +
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>> +		/* POWER10 uses MMCRA[:26] as BHRB disable bit
> 
> Comment format.
> 
>> +		 * to disable BHRB logic when not used. Hence Save and
>> +		 * restore MMCRA after a state-loss idle.
>> +		 */
>> +		mmcra_bhrb		= mfspr(SPRN_MMCRA);
>> +	}
> 
> It's the whole mmcra it should be called mmcra?

Yes, we are saving the whole mmcra. 
> 
>> +
>> 	if ((psscr & PSSCR_RL_MASK) >= pnv_first_spr_loss_level) {
>> 		sprs.lpcr	= mfspr(SPRN_LPCR);
>> 		sprs.hfscr	= mfspr(SPRN_HFSCR);
>> @@ -721,6 +731,10 @@ static unsigned long power9_idle_stop(unsigned long psscr, bool mmu_on)
>> 			mtspr(SPRN_MMCR0, mmcr0);
>> 		}
>> 
>> +		/* Reload MMCRA to restore BHRB disable bit for POWER10 */
>> +		if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +			mtspr(SPRN_MMCRA, mmcra_bhrb);
>> +
>> 		/*
>> 		 * DD2.2 and earlier need to set then clear bit 60 in MMCRA
>> 		 * to ensure the PMU starts running.
> 
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 85847 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc
  2020-07-08 12:04   ` Michael Ellerman
@ 2020-07-09  3:10     ` Athira Rajeev
  2020-07-13 12:47       ` Michael Ellerman
  2020-07-13  2:36     ` Athira Rajeev
  1 sibling, 1 reply; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  3:10 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: mikey, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6308 bytes --]



> On 08-Jul-2020, at 5:34 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> 
>> Add extended regs to sample_reg_mask in the tool side to use
>> with `-I?` option. Perf tools side uses extended mask to display
>> the platform supported register names (with -I? option) to the user
>> and also send this mask to the kernel to capture the extended registers
>> in each sample. Hence decide the mask value based on the processor
>> version.
>> 
>> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> [Decide extended mask at run time based on platform]
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> 
> Will need an ack from perf tools folks, who are not on Cc by the looks.
> 

Yes, my bad. Will make sure to add proper Cc 

>> diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> index f599064..485b1d5 100644
>> --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
>> 	PERF_REG_POWERPC_DSISR,
>> 	PERF_REG_POWERPC_SIER,
>> 	PERF_REG_POWERPC_MMCRA,
>> -	PERF_REG_POWERPC_MAX,
>> +	/* Extended registers */
>> +	PERF_REG_POWERPC_MMCR0,
>> +	PERF_REG_POWERPC_MMCR1,
>> +	PERF_REG_POWERPC_MMCR2,
>> +	/* Max regs without the extended regs */
>> +	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
> 
> I don't really understand this idea of a max that's not the max.
> 
>> };
>> +
>> +#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
>> +
>> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
>> +#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
>> +				- PERF_REG_PMU_MASK)
>> +
>> #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
>> diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
>> index e18a355..46ed00d 100644
>> --- a/tools/perf/arch/powerpc/include/perf_regs.h
>> +++ b/tools/perf/arch/powerpc/include/perf_regs.h
>> @@ -64,7 +64,10 @@
>> 	[PERF_REG_POWERPC_DAR] = "dar",
>> 	[PERF_REG_POWERPC_DSISR] = "dsisr",
>> 	[PERF_REG_POWERPC_SIER] = "sier",
>> -	[PERF_REG_POWERPC_MMCRA] = "mmcra"
>> +	[PERF_REG_POWERPC_MMCRA] = "mmcra",
>> +	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
>> +	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
>> +	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
>> };
>> 
>> static inline const char *perf_reg_name(int id)
>> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
>> index 0a52429..9179230 100644
>> --- a/tools/perf/arch/powerpc/util/perf_regs.c
>> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
>> @@ -6,9 +6,14 @@
>> 
>> #include "../../../util/perf_regs.h"
>> #include "../../../util/debug.h"
>> +#include "../../../util/event.h"
>> +#include "../../../util/header.h"
>> +#include "../../../perf-sys.h"
>> 
>> #include <linux/kernel.h>
>> 
>> +#define PVR_POWER9		0x004E
>> +
>> const struct sample_reg sample_reg_masks[] = {
>> 	SMPL_REG(r0, PERF_REG_POWERPC_R0),
>> 	SMPL_REG(r1, PERF_REG_POWERPC_R1),
>> @@ -55,6 +60,9 @@
>> 	SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
>> 	SMPL_REG(sier, PERF_REG_POWERPC_SIER),
>> 	SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
>> +	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
>> +	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
>> +	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
>> 	SMPL_REG_END
>> };
>> 
>> @@ -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>> 
>> 	return SDT_ARG_VALID;
>> }
>> +
>> +uint64_t arch__intr_reg_mask(void)
>> +{
>> +	struct perf_event_attr attr = {
>> +		.type                   = PERF_TYPE_HARDWARE,
>> +		.config                 = PERF_COUNT_HW_CPU_CYCLES,
>> +		.sample_type            = PERF_SAMPLE_REGS_INTR,
>> +		.precise_ip             = 1,
>> +		.disabled               = 1,
>> +		.exclude_kernel         = 1,
>> +	};
>> +	int fd, ret;
>> +	char buffer[64];
>> +	u32 version;
>> +	u64 extended_mask = 0;
>> +
>> +	/* Get the PVR value to set the extended
>> +	 * mask specific to platform
> 
> Comment format is wrong, and punctuation please.
> 
>> +	 */
>> +	get_cpuid(buffer, sizeof(buffer));
>> +	ret = sscanf(buffer, "%u,", &version);
> 
> This is powerpc specific code, why not just use mfspr(SPRN_PVR), rather
> than redirecting via printf/sscanf.

Hi Michael

For perf tools, defines for `mfspr` , `SPRN_PVR` are in arch/powerpc/util/header.c 
So I have re-used existing utility. Otherwise, we will need to include these defines here as well
Does that sounds good ?

> 
>> +
>> +	if (ret != 1) {
>> +		pr_debug("Failed to get the processor version, unable to output extended registers\n");
>> +		return PERF_REGS_MASK;
>> +	}
>> +
>> +	if (version == PVR_POWER9)
>> +		extended_mask = PERF_REG_PMU_MASK_300;
>> +	else
>> +		return PERF_REGS_MASK;
>> +
>> +	attr.sample_regs_intr = extended_mask;
>> +	attr.sample_period = 1;
>> +	event_attr_init(&attr);
>> +
>> +	/*
>> +	 * check if the pmu supports perf extended regs, before
>> +	 * returning the register mask to sample.
>> +	 */
>> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>> +	if (fd != -1) {
>> +		close(fd);
>> +		return (extended_mask | PERF_REGS_MASK);
>> +	}
>> +	return PERF_REGS_MASK;
> 
> I think this would read a bit better like:
> 
> 	mask = PERF_REGS_MASK;
> 
> 	if (version == PVR_POWER9)
> 		extended_mask = PERF_REG_PMU_MASK_300;
>        else
>        	return mask;
> 
>        attr.sample_regs_intr = extended_mask;
>        attr.sample_period = 1;
>        event_attr_init(&attr);
> 
>        /*
>          * check if the pmu supports perf extended regs, before
>          * returning the register mask to sample.
>          */
>        fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>        if (fd != -1) {
>                close(fd);
>                mask |= extended_mask;
>        }
> 
> 	return mask;

Sure, I will try with this change

Thanks
Athira
> 
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 43844 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform
  2020-07-08 12:04   ` Michael Ellerman
@ 2020-07-09  6:29     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09  6:29 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 5465 bytes --]



> On 08-Jul-2020, at 5:34 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>> Include capability flag `PERF_PMU_CAP_EXTENDED_REGS` for power10
>> and expose MMCR3, SIER2, SIER3 registers as part of extended regs.
>> Also introduce `PERF_REG_PMU_MASK_31` to define extended mask
>> value at runtime for power10
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/include/uapi/asm/perf_regs.h       |  6 ++++++
>> arch/powerpc/perf/perf_regs.c                   | 10 +++++++++-
>> arch/powerpc/perf/power10-pmu.c                 |  6 ++++++
>> tools/arch/powerpc/include/uapi/asm/perf_regs.h |  6 ++++++
>> tools/perf/arch/powerpc/include/perf_regs.h     |  3 +++
>> tools/perf/arch/powerpc/util/perf_regs.c        |  6 ++++++
> 
> Please split into a kernel patch and a tools patch. And cc the tools people.

Ok sure
> 
>> 6 files changed, 36 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/include/uapi/asm/perf_regs.h b/arch/powerpc/include/uapi/asm/perf_regs.h
>> index 485b1d5..020b51c 100644
>> --- a/arch/powerpc/include/uapi/asm/perf_regs.h
>> +++ b/arch/powerpc/include/uapi/asm/perf_regs.h
>> @@ -52,6 +52,9 @@ enum perf_event_powerpc_regs {
>> 	PERF_REG_POWERPC_MMCR0,
>> 	PERF_REG_POWERPC_MMCR1,
>> 	PERF_REG_POWERPC_MMCR2,
>> +	PERF_REG_POWERPC_MMCR3,
>> +	PERF_REG_POWERPC_SIER2,
>> +	PERF_REG_POWERPC_SIER3,
>> 	/* Max regs without the extended regs */
>> 	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
>> };
>> @@ -62,4 +65,7 @@ enum perf_event_powerpc_regs {
>> #define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
>> 				- PERF_REG_PMU_MASK)
>> 
>> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
>> +#define PERF_REG_PMU_MASK_31	(((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) \
>> +				- PERF_REG_PMU_MASK)
> 
> Wrapping that provides no benefit, just let it be long.
> 

Ok,

>> #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
>> diff --git a/arch/powerpc/perf/perf_regs.c b/arch/powerpc/perf/perf_regs.c
>> index c8a7e8c..c969935 100644
>> --- a/arch/powerpc/perf/perf_regs.c
>> +++ b/arch/powerpc/perf/perf_regs.c
>> @@ -81,6 +81,12 @@ static u64 get_ext_regs_value(int idx)
>> 		return mfspr(SPRN_MMCR1);
>> 	case PERF_REG_POWERPC_MMCR2:
>> 		return mfspr(SPRN_MMCR2);
>> +	case PERF_REG_POWERPC_MMCR3:
>> +			return mfspr(SPRN_MMCR3);
>> +	case PERF_REG_POWERPC_SIER2:
>> +			return mfspr(SPRN_SIER2);
>> +	case PERF_REG_POWERPC_SIER3:
>> +			return mfspr(SPRN_SIER3);
> 
> Indentation is wrong.
> 
>> 	default: return 0;
>> 	}
>> }
>> @@ -89,7 +95,9 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
>> {
>> 	u64 PERF_REG_EXTENDED_MAX;
>> 
>> -	if (cpu_has_feature(CPU_FTR_ARCH_300))
>> +	if (cpu_has_feature(CPU_FTR_ARCH_31))
>> +		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_SIER3 + 1;
> 
> There's no way to know if that's correct other than going back to the
> header to look at the list of values.
> 
> So instead you should define it in the header, next to the other values,
> with a meaningful name, like PERF_REG_MAX_ISA_31 or something.
> 
>> +	else if (cpu_has_feature(CPU_FTR_ARCH_300))
>> 		PERF_REG_EXTENDED_MAX = PERF_REG_POWERPC_MMCR2 + 1;
> 
> Same.
> 

Ok, will make this change

>> 	if (idx == PERF_REG_POWERPC_SIER &&
>> diff --git a/arch/powerpc/perf/power10-pmu.c b/arch/powerpc/perf/power10-pmu.c
>> index 07fb919..51082d6 100644
>> --- a/arch/powerpc/perf/power10-pmu.c
>> +++ b/arch/powerpc/perf/power10-pmu.c
>> @@ -86,6 +86,8 @@
>> #define POWER10_MMCRA_IFM3		0x00000000C0000000UL
>> #define POWER10_MMCRA_BHRB_MASK		0x00000000C0000000UL
>> 
>> +extern u64 mask_var;
> 
> Why is it extern? Also not a good name for a global.
> 
> Hang on, it's not even used? Is there some macro magic somewhere?

This is defined in patch 8 "powerpc/perf: Add support for outputting extended regs in perf intr_regs”, 
which adds the base support for extended regs in powerpc. Current patch covers changes to support
It for power10. 

`mask_var` is used to define `PERF_REG_EXTENDED_MASK` at run time. 
`PERF_REG_EXTENDED_MASK` basically contains mask value of supported extended registers.
And since supported registers may differ between processor versions, we are defining this mask at runtime.

The #define is done in arch/powerpc/include/asm/perf_event_server.h ( in patch 8 ).
In the PMU driver init, we will set the respective mask value ( in the below code ). Hence it is extern

Sorry for the confusion here. 

Thanks
Athira

> 
>> /* Table of alternatives, sorted by column 0 */
>> static const unsigned int power10_event_alternatives[][MAX_ALT] = {
>> 	{ PM_RUN_CYC_ALT,		PM_RUN_CYC },
>> @@ -397,6 +399,7 @@ static void power10_config_bhrb(u64 pmu_bhrb_filter)
>> 	.cache_events		= &power10_cache_events,
>> 	.attr_groups		= power10_pmu_attr_groups,
>> 	.bhrb_nr		= 32,
>> +	.capabilities           = PERF_PMU_CAP_EXTENDED_REGS,
>> };
>> 
>> int init_power10_pmu(void)
>> @@ -408,6 +411,9 @@ int init_power10_pmu(void)
>> 	    strcmp(cur_cpu_spec->oprofile_cpu_type, "ppc64/power10"))
>> 		return -ENODEV;
>> 
>> +	/* Set the PERF_REG_EXTENDED_MASK here */
>> +	mask_var = PERF_REG_PMU_MASK_31;
>> +
>> 	rc = register_power_pmu(&power10_pmu);
>> 	if (rc)
>> 		return rc;
> 
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 29293 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs
  2020-07-08 11:15   ` Michael Ellerman
@ 2020-07-09 11:07     ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-09 11:07 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1829 bytes --]



> On 08-Jul-2020, at 4:45 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>> From: Madhavan Srinivasan <maddy@linux.ibm.com>
>> 
>> Add power10 feature function to dt_cpu_ftrs.c along
>> with a power10 specific init() to initialize pmu sprs.
>> 
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/reg.h        |  3 +++
>> arch/powerpc/kernel/cpu_setup_power.S |  7 +++++++
>> arch/powerpc/kernel/dt_cpu_ftrs.c     | 26 ++++++++++++++++++++++++++
>> 3 files changed, 36 insertions(+)
>> 
>> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
>> index 21a1b2d..900ada1 100644
>> --- a/arch/powerpc/include/asm/reg.h
>> +++ b/arch/powerpc/include/asm/reg.h
>> @@ -1068,6 +1068,9 @@
>> #define MMCR0_PMC2_LOADMISSTIME	0x5
>> #endif
>> 
>> +/* BHRB disable bit for PowerISA v3.10 */
>> +#define MMCRA_BHRB_DISABLE	0x0000002000000000
>> +
>> /*
>>  * SPRG usage:
>>  *
>> diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
>> index efdcfa7..e8b3370c 100644
>> --- a/arch/powerpc/kernel/cpu_setup_power.S
>> +++ b/arch/powerpc/kernel/cpu_setup_power.S
>> @@ -233,3 +233,10 @@ __init_PMU_ISA207:
>> 	li	r5,0
>> 	mtspr	SPRN_MMCRS,r5
>> 	blr
>> +
>> +__init_PMU_ISA31:
>> +	li	r5,0
>> +	mtspr	SPRN_MMCR3,r5
>> +	LOAD_REG_IMMEDIATE(r5, MMCRA_BHRB_DISABLE)
>> +	mtspr	SPRN_MMCRA,r5
>> +	blr
> 
> This doesn't seem like it belongs in this patch. It's not called?

Yes, you are right, this needs to be called from `__setup_cpu_power10`.
Since we didn’t had setup part for power10 in the tree initially, missed it.
I will include this update in V3 

Thanks
Athira
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 7476 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc
  2020-07-08 12:04   ` Michael Ellerman
  2020-07-09  3:10     ` Athira Rajeev
@ 2020-07-13  2:36     ` Athira Rajeev
  1 sibling, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-13  2:36 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 6296 bytes --]



> On 08-Jul-2020, at 5:34 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> 
>> Add extended regs to sample_reg_mask in the tool side to use
>> with `-I?` option. Perf tools side uses extended mask to display
>> the platform supported register names (with -I? option) to the user
>> and also send this mask to the kernel to capture the extended registers
>> in each sample. Hence decide the mask value based on the processor
>> version.
>> 
>> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> [Decide extended mask at run time based on platform]
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> 
> Will need an ack from perf tools folks, who are not on Cc by the looks.
> 
>> diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> index f599064..485b1d5 100644
>> --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
>> @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs {
>> 	PERF_REG_POWERPC_DSISR,
>> 	PERF_REG_POWERPC_SIER,
>> 	PERF_REG_POWERPC_MMCRA,
>> -	PERF_REG_POWERPC_MAX,
>> +	/* Extended registers */
>> +	PERF_REG_POWERPC_MMCR0,
>> +	PERF_REG_POWERPC_MMCR1,
>> +	PERF_REG_POWERPC_MMCR2,
>> +	/* Max regs without the extended regs */
>> +	PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
> 
> I don't really understand this idea of a max that's not the max.

Hi Michael

This is the MAX without extended regs. This is mainly used in `arch/powerpc/perf/perf_regs.c` to define pt_regs_offset ( to get index
for other regs ) and also is used to determine whether requested register is an extended reg while capturing data in sample
( in `perf_reg_value` )

Thanks
Athira

> 
>> };
>> +
>> +#define PERF_REG_PMU_MASK	((1ULL << PERF_REG_POWERPC_MAX) - 1)
>> +
>> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
>> +#define PERF_REG_PMU_MASK_300   (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
>> +				- PERF_REG_PMU_MASK)
>> +
>> #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */
>> diff --git a/tools/perf/arch/powerpc/include/perf_regs.h b/tools/perf/arch/powerpc/include/perf_regs.h
>> index e18a355..46ed00d 100644
>> --- a/tools/perf/arch/powerpc/include/perf_regs.h
>> +++ b/tools/perf/arch/powerpc/include/perf_regs.h
>> @@ -64,7 +64,10 @@
>> 	[PERF_REG_POWERPC_DAR] = "dar",
>> 	[PERF_REG_POWERPC_DSISR] = "dsisr",
>> 	[PERF_REG_POWERPC_SIER] = "sier",
>> -	[PERF_REG_POWERPC_MMCRA] = "mmcra"
>> +	[PERF_REG_POWERPC_MMCRA] = "mmcra",
>> +	[PERF_REG_POWERPC_MMCR0] = "mmcr0",
>> +	[PERF_REG_POWERPC_MMCR1] = "mmcr1",
>> +	[PERF_REG_POWERPC_MMCR2] = "mmcr2",
>> };
>> 
>> static inline const char *perf_reg_name(int id)
>> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c b/tools/perf/arch/powerpc/util/perf_regs.c
>> index 0a52429..9179230 100644
>> --- a/tools/perf/arch/powerpc/util/perf_regs.c
>> +++ b/tools/perf/arch/powerpc/util/perf_regs.c
>> @@ -6,9 +6,14 @@
>> 
>> #include "../../../util/perf_regs.h"
>> #include "../../../util/debug.h"
>> +#include "../../../util/event.h"
>> +#include "../../../util/header.h"
>> +#include "../../../perf-sys.h"
>> 
>> #include <linux/kernel.h>
>> 
>> +#define PVR_POWER9		0x004E
>> +
>> const struct sample_reg sample_reg_masks[] = {
>> 	SMPL_REG(r0, PERF_REG_POWERPC_R0),
>> 	SMPL_REG(r1, PERF_REG_POWERPC_R1),
>> @@ -55,6 +60,9 @@
>> 	SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
>> 	SMPL_REG(sier, PERF_REG_POWERPC_SIER),
>> 	SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
>> +	SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
>> +	SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
>> +	SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
>> 	SMPL_REG_END
>> };
>> 
>> @@ -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char **new_op)
>> 
>> 	return SDT_ARG_VALID;
>> }
>> +
>> +uint64_t arch__intr_reg_mask(void)
>> +{
>> +	struct perf_event_attr attr = {
>> +		.type                   = PERF_TYPE_HARDWARE,
>> +		.config                 = PERF_COUNT_HW_CPU_CYCLES,
>> +		.sample_type            = PERF_SAMPLE_REGS_INTR,
>> +		.precise_ip             = 1,
>> +		.disabled               = 1,
>> +		.exclude_kernel         = 1,
>> +	};
>> +	int fd, ret;
>> +	char buffer[64];
>> +	u32 version;
>> +	u64 extended_mask = 0;
>> +
>> +	/* Get the PVR value to set the extended
>> +	 * mask specific to platform
> 
> Comment format is wrong, and punctuation please.
> 
>> +	 */
>> +	get_cpuid(buffer, sizeof(buffer));
>> +	ret = sscanf(buffer, "%u,", &version);
> 
> This is powerpc specific code, why not just use mfspr(SPRN_PVR), rather
> than redirecting via printf/sscanf.
> 
>> +
>> +	if (ret != 1) {
>> +		pr_debug("Failed to get the processor version, unable to output extended registers\n");
>> +		return PERF_REGS_MASK;
>> +	}
>> +
>> +	if (version == PVR_POWER9)
>> +		extended_mask = PERF_REG_PMU_MASK_300;
>> +	else
>> +		return PERF_REGS_MASK;
>> +
>> +	attr.sample_regs_intr = extended_mask;
>> +	attr.sample_period = 1;
>> +	event_attr_init(&attr);
>> +
>> +	/*
>> +	 * check if the pmu supports perf extended regs, before
>> +	 * returning the register mask to sample.
>> +	 */
>> +	fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>> +	if (fd != -1) {
>> +		close(fd);
>> +		return (extended_mask | PERF_REGS_MASK);
>> +	}
>> +	return PERF_REGS_MASK;
> 
> I think this would read a bit better like:
> 
> 	mask = PERF_REGS_MASK;
> 
> 	if (version == PVR_POWER9)
> 		extended_mask = PERF_REG_PMU_MASK_300;
>        else
>        	return mask;
> 
>        attr.sample_regs_intr = extended_mask;
>        attr.sample_period = 1;
>        event_attr_init(&attr);
> 
>        /*
>          * check if the pmu supports perf extended regs, before
>          * returning the register mask to sample.
>          */
>        fd = sys_perf_event_open(&attr, 0, -1, -1, 0);
>        if (fd != -1) {
>                close(fd);
>                mask |= extended_mask;
>        }
> 
> 	return mask;
> 
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 43626 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc
  2020-07-09  3:10     ` Athira Rajeev
@ 2020-07-13 12:47       ` Michael Ellerman
  0 siblings, 0 replies; 41+ messages in thread
From: Michael Ellerman @ 2020-07-13 12:47 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: mikey, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
>> On 08-Jul-2020, at 5:34 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> 
>> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>>> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>>> 
>>> Add extended regs to sample_reg_mask in the tool side to use
>>> with `-I?` option. Perf tools side uses extended mask to display
...
>> 
>>> +	 */
>>> +	get_cpuid(buffer, sizeof(buffer));
>>> +	ret = sscanf(buffer, "%u,", &version);
>> 
>> This is powerpc specific code, why not just use mfspr(SPRN_PVR), rather
>> than redirecting via printf/sscanf.
>
> Hi Michael
>
> For perf tools, defines for `mfspr` , `SPRN_PVR` are in arch/powerpc/util/header.c 
> So I have re-used existing utility. Otherwise, we will need to include these defines here as well
> Does that sounds good ?

They should be moved to a header in that directory that both C files can include.

cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs
  2020-07-09  1:53     ` Athira Rajeev
@ 2020-07-13 12:50       ` Michael Ellerman
  2020-07-15  6:07         ` Athira Rajeev
  0 siblings, 1 reply; 41+ messages in thread
From: Michael Ellerman @ 2020-07-13 12:50 UTC (permalink / raw)
  To: Athira Rajeev; +Cc: Michael Neuling, maddy, linuxppc-dev

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
>> On 08-Jul-2020, at 4:32 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> 
>> Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
>> ...
>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>> index cd6a742..5c64bd3 100644
>>> --- a/arch/powerpc/perf/core-book3s.c
>>> +++ b/arch/powerpc/perf/core-book3s.c
>>> @@ -39,10 +39,10 @@ struct cpu_hw_events {
>>> 	unsigned int flags[MAX_HWEVENTS];
>>> 	/*
>>> 	 * The order of the MMCR array is:
>>> -	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
>>> +	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
>>> 	 *  - 32-bit, MMCR0, MMCR1, MMCR2
>>> 	 */
>>> -	unsigned long mmcr[4];
>>> +	unsigned long mmcr[5];
>>> 	struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
>>> 	u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
>>> 	u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
>> ...
>>> @@ -1310,6 +1326,10 @@ static void power_pmu_enable(struct pmu *pmu)
>>> 	if (!cpuhw->n_added) {
>>> 		mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>>> 		mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
>>> +#ifdef CONFIG_PPC64
>>> +		if (ppmu->flags & PPMU_ARCH_310S)
>>> +			mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>>> +#endif /* CONFIG_PPC64 */
>>> 		goto out_enable;
>>> 	}
>>> 
>>> @@ -1353,6 +1373,11 @@ static void power_pmu_enable(struct pmu *pmu)
>>> 	if (ppmu->flags & PPMU_ARCH_207S)
>>> 		mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
>>> 
>>> +#ifdef CONFIG_PPC64
>>> +	if (ppmu->flags & PPMU_ARCH_310S)
>>> +		mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>>> +#endif /* CONFIG_PPC64 */
>> 
>> I don't think you need the #ifdef CONFIG_PPC64?
>
> Hi Michael
>
> Thanks for reviewing this series.
>
> SPRN_MMCR3 is not defined for PPC32 and we hit build failure for pmac32_defconfig.
> The #ifdef CONFIG_PPC64 is to address this.

We like to avoid #ifdefs in the body of the code like that.

There's a bunch of existing #defines near the top of the file to make
32-bit work, I think you should just add another for this, so eg:

#ifdef CONFIG_PPC32
...
#define SPRN_MMCR3	0

cheers

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs
  2020-07-13 12:50       ` Michael Ellerman
@ 2020-07-15  6:07         ` Athira Rajeev
  0 siblings, 0 replies; 41+ messages in thread
From: Athira Rajeev @ 2020-07-15  6:07 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Michael Neuling, maddy, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2503 bytes --]



> On 13-Jul-2020, at 6:20 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> writes:
>>> On 08-Jul-2020, at 4:32 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> 
>>> Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
>>> ...
>>>> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
>>>> index cd6a742..5c64bd3 100644
>>>> --- a/arch/powerpc/perf/core-book3s.c
>>>> +++ b/arch/powerpc/perf/core-book3s.c
>>>> @@ -39,10 +39,10 @@ struct cpu_hw_events {
>>>> 	unsigned int flags[MAX_HWEVENTS];
>>>> 	/*
>>>> 	 * The order of the MMCR array is:
>>>> -	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2
>>>> +	 *  - 64-bit, MMCR0, MMCR1, MMCRA, MMCR2, MMCR3
>>>> 	 *  - 32-bit, MMCR0, MMCR1, MMCR2
>>>> 	 */
>>>> -	unsigned long mmcr[4];
>>>> +	unsigned long mmcr[5];
>>>> 	struct perf_event *limited_counter[MAX_LIMITED_HWCOUNTERS];
>>>> 	u8  limited_hwidx[MAX_LIMITED_HWCOUNTERS];
>>>> 	u64 alternatives[MAX_HWEVENTS][MAX_EVENT_ALTERNATIVES];
>>> ...
>>>> @@ -1310,6 +1326,10 @@ static void power_pmu_enable(struct pmu *pmu)
>>>> 	if (!cpuhw->n_added) {
>>>> 		mtspr(SPRN_MMCRA, cpuhw->mmcr[2] & ~MMCRA_SAMPLE_ENABLE);
>>>> 		mtspr(SPRN_MMCR1, cpuhw->mmcr[1]);
>>>> +#ifdef CONFIG_PPC64
>>>> +		if (ppmu->flags & PPMU_ARCH_310S)
>>>> +			mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>>>> +#endif /* CONFIG_PPC64 */
>>>> 		goto out_enable;
>>>> 	}
>>>> 
>>>> @@ -1353,6 +1373,11 @@ static void power_pmu_enable(struct pmu *pmu)
>>>> 	if (ppmu->flags & PPMU_ARCH_207S)
>>>> 		mtspr(SPRN_MMCR2, cpuhw->mmcr[3]);
>>>> 
>>>> +#ifdef CONFIG_PPC64
>>>> +	if (ppmu->flags & PPMU_ARCH_310S)
>>>> +		mtspr(SPRN_MMCR3, cpuhw->mmcr[4]);
>>>> +#endif /* CONFIG_PPC64 */
>>> 
>>> I don't think you need the #ifdef CONFIG_PPC64?
>> 
>> Hi Michael
>> 
>> Thanks for reviewing this series.
>> 
>> SPRN_MMCR3 is not defined for PPC32 and we hit build failure for pmac32_defconfig.
>> The #ifdef CONFIG_PPC64 is to address this.
> 
> We like to avoid #ifdefs in the body of the code like that.
> 
> There's a bunch of existing #defines near the top of the file to make
> 32-bit work, I think you should just add another for this, so eg:
> 
> #ifdef CONFIG_PPC32
> ...
> #define SPRN_MMCR3	0

Ok Ok. Found that currently we do same way as you mentioned for MMCRA which is not supported for 32-bit
I will work on this change

Thanks
Athira 
> 
> cheers


[-- Attachment #2: Type: text/html, Size: 14484 bytes --]

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-07-15  6:13 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01  9:20 [PATCH v2 00/10] powerpc/perf: Add support for power10 PMU Hardware Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 01/10] powerpc/perf: Add support for ISA3.1 PMU SPRs Athira Rajeev
2020-07-08 11:02   ` Michael Ellerman
2020-07-09  1:53     ` Athira Rajeev
2020-07-13 12:50       ` Michael Ellerman
2020-07-15  6:07         ` Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 02/10] KVM: PPC: Book3S HV: Save/restore new PMU registers Athira Rajeev
2020-07-01 11:11   ` Paul Mackerras
2020-07-02  6:22     ` Athira Rajeev
2020-07-07  6:13   ` Michael Neuling
2020-07-01  9:20 ` [PATCH v2 03/10] powerpc/xmon: Add PowerISA v3.1 PMU SPRs Athira Rajeev
2020-07-08 11:04   ` Michael Ellerman
2020-07-09  1:57     ` Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 04/10] powerpc/perf: Add power10_feat to dt_cpu_ftrs Athira Rajeev
2020-07-07  6:22   ` Michael Neuling
2020-07-08  2:13     ` Athira Rajeev
2020-07-08 11:15   ` Michael Ellerman
2020-07-09 11:07     ` Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 05/10] powerpc/perf: Update Power PMU cache_events to u64 type Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 06/10] powerpc/perf: power10 Performance Monitoring support Athira Rajeev
2020-07-02  9:06   ` kernel test robot
2020-07-07  6:50   ` Michael Neuling
2020-07-08 10:56     ` Athira Rajeev
2020-07-01  9:20 ` [PATCH v2 07/10] powerpc/perf: support BHRB disable bit and new filtering modes Athira Rajeev
2020-07-07  7:17   ` Michael Neuling
2020-07-08  7:41     ` Athira Rajeev
2020-07-08  7:43     ` Gautham R Shenoy
2020-07-09  2:01       ` Athira Rajeev
2020-07-08 11:42   ` Michael Ellerman
2020-07-09  2:43     ` Athira Rajeev
2020-07-01  9:21 ` [PATCH v2 08/10] powerpc/perf: Add support for outputting extended regs in perf intr_regs Athira Rajeev
2020-07-01  9:21 ` [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc Athira Rajeev
2020-07-08 12:04   ` Michael Ellerman
2020-07-09  3:10     ` Athira Rajeev
2020-07-13 12:47       ` Michael Ellerman
2020-07-13  2:36     ` Athira Rajeev
2020-07-01  9:21 ` [PATCH v2 10/10] powerpc/perf: Add extended regs support for power10 platform Athira Rajeev
2020-07-02  9:40   ` kernel test robot
2020-07-08  1:53     ` Athira Rajeev
2020-07-08 12:04   ` Michael Ellerman
2020-07-09  6:29     ` Athira Rajeev

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).