All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] perf/amd/iommu: Enable multi-IOMMU support
@ 2015-12-22 19:19 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

This patch series modifies the existing perf_event_amd_iommu driver
to support systems with multiple IOMMUs. It introduces new AMD IOMMU APIs,
which will are used by the AMD IOMMU Perf driver to access performance
counters in multiple IOMMUs.

In addition, this series should also fix current AMD IOMMU PMU driver
initialization issue in some existing CZ platform.

Suravee Suthikulpanit (6):
  perf/amd/iommu: Consolidate and move perf_event_amd_iommu header
  perf/amd/iommu: Modify functions to query max banks and counters
  iommu/amd: Introduce amd_iommu_get_num_iommus()
  perf/amd/iommu: Introduce data structure for tracking prev count.
  perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx
  perf/amd/iommu: Enable support for multiple IOMMUs

 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 145 +++++++++++++++++++----------
 arch/x86/kernel/cpu/perf_event_amd_iommu.h |  40 --------
 drivers/iommu/amd_iommu_init.c             | 125 +++++++++++++++++++++----
 drivers/iommu/amd_iommu_proto.h            |   7 --
 include/linux/perf/perf_event_amd_iommu.h  |  43 +++++++++
 5 files changed, 248 insertions(+), 112 deletions(-)
 delete mode 100644 arch/x86/kernel/cpu/perf_event_amd_iommu.h
 create mode 100644 include/linux/perf/perf_event_amd_iommu.h

-- 
1.9.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/6] perf/amd/iommu: Enable multi-IOMMU support
@ 2015-12-22 19:19 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This patch series modifies the existing perf_event_amd_iommu driver
to support systems with multiple IOMMUs. It introduces new AMD IOMMU APIs,
which will are used by the AMD IOMMU Perf driver to access performance
counters in multiple IOMMUs.

In addition, this series should also fix current AMD IOMMU PMU driver
initialization issue in some existing CZ platform.

Suravee Suthikulpanit (6):
  perf/amd/iommu: Consolidate and move perf_event_amd_iommu header
  perf/amd/iommu: Modify functions to query max banks and counters
  iommu/amd: Introduce amd_iommu_get_num_iommus()
  perf/amd/iommu: Introduce data structure for tracking prev count.
  perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx
  perf/amd/iommu: Enable support for multiple IOMMUs

 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 145 +++++++++++++++++++----------
 arch/x86/kernel/cpu/perf_event_amd_iommu.h |  40 --------
 drivers/iommu/amd_iommu_init.c             | 125 +++++++++++++++++++++----
 drivers/iommu/amd_iommu_proto.h            |   7 --
 include/linux/perf/perf_event_amd_iommu.h  |  43 +++++++++
 5 files changed, 248 insertions(+), 112 deletions(-)
 delete mode 100644 arch/x86/kernel/cpu/perf_event_amd_iommu.h
 create mode 100644 include/linux/perf/perf_event_amd_iommu.h

-- 
1.9.1

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] perf/amd/iommu: Consolidate and move perf_event_amd_iommu header
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

This patch consolidates "arch/x86/kernel/cpu/perf_event_amd_iommu.h" and
"drivers/iommu/amd_iommu_proto.h", which contain duplicate function
declarations, into "include/linux/perf/perf_event_amd_iommu.h"

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c |  2 +-
 arch/x86/kernel/cpu/perf_event_amd_iommu.h | 40 ------------------------------
 drivers/iommu/amd_iommu_init.c             |  2 ++
 drivers/iommu/amd_iommu_proto.h            |  7 ------
 include/linux/perf/perf_event_amd_iommu.h  | 40 ++++++++++++++++++++++++++++++
 5 files changed, 43 insertions(+), 48 deletions(-)
 delete mode 100644 arch/x86/kernel/cpu/perf_event_amd_iommu.h
 create mode 100644 include/linux/perf/perf_event_amd_iommu.h

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 86f8259..1192f1d 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -12,12 +12,12 @@
  */
 
 #include <linux/perf_event.h>
+#include <linux/perf/perf_event_amd_iommu.h>
 #include <linux/module.h>
 #include <linux/cpumask.h>
 #include <linux/slab.h>
 
 #include "perf_event.h"
-#include "perf_event_amd_iommu.h"
 
 #define COUNTER_SHIFT		16
 
diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.h b/arch/x86/kernel/cpu/perf_event_amd_iommu.h
deleted file mode 100644
index 845d173..0000000
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/*
- * Copyright (C) 2013 Advanced Micro Devices, Inc.
- *
- * Author: Steven Kinney <Steven.Kinney@amd.com>
- * Author: Suravee Suthikulpanit <Suraveee.Suthikulpanit@amd.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#ifndef _PERF_EVENT_AMD_IOMMU_H_
-#define _PERF_EVENT_AMD_IOMMU_H_
-
-/* iommu pc mmio region register indexes */
-#define IOMMU_PC_COUNTER_REG			0x00
-#define IOMMU_PC_COUNTER_SRC_REG		0x08
-#define IOMMU_PC_PASID_MATCH_REG		0x10
-#define IOMMU_PC_DOMID_MATCH_REG		0x18
-#define IOMMU_PC_DEVID_MATCH_REG		0x20
-#define IOMMU_PC_COUNTER_REPORT_REG		0x28
-
-/* maximun specified bank/counters */
-#define PC_MAX_SPEC_BNKS			64
-#define PC_MAX_SPEC_CNTRS			16
-
-/* iommu pc reg masks*/
-#define IOMMU_BASE_DEVID			0x0000
-
-/* amd_iommu_init.c external support functions */
-extern bool amd_iommu_pc_supported(void);
-
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
-
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
-
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
-			u8 fxn, u64 *value, bool is_write);
-
-#endif /*_PERF_EVENT_AMD_IOMMU_H_*/
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 013bdff..b6d684c 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -27,6 +27,8 @@
 #include <linux/amd-iommu.h>
 #include <linux/export.h>
 #include <linux/iommu.h>
+#include <linux/perf/perf_event_amd_iommu.h>
+
 #include <asm/pci-direct.h>
 #include <asm/iommu.h>
 #include <asm/gart.h>
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 0bd9eb3..ac2da91 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -55,13 +55,6 @@ extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid);
 extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
 
-/* IOMMU Performance Counter functions */
-extern bool amd_iommu_pc_supported(void);
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
-				    u64 *value, bool is_write);
-
 #ifdef CONFIG_IRQ_REMAP
 extern int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
 #else
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
new file mode 100644
index 0000000..845d173
--- /dev/null
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -0,0 +1,40 @@
+/*
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Steven Kinney <Steven.Kinney@amd.com>
+ * Author: Suravee Suthikulpanit <Suraveee.Suthikulpanit@amd.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _PERF_EVENT_AMD_IOMMU_H_
+#define _PERF_EVENT_AMD_IOMMU_H_
+
+/* iommu pc mmio region register indexes */
+#define IOMMU_PC_COUNTER_REG			0x00
+#define IOMMU_PC_COUNTER_SRC_REG		0x08
+#define IOMMU_PC_PASID_MATCH_REG		0x10
+#define IOMMU_PC_DOMID_MATCH_REG		0x18
+#define IOMMU_PC_DEVID_MATCH_REG		0x20
+#define IOMMU_PC_COUNTER_REPORT_REG		0x28
+
+/* maximun specified bank/counters */
+#define PC_MAX_SPEC_BNKS			64
+#define PC_MAX_SPEC_CNTRS			16
+
+/* iommu pc reg masks*/
+#define IOMMU_BASE_DEVID			0x0000
+
+/* amd_iommu_init.c external support functions */
+extern bool amd_iommu_pc_supported(void);
+
+extern u8 amd_iommu_pc_get_max_banks(u16 devid);
+
+extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+
+extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
+			u8 fxn, u64 *value, bool is_write);
+
+#endif /*_PERF_EVENT_AMD_IOMMU_H_*/
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 1/6] perf/amd/iommu: Consolidate and move perf_event_amd_iommu header
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This patch consolidates "arch/x86/kernel/cpu/perf_event_amd_iommu.h" and
"drivers/iommu/amd_iommu_proto.h", which contain duplicate function
declarations, into "include/linux/perf/perf_event_amd_iommu.h"

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c |  2 +-
 arch/x86/kernel/cpu/perf_event_amd_iommu.h | 40 ------------------------------
 drivers/iommu/amd_iommu_init.c             |  2 ++
 drivers/iommu/amd_iommu_proto.h            |  7 ------
 include/linux/perf/perf_event_amd_iommu.h  | 40 ++++++++++++++++++++++++++++++
 5 files changed, 43 insertions(+), 48 deletions(-)
 delete mode 100644 arch/x86/kernel/cpu/perf_event_amd_iommu.h
 create mode 100644 include/linux/perf/perf_event_amd_iommu.h

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 86f8259..1192f1d 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -12,12 +12,12 @@
  */
 
 #include <linux/perf_event.h>
+#include <linux/perf/perf_event_amd_iommu.h>
 #include <linux/module.h>
 #include <linux/cpumask.h>
 #include <linux/slab.h>
 
 #include "perf_event.h"
-#include "perf_event_amd_iommu.h"
 
 #define COUNTER_SHIFT		16
 
diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.h b/arch/x86/kernel/cpu/perf_event_amd_iommu.h
deleted file mode 100644
index 845d173..0000000
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.h
+++ /dev/null
@@ -1,40 +0,0 @@
-/*
- * Copyright (C) 2013 Advanced Micro Devices, Inc.
- *
- * Author: Steven Kinney <Steven.Kinney-5C7GfCeVMHo@public.gmane.org>
- * Author: Suravee Suthikulpanit <Suraveee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#ifndef _PERF_EVENT_AMD_IOMMU_H_
-#define _PERF_EVENT_AMD_IOMMU_H_
-
-/* iommu pc mmio region register indexes */
-#define IOMMU_PC_COUNTER_REG			0x00
-#define IOMMU_PC_COUNTER_SRC_REG		0x08
-#define IOMMU_PC_PASID_MATCH_REG		0x10
-#define IOMMU_PC_DOMID_MATCH_REG		0x18
-#define IOMMU_PC_DEVID_MATCH_REG		0x20
-#define IOMMU_PC_COUNTER_REPORT_REG		0x28
-
-/* maximun specified bank/counters */
-#define PC_MAX_SPEC_BNKS			64
-#define PC_MAX_SPEC_CNTRS			16
-
-/* iommu pc reg masks*/
-#define IOMMU_BASE_DEVID			0x0000
-
-/* amd_iommu_init.c external support functions */
-extern bool amd_iommu_pc_supported(void);
-
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
-
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
-
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
-			u8 fxn, u64 *value, bool is_write);
-
-#endif /*_PERF_EVENT_AMD_IOMMU_H_*/
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 013bdff..b6d684c 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -27,6 +27,8 @@
 #include <linux/amd-iommu.h>
 #include <linux/export.h>
 #include <linux/iommu.h>
+#include <linux/perf/perf_event_amd_iommu.h>
+
 #include <asm/pci-direct.h>
 #include <asm/iommu.h>
 #include <asm/gart.h>
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 0bd9eb3..ac2da91 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -55,13 +55,6 @@ extern int amd_iommu_domain_set_gcr3(struct iommu_domain *dom, int pasid,
 extern int amd_iommu_domain_clear_gcr3(struct iommu_domain *dom, int pasid);
 extern struct iommu_domain *amd_iommu_get_v2_domain(struct pci_dev *pdev);
 
-/* IOMMU Performance Counter functions */
-extern bool amd_iommu_pc_supported(void);
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
-				    u64 *value, bool is_write);
-
 #ifdef CONFIG_IRQ_REMAP
 extern int amd_iommu_create_irq_domain(struct amd_iommu *iommu);
 #else
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
new file mode 100644
index 0000000..845d173
--- /dev/null
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -0,0 +1,40 @@
+/*
+ * Copyright (C) 2013 Advanced Micro Devices, Inc.
+ *
+ * Author: Steven Kinney <Steven.Kinney-5C7GfCeVMHo@public.gmane.org>
+ * Author: Suravee Suthikulpanit <Suraveee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef _PERF_EVENT_AMD_IOMMU_H_
+#define _PERF_EVENT_AMD_IOMMU_H_
+
+/* iommu pc mmio region register indexes */
+#define IOMMU_PC_COUNTER_REG			0x00
+#define IOMMU_PC_COUNTER_SRC_REG		0x08
+#define IOMMU_PC_PASID_MATCH_REG		0x10
+#define IOMMU_PC_DOMID_MATCH_REG		0x18
+#define IOMMU_PC_DEVID_MATCH_REG		0x20
+#define IOMMU_PC_COUNTER_REPORT_REG		0x28
+
+/* maximun specified bank/counters */
+#define PC_MAX_SPEC_BNKS			64
+#define PC_MAX_SPEC_CNTRS			16
+
+/* iommu pc reg masks*/
+#define IOMMU_BASE_DEVID			0x0000
+
+/* amd_iommu_init.c external support functions */
+extern bool amd_iommu_pc_supported(void);
+
+extern u8 amd_iommu_pc_get_max_banks(u16 devid);
+
+extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+
+extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
+			u8 fxn, u64 *value, bool is_write);
+
+#endif /*_PERF_EVENT_AMD_IOMMU_H_*/
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] perf/amd/iommu: Modify functions to query max banks and counters
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

Currently, amd_iommu_pc_get_max_[banks|counters]() require devid,
which should not be the case. Also, these don't properly support
multi-IOMMU system.

Current and future AMD systems with IOMMU that support perf counter
would likely contain homogeneous IOMMUs where multiple IOMMUs are
availalbe. So, this patch modifies these function to iterate all IOMMU
to check the max banks and counters reported by the hardware.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 17 +++++++----------
 drivers/iommu/amd_iommu_init.c             | 20 ++++++++++++--------
 include/linux/perf/perf_event_amd_iommu.h  |  7 ++-----
 3 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 1192f1d..e6d2485 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -237,14 +237,6 @@ static int perf_iommu_event_init(struct perf_event *event)
 		return -EINVAL;
 	}
 
-	/* integrate with iommu base devid (0000), assume one iommu */
-	perf_iommu->max_banks =
-		amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID);
-	perf_iommu->max_counters =
-		amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID);
-	if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))
-		return -EINVAL;
-
 	/* update the hw_perf_event struct with the iommu config data */
 	hwc->config = config;
 	hwc->extra_reg.config = config1;
@@ -455,6 +447,11 @@ static __init int _init_perf_amd_iommu(
 	if (_init_events_attrs(perf_iommu) != 0)
 		pr_err("perf: amd_iommu: Only support raw events.\n");
 
+	perf_iommu->max_banks = amd_iommu_pc_get_max_banks();
+	perf_iommu->max_counters = amd_iommu_pc_get_max_counters();
+	if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))
+		return -EINVAL;
+
 	/* Init null attributes */
 	perf_iommu->null_group = NULL;
 	perf_iommu->pmu.attr_groups = perf_iommu->attr_groups;
@@ -465,8 +462,8 @@ static __init int _init_perf_amd_iommu(
 		amd_iommu_pc_exit();
 	} else {
 		pr_info("perf: amd_iommu: Detected. (%d banks, %d counters/bank)\n",
-			amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID),
-			amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID));
+			amd_iommu_pc_get_max_banks(),
+			amd_iommu_pc_get_max_counters());
 	}
 
 	return ret;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index b6d684c..275c0f5 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2251,15 +2251,17 @@ EXPORT_SYMBOL(amd_iommu_v2_supported);
  *
  ****************************************************************************/
 
-u8 amd_iommu_pc_get_max_banks(u16 devid)
+u8 amd_iommu_pc_get_max_banks(void)
 {
 	struct amd_iommu *iommu;
 	u8 ret = 0;
 
-	/* locate the iommu governing the devid */
-	iommu = amd_iommu_rlookup_table[devid];
-	if (iommu)
+	for_each_iommu(iommu) {
+		if (!iommu->max_banks ||
+		    (ret && (iommu->max_banks != ret)))
+			return 0;
 		ret = iommu->max_banks;
+	}
 
 	return ret;
 }
@@ -2271,15 +2273,17 @@ bool amd_iommu_pc_supported(void)
 }
 EXPORT_SYMBOL(amd_iommu_pc_supported);
 
-u8 amd_iommu_pc_get_max_counters(u16 devid)
+u8 amd_iommu_pc_get_max_counters(void)
 {
 	struct amd_iommu *iommu;
 	u8 ret = 0;
 
-	/* locate the iommu governing the devid */
-	iommu = amd_iommu_rlookup_table[devid];
-	if (iommu)
+	for_each_iommu(iommu) {
+		if (!iommu->max_counters ||
+		    (ret && (iommu->max_counters != ret)))
+			return 0;
 		ret = iommu->max_counters;
+	}
 
 	return ret;
 }
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index 845d173..815eabb 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -24,15 +24,12 @@
 #define PC_MAX_SPEC_BNKS			64
 #define PC_MAX_SPEC_CNTRS			16
 
-/* iommu pc reg masks*/
-#define IOMMU_BASE_DEVID			0x0000
-
 /* amd_iommu_init.c external support functions */
 extern bool amd_iommu_pc_supported(void);
 
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
+extern u8 amd_iommu_pc_get_max_banks(void);
 
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+extern u8 amd_iommu_pc_get_max_counters(void);
 
 extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
 			u8 fxn, u64 *value, bool is_write);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] perf/amd/iommu: Modify functions to query max banks and counters
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Currently, amd_iommu_pc_get_max_[banks|counters]() require devid,
which should not be the case. Also, these don't properly support
multi-IOMMU system.

Current and future AMD systems with IOMMU that support perf counter
would likely contain homogeneous IOMMUs where multiple IOMMUs are
availalbe. So, this patch modifies these function to iterate all IOMMU
to check the max banks and counters reported by the hardware.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 17 +++++++----------
 drivers/iommu/amd_iommu_init.c             | 20 ++++++++++++--------
 include/linux/perf/perf_event_amd_iommu.h  |  7 ++-----
 3 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 1192f1d..e6d2485 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -237,14 +237,6 @@ static int perf_iommu_event_init(struct perf_event *event)
 		return -EINVAL;
 	}
 
-	/* integrate with iommu base devid (0000), assume one iommu */
-	perf_iommu->max_banks =
-		amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID);
-	perf_iommu->max_counters =
-		amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID);
-	if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))
-		return -EINVAL;
-
 	/* update the hw_perf_event struct with the iommu config data */
 	hwc->config = config;
 	hwc->extra_reg.config = config1;
@@ -455,6 +447,11 @@ static __init int _init_perf_amd_iommu(
 	if (_init_events_attrs(perf_iommu) != 0)
 		pr_err("perf: amd_iommu: Only support raw events.\n");
 
+	perf_iommu->max_banks = amd_iommu_pc_get_max_banks();
+	perf_iommu->max_counters = amd_iommu_pc_get_max_counters();
+	if ((perf_iommu->max_banks == 0) || (perf_iommu->max_counters == 0))
+		return -EINVAL;
+
 	/* Init null attributes */
 	perf_iommu->null_group = NULL;
 	perf_iommu->pmu.attr_groups = perf_iommu->attr_groups;
@@ -465,8 +462,8 @@ static __init int _init_perf_amd_iommu(
 		amd_iommu_pc_exit();
 	} else {
 		pr_info("perf: amd_iommu: Detected. (%d banks, %d counters/bank)\n",
-			amd_iommu_pc_get_max_banks(IOMMU_BASE_DEVID),
-			amd_iommu_pc_get_max_counters(IOMMU_BASE_DEVID));
+			amd_iommu_pc_get_max_banks(),
+			amd_iommu_pc_get_max_counters());
 	}
 
 	return ret;
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index b6d684c..275c0f5 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2251,15 +2251,17 @@ EXPORT_SYMBOL(amd_iommu_v2_supported);
  *
  ****************************************************************************/
 
-u8 amd_iommu_pc_get_max_banks(u16 devid)
+u8 amd_iommu_pc_get_max_banks(void)
 {
 	struct amd_iommu *iommu;
 	u8 ret = 0;
 
-	/* locate the iommu governing the devid */
-	iommu = amd_iommu_rlookup_table[devid];
-	if (iommu)
+	for_each_iommu(iommu) {
+		if (!iommu->max_banks ||
+		    (ret && (iommu->max_banks != ret)))
+			return 0;
 		ret = iommu->max_banks;
+	}
 
 	return ret;
 }
@@ -2271,15 +2273,17 @@ bool amd_iommu_pc_supported(void)
 }
 EXPORT_SYMBOL(amd_iommu_pc_supported);
 
-u8 amd_iommu_pc_get_max_counters(u16 devid)
+u8 amd_iommu_pc_get_max_counters(void)
 {
 	struct amd_iommu *iommu;
 	u8 ret = 0;
 
-	/* locate the iommu governing the devid */
-	iommu = amd_iommu_rlookup_table[devid];
-	if (iommu)
+	for_each_iommu(iommu) {
+		if (!iommu->max_counters ||
+		    (ret && (iommu->max_counters != ret)))
+			return 0;
 		ret = iommu->max_counters;
+	}
 
 	return ret;
 }
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index 845d173..815eabb 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -24,15 +24,12 @@
 #define PC_MAX_SPEC_BNKS			64
 #define PC_MAX_SPEC_CNTRS			16
 
-/* iommu pc reg masks*/
-#define IOMMU_BASE_DEVID			0x0000
-
 /* amd_iommu_init.c external support functions */
 extern bool amd_iommu_pc_supported(void);
 
-extern u8 amd_iommu_pc_get_max_banks(u16 devid);
+extern u8 amd_iommu_pc_get_max_banks(void);
 
-extern u8 amd_iommu_pc_get_max_counters(u16 devid);
+extern u8 amd_iommu_pc_get_max_counters(void);
 
 extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
 			u8 fxn, u64 *value, bool is_write);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

This patch introduces amd_iommu_get_num_iommus(). Initially, this is
intended to be used by Perf AMD IOMMU driver.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
 include/linux/perf/perf_event_amd_iommu.h |  2 ++
 2 files changed, 18 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 275c0f5..9c62613 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
+static int amd_iommu_cnt;
+
+int amd_iommu_get_num_iommus(void)
+{
+	struct amd_iommu *iommu;
+
+	if (amd_iommu_cnt)
+		return amd_iommu_cnt;
+
+	for_each_iommu(iommu)
+		amd_iommu_cnt++;
+
+	return amd_iommu_cnt;
+}
+EXPORT_SYMBOL(amd_iommu_get_num_iommus);
+
 /****************************************************************************
  *
  * IOMMU EFR Performance Counter support functionality. This code allows
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index 815eabb..cb820c2 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -25,6 +25,8 @@
 #define PC_MAX_SPEC_CNTRS			16
 
 /* amd_iommu_init.c external support functions */
+extern int amd_iommu_get_num_iommus(void);
+
 extern bool amd_iommu_pc_supported(void);
 
 extern u8 amd_iommu_pc_get_max_banks(void);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This patch introduces amd_iommu_get_num_iommus(). Initially, this is
intended to be used by Perf AMD IOMMU driver.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
 include/linux/perf/perf_event_amd_iommu.h |  2 ++
 2 files changed, 18 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 275c0f5..9c62613 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
 }
 EXPORT_SYMBOL(amd_iommu_v2_supported);
 
+static int amd_iommu_cnt;
+
+int amd_iommu_get_num_iommus(void)
+{
+	struct amd_iommu *iommu;
+
+	if (amd_iommu_cnt)
+		return amd_iommu_cnt;
+
+	for_each_iommu(iommu)
+		amd_iommu_cnt++;
+
+	return amd_iommu_cnt;
+}
+EXPORT_SYMBOL(amd_iommu_get_num_iommus);
+
 /****************************************************************************
  *
  * IOMMU EFR Performance Counter support functionality. This code allows
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index 815eabb..cb820c2 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -25,6 +25,8 @@
 #define PC_MAX_SPEC_CNTRS			16
 
 /* amd_iommu_init.c external support functions */
+extern int amd_iommu_get_num_iommus(void);
+
 extern bool amd_iommu_pc_supported(void);
 
 extern u8 amd_iommu_pc_get_max_banks(void);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] perf/amd/iommu: Introduce data structure for tracking prev count.
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

To enable AMD IOMMU PMU to support multiple IOMMUs, this patch introduces
a new data structure, perf_amd_iommu.prev_cnts, to track previous counts
of IOMMU performance counters in multi-IOMMU environment.

Also, this patch allocates perf_iommu_cnts for internal use
when manages counters.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index e6d2485..99fcd10 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -42,6 +42,7 @@ struct perf_amd_iommu {
 	u64 cntr_assign_mask;
 	raw_spinlock_t lock;
 	const struct attribute_group *attr_groups[4];
+	local64_t *prev_cnts;
 };
 
 #define format_group	attr_groups[0]
@@ -126,6 +127,8 @@ static struct amd_iommu_event_desc amd_iommu_v2_event_descs[] = {
 	{ /* end: all zeroes */ },
 };
 
+static u64 *perf_iommu_cnts;
+
 /*---------------------------------------------
  * sysfs cpumask attributes
  *---------------------------------------------*/
@@ -423,10 +426,14 @@ static __init int _init_events_attrs(struct perf_amd_iommu *perf_iommu)
 
 static __init void amd_iommu_pc_exit(void)
 {
-	if (__perf_iommu.events_group != NULL) {
-		kfree(__perf_iommu.events_group);
-		__perf_iommu.events_group = NULL;
-	}
+	kfree(__perf_iommu.events_group);
+	__perf_iommu.events_group = NULL;
+
+	kfree(__perf_iommu.prev_cnts);
+	__perf_iommu.prev_cnts = NULL;
+
+	kfree(perf_iommu_cnts);
+	perf_iommu_cnts = NULL;
 }
 
 static __init int _init_perf_amd_iommu(
@@ -456,6 +463,17 @@ static __init int _init_perf_amd_iommu(
 	perf_iommu->null_group = NULL;
 	perf_iommu->pmu.attr_groups = perf_iommu->attr_groups;
 
+	perf_iommu->prev_cnts = kzalloc(sizeof(*perf_iommu->prev_cnts) *
+		(amd_iommu_get_num_iommus() * perf_iommu->max_banks *
+		perf_iommu->max_counters), GFP_KERNEL);
+	if (!perf_iommu->prev_cnts)
+		return -ENOMEM;
+
+	perf_iommu_cnts = kzalloc(sizeof(*perf_iommu_cnts) *
+				  amd_iommu_get_num_iommus(), GFP_KERNEL);
+	if (!perf_iommu_cnts)
+		return -ENOMEM;
+
 	ret = perf_pmu_register(&perf_iommu->pmu, name, -1);
 	if (ret) {
 		pr_err("perf: amd_iommu: Failed to initialized.\n");
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] perf/amd/iommu: Introduce data structure for tracking prev count.
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

To enable AMD IOMMU PMU to support multiple IOMMUs, this patch introduces
a new data structure, perf_amd_iommu.prev_cnts, to track previous counts
of IOMMU performance counters in multi-IOMMU environment.

Also, this patch allocates perf_iommu_cnts for internal use
when manages counters.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index e6d2485..99fcd10 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -42,6 +42,7 @@ struct perf_amd_iommu {
 	u64 cntr_assign_mask;
 	raw_spinlock_t lock;
 	const struct attribute_group *attr_groups[4];
+	local64_t *prev_cnts;
 };
 
 #define format_group	attr_groups[0]
@@ -126,6 +127,8 @@ static struct amd_iommu_event_desc amd_iommu_v2_event_descs[] = {
 	{ /* end: all zeroes */ },
 };
 
+static u64 *perf_iommu_cnts;
+
 /*---------------------------------------------
  * sysfs cpumask attributes
  *---------------------------------------------*/
@@ -423,10 +426,14 @@ static __init int _init_events_attrs(struct perf_amd_iommu *perf_iommu)
 
 static __init void amd_iommu_pc_exit(void)
 {
-	if (__perf_iommu.events_group != NULL) {
-		kfree(__perf_iommu.events_group);
-		__perf_iommu.events_group = NULL;
-	}
+	kfree(__perf_iommu.events_group);
+	__perf_iommu.events_group = NULL;
+
+	kfree(__perf_iommu.prev_cnts);
+	__perf_iommu.prev_cnts = NULL;
+
+	kfree(perf_iommu_cnts);
+	perf_iommu_cnts = NULL;
 }
 
 static __init int _init_perf_amd_iommu(
@@ -456,6 +463,17 @@ static __init int _init_perf_amd_iommu(
 	perf_iommu->null_group = NULL;
 	perf_iommu->pmu.attr_groups = perf_iommu->attr_groups;
 
+	perf_iommu->prev_cnts = kzalloc(sizeof(*perf_iommu->prev_cnts) *
+		(amd_iommu_get_num_iommus() * perf_iommu->max_banks *
+		perf_iommu->max_counters), GFP_KERNEL);
+	if (!perf_iommu->prev_cnts)
+		return -ENOMEM;
+
+	perf_iommu_cnts = kzalloc(sizeof(*perf_iommu_cnts) *
+				  amd_iommu_get_num_iommus(), GFP_KERNEL);
+	if (!perf_iommu_cnts)
+		return -ENOMEM;
+
 	ret = perf_pmu_register(&perf_iommu->pmu, name, -1);
 	if (ret) {
 		pr_err("perf: amd_iommu: Failed to initialized.\n");
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

Introduce a helper function to calculate bit-index for assigning
performance counter assignment.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 99fcd10..8af7149 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -153,18 +153,28 @@ static struct attribute_group amd_iommu_cpumask_group = {
 
 /*---------------------------------------------*/
 
+static inline
+int get_iommu_bnk_cnt_evt_idx(struct perf_amd_iommu *perf_iommu,
+				  int iommu_index, int bank_index,
+				  int cntr_index)
+{
+	int cntrs_per_iommu = perf_iommu->max_banks * perf_iommu->max_counters;
+	int index = (perf_iommu->max_counters * bank_index) + cntr_index;
+
+	return (cntrs_per_iommu * iommu_index) + index;
+}
+
 static int get_next_avail_iommu_bnk_cntr(struct perf_amd_iommu *perf_iommu)
 {
 	unsigned long flags;
 	int shift, bank, cntr, retval;
-	int max_banks = perf_iommu->max_banks;
-	int max_cntrs = perf_iommu->max_counters;
 
 	raw_spin_lock_irqsave(&perf_iommu->lock, flags);
 
-	for (bank = 0, shift = 0; bank < max_banks; bank++) {
-		for (cntr = 0; cntr < max_cntrs; cntr++) {
-			shift = bank + (bank*3) + cntr;
+	for (bank = 0, shift = 0; bank < perf_iommu->max_banks; bank++) {
+		for (cntr = 0; cntr < perf_iommu->max_counters; cntr++) {
+			shift = get_iommu_bnk_cnt_evt_idx(perf_iommu,
+							      0, bank, cntr);
 			if (perf_iommu->cntr_assign_mask & (1ULL<<shift)) {
 				continue;
 			} else {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Introduce a helper function to calculate bit-index for assigning
performance counter assignment.

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 99fcd10..8af7149 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -153,18 +153,28 @@ static struct attribute_group amd_iommu_cpumask_group = {
 
 /*---------------------------------------------*/
 
+static inline
+int get_iommu_bnk_cnt_evt_idx(struct perf_amd_iommu *perf_iommu,
+				  int iommu_index, int bank_index,
+				  int cntr_index)
+{
+	int cntrs_per_iommu = perf_iommu->max_banks * perf_iommu->max_counters;
+	int index = (perf_iommu->max_counters * bank_index) + cntr_index;
+
+	return (cntrs_per_iommu * iommu_index) + index;
+}
+
 static int get_next_avail_iommu_bnk_cntr(struct perf_amd_iommu *perf_iommu)
 {
 	unsigned long flags;
 	int shift, bank, cntr, retval;
-	int max_banks = perf_iommu->max_banks;
-	int max_cntrs = perf_iommu->max_counters;
 
 	raw_spin_lock_irqsave(&perf_iommu->lock, flags);
 
-	for (bank = 0, shift = 0; bank < max_banks; bank++) {
-		for (cntr = 0; cntr < max_cntrs; cntr++) {
-			shift = bank + (bank*3) + cntr;
+	for (bank = 0, shift = 0; bank < perf_iommu->max_banks; bank++) {
+		for (cntr = 0; cntr < perf_iommu->max_counters; cntr++) {
+			shift = get_iommu_bnk_cnt_evt_idx(perf_iommu,
+							      0, bank, cntr);
 			if (perf_iommu->cntr_assign_mask & (1ULL<<shift)) {
 				continue;
 			} else {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] perf/amd/iommu: Enable support for multiple IOMMUs
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro, bp, peterz, mingo, acme; +Cc: linux-kernel, iommu, Suravee Suthikulpanit

The current amd_iommu_pc_get_set_reg_val() does not support muli-IOMMU
system. This patch replace amd_iommu_pc_get_set_reg_val() with
amd_iommu_pc_set_reg_val() and amd_iommu_pc_[set|get]_cnt_vals().

This implementation makes an assumption that the counters on all IOMMUs
will be programmed the same way (i.e with the same events).

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 80 +++++++++++++++++----------
 drivers/iommu/amd_iommu_init.c             | 87 ++++++++++++++++++++++++++----
 include/linux/perf/perf_event_amd_iommu.h  |  8 ++-
 3 files changed, 136 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 8af7149..9c60eb3 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -264,44 +264,46 @@ static void perf_iommu_enable_event(struct perf_event *ev)
 	u64 reg = 0ULL;
 
 	reg = csource;
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_COUNTER_SRC_REG, &reg, true);
+			 IOMMU_PC_COUNTER_SRC_REG, &reg);
 
 	reg = 0ULL | devid | (_GET_DEVID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_DEVID_MATCH_REG, &reg, true);
+			 IOMMU_PC_DEVID_MATCH_REG, &reg);
 
 	reg = 0ULL | _GET_PASID(ev) | (_GET_PASID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_PASID_MATCH_REG, &reg, true);
+			 IOMMU_PC_PASID_MATCH_REG, &reg);
 
 	reg = 0ULL | _GET_DOMID(ev) | (_GET_DOMID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_DOMID_MATCH_REG, &reg, true);
+			 IOMMU_PC_DOMID_MATCH_REG, &reg);
 }
 
 static void perf_iommu_disable_event(struct perf_event *event)
 {
 	u64 reg = 0ULL;
 
-	amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
+	amd_iommu_pc_set_reg_val(_GET_DEVID(event),
 			_GET_BANK(event), _GET_CNTR(event),
-			IOMMU_PC_COUNTER_SRC_REG, &reg, true);
+			IOMMU_PC_COUNTER_SRC_REG, &reg);
 }
 
 static void perf_iommu_start(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_amd_iommu *perf_iommu =
+			container_of(event->pmu, struct perf_amd_iommu, pmu);
 
 	pr_debug("perf: amd_iommu:perf_iommu_start\n");
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
@@ -311,10 +313,19 @@ static void perf_iommu_start(struct perf_event *event, int flags)
 	hwc->state = 0;
 
 	if (flags & PERF_EF_RELOAD) {
-		u64 prev_raw_count =  local64_read(&hwc->prev_count);
-		amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
-				_GET_BANK(event), _GET_CNTR(event),
-				IOMMU_PC_COUNTER_REG, &prev_raw_count, true);
+		int i;
+
+		for (i = 0; i < amd_iommu_get_num_iommus(); i++) {
+			int index = get_iommu_bnk_cnt_evt_idx(perf_iommu, i,
+					  _GET_BANK(event), _GET_CNTR(event));
+
+			perf_iommu_cnts[i] = local64_read(
+						&perf_iommu->prev_cnts[index]);
+		}
+
+		amd_iommu_pc_set_cnt_vals(_GET_BANK(event), _GET_CNTR(event),
+					  amd_iommu_get_num_iommus(),
+					  perf_iommu_cnts);
 	}
 
 	perf_iommu_enable_event(event);
@@ -324,29 +335,42 @@ static void perf_iommu_start(struct perf_event *event, int flags)
 
 static void perf_iommu_read(struct perf_event *event)
 {
-	u64 count = 0ULL;
+	int i;
 	u64 prev_raw_count = 0ULL;
 	u64 delta = 0ULL;
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_amd_iommu *perf_iommu =
+			container_of(event->pmu, struct perf_amd_iommu, pmu);
+
 	pr_debug("perf: amd_iommu:perf_iommu_read\n");
 
-	amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
-				_GET_BANK(event), _GET_CNTR(event),
-				IOMMU_PC_COUNTER_REG, &count, false);
+	if (amd_iommu_pc_get_cnt_vals(_GET_BANK(event), _GET_CNTR(event),
+				      amd_iommu_get_num_iommus(),
+				      perf_iommu_cnts))
+		return;
+
+	local64_set(&hwc->prev_count, 0);
+	for (i = 0; i < amd_iommu_get_num_iommus(); i++) {
+		int index = get_iommu_bnk_cnt_evt_idx(perf_iommu, i,
+					  _GET_BANK(event), _GET_CNTR(event));
 
-	/* IOMMU pc counter register is only 48 bits */
-	count &= 0xFFFFFFFFFFFFULL;
+		/* IOMMU pc counter register is only 48 bits */
+		perf_iommu_cnts[i] &= 0xFFFFFFFFFFFFULL;
 
-	prev_raw_count =  local64_read(&hwc->prev_count);
-	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-					count) != prev_raw_count)
-		return;
+		prev_raw_count = local64_read(&perf_iommu->prev_cnts[index]);
+		if (prev_raw_count != local64_cmpxchg(
+					&perf_iommu->prev_cnts[index],
+					prev_raw_count, perf_iommu_cnts[i]))
+			return;
 
-	/* Handling 48-bit counter overflowing */
-	delta = (count << COUNTER_SHIFT) - (prev_raw_count << COUNTER_SHIFT);
-	delta >>= COUNTER_SHIFT;
-	local64_add(delta, &event->count);
+		local64_add(prev_raw_count, &hwc->prev_count);
 
+		/* Handling 48-bit counter overflowing */
+		delta = (perf_iommu_cnts[i] << COUNTER_SHIFT) -
+				(prev_raw_count << COUNTER_SHIFT);
+		delta >>= COUNTER_SHIFT;
+		local64_add(delta, &event->count);
+	}
 }
 
 static void perf_iommu_stop(struct perf_event *event, int flags)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 9c62613..86b09ec 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1133,6 +1133,9 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
+static int _amd_iommu_pc_get_set_reg_val(struct amd_iommu *iommu,
+					 u8 bank, u8 cntr, u8 fxn,
+					 u64 *value, bool is_write);
 
 static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
@@ -1144,8 +1147,8 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 	amd_iommu_pc_present = true;
 
 	/* Check if the performance counters can be written to */
-	if ((0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val, true)) ||
-	    (0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val2, false)) ||
+	if ((_amd_iommu_pc_get_set_reg_val(iommu, 0, 0, 0, &val, true)) ||
+	    (_amd_iommu_pc_get_set_reg_val(iommu, 0, 0, 0, &val2, false)) ||
 	    (val != val2)) {
 		pr_err("AMD-Vi: Unable to write to IOMMU perf counter.\n");
 		amd_iommu_pc_present = false;
@@ -2305,10 +2308,10 @@ u8 amd_iommu_pc_get_max_counters(void)
 }
 EXPORT_SYMBOL(amd_iommu_pc_get_max_counters);
 
-int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
-				    u64 *value, bool is_write)
+static int _amd_iommu_pc_get_set_reg_val(struct amd_iommu *iommu,
+					 u8 bank, u8 cntr, u8 fxn,
+					 u64 *value, bool is_write)
 {
-	struct amd_iommu *iommu;
 	u32 offset;
 	u32 max_offset_lim;
 
@@ -2316,9 +2319,6 @@ int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 	if (!amd_iommu_pc_present)
 		return -ENODEV;
 
-	/* Locate the iommu associated with the device ID */
-	iommu = amd_iommu_rlookup_table[devid];
-
 	/* Check for valid iommu and pc register indexing */
 	if (WARN_ON((iommu == NULL) || (fxn > 0x28) || (fxn & 7)))
 		return -ENODEV;
@@ -2343,4 +2343,73 @@ int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 
 	return 0;
 }
-EXPORT_SYMBOL(amd_iommu_pc_get_set_reg_val);
+
+int amd_iommu_pc_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn, u64 *value)
+{
+	struct amd_iommu *iommu;
+
+	for_each_iommu(iommu) {
+		int ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							fxn, value, true);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_set_reg_val);
+
+int amd_iommu_pc_set_cnt_vals(u8 bank, u8 cntr, int num, u64 *value)
+{
+	struct amd_iommu *iommu;
+	int i = 0;
+
+	if (num > amd_iommu_cnt)
+		return -EINVAL;
+
+	for_each_iommu(iommu) {
+		int ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							IOMMU_PC_COUNTER_REG,
+							&value[i], true);
+		if (ret)
+			return ret;
+		if (i++ == amd_iommu_cnt)
+			break;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_set_cnt_vals);
+
+int amd_iommu_pc_get_cnt_vals(u8 bank, u8 cntr, int num, u64 *value)
+{
+	struct amd_iommu *iommu;
+	int i = 0, ret;
+
+	if (!num)
+		return -EINVAL;
+
+	/*
+	 * Here, we read the specified counters on all IOMMU,
+	 * which should have been programmed the same way.
+	 * and aggregate the counter values.
+	 */
+	for_each_iommu(iommu) {
+		u64 tmp;
+
+		if (i >= num)
+			return -EINVAL;
+
+		ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							IOMMU_PC_COUNTER_REG,
+							&tmp, false);
+		if (ret)
+			return ret;
+
+		/* IOMMU pc counter register is only 48 bits */
+		value[i] = tmp & 0xFFFFFFFFFFFFULL;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_get_cnt_vals);
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index cb820c2..be1a17d 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -33,7 +33,11 @@ extern u8 amd_iommu_pc_get_max_banks(void);
 
 extern u8 amd_iommu_pc_get_max_counters(void);
 
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
-			u8 fxn, u64 *value, bool is_write);
+extern int amd_iommu_pc_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
+				    u64 *value);
+
+extern int amd_iommu_pc_set_cnt_vals(u8 bank, u8 cntr, int num, u64 *value);
+
+extern int amd_iommu_pc_get_cnt_vals(u8 bank, u8 cntr, int num, u64 *value);
 
 #endif /*_PERF_EVENT_AMD_IOMMU_H_*/
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] perf/amd/iommu: Enable support for multiple IOMMUs
@ 2015-12-22 19:19   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-22 19:19 UTC (permalink / raw)
  To: joro-zLv9SwRftAIdnm+yROfE0A, bp-Gina5bIWoIWzQB+pC5nmwQ,
	peterz-wEGCiKHe2LqWVfeAwA7xHQ, mingo-H+wXaHxf7aLQT0dZR+AlfA,
	acme-DgEjT+Ai2ygdnm+yROfE0A
  Cc: iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

The current amd_iommu_pc_get_set_reg_val() does not support muli-IOMMU
system. This patch replace amd_iommu_pc_get_set_reg_val() with
amd_iommu_pc_set_reg_val() and amd_iommu_pc_[set|get]_cnt_vals().

This implementation makes an assumption that the counters on all IOMMUs
will be programmed the same way (i.e with the same events).

Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
---
 arch/x86/kernel/cpu/perf_event_amd_iommu.c | 80 +++++++++++++++++----------
 drivers/iommu/amd_iommu_init.c             | 87 ++++++++++++++++++++++++++----
 include/linux/perf/perf_event_amd_iommu.h  |  8 ++-
 3 files changed, 136 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_amd_iommu.c b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
index 8af7149..9c60eb3 100644
--- a/arch/x86/kernel/cpu/perf_event_amd_iommu.c
+++ b/arch/x86/kernel/cpu/perf_event_amd_iommu.c
@@ -264,44 +264,46 @@ static void perf_iommu_enable_event(struct perf_event *ev)
 	u64 reg = 0ULL;
 
 	reg = csource;
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_COUNTER_SRC_REG, &reg, true);
+			 IOMMU_PC_COUNTER_SRC_REG, &reg);
 
 	reg = 0ULL | devid | (_GET_DEVID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_DEVID_MATCH_REG, &reg, true);
+			 IOMMU_PC_DEVID_MATCH_REG, &reg);
 
 	reg = 0ULL | _GET_PASID(ev) | (_GET_PASID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_PASID_MATCH_REG, &reg, true);
+			 IOMMU_PC_PASID_MATCH_REG, &reg);
 
 	reg = 0ULL | _GET_DOMID(ev) | (_GET_DOMID_MASK(ev) << 32);
 	if (reg)
 		reg |= (1UL << 31);
-	amd_iommu_pc_get_set_reg_val(devid,
+	amd_iommu_pc_set_reg_val(devid,
 			_GET_BANK(ev), _GET_CNTR(ev) ,
-			 IOMMU_PC_DOMID_MATCH_REG, &reg, true);
+			 IOMMU_PC_DOMID_MATCH_REG, &reg);
 }
 
 static void perf_iommu_disable_event(struct perf_event *event)
 {
 	u64 reg = 0ULL;
 
-	amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
+	amd_iommu_pc_set_reg_val(_GET_DEVID(event),
 			_GET_BANK(event), _GET_CNTR(event),
-			IOMMU_PC_COUNTER_SRC_REG, &reg, true);
+			IOMMU_PC_COUNTER_SRC_REG, &reg);
 }
 
 static void perf_iommu_start(struct perf_event *event, int flags)
 {
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_amd_iommu *perf_iommu =
+			container_of(event->pmu, struct perf_amd_iommu, pmu);
 
 	pr_debug("perf: amd_iommu:perf_iommu_start\n");
 	if (WARN_ON_ONCE(!(hwc->state & PERF_HES_STOPPED)))
@@ -311,10 +313,19 @@ static void perf_iommu_start(struct perf_event *event, int flags)
 	hwc->state = 0;
 
 	if (flags & PERF_EF_RELOAD) {
-		u64 prev_raw_count =  local64_read(&hwc->prev_count);
-		amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
-				_GET_BANK(event), _GET_CNTR(event),
-				IOMMU_PC_COUNTER_REG, &prev_raw_count, true);
+		int i;
+
+		for (i = 0; i < amd_iommu_get_num_iommus(); i++) {
+			int index = get_iommu_bnk_cnt_evt_idx(perf_iommu, i,
+					  _GET_BANK(event), _GET_CNTR(event));
+
+			perf_iommu_cnts[i] = local64_read(
+						&perf_iommu->prev_cnts[index]);
+		}
+
+		amd_iommu_pc_set_cnt_vals(_GET_BANK(event), _GET_CNTR(event),
+					  amd_iommu_get_num_iommus(),
+					  perf_iommu_cnts);
 	}
 
 	perf_iommu_enable_event(event);
@@ -324,29 +335,42 @@ static void perf_iommu_start(struct perf_event *event, int flags)
 
 static void perf_iommu_read(struct perf_event *event)
 {
-	u64 count = 0ULL;
+	int i;
 	u64 prev_raw_count = 0ULL;
 	u64 delta = 0ULL;
 	struct hw_perf_event *hwc = &event->hw;
+	struct perf_amd_iommu *perf_iommu =
+			container_of(event->pmu, struct perf_amd_iommu, pmu);
+
 	pr_debug("perf: amd_iommu:perf_iommu_read\n");
 
-	amd_iommu_pc_get_set_reg_val(_GET_DEVID(event),
-				_GET_BANK(event), _GET_CNTR(event),
-				IOMMU_PC_COUNTER_REG, &count, false);
+	if (amd_iommu_pc_get_cnt_vals(_GET_BANK(event), _GET_CNTR(event),
+				      amd_iommu_get_num_iommus(),
+				      perf_iommu_cnts))
+		return;
+
+	local64_set(&hwc->prev_count, 0);
+	for (i = 0; i < amd_iommu_get_num_iommus(); i++) {
+		int index = get_iommu_bnk_cnt_evt_idx(perf_iommu, i,
+					  _GET_BANK(event), _GET_CNTR(event));
 
-	/* IOMMU pc counter register is only 48 bits */
-	count &= 0xFFFFFFFFFFFFULL;
+		/* IOMMU pc counter register is only 48 bits */
+		perf_iommu_cnts[i] &= 0xFFFFFFFFFFFFULL;
 
-	prev_raw_count =  local64_read(&hwc->prev_count);
-	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
-					count) != prev_raw_count)
-		return;
+		prev_raw_count = local64_read(&perf_iommu->prev_cnts[index]);
+		if (prev_raw_count != local64_cmpxchg(
+					&perf_iommu->prev_cnts[index],
+					prev_raw_count, perf_iommu_cnts[i]))
+			return;
 
-	/* Handling 48-bit counter overflowing */
-	delta = (count << COUNTER_SHIFT) - (prev_raw_count << COUNTER_SHIFT);
-	delta >>= COUNTER_SHIFT;
-	local64_add(delta, &event->count);
+		local64_add(prev_raw_count, &hwc->prev_count);
 
+		/* Handling 48-bit counter overflowing */
+		delta = (perf_iommu_cnts[i] << COUNTER_SHIFT) -
+				(prev_raw_count << COUNTER_SHIFT);
+		delta >>= COUNTER_SHIFT;
+		local64_add(delta, &event->count);
+	}
 }
 
 static void perf_iommu_stop(struct perf_event *event, int flags)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 9c62613..86b09ec 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1133,6 +1133,9 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
+static int _amd_iommu_pc_get_set_reg_val(struct amd_iommu *iommu,
+					 u8 bank, u8 cntr, u8 fxn,
+					 u64 *value, bool is_write);
 
 static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
@@ -1144,8 +1147,8 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 	amd_iommu_pc_present = true;
 
 	/* Check if the performance counters can be written to */
-	if ((0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val, true)) ||
-	    (0 != amd_iommu_pc_get_set_reg_val(0, 0, 0, 0, &val2, false)) ||
+	if ((_amd_iommu_pc_get_set_reg_val(iommu, 0, 0, 0, &val, true)) ||
+	    (_amd_iommu_pc_get_set_reg_val(iommu, 0, 0, 0, &val2, false)) ||
 	    (val != val2)) {
 		pr_err("AMD-Vi: Unable to write to IOMMU perf counter.\n");
 		amd_iommu_pc_present = false;
@@ -2305,10 +2308,10 @@ u8 amd_iommu_pc_get_max_counters(void)
 }
 EXPORT_SYMBOL(amd_iommu_pc_get_max_counters);
 
-int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
-				    u64 *value, bool is_write)
+static int _amd_iommu_pc_get_set_reg_val(struct amd_iommu *iommu,
+					 u8 bank, u8 cntr, u8 fxn,
+					 u64 *value, bool is_write)
 {
-	struct amd_iommu *iommu;
 	u32 offset;
 	u32 max_offset_lim;
 
@@ -2316,9 +2319,6 @@ int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 	if (!amd_iommu_pc_present)
 		return -ENODEV;
 
-	/* Locate the iommu associated with the device ID */
-	iommu = amd_iommu_rlookup_table[devid];
-
 	/* Check for valid iommu and pc register indexing */
 	if (WARN_ON((iommu == NULL) || (fxn > 0x28) || (fxn & 7)))
 		return -ENODEV;
@@ -2343,4 +2343,73 @@ int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
 
 	return 0;
 }
-EXPORT_SYMBOL(amd_iommu_pc_get_set_reg_val);
+
+int amd_iommu_pc_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn, u64 *value)
+{
+	struct amd_iommu *iommu;
+
+	for_each_iommu(iommu) {
+		int ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							fxn, value, true);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_set_reg_val);
+
+int amd_iommu_pc_set_cnt_vals(u8 bank, u8 cntr, int num, u64 *value)
+{
+	struct amd_iommu *iommu;
+	int i = 0;
+
+	if (num > amd_iommu_cnt)
+		return -EINVAL;
+
+	for_each_iommu(iommu) {
+		int ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							IOMMU_PC_COUNTER_REG,
+							&value[i], true);
+		if (ret)
+			return ret;
+		if (i++ == amd_iommu_cnt)
+			break;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_set_cnt_vals);
+
+int amd_iommu_pc_get_cnt_vals(u8 bank, u8 cntr, int num, u64 *value)
+{
+	struct amd_iommu *iommu;
+	int i = 0, ret;
+
+	if (!num)
+		return -EINVAL;
+
+	/*
+	 * Here, we read the specified counters on all IOMMU,
+	 * which should have been programmed the same way.
+	 * and aggregate the counter values.
+	 */
+	for_each_iommu(iommu) {
+		u64 tmp;
+
+		if (i >= num)
+			return -EINVAL;
+
+		ret = _amd_iommu_pc_get_set_reg_val(iommu, bank, cntr,
+							IOMMU_PC_COUNTER_REG,
+							&tmp, false);
+		if (ret)
+			return ret;
+
+		/* IOMMU pc counter register is only 48 bits */
+		value[i] = tmp & 0xFFFFFFFFFFFFULL;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_pc_get_cnt_vals);
diff --git a/include/linux/perf/perf_event_amd_iommu.h b/include/linux/perf/perf_event_amd_iommu.h
index cb820c2..be1a17d 100644
--- a/include/linux/perf/perf_event_amd_iommu.h
+++ b/include/linux/perf/perf_event_amd_iommu.h
@@ -33,7 +33,11 @@ extern u8 amd_iommu_pc_get_max_banks(void);
 
 extern u8 amd_iommu_pc_get_max_counters(void);
 
-extern int amd_iommu_pc_get_set_reg_val(u16 devid, u8 bank, u8 cntr,
-			u8 fxn, u64 *value, bool is_write);
+extern int amd_iommu_pc_set_reg_val(u16 devid, u8 bank, u8 cntr, u8 fxn,
+				    u64 *value);
+
+extern int amd_iommu_pc_set_cnt_vals(u8 bank, u8 cntr, int num, u64 *value);
+
+extern int amd_iommu_pc_get_cnt_vals(u8 bank, u8 cntr, int num, u64 *value);
 
 #endif /*_PERF_EVENT_AMD_IOMMU_H_*/
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-28 15:43     ` Joerg Roedel
  0 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2015-12-28 15:43 UTC (permalink / raw)
  To: Suravee Suthikulpanit; +Cc: bp, peterz, mingo, acme, linux-kernel, iommu

On Tue, Dec 22, 2015 at 01:19:14PM -0600, Suthikulpanit, Suravee wrote:
> This patch introduces amd_iommu_get_num_iommus(). Initially, this is
> intended to be used by Perf AMD IOMMU driver.
> 
> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
> ---
>  drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
>  include/linux/perf/perf_event_amd_iommu.h |  2 ++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 275c0f5..9c62613 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
>  }
>  EXPORT_SYMBOL(amd_iommu_v2_supported);
>  
> +static int amd_iommu_cnt;
> +
> +int amd_iommu_get_num_iommus(void)
> +{
> +	struct amd_iommu *iommu;
> +
> +	if (amd_iommu_cnt)
> +		return amd_iommu_cnt;
> +
> +	for_each_iommu(iommu)
> +		amd_iommu_cnt++;

It is better to set amd_iommu_cnt during IOMMU initialization. You can
just increment this value after an IOMMU has been set up.



	Joerg


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-28 15:43     ` Joerg Roedel
  0 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2015-12-28 15:43 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, acme-DgEjT+Ai2ygdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, bp-Gina5bIWoIWzQB+pC5nmwQ

On Tue, Dec 22, 2015 at 01:19:14PM -0600, Suthikulpanit, Suravee wrote:
> This patch introduces amd_iommu_get_num_iommus(). Initially, this is
> intended to be used by Perf AMD IOMMU driver.
> 
> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
> ---
>  drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
>  include/linux/perf/perf_event_amd_iommu.h |  2 ++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
> index 275c0f5..9c62613 100644
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
>  }
>  EXPORT_SYMBOL(amd_iommu_v2_supported);
>  
> +static int amd_iommu_cnt;
> +
> +int amd_iommu_get_num_iommus(void)
> +{
> +	struct amd_iommu *iommu;
> +
> +	if (amd_iommu_cnt)
> +		return amd_iommu_cnt;
> +
> +	for_each_iommu(iommu)
> +		amd_iommu_cnt++;

It is better to set amd_iommu_cnt during IOMMU initialization. You can
just increment this value after an IOMMU has been set up.



	Joerg

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-29 21:20       ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-29 21:20 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: bp, peterz, mingo, acme, linux-kernel, iommu

Hi Jorge,

On 12/28/15 09:43, Joerg Roedel wrote:
> On Tue, Dec 22, 2015 at 01:19:14PM -0600, Suthikulpanit, Suravee wrote:
>> This patch introduces amd_iommu_get_num_iommus(). Initially, this is
>> intended to be used by Perf AMD IOMMU driver.
>>
>> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
>> ---
>>   drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
>>   include/linux/perf/perf_event_amd_iommu.h |  2 ++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
>> index 275c0f5..9c62613 100644
>> --- a/drivers/iommu/amd_iommu_init.c
>> +++ b/drivers/iommu/amd_iommu_init.c
>> @@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
>>   }
>>   EXPORT_SYMBOL(amd_iommu_v2_supported);
>>
>> +static int amd_iommu_cnt;
>> +
>> +int amd_iommu_get_num_iommus(void)
>> +{
>> +	struct amd_iommu *iommu;
>> +
>> +	if (amd_iommu_cnt)
>> +		return amd_iommu_cnt;
>> +
>> +	for_each_iommu(iommu)
>> +		amd_iommu_cnt++;
>
> It is better to set amd_iommu_cnt during IOMMU initialization. You can
> just increment this value after an IOMMU has been set up.
>
>
>
> 	Joerg
>

Sure. I'll take care of this in V2.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus()
@ 2015-12-29 21:20       ` Suravee Suthikulpanit
  0 siblings, 0 replies; 18+ messages in thread
From: Suravee Suthikulpanit @ 2015-12-29 21:20 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: peterz-wEGCiKHe2LqWVfeAwA7xHQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, acme-DgEjT+Ai2ygdnm+yROfE0A,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	mingo-H+wXaHxf7aLQT0dZR+AlfA, bp-Gina5bIWoIWzQB+pC5nmwQ

Hi Jorge,

On 12/28/15 09:43, Joerg Roedel wrote:
> On Tue, Dec 22, 2015 at 01:19:14PM -0600, Suthikulpanit, Suravee wrote:
>> This patch introduces amd_iommu_get_num_iommus(). Initially, this is
>> intended to be used by Perf AMD IOMMU driver.
>>
>> Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit-5C7GfCeVMHo@public.gmane.org>
>> ---
>>   drivers/iommu/amd_iommu_init.c            | 16 ++++++++++++++++
>>   include/linux/perf/perf_event_amd_iommu.h |  2 ++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
>> index 275c0f5..9c62613 100644
>> --- a/drivers/iommu/amd_iommu_init.c
>> +++ b/drivers/iommu/amd_iommu_init.c
>> @@ -2244,6 +2244,22 @@ bool amd_iommu_v2_supported(void)
>>   }
>>   EXPORT_SYMBOL(amd_iommu_v2_supported);
>>
>> +static int amd_iommu_cnt;
>> +
>> +int amd_iommu_get_num_iommus(void)
>> +{
>> +	struct amd_iommu *iommu;
>> +
>> +	if (amd_iommu_cnt)
>> +		return amd_iommu_cnt;
>> +
>> +	for_each_iommu(iommu)
>> +		amd_iommu_cnt++;
>
> It is better to set amd_iommu_cnt during IOMMU initialization. You can
> just increment this value after an IOMMU has been set up.
>
>
>
> 	Joerg
>

Sure. I'll take care of this in V2.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2015-12-29 21:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-22 19:19 [PATCH 0/6] perf/amd/iommu: Enable multi-IOMMU support Suravee Suthikulpanit
2015-12-22 19:19 ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 1/6] perf/amd/iommu: Consolidate and move perf_event_amd_iommu header Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 2/6] perf/amd/iommu: Modify functions to query max banks and counters Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 3/6] iommu/amd: Introduce amd_iommu_get_num_iommus() Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit
2015-12-28 15:43   ` Joerg Roedel
2015-12-28 15:43     ` Joerg Roedel
2015-12-29 21:20     ` Suravee Suthikulpanit
2015-12-29 21:20       ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 4/6] perf/amd/iommu: Introduce data structure for tracking prev count Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 5/6] perf/amd/iommu: Introduce get_iommu_bnk_cnt_evt_idx Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit
2015-12-22 19:19 ` [PATCH 6/6] perf/amd/iommu: Enable support for multiple IOMMUs Suravee Suthikulpanit
2015-12-22 19:19   ` Suravee Suthikulpanit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.