All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] iommu/amd: Revert and remove failing PMC test
@ 2021-04-09  8:58 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Suravee Suthikulpanit

This has prevented PMC to work on more recent desktop/mobile platforms,
where the PMC power-gating is normally enabled. After consulting
with HW designers and IOMMU maintainer, we have decide to remove
the legacy test altogether to avoid future PMC enabling issues.

Thanks the community for helping to test, investigate, provide data
and report issues on several platforms in the field.

Regards,
Suravee 

Paul Menzel (1):
  Revert "iommu/amd: Fix performance counter initialization"

Suravee Suthikulpanit (1):
  iommu/amd: Remove performance counter pre-initialization test

 drivers/iommu/amd/init.c | 49 ++--------------------------------------
 1 file changed, 2 insertions(+), 47 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0/2] iommu/amd: Revert and remove failing PMC test
@ 2021-04-09  8:58 ` Suravee Suthikulpanit
  0 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu; +Cc: pmenzel, Jon.Grimm, will

This has prevented PMC to work on more recent desktop/mobile platforms,
where the PMC power-gating is normally enabled. After consulting
with HW designers and IOMMU maintainer, we have decide to remove
the legacy test altogether to avoid future PMC enabling issues.

Thanks the community for helping to test, investigate, provide data
and report issues on several platforms in the field.

Regards,
Suravee 

Paul Menzel (1):
  Revert "iommu/amd: Fix performance counter initialization"

Suravee Suthikulpanit (1):
  iommu/amd: Remove performance counter pre-initialization test

 drivers/iommu/amd/init.c | 49 ++--------------------------------------
 1 file changed, 2 insertions(+), 47 deletions(-)

-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
  2021-04-09  8:58 ` Suravee Suthikulpanit
@ 2021-04-09  8:58   ` Suravee Suthikulpanit
  -1 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, David Coe, Suravee Suthikulpanit

From: Paul Menzel <pmenzel@molgen.mpg.de>

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The original commit tries to address an issue, where PMC power-gating
causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
platforms where the power-gating is normally enabled.

There have been several reports that the workaround still does not
guarantee to work, and can add up to 100 ms (on the worst case)
to the boot process on certain platforms such as the MSI B350M MORTAR
with AMD Ryzen 3 2200G.

Therefore, revert this commit as a prelude to removing the pre-init
test.

Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
Note: I have revised the commit message to add more detail
      and remove uncessary information.

 drivers/iommu/amd/init.c | 45 ++++++++++------------------------------
 1 file changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 321f5906e6ed..648cdfd03074 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,7 +12,6 @@
 #include <linux/acpi.h>
 #include <linux/list.h>
 #include <linux/bitmap.h>
-#include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/syscore_ops.h>
 #include <linux/interrupt.h>
@@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-				u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
-static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+				u8 fxn, u64 *value, bool is_write);
+
+static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
-	int retry;
 	struct pci_dev *pdev = iommu->dev;
-	u64 val = 0xabcd, val2 = 0, save_reg, save_src;
+	u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
 	if (!iommu_feature(iommu, FEATURE_PC))
 		return;
@@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
 	amd_iommu_pc_present = true;
 
 	/* save the value to restore, if writable */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
-		goto pc_false;
-
-	/*
-	 * Disable power gating by programing the performance counter
-	 * source to 20 (i.e. counts the reads and writes from/to IOMMU
-	 * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
-	 * which never get incremented during this init phase.
-	 * (Note: The event is also deprecated.)
-	 */
-	val = 20;
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
 		goto pc_false;
 
 	/* Check if the performance counters can be written to */
-	val = 0xabcd;
-	for (retry = 5; retry; retry--) {
-		if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
-		    iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
-		    val2)
-			break;
-
-		/* Wait about 20 msec for power gating to disable and retry. */
-		msleep(20);
-	}
-
-	/* restore */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
+	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
+	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
+	    (val != val2))
 		goto pc_false;
 
-	if (val != val2)
+	/* restore */
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
 		goto pc_false;
 
 	pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
@ 2021-04-09  8:58   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

From: Paul Menzel <pmenzel@molgen.mpg.de>

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The original commit tries to address an issue, where PMC power-gating
causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
platforms where the power-gating is normally enabled.

There have been several reports that the workaround still does not
guarantee to work, and can add up to 100 ms (on the worst case)
to the boot process on certain platforms such as the MSI B350M MORTAR
with AMD Ryzen 3 2200G.

Therefore, revert this commit as a prelude to removing the pre-init
test.

Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
Note: I have revised the commit message to add more detail
      and remove uncessary information.

 drivers/iommu/amd/init.c | 45 ++++++++++------------------------------
 1 file changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 321f5906e6ed..648cdfd03074 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,7 +12,6 @@
 #include <linux/acpi.h>
 #include <linux/list.h>
 #include <linux/bitmap.h>
-#include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/syscore_ops.h>
 #include <linux/interrupt.h>
@@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-				u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
-static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+				u8 fxn, u64 *value, bool is_write);
+
+static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
-	int retry;
 	struct pci_dev *pdev = iommu->dev;
-	u64 val = 0xabcd, val2 = 0, save_reg, save_src;
+	u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
 	if (!iommu_feature(iommu, FEATURE_PC))
 		return;
@@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
 	amd_iommu_pc_present = true;
 
 	/* save the value to restore, if writable */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
-		goto pc_false;
-
-	/*
-	 * Disable power gating by programing the performance counter
-	 * source to 20 (i.e. counts the reads and writes from/to IOMMU
-	 * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
-	 * which never get incremented during this init phase.
-	 * (Note: The event is also deprecated.)
-	 */
-	val = 20;
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
 		goto pc_false;
 
 	/* Check if the performance counters can be written to */
-	val = 0xabcd;
-	for (retry = 5; retry; retry--) {
-		if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
-		    iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
-		    val2)
-			break;
-
-		/* Wait about 20 msec for power gating to disable and retry. */
-		msleep(20);
-	}
-
-	/* restore */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
+	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
+	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
+	    (val != val2))
 		goto pc_false;
 
-	if (val != val2)
+	/* restore */
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
 		goto pc_false;
 
 	pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58 ` Suravee Suthikulpanit
@ 2021-04-09  8:58   ` Suravee Suthikulpanit
  -1 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Suravee Suthikulpanit,
	Tj, Shuah Khan, Alexander Monakov, David Coe

In early AMD desktop/mobile platforms (during 2013), when the IOMMU
Performance Counter (PMC) support was first introduced in
commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
resource management"), there was a HW bug where the counters could not
be accessed. The result was reading of the counter always return zero.

At the time, the suggested workaround was to add a test logic prior
to initializing the PMC feature to check if the counters can be programmed
and read back the same value. This has been working fine until the more
recent desktop/mobile platforms start enabling power gating for the PMC,
which prevents access to the counters. This results in the PMC support
being disabled unnecesarily.

Unfortunatly, there is no documentation of since which generation
of hardware the original PMC HW bug was fixed. Although, it was fixed
soon after the first introduction of the PMC. Base on this, we assume
that the buggy platforms are less likely to be in used, and it should
be relatively safe to remove this legacy logic.

Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/init.c | 24 +-----------------------
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 648cdfd03074..247cdda5d683 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-				u8 fxn, u64 *value, bool is_write);
-
 static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
+	u64 val;
 	struct pci_dev *pdev = iommu->dev;
-	u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
 	if (!iommu_feature(iommu, FEATURE_PC))
 		return;
 
 	amd_iommu_pc_present = true;
 
-	/* save the value to restore, if writable */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
-		goto pc_false;
-
-	/* Check if the performance counters can be written to */
-	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
-	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
-	    (val != val2))
-		goto pc_false;
-
-	/* restore */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
-		goto pc_false;
-
 	pci_info(pdev, "IOMMU performance counters supported\n");
 
 	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
@@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 	iommu->max_counters = (u8) ((val >> 7) & 0xf);
 
 	return;
-
-pc_false:
-	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
-	amd_iommu_pc_present = false;
-	return;
 }
 
 static ssize_t amd_iommu_show_cap(struct device *dev,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09  8:58   ` Suravee Suthikulpanit
  0 siblings, 0 replies; 46+ messages in thread
From: Suravee Suthikulpanit @ 2021-04-09  8:58 UTC (permalink / raw)
  To: linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

In early AMD desktop/mobile platforms (during 2013), when the IOMMU
Performance Counter (PMC) support was first introduced in
commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
resource management"), there was a HW bug where the counters could not
be accessed. The result was reading of the counter always return zero.

At the time, the suggested workaround was to add a test logic prior
to initializing the PMC feature to check if the counters can be programmed
and read back the same value. This has been working fine until the more
recent desktop/mobile platforms start enabling power gating for the PMC,
which prevents access to the counters. This results in the PMC support
being disabled unnecesarily.

Unfortunatly, there is no documentation of since which generation
of hardware the original PMC HW bug was fixed. Although, it was fixed
soon after the first introduction of the PMC. Base on this, we assume
that the buggy platforms are less likely to be in used, and it should
be relatively safe to remove this legacy logic.

Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Cc: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd/init.c | 24 +-----------------------
 1 file changed, 1 insertion(+), 23 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 648cdfd03074..247cdda5d683 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-				u8 fxn, u64 *value, bool is_write);
-
 static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
+	u64 val;
 	struct pci_dev *pdev = iommu->dev;
-	u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
 	if (!iommu_feature(iommu, FEATURE_PC))
 		return;
 
 	amd_iommu_pc_present = true;
 
-	/* save the value to restore, if writable */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
-		goto pc_false;
-
-	/* Check if the performance counters can be written to */
-	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
-	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
-	    (val != val2))
-		goto pc_false;
-
-	/* restore */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
-		goto pc_false;
-
 	pci_info(pdev, "IOMMU performance counters supported\n");
 
 	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
@@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 	iommu->max_counters = (u8) ((val >> 7) & 0xf);
 
 	return;
-
-pc_false:
-	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
-	amd_iommu_pc_present = false;
-	return;
 }
 
 static ssize_t amd_iommu_show_cap(struct device *dev,
-- 
2.17.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-09 16:37     ` Shuah Khan
  -1 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 16:37 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe, Shuah Khan

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.

unnecessarily

> 
> Unfortunatly, there is no documentation of since which generation

Unfortunately,

Rephrase suggestion:
Unfortunately, it is unclear when the PMC HW bug fixed.

> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume

Based

> that the buggy platforms are less likely to be in used, and it should

in use

> be relatively safe to remove this legacy logic.

> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd/init.c | 24 +-----------------------
>   1 file changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 648cdfd03074..247cdda5d683 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
> -
>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> +	u64 val;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
>   
>   	amd_iommu_pc_present = true;
>   
> -	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
> -		goto pc_false;
> -
> -	/* Check if the performance counters can be written to */
> -	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> -	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> -	    (val != val2))
> -		goto pc_false;
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
> -		goto pc_false;
> -
>   	pci_info(pdev, "IOMMU performance counters supported\n");
>   
>   	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	iommu->max_counters = (u8) ((val >> 7) & 0xf);
>   
>   	return;
> -
> -pc_false:
> -	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
> -	amd_iommu_pc_present = false;
> -	return;
>   }
>   
>   static ssize_t amd_iommu_show_cap(struct device *dev,
> 

I will test this patch and the revert on my two AMD systems and update
you in a day or two.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09 16:37     ` Shuah Khan
  0 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 16:37 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.

unnecessarily

> 
> Unfortunatly, there is no documentation of since which generation

Unfortunately,

Rephrase suggestion:
Unfortunately, it is unclear when the PMC HW bug fixed.

> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume

Based

> that the buggy platforms are less likely to be in used, and it should

in use

> be relatively safe to remove this legacy logic.

> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd/init.c | 24 +-----------------------
>   1 file changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 648cdfd03074..247cdda5d683 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
> -
>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> +	u64 val;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
>   
>   	amd_iommu_pc_present = true;
>   
> -	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
> -		goto pc_false;
> -
> -	/* Check if the performance counters can be written to */
> -	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> -	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> -	    (val != val2))
> -		goto pc_false;
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
> -		goto pc_false;
> -
>   	pci_info(pdev, "IOMMU performance counters supported\n");
>   
>   	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	iommu->max_counters = (u8) ((val >> 7) & 0xf);
>   
>   	return;
> -
> -pc_false:
> -	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
> -	amd_iommu_pc_present = false;
> -	return;
>   }
>   
>   static ssize_t amd_iommu_show_cap(struct device *dev,
> 

I will test this patch and the revert on my two AMD systems and update
you in a day or two.

thanks,
-- Shuah
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-09 17:06     ` Shuah Khan
  -1 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 17:06 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe, Shuah Khan

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> From: Paul Menzel <pmenzel@molgen.mpg.de>
> 
> This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.
> 
> The original commit tries to address an issue, where PMC power-gating
> causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
> platforms where the power-gating is normally enabled.
> 
> There have been several reports that the workaround still does not
> guarantee to work, and can add up to 100 ms (on the worst case)
> to the boot process on certain platforms such as the MSI B350M MORTAR
> with AMD Ryzen 3 2200G.
> 
> Therefore, revert this commit as a prelude to removing the pre-init
> test.
> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
> Note: I have revised the commit message to add more detail
>        and remove uncessary information.
> 
>   drivers/iommu/amd/init.c | 45 ++++++++++------------------------------
>   1 file changed, 11 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 321f5906e6ed..648cdfd03074 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -12,7 +12,6 @@
>   #include <linux/acpi.h>
>   #include <linux/list.h>
>   #include <linux/bitmap.h>
> -#include <linux/delay.h>
>   #include <linux/slab.h>
>   #include <linux/syscore_ops.h>
>   #include <linux/interrupt.h>
> @@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
>   static int amd_iommu_enable_interrupts(void);
>   static int __init iommu_go_to_state(enum iommu_init_state state);
>   static void init_device_table_dma(void);
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
>   
>   static bool amd_iommu_pre_enabled = true;
>   
> @@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
> +static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> +				u8 fxn, u64 *value, bool is_write);
> +
> +static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> -	int retry;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg, save_src;
> +	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
> @@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	amd_iommu_pc_present = true;
>   
>   	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
> -	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
> -		goto pc_false;
> -
> -	/*
> -	 * Disable power gating by programing the performance counter
> -	 * source to 20 (i.e. counts the reads and writes from/to IOMMU
> -	 * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
> -	 * which never get incremented during this init phase.
> -	 * (Note: The event is also deprecated.)
> -	 */
> -	val = 20;
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
> +	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>   		goto pc_false;
>   
>   	/* Check if the performance counters can be written to */
> -	val = 0xabcd;
> -	for (retry = 5; retry; retry--) {
> -		if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
> -		    iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
> -		    val2)
> -			break;
> -
> -		/* Wait about 20 msec for power gating to disable and retry. */
> -		msleep(20);
> -	}
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
> -	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
> +	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> +	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> +	    (val != val2))

Probably don't need parentheses around 'val != val2'

>   		goto pc_false;
>   
> -	if (val != val2)
> +	/* restore */
> +	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
>   		goto pc_false;
>   
>   	pci_info(pdev, "IOMMU performance counters supported\n");
> 

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
@ 2021-04-09 17:06     ` Shuah Khan
  0 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 17:06 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> From: Paul Menzel <pmenzel@molgen.mpg.de>
> 
> This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.
> 
> The original commit tries to address an issue, where PMC power-gating
> causing the IOMMU PMC pre-init test to fail on certain desktop/mobile
> platforms where the power-gating is normally enabled.
> 
> There have been several reports that the workaround still does not
> guarantee to work, and can add up to 100 ms (on the worst case)
> to the boot process on certain platforms such as the MSI B350M MORTAR
> with AMD Ryzen 3 2200G.
> 
> Therefore, revert this commit as a prelude to removing the pre-init
> test.
> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
> Note: I have revised the commit message to add more detail
>        and remove uncessary information.
> 
>   drivers/iommu/amd/init.c | 45 ++++++++++------------------------------
>   1 file changed, 11 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 321f5906e6ed..648cdfd03074 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -12,7 +12,6 @@
>   #include <linux/acpi.h>
>   #include <linux/list.h>
>   #include <linux/bitmap.h>
> -#include <linux/delay.h>
>   #include <linux/slab.h>
>   #include <linux/syscore_ops.h>
>   #include <linux/interrupt.h>
> @@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
>   static int amd_iommu_enable_interrupts(void);
>   static int __init iommu_go_to_state(enum iommu_init_state state);
>   static void init_device_table_dma(void);
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
>   
>   static bool amd_iommu_pre_enabled = true;
>   
> @@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
> +static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> +				u8 fxn, u64 *value, bool is_write);
> +
> +static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> -	int retry;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg, save_src;
> +	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
> @@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	amd_iommu_pc_present = true;
>   
>   	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
> -	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
> -		goto pc_false;
> -
> -	/*
> -	 * Disable power gating by programing the performance counter
> -	 * source to 20 (i.e. counts the reads and writes from/to IOMMU
> -	 * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
> -	 * which never get incremented during this init phase.
> -	 * (Note: The event is also deprecated.)
> -	 */
> -	val = 20;
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
> +	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>   		goto pc_false;
>   
>   	/* Check if the performance counters can be written to */
> -	val = 0xabcd;
> -	for (retry = 5; retry; retry--) {
> -		if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
> -		    iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
> -		    val2)
> -			break;
> -
> -		/* Wait about 20 msec for power gating to disable and retry. */
> -		msleep(20);
> -	}
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
> -	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
> +	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> +	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> +	    (val != val2))

Probably don't need parentheses around 'val != val2'

>   		goto pc_false;
>   
> -	if (val != val2)
> +	/* restore */
> +	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
>   		goto pc_false;
>   
>   	pci_info(pdev, "IOMMU performance counters supported\n");
> 

thanks,
-- Shuah
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09 16:37     ` Shuah Khan
@ 2021-04-09 17:10       ` Shuah Khan
  -1 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 17:10 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe, Shuah Khan

On 4/9/21 10:37 AM, Shuah Khan wrote:
> On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
>> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
>> Performance Counter (PMC) support was first introduced in
>> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
>> resource management"), there was a HW bug where the counters could not
>> be accessed. The result was reading of the counter always return zero.
>>
>> At the time, the suggested workaround was to add a test logic prior
>> to initializing the PMC feature to check if the counters can be 
>> programmed
>> and read back the same value. This has been working fine until the more
>> recent desktop/mobile platforms start enabling power gating for the PMC,
>> which prevents access to the counters. This results in the PMC support
>> being disabled unnecesarily.
> 
> unnecessarily
> 
>>
>> Unfortunatly, there is no documentation of since which generation
> 
> Unfortunately,
> 
> Rephrase suggestion:
> Unfortunately, it is unclear when the PMC HW bug fixed.
> 
>> of hardware the original PMC HW bug was fixed. Although, it was fixed
>> soon after the first introduction of the PMC. Base on this, we assume
> 
> Based
> 
>> that the buggy platforms are less likely to be in used, and it should
> 
> in use
> 
>> be relatively safe to remove this legacy logic.
> 
>>
>> Link: 
>> https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/ 
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
>> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
>> Cc: Shuah Khan <skhan@linuxfoundation.org>
>> Cc: Alexander Monakov <amonakov@ispras.ru>
>> Cc: David Coe <david.coe@live.co.uk>
>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   drivers/iommu/amd/init.c | 24 +-----------------------
>>   1 file changed, 1 insertion(+), 23 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
>> index 648cdfd03074..247cdda5d683 100644
>> --- a/drivers/iommu/amd/init.c
>> +++ b/drivers/iommu/amd/init.c
>> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct 
>> acpi_table_header *table)
>>       return 0;
>>   }
>> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 
>> cntr,
>> -                u8 fxn, u64 *value, bool is_write);
>> -
>>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>>   {
>> +    u64 val;
>>       struct pci_dev *pdev = iommu->dev;
>> -    u64 val = 0xabcd, val2 = 0, save_reg = 0;

Why not leave this u64 val here? Having the pdev assignment as the
first line makes it easier to read/follow.

>>       if (!iommu_feature(iommu, FEATURE_PC))
>>           return;
>>       amd_iommu_pc_present = true;
>> -    /* save the value to restore, if writable */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>> -        goto pc_false;
>> -
>> -    /* Check if the performance counters can be written to */
>> -    if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
>> -        (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
>> -        (val != val2))

Aha - this is going away anyway. Please ignore my comment on 1/2
about parenthesis around (val != val2) being unnecessary.

>> -        goto pc_false;
>> -
>> -    /* restore */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
>> -        goto pc_false;
>> -
>>       pci_info(pdev, "IOMMU performance counters supported\n");
>>       val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
>> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct 
>> amd_iommu *iommu)
>>       iommu->max_counters = (u8) ((val >> 7) & 0xf);
>>       return;
>> -
>> -pc_false:
>> -    pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
>> -    amd_iommu_pc_present = false;
>> -    return;
>>   }
>>   static ssize_t amd_iommu_show_cap(struct device *dev,
>>
> 

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09 17:10       ` Shuah Khan
  0 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 17:10 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

On 4/9/21 10:37 AM, Shuah Khan wrote:
> On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
>> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
>> Performance Counter (PMC) support was first introduced in
>> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
>> resource management"), there was a HW bug where the counters could not
>> be accessed. The result was reading of the counter always return zero.
>>
>> At the time, the suggested workaround was to add a test logic prior
>> to initializing the PMC feature to check if the counters can be 
>> programmed
>> and read back the same value. This has been working fine until the more
>> recent desktop/mobile platforms start enabling power gating for the PMC,
>> which prevents access to the counters. This results in the PMC support
>> being disabled unnecesarily.
> 
> unnecessarily
> 
>>
>> Unfortunatly, there is no documentation of since which generation
> 
> Unfortunately,
> 
> Rephrase suggestion:
> Unfortunately, it is unclear when the PMC HW bug fixed.
> 
>> of hardware the original PMC HW bug was fixed. Although, it was fixed
>> soon after the first introduction of the PMC. Base on this, we assume
> 
> Based
> 
>> that the buggy platforms are less likely to be in used, and it should
> 
> in use
> 
>> be relatively safe to remove this legacy logic.
> 
>>
>> Link: 
>> https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/ 
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
>> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
>> Cc: Shuah Khan <skhan@linuxfoundation.org>
>> Cc: Alexander Monakov <amonakov@ispras.ru>
>> Cc: David Coe <david.coe@live.co.uk>
>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>>   drivers/iommu/amd/init.c | 24 +-----------------------
>>   1 file changed, 1 insertion(+), 23 deletions(-)
>>
>> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
>> index 648cdfd03074..247cdda5d683 100644
>> --- a/drivers/iommu/amd/init.c
>> +++ b/drivers/iommu/amd/init.c
>> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct 
>> acpi_table_header *table)
>>       return 0;
>>   }
>> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 
>> cntr,
>> -                u8 fxn, u64 *value, bool is_write);
>> -
>>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>>   {
>> +    u64 val;
>>       struct pci_dev *pdev = iommu->dev;
>> -    u64 val = 0xabcd, val2 = 0, save_reg = 0;

Why not leave this u64 val here? Having the pdev assignment as the
first line makes it easier to read/follow.

>>       if (!iommu_feature(iommu, FEATURE_PC))
>>           return;
>>       amd_iommu_pc_present = true;
>> -    /* save the value to restore, if writable */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>> -        goto pc_false;
>> -
>> -    /* Check if the performance counters can be written to */
>> -    if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
>> -        (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
>> -        (val != val2))

Aha - this is going away anyway. Please ignore my comment on 1/2
about parenthesis around (val != val2) being unnecessary.

>> -        goto pc_false;
>> -
>> -    /* restore */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
>> -        goto pc_false;
>> -
>>       pci_info(pdev, "IOMMU performance counters supported\n");
>>       val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
>> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct 
>> amd_iommu *iommu)
>>       iommu->max_counters = (u8) ((val >> 7) & 0xf);
>>       return;
>> -
>> -pc_false:
>> -    pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
>> -    amd_iommu_pc_present = false;
>> -    return;
>>   }
>>   static ssize_t amd_iommu_show_cap(struct device *dev,
>>
> 

thanks,
-- Shuah
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-09 20:00     ` Shuah Khan
  -1 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 20:00 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe, Shuah Khan

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.
> 
> Unfortunatly, there is no documentation of since which generation
> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume
> that the buggy platforms are less likely to be in used, and it should
> be relatively safe to remove this legacy logic.
> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---


Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

Results with this patch on AMD Ryzen 5 PRO 2400GE w/ Radeon Vega
Graphics

sudo ./perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

  Performance counter stats for 'system wide':

                156      amd_iommu_0/cmd_processed/ 
                 (33.30%)
                 80       amd_iommu_0/cmd_processed_inv/ 
                      (33.38%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 
                         (33.40%)
                  0       amd_iommu_0/int_dte_hit/ 
                (33.43%)
                325       amd_iommu_0/int_dte_mis/ 
                (33.44%)
              1,951       amd_iommu_0/mem_dte_hit/ 
                (33.45%)
              7,589       amd_iommu_0/mem_dte_mis/ 
                (33.49%)
                325       amd_iommu_0/mem_iommu_tlb_pde_hit/ 
                          (33.45%)
              2,460       amd_iommu_0/mem_iommu_tlb_pde_mis/ 
                          (33.41%)
              2,510       amd_iommu_0/mem_iommu_tlb_pte_hit/ 
                          (33.38%)
              5,526       amd_iommu_0/mem_iommu_tlb_pte_mis/ 
                          (33.33%)
                  0       amd_iommu_0/mem_pass_excl/ 
                  (33.29%)
                  0       amd_iommu_0/mem_pass_pretrans/ 
                      (33.28%)
              1,556       amd_iommu_0/mem_pass_untrans/ 
                     (33.27%)
                  0       amd_iommu_0/mem_target_abort/ 
                     (33.26%)
              3,112       amd_iommu_0/mem_trans_total/ 
                    (33.29%)
                  0       amd_iommu_0/page_tbl_read_gst/ 
                      (33.29%)
              1,813       amd_iommu_0/page_tbl_read_nst/ 
                      (33.25%)
              2,242       amd_iommu_0/page_tbl_read_tot/ 
                      (33.27%)
                  0       amd_iommu_0/smi_blk/ 
                (33.29%)
                  0       amd_iommu_0/smi_recv/ 
                (33.28%)
                  0       amd_iommu_0/tlb_inv/ 
                (33.28%)
                  0       amd_iommu_0/vapic_int_guest/ 
                    (33.25%)
                  0       amd_iommu_0/vapic_int_non_guest/ 
                        (33.26%)

       10.003200316 seconds time elapsed


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09 20:00     ` Shuah Khan
  0 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 20:00 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj, will

On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.
> 
> Unfortunatly, there is no documentation of since which generation
> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume
> that the buggy platforms are less likely to be in used, and it should
> be relatively safe to remove this legacy logic.
> 
> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---


Tested-by: Shuah Khan <skhan@linuxfoundation.org>

thanks,
-- Shuah

Results with this patch on AMD Ryzen 5 PRO 2400GE w/ Radeon Vega
Graphics

sudo ./perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

  Performance counter stats for 'system wide':

                156      amd_iommu_0/cmd_processed/ 
                 (33.30%)
                 80       amd_iommu_0/cmd_processed_inv/ 
                      (33.38%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 
                         (33.40%)
                  0       amd_iommu_0/int_dte_hit/ 
                (33.43%)
                325       amd_iommu_0/int_dte_mis/ 
                (33.44%)
              1,951       amd_iommu_0/mem_dte_hit/ 
                (33.45%)
              7,589       amd_iommu_0/mem_dte_mis/ 
                (33.49%)
                325       amd_iommu_0/mem_iommu_tlb_pde_hit/ 
                          (33.45%)
              2,460       amd_iommu_0/mem_iommu_tlb_pde_mis/ 
                          (33.41%)
              2,510       amd_iommu_0/mem_iommu_tlb_pte_hit/ 
                          (33.38%)
              5,526       amd_iommu_0/mem_iommu_tlb_pte_mis/ 
                          (33.33%)
                  0       amd_iommu_0/mem_pass_excl/ 
                  (33.29%)
                  0       amd_iommu_0/mem_pass_pretrans/ 
                      (33.28%)
              1,556       amd_iommu_0/mem_pass_untrans/ 
                     (33.27%)
                  0       amd_iommu_0/mem_target_abort/ 
                     (33.26%)
              3,112       amd_iommu_0/mem_trans_total/ 
                    (33.29%)
                  0       amd_iommu_0/page_tbl_read_gst/ 
                      (33.29%)
              1,813       amd_iommu_0/page_tbl_read_nst/ 
                      (33.25%)
              2,242       amd_iommu_0/page_tbl_read_tot/ 
                      (33.27%)
                  0       amd_iommu_0/smi_blk/ 
                (33.29%)
                  0       amd_iommu_0/smi_recv/ 
                (33.28%)
                  0       amd_iommu_0/tlb_inv/ 
                (33.28%)
                  0       amd_iommu_0/vapic_int_guest/ 
                    (33.25%)
                  0       amd_iommu_0/vapic_int_non_guest/ 
                        (33.26%)

       10.003200316 seconds time elapsed

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-09 20:11     ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-09 20:11 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

On 09/04/2021 09:58, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.
> 
> Unfortunatly, there is no documentation of since which generation
> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume
> that the buggy platforms are less likely to be in used, and it should
> be relatively safe to remove this legacy logic.

Thanks for explaining the 'context', Suravee.

> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd/init.c | 24 +-----------------------
>   1 file changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 648cdfd03074..247cdda5d683 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
> -
>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> +	u64 val;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
>   
>   	amd_iommu_pc_present = true;
>   
> -	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
> -		goto pc_false;
> -
> -	/* Check if the performance counters can be written to */
> -	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> -	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> -	    (val != val2))
> -		goto pc_false;
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
> -		goto pc_false;
> -
>   	pci_info(pdev, "IOMMU performance counters supported\n");
>   
>   	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	iommu->max_counters = (u8) ((val >> 7) & 0xf);
>   
>   	return;
> -
> -pc_false:
> -	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
> -	amd_iommu_pc_present = false;
> -	return;
>   }
>   
>   static ssize_t amd_iommu_show_cap(struct device *dev,
> 

I'll test your revert + update IOMMU patch on my Ryzen 2400G and 4700U 
most likely over the weekend. Very interesting!

Please be aware that your original IOMMU patch has already reached the 
imminent release of Ubuntu 21.04 (Hirsute). I've taken the liberty of 
adding Alex Hung (lead kernel developer at Ubuntu) to the circulation list.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09 20:11     ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-09 20:11 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

On 09/04/2021 09:58, Suravee Suthikulpanit wrote:
> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
> Performance Counter (PMC) support was first introduced in
> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
> resource management"), there was a HW bug where the counters could not
> be accessed. The result was reading of the counter always return zero.
> 
> At the time, the suggested workaround was to add a test logic prior
> to initializing the PMC feature to check if the counters can be programmed
> and read back the same value. This has been working fine until the more
> recent desktop/mobile platforms start enabling power gating for the PMC,
> which prevents access to the counters. This results in the PMC support
> being disabled unnecesarily.
> 
> Unfortunatly, there is no documentation of since which generation
> of hardware the original PMC HW bug was fixed. Although, it was fixed
> soon after the first introduction of the PMC. Base on this, we assume
> that the buggy platforms are less likely to be in used, and it should
> be relatively safe to remove this legacy logic.

Thanks for explaining the 'context', Suravee.

> Link: https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
> Cc: Shuah Khan <skhan@linuxfoundation.org>
> Cc: Alexander Monakov <amonakov@ispras.ru>
> Cc: David Coe <david.coe@live.co.uk>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
>   drivers/iommu/amd/init.c | 24 +-----------------------
>   1 file changed, 1 insertion(+), 23 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 648cdfd03074..247cdda5d683 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1714,33 +1714,16 @@ static int __init init_iommu_all(struct acpi_table_header *table)
>   	return 0;
>   }
>   
> -static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
> -				u8 fxn, u64 *value, bool is_write);
> -
>   static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   {
> +	u64 val;
>   	struct pci_dev *pdev = iommu->dev;
> -	u64 val = 0xabcd, val2 = 0, save_reg = 0;
>   
>   	if (!iommu_feature(iommu, FEATURE_PC))
>   		return;
>   
>   	amd_iommu_pc_present = true;
>   
> -	/* save the value to restore, if writable */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
> -		goto pc_false;
> -
> -	/* Check if the performance counters can be written to */
> -	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
> -	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
> -	    (val != val2))
> -		goto pc_false;
> -
> -	/* restore */
> -	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
> -		goto pc_false;
> -
>   	pci_info(pdev, "IOMMU performance counters supported\n");
>   
>   	val = readl(iommu->mmio_base + MMIO_CNTR_CONF_OFFSET);
> @@ -1748,11 +1731,6 @@ static void init_iommu_perf_ctr(struct amd_iommu *iommu)
>   	iommu->max_counters = (u8) ((val >> 7) & 0xf);
>   
>   	return;
> -
> -pc_false:
> -	pci_err(pdev, "Unable to read/write to IOMMU perf counter.\n");
> -	amd_iommu_pc_present = false;
> -	return;
>   }
>   
>   static ssize_t amd_iommu_show_cap(struct device *dev,
> 

I'll test your revert + update IOMMU patch on my Ryzen 2400G and 4700U 
most likely over the weekend. Very interesting!

Please be aware that your original IOMMU patch has already reached the 
imminent release of Ubuntu 21.04 (Hirsute). I've taken the liberty of 
adding Alex Hung (lead kernel developer at Ubuntu) to the circulation list.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09 20:00     ` Shuah Khan
@ 2021-04-09 20:19       ` Shuah Khan
  -1 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 20:19 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe, Shuah Khan, 1917203

On 4/9/21 2:00 PM, Shuah Khan wrote:
> On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
>> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
>> Performance Counter (PMC) support was first introduced in
>> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
>> resource management"), there was a HW bug where the counters could not
>> be accessed. The result was reading of the counter always return zero.
>>
>> At the time, the suggested workaround was to add a test logic prior
>> to initializing the PMC feature to check if the counters can be 
>> programmed
>> and read back the same value. This has been working fine until the more
>> recent desktop/mobile platforms start enabling power gating for the PMC,
>> which prevents access to the counters. This results in the PMC support
>> being disabled unnecesarily.
>>
>> Unfortunatly, there is no documentation of since which generation
>> of hardware the original PMC HW bug was fixed. Although, it was fixed
>> soon after the first introduction of the PMC. Base on this, we assume
>> that the buggy platforms are less likely to be in used, and it should
>> be relatively safe to remove this legacy logic.
>>
>> Link: 
>> https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/ 
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
>> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
>> Cc: Shuah Khan <skhan@linuxfoundation.org>
>> Cc: Alexander Monakov <amonakov@ispras.ru>
>> Cc: David Coe <david.coe@live.co.uk>
>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
> 
> 
> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> 

Revert + this patch - same as my test on Ryzen 5

On AMD Ryzen 7 4700G with Radeon Graphics

These look real odd to me. Let me know if I should look further.

sudo ./perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

  Performance counter stats for 'system wide':

17,761,952,514,865,374      amd_iommu_0/cmd_processed/ 
                    (33.28%)
18,582,155,570,607,472       amd_iommu_0/cmd_processed_inv/ 
                         (33.32%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 
                         (33.36%)
5,056,087,645,262,255       amd_iommu_0/int_dte_hit/ 
                  (33.40%)
32,831,106,446,308,888       amd_iommu_0/int_dte_mis/ 
                   (33.44%)
13,461,819,655,591,296       amd_iommu_0/mem_dte_hit/ 
                   (33.45%)
208,555,436,221,050,464       amd_iommu_0/mem_dte_mis/ 
                    (33.47%)
196,824,154,635,609,888       amd_iommu_0/mem_iommu_tlb_pde_hit/ 
                              (33.46%)
193,552,630,440,410,144       amd_iommu_0/mem_iommu_tlb_pde_mis/ 
                              (33.45%)
176,936,647,809,098,368       amd_iommu_0/mem_iommu_tlb_pte_hit/ 
                              (33.41%)
184,737,401,623,626,464       amd_iommu_0/mem_iommu_tlb_pte_mis/ 
                              (33.37%)
                  0       amd_iommu_0/mem_pass_excl/ 
                  (33.33%)
                  0       amd_iommu_0/mem_pass_pretrans/ 
                      (33.30%)
                  0       amd_iommu_0/mem_pass_untrans/ 
                     (33.28%)
                  0       amd_iommu_0/mem_target_abort/ 
                     (33.27%)
245,383,212,924,004,288       amd_iommu_0/mem_trans_total/ 
                        (33.27%)
                  0       amd_iommu_0/page_tbl_read_gst/ 
                      (33.28%)
262,267,045,917,967,264       amd_iommu_0/page_tbl_read_nst/ 
                          (33.27%)
256,308,216,913,137,600       amd_iommu_0/page_tbl_read_tot/ 
                          (33.28%)
                  0       amd_iommu_0/smi_blk/ 
                (33.27%)
                  0       amd_iommu_0/smi_recv/ 
                (33.27%)
                  0       amd_iommu_0/tlb_inv/ 
                (33.27%)
                  0       amd_iommu_0/vapic_int_guest/ 
                    (33.26%)
38,913,544,420,579,888       amd_iommu_0/vapic_int_non_guest/ 
                           (33.27%)

       10.003967760 seconds time elapsed

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-09 20:19       ` Shuah Khan
  0 siblings, 0 replies; 46+ messages in thread
From: Shuah Khan @ 2021-04-09 20:19 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Shuah Khan, Tj,
	1917203, will

On 4/9/21 2:00 PM, Shuah Khan wrote:
> On 4/9/21 2:58 AM, Suravee Suthikulpanit wrote:
>> In early AMD desktop/mobile platforms (during 2013), when the IOMMU
>> Performance Counter (PMC) support was first introduced in
>> commit 30861ddc9cca ("perf/x86/amd: Add IOMMU Performance Counter
>> resource management"), there was a HW bug where the counters could not
>> be accessed. The result was reading of the counter always return zero.
>>
>> At the time, the suggested workaround was to add a test logic prior
>> to initializing the PMC feature to check if the counters can be 
>> programmed
>> and read back the same value. This has been working fine until the more
>> recent desktop/mobile platforms start enabling power gating for the PMC,
>> which prevents access to the counters. This results in the PMC support
>> being disabled unnecesarily.
>>
>> Unfortunatly, there is no documentation of since which generation
>> of hardware the original PMC HW bug was fixed. Although, it was fixed
>> soon after the first introduction of the PMC. Base on this, we assume
>> that the buggy platforms are less likely to be in used, and it should
>> be relatively safe to remove this legacy logic.
>>
>> Link: 
>> https://lore.kernel.org/linux-iommu/alpine.LNX.3.20.13.2006030935570.3181@monopod.intra.ispras.ru/ 
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201753
>> Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
>> Cc: Shuah Khan <skhan@linuxfoundation.org>
>> Cc: Alexander Monakov <amonakov@ispras.ru>
>> Cc: David Coe <david.coe@live.co.uk>
>> Cc: Paul Menzel <pmenzel@molgen.mpg.de>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
> 
> 
> Tested-by: Shuah Khan <skhan@linuxfoundation.org>
> 

Revert + this patch - same as my test on Ryzen 5

On AMD Ryzen 7 4700G with Radeon Graphics

These look real odd to me. Let me know if I should look further.

sudo ./perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

  Performance counter stats for 'system wide':

17,761,952,514,865,374      amd_iommu_0/cmd_processed/ 
                    (33.28%)
18,582,155,570,607,472       amd_iommu_0/cmd_processed_inv/ 
                         (33.32%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 
                         (33.36%)
5,056,087,645,262,255       amd_iommu_0/int_dte_hit/ 
                  (33.40%)
32,831,106,446,308,888       amd_iommu_0/int_dte_mis/ 
                   (33.44%)
13,461,819,655,591,296       amd_iommu_0/mem_dte_hit/ 
                   (33.45%)
208,555,436,221,050,464       amd_iommu_0/mem_dte_mis/ 
                    (33.47%)
196,824,154,635,609,888       amd_iommu_0/mem_iommu_tlb_pde_hit/ 
                              (33.46%)
193,552,630,440,410,144       amd_iommu_0/mem_iommu_tlb_pde_mis/ 
                              (33.45%)
176,936,647,809,098,368       amd_iommu_0/mem_iommu_tlb_pte_hit/ 
                              (33.41%)
184,737,401,623,626,464       amd_iommu_0/mem_iommu_tlb_pte_mis/ 
                              (33.37%)
                  0       amd_iommu_0/mem_pass_excl/ 
                  (33.33%)
                  0       amd_iommu_0/mem_pass_pretrans/ 
                      (33.30%)
                  0       amd_iommu_0/mem_pass_untrans/ 
                     (33.28%)
                  0       amd_iommu_0/mem_target_abort/ 
                     (33.27%)
245,383,212,924,004,288       amd_iommu_0/mem_trans_total/ 
                        (33.27%)
                  0       amd_iommu_0/page_tbl_read_gst/ 
                      (33.28%)
262,267,045,917,967,264       amd_iommu_0/page_tbl_read_nst/ 
                          (33.27%)
256,308,216,913,137,600       amd_iommu_0/page_tbl_read_tot/ 
                          (33.28%)
                  0       amd_iommu_0/smi_blk/ 
                (33.27%)
                  0       amd_iommu_0/smi_recv/ 
                (33.27%)
                  0       amd_iommu_0/tlb_inv/ 
                (33.27%)
                  0       amd_iommu_0/vapic_int_guest/ 
                    (33.26%)
38,913,544,420,579,888       amd_iommu_0/vapic_int_non_guest/ 
                           (33.27%)

       10.003967760 seconds time elapsed

thanks,
-- Shuah
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-10  8:17     ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-10  8:17 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

Well, well! This is a promising start to the weekend. Thank you both, 
Suravee and Paul.

Results for AMD Ryzen 4 2400G
   running Ubuntu 21.04β kernel 5.11.0-13 and Windows 10 2H2 under KVM

$ sudo dmesg | grep IOMMU
[    0.557725] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.561538] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.562828] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.881108] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "1"
     vls                 = "1"

   Sections:

$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/9/devices/0000:09:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/10/devices/0000:0a:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.6
/sys/kernel/iommu_groups/12/devices/0000:0b:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:08.1
/sys/kernel/iommu_groups/6/devices/0000:00:14.0
/sys/kernel/iommu_groups/6/devices/0000:00:14.3
/sys/kernel/iommu_groups/8/devices/0000:02:04.0
/sys/kernel/iommu_groups/8/devices/0000:02:01.0
/sys/kernel/iommu_groups/8/devices/0000:04:00.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.1
/sys/kernel/iommu_groups/8/devices/0000:02:06.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.2
/sys/kernel/iommu_groups/8/devices/0000:08:00.0
/sys/kernel/iommu_groups/8/devices/0000:02:00.0
/sys/kernel/iommu_groups/8/devices/0000:02:07.0
/sys/kernel/iommu_groups/8/devices/0000:03:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.2
/sys/kernel/iommu_groups/11/devices/0000:0a:00.1
/sys/kernel/iommu_groups/11/devices/0000:0a:00.3
/sys/kernel/iommu_groups/11/devices/0000:0a:00.2
/sys/kernel/iommu_groups/11/devices/0000:0a:00.4
/sys/kernel/iommu_groups/11/devices/0000:0a:00.6
/sys/kernel/iommu_groups/3/devices/0000:00:08.0
/sys/kernel/iommu_groups/5/devices/0000:00:08.2
/sys/kernel/iommu_groups/7/devices/0000:00:18.5
/sys/kernel/iommu_groups/7/devices/0000:00:18.7
/sys/kernel/iommu_groups/7/devices/0000:00:18.0
/sys/kernel/iommu_groups/7/devices/0000:00:18.2
/sys/kernel/iommu_groups/7/devices/0000:00:18.4
/sys/kernel/iommu_groups/7/devices/0000:00:18.6
/sys/kernel/iommu_groups/7/devices/0000:00:18.1
/sys/kernel/iommu_groups/7/devices/0000:00:18.3

$ sudo kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                   0       amd_iommu_0/cmd_processed/            (33.32%)
                   2       amd_iommu_0/cmd_processed_inv/        (33.35%)
                   0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.39%)
                 340       amd_iommu_0/int_dte_hit/              (33.43%)
                  23       amd_iommu_0/int_dte_mis/              (33.44%)
                 556       amd_iommu_0/mem_dte_hit/              (33.44%)
841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
                  74       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.44%)
                 502       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.41%)
               1,195       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.36%)
               8,017       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.33%)
                   0       amd_iommu_0/mem_pass_excl/            (33.29%)
                   0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
              16,504       amd_iommu_0/mem_pass_untrans/         (33.28%)
                   0       amd_iommu_0/mem_target_abort/         (33.28%)
               2,842       amd_iommu_0/mem_trans_total/          (33.28%)
                   0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
                 111       amd_iommu_0/page_tbl_read_nst/        (33.29%)
                 111       amd_iommu_0/page_tbl_read_tot/        (33.28%)
                   0       amd_iommu_0/smi_blk/                  (33.28%)
                   0       amd_iommu_0/smi_recv/                 (33.29%)
                   0       amd_iommu_0/tlb_inv/                  (33.28%)
                   0       amd_iommu_0/vapic_int_guest/          (33.28%)
                 345       amd_iommu_0/vapic_int_non_guest/      (33.29%)

        10.000799128 seconds time elapsed

Results for Ryzen 7 4700U to follow.

-- 
David Coe

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-10  8:17     ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-10  8:17 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

Well, well! This is a promising start to the weekend. Thank you both, 
Suravee and Paul.

Results for AMD Ryzen 4 2400G
   running Ubuntu 21.04β kernel 5.11.0-13 and Windows 10 2H2 under KVM

$ sudo dmesg | grep IOMMU
[    0.557725] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.561538] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.562828] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.881108] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "1"
     vls                 = "1"

   Sections:

$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/9/devices/0000:09:00.0
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/10/devices/0000:0a:00.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.6
/sys/kernel/iommu_groups/12/devices/0000:0b:00.0
/sys/kernel/iommu_groups/4/devices/0000:00:08.1
/sys/kernel/iommu_groups/6/devices/0000:00:14.0
/sys/kernel/iommu_groups/6/devices/0000:00:14.3
/sys/kernel/iommu_groups/8/devices/0000:02:04.0
/sys/kernel/iommu_groups/8/devices/0000:02:01.0
/sys/kernel/iommu_groups/8/devices/0000:04:00.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.1
/sys/kernel/iommu_groups/8/devices/0000:02:06.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.0
/sys/kernel/iommu_groups/8/devices/0000:01:00.2
/sys/kernel/iommu_groups/8/devices/0000:08:00.0
/sys/kernel/iommu_groups/8/devices/0000:02:00.0
/sys/kernel/iommu_groups/8/devices/0000:02:07.0
/sys/kernel/iommu_groups/8/devices/0000:03:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.2
/sys/kernel/iommu_groups/11/devices/0000:0a:00.1
/sys/kernel/iommu_groups/11/devices/0000:0a:00.3
/sys/kernel/iommu_groups/11/devices/0000:0a:00.2
/sys/kernel/iommu_groups/11/devices/0000:0a:00.4
/sys/kernel/iommu_groups/11/devices/0000:0a:00.6
/sys/kernel/iommu_groups/3/devices/0000:00:08.0
/sys/kernel/iommu_groups/5/devices/0000:00:08.2
/sys/kernel/iommu_groups/7/devices/0000:00:18.5
/sys/kernel/iommu_groups/7/devices/0000:00:18.7
/sys/kernel/iommu_groups/7/devices/0000:00:18.0
/sys/kernel/iommu_groups/7/devices/0000:00:18.2
/sys/kernel/iommu_groups/7/devices/0000:00:18.4
/sys/kernel/iommu_groups/7/devices/0000:00:18.6
/sys/kernel/iommu_groups/7/devices/0000:00:18.1
/sys/kernel/iommu_groups/7/devices/0000:00:18.3

$ sudo kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                   0       amd_iommu_0/cmd_processed/            (33.32%)
                   2       amd_iommu_0/cmd_processed_inv/        (33.35%)
                   0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.39%)
                 340       amd_iommu_0/int_dte_hit/              (33.43%)
                  23       amd_iommu_0/int_dte_mis/              (33.44%)
                 556       amd_iommu_0/mem_dte_hit/              (33.44%)
841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
                  74       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.44%)
                 502       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.41%)
               1,195       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.36%)
               8,017       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.33%)
                   0       amd_iommu_0/mem_pass_excl/            (33.29%)
                   0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
              16,504       amd_iommu_0/mem_pass_untrans/         (33.28%)
                   0       amd_iommu_0/mem_target_abort/         (33.28%)
               2,842       amd_iommu_0/mem_trans_total/          (33.28%)
                   0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
                 111       amd_iommu_0/page_tbl_read_nst/        (33.29%)
                 111       amd_iommu_0/page_tbl_read_tot/        (33.28%)
                   0       amd_iommu_0/smi_blk/                  (33.28%)
                   0       amd_iommu_0/smi_recv/                 (33.29%)
                   0       amd_iommu_0/tlb_inv/                  (33.28%)
                   0       amd_iommu_0/vapic_int_guest/          (33.28%)
                 345       amd_iommu_0/vapic_int_non_guest/      (33.29%)

        10.000799128 seconds time elapsed

Results for Ryzen 7 4700U to follow.

-- 
David Coe
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-10 10:03     ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-10 10:03 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

Results for AMD Ryzen 4700U running Ubuntu 21.04β kernel 5.11.0-13

$ sudo dmesg | grep IOMMU
[    0.490352] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.491985] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.493732] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.793259] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "1"
     vls                 = "1"

   Sections:

$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.2
/sys/kernel/iommu_groups/4/devices/0000:00:08.2
/sys/kernel/iommu_groups/4/devices/0000:03:00.2
/sys/kernel/iommu_groups/4/devices/0000:04:00.0
/sys/kernel/iommu_groups/4/devices/0000:03:00.4
/sys/kernel/iommu_groups/4/devices/0000:03:00.6
/sys/kernel/iommu_groups/4/devices/0000:00:08.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.3
/sys/kernel/iommu_groups/4/devices/0000:04:00.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.5
/sys/kernel/iommu_groups/4/devices/0000:00:08.0
/sys/kernel/iommu_groups/4/devices/0000:03:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:18.5
/sys/kernel/iommu_groups/6/devices/0000:00:18.7
/sys/kernel/iommu_groups/6/devices/0000:00:18.0
/sys/kernel/iommu_groups/6/devices/0000:00:18.2
/sys/kernel/iommu_groups/6/devices/0000:00:18.4
/sys/kernel/iommu_groups/6/devices/0000:00:18.6
/sys/kernel/iommu_groups/6/devices/0000:00:18.1
/sys/kernel/iommu_groups/6/devices/0000:00:18.3
/sys/kernel/iommu_groups/8/devices/0000:02:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/3/devices/0000:00:02.4
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.3
/sys/kernel/iommu_groups/7/devices/0000:01:00.0

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 12      amd_iommu_0/cmd_processed/             (33.28%)
                  6       amd_iommu_0/cmd_processed_inv/        (33.32%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
                290       amd_iommu_0/int_dte_hit/              (33.40%)
                 20       amd_iommu_0/int_dte_mis/              (33.46%)
                391       amd_iommu_0/mem_dte_hit/              (33.49%)
              3,720       amd_iommu_0/mem_dte_mis/              (33.49%)
                 44       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
                810       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.45%)
                 35       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.41%)
                746       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
                  0       amd_iommu_0/mem_pass_excl/            (33.32%)
                  0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
                  0       amd_iommu_0/mem_pass_untrans/         (33.28%)
                  0       amd_iommu_0/mem_target_abort/         (33.27%)
                715       amd_iommu_0/mem_trans_total/          (33.27%)
                  0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
                 36       amd_iommu_0/page_tbl_read_nst/        (33.27%)
                 36       amd_iommu_0/page_tbl_read_tot/        (33.27%)
                  0       amd_iommu_0/smi_blk/                  (33.28%)
                  0       amd_iommu_0/smi_recv/                 (33.26%)
                  0       amd_iommu_0/tlb_inv/                  (33.23%)
                  0       amd_iommu_0/vapic_int_guest/          (33.24%)
                366       amd_iommu_0/vapic_int_non_guest/      (33.27%)

The immediately obvious difference is the with the enormous count seen 
on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone 
with comments and insight?

841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)

Otherwise, all seems to running smoothly (especially for a distribution 
still in β). Bravo and many thanks all!

-- 
David Coe

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-10 10:03     ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-10 10:03 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

Results for AMD Ryzen 4700U running Ubuntu 21.04β kernel 5.11.0-13

$ sudo dmesg | grep IOMMU
[    0.490352] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.491985] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.493732] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.793259] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "1"
     vls                 = "1"

   Sections:

$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/2/devices/0000:00:02.2
/sys/kernel/iommu_groups/4/devices/0000:00:08.2
/sys/kernel/iommu_groups/4/devices/0000:03:00.2
/sys/kernel/iommu_groups/4/devices/0000:04:00.0
/sys/kernel/iommu_groups/4/devices/0000:03:00.4
/sys/kernel/iommu_groups/4/devices/0000:03:00.6
/sys/kernel/iommu_groups/4/devices/0000:00:08.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.3
/sys/kernel/iommu_groups/4/devices/0000:04:00.1
/sys/kernel/iommu_groups/4/devices/0000:03:00.5
/sys/kernel/iommu_groups/4/devices/0000:00:08.0
/sys/kernel/iommu_groups/4/devices/0000:03:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:18.5
/sys/kernel/iommu_groups/6/devices/0000:00:18.7
/sys/kernel/iommu_groups/6/devices/0000:00:18.0
/sys/kernel/iommu_groups/6/devices/0000:00:18.2
/sys/kernel/iommu_groups/6/devices/0000:00:18.4
/sys/kernel/iommu_groups/6/devices/0000:00:18.6
/sys/kernel/iommu_groups/6/devices/0000:00:18.1
/sys/kernel/iommu_groups/6/devices/0000:00:18.3
/sys/kernel/iommu_groups/8/devices/0000:02:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/3/devices/0000:00:02.4
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/5/devices/0000:00:14.3
/sys/kernel/iommu_groups/7/devices/0000:01:00.0

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 12      amd_iommu_0/cmd_processed/             (33.28%)
                  6       amd_iommu_0/cmd_processed_inv/        (33.32%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
                290       amd_iommu_0/int_dte_hit/              (33.40%)
                 20       amd_iommu_0/int_dte_mis/              (33.46%)
                391       amd_iommu_0/mem_dte_hit/              (33.49%)
              3,720       amd_iommu_0/mem_dte_mis/              (33.49%)
                 44       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
                810       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.45%)
                 35       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.41%)
                746       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
                  0       amd_iommu_0/mem_pass_excl/            (33.32%)
                  0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
                  0       amd_iommu_0/mem_pass_untrans/         (33.28%)
                  0       amd_iommu_0/mem_target_abort/         (33.27%)
                715       amd_iommu_0/mem_trans_total/          (33.27%)
                  0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
                 36       amd_iommu_0/page_tbl_read_nst/        (33.27%)
                 36       amd_iommu_0/page_tbl_read_tot/        (33.27%)
                  0       amd_iommu_0/smi_blk/                  (33.28%)
                  0       amd_iommu_0/smi_recv/                 (33.26%)
                  0       amd_iommu_0/tlb_inv/                  (33.23%)
                  0       amd_iommu_0/vapic_int_guest/          (33.24%)
                366       amd_iommu_0/vapic_int_non_guest/      (33.27%)

The immediately obvious difference is the with the enormous count seen 
on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone 
with comments and insight?

841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)

Otherwise, all seems to running smoothly (especially for a distribution 
still in β). Bravo and many thanks all!

-- 
David Coe
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-09  8:58   ` Suravee Suthikulpanit
@ 2021-04-13  9:38     ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-13  9:38 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

Hi Suravee!

Just in case (!), I've run your revert+update patch on kernel 5.11.0-13, 
Ubuntu 21.04β running on an AMD FX-8350 (pre Zen and IOMMUv2). As with 
the AMD Ryzen 2400G and 4700U, I'm finding no obvious issues.


$ sudo dmesg | grep IOMMU
[    0.948890] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    4.393773] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    4.393776] AMD-Vi: AMD IOMMUv2 functionality not available on this 
system


$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "0"
     vls                 = "0"

   Sections:


$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/9/devices/0000:00:14.2
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:14.3
/sys/kernel/iommu_groups/2/devices/0000:00:04.0
/sys/kernel/iommu_groups/12/devices/0000:00:14.5
/sys/kernel/iommu_groups/4/devices/0000:00:0d.0
/sys/kernel/iommu_groups/14/devices/0000:00:16.0
/sys/kernel/iommu_groups/14/devices/0000:00:16.2
/sys/kernel/iommu_groups/6/devices/0000:00:12.0
/sys/kernel/iommu_groups/6/devices/0000:00:12.2
/sys/kernel/iommu_groups/16/devices/0000:02:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:14.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/11/devices/0000:00:14.4
/sys/kernel/iommu_groups/3/devices/0000:00:0b.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.3
/sys/kernel/iommu_groups/13/devices/0000:00:15.0
/sys/kernel/iommu_groups/13/devices/0000:06:00.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.2
/sys/kernel/iommu_groups/13/devices/0000:07:00.0
/sys/kernel/iommu_groups/13/devices/0000:08:00.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.1
/sys/kernel/iommu_groups/13/devices/0000:09:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:11.0
/sys/kernel/iommu_groups/15/devices/0000:01:00.1
/sys/kernel/iommu_groups/15/devices/0000:01:00.0
/sys/kernel/iommu_groups/7/devices/0000:00:13.0
/sys/kernel/iommu_groups/7/devices/0000:00:13.2
/sys/kernel/iommu_groups/17/devices/0000:04:00.0


$ sudo kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used


$ perf list | grep iommu
No amd_iommu events

Best regards and many thanks.

-- 
David Coe


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-13  9:38     ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-13  9:38 UTC (permalink / raw)
  To: Suravee Suthikulpanit, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

Hi Suravee!

Just in case (!), I've run your revert+update patch on kernel 5.11.0-13, 
Ubuntu 21.04β running on an AMD FX-8350 (pre Zen and IOMMUv2). As with 
the AMD Ryzen 2400G and 4700U, I'm finding no obvious issues.


$ sudo dmesg | grep IOMMU
[    0.948890] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    4.393773] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
[    4.393776] AMD-Vi: AMD IOMMUv2 functionality not available on this 
system


$ systool -m kvm_amd -v
Module = "kvm_amd"

   Attributes:
     coresize            = "114688"
     initsize            = "0"
     initstate           = "live"
     refcnt              = "0"
     srcversion          = "4371BA17A41823101F90761"
     taint               = ""
     uevent              = <store method only>

   Parameters:
     avic                = "0"
     dump_invalid_vmcb   = "N"
     nested              = "1"
     npt                 = "1"
     nrips               = "1"
     pause_filter_count_grow= "2"
     pause_filter_count_max= "65535"
     pause_filter_count_shrink= "0"
     pause_filter_count  = "3000"
     pause_filter_thresh = "128"
     sev_es              = "0"
     sev                 = "0"
     vgif                = "0"
     vls                 = "0"

   Sections:


$ compgen -G "/sys/kernel/iommu_groups/*/devices/*"
/sys/kernel/iommu_groups/9/devices/0000:00:14.2
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/10/devices/0000:00:14.3
/sys/kernel/iommu_groups/2/devices/0000:00:04.0
/sys/kernel/iommu_groups/12/devices/0000:00:14.5
/sys/kernel/iommu_groups/4/devices/0000:00:0d.0
/sys/kernel/iommu_groups/14/devices/0000:00:16.0
/sys/kernel/iommu_groups/14/devices/0000:00:16.2
/sys/kernel/iommu_groups/6/devices/0000:00:12.0
/sys/kernel/iommu_groups/6/devices/0000:00:12.2
/sys/kernel/iommu_groups/16/devices/0000:02:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:14.0
/sys/kernel/iommu_groups/1/devices/0000:00:02.0
/sys/kernel/iommu_groups/11/devices/0000:00:14.4
/sys/kernel/iommu_groups/3/devices/0000:00:0b.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.3
/sys/kernel/iommu_groups/13/devices/0000:00:15.0
/sys/kernel/iommu_groups/13/devices/0000:06:00.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.2
/sys/kernel/iommu_groups/13/devices/0000:07:00.0
/sys/kernel/iommu_groups/13/devices/0000:08:00.0
/sys/kernel/iommu_groups/13/devices/0000:00:15.1
/sys/kernel/iommu_groups/13/devices/0000:09:00.0
/sys/kernel/iommu_groups/5/devices/0000:00:11.0
/sys/kernel/iommu_groups/15/devices/0000:01:00.1
/sys/kernel/iommu_groups/15/devices/0000:01:00.0
/sys/kernel/iommu_groups/7/devices/0000:00:13.0
/sys/kernel/iommu_groups/7/devices/0000:00:13.2
/sys/kernel/iommu_groups/17/devices/0000:04:00.0


$ sudo kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used


$ perf list | grep iommu
No amd_iommu events

Best regards and many thanks.

-- 
David Coe

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
  2021-04-09 17:06     ` Shuah Khan
@ 2021-04-13 13:36       ` Suthikulpanit, Suravee
  -1 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-13 13:36 UTC (permalink / raw)
  To: Shuah Khan, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Alexander Monakov,
	David Coe

Shuah,

On 4/10/2021 12:06 AM, Shuah Khan wrote:
>> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
>> index 321f5906e6ed..648cdfd03074 100644
>> --- a/drivers/iommu/amd/init.c
>> +++ b/drivers/iommu/amd/init.c
>> ....
>> @@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
>>       amd_iommu_pc_present = true;
>>       /* save the value to restore, if writable */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
>> -        iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
>> -        goto pc_false;
>> -
>> -    /*
>> -     * Disable power gating by programing the performance counter
>> -     * source to 20 (i.e. counts the reads and writes from/to IOMMU
>> -     * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
>> -     * which never get incremented during this init phase.
>> -     * (Note: The event is also deprecated.)
>> -     */
>> -    val = 20;
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
>> +    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>>           goto pc_false;
>>       /* Check if the performance counters can be written to */
>> -    val = 0xabcd;
>> -    for (retry = 5; retry; retry--) {
>> -        if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
>> -            iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
>> -            val2)
>> -            break;
>> -
>> -        /* Wait about 20 msec for power gating to disable and retry. */
>> -        msleep(20);
>> -    }
>> -
>> -    /* restore */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
>> -        iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
>> +    if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
>> +        (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
>> +        (val != val2))
> 
> Probably don't need parentheses around 'val != val2'

This is the result from git revert. Also, the logic is removed in patch 2/2.

Suravee

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization"
@ 2021-04-13 13:36       ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-13 13:36 UTC (permalink / raw)
  To: Shuah Khan, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, David Coe, Jon.Grimm, Tj, will

Shuah,

On 4/10/2021 12:06 AM, Shuah Khan wrote:
>> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
>> index 321f5906e6ed..648cdfd03074 100644
>> --- a/drivers/iommu/amd/init.c
>> +++ b/drivers/iommu/amd/init.c
>> ....
>> @@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
>>       amd_iommu_pc_present = true;
>>       /* save the value to restore, if writable */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
>> -        iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
>> -        goto pc_false;
>> -
>> -    /*
>> -     * Disable power gating by programing the performance counter
>> -     * source to 20 (i.e. counts the reads and writes from/to IOMMU
>> -     * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
>> -     * which never get incremented during this init phase.
>> -     * (Note: The event is also deprecated.)
>> -     */
>> -    val = 20;
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
>> +    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
>>           goto pc_false;
>>       /* Check if the performance counters can be written to */
>> -    val = 0xabcd;
>> -    for (retry = 5; retry; retry--) {
>> -        if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
>> -            iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
>> -            val2)
>> -            break;
>> -
>> -        /* Wait about 20 msec for power gating to disable and retry. */
>> -        msleep(20);
>> -    }
>> -
>> -    /* restore */
>> -    if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
>> -        iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
>> +    if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
>> +        (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
>> +        (val != val2))
> 
> Probably don't need parentheses around 'val != val2'

This is the result from git revert. Also, the logic is removed in patch 2/2.

Suravee
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-10 10:03     ` David Coe
@ 2021-04-13 13:51       ` Suthikulpanit, Suravee
  -1 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-13 13:51 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung



On 4/10/2021 5:03 PM, David Coe wrote:
> Results for AMD Ryzen 4700U running Ubuntu 21.04β kernel 5.11.0-13
> 
> $ sudo dmesg | grep IOMMU
> [    0.490352] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
> [    0.491985] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
> [    0.493732] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
> [    0.793259] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
> 
> ....
> 
> $ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
> amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10
> 
> Performance counter stats for 'system wide':
> 
>                 12      amd_iommu_0/cmd_processed/             (33.28%)
>                  6       amd_iommu_0/cmd_processed_inv/        (33.32%)
>                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
>                290       amd_iommu_0/int_dte_hit/              (33.40%)
>                 20       amd_iommu_0/int_dte_mis/              (33.46%)
>                391       amd_iommu_0/mem_dte_hit/              (33.49%)
>              3,720       amd_iommu_0/mem_dte_mis/              (33.49%)
>                 44       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
>                810       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.45%)
>                 35       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.41%)
>                746       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
>                  0       amd_iommu_0/mem_pass_excl/            (33.32%)
>                  0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
>                  0       amd_iommu_0/mem_pass_untrans/         (33.28%)
>                  0       amd_iommu_0/mem_target_abort/         (33.27%)
>                715       amd_iommu_0/mem_trans_total/          (33.27%)
>                  0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
>                 36       amd_iommu_0/page_tbl_read_nst/        (33.27%)
>                 36       amd_iommu_0/page_tbl_read_tot/        (33.27%)
>                  0       amd_iommu_0/smi_blk/                  (33.28%)
>                  0       amd_iommu_0/smi_recv/                 (33.26%)
>                  0       amd_iommu_0/tlb_inv/                  (33.23%)
>                  0       amd_iommu_0/vapic_int_guest/          (33.24%)
>                366       amd_iommu_0/vapic_int_non_guest/      (33.27%)
> 
> The immediately obvious difference is the with the enormous count seen on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone with comments and insight?
> 
> 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> 
> Otherwise, all seems to running smoothly (especially for a distribution still in β). Bravo and many thanks all!
> 
That doesn't look correct. Lemme do some more investigation also.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-13 13:51       ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-13 13:51 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will



On 4/10/2021 5:03 PM, David Coe wrote:
> Results for AMD Ryzen 4700U running Ubuntu 21.04β kernel 5.11.0-13
> 
> $ sudo dmesg | grep IOMMU
> [    0.490352] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
> [    0.491985] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
> [    0.493732] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
> [    0.793259] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
> 
> ....
> 
> $ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
> amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10
> 
> Performance counter stats for 'system wide':
> 
>                 12      amd_iommu_0/cmd_processed/             (33.28%)
>                  6       amd_iommu_0/cmd_processed_inv/        (33.32%)
>                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
>                290       amd_iommu_0/int_dte_hit/              (33.40%)
>                 20       amd_iommu_0/int_dte_mis/              (33.46%)
>                391       amd_iommu_0/mem_dte_hit/              (33.49%)
>              3,720       amd_iommu_0/mem_dte_mis/              (33.49%)
>                 44       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
>                810       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.45%)
>                 35       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.41%)
>                746       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
>                  0       amd_iommu_0/mem_pass_excl/            (33.32%)
>                  0       amd_iommu_0/mem_pass_pretrans/        (33.28%)
>                  0       amd_iommu_0/mem_pass_untrans/         (33.28%)
>                  0       amd_iommu_0/mem_target_abort/         (33.27%)
>                715       amd_iommu_0/mem_trans_total/          (33.27%)
>                  0       amd_iommu_0/page_tbl_read_gst/        (33.28%)
>                 36       amd_iommu_0/page_tbl_read_nst/        (33.27%)
>                 36       amd_iommu_0/page_tbl_read_tot/        (33.27%)
>                  0       amd_iommu_0/smi_blk/                  (33.28%)
>                  0       amd_iommu_0/smi_recv/                 (33.26%)
>                  0       amd_iommu_0/tlb_inv/                  (33.23%)
>                  0       amd_iommu_0/vapic_int_guest/          (33.24%)
>                366       amd_iommu_0/vapic_int_non_guest/      (33.27%)
> 
> The immediately obvious difference is the with the enormous count seen on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone with comments and insight?
> 
> 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> 
> Otherwise, all seems to running smoothly (especially for a distribution still in β). Bravo and many thanks all!
> 
That doesn't look correct. Lemme do some more investigation also.

Thanks,
Suravee
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-13 13:51       ` Suthikulpanit, Suravee
@ 2021-04-14 15:33         ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-14 15:33 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

Hi Suravee!

I've re-run your revert+update patch on Ubuntu's latest kernel 5.11.0-14 
partly to check my mailer's 'mangling' hadn't also reached the code!

There are 3 sets of results in the attachment, all for the Ryzen 2400G. 
The as-distributed kernel already incorporates your IOMMU RFCv3 patch.

A. As-distributed kernel (cold boot)
    >5 retries, so no IOMMU read/write capability, no amd_iommu events.

B. As-distributed kernel (warm boot)
    <5 retries, amd_iommu running stats show large numbers as before.

C. Revert+Update kernel
    amd_iommu events listed and also show large hit/miss numbers.

In due course, I'll load the new (revert+update) kernel on the 4700G but 
won't overload your mail-box unless something unusual turns up.

Best regards,

-- 
David

[-- Attachment #2: 2400G-5.11.0-14.txt --]
[-- Type: text/plain, Size: 10025 bytes --]

A. As Supplied - Cold Boot
**************************

$ sudo dmesg | grep IOMMU
[    0.710610] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
[    0.714365] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.984616] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ sudo perf list | grep iommu
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

No amd_iommu events listed.


B. As Supplied - Warm Boot
**************************

$ sudo dmesg | grep IOMMU
[    0.515523] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.519236] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.520549] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.795781] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                  0       amd_iommu_0/cmd_processed/           (33.39%)
                  0       amd_iommu_0/cmd_processed_inv/       (33.06%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/    (33.58%)
842,245,888,947,849       amd_iommu_0/int_dte_hit/             (33.42%)
848,982,159,636,118       amd_iommu_0/int_dte_mis/             (33.15%)
835,698,854,752,581       amd_iommu_0/mem_dte_hit/             (33.68%)
839,060,819,932,270       amd_iommu_0/mem_dte_mis/             (33.55%)
                  0       amd_iommu_0/mem_iommu_tlb_pde_hit/   (33.44%)
837,231,240,047,576       amd_iommu_0/mem_iommu_tlb_pde_mis/   (33.62%)
842,688,371,629,123       amd_iommu_0/mem_iommu_tlb_pte_hit/   (33.40%)
851,647,568,857,291       amd_iommu_0/mem_iommu_tlb_pte_mis/   (33.05%)
                  0       amd_iommu_0/mem_pass_excl/           (33.30%)
                  0       amd_iommu_0/mem_pass_pretrans/       (33.36%)
852,801,037,224,491       amd_iommu_0/mem_pass_untrans/        (33.01%)
                  0       amd_iommu_0/mem_target_abort/        (33.50%)
             46,371        amd_iommu_0/mem_trans_total/        (33.28%)
                  0       amd_iommu_0/page_tbl_read_gst/       (33.00%)
              1,663        amd_iommu_0/page_tbl_read_nst/      (33.55%)
                 17        amd_iommu_0/page_tbl_read_tot/      (33.38%)
                  0       amd_iommu_0/smi_blk/                 (33.28%)
                  0       amd_iommu_0/smi_recv/                (33.49%)
                  0       amd_iommu_0/tlb_inv/                 (33.32%)
                  0       amd_iommu_0/vapic_int_guest/         (32.96%)
                318       amd_iommu_0/vapic_int_non_guest/     (33.28%)

       10.001298695 seconds time elapsed
 

C. Revert + Update
******************

$ sudo dmesg | grep IOMMU
[    0.557917] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.561705] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.563004] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.905418] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
 
$ sudo perf list | grep iommu
  amd_iommu_0/cmd_processed/                         [Kernel PMU event]
  amd_iommu_0/cmd_processed_inv/                     [Kernel PMU event]
  amd_iommu_0/ign_rd_wr_mmio_1ff8h/                  [Kernel PMU event]
  amd_iommu_0/int_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/int_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_pass_excl/                         [Kernel PMU event]
  amd_iommu_0/mem_pass_pretrans/                     [Kernel PMU event]
  amd_iommu_0/mem_pass_untrans/                      [Kernel PMU event]
  amd_iommu_0/mem_target_abort/                      [Kernel PMU event]
  amd_iommu_0/mem_trans_total/                       [Kernel PMU event]
  amd_iommu_0/page_tbl_read_gst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_nst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_tot/                     [Kernel PMU event]
  amd_iommu_0/smi_blk/                               [Kernel PMU event]
  amd_iommu_0/smi_recv/                              [Kernel PMU event]
  amd_iommu_0/tlb_inv/                               [Kernel PMU event]
  amd_iommu_0/vapic_int_guest/                       [Kernel PMU event]
  amd_iommu_0/vapic_int_non_guest/                   [Kernel PMU event]
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 36       amd_iommu_0/cmd_processed/           (33.30%)
                 20       amd_iommu_0/cmd_processed_inv/       (33.34%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/    (33.38%)
842,158,150,822,877       amd_iommu_0/int_dte_hit/             (33.42%)
                  0       amd_iommu_0/int_dte_mis/             (33.44%)
                310       amd_iommu_0/mem_dte_hit/             (33.44%)
841,709,432,950,259       amd_iommu_0/mem_dte_mis/             (33.44%)
                  0       amd_iommu_0/mem_iommu_tlb_pde_hit/   (33.44%)
                320       amd_iommu_0/mem_iommu_tlb_pde_mis/   (33.42%)
                182       amd_iommu_0/mem_iommu_tlb_pte_hit/   (33.38%)
             10,863       amd_iommu_0/mem_iommu_tlb_pte_mis/   (33.34%)
                  0       amd_iommu_0/mem_pass_excl/           (33.30%)
                  0       amd_iommu_0/mem_pass_pretrans/       (33.28%)
             10,834       amd_iommu_0/mem_pass_untrans/        (33.28%)
                  0       amd_iommu_0/mem_target_abort/        (33.28%)
              1,430       amd_iommu_0/mem_trans_total/         (33.28%)
                  0       amd_iommu_0/page_tbl_read_gst/       (33.28%)
                201       amd_iommu_0/page_tbl_read_nst/       (33.28%)
                192       amd_iommu_0/page_tbl_read_tot/       (33.28%)
                  0       amd_iommu_0/smi_blk/                 (33.28%)
                  0       amd_iommu_0/smi_recv/                (33.28%)
                  0       amd_iommu_0/tlb_inv/                 (33.28%)
                  0       amd_iommu_0/vapic_int_guest/         (33.28%)
                336       amd_iommu_0/vapic_int_non_guest/     (33.28%)

       10.001183400 seconds time elapsed

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-14 15:33         ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-14 15:33 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

[-- Attachment #1: Type: text/plain, Size: 775 bytes --]

Hi Suravee!

I've re-run your revert+update patch on Ubuntu's latest kernel 5.11.0-14 
partly to check my mailer's 'mangling' hadn't also reached the code!

There are 3 sets of results in the attachment, all for the Ryzen 2400G. 
The as-distributed kernel already incorporates your IOMMU RFCv3 patch.

A. As-distributed kernel (cold boot)
    >5 retries, so no IOMMU read/write capability, no amd_iommu events.

B. As-distributed kernel (warm boot)
    <5 retries, amd_iommu running stats show large numbers as before.

C. Revert+Update kernel
    amd_iommu events listed and also show large hit/miss numbers.

In due course, I'll load the new (revert+update) kernel on the 4700G but 
won't overload your mail-box unless something unusual turns up.

Best regards,

-- 
David

[-- Attachment #2: 2400G-5.11.0-14.txt --]
[-- Type: text/plain, Size: 10025 bytes --]

A. As Supplied - Cold Boot
**************************

$ sudo dmesg | grep IOMMU
[    0.710610] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
[    0.714365] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.984616] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ sudo perf list | grep iommu
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

No amd_iommu events listed.


B. As Supplied - Warm Boot
**************************

$ sudo dmesg | grep IOMMU
[    0.515523] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.519236] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.520549] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.795781] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                  0       amd_iommu_0/cmd_processed/           (33.39%)
                  0       amd_iommu_0/cmd_processed_inv/       (33.06%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/    (33.58%)
842,245,888,947,849       amd_iommu_0/int_dte_hit/             (33.42%)
848,982,159,636,118       amd_iommu_0/int_dte_mis/             (33.15%)
835,698,854,752,581       amd_iommu_0/mem_dte_hit/             (33.68%)
839,060,819,932,270       amd_iommu_0/mem_dte_mis/             (33.55%)
                  0       amd_iommu_0/mem_iommu_tlb_pde_hit/   (33.44%)
837,231,240,047,576       amd_iommu_0/mem_iommu_tlb_pde_mis/   (33.62%)
842,688,371,629,123       amd_iommu_0/mem_iommu_tlb_pte_hit/   (33.40%)
851,647,568,857,291       amd_iommu_0/mem_iommu_tlb_pte_mis/   (33.05%)
                  0       amd_iommu_0/mem_pass_excl/           (33.30%)
                  0       amd_iommu_0/mem_pass_pretrans/       (33.36%)
852,801,037,224,491       amd_iommu_0/mem_pass_untrans/        (33.01%)
                  0       amd_iommu_0/mem_target_abort/        (33.50%)
             46,371        amd_iommu_0/mem_trans_total/        (33.28%)
                  0       amd_iommu_0/page_tbl_read_gst/       (33.00%)
              1,663        amd_iommu_0/page_tbl_read_nst/      (33.55%)
                 17        amd_iommu_0/page_tbl_read_tot/      (33.38%)
                  0       amd_iommu_0/smi_blk/                 (33.28%)
                  0       amd_iommu_0/smi_recv/                (33.49%)
                  0       amd_iommu_0/tlb_inv/                 (33.32%)
                  0       amd_iommu_0/vapic_int_guest/         (32.96%)
                318       amd_iommu_0/vapic_int_non_guest/     (33.28%)

       10.001298695 seconds time elapsed
 

C. Revert + Update
******************

$ sudo dmesg | grep IOMMU
[    0.557917] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.561705] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.563004] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    0.905418] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>
 
$ sudo perf list | grep iommu
  amd_iommu_0/cmd_processed/                         [Kernel PMU event]
  amd_iommu_0/cmd_processed_inv/                     [Kernel PMU event]
  amd_iommu_0/ign_rd_wr_mmio_1ff8h/                  [Kernel PMU event]
  amd_iommu_0/int_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/int_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_pass_excl/                         [Kernel PMU event]
  amd_iommu_0/mem_pass_pretrans/                     [Kernel PMU event]
  amd_iommu_0/mem_pass_untrans/                      [Kernel PMU event]
  amd_iommu_0/mem_target_abort/                      [Kernel PMU event]
  amd_iommu_0/mem_trans_total/                       [Kernel PMU event]
  amd_iommu_0/page_tbl_read_gst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_nst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_tot/                     [Kernel PMU event]
  amd_iommu_0/smi_blk/                               [Kernel PMU event]
  amd_iommu_0/smi_recv/                              [Kernel PMU event]
  amd_iommu_0/tlb_inv/                               [Kernel PMU event]
  amd_iommu_0/vapic_int_guest/                       [Kernel PMU event]
  amd_iommu_0/vapic_int_non_guest/                   [Kernel PMU event]
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 36       amd_iommu_0/cmd_processed/           (33.30%)
                 20       amd_iommu_0/cmd_processed_inv/       (33.34%)
                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/    (33.38%)
842,158,150,822,877       amd_iommu_0/int_dte_hit/             (33.42%)
                  0       amd_iommu_0/int_dte_mis/             (33.44%)
                310       amd_iommu_0/mem_dte_hit/             (33.44%)
841,709,432,950,259       amd_iommu_0/mem_dte_mis/             (33.44%)
                  0       amd_iommu_0/mem_iommu_tlb_pde_hit/   (33.44%)
                320       amd_iommu_0/mem_iommu_tlb_pde_mis/   (33.42%)
                182       amd_iommu_0/mem_iommu_tlb_pte_hit/   (33.38%)
             10,863       amd_iommu_0/mem_iommu_tlb_pte_mis/   (33.34%)
                  0       amd_iommu_0/mem_pass_excl/           (33.30%)
                  0       amd_iommu_0/mem_pass_pretrans/       (33.28%)
             10,834       amd_iommu_0/mem_pass_untrans/        (33.28%)
                  0       amd_iommu_0/mem_target_abort/        (33.28%)
              1,430       amd_iommu_0/mem_trans_total/         (33.28%)
                  0       amd_iommu_0/page_tbl_read_gst/       (33.28%)
                201       amd_iommu_0/page_tbl_read_nst/       (33.28%)
                192       amd_iommu_0/page_tbl_read_tot/       (33.28%)
                  0       amd_iommu_0/smi_blk/                 (33.28%)
                  0       amd_iommu_0/smi_recv/                (33.28%)
                  0       amd_iommu_0/tlb_inv/                 (33.28%)
                  0       amd_iommu_0/vapic_int_guest/         (33.28%)
                336       amd_iommu_0/vapic_int_non_guest/     (33.28%)

       10.001183400 seconds time elapsed

[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-13 13:51       ` Suthikulpanit, Suravee
@ 2021-04-14 22:18         ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-14 22:18 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

[-- Attachment #1: Type: text/plain, Size: 411 bytes --]

Hi again!

For completeness, I'm attaching results for the revert+update patch 
running the Ubuntu 21.04β kernel 5.11.0-14 on a Ryzen 4700U laptop.

The enormous amd_iommu running stats aren't always there, as they nearly 
always are on the the 2400G desktop, but they do turn up (depending on 
what the machine's been doing).

Be very interested in your thoughts on their relevance!

Best regards,

-- 
David

[-- Attachment #2: 4700U-5.11.0-14.txt --]
[-- Type: text/plain, Size: 5661 bytes --]

$ sudo dmesg | grep IOMMU
[    0.498593] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.500507] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.502011] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    1.113195] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>


$ sudo perf list | grep iommu
  amd_iommu_0/cmd_processed/                         [Kernel PMU event]
  amd_iommu_0/cmd_processed_inv/                     [Kernel PMU event]
  amd_iommu_0/ign_rd_wr_mmio_1ff8h/                  [Kernel PMU event]
  amd_iommu_0/int_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/int_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_pass_excl/                         [Kernel PMU event]
  amd_iommu_0/mem_pass_pretrans/                     [Kernel PMU event]
  amd_iommu_0/mem_pass_untrans/                      [Kernel PMU event]
  amd_iommu_0/mem_target_abort/                      [Kernel PMU event]
  amd_iommu_0/mem_trans_total/                       [Kernel PMU event]
  amd_iommu_0/page_tbl_read_gst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_nst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_tot/                     [Kernel PMU event]
  amd_iommu_0/smi_blk/                               [Kernel PMU event]
  amd_iommu_0/smi_recv/                              [Kernel PMU event]
  amd_iommu_0/tlb_inv/                               [Kernel PMU event]
  amd_iommu_0/vapic_int_guest/                       [Kernel PMU event]
  amd_iommu_0/vapic_int_non_guest/                   [Kernel PMU event]
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                30      amd_iommu_0/cmd_processed/             (33.31%)
                17       amd_iommu_0/cmd_processed_inv/        (33.34%)
                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
               374       amd_iommu_0/int_dte_hit/              (33.39%)
                29       amd_iommu_0/int_dte_mis/              (33.44%)
               394       amd_iommu_0/mem_dte_hit/              (33.46%)
             9,117       amd_iommu_0/mem_dte_mis/              (33.45%)
                 5       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
               819       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.42%)
                 2       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.39%)
             1,012       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
                 0       amd_iommu_0/mem_pass_excl/            (33.34%)
                 0       amd_iommu_0/mem_pass_pretrans/        (33.29%)
             1,553       amd_iommu_0/mem_pass_untrans/         (33.28%)
                 0       amd_iommu_0/mem_target_abort/         (33.28%)
            39,207       amd_iommu_0/mem_trans_total/          (33.27%)
                 0       amd_iommu_0/page_tbl_read_gst/        (33.27%)
             1,625       amd_iommu_0/page_tbl_read_nst/        (33.27%)
               901       amd_iommu_0/page_tbl_read_tot/        (33.27%)
                 0       amd_iommu_0/smi_blk/                  (33.7%)
                 0       amd_iommu_0/smi_recv/                 (33.28%)
                 0       amd_iommu_0/tlb_inv/                  (33.27%)
                 0       amd_iommu_0/vapic_int_guest/          (33.27%)
               420       amd_iommu_0/vapic_int_non_guest/      (33.27%)

      10.003004096 seconds time elapsed



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-14 22:18         ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-14 22:18 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

[-- Attachment #1: Type: text/plain, Size: 411 bytes --]

Hi again!

For completeness, I'm attaching results for the revert+update patch 
running the Ubuntu 21.04β kernel 5.11.0-14 on a Ryzen 4700U laptop.

The enormous amd_iommu running stats aren't always there, as they nearly 
always are on the the 2400G desktop, but they do turn up (depending on 
what the machine's been doing).

Be very interested in your thoughts on their relevance!

Best regards,

-- 
David

[-- Attachment #2: 4700U-5.11.0-14.txt --]
[-- Type: text/plain, Size: 5661 bytes --]

$ sudo dmesg | grep IOMMU
[    0.498593] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.500507] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.502011] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    1.113195] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>


$ sudo perf list | grep iommu
  amd_iommu_0/cmd_processed/                         [Kernel PMU event]
  amd_iommu_0/cmd_processed_inv/                     [Kernel PMU event]
  amd_iommu_0/ign_rd_wr_mmio_1ff8h/                  [Kernel PMU event]
  amd_iommu_0/int_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/int_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_hit/                           [Kernel PMU event]
  amd_iommu_0/mem_dte_mis/                           [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pde_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_hit/                 [Kernel PMU event]
  amd_iommu_0/mem_iommu_tlb_pte_mis/                 [Kernel PMU event]
  amd_iommu_0/mem_pass_excl/                         [Kernel PMU event]
  amd_iommu_0/mem_pass_pretrans/                     [Kernel PMU event]
  amd_iommu_0/mem_pass_untrans/                      [Kernel PMU event]
  amd_iommu_0/mem_target_abort/                      [Kernel PMU event]
  amd_iommu_0/mem_trans_total/                       [Kernel PMU event]
  amd_iommu_0/page_tbl_read_gst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_nst/                     [Kernel PMU event]
  amd_iommu_0/page_tbl_read_tot/                     [Kernel PMU event]
  amd_iommu_0/smi_blk/                               [Kernel PMU event]
  amd_iommu_0/smi_recv/                              [Kernel PMU event]
  amd_iommu_0/tlb_inv/                               [Kernel PMU event]
  amd_iommu_0/vapic_int_guest/                       [Kernel PMU event]
  amd_iommu_0/vapic_int_non_guest/                   [Kernel PMU event]
  intel_iommu:bounce_map_sg                          [Tracepoint event]
  intel_iommu:bounce_map_single                      [Tracepoint event]
  intel_iommu:bounce_unmap_single                    [Tracepoint event]
  intel_iommu:map_sg                                 [Tracepoint event]
  intel_iommu:map_single                             [Tracepoint event]
  intel_iommu:unmap_sg                               [Tracepoint event]
  intel_iommu:unmap_single                           [Tracepoint event]
  iommu:add_device_to_group                          [Tracepoint event]
  iommu:attach_device_to_domain                      [Tracepoint event]
  iommu:detach_device_from_domain                    [Tracepoint event]
  iommu:io_page_fault                                [Tracepoint event]
  iommu:map                                          [Tracepoint event]
  iommu:remove_device_from_group                     [Tracepoint event]
  iommu:unmap                                        [Tracepoint event]

$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                30      amd_iommu_0/cmd_processed/             (33.31%)
                17       amd_iommu_0/cmd_processed_inv/        (33.34%)
                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/     (33.36%)
               374       amd_iommu_0/int_dte_hit/              (33.39%)
                29       amd_iommu_0/int_dte_mis/              (33.44%)
               394       amd_iommu_0/mem_dte_hit/              (33.46%)
             9,117       amd_iommu_0/mem_dte_mis/              (33.45%)
                 5       amd_iommu_0/mem_iommu_tlb_pde_hit/    (33.46%)
               819       amd_iommu_0/mem_iommu_tlb_pde_mis/    (33.42%)
                 2       amd_iommu_0/mem_iommu_tlb_pte_hit/    (33.39%)
             1,012       amd_iommu_0/mem_iommu_tlb_pte_mis/    (33.37%)
                 0       amd_iommu_0/mem_pass_excl/            (33.34%)
                 0       amd_iommu_0/mem_pass_pretrans/        (33.29%)
             1,553       amd_iommu_0/mem_pass_untrans/         (33.28%)
                 0       amd_iommu_0/mem_target_abort/         (33.28%)
            39,207       amd_iommu_0/mem_trans_total/          (33.27%)
                 0       amd_iommu_0/page_tbl_read_gst/        (33.27%)
             1,625       amd_iommu_0/page_tbl_read_nst/        (33.27%)
               901       amd_iommu_0/page_tbl_read_tot/        (33.27%)
                 0       amd_iommu_0/smi_blk/                  (33.7%)
                 0       amd_iommu_0/smi_recv/                 (33.28%)
                 0       amd_iommu_0/tlb_inv/                  (33.27%)
                 0       amd_iommu_0/vapic_int_guest/          (33.27%)
               420       amd_iommu_0/vapic_int_non_guest/      (33.27%)

      10.003004096 seconds time elapsed



[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-14 15:33         ` David Coe
@ 2021-04-15  9:28           ` Suthikulpanit, Suravee
  -1 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-15  9:28 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

David,

On 4/14/2021 10:33 PM, David Coe wrote:
> Hi Suravee!
> 
> I've re-run your revert+update patch on Ubuntu's latest kernel 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached the code!
> 
> There are 3 sets of results in the attachment, all for the Ryzen 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 patch.
> 
> A. As-distributed kernel (cold boot)
>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
> 
> B. As-distributed kernel (warm boot)
>     <5 retries, amd_iommu running stats show large numbers as before.
> 
> C. Revert+Update kernel
>     amd_iommu events listed and also show large hit/miss numbers.
> 
> In due course, I'll load the new (revert+update) kernel on the 4700G but won't overload your mail-box unless something unusual turns up.
> 
> Best regards,
> 

For the Ryzen 2400G, could you please try with:
- 1 event at a time
- Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
I am trying to see if this issue might be related to the counters multiplexing).

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-15  9:28           ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-15  9:28 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

David,

On 4/14/2021 10:33 PM, David Coe wrote:
> Hi Suravee!
> 
> I've re-run your revert+update patch on Ubuntu's latest kernel 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached the code!
> 
> There are 3 sets of results in the attachment, all for the Ryzen 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 patch.
> 
> A. As-distributed kernel (cold boot)
>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
> 
> B. As-distributed kernel (warm boot)
>     <5 retries, amd_iommu running stats show large numbers as before.
> 
> C. Revert+Update kernel
>     amd_iommu events listed and also show large hit/miss numbers.
> 
> In due course, I'll load the new (revert+update) kernel on the 4700G but won't overload your mail-box unless something unusual turns up.
> 
> Best regards,
> 

For the Ryzen 2400G, could you please try with:
- 1 event at a time
- Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
I am trying to see if this issue might be related to the counters multiplexing).

Thanks,
Suravee
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] iommu/amd: Revert and remove failing PMC test
  2021-04-09  8:58 ` Suravee Suthikulpanit
@ 2021-04-15 13:41   ` Joerg Roedel
  -1 siblings, 0 replies; 46+ messages in thread
From: Joerg Roedel @ 2021-04-15 13:41 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: linux-kernel, iommu, will, jsnitsel, pmenzel, Jon.Grimm

On Fri, Apr 09, 2021 at 03:58:46AM -0500, Suravee Suthikulpanit wrote:
> Paul Menzel (1):
>   Revert "iommu/amd: Fix performance counter initialization"
> 
> Suravee Suthikulpanit (1):
>   iommu/amd: Remove performance counter pre-initialization test

Applied, thanks Paul and Suravee.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] iommu/amd: Revert and remove failing PMC test
@ 2021-04-15 13:41   ` Joerg Roedel
  0 siblings, 0 replies; 46+ messages in thread
From: Joerg Roedel @ 2021-04-15 13:41 UTC (permalink / raw)
  To: Suravee Suthikulpanit; +Cc: pmenzel, Jon.Grimm, linux-kernel, iommu, will

On Fri, Apr 09, 2021 at 03:58:46AM -0500, Suravee Suthikulpanit wrote:
> Paul Menzel (1):
>   Revert "iommu/amd: Fix performance counter initialization"
> 
> Suravee Suthikulpanit (1):
>   iommu/amd: Remove performance counter pre-initialization test

Applied, thanks Paul and Suravee.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-15  9:28           ` Suthikulpanit, Suravee
@ 2021-04-15 14:39             ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-15 14:39 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]

I think you've put your finger on it, Suravee!

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> On 4/14/2021 10:33 PM, David Coe wrote:
>> Hi Suravee!
>>
>> I've re-run your revert+update patch on Ubuntu's latest kernel 
>> 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached 
>> the code!
>>
>> There are 3 sets of results in the attachment, all for the Ryzen 
>> 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 
>> patch.
>>
>> A. As-distributed kernel (cold boot)
>>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
>>
>> B. As-distributed kernel (warm boot)
>>     <5 retries, amd_iommu running stats show large numbers as before.
>>
>> C. Revert+Update kernel
>>     amd_iommu events listed and also show large hit/miss numbers.
>>
>> In due course, I'll load the new (revert+update) kernel on the 4700G 
>> but won't overload your mail-box unless something unusual turns up.
>>
>> Best regards,
>>
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
> Thanks,

Attached are the results you requested for the 2400G along with a tiny 
shell-script.

One event at a time and various batches of less than 8 events produce 
unexceptionable data. One final batch of 10 events and (hoopla) up go 
the counter stats.

Will you be doing something in mitigation or does this just go with the 
patch? Is there anything further you need from me? I'll run the script 
on the 4700U but I don't expect surprises :-).

All most appreciated,

-- 
David

[-- Attachment #2: iommu_list.sh --]
[-- Type: application/x-shellscript, Size: 849 bytes --]

[-- Attachment #3: EventList.txt --]
[-- Type: text/plain, Size: 8041 bytes --]

$ sudo ./iommu_list.sh

 Performance counter stats for 'system wide':

                12      amd_iommu_0/cmd_processed/

      10.001266851 seconds time elapsed


 Performance counter stats for 'system wide':

                11       amd_iommu_0/cmd_processed_inv/

      10.001259049 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/

      10.000791810 seconds time elapsed


 Performance counter stats for 'system wide':

               350       amd_iommu_0/int_dte_hit/

      10.000848437 seconds time elapsed


 Performance counter stats for 'system wide':

                16       amd_iommu_0/int_dte_mis/

      10.001271989 seconds time elapsed


 Performance counter stats for 'system wide':

               348       amd_iommu_0/mem_dte_hit/

      10.000808074 seconds time elapsed


 Performance counter stats for 'system wide':

           211,925       amd_iommu_0/mem_dte_mis/

      10.000915362 seconds time elapsed


 Performance counter stats for 'system wide':

                30       amd_iommu_0/mem_iommu_tlb_pde_hit/

      10.001520597 seconds time elapsed


 Performance counter stats for 'system wide':

               450       amd_iommu_0/mem_iommu_tlb_pde_mis/

      10.000877493 seconds time elapsed


 Performance counter stats for 'system wide':

            10,953       amd_iommu_0/mem_iommu_tlb_pte_hit/

      10.000831802 seconds time elapsed


 Performance counter stats for 'system wide':

            13,235       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.001292003 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_excl/

      10.000836000 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_pretrans/

      10.000799887 seconds time elapsed


 Performance counter stats for 'system wide':

            12,283       amd_iommu_0/mem_pass_untrans/

      10.000815339 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/

      10.001205168 seconds time elapsed


 Performance counter stats for 'system wide':

             1,333       amd_iommu_0/mem_trans_total/

      10.000915359 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/page_tbl_read_gst/

      10.001248235 seconds time elapsed


 Performance counter stats for 'system wide':

                65       amd_iommu_0/page_tbl_read_nst/

      10.001266411 seconds time elapsed


 Performance counter stats for 'system wide':

                78       amd_iommu_0/page_tbl_read_tot/

      10.001272406 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/

      10.001282912 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_recv/

      10.001223193 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/tlb_inv/

      10.001234853 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/vapic_int_guest/

      10.000848081 seconds time elapsed


 Performance counter stats for 'system wide':

               428       amd_iommu_0/vapic_int_non_guest/

      10.000806041 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/' sleep 10
 
Performance counter stats for 'system wide':

                16       amd_iommu_0/cmd_processed/                                   
                 8       amd_iommu_0/cmd_processed_inv/                                   
                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/                                   
               358       amd_iommu_0/int_dte_hit/                                   
                10       amd_iommu_0/int_dte_mis/                                   
               465       amd_iommu_0/mem_dte_hit/                                   
             4,296       amd_iommu_0/mem_dte_mis/                                   

      10.001297570 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/' sleep 10

Performance counter stats for 'system wide':

                24       amd_iommu_0/mem_iommu_tlb_pde_hit/                                   
               407       amd_iommu_0/mem_iommu_tlb_pde_mis/                                   
               478       amd_iommu_0/mem_iommu_tlb_pte_hit/                                   
             7,113       amd_iommu_0/mem_iommu_tlb_pte_mis/                                   
                 0       amd_iommu_0/mem_pass_excl/                                   
                 0       amd_iommu_0/mem_pass_pretrans/                                   
             7,040       amd_iommu_0/mem_pass_untrans/                                   

      10.001246489 seconds time elapsed

$ sudo perf stat -e 'amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/' sleep 10

Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/                                   
             1,898       amd_iommu_0/mem_trans_total/                                   
                 0       amd_iommu_0/page_tbl_read_gst/                                   
               140       amd_iommu_0/page_tbl_read_nst/                                   
               140       amd_iommu_0/page_tbl_read_tot/                                   

      10.001295526 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/                                        
                 0       amd_iommu_0/smi_recv/                                      
                 0       amd_iommu_0/tlb_inv/                                       
                 0       amd_iommu_0/vapic_int_guest/                                   
               433       amd_iommu_0/vapic_int_non_guest/                                   

      10.001286515 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10                                                                        

Performance counter stats for 'system wide':

                  0       amd_iommu_0/mem_target_abort/         (80.00%)
703,650,342,510,810       amd_iommu_0/mem_trans_total/          (80.00%)
                  0       amd_iommu_0/page_tbl_read_gst/        (80.00%)
351,839,572,857,842       amd_iommu_0/page_tbl_read_nst/        (80.00%)
351,849,973,332,309       amd_iommu_0/page_tbl_read_tot/        (80.00%)
                  0       amd_iommu_0/smi_blk/                  (80.00%)
                  0       amd_iommu_0/smi_recv/                 (80.00%)
                  0       amd_iommu_0/tlb_inv/                  (80.00%)
                  0       amd_iommu_0/vapic_int_guest/          (80.00%)
703,720,763,722,288       amd_iommu_0/vapic_int_non_guest/      (80.00%)

      10.000790762 seconds time elapsed

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-15 14:39             ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-15 14:39 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]

I think you've put your finger on it, Suravee!

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> On 4/14/2021 10:33 PM, David Coe wrote:
>> Hi Suravee!
>>
>> I've re-run your revert+update patch on Ubuntu's latest kernel 
>> 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached 
>> the code!
>>
>> There are 3 sets of results in the attachment, all for the Ryzen 
>> 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 
>> patch.
>>
>> A. As-distributed kernel (cold boot)
>>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
>>
>> B. As-distributed kernel (warm boot)
>>     <5 retries, amd_iommu running stats show large numbers as before.
>>
>> C. Revert+Update kernel
>>     amd_iommu events listed and also show large hit/miss numbers.
>>
>> In due course, I'll load the new (revert+update) kernel on the 4700G 
>> but won't overload your mail-box unless something unusual turns up.
>>
>> Best regards,
>>
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
> Thanks,

Attached are the results you requested for the 2400G along with a tiny 
shell-script.

One event at a time and various batches of less than 8 events produce 
unexceptionable data. One final batch of 10 events and (hoopla) up go 
the counter stats.

Will you be doing something in mitigation or does this just go with the 
patch? Is there anything further you need from me? I'll run the script 
on the 4700U but I don't expect surprises :-).

All most appreciated,

-- 
David

[-- Attachment #2: iommu_list.sh --]
[-- Type: application/x-shellscript, Size: 849 bytes --]

[-- Attachment #3: EventList.txt --]
[-- Type: text/plain, Size: 8041 bytes --]

$ sudo ./iommu_list.sh

 Performance counter stats for 'system wide':

                12      amd_iommu_0/cmd_processed/

      10.001266851 seconds time elapsed


 Performance counter stats for 'system wide':

                11       amd_iommu_0/cmd_processed_inv/

      10.001259049 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/

      10.000791810 seconds time elapsed


 Performance counter stats for 'system wide':

               350       amd_iommu_0/int_dte_hit/

      10.000848437 seconds time elapsed


 Performance counter stats for 'system wide':

                16       amd_iommu_0/int_dte_mis/

      10.001271989 seconds time elapsed


 Performance counter stats for 'system wide':

               348       amd_iommu_0/mem_dte_hit/

      10.000808074 seconds time elapsed


 Performance counter stats for 'system wide':

           211,925       amd_iommu_0/mem_dte_mis/

      10.000915362 seconds time elapsed


 Performance counter stats for 'system wide':

                30       amd_iommu_0/mem_iommu_tlb_pde_hit/

      10.001520597 seconds time elapsed


 Performance counter stats for 'system wide':

               450       amd_iommu_0/mem_iommu_tlb_pde_mis/

      10.000877493 seconds time elapsed


 Performance counter stats for 'system wide':

            10,953       amd_iommu_0/mem_iommu_tlb_pte_hit/

      10.000831802 seconds time elapsed


 Performance counter stats for 'system wide':

            13,235       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.001292003 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_excl/

      10.000836000 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_pretrans/

      10.000799887 seconds time elapsed


 Performance counter stats for 'system wide':

            12,283       amd_iommu_0/mem_pass_untrans/

      10.000815339 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/

      10.001205168 seconds time elapsed


 Performance counter stats for 'system wide':

             1,333       amd_iommu_0/mem_trans_total/

      10.000915359 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/page_tbl_read_gst/

      10.001248235 seconds time elapsed


 Performance counter stats for 'system wide':

                65       amd_iommu_0/page_tbl_read_nst/

      10.001266411 seconds time elapsed


 Performance counter stats for 'system wide':

                78       amd_iommu_0/page_tbl_read_tot/

      10.001272406 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/

      10.001282912 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_recv/

      10.001223193 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/tlb_inv/

      10.001234853 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/vapic_int_guest/

      10.000848081 seconds time elapsed


 Performance counter stats for 'system wide':

               428       amd_iommu_0/vapic_int_non_guest/

      10.000806041 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/cmd_processed/, amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/' sleep 10
 
Performance counter stats for 'system wide':

                16       amd_iommu_0/cmd_processed/                                   
                 8       amd_iommu_0/cmd_processed_inv/                                   
                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/                                   
               358       amd_iommu_0/int_dte_hit/                                   
                10       amd_iommu_0/int_dte_mis/                                   
               465       amd_iommu_0/mem_dte_hit/                                   
             4,296       amd_iommu_0/mem_dte_mis/                                   

      10.001297570 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/' sleep 10

Performance counter stats for 'system wide':

                24       amd_iommu_0/mem_iommu_tlb_pde_hit/                                   
               407       amd_iommu_0/mem_iommu_tlb_pde_mis/                                   
               478       amd_iommu_0/mem_iommu_tlb_pte_hit/                                   
             7,113       amd_iommu_0/mem_iommu_tlb_pte_mis/                                   
                 0       amd_iommu_0/mem_pass_excl/                                   
                 0       amd_iommu_0/mem_pass_pretrans/                                   
             7,040       amd_iommu_0/mem_pass_untrans/                                   

      10.001246489 seconds time elapsed

$ sudo perf stat -e 'amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/' sleep 10

Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/                                   
             1,898       amd_iommu_0/mem_trans_total/                                   
                 0       amd_iommu_0/page_tbl_read_gst/                                   
               140       amd_iommu_0/page_tbl_read_nst/                                   
               140       amd_iommu_0/page_tbl_read_tot/                                   

      10.001295526 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10

Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/                                        
                 0       amd_iommu_0/smi_recv/                                      
                 0       amd_iommu_0/tlb_inv/                                       
                 0       amd_iommu_0/vapic_int_guest/                                   
               433       amd_iommu_0/vapic_int_non_guest/                                   

      10.001286515 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/, amd_iommu_0/page_tbl_read_gst/, amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/' sleep 10                                                                        

Performance counter stats for 'system wide':

                  0       amd_iommu_0/mem_target_abort/         (80.00%)
703,650,342,510,810       amd_iommu_0/mem_trans_total/          (80.00%)
                  0       amd_iommu_0/page_tbl_read_gst/        (80.00%)
351,839,572,857,842       amd_iommu_0/page_tbl_read_nst/        (80.00%)
351,849,973,332,309       amd_iommu_0/page_tbl_read_tot/        (80.00%)
                  0       amd_iommu_0/smi_blk/                  (80.00%)
                  0       amd_iommu_0/smi_recv/                 (80.00%)
                  0       amd_iommu_0/tlb_inv/                  (80.00%)
                  0       amd_iommu_0/vapic_int_guest/          (80.00%)
703,720,763,722,288       amd_iommu_0/vapic_int_non_guest/      (80.00%)

      10.000790762 seconds time elapsed

[-- Attachment #4: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-15  9:28           ` Suthikulpanit, Suravee
@ 2021-04-15 16:20             ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-15 16:20 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]

Hi Suravee!

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> On 4/14/2021 10:33 PM, David Coe wrote:
>> Hi Suravee!
>>
>> I've re-run your revert+update patch on Ubuntu's latest kernel 
>> 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached 
>> the code!
>>
>> There are 3 sets of results in the attachment, all for the Ryzen 
>> 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 
>> patch.
>>
>> A. As-distributed kernel (cold boot)
>>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
>>
>> B. As-distributed kernel (warm boot)
>>     <5 retries, amd_iommu running stats show large numbers as before.
>>
>> C. Revert+Update kernel
>>     amd_iommu events listed and also show large hit/miss numbers.
>>
>> In due course, I'll load the new (revert+update) kernel on the 4700G 
>> but won't overload your mail-box unless something unusual turns up.
>>
>> Best regards,
>>
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
> Thanks,

Herewith similar one-at-a-time perf stat results for the Ryzen 4700U. As 
before, more than 8 events displays the third (%) column and (sometimes) 
'silly' numbers in the first column.

-- 
David

[-- Attachment #2: EventList-4700U.txt --]
[-- Type: text/plain, Size: 5650 bytes --]

$ sudo ./iommu_list.sh

 Performance counter stats for 'system wide':

                20      amd_iommu_0/cmd_processed/

      10.003010666 seconds time elapsed


 Performance counter stats for 'system wide':

                 5       amd_iommu_0/cmd_processed_inv/

      10.002349464 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/

      10.002386129 seconds time elapsed


 Performance counter stats for 'system wide':

               325       amd_iommu_0/int_dte_hit/

      10.002346630 seconds time elapsed


 Performance counter stats for 'system wide':

                 2       amd_iommu_0/int_dte_mis/

      10.002365656 seconds time elapsed


 Performance counter stats for 'system wide':

               356       amd_iommu_0/mem_dte_hit/

      10.002426866 seconds time elapsed


 Performance counter stats for 'system wide':

             3,955       amd_iommu_0/mem_dte_mis/

      10.002729058 seconds time elapsed


 Performance counter stats for 'system wide':

                 4       amd_iommu_0/mem_iommu_tlb_pde_hit/

      10.002422610 seconds time elapsed


 Performance counter stats for 'system wide':

             1,326       amd_iommu_0/mem_iommu_tlb_pde_mis/

      10.002397045 seconds time elapsed


 Performance counter stats for 'system wide':

             1,009       amd_iommu_0/mem_iommu_tlb_pte_hit/

      10.002445347 seconds time elapsed


 Performance counter stats for 'system wide':

             1,072       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.002414734 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_excl/

      10.002435482 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_pretrans/

      10.002409956 seconds time elapsed


 Performance counter stats for 'system wide':

             3,405       amd_iommu_0/mem_pass_untrans/

      10.002563812 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/

      10.002473657 seconds time elapsed


 Performance counter stats for 'system wide':

             1,311       amd_iommu_0/mem_trans_total/

      10.002471787 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/page_tbl_read_gst/

      10.002216033 seconds time elapsed


 Performance counter stats for 'system wide':

           276,609       amd_iommu_0/page_tbl_read_nst/

      10.002029261 seconds time elapsed


 Performance counter stats for 'system wide':

           126,161       amd_iommu_0/page_tbl_read_tot/

      10.003569029 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/

      10.001871818 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_recv/

      10.002212891 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/tlb_inv/

      10.002396606 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/vapic_int_guest/

      10.002435308 seconds time elapsed


 Performance counter stats for 'system wide':

               340       amd_iommu_0/vapic_int_non_guest/

      10.002405868 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/' sleep 10

 Performance counter stats for 'system wide':

               345       amd_iommu_0/mem_dte_hit/
               606       amd_iommu_0/mem_dte_mis/
                10       amd_iommu_0/mem_iommu_tlb_pde_hit/
               910       amd_iommu_0/mem_iommu_tlb_pde_mis/
               140       amd_iommu_0/mem_iommu_tlb_pte_hit/
               797       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.002553966 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/' sleep 10

 Performance counter stats for 'system wide':

               340       amd_iommu_0/mem_dte_hit/                                     (72.65%)
               389       amd_iommu_0/mem_dte_mis/                                     (72.67%)
                 8       amd_iommu_0/mem_iommu_tlb_pde_hit/                           (72.71%)
               849       amd_iommu_0/mem_iommu_tlb_pde_mis/                           (72.76%)
                96       amd_iommu_0/mem_iommu_tlb_pte_hit/                           (72.78%)
               742       amd_iommu_0/mem_iommu_tlb_pte_mis/                           (72.75%)
                 0       amd_iommu_0/mem_pass_excl/                                   (72.76%)
                 0       amd_iommu_0/mem_pass_pretrans/                               (72.76%)
                 6       amd_iommu_0/mem_pass_untrans/                                (72.77%)
                 0       amd_iommu_0/mem_target_abort/                                (72.71%)
             1,122       amd_iommu_0/mem_trans_total/                                 (72.69%)

      10.002471480 seconds time elapsed

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-15 16:20             ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-15 16:20 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

[-- Attachment #1: Type: text/plain, Size: 1407 bytes --]

Hi Suravee!

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> On 4/14/2021 10:33 PM, David Coe wrote:
>> Hi Suravee!
>>
>> I've re-run your revert+update patch on Ubuntu's latest kernel 
>> 5.11.0-14 partly to check my mailer's 'mangling' hadn't also reached 
>> the code!
>>
>> There are 3 sets of results in the attachment, all for the Ryzen 
>> 2400G. The as-distributed kernel already incorporates your IOMMU RFCv3 
>> patch.
>>
>> A. As-distributed kernel (cold boot)
>>     >5 retries, so no IOMMU read/write capability, no amd_iommu events.
>>
>> B. As-distributed kernel (warm boot)
>>     <5 retries, amd_iommu running stats show large numbers as before.
>>
>> C. Revert+Update kernel
>>     amd_iommu events listed and also show large hit/miss numbers.
>>
>> In due course, I'll load the new (revert+update) kernel on the 4700G 
>> but won't overload your mail-box unless something unusual turns up.
>>
>> Best regards,
>>
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
> Thanks,

Herewith similar one-at-a-time perf stat results for the Ryzen 4700U. As 
before, more than 8 events displays the third (%) column and (sometimes) 
'silly' numbers in the first column.

-- 
David

[-- Attachment #2: EventList-4700U.txt --]
[-- Type: text/plain, Size: 5650 bytes --]

$ sudo ./iommu_list.sh

 Performance counter stats for 'system wide':

                20      amd_iommu_0/cmd_processed/

      10.003010666 seconds time elapsed


 Performance counter stats for 'system wide':

                 5       amd_iommu_0/cmd_processed_inv/

      10.002349464 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/

      10.002386129 seconds time elapsed


 Performance counter stats for 'system wide':

               325       amd_iommu_0/int_dte_hit/

      10.002346630 seconds time elapsed


 Performance counter stats for 'system wide':

                 2       amd_iommu_0/int_dte_mis/

      10.002365656 seconds time elapsed


 Performance counter stats for 'system wide':

               356       amd_iommu_0/mem_dte_hit/

      10.002426866 seconds time elapsed


 Performance counter stats for 'system wide':

             3,955       amd_iommu_0/mem_dte_mis/

      10.002729058 seconds time elapsed


 Performance counter stats for 'system wide':

                 4       amd_iommu_0/mem_iommu_tlb_pde_hit/

      10.002422610 seconds time elapsed


 Performance counter stats for 'system wide':

             1,326       amd_iommu_0/mem_iommu_tlb_pde_mis/

      10.002397045 seconds time elapsed


 Performance counter stats for 'system wide':

             1,009       amd_iommu_0/mem_iommu_tlb_pte_hit/

      10.002445347 seconds time elapsed


 Performance counter stats for 'system wide':

             1,072       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.002414734 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_excl/

      10.002435482 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_pass_pretrans/

      10.002409956 seconds time elapsed


 Performance counter stats for 'system wide':

             3,405       amd_iommu_0/mem_pass_untrans/

      10.002563812 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/mem_target_abort/

      10.002473657 seconds time elapsed


 Performance counter stats for 'system wide':

             1,311       amd_iommu_0/mem_trans_total/

      10.002471787 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/page_tbl_read_gst/

      10.002216033 seconds time elapsed


 Performance counter stats for 'system wide':

           276,609       amd_iommu_0/page_tbl_read_nst/

      10.002029261 seconds time elapsed


 Performance counter stats for 'system wide':

           126,161       amd_iommu_0/page_tbl_read_tot/

      10.003569029 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_blk/

      10.001871818 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/smi_recv/

      10.002212891 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/tlb_inv/

      10.002396606 seconds time elapsed


 Performance counter stats for 'system wide':

                 0       amd_iommu_0/vapic_int_guest/

      10.002435308 seconds time elapsed


 Performance counter stats for 'system wide':

               340       amd_iommu_0/vapic_int_non_guest/

      10.002405868 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/' sleep 10

 Performance counter stats for 'system wide':

               345       amd_iommu_0/mem_dte_hit/
               606       amd_iommu_0/mem_dte_mis/
                10       amd_iommu_0/mem_iommu_tlb_pde_hit/
               910       amd_iommu_0/mem_iommu_tlb_pde_mis/
               140       amd_iommu_0/mem_iommu_tlb_pte_hit/
               797       amd_iommu_0/mem_iommu_tlb_pte_mis/

      10.002553966 seconds time elapsed


$ sudo perf stat -e 'amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, amd_iommu_0/mem_iommu_tlb_pde_hit/, amd_iommu_0/mem_iommu_tlb_pde_mis/, amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, amd_iommu_0/mem_trans_total/' sleep 10

 Performance counter stats for 'system wide':

               340       amd_iommu_0/mem_dte_hit/                                     (72.65%)
               389       amd_iommu_0/mem_dte_mis/                                     (72.67%)
                 8       amd_iommu_0/mem_iommu_tlb_pde_hit/                           (72.71%)
               849       amd_iommu_0/mem_iommu_tlb_pde_mis/                           (72.76%)
                96       amd_iommu_0/mem_iommu_tlb_pte_hit/                           (72.78%)
               742       amd_iommu_0/mem_iommu_tlb_pte_mis/                           (72.75%)
                 0       amd_iommu_0/mem_pass_excl/                                   (72.76%)
                 0       amd_iommu_0/mem_pass_pretrans/                               (72.76%)
                 6       amd_iommu_0/mem_pass_untrans/                                (72.77%)
                 0       amd_iommu_0/mem_target_abort/                                (72.71%)
             1,122       amd_iommu_0/mem_trans_total/                                 (72.69%)

      10.002471480 seconds time elapsed

[-- Attachment #3: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-15  9:28           ` Suthikulpanit, Suravee
@ 2021-04-18 19:16             ` David Coe
  -1 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-18 19:16 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

Hi Suravee!

Results for Ryzen 2400G on Ubuntu 20.10, kernel 5.8.0-50 with patch 2/2 
alone. Events batched 3 x 8 to avoid counter-multiplexing (?) artefacts.

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
$ sudo dmesg | grep IOMMU
[sudo] password for info:
[    0.543768] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.547696] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.549196] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.811538] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>


$ declare -a EventList=("amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/" "amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/" "amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/")


$ for event in "${EventList[@]}"; do sudo perf stat -e "$event" sleep 10 
; done

  Performance counter stats for 'system wide':

                 18       amd_iommu_0/cmd_processed/ 

                  9       amd_iommu_0/cmd_processed_inv/ 

                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 

                399       amd_iommu_0/int_dte_hit/ 

                 19       amd_iommu_0/int_dte_mis/ 

              1,177       amd_iommu_0/mem_dte_hit/ 

              5,521       amd_iommu_0/mem_dte_mis/ 

                 70       amd_iommu_0/mem_iommu_tlb_pde_hit/ 


       10.001490092 seconds time elapsed


  Performance counter stats for 'system wide':

                394       amd_iommu_0/mem_iommu_tlb_pde_mis/ 

                602       amd_iommu_0/mem_iommu_tlb_pte_hit/ 

              6,612       amd_iommu_0/mem_iommu_tlb_pte_mis/ 

                  0       amd_iommu_0/mem_pass_excl/ 

                  0       amd_iommu_0/mem_pass_pretrans/ 

              6,590       amd_iommu_0/mem_pass_untrans/ 

                  0       amd_iommu_0/mem_target_abort/ 

                616       amd_iommu_0/mem_trans_total/ 


       10.001237585 seconds time elapsed


  Performance counter stats for 'system wide':

                  0       amd_iommu_0/page_tbl_read_gst/ 

                 78       amd_iommu_0/page_tbl_read_nst/ 

                 78       amd_iommu_0/page_tbl_read_tot/ 

                  0       amd_iommu_0/smi_blk/ 

                  0       amd_iommu_0/smi_recv/ 

                  0       amd_iommu_0/tlb_inv/ 

                  0       amd_iommu_0/vapic_int_guest/ 

                637       amd_iommu_0/vapic_int_non_guest/ 


       10.001186031 seconds time elapsed

Best regards,

-- 
David

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-18 19:16             ` David Coe
  0 siblings, 0 replies; 46+ messages in thread
From: David Coe @ 2021-04-18 19:16 UTC (permalink / raw)
  To: Suthikulpanit, Suravee, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

Hi Suravee!

Results for Ryzen 2400G on Ubuntu 20.10, kernel 5.8.0-50 with patch 2/2 
alone. Events batched 3 x 8 to avoid counter-multiplexing (?) artefacts.

On 15/04/2021 10:28, Suthikulpanit, Suravee wrote:
> David,
> 
> For the Ryzen 2400G, could you please try with:
> - 1 event at a time
> - Not more than 8 events (On your system, it has 2 banks x 4 counters/bank.
> I am trying to see if this issue might be related to the counters 
> multiplexing).
> 
$ sudo dmesg | grep IOMMU
[sudo] password for info:
[    0.543768] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters 
supported
[    0.547696] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.549196] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 
counters/bank).
[    0.811538] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel <jroedel@suse.de>


$ declare -a EventList=("amd_iommu_0/cmd_processed/, 
amd_iommu_0/cmd_processed_inv/, amd_iommu_0/ign_rd_wr_mmio_1ff8h/, 
amd_iommu_0/int_dte_hit/, amd_iommu_0/int_dte_mis/, 
amd_iommu_0/mem_dte_hit/, amd_iommu_0/mem_dte_mis/, 
amd_iommu_0/mem_iommu_tlb_pde_hit/" "amd_iommu_0/mem_iommu_tlb_pde_mis/, 
amd_iommu_0/mem_iommu_tlb_pte_hit/, amd_iommu_0/mem_iommu_tlb_pte_mis/, 
amd_iommu_0/mem_pass_excl/, amd_iommu_0/mem_pass_pretrans/, 
amd_iommu_0/mem_pass_untrans/, amd_iommu_0/mem_target_abort/, 
amd_iommu_0/mem_trans_total/" "amd_iommu_0/page_tbl_read_gst/, 
amd_iommu_0/page_tbl_read_nst/, amd_iommu_0/page_tbl_read_tot/, 
amd_iommu_0/smi_blk/, amd_iommu_0/smi_recv/, amd_iommu_0/tlb_inv/, 
amd_iommu_0/vapic_int_guest/, amd_iommu_0/vapic_int_non_guest/")


$ for event in "${EventList[@]}"; do sudo perf stat -e "$event" sleep 10 
; done

  Performance counter stats for 'system wide':

                 18       amd_iommu_0/cmd_processed/ 

                  9       amd_iommu_0/cmd_processed_inv/ 

                  0       amd_iommu_0/ign_rd_wr_mmio_1ff8h/ 

                399       amd_iommu_0/int_dte_hit/ 

                 19       amd_iommu_0/int_dte_mis/ 

              1,177       amd_iommu_0/mem_dte_hit/ 

              5,521       amd_iommu_0/mem_dte_mis/ 

                 70       amd_iommu_0/mem_iommu_tlb_pde_hit/ 


       10.001490092 seconds time elapsed


  Performance counter stats for 'system wide':

                394       amd_iommu_0/mem_iommu_tlb_pde_mis/ 

                602       amd_iommu_0/mem_iommu_tlb_pte_hit/ 

              6,612       amd_iommu_0/mem_iommu_tlb_pte_mis/ 

                  0       amd_iommu_0/mem_pass_excl/ 

                  0       amd_iommu_0/mem_pass_pretrans/ 

              6,590       amd_iommu_0/mem_pass_untrans/ 

                  0       amd_iommu_0/mem_target_abort/ 

                616       amd_iommu_0/mem_trans_total/ 


       10.001237585 seconds time elapsed


  Performance counter stats for 'system wide':

                  0       amd_iommu_0/page_tbl_read_gst/ 

                 78       amd_iommu_0/page_tbl_read_nst/ 

                 78       amd_iommu_0/page_tbl_read_tot/ 

                  0       amd_iommu_0/smi_blk/ 

                  0       amd_iommu_0/smi_recv/ 

                  0       amd_iommu_0/tlb_inv/ 

                  0       amd_iommu_0/vapic_int_guest/ 

                637       amd_iommu_0/vapic_int_non_guest/ 


       10.001186031 seconds time elapsed

Best regards,

-- 
David
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-10 10:03     ` David Coe
@ 2021-04-20  8:38       ` Suthikulpanit, Suravee
  -1 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-20  8:38 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: joro, will, jsnitsel, pmenzel, Jon.Grimm, Tj, Shuah Khan,
	Alexander Monakov, Alex Hung

David / Joerg,

On 4/10/2021 5:03 PM, David Coe wrote:
> 
> The immediately obvious difference is the with the enormous count seen on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone with comments and insight?
> 
> 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> 
> Otherwise, all seems to running smoothly (especially for a distribution still in β). Bravo and many thanks all!

The initial hypothesis is that the issue happens only when users specify more number of events than
the available counters, which Perf will time-multiplex the events onto the counters.

Looking at the Perf and AMD IOMMU PMU multiplexing logic, it requires:
  1. Stop the counter (i.e. set CSOURCE to zero to stop counting)
  2. Save the counter value of the current event
  3. Reload the counter value of the new event (previously saved)
  4. Start the counter (i.e. set CSOURCE to count new events)

The problem here is that when the driver writes zero to CSOURCE register in step 1, this would enable power-gating,
which prevents access to the counter and result in writing/reading value in step 2 and 3.

I have found a system that reproduced this case (w/ unusually large number of count), and debug the issue further.
As a hack, I have tried skipping step 1, and it seems to eliminate this issue. However, this is logically incorrect,
and might result in inaccurate data depending on the events.

Here are the options:
1. Continue to look for workaround for this issue.
2. Find a way to disable event time-multiplexing (e.g. only limit the number of counters to 8)
    if power gating is enabled on the platform.
3. Back to the original logic where we had the pre-init check of the counter vlues, which is still the safest choice
    at the moment unless

Regards,
Suravee

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-20  8:38       ` Suthikulpanit, Suravee
  0 siblings, 0 replies; 46+ messages in thread
From: Suthikulpanit, Suravee @ 2021-04-20  8:38 UTC (permalink / raw)
  To: David Coe, linux-kernel, iommu
  Cc: pmenzel, Alexander Monakov, Alex Hung, Jon.Grimm, Shuah Khan, Tj, will

David / Joerg,

On 4/10/2021 5:03 PM, David Coe wrote:
> 
> The immediately obvious difference is the with the enormous count seen on mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone with comments and insight?
> 
> 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> 
> Otherwise, all seems to running smoothly (especially for a distribution still in β). Bravo and many thanks all!

The initial hypothesis is that the issue happens only when users specify more number of events than
the available counters, which Perf will time-multiplex the events onto the counters.

Looking at the Perf and AMD IOMMU PMU multiplexing logic, it requires:
  1. Stop the counter (i.e. set CSOURCE to zero to stop counting)
  2. Save the counter value of the current event
  3. Reload the counter value of the new event (previously saved)
  4. Start the counter (i.e. set CSOURCE to count new events)

The problem here is that when the driver writes zero to CSOURCE register in step 1, this would enable power-gating,
which prevents access to the counter and result in writing/reading value in step 2 and 3.

I have found a system that reproduced this case (w/ unusually large number of count), and debug the issue further.
As a hack, I have tried skipping step 1, and it seems to eliminate this issue. However, this is logically incorrect,
and might result in inaccurate data depending on the events.

Here are the options:
1. Continue to look for workaround for this issue.
2. Find a way to disable event time-multiplexing (e.g. only limit the number of counters to 8)
    if power gating is enabled on the platform.
3. Back to the original logic where we had the pre-init check of the counter vlues, which is still the safest choice
    at the moment unless

Regards,
Suravee
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
  2021-04-20  8:38       ` Suthikulpanit, Suravee
@ 2021-04-20 10:33         ` Alexander Monakov
  -1 siblings, 0 replies; 46+ messages in thread
From: Alexander Monakov @ 2021-04-20 10:33 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: David Coe, linux-kernel, iommu, joro, will, jsnitsel, pmenzel,
	Jon.Grimm, Tj, Shuah Khan, Alex Hung

[-- Attachment #1: Type: text/plain, Size: 2742 bytes --]

On Tue, 20 Apr 2021, Suthikulpanit, Suravee wrote:

> David / Joerg,
> 
> On 4/10/2021 5:03 PM, David Coe wrote:
> > 
> > The immediately obvious difference is the with the enormous count seen on
> > mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone
> > with comments and insight?
> > 
> > 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> > 
> > Otherwise, all seems to running smoothly (especially for a distribution
> > still in β). Bravo and many thanks all!
> 
> The initial hypothesis is that the issue happens only when users specify more
> number of events than
> the available counters, which Perf will time-multiplex the events onto the
> counters.
> 
> Looking at the Perf and AMD IOMMU PMU multiplexing logic, it requires:
>  1. Stop the counter (i.e. set CSOURCE to zero to stop counting)
>  2. Save the counter value of the current event
>  3. Reload the counter value of the new event (previously saved)
>  4. Start the counter (i.e. set CSOURCE to count new events)
> 
> The problem here is that when the driver writes zero to CSOURCE register in
> step 1, this would enable power-gating,
> which prevents access to the counter and result in writing/reading value in
> step 2 and 3.
> 
> I have found a system that reproduced this case (w/ unusually large number of
> count), and debug the issue further.
> As a hack, I have tried skipping step 1, and it seems to eliminate this issue.
> However, this is logically incorrect,
> and might result in inaccurate data depending on the events.
> 
> Here are the options:
> 1. Continue to look for workaround for this issue.
> 2. Find a way to disable event time-multiplexing (e.g. only limit the number
> of counters to 8)
>    if power gating is enabled on the platform.
> 3. Back to the original logic where we had the pre-init check of the counter
> vlues, which is still the safest choice
>    at the moment unless

If the "power-gated" counter only ignores writes, but yields correct values on
reads, you don't need to change its value.

0. When initializing the counter, save its value as 'raw_counter_value'.

When switching:

1. Set CSOURCE to zero (pauses the counter).
2. Read current counter value ('new_value').
3. Add '(new_value - raw_counter_value) & mask' to the current event count;
   where 'mask' is '(1ULL << 48) - 1' to account for overflow correctly.
4. Assign 'raw_counter_value = new_value'.
5. Set CSOURCE to new configuration value.

Modifying the hardware counter value is only needed to get overflow interrupts,
but there's no code to handle them on Linux (since the interrupts are
asynchronous, and given the nature of the events, I don't see how it would be
useful).

Alexander

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test
@ 2021-04-20 10:33         ` Alexander Monakov
  0 siblings, 0 replies; 46+ messages in thread
From: Alexander Monakov @ 2021-04-20 10:33 UTC (permalink / raw)
  To: Suthikulpanit, Suravee
  Cc: David Coe, pmenzel, Jon.Grimm, linux-kernel, iommu, Shuah Khan,
	Tj, Alex Hung, will

[-- Attachment #1: Type: text/plain, Size: 2742 bytes --]

On Tue, 20 Apr 2021, Suthikulpanit, Suravee wrote:

> David / Joerg,
> 
> On 4/10/2021 5:03 PM, David Coe wrote:
> > 
> > The immediately obvious difference is the with the enormous count seen on
> > mem_dte_mis on the older Ryzen 2400G. Will do some RTFM but anyone
> > with comments and insight?
> > 
> > 841,689,151,202,939       amd_iommu_0/mem_dte_mis/              (33.44%)
> > 
> > Otherwise, all seems to running smoothly (especially for a distribution
> > still in β). Bravo and many thanks all!
> 
> The initial hypothesis is that the issue happens only when users specify more
> number of events than
> the available counters, which Perf will time-multiplex the events onto the
> counters.
> 
> Looking at the Perf and AMD IOMMU PMU multiplexing logic, it requires:
>  1. Stop the counter (i.e. set CSOURCE to zero to stop counting)
>  2. Save the counter value of the current event
>  3. Reload the counter value of the new event (previously saved)
>  4. Start the counter (i.e. set CSOURCE to count new events)
> 
> The problem here is that when the driver writes zero to CSOURCE register in
> step 1, this would enable power-gating,
> which prevents access to the counter and result in writing/reading value in
> step 2 and 3.
> 
> I have found a system that reproduced this case (w/ unusually large number of
> count), and debug the issue further.
> As a hack, I have tried skipping step 1, and it seems to eliminate this issue.
> However, this is logically incorrect,
> and might result in inaccurate data depending on the events.
> 
> Here are the options:
> 1. Continue to look for workaround for this issue.
> 2. Find a way to disable event time-multiplexing (e.g. only limit the number
> of counters to 8)
>    if power gating is enabled on the platform.
> 3. Back to the original logic where we had the pre-init check of the counter
> vlues, which is still the safest choice
>    at the moment unless

If the "power-gated" counter only ignores writes, but yields correct values on
reads, you don't need to change its value.

0. When initializing the counter, save its value as 'raw_counter_value'.

When switching:

1. Set CSOURCE to zero (pauses the counter).
2. Read current counter value ('new_value').
3. Add '(new_value - raw_counter_value) & mask' to the current event count;
   where 'mask' is '(1ULL << 48) - 1' to account for overflow correctly.
4. Assign 'raw_counter_value = new_value'.
5. Set CSOURCE to new configuration value.

Modifying the hardware counter value is only needed to get overflow interrupts,
but there's no code to handle them on Linux (since the interrupts are
asynchronous, and given the nature of the events, I don't see how it would be
useful).

Alexander

[-- Attachment #2: Type: text/plain, Size: 156 bytes --]

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2021-04-20 10:33 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09  8:58 [PATCH 0/2] iommu/amd: Revert and remove failing PMC test Suravee Suthikulpanit
2021-04-09  8:58 ` Suravee Suthikulpanit
2021-04-09  8:58 ` [PATCH 1/2] Revert "iommu/amd: Fix performance counter initialization" Suravee Suthikulpanit
2021-04-09  8:58   ` Suravee Suthikulpanit
2021-04-09 17:06   ` Shuah Khan
2021-04-09 17:06     ` Shuah Khan
2021-04-13 13:36     ` Suthikulpanit, Suravee
2021-04-13 13:36       ` Suthikulpanit, Suravee
2021-04-09  8:58 ` [PATCH 2/2] iommu/amd: Remove performance counter pre-initialization test Suravee Suthikulpanit
2021-04-09  8:58   ` Suravee Suthikulpanit
2021-04-09 16:37   ` Shuah Khan
2021-04-09 16:37     ` Shuah Khan
2021-04-09 17:10     ` Shuah Khan
2021-04-09 17:10       ` Shuah Khan
2021-04-09 20:00   ` Shuah Khan
2021-04-09 20:00     ` Shuah Khan
2021-04-09 20:19     ` Shuah Khan
2021-04-09 20:19       ` Shuah Khan
2021-04-09 20:11   ` David Coe
2021-04-09 20:11     ` David Coe
2021-04-10  8:17   ` David Coe
2021-04-10  8:17     ` David Coe
2021-04-10 10:03   ` David Coe
2021-04-10 10:03     ` David Coe
2021-04-13 13:51     ` Suthikulpanit, Suravee
2021-04-13 13:51       ` Suthikulpanit, Suravee
2021-04-14 15:33       ` David Coe
2021-04-14 15:33         ` David Coe
2021-04-15  9:28         ` Suthikulpanit, Suravee
2021-04-15  9:28           ` Suthikulpanit, Suravee
2021-04-15 14:39           ` David Coe
2021-04-15 14:39             ` David Coe
2021-04-15 16:20           ` David Coe
2021-04-15 16:20             ` David Coe
2021-04-18 19:16           ` David Coe
2021-04-18 19:16             ` David Coe
2021-04-14 22:18       ` David Coe
2021-04-14 22:18         ` David Coe
2021-04-20  8:38     ` Suthikulpanit, Suravee
2021-04-20  8:38       ` Suthikulpanit, Suravee
2021-04-20 10:33       ` Alexander Monakov
2021-04-20 10:33         ` Alexander Monakov
2021-04-13  9:38   ` David Coe
2021-04-13  9:38     ` David Coe
2021-04-15 13:41 ` [PATCH 0/2] iommu/amd: Revert and remove failing PMC test Joerg Roedel
2021-04-15 13:41   ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.