iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Revert "iommu/amd: Fix performance counter initialization"
@ 2021-03-03 12:11 Paul Menzel
  2021-03-03 13:25 ` Suravee Suthikulpanit
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Menzel @ 2021-03-03 12:11 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Paul Menzel, David Coe, Alexander Monakov, iommu, Shuah Khan, Tj

This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.

The commit adds up to 100 ms to the boot process, which is not mentioned
in the commit message, and is making up more than 20 % on current
systems, where the Linux kernel takes 500 ms.

    [    0.000000] Linux version 5.11.0-10281-g19b4f3edd5c9 (root@a2ab663d937e) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021
    […]
    [    0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
    […]
    [    0.291257] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
    […]

Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen
3 2200G (even with ten retries, resulting in 200 ms time-out).

    [    0.401152] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.

Additionally, alternative proposed solutions [1] were not considered or
discussed.

[1]: https://lore.kernel.org/linux-iommu/alpine.LNX.2.20.13.2006030935570.3181@monopod.intra.ispras.ru/

Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Tj (Elloe Linux) <ml.linux@elloe.vision>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Alexander Monakov <amonakov@ispras.ru>
Cc: David Coe <david.coe@live.co.uk>
Cc: iommu@lists.linux-foundation.org
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
---
 drivers/iommu/amd/init.c | 45 ++++++++++------------------------------
 1 file changed, 11 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9126efcbaf2c..af195f11d254 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -12,7 +12,6 @@
 #include <linux/acpi.h>
 #include <linux/list.h>
 #include <linux/bitmap.h>
-#include <linux/delay.h>
 #include <linux/slab.h>
 #include <linux/syscore_ops.h>
 #include <linux/interrupt.h>
@@ -257,8 +256,6 @@ static enum iommu_init_state init_state = IOMMU_START_STATE;
 static int amd_iommu_enable_interrupts(void);
 static int __init iommu_go_to_state(enum iommu_init_state state);
 static void init_device_table_dma(void);
-static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
-				u8 fxn, u64 *value, bool is_write);
 
 static bool amd_iommu_pre_enabled = true;
 
@@ -1717,11 +1714,13 @@ static int __init init_iommu_all(struct acpi_table_header *table)
 	return 0;
 }
 
-static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
+static int iommu_pc_get_set_reg(struct amd_iommu *iommu, u8 bank, u8 cntr,
+				u8 fxn, u64 *value, bool is_write);
+
+static void init_iommu_perf_ctr(struct amd_iommu *iommu)
 {
-	int retry;
 	struct pci_dev *pdev = iommu->dev;
-	u64 val = 0xabcd, val2 = 0, save_reg, save_src;
+	u64 val = 0xabcd, val2 = 0, save_reg = 0;
 
 	if (!iommu_feature(iommu, FEATURE_PC))
 		return;
@@ -1729,39 +1728,17 @@ static void __init init_iommu_perf_ctr(struct amd_iommu *iommu)
 	amd_iommu_pc_present = true;
 
 	/* save the value to restore, if writable */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, false))
-		goto pc_false;
-
-	/*
-	 * Disable power gating by programing the performance counter
-	 * source to 20 (i.e. counts the reads and writes from/to IOMMU
-	 * Reserved Register [MMIO Offset 1FF8h] that are ignored.),
-	 * which never get incremented during this init phase.
-	 * (Note: The event is also deprecated.)
-	 */
-	val = 20;
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 8, &val, true))
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, false))
 		goto pc_false;
 
 	/* Check if the performance counters can be written to */
-	val = 0xabcd;
-	for (retry = 5; retry; retry--) {
-		if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true) ||
-		    iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false) ||
-		    val2)
-			break;
-
-		/* Wait about 20 msec for power gating to disable and retry. */
-		msleep(20);
-	}
-
-	/* restore */
-	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true) ||
-	    iommu_pc_get_set_reg(iommu, 0, 0, 8, &save_src, true))
+	if ((iommu_pc_get_set_reg(iommu, 0, 0, 0, &val, true)) ||
+	    (iommu_pc_get_set_reg(iommu, 0, 0, 0, &val2, false)) ||
+	    (val != val2))
 		goto pc_false;
 
-	if (val != val2)
+	/* restore */
+	if (iommu_pc_get_set_reg(iommu, 0, 0, 0, &save_reg, true))
 		goto pc_false;
 
 	pci_info(pdev, "IOMMU performance counters supported\n");
-- 
2.30.1

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"
  2021-03-03 12:11 [PATCH] Revert "iommu/amd: Fix performance counter initialization" Paul Menzel
@ 2021-03-03 13:25 ` Suravee Suthikulpanit
  2021-03-03 14:10   ` Alexander Monakov
  0 siblings, 1 reply; 5+ messages in thread
From: Suravee Suthikulpanit @ 2021-03-03 13:25 UTC (permalink / raw)
  To: Paul Menzel, Joerg Roedel
  Cc: Tj, iommu, Alexander Monakov, David Coe, Shuah Khan

Paul,

On 3/3/21 7:11 PM, Paul Menzel wrote:
> This reverts commit 6778ff5b21bd8e78c8bd547fd66437cf2657fd9b.
> 
> The commit adds up to 100 ms to the boot process, which is not mentioned
> in the commit message, and is making up more than 20 % on current
> systems, where the Linux kernel takes 500 ms.

The 100 msec (5 * 20ms) is only for the worst-case scenario. For most cases,
the delay is not applicable. In addition, this patch has shown to fix the issue for
some users in the field.

> 
>      [    0.000000] Linux version 5.11.0-10281-g19b4f3edd5c9 (root@a2ab663d937e) (gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.1) #138 SMP Wed Feb 24 11:28:17 UTC 2021
>      […]
>      [    0.106422] smpboot: CPU0: AMD Ryzen 3 2200G with Radeon Vega Graphics (family: 0x17, model: 0x11, stepping: 0x0)
>      […]
>      [    0.291257] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
>      […]
> 
> Also, it does not fix the problem on an MSI B350M MORTAR with AMD Ryzen
> 3 2200G (even with ten retries, resulting in 200 ms time-out).

We are still investigating to root cause the long delay for the IOMMU
performance counter unit to disable power-gating, and allow access to
the performance counters. If your concern is the amount of retries,
we can try to reduce the number of retires.

> 
>      [    0.401152] pci 0000:00:00.2: AMD-Vi: Unable to read/write to IOMMU perf counter.
> 
> Additionally, alternative proposed solutions [1] were not considered or
> discussed.
> 
> [1]:https://lore.kernel.org/linux-iommu/alpine.LNX.2.20.13.2006030935570.3181@monopod.intra.ispras.ru/

This check has been introduced early on to detect a HW issue for certain platforms in the past,
where the performance counters are not accessible and would result in silent failure when try
to use the counters. This is considered legacy code, and can be removed if we decide to no
longer provide sanity check for such case.

Regards,
Suravee
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"
  2021-03-03 13:25 ` Suravee Suthikulpanit
@ 2021-03-03 14:10   ` Alexander Monakov
  2021-03-18  9:20     ` Paul Menzel
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Monakov @ 2021-03-03 14:10 UTC (permalink / raw)
  To: Suravee Suthikulpanit; +Cc: Paul Menzel, David Coe, iommu, Shuah Khan, Tj

On Wed, 3 Mar 2021, Suravee Suthikulpanit wrote:

> > Additionally, alternative proposed solutions [1] were not considered or
> > discussed.
> > 
> > [1]:https://lore.kernel.org/linux-iommu/alpine.LNX.2.20.13.2006030935570.3181@monopod.intra.ispras.ru/
> 
> This check has been introduced early on to detect a HW issue for
> certain platforms in the past, where the performance counters are not
> accessible and would result in silent failure when try to use the
> counters. This is considered legacy code, and can be removed if we
> decide to no longer provide sanity check for such case.

Which platforms? There is no such information in the code or the commit
messages that introduced this.

According to AMD's documentation, presence of performance counters is
indicated by "PCSup" bit in the "EFR" register. I don't think the driver
should second-guess that. If there were platforms where the CPU or the
firmware lied to the OS (EFR[PCSup] was 1, but counters were not present),
I think that should have been handled in a more explicit manner, e.g.
via matching broken CPUs by cpuid.

Alexander
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"
  2021-03-03 14:10   ` Alexander Monakov
@ 2021-03-18  9:20     ` Paul Menzel
  2021-04-07 10:04       ` Joerg Roedel
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Menzel @ 2021-03-18  9:20 UTC (permalink / raw)
  To: Alexander Monakov, Suravee Suthikulpanit
  Cc: Sasha Levin, David Coe, Greg KH, stable, iommu, Shuah Khan, Tj

Dear Jörg, dear Suravee,


Am 03.03.21 um 15:10 schrieb Alexander Monakov:
> On Wed, 3 Mar 2021, Suravee Suthikulpanit wrote:
> 
>>> Additionally, alternative proposed solutions [1] were not considered or
>>> discussed.
>>>
>>> [1]:https://lore.kernel.org/linux-iommu/alpine.LNX.2.20.13.2006030935570.3181@monopod.intra.ispras.ru/
>>
>> This check has been introduced early on to detect a HW issue for
>> certain platforms in the past, where the performance counters are not
>> accessible and would result in silent failure when try to use the
>> counters. This is considered legacy code, and can be removed if we
>> decide to no longer provide sanity check for such case.
> 
> Which platforms? There is no such information in the code or the commit
> messages that introduced this.
> 
> According to AMD's documentation, presence of performance counters is
> indicated by "PCSup" bit in the "EFR" register. I don't think the driver
> should second-guess that. If there were platforms where the CPU or the
> firmware lied to the OS (EFR[PCSup] was 1, but counters were not present),
> I think that should have been handled in a more explicit manner, e.g.
> via matching broken CPUs by cpuid.

Suravee, could you please answer the questions?

Jörg, I know you are probably busy, but the patch was applied to the 
stable series (v5.11.7). There are still too many question open 
regarding the patch, and Suravee has not yet addressed the comments. 
It’d be great, if you could revert it.


Kind regards,

Paul

Could you please
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Revert "iommu/amd: Fix performance counter initialization"
  2021-03-18  9:20     ` Paul Menzel
@ 2021-04-07 10:04       ` Joerg Roedel
  0 siblings, 0 replies; 5+ messages in thread
From: Joerg Roedel @ 2021-04-07 10:04 UTC (permalink / raw)
  To: Paul Menzel
  Cc: Sasha Levin, David Coe, Greg KH, Alexander Monakov, stable,
	iommu, Shuah Khan, Tj

Hi Paul,

On Thu, Mar 18, 2021 at 10:20:16AM +0100, Paul Menzel wrote:
> Jörg, I know you are probably busy, but the patch was applied to the stable
> series (v5.11.7). There are still too many question open regarding the
> patch, and Suravee has not yet addressed the comments. It’d be great, if you
> could revert it.

We are currently discussing the next steps here. Maybe the retry logic
can be removed entirely.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-07 10:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-03 12:11 [PATCH] Revert "iommu/amd: Fix performance counter initialization" Paul Menzel
2021-03-03 13:25 ` Suravee Suthikulpanit
2021-03-03 14:10   ` Alexander Monakov
2021-03-18  9:20     ` Paul Menzel
2021-04-07 10:04       ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).