All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] iommu/amd: Fix I/O page fault logging ratelimit test
@ 2021-07-19  0:47 Lennert Buytenhek
  2021-07-21  0:05 ` Suthikulpanit, Suravee via iommu
  0 siblings, 1 reply; 3+ messages in thread
From: Lennert Buytenhek @ 2021-07-19  0:47 UTC (permalink / raw)
  To: iommu, Joerg Roedel, Suravee Suthikulpanit

On an AMD system, I/O page faults are usually logged like this:

	drvname 0000:05:00.0: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x0000000092050da0 flags=0x0020]

But sometimes they are logged like this instead, even for the exact
same PCI device:

	AMD-Vi: Event logged [IO_PAGE_FAULT device=05:00.0 domain=0x0000 address=0x0000000092050de0 flags=0x0020]

This discrepancy appears to be caused by this code:

	if (dev_data && __ratelimit(&dev_data->rs)) {
		pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
			domain_id, address, flags);
	} else if (printk_ratelimit()) {
		pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
			PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
			domain_id, address, flags);
	}

If an I/O page fault occurs for a PCI device with associated
iommu_dev_data, but for which the __ratelimit(&dev_data->rs) check fails,
we'll give it a second chance with printk_ratelimit(), and if that check
succeeds, we will log the fault anyway, but in a different format.

Change this to only check printk_ratelimit() if !dev_data, which seems to
be what had been originally intended.

Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
---
 drivers/iommu/amd/iommu.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 811a49a95d04..7ae426b092f2 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -483,7 +483,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
 	if (dev_data && __ratelimit(&dev_data->rs)) {
 		pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
 			domain_id, address, flags);
-	} else if (printk_ratelimit()) {
+	} else if (!dev_data && printk_ratelimit()) {
 		pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
 			PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
 			domain_id, address, flags);
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] iommu/amd: Fix I/O page fault logging ratelimit test
  2021-07-19  0:47 [PATCH] iommu/amd: Fix I/O page fault logging ratelimit test Lennert Buytenhek
@ 2021-07-21  0:05 ` Suthikulpanit, Suravee via iommu
  2021-07-21 13:46   ` Lennert Buytenhek
  0 siblings, 1 reply; 3+ messages in thread
From: Suthikulpanit, Suravee via iommu @ 2021-07-21  0:05 UTC (permalink / raw)
  To: Lennert Buytenhek, iommu, Joerg Roedel

Hi Lennert,

On 7/18/2021 7:47 PM, Lennert Buytenhek wrote:
> On an AMD system, I/O page faults are usually logged like this:
> ....
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 811a49a95d04..7ae426b092f2 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -483,7 +483,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
>   	if (dev_data && __ratelimit(&dev_data->rs)) {
>   		pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
>   			domain_id, address, flags);
> -	} else if (printk_ratelimit()) {
> +	} else if (!dev_data && printk_ratelimit()) {

This seems a bit confusing. Also, according to the following comment in include/linux/printk.h:

/*
  * Please don't use printk_ratelimit(), because it shares ratelimiting state
  * with all other unrelated printk_ratelimit() callsites.  Instead use
  * printk_ratelimited() or plain old __ratelimit().
  */

We probably should move away from using printk_ratelimit() here.
What about the following change instead?

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 811a49a95d04..8eb5d3519743 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -480,11 +480,12 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
         if (pdev)
                 dev_data = dev_iommu_priv_get(&pdev->dev);

-       if (dev_data && __ratelimit(&dev_data->rs)) {
-               pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
-                       domain_id, address, flags);
-       } else if (printk_ratelimit()) {
-               pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
+       if (dev_data) {
+               if (__ratelimit(&dev_data->rs))
+                       pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
+                               domain_id, address, flags);
+       } else {
+               pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
                         PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
                         domain_id, address, flags);
         }

Note also that there might be other places in this file that would need similar modification as well.

Thanks,
Suravee

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] iommu/amd: Fix I/O page fault logging ratelimit test
  2021-07-21  0:05 ` Suthikulpanit, Suravee via iommu
@ 2021-07-21 13:46   ` Lennert Buytenhek
  0 siblings, 0 replies; 3+ messages in thread
From: Lennert Buytenhek @ 2021-07-21 13:46 UTC (permalink / raw)
  To: Suthikulpanit, Suravee; +Cc: iommu

On Tue, Jul 20, 2021 at 07:05:50PM -0500, Suthikulpanit, Suravee wrote:

> Hi Lennert,

Hi Suravee,


> > On an AMD system, I/O page faults are usually logged like this:
> > ....
> > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> > index 811a49a95d04..7ae426b092f2 100644
> > --- a/drivers/iommu/amd/iommu.c
> > +++ b/drivers/iommu/amd/iommu.c
> > @@ -483,7 +483,7 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
> >   	if (dev_data && __ratelimit(&dev_data->rs)) {
> >   		pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
> >   			domain_id, address, flags);
> > -	} else if (printk_ratelimit()) {
> > +	} else if (!dev_data && printk_ratelimit()) {
> 
> This seems a bit confusing. Also, according to the following comment in include/linux/printk.h:
> 
> /*
>  * Please don't use printk_ratelimit(), because it shares ratelimiting state
>  * with all other unrelated printk_ratelimit() callsites.  Instead use
>  * printk_ratelimited() or plain old __ratelimit().
>  */
> 
> We probably should move away from using printk_ratelimit() here.
> What about the following change instead?
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 811a49a95d04..8eb5d3519743 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -480,11 +480,12 @@ static void amd_iommu_report_page_fault(u16 devid, u16 domain_id,
>         if (pdev)
>                 dev_data = dev_iommu_priv_get(&pdev->dev);
> 
> -       if (dev_data && __ratelimit(&dev_data->rs)) {
> -               pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
> -                       domain_id, address, flags);
> -       } else if (printk_ratelimit()) {
> -               pr_err("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
> +       if (dev_data) {
> +               if (__ratelimit(&dev_data->rs))
> +                       pci_err(pdev, "Event logged [IO_PAGE_FAULT domain=0x%04x address=0x%llx flags=0x%04x]\n",
> +                               domain_id, address, flags);
> +       } else {
> +               pr_err_ratelimited("Event logged [IO_PAGE_FAULT device=%02x:%02x.%x domain=0x%04x address=0x%llx flags=0x%04x]\n",
>                         PCI_BUS_NUM(devid), PCI_SLOT(devid), PCI_FUNC(devid),
>                         domain_id, address, flags);
>         }

Looks good!


> Note also that there might be other places in this file that would need
> similar modification as well.

Indeed, there are two more sites like these.

I've sent a new patch that incorporates your feedback.  Thank you!


Cheers,
Lennert
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-07-21 13:46 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-19  0:47 [PATCH] iommu/amd: Fix I/O page fault logging ratelimit test Lennert Buytenhek
2021-07-21  0:05 ` Suthikulpanit, Suravee via iommu
2021-07-21 13:46   ` Lennert Buytenhek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.