All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-09 15:12 ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-09 15:12 UTC (permalink / raw)
  To: will, joro; +Cc: iommu, linux-kernel, hch, john.garry

For devices stuck behind a conventional PCI bus, saving extra cycles at
33MHz is probably fairly significant. However since native PCI Express
is now the norm for high-performance devices, the optimisation to always
prefer 32-bit addresses for the sake of avoiding DAC is starting to look
rather anachronistic. Technically 32-bit addresses do have shorter TLPs
on PCIe, but unless the device is saturating its link bandwidth with
small transfers it seems unlikely that the difference is appreciable.

What definitely is appreciable, however, is that the IOVA allocator
doesn't behave all that well once the 32-bit space starts getting full.
As DMA working sets get bigger, this optimisation increasingly backfires
and adds considerable overhead to the dma_map path for use-cases like
high-bandwidth networking. We've increasingly bandaged the allocator
in attempts to mitigate this, but it remains fundamentally at odds with
other valid requirements to try as hard as possible to satisfy a request
within the given limit; what we really need is to just avoid this odd
notion of a speculative allocation when it isn't beneficial anyway.

Unfortunately that's where things get awkward... Having been present on
x86 for 15 years or so now, it turns out there are systems which fail to
properly define the upper limit of usable IOVA space for certain devices
and this trick was the only thing letting them work OK. I had a similar
ulterior motive for a couple of early arm64 systems when originally
adding it to iommu-dma, but those really should be fixed with proper
firmware bindings by now. Let's be brave and default it to off in the
hope that CI systems and developers will find and fix those bugs, but
expect that desktop-focused distro configs are likely to want to turn
it back on for maximum compatibility.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: Tweak wording to clarify that it's not really an optimisation in
    general, remove "default X86".

 drivers/iommu/Kconfig     | 26 ++++++++++++++++++++++++++
 drivers/iommu/dma-iommu.c |  2 +-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c79a0df090c0..5a225b48dd00 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -144,6 +144,32 @@ config IOMMU_DMA
 	select IRQ_MSI_IOMMU
 	select NEED_SG_DMA_LENGTH
 
+config IOMMU_DMA_PCI_SAC
+	bool "Enable 64-bit legacy PCI optimisation by default"
+	depends on IOMMU_DMA
+	help
+	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
+	  wherein the DMA API layer will always first try to allocate a 32-bit
+	  DMA address suitable for a single address cycle, before falling back
+	  to allocating from the device's full usable address range. If your
+	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
+	  address cycles reduces DMA throughput significantly, this may be
+	  beneficial to overall performance.
+
+	  If you have a modern PCI Express based system, this feature mostly just
+	  represents extra overhead in the allocation path for no practical
+	  benefit, and it should usually be preferable to say "n" here.
+
+	  However, beware that this feature has also historically papered over
+	  bugs where the IOMMU address width and/or device DMA mask is not set
+	  correctly. If device DMA problems and IOMMU faults start occurring
+	  after disabling this option, it is almost certainly indicative of a
+	  latent driver or firmware/BIOS bug, which would previously have only
+	  manifested with several gigabytes worth of concurrent DMA mappings.
+
+	  If this option is not set, the feature can still be re-enabled at
+	  boot time with the "iommu.forcedac=0" command-line argument.
+
 # Shared Virtual Addressing
 config IOMMU_SVA
 	bool
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9f9d9ba7f376 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -67,7 +67,7 @@ struct iommu_dma_cookie {
 };
 
 static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
-bool iommu_dma_forcedac __read_mostly;
+bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);
 
 static int __init iommu_dma_forcedac_setup(char *str)
 {
-- 
2.36.1.dirty


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-09 15:12 ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-09 15:12 UTC (permalink / raw)
  To: will, joro; +Cc: iommu, linux-kernel, hch

For devices stuck behind a conventional PCI bus, saving extra cycles at
33MHz is probably fairly significant. However since native PCI Express
is now the norm for high-performance devices, the optimisation to always
prefer 32-bit addresses for the sake of avoiding DAC is starting to look
rather anachronistic. Technically 32-bit addresses do have shorter TLPs
on PCIe, but unless the device is saturating its link bandwidth with
small transfers it seems unlikely that the difference is appreciable.

What definitely is appreciable, however, is that the IOVA allocator
doesn't behave all that well once the 32-bit space starts getting full.
As DMA working sets get bigger, this optimisation increasingly backfires
and adds considerable overhead to the dma_map path for use-cases like
high-bandwidth networking. We've increasingly bandaged the allocator
in attempts to mitigate this, but it remains fundamentally at odds with
other valid requirements to try as hard as possible to satisfy a request
within the given limit; what we really need is to just avoid this odd
notion of a speculative allocation when it isn't beneficial anyway.

Unfortunately that's where things get awkward... Having been present on
x86 for 15 years or so now, it turns out there are systems which fail to
properly define the upper limit of usable IOVA space for certain devices
and this trick was the only thing letting them work OK. I had a similar
ulterior motive for a couple of early arm64 systems when originally
adding it to iommu-dma, but those really should be fixed with proper
firmware bindings by now. Let's be brave and default it to off in the
hope that CI systems and developers will find and fix those bugs, but
expect that desktop-focused distro configs are likely to want to turn
it back on for maximum compatibility.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---

v2: Tweak wording to clarify that it's not really an optimisation in
    general, remove "default X86".

 drivers/iommu/Kconfig     | 26 ++++++++++++++++++++++++++
 drivers/iommu/dma-iommu.c |  2 +-
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c79a0df090c0..5a225b48dd00 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -144,6 +144,32 @@ config IOMMU_DMA
 	select IRQ_MSI_IOMMU
 	select NEED_SG_DMA_LENGTH
 
+config IOMMU_DMA_PCI_SAC
+	bool "Enable 64-bit legacy PCI optimisation by default"
+	depends on IOMMU_DMA
+	help
+	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
+	  wherein the DMA API layer will always first try to allocate a 32-bit
+	  DMA address suitable for a single address cycle, before falling back
+	  to allocating from the device's full usable address range. If your
+	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
+	  address cycles reduces DMA throughput significantly, this may be
+	  beneficial to overall performance.
+
+	  If you have a modern PCI Express based system, this feature mostly just
+	  represents extra overhead in the allocation path for no practical
+	  benefit, and it should usually be preferable to say "n" here.
+
+	  However, beware that this feature has also historically papered over
+	  bugs where the IOMMU address width and/or device DMA mask is not set
+	  correctly. If device DMA problems and IOMMU faults start occurring
+	  after disabling this option, it is almost certainly indicative of a
+	  latent driver or firmware/BIOS bug, which would previously have only
+	  manifested with several gigabytes worth of concurrent DMA mappings.
+
+	  If this option is not set, the feature can still be re-enabled at
+	  boot time with the "iommu.forcedac=0" command-line argument.
+
 # Shared Virtual Addressing
 config IOMMU_SVA
 	bool
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index f90251572a5d..9f9d9ba7f376 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -67,7 +67,7 @@ struct iommu_dma_cookie {
 };
 
 static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
-bool iommu_dma_forcedac __read_mostly;
+bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);
 
 static int __init iommu_dma_forcedac_setup(char *str)
 {
-- 
2.36.1.dirty

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-09 15:12 ` Robin Murphy
@ 2022-06-09 18:09   ` John Garry
  -1 siblings, 0 replies; 18+ messages in thread
From: John Garry via iommu @ 2022-06-09 18:09 UTC (permalink / raw)
  To: Robin Murphy, will, joro; +Cc: iommu, linux-kernel, hch

On 09/06/2022 16:12, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking. We've increasingly bandaged the allocator
> in attempts to mitigate this, but it remains fundamentally at odds with
> other valid requirements to try as hard as possible to satisfy a request
> within the given limit; what we really need is to just avoid this odd
> notion of a speculative allocation when it isn't beneficial anyway.
> 
> Unfortunately that's where things get awkward... Having been present on
> x86 for 15 years or so now, it turns out there are systems which fail to
> properly define the upper limit of usable IOVA space for certain devices
> and this trick was the only thing letting them work OK. I had a similar
> ulterior motive for a couple of early arm64 systems when originally
> adding it to iommu-dma, but those really should be fixed with proper
> firmware bindings by now. Let's be brave and default it to off in the
> hope that CI systems and developers will find and fix those bugs, > but
> expect that desktop-focused distro configs are likely to want to turn
> it back on for maximum compatibility.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

FWIW,
Reviewed-by: John Garry <john.garry@huawei.com>

If we're not enabling by default for x86 then doesn't Jeorg have some 
XHCI issue which we would now need to quirk? I don't remember which 
device exactly. Or, alternatively, simply ask him to enable this new config.


> ---
> 
> v2: Tweak wording to clarify that it's not really an optimisation in
>      general, remove "default X86".
> 
>   drivers/iommu/Kconfig     | 26 ++++++++++++++++++++++++++
>   drivers/iommu/dma-iommu.c |  2 +-
>   2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c79a0df090c0..5a225b48dd00 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -144,6 +144,32 @@ config IOMMU_DMA
>   	select IRQ_MSI_IOMMU
>   	select NEED_SG_DMA_LENGTH
>   
> +config IOMMU_DMA_PCI_SAC
> +	bool "Enable 64-bit legacy PCI optimisation by default"
> +	depends on IOMMU_DMA
> +	help
> +	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
> +	  wherein the DMA API layer will always first try to allocate a 32-bit
> +	  DMA address suitable for a single address cycle, before falling back
> +	  to allocating from the device's full usable address range. If your
> +	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
> +	  address cycles reduces DMA throughput significantly, this may be
> +	  beneficial to overall performance.
> +
> +	  If you have a modern PCI Express based system, this feature mostly just
> +	  represents extra overhead in the allocation path for no practical
> +	  benefit, and it should usually be preferable to say "n" here.
> +
> +	  However, beware that this feature has also historically papered over
> +	  bugs where the IOMMU address width and/or device DMA mask is not set
> +	  correctly. If device DMA problems and IOMMU faults start occurring
> +	  after disabling this option, it is almost certainly indicative of a
> +	  latent driver or firmware/BIOS bug, which would previously have only
> +	  manifested with several gigabytes worth of concurrent DMA mappings.
> +
> +	  If this option is not set, the feature can still be re-enabled at
> +	  boot time with the "iommu.forcedac=0" command-line argument.
> +
>   # Shared Virtual Addressing
>   config IOMMU_SVA
>   	bool
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f90251572a5d..9f9d9ba7f376 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -67,7 +67,7 @@ struct iommu_dma_cookie {
>   };
>   
>   static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
> -bool iommu_dma_forcedac __read_mostly;
> +bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);
>   
>   static int __init iommu_dma_forcedac_setup(char *str)
>   {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-09 18:09   ` John Garry
  0 siblings, 0 replies; 18+ messages in thread
From: John Garry @ 2022-06-09 18:09 UTC (permalink / raw)
  To: Robin Murphy, will, joro; +Cc: iommu, linux-kernel, hch

On 09/06/2022 16:12, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking. We've increasingly bandaged the allocator
> in attempts to mitigate this, but it remains fundamentally at odds with
> other valid requirements to try as hard as possible to satisfy a request
> within the given limit; what we really need is to just avoid this odd
> notion of a speculative allocation when it isn't beneficial anyway.
> 
> Unfortunately that's where things get awkward... Having been present on
> x86 for 15 years or so now, it turns out there are systems which fail to
> properly define the upper limit of usable IOVA space for certain devices
> and this trick was the only thing letting them work OK. I had a similar
> ulterior motive for a couple of early arm64 systems when originally
> adding it to iommu-dma, but those really should be fixed with proper
> firmware bindings by now. Let's be brave and default it to off in the
> hope that CI systems and developers will find and fix those bugs, > but
> expect that desktop-focused distro configs are likely to want to turn
> it back on for maximum compatibility.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

FWIW,
Reviewed-by: John Garry <john.garry@huawei.com>

If we're not enabling by default for x86 then doesn't Jeorg have some 
XHCI issue which we would now need to quirk? I don't remember which 
device exactly. Or, alternatively, simply ask him to enable this new config.


> ---
> 
> v2: Tweak wording to clarify that it's not really an optimisation in
>      general, remove "default X86".
> 
>   drivers/iommu/Kconfig     | 26 ++++++++++++++++++++++++++
>   drivers/iommu/dma-iommu.c |  2 +-
>   2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c79a0df090c0..5a225b48dd00 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -144,6 +144,32 @@ config IOMMU_DMA
>   	select IRQ_MSI_IOMMU
>   	select NEED_SG_DMA_LENGTH
>   
> +config IOMMU_DMA_PCI_SAC
> +	bool "Enable 64-bit legacy PCI optimisation by default"
> +	depends on IOMMU_DMA
> +	help
> +	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
> +	  wherein the DMA API layer will always first try to allocate a 32-bit
> +	  DMA address suitable for a single address cycle, before falling back
> +	  to allocating from the device's full usable address range. If your
> +	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
> +	  address cycles reduces DMA throughput significantly, this may be
> +	  beneficial to overall performance.
> +
> +	  If you have a modern PCI Express based system, this feature mostly just
> +	  represents extra overhead in the allocation path for no practical
> +	  benefit, and it should usually be preferable to say "n" here.
> +
> +	  However, beware that this feature has also historically papered over
> +	  bugs where the IOMMU address width and/or device DMA mask is not set
> +	  correctly. If device DMA problems and IOMMU faults start occurring
> +	  after disabling this option, it is almost certainly indicative of a
> +	  latent driver or firmware/BIOS bug, which would previously have only
> +	  manifested with several gigabytes worth of concurrent DMA mappings.
> +
> +	  If this option is not set, the feature can still be re-enabled at
> +	  boot time with the "iommu.forcedac=0" command-line argument.
> +
>   # Shared Virtual Addressing
>   config IOMMU_SVA
>   	bool
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f90251572a5d..9f9d9ba7f376 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -67,7 +67,7 @@ struct iommu_dma_cookie {
>   };
>   
>   static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
> -bool iommu_dma_forcedac __read_mostly;
> +bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);
>   
>   static int __init iommu_dma_forcedac_setup(char *str)
>   {


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-09 15:12 ` Robin Murphy
@ 2022-06-10  6:01   ` Christoph Hellwig
  -1 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-06-10  6:01 UTC (permalink / raw)
  To: Robin Murphy; +Cc: will, joro, iommu, linux-kernel, hch, john.garry

On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
> +	  If you have a modern PCI Express based system, this feature mostly just

Overly long line here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-10  6:01   ` Christoph Hellwig
  0 siblings, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2022-06-10  6:01 UTC (permalink / raw)
  To: Robin Murphy; +Cc: linux-kernel, iommu, will, hch

On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
> +	  If you have a modern PCI Express based system, this feature mostly just

Overly long line here.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-09 15:12 ` Robin Murphy
@ 2022-06-22 12:59   ` Joerg Roedel
  -1 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-22 12:59 UTC (permalink / raw)
  To: Robin Murphy; +Cc: will, iommu, linux-kernel, hch, john.garry

On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
> firmware bindings by now. Let's be brave and default it to off in the

I don't have an overall good feeling about this, but as you said, let's
be brave. This is applied now to the core branch.

If it causes too much trouble we can still revert it (and re-revert it
later, ...)

Thanks,

	Joerg

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-22 12:59   ` Joerg Roedel
  0 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-22 12:59 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu, will, linux-kernel, hch

On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
> firmware bindings by now. Let's be brave and default it to off in the

I don't have an overall good feeling about this, but as you said, let's
be brave. This is applied now to the core branch.

If it causes too much trouble we can still revert it (and re-revert it
later, ...)

Thanks,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-22 12:59   ` Joerg Roedel
@ 2022-06-22 13:12     ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-22 13:12 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, will, linux-kernel, hch

On 2022-06-22 13:59, Joerg Roedel wrote:
> On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
>> firmware bindings by now. Let's be brave and default it to off in the
> 
> I don't have an overall good feeling about this, but as you said, let's
> be brave. This is applied now to the core branch.
> 
> If it causes too much trouble we can still revert it (and re-revert it
> later, ...)

Even easier, we can just bring back "default X86", or "default y", if 
too many folks object to configuring it manually  :)

Thanks for your bravery!
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-22 13:12     ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-22 13:12 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: will, iommu, linux-kernel, hch, john.garry

On 2022-06-22 13:59, Joerg Roedel wrote:
> On Thu, Jun 09, 2022 at 04:12:10PM +0100, Robin Murphy wrote:
>> firmware bindings by now. Let's be brave and default it to off in the
> 
> I don't have an overall good feeling about this, but as you said, let's
> be brave. This is applied now to the core branch.
> 
> If it causes too much trouble we can still revert it (and re-revert it
> later, ...)

Even easier, we can just bring back "default X86", or "default y", if 
too many folks object to configuring it manually  :)

Thanks for your bravery!
Robin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-22 13:12     ` Robin Murphy
@ 2022-06-23 11:33       ` Joerg Roedel
  -1 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-23 11:33 UTC (permalink / raw)
  To: Robin Murphy; +Cc: will, iommu, linux-kernel, hch, john.garry

On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
> Thanks for your bravery!

It already starts, with that patch I am getting:

	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]

In my kernel log. The device is an AMD XHCI controller and seems to
funciton normally after boot. The message disappears with
iommu.forcedac=0.

Need to look more into that...


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-23 11:33       ` Joerg Roedel
  0 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-23 11:33 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu, will, linux-kernel, hch

On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
> Thanks for your bravery!

It already starts, with that patch I am getting:

	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]

In my kernel log. The device is an AMD XHCI controller and seems to
funciton normally after boot. The message disappears with
iommu.forcedac=0.

Need to look more into that...

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-23 11:33       ` Joerg Roedel
@ 2022-06-23 11:41         ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-23 11:41 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: will, iommu, linux-kernel, hch, john.garry

On 2022-06-23 12:33, Joerg Roedel wrote:
> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>> Thanks for your bravery!
> 
> It already starts, with that patch I am getting:
> 
> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
> 
> In my kernel log. The device is an AMD XHCI controller and seems to
> funciton normally after boot. The message disappears with
> iommu.forcedac=0.
> 
> Need to look more into that...

Given how amd_iommu_domain_alloc() sets the domain aperture, presumably 
the DMA address allocated was 0xffffffffffefe000? Odd that it gets bits 
punched out in the middle rather than simply truncated off the top as I 
would have expected :/

Robin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-23 11:41         ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-23 11:41 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, will, linux-kernel, hch

On 2022-06-23 12:33, Joerg Roedel wrote:
> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>> Thanks for your bravery!
> 
> It already starts, with that patch I am getting:
> 
> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
> 
> In my kernel log. The device is an AMD XHCI controller and seems to
> funciton normally after boot. The message disappears with
> iommu.forcedac=0.
> 
> Need to look more into that...

Given how amd_iommu_domain_alloc() sets the domain aperture, presumably 
the DMA address allocated was 0xffffffffffefe000? Odd that it gets bits 
punched out in the middle rather than simply truncated off the top as I 
would have expected :/

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-23 11:41         ` Robin Murphy
@ 2022-06-24 13:28           ` Joerg Roedel
  -1 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-24 13:28 UTC (permalink / raw)
  To: Robin Murphy; +Cc: will, iommu, linux-kernel, hch, john.garry

On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
> On 2022-06-23 12:33, Joerg Roedel wrote:
> > On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
> > > Thanks for your bravery!
> > 
> > It already starts, with that patch I am getting:
> > 
> > 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
> > 
> > In my kernel log. The device is an AMD XHCI controller and seems to
> > funciton normally after boot. The message disappears with
> > iommu.forcedac=0.
> > 
> > Need to look more into that...
> 
> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
> out in the middle rather than simply truncated off the top as I would have
> expected :/

So even more weird, as a workaround I changed the AMD IOMMU driver to
allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
still get the same message.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-24 13:28           ` Joerg Roedel
  0 siblings, 0 replies; 18+ messages in thread
From: Joerg Roedel @ 2022-06-24 13:28 UTC (permalink / raw)
  To: Robin Murphy; +Cc: iommu, will, linux-kernel, hch

On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
> On 2022-06-23 12:33, Joerg Roedel wrote:
> > On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
> > > Thanks for your bravery!
> > 
> > It already starts, with that patch I am getting:
> > 
> > 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
> > 
> > In my kernel log. The device is an AMD XHCI controller and seems to
> > funciton normally after boot. The message disappears with
> > iommu.forcedac=0.
> > 
> > Need to look more into that...
> 
> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
> out in the middle rather than simply truncated off the top as I would have
> expected :/

So even more weird, as a workaround I changed the AMD IOMMU driver to
allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
still get the same message.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
  2022-06-24 13:28           ` Joerg Roedel
@ 2022-06-24 14:49             ` Robin Murphy
  -1 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-24 14:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: iommu, will, linux-kernel, hch

On 2022-06-24 14:28, Joerg Roedel wrote:
> On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
>> On 2022-06-23 12:33, Joerg Roedel wrote:
>>> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>>>> Thanks for your bravery!
>>>
>>> It already starts, with that patch I am getting:
>>>
>>> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
>>>
>>> In my kernel log. The device is an AMD XHCI controller and seems to
>>> funciton normally after boot. The message disappears with
>>> iommu.forcedac=0.
>>>
>>> Need to look more into that...
>>
>> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
>> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
>> out in the middle rather than simply truncated off the top as I would have
>> expected :/
> 
> So even more weird, as a workaround I changed the AMD IOMMU driver to
> allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
> still get the same message.

Hmm, in that case my best guess would be that somewhere between the 
device itself and the IOMMU input it's trying to sign-extend the address 
from bit 47 or lower, but for whatever reason bits 55:48 get lost.

Comparing the PCI xHCI I have to hand, mine (with nothing plugged in) 
only has 6 pages mapped for its command ring and other stuff. Thus 
unless it's sharing that domain with other devices, to be accessing 
something down in the second MB of IOVA space suggests that this 
probably isn't the very first access it's made, and therefore it would 
almost certainly have to be the endpoint emitting a corrupted address, 
but only for certain operations.

FWIW I'd be inclined to turn on DMA debug and call 
debug_dma_dump_mappings() from the IOMMU fault handler, and/or add a bit 
of tracing to all the DMA mapping/allocation sites in the xHCI driver, 
to see what the offending address most likely represents.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
@ 2022-06-24 14:49             ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2022-06-24 14:49 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: will, iommu, linux-kernel, hch, john.garry

On 2022-06-24 14:28, Joerg Roedel wrote:
> On Thu, Jun 23, 2022 at 12:41:00PM +0100, Robin Murphy wrote:
>> On 2022-06-23 12:33, Joerg Roedel wrote:
>>> On Wed, Jun 22, 2022 at 02:12:39PM +0100, Robin Murphy wrote:
>>>> Thanks for your bravery!
>>>
>>> It already starts, with that patch I am getting:
>>>
>>> 	xhci_hcd 0000:02:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000f address=0xff00ffffffefe000 flags=0x0000]
>>>
>>> In my kernel log. The device is an AMD XHCI controller and seems to
>>> funciton normally after boot. The message disappears with
>>> iommu.forcedac=0.
>>>
>>> Need to look more into that...
>>
>> Given how amd_iommu_domain_alloc() sets the domain aperture, presumably the
>> DMA address allocated was 0xffffffffffefe000? Odd that it gets bits punched
>> out in the middle rather than simply truncated off the top as I would have
>> expected :/
> 
> So even more weird, as a workaround I changed the AMD IOMMU driver to
> allocate a 4-level page-table and limit the DMA aperture to 48 bits. I
> still get the same message.

Hmm, in that case my best guess would be that somewhere between the 
device itself and the IOMMU input it's trying to sign-extend the address 
from bit 47 or lower, but for whatever reason bits 55:48 get lost.

Comparing the PCI xHCI I have to hand, mine (with nothing plugged in) 
only has 6 pages mapped for its command ring and other stuff. Thus 
unless it's sharing that domain with other devices, to be accessing 
something down in the second MB of IOVA space suggests that this 
probably isn't the very first access it's made, and therefore it would 
almost certainly have to be the endpoint emitting a corrupted address, 
but only for certain operations.

FWIW I'd be inclined to turn on DMA debug and call 
debug_dma_dump_mappings() from the IOMMU fault handler, and/or add a bit 
of tracing to all the DMA mapping/allocation sites in the xHCI driver, 
to see what the offending address most likely represents.

Robin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-06-24 14:54 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-09 15:12 [PATCH v2] iommu/dma: Add config for PCI SAC address trick Robin Murphy
2022-06-09 15:12 ` Robin Murphy
2022-06-09 18:09 ` John Garry via iommu
2022-06-09 18:09   ` John Garry
2022-06-10  6:01 ` Christoph Hellwig
2022-06-10  6:01   ` Christoph Hellwig
2022-06-22 12:59 ` Joerg Roedel
2022-06-22 12:59   ` Joerg Roedel
2022-06-22 13:12   ` Robin Murphy
2022-06-22 13:12     ` Robin Murphy
2022-06-23 11:33     ` Joerg Roedel
2022-06-23 11:33       ` Joerg Roedel
2022-06-23 11:41       ` Robin Murphy
2022-06-23 11:41         ` Robin Murphy
2022-06-24 13:28         ` Joerg Roedel
2022-06-24 13:28           ` Joerg Roedel
2022-06-24 14:49           ` Robin Murphy
2022-06-24 14:49             ` Robin Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.