iommu.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices
@ 2020-07-08 11:32 Robin Murphy
  2020-07-08 11:32 ` [PATCH 2/2] iommu/dma: " Robin Murphy
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Robin Murphy @ 2020-07-08 11:32 UTC (permalink / raw)
  To: joro
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

For devices stuck behind a conventional PCI bus, saving extra cycles at
33MHz is probably fairly significant. However since native PCI Express
is now the norm for high-performance devices, the optimisation to always
prefer 32-bit addresses for the sake of avoiding DAC is starting to look
rather anachronistic. Technically 32-bit addresses do have shorter TLPs
on PCIe, but unless the device is saturating its link bandwidth with
small transfers it seems unlikely that the difference is appreciable.

What definitely is appreciable, however, is that the IOVA allocator
doesn't behave all that well once the 32-bit space starts getting full.
As DMA working sets get bigger, this optimisation increasingly backfires
and adds considerable overhead to the dma_map path for use-cases like
high-bandwidth networking.

As such, let's simply take it out of consideration for PCIe devices.
Technically this might work out suboptimal for a PCIe device stuck
behind a conventional PCI bridge, or for PCI-X devices that also have
native 64-bit addressing, but neither of those are likely to be found
in performance-critical parts of modern systems.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/intel/iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9129663a7406..21b11de23a53 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3348,7 +3348,8 @@ static unsigned long intel_alloc_iova(struct device *dev,
 	/* Ensure we reserve the whole size-aligned region */
 	nrpages = __roundup_pow_of_two(nrpages);
 
-	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32)) {
+	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32) &&
+	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev))) {
 		/*
 		 * First try to allocate an io virtual address in
 		 * DMA_BIT_MASK(32) and if that fails then try allocating
-- 
2.23.0.dirty

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] iommu/dma: Avoid SAC address trick for PCIe devices
  2020-07-08 11:32 [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices Robin Murphy
@ 2020-07-08 11:32 ` Robin Murphy
  2020-07-13 13:14   ` Joerg Roedel
  2020-07-08 12:26 ` [PATCH 1/2] iommu/intel: " Christoph Hellwig
  2020-07-10 14:20 ` Joerg Roedel
  2 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2020-07-08 11:32 UTC (permalink / raw)
  To: joro
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

As for the intel-iommu implementation, relegate the opportunistic
attempt to allocate a SAC address to the domain of conventional PCI
devices only, to avoid it increasingly causing far more performance
issues than possible benefits on modern PCI Express systems.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/iommu/dma-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4959f5df21bd..0ff124f16ad4 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -426,7 +426,8 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
 		dma_limit = min(dma_limit, (u64)domain->geometry.aperture_end);
 
 	/* Try to get PCI devices a SAC address */
-	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
+	if (dma_limit > DMA_BIT_MASK(32) &&
+	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev)))
 		iova = alloc_iova_fast(iovad, iova_len,
 				       DMA_BIT_MASK(32) >> shift, false);
 
-- 
2.23.0.dirty

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices
  2020-07-08 11:32 [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices Robin Murphy
  2020-07-08 11:32 ` [PATCH 2/2] iommu/dma: " Robin Murphy
@ 2020-07-08 12:26 ` Christoph Hellwig
  2020-07-10 14:20 ` Joerg Roedel
  2 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2020-07-08 12:26 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

Looks pretty sensible.

> -	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32)) {
> +	if (!dmar_forcedac && dma_mask > DMA_BIT_MASK(32) &&
> +	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev))) {

The only thing I wonder is if it is worth to add a little helper
for this check, but with everything moving to dma-iommu that might
not be needed.

Btw, does anyone know what the status of the intel-iommu conversion
to dma-iommu is?
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices
  2020-07-08 11:32 [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices Robin Murphy
  2020-07-08 11:32 ` [PATCH 2/2] iommu/dma: " Robin Murphy
  2020-07-08 12:26 ` [PATCH 1/2] iommu/intel: " Christoph Hellwig
@ 2020-07-10 14:20 ` Joerg Roedel
  2 siblings, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2020-07-10 14:20 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

On Wed, Jul 08, 2020 at 12:32:41PM +0100, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking.
> 
> As such, let's simply take it out of consideration for PCIe devices.
> Technically this might work out suboptimal for a PCIe device stuck
> behind a conventional PCI bridge, or for PCI-X devices that also have
> native 64-bit addressing, but neither of those are likely to be found
> in performance-critical parts of modern systems.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/intel/iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Applied both, thanks.

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] iommu/dma: Avoid SAC address trick for PCIe devices
  2020-07-08 11:32 ` [PATCH 2/2] iommu/dma: " Robin Murphy
@ 2020-07-13 13:14   ` Joerg Roedel
  2020-07-14 11:42     ` Robin Murphy
  0 siblings, 1 reply; 7+ messages in thread
From: Joerg Roedel @ 2020-07-13 13:14 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

On Wed, Jul 08, 2020 at 12:32:42PM +0100, Robin Murphy wrote:
> As for the intel-iommu implementation, relegate the opportunistic
> attempt to allocate a SAC address to the domain of conventional PCI
> devices only, to avoid it increasingly causing far more performance
> issues than possible benefits on modern PCI Express systems.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>  drivers/iommu/dma-iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 4959f5df21bd..0ff124f16ad4 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -426,7 +426,8 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
>  		dma_limit = min(dma_limit, (u64)domain->geometry.aperture_end);
>  
>  	/* Try to get PCI devices a SAC address */
> -	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
> +	if (dma_limit > DMA_BIT_MASK(32) &&
> +	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev)))
>  		iova = alloc_iova_fast(iovad, iova_len,
>  				       DMA_BIT_MASK(32) >> shift, false);
>  

Unfortunatly this patch causes XHCI initialization failures on my AMD
Ryzen system. I will remove both from the IOMMU tree for now.

I guess the XHCI chip in my system does not support full 64bit dma
addresses and needs a quirk or something like that. But until this is
resolved its better to not change the IOVA allocation behavior.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] iommu/dma: Avoid SAC address trick for PCIe devices
  2020-07-13 13:14   ` Joerg Roedel
@ 2020-07-14 11:42     ` Robin Murphy
  2020-07-22 12:48       ` Joerg Roedel
  0 siblings, 1 reply; 7+ messages in thread
From: Robin Murphy @ 2020-07-14 11:42 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

On 2020-07-13 14:14, Joerg Roedel wrote:
> On Wed, Jul 08, 2020 at 12:32:42PM +0100, Robin Murphy wrote:
>> As for the intel-iommu implementation, relegate the opportunistic
>> attempt to allocate a SAC address to the domain of conventional PCI
>> devices only, to avoid it increasingly causing far more performance
>> issues than possible benefits on modern PCI Express systems.
>>
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> ---
>>   drivers/iommu/dma-iommu.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>> index 4959f5df21bd..0ff124f16ad4 100644
>> --- a/drivers/iommu/dma-iommu.c
>> +++ b/drivers/iommu/dma-iommu.c
>> @@ -426,7 +426,8 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
>>   		dma_limit = min(dma_limit, (u64)domain->geometry.aperture_end);
>>   
>>   	/* Try to get PCI devices a SAC address */
>> -	if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev))
>> +	if (dma_limit > DMA_BIT_MASK(32) &&
>> +	    dev_is_pci(dev) && !pci_is_pcie(to_pci_dev(dev)))
>>   		iova = alloc_iova_fast(iovad, iova_len,
>>   				       DMA_BIT_MASK(32) >> shift, false);
>>   
> 
> Unfortunatly this patch causes XHCI initialization failures on my AMD
> Ryzen system. I will remove both from the IOMMU tree for now.
> 
> I guess the XHCI chip in my system does not support full 64bit dma
> addresses and needs a quirk or something like that. But until this is
> resolved its better to not change the IOVA allocation behavior.

Oh bother - yes, this could have been masking all manner of bugs. That 
system will presumably also break if you managed to exhaust the 32-bit 
IOVA space such that the allocator moved up to the higher range anyway, 
or if you passed the XHCI through to a VM with a sufficiently wacky GPA 
layout, but I guess those are cases that simply nobody's run into yet.

Does the firmware actually report any upper address constraint such that 
Sebastian's IVRS aperture patches might help?

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] iommu/dma: Avoid SAC address trick for PCIe devices
  2020-07-14 11:42     ` Robin Murphy
@ 2020-07-22 12:48       ` Joerg Roedel
  0 siblings, 0 replies; 7+ messages in thread
From: Joerg Roedel @ 2020-07-22 12:48 UTC (permalink / raw)
  To: Robin Murphy
  Cc: linux-pci, linux-kernel, iommu, jonathan.lemon, dwmw2, hch,
	linux-arm-kernel

On Tue, Jul 14, 2020 at 12:42:36PM +0100, Robin Murphy wrote:
> Oh bother - yes, this could have been masking all manner of bugs. That
> system will presumably also break if you managed to exhaust the 32-bit IOVA
> space such that the allocator moved up to the higher range anyway, or if you
> passed the XHCI through to a VM with a sufficiently wacky GPA layout, but I
> guess those are cases that simply nobody's run into yet.
> 
> Does the firmware actually report any upper address constraint such that
> Sebastian's IVRS aperture patches might help?

No, it doesn't. I am not sure what the best way is to get these issues
found and fixed. I doubt they will be getting fixed when the allocation
pattern isn't changed, maybe we can put your changes behind a config
variable and start testing/reporting bugs/etc.

Regards,

	Joerg
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-07-22 12:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08 11:32 [PATCH 1/2] iommu/intel: Avoid SAC address trick for PCIe devices Robin Murphy
2020-07-08 11:32 ` [PATCH 2/2] iommu/dma: " Robin Murphy
2020-07-13 13:14   ` Joerg Roedel
2020-07-14 11:42     ` Robin Murphy
2020-07-22 12:48       ` Joerg Roedel
2020-07-08 12:26 ` [PATCH 1/2] iommu/intel: " Christoph Hellwig
2020-07-10 14:20 ` Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).