linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU
@ 2022-12-15 13:56 Vladimir Oltean
  2022-12-15 13:56 ` [PATCH v2 2/2] arm64: dts: ls1088a: " Vladimir Oltean
  2022-12-31 13:34 ` [PATCH v2 1/2] arm64: dts: ls1028a: " Shawn Guo
  0 siblings, 2 replies; 3+ messages in thread
From: Vladimir Oltean @ 2022-12-15 13:56 UTC (permalink / raw)
  To: devicetree, iommu
  Cc: Laurentiu Tudor, Will Deacon, Robin Murphy, linux-arm-kernel,
	Shawn Guo, Li Yang, Rob Herring, Krzysztof Kozlowski,
	linux-kernel, Michael Walle

The SMMUv2 driver for MMU-500 reads the ARM_SMMU_GR0_ID0 register at
probe time and tries to determine based on the CTTW (Coherent
Translation Table Walk) bit whether this feature is supported.

Unfortunately, it looks like the SMMU integration in the NXP LS1028A has
wrongly tied the cfg_cttw signal to 0, even though the SoC documentation
specifies that "The SMMU supports cache coherency for page table walks
and DVM transactions for page table cache maintenance operations."

Device tree provides the option of overriding the ID register via the
dma-coherent property since commit bae2c2d421cd ("iommu/arm-smmu: Sort
out coherency"), and that's what we do here.

Telling struct io_pgtable_cfg that the SMMU page table walks are
coherent with the CPU caches brings performance benefits, because it
avoids certain operations such as __arm_lpae_sync_pte() for PTE updates.

Link: https://lore.kernel.org/linux-iommu/3f3112e4-65ff-105d-8cd7-60495ec9054a@arm.com/
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v1->v2: reword commit message, drop Fixes: tag

vfio's problem with arm_smmu_capable(IOMMU_CAP_CACHE_COHERENCY) should
be resolved independently, I'm not claiming that this is the only fix
for that.

v1 at:
https://lore.kernel.org/linux-iommu/20221208151514.3840720-1-vladimir.oltean@nxp.com/

 arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
index ac1c3a7e5f7a..9be0b4b7babf 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi
@@ -712,6 +712,7 @@ smmu: iommu@5000000 {
 			reg = <0 0x5000000 0 0x800000>;
 			#global-interrupts = <8>;
 			#iommu-cells = <1>;
+			dma-coherent;
 			stream-match-mask = <0x7c00>;
 			/* global secure fault */
 			interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v2 2/2] arm64: dts: ls1088a: declare cache-coherent page table walk feature for IOMMU
  2022-12-15 13:56 [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU Vladimir Oltean
@ 2022-12-15 13:56 ` Vladimir Oltean
  2022-12-31 13:34 ` [PATCH v2 1/2] arm64: dts: ls1028a: " Shawn Guo
  1 sibling, 0 replies; 3+ messages in thread
From: Vladimir Oltean @ 2022-12-15 13:56 UTC (permalink / raw)
  To: devicetree, iommu
  Cc: Laurentiu Tudor, Will Deacon, Robin Murphy, linux-arm-kernel,
	Shawn Guo, Li Yang, Rob Herring, Krzysztof Kozlowski,
	linux-kernel, Michael Walle

The SMMUv2 driver for MMU-500 reads the ARM_SMMU_GR0_ID0 register at
probe time and tries to determine based on the CTTW (Coherent
Translation Table Walk) bit whether this feature is supported.

Unfortunately, it looks like the SMMU integration in the NXP LS1088A has
wrongly tied the cfg_cttw signal to 0, even though the SoC documentation
specifies that "The SMMU supports cache coherency for page table walks
and DVM transactions for page table cache maintenance operations."

Device tree provides the option of overriding the ID register via the
dma-coherent property since commit bae2c2d421cd ("iommu/arm-smmu: Sort
out coherency"), and that's what we do here.

Telling struct io_pgtable_cfg that the SMMU page table walks are
coherent with the CPU caches brings performance benefits, because it
avoids certain operations such as __arm_lpae_sync_pte() for PTE updates.

Link: https://lore.kernel.org/linux-iommu/3f3112e4-65ff-105d-8cd7-60495ec9054a@arm.com/
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v1->v2: patch is new

 arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index 260d045dbd9a..e5fb137ac02b 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -674,6 +674,7 @@ smmu: iommu@5000000 {
 			reg = <0 0x5000000 0 0x800000>;
 			#iommu-cells = <1>;
 			stream-match-mask = <0x7C00>;
+			dma-coherent;
 			#global-interrupts = <12>;
 				     // global secure fault
 			interrupts = <GIC_SPI 13 IRQ_TYPE_LEVEL_HIGH>,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU
  2022-12-15 13:56 [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU Vladimir Oltean
  2022-12-15 13:56 ` [PATCH v2 2/2] arm64: dts: ls1088a: " Vladimir Oltean
@ 2022-12-31 13:34 ` Shawn Guo
  1 sibling, 0 replies; 3+ messages in thread
From: Shawn Guo @ 2022-12-31 13:34 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: devicetree, iommu, Laurentiu Tudor, Will Deacon, Robin Murphy,
	linux-arm-kernel, Li Yang, Rob Herring, Krzysztof Kozlowski,
	linux-kernel, Michael Walle

On Thu, Dec 15, 2022 at 03:56:35PM +0200, Vladimir Oltean wrote:
> The SMMUv2 driver for MMU-500 reads the ARM_SMMU_GR0_ID0 register at
> probe time and tries to determine based on the CTTW (Coherent
> Translation Table Walk) bit whether this feature is supported.
> 
> Unfortunately, it looks like the SMMU integration in the NXP LS1028A has
> wrongly tied the cfg_cttw signal to 0, even though the SoC documentation
> specifies that "The SMMU supports cache coherency for page table walks
> and DVM transactions for page table cache maintenance operations."
> 
> Device tree provides the option of overriding the ID register via the
> dma-coherent property since commit bae2c2d421cd ("iommu/arm-smmu: Sort
> out coherency"), and that's what we do here.
> 
> Telling struct io_pgtable_cfg that the SMMU page table walks are
> coherent with the CPU caches brings performance benefits, because it
> avoids certain operations such as __arm_lpae_sync_pte() for PTE updates.
> 
> Link: https://lore.kernel.org/linux-iommu/3f3112e4-65ff-105d-8cd7-60495ec9054a@arm.com/
> Suggested-by: Robin Murphy <robin.murphy@arm.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>

Applied both, thanks!

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-12-31 13:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15 13:56 [PATCH v2 1/2] arm64: dts: ls1028a: declare cache-coherent page table walk feature for IOMMU Vladimir Oltean
2022-12-15 13:56 ` [PATCH v2 2/2] arm64: dts: ls1088a: " Vladimir Oltean
2022-12-31 13:34 ` [PATCH v2 1/2] arm64: dts: ls1028a: " Shawn Guo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).