All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
@ 2017-09-27 13:32 ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, robh
  Cc: gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun,
	Shameer Kolothum

On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
deviates from the standard implementation and this breaks PCIe MSI
functionality when SMMU is enabled.

The HiSilicon erratum 161010801 describes this limitation of certain
HiSilicon platforms to support the SMMU mappings for MSI transactions.
On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements an ACPI and DT based quirk to reserve the hw msi
regions in the smmu-v3 driver which means these address regions will
not be translated and will be excluded from iova allocations.

To implement this quirk, the following changes are incorporated:
1. Added a generic helper function to IORT code to retrieve the
   associated ITS base address from a device IORT node.
2. Added a generic helper function to of iommu code to retrieve the
   associated msi controller base address from for a PCI RC
   msi-mapping and also platform device msi-parent.
3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
   the default SW MSI reserve address based on the IORT SMMU model
   or DT bindings.

Changelog:

v7 --> v8
Addressed comments from Rob and Lorenzo:
 -Modified to use DT compatible string for errata.
 -Changed logic to retrieve the msi-parent for DT case.

v6 --> v7
Addressed request from Will to add DT support for the erratum:
 - added bt binding
 - add of_iommu_msi_get_resv_regions()
New arm64 silicon errata entry
Rename iort_iommu_{its->msi}_get_resv_regions

v5 --> v6
Addressed comments from Robin and Lorenzo:
-No change to patch#1 .
-Reverted v5 patch#2 as this might break the platforms where this quirk
  is not applicable. Provided a generic function in iommu code and added
  back the quirk implementation in SMMU v3 driver(patch#3)
 
v4 --> v5
Addressed comments from Robin and Lorenzo:
-Added a comment to make it clear that, for now, only straightforward 
  HW topologies are handled while reserving ITS regions(patch #1).

v3 --> v4
Rebased on 4.13-rc1.
Addressed comments from Robin, Will and Lorenzo:
-As suggested by Robin, moved the ITS msi reservation into 
  iommu_dma_get_resv_regions().
-Added its_count != resv region failure case(patch #1).

v2 --> v3
Addressed comments from Lorenzo and Robin:
-Removed dev_is_pci() check in smmuV3 driver.
-Don't treat device not having an ITS mapping as an error in
  iort helper function.

v1 --> v2
-patch 2/2: Invoke iort helper fn based on fwnode type(acpi).

RFCv2 -->PATCH
-Incorporated Lorenzo's review comments.

RFC v1 --> RFC v2
Based on Robin's review comments,
-Removed  the generic erratum framework.
-Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor specific CSRT table.

John Garry (2):
  Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  iommu/of: Add msi address regions reservation helper

Shameer Kolothum (3):
  ACPI/IORT: Add msi address regions reservation helper
  iommu/dma: Add a helper function to reserve HW MSI address regions for
    IOMMU drivers
  iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

 Documentation/arm64/silicon-errata.txt             |  1 +
 .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
 drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
 drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
 drivers/iommu/dma-iommu.c                          | 19 +++++
 drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
 drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
 include/linux/acpi_iort.h                          |  7 +-
 include/linux/dma-iommu.h                          |  7 ++
 include/linux/of_iommu.h                           | 10 +++
 10 files changed, 276 insertions(+), 12 deletions(-)

-- 
1.9.1



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
@ 2017-09-27 13:32 ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
deviates from the standard implementation and this breaks PCIe MSI
functionality when SMMU is enabled.

The HiSilicon erratum 161010801 describes this limitation of certain
HiSilicon platforms to support the SMMU mappings for MSI transactions.
On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements an ACPI and DT based quirk to reserve the hw msi
regions in the smmu-v3 driver which means these address regions will
not be translated and will be excluded from iova allocations.

To implement this quirk, the following changes are incorporated:
1. Added a generic helper function to IORT code to retrieve the
   associated ITS base address from a device IORT node.
2. Added a generic helper function to of iommu code to retrieve the
   associated msi controller base address from for a PCI RC
   msi-mapping and also platform device msi-parent.
3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
   the default SW MSI reserve address based on the IORT SMMU model
   or DT bindings.

Changelog:

v7 --> v8
Addressed comments from Rob and Lorenzo:
 -Modified to use DT compatible string for errata.
 -Changed logic to retrieve the msi-parent for DT case.

v6 --> v7
Addressed request from Will to add DT support for the erratum:
 - added bt binding
 - add of_iommu_msi_get_resv_regions()
New arm64 silicon errata entry
Rename iort_iommu_{its->msi}_get_resv_regions

v5 --> v6
Addressed comments from Robin and Lorenzo:
-No change to patch#1 .
-Reverted v5 patch#2 as this might break the platforms where this quirk
  is not applicable. Provided a generic function in iommu code and added
  back the quirk implementation in SMMU v3 driver(patch#3)
 
v4 --> v5
Addressed comments from Robin and Lorenzo:
-Added a comment to make it clear that, for now, only straightforward 
  HW topologies are handled while reserving ITS regions(patch #1).

v3 --> v4
Rebased on 4.13-rc1.
Addressed comments from Robin, Will and Lorenzo:
-As suggested by Robin, moved the ITS msi reservation into 
  iommu_dma_get_resv_regions().
-Added its_count != resv region failure case(patch #1).

v2 --> v3
Addressed comments from Lorenzo and Robin:
-Removed dev_is_pci() check in smmuV3 driver.
-Don't treat device not having an ITS mapping as an error in
  iort helper function.

v1 --> v2
-patch 2/2: Invoke iort helper fn based on fwnode type(acpi).

RFCv2 -->PATCH
-Incorporated Lorenzo's review comments.

RFC v1 --> RFC v2
Based on Robin's review comments,
-Removed  the generic erratum framework.
-Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor specific CSRT table.

John Garry (2):
  Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  iommu/of: Add msi address regions reservation helper

Shameer Kolothum (3):
  ACPI/IORT: Add msi address regions reservation helper
  iommu/dma: Add a helper function to reserve HW MSI address regions for
    IOMMU drivers
  iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

 Documentation/arm64/silicon-errata.txt             |  1 +
 .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
 drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
 drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
 drivers/iommu/dma-iommu.c                          | 19 +++++
 drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
 drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
 include/linux/acpi_iort.h                          |  7 +-
 include/linux/dma-iommu.h                          |  7 ++
 include/linux/of_iommu.h                           | 10 +++
 10 files changed, 276 insertions(+), 12 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
@ 2017-09-27 13:32 ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 4198 bytes --]

On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
deviates from the standard implementation and this breaks PCIe MSI
functionality when SMMU is enabled.

The HiSilicon erratum 161010801 describes this limitation of certain
HiSilicon platforms to support the SMMU mappings for MSI transactions.
On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements an ACPI and DT based quirk to reserve the hw msi
regions in the smmu-v3 driver which means these address regions will
not be translated and will be excluded from iova allocations.

To implement this quirk, the following changes are incorporated:
1. Added a generic helper function to IORT code to retrieve the
   associated ITS base address from a device IORT node.
2. Added a generic helper function to of iommu code to retrieve the
   associated msi controller base address from for a PCI RC
   msi-mapping and also platform device msi-parent.
3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
   the default SW MSI reserve address based on the IORT SMMU model
   or DT bindings.

Changelog:

v7 --> v8
Addressed comments from Rob and Lorenzo:
 -Modified to use DT compatible string for errata.
 -Changed logic to retrieve the msi-parent for DT case.

v6 --> v7
Addressed request from Will to add DT support for the erratum:
 - added bt binding
 - add of_iommu_msi_get_resv_regions()
New arm64 silicon errata entry
Rename iort_iommu_{its->msi}_get_resv_regions

v5 --> v6
Addressed comments from Robin and Lorenzo:
-No change to patch#1 .
-Reverted v5 patch#2 as this might break the platforms where this quirk
  is not applicable. Provided a generic function in iommu code and added
  back the quirk implementation in SMMU v3 driver(patch#3)
 
v4 --> v5
Addressed comments from Robin and Lorenzo:
-Added a comment to make it clear that, for now, only straightforward 
  HW topologies are handled while reserving ITS regions(patch #1).

v3 --> v4
Rebased on 4.13-rc1.
Addressed comments from Robin, Will and Lorenzo:
-As suggested by Robin, moved the ITS msi reservation into 
  iommu_dma_get_resv_regions().
-Added its_count != resv region failure case(patch #1).

v2 --> v3
Addressed comments from Lorenzo and Robin:
-Removed dev_is_pci() check in smmuV3 driver.
-Don't treat device not having an ITS mapping as an error in
  iort helper function.

v1 --> v2
-patch 2/2: Invoke iort helper fn based on fwnode type(acpi).

RFCv2 -->PATCH
-Incorporated Lorenzo's review comments.

RFC v1 --> RFC v2
Based on Robin's review comments,
-Removed  the generic erratum framework.
-Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor specific CSRT table.

John Garry (2):
  Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  iommu/of: Add msi address regions reservation helper

Shameer Kolothum (3):
  ACPI/IORT: Add msi address regions reservation helper
  iommu/dma: Add a helper function to reserve HW MSI address regions for
    IOMMU drivers
  iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801

 Documentation/arm64/silicon-errata.txt             |  1 +
 .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
 drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
 drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
 drivers/iommu/dma-iommu.c                          | 19 +++++
 drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
 drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
 include/linux/acpi_iort.h                          |  7 +-
 include/linux/dma-iommu.h                          |  7 ++
 include/linux/of_iommu.h                           | 10 +++
 10 files changed, 276 insertions(+), 12 deletions(-)

-- 
1.9.1



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-09-27 13:32   ` Shameer Kolothum
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, robh
  Cc: gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun,
	Shameer Kolothum

From: John Garry <john.garry@huawei.com>

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms hip06/hip07 to support the SMMU mappings for MSI transactions.

On these platforms, GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch adds a compatible string to implement this errata for
HiSilicon Hi161x SMMUv3 model on hip06/hip07 platforms.

Also, the arm64 silicon errata is updated with this same erratum.

Signed-off-by: John Garry <john.garry@huawei.com>
[Shameer: Modified to use compatible string for errata]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 Documentation/arm64/silicon-errata.txt                  | 1 +
 Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 9 ++++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index 66e8ce1..02816b1 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -70,6 +70,7 @@ stable kernels.
 |                |                 |                 |                             |
 | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
 | Hisilicon      | Hip0{6,7}       | #161010701      | N/A                         |
+| Hisilicon      | Hip0{6,7}       | #161010801      | N/A                         |
 |                |                 |                 |                             |
 | Qualcomm Tech. | Falkor v1       | E1003           | QCOM_FALKOR_ERRATUM_1003    |
 | Qualcomm Tech. | Falkor v1       | E1009           | QCOM_FALKOR_ERRATUM_1009    |
diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
index c9abbf3..3b0d599 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
@@ -7,11 +7,18 @@ the PCIe specification.
 
 ** SMMUv3 required properties:
 
-- compatible        : Should include:
+- compatible        : Should be one of:
+
+                      "arm,smmu-v3"
+                      "hisilicon,hi161x-smmu-v3"
+
+                      depending on the particular implementation.
 
                       * "arm,smmu-v3" for any SMMUv3 compliant
                         implementation. This entry should be last in the
                         compatible list.
+                      * "hisilicon,hi161x-smmu-v3" for HiSilicon hi161x
+                         SMMUv3 implementation on hip06/hip07 platforms.
 
 - reg               : Base address and size of the SMMU.
 
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: John Garry <john.garry@huawei.com>

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms hip06/hip07 to support the SMMU mappings for MSI transactions.

On these platforms, GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch adds a compatible string to implement this errata for
HiSilicon Hi161x SMMUv3 model on hip06/hip07 platforms.

Also, the arm64 silicon errata is updated with this same erratum.

Signed-off-by: John Garry <john.garry@huawei.com>
[Shameer: Modified to use compatible string for errata]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 Documentation/arm64/silicon-errata.txt                  | 1 +
 Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 9 ++++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index 66e8ce1..02816b1 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -70,6 +70,7 @@ stable kernels.
 |                |                 |                 |                             |
 | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
 | Hisilicon      | Hip0{6,7}       | #161010701      | N/A                         |
+| Hisilicon      | Hip0{6,7}       | #161010801      | N/A                         |
 |                |                 |                 |                             |
 | Qualcomm Tech. | Falkor v1       | E1003           | QCOM_FALKOR_ERRATUM_1003    |
 | Qualcomm Tech. | Falkor v1       | E1009           | QCOM_FALKOR_ERRATUM_1009    |
diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
index c9abbf3..3b0d599 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
@@ -7,11 +7,18 @@ the PCIe specification.
 
 ** SMMUv3 required properties:
 
-- compatible        : Should include:
+- compatible        : Should be one of:
+
+                      "arm,smmu-v3"
+                      "hisilicon,hi161x-smmu-v3"
+
+                      depending on the particular implementation.
 
                       * "arm,smmu-v3" for any SMMUv3 compliant
                         implementation. This entry should be last in the
                         compatible list.
+                      * "hisilicon,hi161x-smmu-v3" for HiSilicon hi161x
+                         SMMUv3 implementation on hip06/hip07 platforms.
 
 - reg               : Base address and size of the SMMU.
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 3035 bytes --]

From: John Garry <john.garry(a)huawei.com>

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms hip06/hip07 to support the SMMU mappings for MSI transactions.

On these platforms, GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the MSI
payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch adds a compatible string to implement this errata for
HiSilicon Hi161x SMMUv3 model on hip06/hip07 platforms.

Also, the arm64 silicon errata is updated with this same erratum.

Signed-off-by: John Garry <john.garry(a)huawei.com>
[Shameer: Modified to use compatible string for errata]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
---
 Documentation/arm64/silicon-errata.txt                  | 1 +
 Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 9 ++++++++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index 66e8ce1..02816b1 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -70,6 +70,7 @@ stable kernels.
 |                |                 |                 |                             |
 | Hisilicon      | Hip0{5,6,7}     | #161010101      | HISILICON_ERRATUM_161010101 |
 | Hisilicon      | Hip0{6,7}       | #161010701      | N/A                         |
+| Hisilicon      | Hip0{6,7}       | #161010801      | N/A                         |
 |                |                 |                 |                             |
 | Qualcomm Tech. | Falkor v1       | E1003           | QCOM_FALKOR_ERRATUM_1003    |
 | Qualcomm Tech. | Falkor v1       | E1009           | QCOM_FALKOR_ERRATUM_1009    |
diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
index c9abbf3..3b0d599 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt
@@ -7,11 +7,18 @@ the PCIe specification.
 
 ** SMMUv3 required properties:
 
-- compatible        : Should include:
+- compatible        : Should be one of:
+
+                      "arm,smmu-v3"
+                      "hisilicon,hi161x-smmu-v3"
+
+                      depending on the particular implementation.
 
                       * "arm,smmu-v3" for any SMMUv3 compliant
                         implementation. This entry should be last in the
                         compatible list.
+                      * "hisilicon,hi161x-smmu-v3" for HiSilicon hi161x
+                         SMMUv3 implementation on hip06/hip07 platforms.
 
 - reg               : Base address and size of the SMMU.
 
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-09-27 13:32     ` Shameer Kolothum
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi-5wv7dgnIgG8, marc.zyngier-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	john.garry-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, devel-E0kO6a4B6psdnm+yROfE0A,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q,
	guohanjun-hv44wF8Li93QT0dZR+AlfA, Shameer Kolothum

On some platforms msi parent address regions have to be excluded from
normal IOVA allocation in that they are detected and decoded in a HW
specific way by system components and so they cannot be considered normal
IOVA address space.

Add a helper function that retrieves ITS address regions - the msi
parent - through IORT device <-> ITS mappings and reserves it so that
these regions will not be translated by IOMMU and will be excluded from
IOVA allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
[lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org: updated commit log/added comments]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
---
 drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
 drivers/irqchip/irq-gic-v3-its.c |  3 +-
 include/linux/acpi_iort.h        |  7 ++-
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9565d57..14efa9d 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -39,6 +39,7 @@
 struct iort_its_msi_chip {
 	struct list_head	list;
 	struct fwnode_handle	*fw_node;
+	phys_addr_t		base_addr;
 	u32			translation_id;
 };
 
@@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
 static DEFINE_SPINLOCK(iort_msi_chip_lock);
 
 /**
- * iort_register_domain_token() - register domain token and related ITS ID
- * to the list from where we can get it back later on.
+ * iort_register_domain_token() - register domain token along with related
+ * ITS ID and base address to the list from where we can get it back later on.
  * @trans_id: ITS ID.
+ * @base: ITS base address.
  * @fw_node: Domain token.
  *
  * Returns: 0 on success, -ENOMEM if no memory when allocating list element
  */
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node)
 {
 	struct iort_its_msi_chip *its_msi_chip;
 
@@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
 
 	its_msi_chip->fw_node = fw_node;
 	its_msi_chip->translation_id = trans_id;
+	its_msi_chip->base_addr = base;
 
 	spin_lock(&iort_msi_chip_lock);
 	list_add(&its_msi_chip->list, &iort_msi_chip_list);
@@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
 	return -ENODEV;
 }
 
+static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
+{
+	struct iort_its_msi_chip *its_msi_chip;
+	bool match = false;
+
+	spin_lock(&iort_msi_chip_lock);
+	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
+		if (its_msi_chip->translation_id == its_id) {
+			*base = its_msi_chip->base_addr;
+			match = true;
+			break;
+		}
+	}
+	spin_unlock(&iort_msi_chip_lock);
+
+	return match ? 0 : -ENODEV;
+}
+
 /**
  * iort_dev_find_its_id() - Find the ITS identifier for a device
  * @dev: The device.
@@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 
 	return err;
 }
+
+/**
+ * iort_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated msi
+ *          regions), appropriate error value otherwise. The ITS regions
+ *          associated with the device are the msi reserved regions.
+ */
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+	struct acpi_iort_its_group *its;
+	struct acpi_iort_node *node, *its_node = NULL;
+	int i, resv = 0;
+
+	node = iort_find_dev_node(dev);
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Current logic to reserve ITS regions relies on HW topologies
+	 * where a given PCI or named component maps its IDs to only one
+	 * ITS group; if a PCI or named component can map its IDs to
+	 * different ITS groups through IORT mappings this function has
+	 * to be reworked to ensure we reserve regions for all ITS groups
+	 * a given PCI or named component may map IDs to.
+	 */
+	if (dev_is_pci(dev)) {
+		u32 rid;
+
+		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
+	} else {
+		for (i = 0; i < node->mapping_count; i++) {
+			its_node = iort_node_map_platform_id(node, NULL,
+							 IORT_MSI_TYPE, i);
+			if (its_node)
+				break;
+		}
+	}
+
+	if (!its_node)
+		return 0;
+
+	/* Move to ITS specific data */
+	its = (struct acpi_iort_its_group *)its_node->node_data;
+
+	for (i = 0; i < its->its_count; i++) {
+		phys_addr_t base;
+
+		if (!iort_find_its_base(its->identifiers[i], &base)) {
+			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+			struct iommu_resv_region *region;
+
+			region = iommu_alloc_resv_region(base, SZ_128K, prot,
+							 IOMMU_RESV_MSI);
+			if (region) {
+				list_add_tail(&region->list, head);
+				resv++;
+			}
+		}
+	}
+
+	return (resv == its->its_count) ? resv : -ENODEV;
+}
 #else
 static inline
 const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
@@ -646,6 +734,8 @@ const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
 static inline
 int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 { return 0; }
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 static int iort_iommu_xlate(struct device *dev, struct acpi_iort_node *node,
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e8d8934..19d1ff6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3197,7 +3197,8 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
 		return -ENOMEM;
 	}
 
-	err = iort_register_domain_token(its_entry->translation_id, dom_handle);
+	err = iort_register_domain_token(its_entry->translation_id, res.start,
+					 dom_handle);
 	if (err) {
 		pr_err("ITS@%pa: Unable to register GICv3 ITS domain token (ITS ID %d) to IORT\n",
 		       &res.start, its_entry->translation_id);
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 8d3f0bf..182a577 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -26,7 +26,8 @@
 #define IORT_IRQ_MASK(irq)		(irq & 0xffffffffULL)
 #define IORT_IRQ_TRIGGER_MASK(irq)	((irq >> 32) & 0xffffffffULL)
 
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node);
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node);
 void iort_deregister_domain_token(int trans_id);
 struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
@@ -38,6 +39,7 @@
 /* IOMMU interface */
 void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *size);
 const struct iommu_ops *iort_iommu_configure(struct device *dev);
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head);
 #else
 static inline void acpi_iort_init(void) { }
 static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
@@ -52,6 +54,9 @@ static inline void iort_dma_setup(struct device *dev, u64 *dma_addr,
 static inline
 const struct iommu_ops *iort_iommu_configure(struct device *dev)
 { return NULL; }
+static inline
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 #endif /* __ACPI_IORT_H__ */
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-09-27 13:32     ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

On some platforms msi parent address regions have to be excluded from
normal IOVA allocation in that they are detected and decoded in a HW
specific way by system components and so they cannot be considered normal
IOVA address space.

Add a helper function that retrieves ITS address regions - the msi
parent - through IORT device <-> ITS mappings and reserves it so that
these regions will not be translated by IOMMU and will be excluded from
IOVA allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
[lorenzo.pieralisi at arm.com: updated commit log/added comments]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
 drivers/irqchip/irq-gic-v3-its.c |  3 +-
 include/linux/acpi_iort.h        |  7 ++-
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9565d57..14efa9d 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -39,6 +39,7 @@
 struct iort_its_msi_chip {
 	struct list_head	list;
 	struct fwnode_handle	*fw_node;
+	phys_addr_t		base_addr;
 	u32			translation_id;
 };
 
@@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
 static DEFINE_SPINLOCK(iort_msi_chip_lock);
 
 /**
- * iort_register_domain_token() - register domain token and related ITS ID
- * to the list from where we can get it back later on.
+ * iort_register_domain_token() - register domain token along with related
+ * ITS ID and base address to the list from where we can get it back later on.
  * @trans_id: ITS ID.
+ * @base: ITS base address.
  * @fw_node: Domain token.
  *
  * Returns: 0 on success, -ENOMEM if no memory when allocating list element
  */
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node)
 {
 	struct iort_its_msi_chip *its_msi_chip;
 
@@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
 
 	its_msi_chip->fw_node = fw_node;
 	its_msi_chip->translation_id = trans_id;
+	its_msi_chip->base_addr = base;
 
 	spin_lock(&iort_msi_chip_lock);
 	list_add(&its_msi_chip->list, &iort_msi_chip_list);
@@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
 	return -ENODEV;
 }
 
+static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
+{
+	struct iort_its_msi_chip *its_msi_chip;
+	bool match = false;
+
+	spin_lock(&iort_msi_chip_lock);
+	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
+		if (its_msi_chip->translation_id == its_id) {
+			*base = its_msi_chip->base_addr;
+			match = true;
+			break;
+		}
+	}
+	spin_unlock(&iort_msi_chip_lock);
+
+	return match ? 0 : -ENODEV;
+}
+
 /**
  * iort_dev_find_its_id() - Find the ITS identifier for a device
  * @dev: The device.
@@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 
 	return err;
 }
+
+/**
+ * iort_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated msi
+ *          regions), appropriate error value otherwise. The ITS regions
+ *          associated with the device are the msi reserved regions.
+ */
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+	struct acpi_iort_its_group *its;
+	struct acpi_iort_node *node, *its_node = NULL;
+	int i, resv = 0;
+
+	node = iort_find_dev_node(dev);
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Current logic to reserve ITS regions relies on HW topologies
+	 * where a given PCI or named component maps its IDs to only one
+	 * ITS group; if a PCI or named component can map its IDs to
+	 * different ITS groups through IORT mappings this function has
+	 * to be reworked to ensure we reserve regions for all ITS groups
+	 * a given PCI or named component may map IDs to.
+	 */
+	if (dev_is_pci(dev)) {
+		u32 rid;
+
+		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
+	} else {
+		for (i = 0; i < node->mapping_count; i++) {
+			its_node = iort_node_map_platform_id(node, NULL,
+							 IORT_MSI_TYPE, i);
+			if (its_node)
+				break;
+		}
+	}
+
+	if (!its_node)
+		return 0;
+
+	/* Move to ITS specific data */
+	its = (struct acpi_iort_its_group *)its_node->node_data;
+
+	for (i = 0; i < its->its_count; i++) {
+		phys_addr_t base;
+
+		if (!iort_find_its_base(its->identifiers[i], &base)) {
+			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+			struct iommu_resv_region *region;
+
+			region = iommu_alloc_resv_region(base, SZ_128K, prot,
+							 IOMMU_RESV_MSI);
+			if (region) {
+				list_add_tail(&region->list, head);
+				resv++;
+			}
+		}
+	}
+
+	return (resv == its->its_count) ? resv : -ENODEV;
+}
 #else
 static inline
 const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
@@ -646,6 +734,8 @@ const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
 static inline
 int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 { return 0; }
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 static int iort_iommu_xlate(struct device *dev, struct acpi_iort_node *node,
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e8d8934..19d1ff6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3197,7 +3197,8 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
 		return -ENOMEM;
 	}
 
-	err = iort_register_domain_token(its_entry->translation_id, dom_handle);
+	err = iort_register_domain_token(its_entry->translation_id, res.start,
+					 dom_handle);
 	if (err) {
 		pr_err("ITS@%pa: Unable to register GICv3 ITS domain token (ITS ID %d) to IORT\n",
 		       &res.start, its_entry->translation_id);
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 8d3f0bf..182a577 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -26,7 +26,8 @@
 #define IORT_IRQ_MASK(irq)		(irq & 0xffffffffULL)
 #define IORT_IRQ_TRIGGER_MASK(irq)	((irq >> 32) & 0xffffffffULL)
 
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node);
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node);
 void iort_deregister_domain_token(int trans_id);
 struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
@@ -38,6 +39,7 @@
 /* IOMMU interface */
 void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *size);
 const struct iommu_ops *iort_iommu_configure(struct device *dev);
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head);
 #else
 static inline void acpi_iort_init(void) { }
 static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
@@ -52,6 +54,9 @@ static inline void iort_dma_setup(struct device *dev, u64 *dma_addr,
 static inline
 const struct iommu_ops *iort_iommu_configure(struct device *dev)
 { return NULL; }
+static inline
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 #endif /* __ACPI_IORT_H__ */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-09-27 13:32     ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7737 bytes --]

On some platforms msi parent address regions have to be excluded from
normal IOVA allocation in that they are detected and decoded in a HW
specific way by system components and so they cannot be considered normal
IOVA address space.

Add a helper function that retrieves ITS address regions - the msi
parent - through IORT device <-> ITS mappings and reserves it so that
these regions will not be translated by IOMMU and will be excluded from
IOVA allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
[lorenzo.pieralisi(a)arm.com: updated commit log/added comments]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
---
 drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
 drivers/irqchip/irq-gic-v3-its.c |  3 +-
 include/linux/acpi_iort.h        |  7 ++-
 3 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
index 9565d57..14efa9d 100644
--- a/drivers/acpi/arm64/iort.c
+++ b/drivers/acpi/arm64/iort.c
@@ -39,6 +39,7 @@
 struct iort_its_msi_chip {
 	struct list_head	list;
 	struct fwnode_handle	*fw_node;
+	phys_addr_t		base_addr;
 	u32			translation_id;
 };
 
@@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
 static DEFINE_SPINLOCK(iort_msi_chip_lock);
 
 /**
- * iort_register_domain_token() - register domain token and related ITS ID
- * to the list from where we can get it back later on.
+ * iort_register_domain_token() - register domain token along with related
+ * ITS ID and base address to the list from where we can get it back later on.
  * @trans_id: ITS ID.
+ * @base: ITS base address.
  * @fw_node: Domain token.
  *
  * Returns: 0 on success, -ENOMEM if no memory when allocating list element
  */
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node)
 {
 	struct iort_its_msi_chip *its_msi_chip;
 
@@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
 
 	its_msi_chip->fw_node = fw_node;
 	its_msi_chip->translation_id = trans_id;
+	its_msi_chip->base_addr = base;
 
 	spin_lock(&iort_msi_chip_lock);
 	list_add(&its_msi_chip->list, &iort_msi_chip_list);
@@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
 	return -ENODEV;
 }
 
+static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
+{
+	struct iort_its_msi_chip *its_msi_chip;
+	bool match = false;
+
+	spin_lock(&iort_msi_chip_lock);
+	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
+		if (its_msi_chip->translation_id == its_id) {
+			*base = its_msi_chip->base_addr;
+			match = true;
+			break;
+		}
+	}
+	spin_unlock(&iort_msi_chip_lock);
+
+	return match ? 0 : -ENODEV;
+}
+
 /**
  * iort_dev_find_its_id() - Find the ITS identifier for a device
  * @dev: The device.
@@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 
 	return err;
 }
+
+/**
+ * iort_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated msi
+ *          regions), appropriate error value otherwise. The ITS regions
+ *          associated with the device are the msi reserved regions.
+ */
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+	struct acpi_iort_its_group *its;
+	struct acpi_iort_node *node, *its_node = NULL;
+	int i, resv = 0;
+
+	node = iort_find_dev_node(dev);
+	if (!node)
+		return -ENODEV;
+
+	/*
+	 * Current logic to reserve ITS regions relies on HW topologies
+	 * where a given PCI or named component maps its IDs to only one
+	 * ITS group; if a PCI or named component can map its IDs to
+	 * different ITS groups through IORT mappings this function has
+	 * to be reworked to ensure we reserve regions for all ITS groups
+	 * a given PCI or named component may map IDs to.
+	 */
+	if (dev_is_pci(dev)) {
+		u32 rid;
+
+		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
+	} else {
+		for (i = 0; i < node->mapping_count; i++) {
+			its_node = iort_node_map_platform_id(node, NULL,
+							 IORT_MSI_TYPE, i);
+			if (its_node)
+				break;
+		}
+	}
+
+	if (!its_node)
+		return 0;
+
+	/* Move to ITS specific data */
+	its = (struct acpi_iort_its_group *)its_node->node_data;
+
+	for (i = 0; i < its->its_count; i++) {
+		phys_addr_t base;
+
+		if (!iort_find_its_base(its->identifiers[i], &base)) {
+			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+			struct iommu_resv_region *region;
+
+			region = iommu_alloc_resv_region(base, SZ_128K, prot,
+							 IOMMU_RESV_MSI);
+			if (region) {
+				list_add_tail(&region->list, head);
+				resv++;
+			}
+		}
+	}
+
+	return (resv == its->its_count) ? resv : -ENODEV;
+}
 #else
 static inline
 const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
@@ -646,6 +734,8 @@ const struct iommu_ops *iort_fwspec_iommu_ops(struct iommu_fwspec *fwspec)
 static inline
 int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
 { return 0; }
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 static int iort_iommu_xlate(struct device *dev, struct acpi_iort_node *node,
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index e8d8934..19d1ff6 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -3197,7 +3197,8 @@ static int __init gic_acpi_parse_madt_its(struct acpi_subtable_header *header,
 		return -ENOMEM;
 	}
 
-	err = iort_register_domain_token(its_entry->translation_id, dom_handle);
+	err = iort_register_domain_token(its_entry->translation_id, res.start,
+					 dom_handle);
 	if (err) {
 		pr_err("ITS@%pa: Unable to register GICv3 ITS domain token (ITS ID %d) to IORT\n",
 		       &res.start, its_entry->translation_id);
diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
index 8d3f0bf..182a577 100644
--- a/include/linux/acpi_iort.h
+++ b/include/linux/acpi_iort.h
@@ -26,7 +26,8 @@
 #define IORT_IRQ_MASK(irq)		(irq & 0xffffffffULL)
 #define IORT_IRQ_TRIGGER_MASK(irq)	((irq >> 32) & 0xffffffffULL)
 
-int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node);
+int iort_register_domain_token(int trans_id, phys_addr_t base,
+			       struct fwnode_handle *fw_node);
 void iort_deregister_domain_token(int trans_id);
 struct fwnode_handle *iort_find_domain_token(int trans_id);
 #ifdef CONFIG_ACPI_IORT
@@ -38,6 +39,7 @@
 /* IOMMU interface */
 void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *size);
 const struct iommu_ops *iort_iommu_configure(struct device *dev);
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head);
 #else
 static inline void acpi_iort_init(void) { }
 static inline u32 iort_msi_map_rid(struct device *dev, u32 req_id)
@@ -52,6 +54,9 @@ static inline void iort_dma_setup(struct device *dev, u64 *dma_addr,
 static inline
 const struct iommu_ops *iort_iommu_configure(struct device *dev)
 { return NULL; }
+static inline
+int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{ return -ENODEV; }
 #endif
 
 #endif /* __ACPI_IORT_H__ */
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-09-27 13:32   ` Shameer Kolothum
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, robh
  Cc: gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun,
	Shameer Kolothum

From: John Garry <john.garry@huawei.com>

On some platforms msi-controller address regions have to be excluded
from normal IOVA allocation in that they are detected and decoded in
a HW specific way by system components and so they cannot be considered
normal IOVA address space.

Add a helper function that retrieves msi address regions through device
tree msi mapping, so that these regions will not be translated by IOMMU
and will be excluded from IOVA allocations.

Signed-off-by: John Garry <john.garry@huawei.com>
[Shameer: Modified msi-parent retrieval logic]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_iommu.h | 10 +++++
 2 files changed, 105 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e60e3db..ffd7fb7 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -21,6 +21,7 @@
 #include <linux/iommu.h>
 #include <linux/limits.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_iommu.h>
 #include <linux/of_pci.h>
 #include <linux/slab.h>
@@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
 	return ops->of_xlate(dev, iommu_spec);
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+	u32 *rid = data;
+
+	*rid = alias;
+	return 0;
+}
+
 struct of_pci_iommu_alias_info {
 	struct device *dev;
 	struct device_node *np;
@@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 	return info->np == pdev->bus->dev.of_node;
 }
 
+static int of_iommu_alloc_resv_region(struct device_node *np,
+				      struct list_head *head)
+{
+	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	struct iommu_resv_region *region;
+	struct resource res;
+	int err;
+
+	err = of_address_to_resource(np, 0, &res);
+	if (err)
+		return err;
+
+	region = iommu_alloc_resv_region(res.start, resource_size(&res),
+					 prot, IOMMU_RESV_MSI);
+	if (!region)
+		return -ENOMEM;
+
+	list_add_tail(&region->list, head);
+
+	return 0;
+}
+
+static int of_pci_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct device_node *msi_np;
+	struct device *pdev;
+	u32 rid;
+	int err, resv = 0;
+
+	pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+
+	for_each_node_with_property(msi_np, "msi-controller") {
+		for (pdev = dev; pdev; pdev = pdev->parent) {
+			if (!of_pci_map_rid(pdev->of_node, rid, "msi-map",
+					    "msi-map-mask", &msi_np, NULL)) {
+
+				err = of_iommu_alloc_resv_region(msi_np, head);
+				if (err)
+					return err;
+				resv++;
+			}
+		}
+	}
+
+	return resv;
+}
+
+static int of_platform_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct of_phandle_args args;
+	int err, resv = 0;
+
+	while (!of_parse_phandle_with_args(dev->of_node, "msi-parent",
+					   "#msi-cells", resv, &args)) {
+
+		err = of_iommu_alloc_resv_region(args.np, head);
+		of_node_put(args.np);
+		if (err)
+			return err;
+		resv++;
+	}
+
+	return resv;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
 					   struct device_node *master_np)
 {
@@ -235,6 +311,25 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return ops;
 }
 
+/**
+ * of_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated
+ *          msi parent), appropriate error value otherwise.
+ */
+int of_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+
+	if (dev_is_pci(dev))
+		return of_pci_msi_get_resv_regions(dev, head);
+	else if (dev->of_node)
+		return of_platform_msi_get_resv_regions(dev, head);
+
+	return 0;
+}
+
 static int __init of_iommu_init(void)
 {
 	struct device_node *np;
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 13394ac..9267772 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -14,6 +14,9 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
+extern int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -29,6 +32,13 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return NULL;
 }
 
+static int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head)
+{
+	return -ENODEV;
+}
+
+
 #endif	/* CONFIG_OF_IOMMU */
 
 extern struct of_device_id __iommu_of_table;
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: John Garry <john.garry@huawei.com>

On some platforms msi-controller address regions have to be excluded
from normal IOVA allocation in that they are detected and decoded in
a HW specific way by system components and so they cannot be considered
normal IOVA address space.

Add a helper function that retrieves msi address regions through device
tree msi mapping, so that these regions will not be translated by IOMMU
and will be excluded from IOVA allocations.

Signed-off-by: John Garry <john.garry@huawei.com>
[Shameer: Modified msi-parent retrieval logic]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_iommu.h | 10 +++++
 2 files changed, 105 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e60e3db..ffd7fb7 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -21,6 +21,7 @@
 #include <linux/iommu.h>
 #include <linux/limits.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_iommu.h>
 #include <linux/of_pci.h>
 #include <linux/slab.h>
@@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
 	return ops->of_xlate(dev, iommu_spec);
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+	u32 *rid = data;
+
+	*rid = alias;
+	return 0;
+}
+
 struct of_pci_iommu_alias_info {
 	struct device *dev;
 	struct device_node *np;
@@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 	return info->np == pdev->bus->dev.of_node;
 }
 
+static int of_iommu_alloc_resv_region(struct device_node *np,
+				      struct list_head *head)
+{
+	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	struct iommu_resv_region *region;
+	struct resource res;
+	int err;
+
+	err = of_address_to_resource(np, 0, &res);
+	if (err)
+		return err;
+
+	region = iommu_alloc_resv_region(res.start, resource_size(&res),
+					 prot, IOMMU_RESV_MSI);
+	if (!region)
+		return -ENOMEM;
+
+	list_add_tail(&region->list, head);
+
+	return 0;
+}
+
+static int of_pci_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct device_node *msi_np;
+	struct device *pdev;
+	u32 rid;
+	int err, resv = 0;
+
+	pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+
+	for_each_node_with_property(msi_np, "msi-controller") {
+		for (pdev = dev; pdev; pdev = pdev->parent) {
+			if (!of_pci_map_rid(pdev->of_node, rid, "msi-map",
+					    "msi-map-mask", &msi_np, NULL)) {
+
+				err = of_iommu_alloc_resv_region(msi_np, head);
+				if (err)
+					return err;
+				resv++;
+			}
+		}
+	}
+
+	return resv;
+}
+
+static int of_platform_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct of_phandle_args args;
+	int err, resv = 0;
+
+	while (!of_parse_phandle_with_args(dev->of_node, "msi-parent",
+					   "#msi-cells", resv, &args)) {
+
+		err = of_iommu_alloc_resv_region(args.np, head);
+		of_node_put(args.np);
+		if (err)
+			return err;
+		resv++;
+	}
+
+	return resv;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
 					   struct device_node *master_np)
 {
@@ -235,6 +311,25 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return ops;
 }
 
+/**
+ * of_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated
+ *          msi parent), appropriate error value otherwise.
+ */
+int of_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+
+	if (dev_is_pci(dev))
+		return of_pci_msi_get_resv_regions(dev, head);
+	else if (dev->of_node)
+		return of_platform_msi_get_resv_regions(dev, head);
+
+	return 0;
+}
+
 static int __init of_iommu_init(void)
 {
 	struct device_node *np;
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 13394ac..9267772 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -14,6 +14,9 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
+extern int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -29,6 +32,13 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return NULL;
 }
 
+static int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head)
+{
+	return -ENODEV;
+}
+
+
 #endif	/* CONFIG_OF_IOMMU */
 
 extern struct of_device_id __iommu_of_table;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5026 bytes --]

From: John Garry <john.garry(a)huawei.com>

On some platforms msi-controller address regions have to be excluded
from normal IOVA allocation in that they are detected and decoded in
a HW specific way by system components and so they cannot be considered
normal IOVA address space.

Add a helper function that retrieves msi address regions through device
tree msi mapping, so that these regions will not be translated by IOMMU
and will be excluded from IOVA allocations.

Signed-off-by: John Garry <john.garry(a)huawei.com>
[Shameer: Modified msi-parent retrieval logic]
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
---
 drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/of_iommu.h | 10 +++++
 2 files changed, 105 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e60e3db..ffd7fb7 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -21,6 +21,7 @@
 #include <linux/iommu.h>
 #include <linux/limits.h>
 #include <linux/of.h>
+#include <linux/of_address.h>
 #include <linux/of_iommu.h>
 #include <linux/of_pci.h>
 #include <linux/slab.h>
@@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
 	return ops->of_xlate(dev, iommu_spec);
 }
 
+static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
+{
+	u32 *rid = data;
+
+	*rid = alias;
+	return 0;
+}
+
 struct of_pci_iommu_alias_info {
 	struct device *dev;
 	struct device_node *np;
@@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
 	return info->np == pdev->bus->dev.of_node;
 }
 
+static int of_iommu_alloc_resv_region(struct device_node *np,
+				      struct list_head *head)
+{
+	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	struct iommu_resv_region *region;
+	struct resource res;
+	int err;
+
+	err = of_address_to_resource(np, 0, &res);
+	if (err)
+		return err;
+
+	region = iommu_alloc_resv_region(res.start, resource_size(&res),
+					 prot, IOMMU_RESV_MSI);
+	if (!region)
+		return -ENOMEM;
+
+	list_add_tail(&region->list, head);
+
+	return 0;
+}
+
+static int of_pci_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct device_node *msi_np;
+	struct device *pdev;
+	u32 rid;
+	int err, resv = 0;
+
+	pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
+
+	for_each_node_with_property(msi_np, "msi-controller") {
+		for (pdev = dev; pdev; pdev = pdev->parent) {
+			if (!of_pci_map_rid(pdev->of_node, rid, "msi-map",
+					    "msi-map-mask", &msi_np, NULL)) {
+
+				err = of_iommu_alloc_resv_region(msi_np, head);
+				if (err)
+					return err;
+				resv++;
+			}
+		}
+	}
+
+	return resv;
+}
+
+static int of_platform_msi_get_resv_regions(struct device *dev,
+				       struct list_head *head)
+{
+	struct of_phandle_args args;
+	int err, resv = 0;
+
+	while (!of_parse_phandle_with_args(dev->of_node, "msi-parent",
+					   "#msi-cells", resv, &args)) {
+
+		err = of_iommu_alloc_resv_region(args.np, head);
+		of_node_put(args.np);
+		if (err)
+			return err;
+		resv++;
+	}
+
+	return resv;
+}
+
 const struct iommu_ops *of_iommu_configure(struct device *dev,
 					   struct device_node *master_np)
 {
@@ -235,6 +311,25 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return ops;
 }
 
+/**
+ * of_iommu_msi_get_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @head: Reserved region list from iommu_get_resv_regions()
+ *
+ * Returns: Number of reserved regions on success (0 if no associated
+ *          msi parent), appropriate error value otherwise.
+ */
+int of_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
+{
+
+	if (dev_is_pci(dev))
+		return of_pci_msi_get_resv_regions(dev, head);
+	else if (dev->of_node)
+		return of_platform_msi_get_resv_regions(dev, head);
+
+	return 0;
+}
+
 static int __init of_iommu_init(void)
 {
 	struct device_node *np;
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 13394ac..9267772 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -14,6 +14,9 @@ extern int of_get_dma_window(struct device_node *dn, const char *prefix,
 extern const struct iommu_ops *of_iommu_configure(struct device *dev,
 					struct device_node *master_np);
 
+extern int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -29,6 +32,13 @@ static inline const struct iommu_ops *of_iommu_configure(struct device *dev,
 	return NULL;
 }
 
+static int of_iommu_msi_get_resv_regions(struct device *dev,
+					struct list_head *head)
+{
+	return -ENODEV;
+}
+
+
 #endif	/* CONFIG_OF_IOMMU */
 
 extern struct of_device_id __iommu_of_table;
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 4/5] iommu/dma: Add a helper function to reserve HW MSI address regions for IOMMU drivers
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-09-27 13:32   ` Shameer Kolothum
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, robh
  Cc: gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun,
	Shameer Kolothum

IOMMU drivers can use this to implement their .get_resv_regions callback
for HW MSI specific reservations(e.g. ARM GICv3 ITS MSI region).

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
[John: added DT support]
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/dma-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-iommu.h |  7 +++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9d1cebe..f8709a2 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi_iort.h>
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
 #include <linux/gfp.h>
@@ -27,6 +28,7 @@
 #include <linux/iova.h>
 #include <linux/irq.h>
 #include <linux/mm.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/scatterlist.h>
 #include <linux/vmalloc.h>
@@ -198,6 +200,23 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_dma_get_msi_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @list: Reserved region list from iommu_get_resv_regions()
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions
+ * callback for HW MSI specific reservations.
+ */
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	if (is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		return of_iommu_msi_get_resv_regions(dev, list);
+
+	return iort_iommu_msi_get_resv_regions(dev, list);
+}
+EXPORT_SYMBOL(iommu_dma_get_msi_resv_regions);
+
 static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 		phys_addr_t start, phys_addr_t end)
 {
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 92f2083..6062ef0 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -74,6 +74,8 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list);
+
 #else
 
 struct iommu_domain;
@@ -107,6 +109,11 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
 {
 }
 
+static inline int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	return -ENODEV;
+}
+
 #endif	/* CONFIG_IOMMU_DMA */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 4/5] iommu/dma: Add a helper function to reserve HW MSI address regions for IOMMU drivers
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

IOMMU drivers can use this to implement their .get_resv_regions callback
for HW MSI specific reservations(e.g. ARM GICv3 ITS MSI region).

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
[John: added DT support]
Signed-off-by: John Garry <john.garry@huawei.com>
---
 drivers/iommu/dma-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-iommu.h |  7 +++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9d1cebe..f8709a2 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi_iort.h>
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
 #include <linux/gfp.h>
@@ -27,6 +28,7 @@
 #include <linux/iova.h>
 #include <linux/irq.h>
 #include <linux/mm.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/scatterlist.h>
 #include <linux/vmalloc.h>
@@ -198,6 +200,23 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_dma_get_msi_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @list: Reserved region list from iommu_get_resv_regions()
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions
+ * callback for HW MSI specific reservations.
+ */
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	if (is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		return of_iommu_msi_get_resv_regions(dev, list);
+
+	return iort_iommu_msi_get_resv_regions(dev, list);
+}
+EXPORT_SYMBOL(iommu_dma_get_msi_resv_regions);
+
 static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 		phys_addr_t start, phys_addr_t end)
 {
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 92f2083..6062ef0 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -74,6 +74,8 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list);
+
 #else
 
 struct iommu_domain;
@@ -107,6 +109,11 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
 {
 }
 
+static inline int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	return -ENODEV;
+}
+
 #endif	/* CONFIG_IOMMU_DMA */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 4/5] iommu/dma: Add a helper function to reserve HW MSI address regions for IOMMU drivers
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 2764 bytes --]

IOMMU drivers can use this to implement their .get_resv_regions callback
for HW MSI specific reservations(e.g. ARM GICv3 ITS MSI region).

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
[John: added DT support]
Signed-off-by: John Garry <john.garry(a)huawei.com>
---
 drivers/iommu/dma-iommu.c | 19 +++++++++++++++++++
 include/linux/dma-iommu.h |  7 +++++++
 2 files changed, 26 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9d1cebe..f8709a2 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/acpi_iort.h>
 #include <linux/device.h>
 #include <linux/dma-iommu.h>
 #include <linux/gfp.h>
@@ -27,6 +28,7 @@
 #include <linux/iova.h>
 #include <linux/irq.h>
 #include <linux/mm.h>
+#include <linux/of_iommu.h>
 #include <linux/pci.h>
 #include <linux/scatterlist.h>
 #include <linux/vmalloc.h>
@@ -198,6 +200,23 @@ void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list)
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
+/**
+ * iommu_dma_get_msi_resv_regions - Reserved region driver helper
+ * @dev: Device from iommu_get_resv_regions()
+ * @list: Reserved region list from iommu_get_resv_regions()
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions
+ * callback for HW MSI specific reservations.
+ */
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	if (is_of_node(dev->iommu_fwspec->iommu_fwnode))
+		return of_iommu_msi_get_resv_regions(dev, list);
+
+	return iort_iommu_msi_get_resv_regions(dev, list);
+}
+EXPORT_SYMBOL(iommu_dma_get_msi_resv_regions);
+
 static int cookie_init_hw_msi_region(struct iommu_dma_cookie *cookie,
 		phys_addr_t start, phys_addr_t end)
 {
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 92f2083..6062ef0 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -74,6 +74,8 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle,
 void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg);
 void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list);
 
+int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list);
+
 #else
 
 struct iommu_domain;
@@ -107,6 +109,11 @@ static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_he
 {
 }
 
+static inline int iommu_dma_get_msi_resv_regions(struct device *dev, struct list_head *list)
+{
+	return -ENODEV;
+}
+
 #endif	/* CONFIG_IOMMU_DMA */
 #endif	/* __KERNEL__ */
 #endif	/* __DMA_IOMMU_H */
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 5/5] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-09-27 13:32   ` Shameer Kolothum
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, robh
  Cc: gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun,
	Shameer Kolothum

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms Hip06/Hip07 to support the SMMU mappings for MSI transactions.

On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the
MSI payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements a quirk to reserve the hw msi regions in the
smmu-v3 driver which means these address regions will not be
translated and will be excluded from iova allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/iommu/arm-smmu-v3.c | 41 +++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e67ba6c..fb7f08d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -413,6 +413,9 @@
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
+#define SMMU_V3_GENERIC_ARM		0x0
+#define SMMU_V3_HISILICON_HI161X	0x1
+
 /* Until ACPICA headers cover IORT rev. C */
 #ifndef ACPI_IORT_SMMU_HISILICON_HI161X
 #define ACPI_IORT_SMMU_HISILICON_HI161X		0x1
@@ -608,6 +611,7 @@ struct arm_smmu_device {
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
 #define ARM_SMMU_OPT_PAGE0_REGS_ONLY	(1 << 1)
+#define ARM_SMMU_OPT_RESV_HW_MSI	(1 << 2)
 	u32				options;
 
 	struct arm_smmu_cmdq		cmdq;
@@ -696,6 +700,8 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
+	const void *data = of_device_get_match_data(smmu->dev);
+	u32 model = *(u32 *)&data;
 
 	do {
 		if (of_property_read_bool(smmu->dev->of_node,
@@ -705,6 +711,11 @@ static void parse_driver_options(struct arm_smmu_device *smmu)
 				arm_smmu_options[i].prop);
 		}
 	} while (arm_smmu_options[++i].opt);
+
+	if (model == SMMU_V3_HISILICON_HI161X) {
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
+		dev_notice(smmu->dev, "\tenabling workaround for HiSilicon erratum 161010801\n");
+	}
 }
 
 /* Low-level queue manipulation functions */
@@ -1934,14 +1945,29 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 				      struct list_head *head)
 {
 	struct iommu_resv_region *region;
+	struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+	struct arm_smmu_device *smmu = master->smmu;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	int resv = 0;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-					 prot, IOMMU_RESV_SW_MSI);
-	if (!region)
-		return;
+	if ((smmu->options & ARM_SMMU_OPT_RESV_HW_MSI)) {
 
-	list_add_tail(&region->list, head);
+		resv = iommu_dma_get_msi_resv_regions(dev, head);
+
+		if (resv < 0) {
+			dev_warn(dev, "HW MSI region resv failed: %d\n", resv);
+			return;
+		}
+	}
+
+	if (!resv) {
+		region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+						 prot, IOMMU_RESV_SW_MSI);
+		if (!region)
+			return;
+
+		list_add_tail(&region->list, head);
+	}
 
 	iommu_dma_get_resv_regions(dev, head);
 }
@@ -2667,6 +2693,7 @@ static void acpi_smmu_get_options(u32 model, struct arm_smmu_device *smmu)
 		break;
 	case ACPI_IORT_SMMU_HISILICON_HI161X:
 		smmu->options |= ARM_SMMU_OPT_SKIP_PREFETCH;
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
 		break;
 	}
 
@@ -2862,7 +2889,9 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
 }
 
 static const struct of_device_id arm_smmu_of_match[] = {
-	{ .compatible = "arm,smmu-v3", },
+	{ .compatible = "hisilicon,hi161x-smmu-v3",
+			.data = (void *)SMMU_V3_HISILICON_HI161X },
+	{ .compatible = "arm,smmu-v3", .data = (void *)SMMU_V3_GENERIC_ARM },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v8 5/5] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms Hip06/Hip07 to support the SMMU mappings for MSI transactions.

On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the
MSI payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements a quirk to reserve the hw msi regions in the
smmu-v3 driver which means these address regions will not be
translated and will be excluded from iova allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 drivers/iommu/arm-smmu-v3.c | 41 +++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e67ba6c..fb7f08d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -413,6 +413,9 @@
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
+#define SMMU_V3_GENERIC_ARM		0x0
+#define SMMU_V3_HISILICON_HI161X	0x1
+
 /* Until ACPICA headers cover IORT rev. C */
 #ifndef ACPI_IORT_SMMU_HISILICON_HI161X
 #define ACPI_IORT_SMMU_HISILICON_HI161X		0x1
@@ -608,6 +611,7 @@ struct arm_smmu_device {
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
 #define ARM_SMMU_OPT_PAGE0_REGS_ONLY	(1 << 1)
+#define ARM_SMMU_OPT_RESV_HW_MSI	(1 << 2)
 	u32				options;
 
 	struct arm_smmu_cmdq		cmdq;
@@ -696,6 +700,8 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
+	const void *data = of_device_get_match_data(smmu->dev);
+	u32 model = *(u32 *)&data;
 
 	do {
 		if (of_property_read_bool(smmu->dev->of_node,
@@ -705,6 +711,11 @@ static void parse_driver_options(struct arm_smmu_device *smmu)
 				arm_smmu_options[i].prop);
 		}
 	} while (arm_smmu_options[++i].opt);
+
+	if (model == SMMU_V3_HISILICON_HI161X) {
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
+		dev_notice(smmu->dev, "\tenabling workaround for HiSilicon erratum 161010801\n");
+	}
 }
 
 /* Low-level queue manipulation functions */
@@ -1934,14 +1945,29 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 				      struct list_head *head)
 {
 	struct iommu_resv_region *region;
+	struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+	struct arm_smmu_device *smmu = master->smmu;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	int resv = 0;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-					 prot, IOMMU_RESV_SW_MSI);
-	if (!region)
-		return;
+	if ((smmu->options & ARM_SMMU_OPT_RESV_HW_MSI)) {
 
-	list_add_tail(&region->list, head);
+		resv = iommu_dma_get_msi_resv_regions(dev, head);
+
+		if (resv < 0) {
+			dev_warn(dev, "HW MSI region resv failed: %d\n", resv);
+			return;
+		}
+	}
+
+	if (!resv) {
+		region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+						 prot, IOMMU_RESV_SW_MSI);
+		if (!region)
+			return;
+
+		list_add_tail(&region->list, head);
+	}
 
 	iommu_dma_get_resv_regions(dev, head);
 }
@@ -2667,6 +2693,7 @@ static void acpi_smmu_get_options(u32 model, struct arm_smmu_device *smmu)
 		break;
 	case ACPI_IORT_SMMU_HISILICON_HI161X:
 		smmu->options |= ARM_SMMU_OPT_SKIP_PREFETCH;
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
 		break;
 	}
 
@@ -2862,7 +2889,9 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
 }
 
 static const struct of_device_id arm_smmu_of_match[] = {
-	{ .compatible = "arm,smmu-v3", },
+	{ .compatible = "hisilicon,hi161x-smmu-v3",
+			.data = (void *)SMMU_V3_HISILICON_HI161X },
+	{ .compatible = "arm,smmu-v3", .data = (void *)SMMU_V3_GENERIC_ARM },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [Devel] [PATCH v8 5/5] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
@ 2017-09-27 13:32   ` Shameer Kolothum
  0 siblings, 0 replies; 59+ messages in thread
From: Shameer Kolothum @ 2017-09-27 13:32 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 4071 bytes --]

The HiSilicon erratum 161010801 describes the limitation of HiSilicon
platforms Hip06/Hip07 to support the SMMU mappings for MSI transactions.

On these platforms GICv3 ITS translator is presented with the deviceID
by extending the MSI payload data to 64 bits to include the deviceID.
Hence, the PCIe controller on this platforms has to differentiate the
MSI payload against other DMA payload and has to modify the MSI payload.
This basically makes it difficult for this platforms to have a SMMU
translation for MSI.

This patch implements a quirk to reserve the hw msi regions in the
smmu-v3 driver which means these address regions will not be
translated and will be excluded from iova allocations.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
---
 drivers/iommu/arm-smmu-v3.c | 41 +++++++++++++++++++++++++++++++++++------
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e67ba6c..fb7f08d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -413,6 +413,9 @@
 #define MSI_IOVA_BASE			0x8000000
 #define MSI_IOVA_LENGTH			0x100000
 
+#define SMMU_V3_GENERIC_ARM		0x0
+#define SMMU_V3_HISILICON_HI161X	0x1
+
 /* Until ACPICA headers cover IORT rev. C */
 #ifndef ACPI_IORT_SMMU_HISILICON_HI161X
 #define ACPI_IORT_SMMU_HISILICON_HI161X		0x1
@@ -608,6 +611,7 @@ struct arm_smmu_device {
 
 #define ARM_SMMU_OPT_SKIP_PREFETCH	(1 << 0)
 #define ARM_SMMU_OPT_PAGE0_REGS_ONLY	(1 << 1)
+#define ARM_SMMU_OPT_RESV_HW_MSI	(1 << 2)
 	u32				options;
 
 	struct arm_smmu_cmdq		cmdq;
@@ -696,6 +700,8 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
 static void parse_driver_options(struct arm_smmu_device *smmu)
 {
 	int i = 0;
+	const void *data = of_device_get_match_data(smmu->dev);
+	u32 model = *(u32 *)&data;
 
 	do {
 		if (of_property_read_bool(smmu->dev->of_node,
@@ -705,6 +711,11 @@ static void parse_driver_options(struct arm_smmu_device *smmu)
 				arm_smmu_options[i].prop);
 		}
 	} while (arm_smmu_options[++i].opt);
+
+	if (model == SMMU_V3_HISILICON_HI161X) {
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
+		dev_notice(smmu->dev, "\tenabling workaround for HiSilicon erratum 161010801\n");
+	}
 }
 
 /* Low-level queue manipulation functions */
@@ -1934,14 +1945,29 @@ static void arm_smmu_get_resv_regions(struct device *dev,
 				      struct list_head *head)
 {
 	struct iommu_resv_region *region;
+	struct arm_smmu_master_data *master = dev->iommu_fwspec->iommu_priv;
+	struct arm_smmu_device *smmu = master->smmu;
 	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+	int resv = 0;
 
-	region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
-					 prot, IOMMU_RESV_SW_MSI);
-	if (!region)
-		return;
+	if ((smmu->options & ARM_SMMU_OPT_RESV_HW_MSI)) {
 
-	list_add_tail(&region->list, head);
+		resv = iommu_dma_get_msi_resv_regions(dev, head);
+
+		if (resv < 0) {
+			dev_warn(dev, "HW MSI region resv failed: %d\n", resv);
+			return;
+		}
+	}
+
+	if (!resv) {
+		region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+						 prot, IOMMU_RESV_SW_MSI);
+		if (!region)
+			return;
+
+		list_add_tail(&region->list, head);
+	}
 
 	iommu_dma_get_resv_regions(dev, head);
 }
@@ -2667,6 +2693,7 @@ static void acpi_smmu_get_options(u32 model, struct arm_smmu_device *smmu)
 		break;
 	case ACPI_IORT_SMMU_HISILICON_HI161X:
 		smmu->options |= ARM_SMMU_OPT_SKIP_PREFETCH;
+		smmu->options |= ARM_SMMU_OPT_RESV_HW_MSI;
 		break;
 	}
 
@@ -2862,7 +2889,9 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
 }
 
 static const struct of_device_id arm_smmu_of_match[] = {
-	{ .compatible = "arm,smmu-v3", },
+	{ .compatible = "hisilicon,hi161x-smmu-v3",
+			.data = (void *)SMMU_V3_HISILICON_HI161X },
+	{ .compatible = "arm,smmu-v3", .data = (void *)SMMU_V3_GENERIC_ARM },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 59+ messages in thread

* RE: [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
  2017-09-27 13:32 ` Shameer Kolothum
  (?)
@ 2017-10-04 10:03     ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 10:03 UTC (permalink / raw)
  To: lorenzo.pieralisi-5wv7dgnIgG8, marc.zyngier-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Gabriele Paoloni, Linuxarm,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Guohanjun (Hanjun Guo),
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A

Hi Will/Lorenzo/Marc,

Any feedback on this series, please? Really appreciate if you can take a look
and let us know. 

Thanks,
Shameer

> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, September 27, 2017 2:33 PM
> To: lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org; marc.zyngier-5wv7dgnIgG8@public.gmane.org;
> sudeep.holla-5wv7dgnIgG8@public.gmane.org; will.deacon-5wv7dgnIgG8@public.gmane.org; robin.murphy-5wv7dgnIgG8@public.gmane.org;
> joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org; mark.rutland-5wv7dgnIgG8@public.gmane.org; robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> Cc: Gabriele Paoloni <gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-arm-
> kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org; linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; devel-E0kO6a4B6psdnm+yROfE0A@public.gmane.org; Linuxarm
> <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Wangzhou (B) <wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q@public.gmane.org>;
> Guohanjun (Hanjun Guo) <guohanjun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Shameerali Kolothum
> Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Subject: [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon
> 161010801 erratum(reserve HW MSI)
> 
> On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
> deviates from the standard implementation and this breaks PCIe MSI
> functionality when SMMU is enabled.
> 
> The HiSilicon erratum 161010801 describes this limitation of certain
> HiSilicon platforms to support the SMMU mappings for MSI transactions.
> On these platforms GICv3 ITS translator is presented with the deviceID
> by extending the MSI payload data to 64 bits to include the deviceID.
> Hence, the PCIe controller on this platforms has to differentiate the MSI
> payload against other DMA payload and has to modify the MSI payload.
> This basically makes it difficult for this platforms to have a SMMU
> translation for MSI.
> 
> This patch implements an ACPI and DT based quirk to reserve the hw msi
> regions in the smmu-v3 driver which means these address regions will
> not be translated and will be excluded from iova allocations.
> 
> To implement this quirk, the following changes are incorporated:
> 1. Added a generic helper function to IORT code to retrieve the
>    associated ITS base address from a device IORT node.
> 2. Added a generic helper function to of iommu code to retrieve the
>    associated msi controller base address from for a PCI RC
>    msi-mapping and also platform device msi-parent.
> 3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
>    the default SW MSI reserve address based on the IORT SMMU model
>    or DT bindings.
> 
> Changelog:
> 
> v7 --> v8
> Addressed comments from Rob and Lorenzo:
>  -Modified to use DT compatible string for errata.
>  -Changed logic to retrieve the msi-parent for DT case.
> 
> v6 --> v7
> Addressed request from Will to add DT support for the erratum:
>  - added bt binding
>  - add of_iommu_msi_get_resv_regions()
> New arm64 silicon errata entry
> Rename iort_iommu_{its->msi}_get_resv_regions
> 
> v5 --> v6
> Addressed comments from Robin and Lorenzo:
> -No change to patch#1 .
> -Reverted v5 patch#2 as this might break the platforms where this quirk
>   is not applicable. Provided a generic function in iommu code and added
>   back the quirk implementation in SMMU v3 driver(patch#3)
> 
> v4 --> v5
> Addressed comments from Robin and Lorenzo:
> -Added a comment to make it clear that, for now, only straightforward
>   HW topologies are handled while reserving ITS regions(patch #1).
> 
> v3 --> v4
> Rebased on 4.13-rc1.
> Addressed comments from Robin, Will and Lorenzo:
> -As suggested by Robin, moved the ITS msi reservation into
>   iommu_dma_get_resv_regions().
> -Added its_count != resv region failure case(patch #1).
> 
> v2 --> v3
> Addressed comments from Lorenzo and Robin:
> -Removed dev_is_pci() check in smmuV3 driver.
> -Don't treat device not having an ITS mapping as an error in
>   iort helper function.
> 
> v1 --> v2
> -patch 2/2: Invoke iort helper fn based on fwnode type(acpi).
> 
> RFCv2 -->PATCH
> -Incorporated Lorenzo's review comments.
> 
> RFC v1 --> RFC v2
> Based on Robin's review comments,
> -Removed  the generic erratum framework.
> -Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor
> specific CSRT table.
> 
> John Garry (2):
>   Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum
> 161010801
>   iommu/of: Add msi address regions reservation helper
> 
> Shameer Kolothum (3):
>   ACPI/IORT: Add msi address regions reservation helper
>   iommu/dma: Add a helper function to reserve HW MSI address regions for
>     IOMMU drivers
>   iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
> 
>  Documentation/arm64/silicon-errata.txt             |  1 +
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
>  drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
>  drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
>  drivers/iommu/dma-iommu.c                          | 19 +++++
>  drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
>  drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
>  include/linux/acpi_iort.h                          |  7 +-
>  include/linux/dma-iommu.h                          |  7 ++
>  include/linux/of_iommu.h                           | 10 +++
>  10 files changed, 276 insertions(+), 12 deletions(-)
> 
> --
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
@ 2017-10-04 10:03     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 10:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will/Lorenzo/Marc,

Any feedback on this series, please? Really appreciate if you can take a look
and let us know. 

Thanks,
Shameer

> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, September 27, 2017 2:33 PM
> To: lorenzo.pieralisi at arm.com; marc.zyngier at arm.com;
> sudeep.holla at arm.com; will.deacon at arm.com; robin.murphy at arm.com;
> joro at 8bytes.org; mark.rutland at arm.com; robh at kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni@huawei.com>; John Garry
> <john.garry@huawei.com>; iommu at lists.linux-foundation.org; linux-arm-
> kernel at lists.infradead.org; linux-acpi at vger.kernel.org;
> devicetree at vger.kernel.org; devel at acpica.org; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun@huawei.com>; Shameerali Kolothum
> Thodi <shameerali.kolothum.thodi@huawei.com>
> Subject: [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon
> 161010801 erratum(reserve HW MSI)
> 
> On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
> deviates from the standard implementation and this breaks PCIe MSI
> functionality when SMMU is enabled.
> 
> The HiSilicon erratum 161010801 describes this limitation of certain
> HiSilicon platforms to support the SMMU mappings for MSI transactions.
> On these platforms GICv3 ITS translator is presented with the deviceID
> by extending the MSI payload data to 64 bits to include the deviceID.
> Hence, the PCIe controller on this platforms has to differentiate the MSI
> payload against other DMA payload and has to modify the MSI payload.
> This basically makes it difficult for this platforms to have a SMMU
> translation for MSI.
> 
> This patch implements an ACPI and DT based quirk to reserve the hw msi
> regions in the smmu-v3 driver which means these address regions will
> not be translated and will be excluded from iova allocations.
> 
> To implement this quirk, the following changes are incorporated:
> 1. Added a generic helper function to IORT code to retrieve the
>    associated ITS base address from a device IORT node.
> 2. Added a generic helper function to of iommu code to retrieve the
>    associated msi controller base address from for a PCI RC
>    msi-mapping and also platform device msi-parent.
> 3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
>    the default SW MSI reserve address based on the IORT SMMU model
>    or DT bindings.
> 
> Changelog:
> 
> v7 --> v8
> Addressed comments from Rob and Lorenzo:
>  -Modified to use DT compatible string for errata.
>  -Changed logic to retrieve the msi-parent for DT case.
> 
> v6 --> v7
> Addressed request from Will to add DT support for the erratum:
>  - added bt binding
>  - add of_iommu_msi_get_resv_regions()
> New arm64 silicon errata entry
> Rename iort_iommu_{its->msi}_get_resv_regions
> 
> v5 --> v6
> Addressed comments from Robin and Lorenzo:
> -No change to patch#1 .
> -Reverted v5 patch#2 as this might break the platforms where this quirk
>   is not applicable. Provided a generic function in iommu code and added
>   back the quirk implementation in SMMU v3 driver(patch#3)
> 
> v4 --> v5
> Addressed comments from Robin and Lorenzo:
> -Added a comment to make it clear that, for now, only straightforward
>   HW topologies are handled while reserving ITS regions(patch #1).
> 
> v3 --> v4
> Rebased on 4.13-rc1.
> Addressed comments from Robin, Will and Lorenzo:
> -As suggested by Robin, moved the ITS msi reservation into
>   iommu_dma_get_resv_regions().
> -Added its_count != resv region failure case(patch #1).
> 
> v2 --> v3
> Addressed comments from Lorenzo and Robin:
> -Removed dev_is_pci() check in smmuV3 driver.
> -Don't treat device not having an ITS mapping as an error in
>   iort helper function.
> 
> v1 --> v2
> -patch 2/2: Invoke iort helper fn based on fwnode type(acpi).
> 
> RFCv2 -->PATCH
> -Incorporated Lorenzo's review comments.
> 
> RFC v1 --> RFC v2
> Based on Robin's review comments,
> -Removed  the generic erratum framework.
> -Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor
> specific CSRT table.
> 
> John Garry (2):
>   Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum
> 161010801
>   iommu/of: Add msi address regions reservation helper
> 
> Shameer Kolothum (3):
>   ACPI/IORT: Add msi address regions reservation helper
>   iommu/dma: Add a helper function to reserve HW MSI address regions for
>     IOMMU drivers
>   iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
> 
>  Documentation/arm64/silicon-errata.txt             |  1 +
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
>  drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
>  drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
>  drivers/iommu/dma-iommu.c                          | 19 +++++
>  drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
>  drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
>  include/linux/acpi_iort.h                          |  7 +-
>  include/linux/dma-iommu.h                          |  7 ++
>  include/linux/of_iommu.h                           | 10 +++
>  10 files changed, 276 insertions(+), 12 deletions(-)
> 
> --
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI)
@ 2017-10-04 10:03     ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 10:03 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5429 bytes --]

Hi Will/Lorenzo/Marc,

Any feedback on this series, please? Really appreciate if you can take a look
and let us know. 

Thanks,
Shameer

> -----Original Message-----
> From: Shameerali Kolothum Thodi
> Sent: Wednesday, September 27, 2017 2:33 PM
> To: lorenzo.pieralisi(a)arm.com; marc.zyngier(a)arm.com;
> sudeep.holla(a)arm.com; will.deacon(a)arm.com; robin.murphy(a)arm.com;
> joro(a)8bytes.org; mark.rutland(a)arm.com; robh(a)kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni(a)huawei.com>; John Garry
> <john.garry(a)huawei.com>; iommu(a)lists.linux-foundation.org; linux-arm-
> kernel(a)lists.infradead.org; linux-acpi(a)vger.kernel.org;
> devicetree(a)vger.kernel.org; devel(a)acpica.org; Linuxarm
> <linuxarm(a)huawei.com>; Wangzhou (B) <wangzhou1(a)hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun(a)huawei.com>; Shameerali Kolothum
> Thodi <shameerali.kolothum.thodi(a)huawei.com>
> Subject: [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon
> 161010801 erratum(reserve HW MSI)
> 
> On certain HiSilicon platforms (hip06/hip07) the GIC ITS and PCIe RC
> deviates from the standard implementation and this breaks PCIe MSI
> functionality when SMMU is enabled.
> 
> The HiSilicon erratum 161010801 describes this limitation of certain
> HiSilicon platforms to support the SMMU mappings for MSI transactions.
> On these platforms GICv3 ITS translator is presented with the deviceID
> by extending the MSI payload data to 64 bits to include the deviceID.
> Hence, the PCIe controller on this platforms has to differentiate the MSI
> payload against other DMA payload and has to modify the MSI payload.
> This basically makes it difficult for this platforms to have a SMMU
> translation for MSI.
> 
> This patch implements an ACPI and DT based quirk to reserve the hw msi
> regions in the smmu-v3 driver which means these address regions will
> not be translated and will be excluded from iova allocations.
> 
> To implement this quirk, the following changes are incorporated:
> 1. Added a generic helper function to IORT code to retrieve the
>    associated ITS base address from a device IORT node.
> 2. Added a generic helper function to of iommu code to retrieve the
>    associated msi controller base address from for a PCI RC
>    msi-mapping and also platform device msi-parent.
> 3. Added quirk to SMMUv3 to retrieve the HW ITS address and replace
>    the default SW MSI reserve address based on the IORT SMMU model
>    or DT bindings.
> 
> Changelog:
> 
> v7 --> v8
> Addressed comments from Rob and Lorenzo:
>  -Modified to use DT compatible string for errata.
>  -Changed logic to retrieve the msi-parent for DT case.
> 
> v6 --> v7
> Addressed request from Will to add DT support for the erratum:
>  - added bt binding
>  - add of_iommu_msi_get_resv_regions()
> New arm64 silicon errata entry
> Rename iort_iommu_{its->msi}_get_resv_regions
> 
> v5 --> v6
> Addressed comments from Robin and Lorenzo:
> -No change to patch#1 .
> -Reverted v5 patch#2 as this might break the platforms where this quirk
>   is not applicable. Provided a generic function in iommu code and added
>   back the quirk implementation in SMMU v3 driver(patch#3)
> 
> v4 --> v5
> Addressed comments from Robin and Lorenzo:
> -Added a comment to make it clear that, for now, only straightforward
>   HW topologies are handled while reserving ITS regions(patch #1).
> 
> v3 --> v4
> Rebased on 4.13-rc1.
> Addressed comments from Robin, Will and Lorenzo:
> -As suggested by Robin, moved the ITS msi reservation into
>   iommu_dma_get_resv_regions().
> -Added its_count != resv region failure case(patch #1).
> 
> v2 --> v3
> Addressed comments from Lorenzo and Robin:
> -Removed dev_is_pci() check in smmuV3 driver.
> -Don't treat device not having an ITS mapping as an error in
>   iort helper function.
> 
> v1 --> v2
> -patch 2/2: Invoke iort helper fn based on fwnode type(acpi).
> 
> RFCv2 -->PATCH
> -Incorporated Lorenzo's review comments.
> 
> RFC v1 --> RFC v2
> Based on Robin's review comments,
> -Removed  the generic erratum framework.
> -Using IORT/MADT tables to retrieve the ITS base addr instead  of vendor
> specific CSRT table.
> 
> John Garry (2):
>   Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum
> 161010801
>   iommu/of: Add msi address regions reservation helper
> 
> Shameer Kolothum (3):
>   ACPI/IORT: Add msi address regions reservation helper
>   iommu/dma: Add a helper function to reserve HW MSI address regions for
>     IOMMU drivers
>   iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801
> 
>  Documentation/arm64/silicon-errata.txt             |  1 +
>  .../devicetree/bindings/iommu/arm,smmu-v3.txt      |  9 +-
>  drivers/acpi/arm64/iort.c                          | 96 +++++++++++++++++++++-
>  drivers/iommu/arm-smmu-v3.c                        | 41 +++++++--
>  drivers/iommu/dma-iommu.c                          | 19 +++++
>  drivers/iommu/of_iommu.c                           | 95 +++++++++++++++++++++
>  drivers/irqchip/irq-gic-v3-its.c                   |  3 +-
>  include/linux/acpi_iort.h                          |  7 +-
>  include/linux/dma-iommu.h                          |  7 ++
>  include/linux/of_iommu.h                           | 10 +++
>  10 files changed, 276 insertions(+), 12 deletions(-)
> 
> --
> 1.9.1
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-09-27 13:32   ` Shameer Kolothum
  (?)
@ 2017-10-04 11:22       ` Marc Zyngier
  -1 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 11:22 UTC (permalink / raw)
  To: Shameer Kolothum, lorenzo.pieralisi-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	john.garry-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, devel-E0kO6a4B6psdnm+yROfE0A,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q,
	guohanjun-hv44wF8Li93QT0dZR+AlfA

On 27/09/17 14:32, Shameer Kolothum wrote:
> From: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> 
> On some platforms msi-controller address regions have to be excluded
> from normal IOVA allocation in that they are detected and decoded in
> a HW specific way by system components and so they cannot be considered
> normal IOVA address space.
> 
> Add a helper function that retrieves msi address regions through device
> tree msi mapping, so that these regions will not be translated by IOMMU
> and will be excluded from IOVA allocations.
> 
> Signed-off-by: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> [Shameer: Modified msi-parent retrieval logic]
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> ---
>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/of_iommu.h | 10 +++++
>  2 files changed, 105 insertions(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e60e3db..ffd7fb7 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -21,6 +21,7 @@
>  #include <linux/iommu.h>
>  #include <linux/limits.h>
>  #include <linux/of.h>
> +#include <linux/of_address.h>
>  #include <linux/of_iommu.h>
>  #include <linux/of_pci.h>
>  #include <linux/slab.h>
> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>  	return ops->of_xlate(dev, iommu_spec);
>  }
>  
> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> +{
> +	u32 *rid = data;
> +
> +	*rid = alias;
> +	return 0;
> +}
> +
>  struct of_pci_iommu_alias_info {
>  	struct device *dev;
>  	struct device_node *np;
> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  	return info->np == pdev->bus->dev.of_node;
>  }
>  
> +static int of_iommu_alloc_resv_region(struct device_node *np,
> +				      struct list_head *head)
> +{
> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +	struct iommu_resv_region *region;
> +	struct resource res;
> +	int err;
> +
> +	err = of_address_to_resource(np, 0, &res);

What is the rational for registering the first region only? Surely you
can have more than one region in an MSI controller (yes, your particular
case is the ITS which has only one, but we're trying to do something
generic here).

Another issue I have with this code is that it inserts all of the ITS
MMIO in the RESV_MSI range. As long as we don't generate any page tables
for this, we're fine. But if we ever change this, we'll end-up with the
ITS programming interface being exposed to a device, which wouldn't be
acceptable.

Why can't we have some generic property instead, that would describe the
actual ranges that cannot be translated? That way, no random insertion
of a random range, and relatively future proof.

I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
like a much nicer and simpler solution to this problem.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 11:22       ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 11:22 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/09/17 14:32, Shameer Kolothum wrote:
> From: John Garry <john.garry@huawei.com>
> 
> On some platforms msi-controller address regions have to be excluded
> from normal IOVA allocation in that they are detected and decoded in
> a HW specific way by system components and so they cannot be considered
> normal IOVA address space.
> 
> Add a helper function that retrieves msi address regions through device
> tree msi mapping, so that these regions will not be translated by IOMMU
> and will be excluded from IOVA allocations.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> [Shameer: Modified msi-parent retrieval logic]
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/of_iommu.h | 10 +++++
>  2 files changed, 105 insertions(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e60e3db..ffd7fb7 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -21,6 +21,7 @@
>  #include <linux/iommu.h>
>  #include <linux/limits.h>
>  #include <linux/of.h>
> +#include <linux/of_address.h>
>  #include <linux/of_iommu.h>
>  #include <linux/of_pci.h>
>  #include <linux/slab.h>
> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>  	return ops->of_xlate(dev, iommu_spec);
>  }
>  
> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> +{
> +	u32 *rid = data;
> +
> +	*rid = alias;
> +	return 0;
> +}
> +
>  struct of_pci_iommu_alias_info {
>  	struct device *dev;
>  	struct device_node *np;
> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  	return info->np == pdev->bus->dev.of_node;
>  }
>  
> +static int of_iommu_alloc_resv_region(struct device_node *np,
> +				      struct list_head *head)
> +{
> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +	struct iommu_resv_region *region;
> +	struct resource res;
> +	int err;
> +
> +	err = of_address_to_resource(np, 0, &res);

What is the rational for registering the first region only? Surely you
can have more than one region in an MSI controller (yes, your particular
case is the ITS which has only one, but we're trying to do something
generic here).

Another issue I have with this code is that it inserts all of the ITS
MMIO in the RESV_MSI range. As long as we don't generate any page tables
for this, we're fine. But if we ever change this, we'll end-up with the
ITS programming interface being exposed to a device, which wouldn't be
acceptable.

Why can't we have some generic property instead, that would describe the
actual ranges that cannot be translated? That way, no random insertion
of a random range, and relatively future proof.

I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
like a much nicer and simpler solution to this problem.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 11:22       ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 11:22 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 3062 bytes --]

On 27/09/17 14:32, Shameer Kolothum wrote:
> From: John Garry <john.garry(a)huawei.com>
> 
> On some platforms msi-controller address regions have to be excluded
> from normal IOVA allocation in that they are detected and decoded in
> a HW specific way by system components and so they cannot be considered
> normal IOVA address space.
> 
> Add a helper function that retrieves msi address regions through device
> tree msi mapping, so that these regions will not be translated by IOMMU
> and will be excluded from IOVA allocations.
> 
> Signed-off-by: John Garry <john.garry(a)huawei.com>
> [Shameer: Modified msi-parent retrieval logic]
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
> ---
>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/of_iommu.h | 10 +++++
>  2 files changed, 105 insertions(+)
> 
> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index e60e3db..ffd7fb7 100644
> --- a/drivers/iommu/of_iommu.c
> +++ b/drivers/iommu/of_iommu.c
> @@ -21,6 +21,7 @@
>  #include <linux/iommu.h>
>  #include <linux/limits.h>
>  #include <linux/of.h>
> +#include <linux/of_address.h>
>  #include <linux/of_iommu.h>
>  #include <linux/of_pci.h>
>  #include <linux/slab.h>
> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>  	return ops->of_xlate(dev, iommu_spec);
>  }
>  
> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> +{
> +	u32 *rid = data;
> +
> +	*rid = alias;
> +	return 0;
> +}
> +
>  struct of_pci_iommu_alias_info {
>  	struct device *dev;
>  	struct device_node *np;
> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>  	return info->np == pdev->bus->dev.of_node;
>  }
>  
> +static int of_iommu_alloc_resv_region(struct device_node *np,
> +				      struct list_head *head)
> +{
> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +	struct iommu_resv_region *region;
> +	struct resource res;
> +	int err;
> +
> +	err = of_address_to_resource(np, 0, &res);

What is the rational for registering the first region only? Surely you
can have more than one region in an MSI controller (yes, your particular
case is the ITS which has only one, but we're trying to do something
generic here).

Another issue I have with this code is that it inserts all of the ITS
MMIO in the RESV_MSI range. As long as we don't generate any page tables
for this, we're fine. But if we ever change this, we'll end-up with the
ITS programming interface being exposed to a device, which wouldn't be
acceptable.

Why can't we have some generic property instead, that would describe the
actual ranges that cannot be translated? That way, no random insertion
of a random range, and relatively future proof.

I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
like a much nicer and simpler solution to this problem.

Thoughts?

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-04 11:22       ` Marc Zyngier
  (?)
@ 2017-10-04 13:50           ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 59+ messages in thread
From: Lorenzo Pieralisi @ 2017-10-04 13:50 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Shameer Kolothum, sudeep.holla-5wv7dgnIgG8,
	will.deacon-5wv7dgnIgG8, robin.murphy-5wv7dgnIgG8,
	joro-zLv9SwRftAIdnm+yROfE0A, mark.rutland-5wv7dgnIgG8,
	robh-DgEjT+Ai2ygdnm+yROfE0A,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	john.garry-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, devel-E0kO6a4B6psdnm+yROfE0A,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q,
	guohanjun-hv44wF8Li93QT0dZR+AlfA

On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > 
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in
> > a HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> > 
> > Add a helper function that retrieves msi address regions through device
> > tree msi mapping, so that these regions will not be translated by IOMMU
> > and will be excluded from IOVA allocations.
> > 
> > Signed-off-by: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> > 
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > index e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);
> >  }
> >  
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;
> >  }
> >  
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you
> can have more than one region in an MSI controller (yes, your particular
> case is the ITS which has only one, but we're trying to do something
> generic here).
> 
> Another issue I have with this code is that it inserts all of the ITS
> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> for this, we're fine. But if we ever change this, we'll end-up with the
> ITS programming interface being exposed to a device, which wouldn't be
> acceptable.
> 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion
> of a random range, and relatively future proof.

IORT code has the same issue (ie it reserves all ITS regions) and I do
not know where a property can be added to describe those ranges (IORT
specs ? I'd rather not) in ACPI other than the IORT tables entries
themselves.

> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> like a much nicer and simpler solution to this problem.

It could be but if we throw ACPI into the picture we have to knock
together Hisilicon specific namespace bindings to handle this and
quickly.

Lorenzo
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 13:50           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 59+ messages in thread
From: Lorenzo Pieralisi @ 2017-10-04 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry@huawei.com>
> > 
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in
> > a HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> > 
> > Add a helper function that retrieves msi address regions through device
> > tree msi mapping, so that these regions will not be translated by IOMMU
> > and will be excluded from IOVA allocations.
> > 
> > Signed-off-by: John Garry <john.garry@huawei.com>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> > 
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > index e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);
> >  }
> >  
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;
> >  }
> >  
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you
> can have more than one region in an MSI controller (yes, your particular
> case is the ITS which has only one, but we're trying to do something
> generic here).
> 
> Another issue I have with this code is that it inserts all of the ITS
> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> for this, we're fine. But if we ever change this, we'll end-up with the
> ITS programming interface being exposed to a device, which wouldn't be
> acceptable.
> 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion
> of a random range, and relatively future proof.

IORT code has the same issue (ie it reserves all ITS regions) and I do
not know where a property can be added to describe those ranges (IORT
specs ? I'd rather not) in ACPI other than the IORT tables entries
themselves.

> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> like a much nicer and simpler solution to this problem.

It could be but if we throw ACPI into the picture we have to knock
together Hisilicon specific namespace bindings to handle this and
quickly.

Lorenzo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 13:50           ` Lorenzo Pieralisi
  0 siblings, 0 replies; 59+ messages in thread
From: Lorenzo Pieralisi @ 2017-10-04 13:50 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 3595 bytes --]

On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry(a)huawei.com>
> > 
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in
> > a HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> > 
> > Add a helper function that retrieves msi address regions through device
> > tree msi mapping, so that these regions will not be translated by IOMMU
> > and will be excluded from IOVA allocations.
> > 
> > Signed-off-by: John Garry <john.garry(a)huawei.com>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
> > ---
> >  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> > 
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> > index e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);
> >  }
> >  
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;
> >  }
> >  
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you
> can have more than one region in an MSI controller (yes, your particular
> case is the ITS which has only one, but we're trying to do something
> generic here).
> 
> Another issue I have with this code is that it inserts all of the ITS
> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> for this, we're fine. But if we ever change this, we'll end-up with the
> ITS programming interface being exposed to a device, which wouldn't be
> acceptable.
> 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion
> of a random range, and relatively future proof.

IORT code has the same issue (ie it reserves all ITS regions) and I do
not know where a property can be added to describe those ranges (IORT
specs ? I'd rather not) in ACPI other than the IORT tables entries
themselves.

> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> like a much nicer and simpler solution to this problem.

It could be but if we throw ACPI into the picture we have to knock
together Hisilicon specific namespace bindings to handle this and
quickly.

Lorenzo

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
  2017-09-27 13:32     ` Shameer Kolothum
  (?)
@ 2017-10-04 14:17         ` Marc Zyngier
  -1 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 14:17 UTC (permalink / raw)
  To: Shameer Kolothum, lorenzo.pieralisi-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	guohanjun-hv44wF8Li93QT0dZR+AlfA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A

On 27/09/17 14:32, Shameer Kolothum wrote:
> On some platforms msi parent address regions have to be excluded from
> normal IOVA allocation in that they are detected and decoded in a HW
> specific way by system components and so they cannot be considered normal
> IOVA address space.
> 
> Add a helper function that retrieves ITS address regions - the msi
> parent - through IORT device <-> ITS mappings and reserves it so that
> these regions will not be translated by IOMMU and will be excluded from
> IOVA allocations.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> [lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org: updated commit log/added comments]
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
> ---
>  drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
>  drivers/irqchip/irq-gic-v3-its.c |  3 +-
>  include/linux/acpi_iort.h        |  7 ++-
>  3 files changed, 101 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 9565d57..14efa9d 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -39,6 +39,7 @@
>  struct iort_its_msi_chip {
>  	struct list_head	list;
>  	struct fwnode_handle	*fw_node;
> +	phys_addr_t		base_addr;
>  	u32			translation_id;
>  };
>  
> @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
>  static DEFINE_SPINLOCK(iort_msi_chip_lock);
>  
>  /**
> - * iort_register_domain_token() - register domain token and related ITS ID
> - * to the list from where we can get it back later on.
> + * iort_register_domain_token() - register domain token along with related
> + * ITS ID and base address to the list from where we can get it back later on.
>   * @trans_id: ITS ID.
> + * @base: ITS base address.
>   * @fw_node: Domain token.
>   *
>   * Returns: 0 on success, -ENOMEM if no memory when allocating list element
>   */
> -int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
> +int iort_register_domain_token(int trans_id, phys_addr_t base,
> +			       struct fwnode_handle *fw_node)
>  {
>  	struct iort_its_msi_chip *its_msi_chip;
>  
> @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
>  
>  	its_msi_chip->fw_node = fw_node;
>  	its_msi_chip->translation_id = trans_id;
> +	its_msi_chip->base_addr = base;
>  
>  	spin_lock(&iort_msi_chip_lock);
>  	list_add(&its_msi_chip->list, &iort_msi_chip_list);
> @@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
>  	return -ENODEV;
>  }
>  
> +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
> +{
> +	struct iort_its_msi_chip *its_msi_chip;
> +	bool match = false;
> +
> +	spin_lock(&iort_msi_chip_lock);
> +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> +		if (its_msi_chip->translation_id == its_id) {
> +			*base = its_msi_chip->base_addr;
> +			match = true;
> +			break;
> +		}
> +	}
> +	spin_unlock(&iort_msi_chip_lock);
> +
> +	return match ? 0 : -ENODEV;
> +}
> +
>  /**
>   * iort_dev_find_its_id() - Find the ITS identifier for a device
>   * @dev: The device.
> @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
>  
>  	return err;
>  }
> +
> +/**
> + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> + * @dev: Device from iommu_get_resv_regions()
> + * @head: Reserved region list from iommu_get_resv_regions()
> + *
> + * Returns: Number of reserved regions on success (0 if no associated msi
> + *          regions), appropriate error value otherwise. The ITS regions
> + *          associated with the device are the msi reserved regions.
> + */
> +int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
> +{
> +	struct acpi_iort_its_group *its;
> +	struct acpi_iort_node *node, *its_node = NULL;
> +	int i, resv = 0;
> +
> +	node = iort_find_dev_node(dev);
> +	if (!node)
> +		return -ENODEV;
> +
> +	/*
> +	 * Current logic to reserve ITS regions relies on HW topologies
> +	 * where a given PCI or named component maps its IDs to only one
> +	 * ITS group; if a PCI or named component can map its IDs to
> +	 * different ITS groups through IORT mappings this function has
> +	 * to be reworked to ensure we reserve regions for all ITS groups
> +	 * a given PCI or named component may map IDs to.
> +	 */
> +	if (dev_is_pci(dev)) {
> +		u32 rid;
> +
> +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
> +		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
> +	} else {
> +		for (i = 0; i < node->mapping_count; i++) {
> +			its_node = iort_node_map_platform_id(node, NULL,
> +							 IORT_MSI_TYPE, i);
> +			if (its_node)
> +				break;
> +		}
> +	}
> +
> +	if (!its_node)
> +		return 0;
> +
> +	/* Move to ITS specific data */
> +	its = (struct acpi_iort_its_group *)its_node->node_data;
> +
> +	for (i = 0; i < its->its_count; i++) {
> +		phys_addr_t base;
> +
> +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> +			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +			struct iommu_resv_region *region;
> +
> +			region = iommu_alloc_resv_region(base, SZ_128K, prot,
> +							 IOMMU_RESV_MSI);

Same as the OF part: I strongly object to reserving the whole 128kB
range. What we really care about is the second 64kB page, and that is
what should get reserved.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-10-04 14:17         ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 14:17 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/09/17 14:32, Shameer Kolothum wrote:
> On some platforms msi parent address regions have to be excluded from
> normal IOVA allocation in that they are detected and decoded in a HW
> specific way by system components and so they cannot be considered normal
> IOVA address space.
> 
> Add a helper function that retrieves ITS address regions - the msi
> parent - through IORT device <-> ITS mappings and reserves it so that
> these regions will not be translated by IOMMU and will be excluded from
> IOVA allocations.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> [lorenzo.pieralisi at arm.com: updated commit log/added comments]
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
>  drivers/irqchip/irq-gic-v3-its.c |  3 +-
>  include/linux/acpi_iort.h        |  7 ++-
>  3 files changed, 101 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 9565d57..14efa9d 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -39,6 +39,7 @@
>  struct iort_its_msi_chip {
>  	struct list_head	list;
>  	struct fwnode_handle	*fw_node;
> +	phys_addr_t		base_addr;
>  	u32			translation_id;
>  };
>  
> @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
>  static DEFINE_SPINLOCK(iort_msi_chip_lock);
>  
>  /**
> - * iort_register_domain_token() - register domain token and related ITS ID
> - * to the list from where we can get it back later on.
> + * iort_register_domain_token() - register domain token along with related
> + * ITS ID and base address to the list from where we can get it back later on.
>   * @trans_id: ITS ID.
> + * @base: ITS base address.
>   * @fw_node: Domain token.
>   *
>   * Returns: 0 on success, -ENOMEM if no memory when allocating list element
>   */
> -int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
> +int iort_register_domain_token(int trans_id, phys_addr_t base,
> +			       struct fwnode_handle *fw_node)
>  {
>  	struct iort_its_msi_chip *its_msi_chip;
>  
> @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
>  
>  	its_msi_chip->fw_node = fw_node;
>  	its_msi_chip->translation_id = trans_id;
> +	its_msi_chip->base_addr = base;
>  
>  	spin_lock(&iort_msi_chip_lock);
>  	list_add(&its_msi_chip->list, &iort_msi_chip_list);
> @@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
>  	return -ENODEV;
>  }
>  
> +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
> +{
> +	struct iort_its_msi_chip *its_msi_chip;
> +	bool match = false;
> +
> +	spin_lock(&iort_msi_chip_lock);
> +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> +		if (its_msi_chip->translation_id == its_id) {
> +			*base = its_msi_chip->base_addr;
> +			match = true;
> +			break;
> +		}
> +	}
> +	spin_unlock(&iort_msi_chip_lock);
> +
> +	return match ? 0 : -ENODEV;
> +}
> +
>  /**
>   * iort_dev_find_its_id() - Find the ITS identifier for a device
>   * @dev: The device.
> @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
>  
>  	return err;
>  }
> +
> +/**
> + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> + * @dev: Device from iommu_get_resv_regions()
> + * @head: Reserved region list from iommu_get_resv_regions()
> + *
> + * Returns: Number of reserved regions on success (0 if no associated msi
> + *          regions), appropriate error value otherwise. The ITS regions
> + *          associated with the device are the msi reserved regions.
> + */
> +int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
> +{
> +	struct acpi_iort_its_group *its;
> +	struct acpi_iort_node *node, *its_node = NULL;
> +	int i, resv = 0;
> +
> +	node = iort_find_dev_node(dev);
> +	if (!node)
> +		return -ENODEV;
> +
> +	/*
> +	 * Current logic to reserve ITS regions relies on HW topologies
> +	 * where a given PCI or named component maps its IDs to only one
> +	 * ITS group; if a PCI or named component can map its IDs to
> +	 * different ITS groups through IORT mappings this function has
> +	 * to be reworked to ensure we reserve regions for all ITS groups
> +	 * a given PCI or named component may map IDs to.
> +	 */
> +	if (dev_is_pci(dev)) {
> +		u32 rid;
> +
> +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
> +		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
> +	} else {
> +		for (i = 0; i < node->mapping_count; i++) {
> +			its_node = iort_node_map_platform_id(node, NULL,
> +							 IORT_MSI_TYPE, i);
> +			if (its_node)
> +				break;
> +		}
> +	}
> +
> +	if (!its_node)
> +		return 0;
> +
> +	/* Move to ITS specific data */
> +	its = (struct acpi_iort_its_group *)its_node->node_data;
> +
> +	for (i = 0; i < its->its_count; i++) {
> +		phys_addr_t base;
> +
> +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> +			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +			struct iommu_resv_region *region;
> +
> +			region = iommu_alloc_resv_region(base, SZ_128K, prot,
> +							 IOMMU_RESV_MSI);

Same as the OF part: I strongly object to reserving the whole 128kB
range. What we really care about is the second 64kB page, and that is
what should get reserved.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-10-04 14:17         ` Marc Zyngier
  0 siblings, 0 replies; 59+ messages in thread
From: Marc Zyngier @ 2017-10-04 14:17 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5617 bytes --]

On 27/09/17 14:32, Shameer Kolothum wrote:
> On some platforms msi parent address regions have to be excluded from
> normal IOVA allocation in that they are detected and decoded in a HW
> specific way by system components and so they cannot be considered normal
> IOVA address space.
> 
> Add a helper function that retrieves ITS address regions - the msi
> parent - through IORT device <-> ITS mappings and reserves it so that
> these regions will not be translated by IOMMU and will be excluded from
> IOVA allocations.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
> [lorenzo.pieralisi(a)arm.com: updated commit log/added comments]
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
> ---
>  drivers/acpi/arm64/iort.c        | 96 ++++++++++++++++++++++++++++++++++++++--
>  drivers/irqchip/irq-gic-v3-its.c |  3 +-
>  include/linux/acpi_iort.h        |  7 ++-
>  3 files changed, 101 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index 9565d57..14efa9d 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -39,6 +39,7 @@
>  struct iort_its_msi_chip {
>  	struct list_head	list;
>  	struct fwnode_handle	*fw_node;
> +	phys_addr_t		base_addr;
>  	u32			translation_id;
>  };
>  
> @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
>  static DEFINE_SPINLOCK(iort_msi_chip_lock);
>  
>  /**
> - * iort_register_domain_token() - register domain token and related ITS ID
> - * to the list from where we can get it back later on.
> + * iort_register_domain_token() - register domain token along with related
> + * ITS ID and base address to the list from where we can get it back later on.
>   * @trans_id: ITS ID.
> + * @base: ITS base address.
>   * @fw_node: Domain token.
>   *
>   * Returns: 0 on success, -ENOMEM if no memory when allocating list element
>   */
> -int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
> +int iort_register_domain_token(int trans_id, phys_addr_t base,
> +			       struct fwnode_handle *fw_node)
>  {
>  	struct iort_its_msi_chip *its_msi_chip;
>  
> @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id, struct fwnode_handle *fw_node)
>  
>  	its_msi_chip->fw_node = fw_node;
>  	its_msi_chip->translation_id = trans_id;
> +	its_msi_chip->base_addr = base;
>  
>  	spin_lock(&iort_msi_chip_lock);
>  	list_add(&its_msi_chip->list, &iort_msi_chip_list);
> @@ -481,6 +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
>  	return -ENODEV;
>  }
>  
> +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t *base)
> +{
> +	struct iort_its_msi_chip *its_msi_chip;
> +	bool match = false;
> +
> +	spin_lock(&iort_msi_chip_lock);
> +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> +		if (its_msi_chip->translation_id == its_id) {
> +			*base = its_msi_chip->base_addr;
> +			match = true;
> +			break;
> +		}
> +	}
> +	spin_unlock(&iort_msi_chip_lock);
> +
> +	return match ? 0 : -ENODEV;
> +}
> +
>  /**
>   * iort_dev_find_its_id() - Find the ITS identifier for a device
>   * @dev: The device.
> @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct iommu_ops *ops, struct device *dev)
>  
>  	return err;
>  }
> +
> +/**
> + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> + * @dev: Device from iommu_get_resv_regions()
> + * @head: Reserved region list from iommu_get_resv_regions()
> + *
> + * Returns: Number of reserved regions on success (0 if no associated msi
> + *          regions), appropriate error value otherwise. The ITS regions
> + *          associated with the device are the msi reserved regions.
> + */
> +int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head *head)
> +{
> +	struct acpi_iort_its_group *its;
> +	struct acpi_iort_node *node, *its_node = NULL;
> +	int i, resv = 0;
> +
> +	node = iort_find_dev_node(dev);
> +	if (!node)
> +		return -ENODEV;
> +
> +	/*
> +	 * Current logic to reserve ITS regions relies on HW topologies
> +	 * where a given PCI or named component maps its IDs to only one
> +	 * ITS group; if a PCI or named component can map its IDs to
> +	 * different ITS groups through IORT mappings this function has
> +	 * to be reworked to ensure we reserve regions for all ITS groups
> +	 * a given PCI or named component may map IDs to.
> +	 */
> +	if (dev_is_pci(dev)) {
> +		u32 rid;
> +
> +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid, &rid);
> +		its_node = iort_node_map_id(node, rid, NULL, IORT_MSI_TYPE);
> +	} else {
> +		for (i = 0; i < node->mapping_count; i++) {
> +			its_node = iort_node_map_platform_id(node, NULL,
> +							 IORT_MSI_TYPE, i);
> +			if (its_node)
> +				break;
> +		}
> +	}
> +
> +	if (!its_node)
> +		return 0;
> +
> +	/* Move to ITS specific data */
> +	its = (struct acpi_iort_its_group *)its_node->node_data;
> +
> +	for (i = 0; i < its->its_count; i++) {
> +		phys_addr_t base;
> +
> +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> +			int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> +			struct iommu_resv_region *region;
> +
> +			region = iommu_alloc_resv_region(base, SZ_128K, prot,
> +							 IOMMU_RESV_MSI);

Same as the OF part: I strongly object to reserving the whole 128kB
range. What we really care about is the second 64kB page, and that is
what should get reserved.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-04 11:22       ` Marc Zyngier
  (?)
@ 2017-10-04 17:07           ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 17:07 UTC (permalink / raw)
  To: Marc Zyngier, lorenzo.pieralisi-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Gabriele Paoloni, Linuxarm,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Guohanjun (Hanjun Guo),
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier-5wv7dgnIgG8@public.gmane.org]
> Sent: Wednesday, October 04, 2017 12:22 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>;
> lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org; sudeep.holla-5wv7dgnIgG8@public.gmane.org; will.deacon-5wv7dgnIgG8@public.gmane.org;
> robin.murphy-5wv7dgnIgG8@public.gmane.org; joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org; mark.rutland-5wv7dgnIgG8@public.gmane.org;
> robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> Cc: Gabriele Paoloni <gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-arm-
> kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org; linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; devel-E0kO6a4B6psdnm+yROfE0A@public.gmane.org; Linuxarm
> <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Wangzhou (B) <wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q@public.gmane.org>;
> Guohanjun (Hanjun Guo) <guohanjun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Subject: Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in a
> > HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves msi address regions through
> > device tree msi mapping, so that these regions will not be translated
> > by IOMMU and will be excluded from IOVA allocations.
> >
> > Signed-off-by: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > ---
> >  drivers/iommu/of_iommu.c | 95
> > ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> >
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index
> > e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);  }
> >
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev,
> u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;  }
> >
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you can have
> more than one region in an MSI controller (yes, your particular case is the ITS
> which has only one, but we're trying to do something generic here).

Ok. 

> Another issue I have with this code is that it inserts all of the ITS MMIO in the
> RESV_MSI range. As long as we don't generate any page tables for this, we're
> fine. But if we ever change this, we'll end-up with the ITS programming
> interface being exposed to a device, which wouldn't be acceptable.

I understand the concern of reserving the whole of ITS MMIO region. Sorry, 
but just being curious, how this will be exposed to a  device ? You mean a device
can  be configured to access the ITS MMIO region and it will fail because there is
no SMMU mapping for that?
 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion of a
> random range, and relatively future proof.
> 
> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel like
> a much nicer and simpler solution to this problem.

I am not sure that will be seen as legitimizing the untranslated regions or not.
We had this discussion to include the regions specified in IORT spec and the
answer was that, it will in a way legitimize this and encourage future
implementations.

How about making the _msi_get_resv_regions() function be very specific to GIC
ITS like _its_msi_get_resv_regions() ? Is that something we can consider?
(In fact we had a checking for "arm, gic-v3-its" in a previous version of this series).

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 17:07           ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 17:07 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier at arm.com]
> Sent: Wednesday, October 04, 2017 12:22 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> lorenzo.pieralisi at arm.com; sudeep.holla at arm.com; will.deacon at arm.com;
> robin.murphy at arm.com; joro at 8bytes.org; mark.rutland at arm.com;
> robh at kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni@huawei.com>; John Garry
> <john.garry@huawei.com>; iommu at lists.linux-foundation.org; linux-arm-
> kernel at lists.infradead.org; linux-acpi at vger.kernel.org;
> devicetree at vger.kernel.org; devel at acpica.org; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun@huawei.com>
> Subject: Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry@huawei.com>
> >
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in a
> > HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves msi address regions through
> > device tree msi mapping, so that these regions will not be translated
> > by IOMMU and will be excluded from IOVA allocations.
> >
> > Signed-off-by: John Garry <john.garry@huawei.com>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > ---
> >  drivers/iommu/of_iommu.c | 95
> > ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> >
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index
> > e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);  }
> >
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev,
> u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;  }
> >
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you can have
> more than one region in an MSI controller (yes, your particular case is the ITS
> which has only one, but we're trying to do something generic here).

Ok. 

> Another issue I have with this code is that it inserts all of the ITS MMIO in the
> RESV_MSI range. As long as we don't generate any page tables for this, we're
> fine. But if we ever change this, we'll end-up with the ITS programming
> interface being exposed to a device, which wouldn't be acceptable.

I understand the concern of reserving the whole of ITS MMIO region. Sorry, 
but just being curious, how this will be exposed to a  device ? You mean a device
can  be configured to access the ITS MMIO region and it will fail because there is
no SMMU mapping for that?
 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion of a
> random range, and relatively future proof.
> 
> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel like
> a much nicer and simpler solution to this problem.

I am not sure that will be seen as legitimizing the untranslated regions or not.
We had this discussion to include the regions specified in IORT spec and the
answer was that, it will in a way legitimize this and encourage future
implementations.

How about making the _msi_get_resv_regions() function be very specific to GIC
ITS like _its_msi_get_resv_regions() ? Is that something we can consider?
(In fact we had a checking for "arm, gic-v3-its" in a previous version of this series).

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-04 17:07           ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-04 17:07 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 4800 bytes --]



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier(a)arm.com]
> Sent: Wednesday, October 04, 2017 12:22 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi(a)huawei.com>;
> lorenzo.pieralisi(a)arm.com; sudeep.holla(a)arm.com; will.deacon(a)arm.com;
> robin.murphy(a)arm.com; joro(a)8bytes.org; mark.rutland(a)arm.com;
> robh(a)kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni(a)huawei.com>; John Garry
> <john.garry(a)huawei.com>; iommu(a)lists.linux-foundation.org; linux-arm-
> kernel(a)lists.infradead.org; linux-acpi(a)vger.kernel.org;
> devicetree(a)vger.kernel.org; devel(a)acpica.org; Linuxarm
> <linuxarm(a)huawei.com>; Wangzhou (B) <wangzhou1(a)hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun(a)huawei.com>
> Subject: Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > From: John Garry <john.garry(a)huawei.com>
> >
> > On some platforms msi-controller address regions have to be excluded
> > from normal IOVA allocation in that they are detected and decoded in a
> > HW specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves msi address regions through
> > device tree msi mapping, so that these regions will not be translated
> > by IOMMU and will be excluded from IOVA allocations.
> >
> > Signed-off-by: John Garry <john.garry(a)huawei.com>
> > [Shameer: Modified msi-parent retrieval logic]
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi(a)huawei.com>
> > ---
> >  drivers/iommu/of_iommu.c | 95
> > ++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/of_iommu.h | 10 +++++
> >  2 files changed, 105 insertions(+)
> >
> > diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> index
> > e60e3db..ffd7fb7 100644
> > --- a/drivers/iommu/of_iommu.c
> > +++ b/drivers/iommu/of_iommu.c
> > @@ -21,6 +21,7 @@
> >  #include <linux/iommu.h>
> >  #include <linux/limits.h>
> >  #include <linux/of.h>
> > +#include <linux/of_address.h>
> >  #include <linux/of_iommu.h>
> >  #include <linux/of_pci.h>
> >  #include <linux/slab.h>
> > @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >  	return ops->of_xlate(dev, iommu_spec);  }
> >
> > +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> > +{
> > +	u32 *rid = data;
> > +
> > +	*rid = alias;
> > +	return 0;
> > +}
> > +
> >  struct of_pci_iommu_alias_info {
> >  	struct device *dev;
> >  	struct device_node *np;
> > @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev,
> u16 alias, void *data)
> >  	return info->np == pdev->bus->dev.of_node;  }
> >
> > +static int of_iommu_alloc_resv_region(struct device_node *np,
> > +				      struct list_head *head)
> > +{
> > +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> > +	struct iommu_resv_region *region;
> > +	struct resource res;
> > +	int err;
> > +
> > +	err = of_address_to_resource(np, 0, &res);
> 
> What is the rational for registering the first region only? Surely you can have
> more than one region in an MSI controller (yes, your particular case is the ITS
> which has only one, but we're trying to do something generic here).

Ok. 

> Another issue I have with this code is that it inserts all of the ITS MMIO in the
> RESV_MSI range. As long as we don't generate any page tables for this, we're
> fine. But if we ever change this, we'll end-up with the ITS programming
> interface being exposed to a device, which wouldn't be acceptable.

I understand the concern of reserving the whole of ITS MMIO region. Sorry, 
but just being curious, how this will be exposed to a  device ? You mean a device
can  be configured to access the ITS MMIO region and it will fail because there is
no SMMU mapping for that?
 
> Why can't we have some generic property instead, that would describe the
> actual ranges that cannot be translated? That way, no random insertion of a
> random range, and relatively future proof.
> 
> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel like
> a much nicer and simpler solution to this problem.

I am not sure that will be seen as legitimizing the untranslated regions or not.
We had this discussion to include the regions specified in IORT spec and the
answer was that, it will in a way legitimize this and encourage future
implementations.

How about making the _msi_get_resv_regions() function be very specific to GIC
ITS like _its_msi_get_resv_regions() ? Is that something we can consider?
(In fact we had a checking for "arm, gic-v3-its" in a previous version of this series).

Thanks,
Shameer



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-04 13:50           ` Lorenzo Pieralisi
  (?)
@ 2017-10-05 11:07             ` Robin Murphy
  -1 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 11:07 UTC (permalink / raw)
  To: Lorenzo Pieralisi, Marc Zyngier
  Cc: Shameer Kolothum, sudeep.holla, will.deacon, joro, mark.rutland,
	robh, gabriele.paoloni, john.garry, iommu, linux-arm-kernel,
	linux-acpi, devicetree, devel, linuxarm, wangzhou1, guohanjun

On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>> From: John Garry <john.garry@huawei.com>
>>>
>>> On some platforms msi-controller address regions have to be excluded
>>> from normal IOVA allocation in that they are detected and decoded in
>>> a HW specific way by system components and so they cannot be considered
>>> normal IOVA address space.
>>>
>>> Add a helper function that retrieves msi address regions through device
>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>> and will be excluded from IOVA allocations.
>>>
>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>> [Shameer: Modified msi-parent retrieval logic]
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/linux/of_iommu.h | 10 +++++
>>>  2 files changed, 105 insertions(+)
>>>
>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>> index e60e3db..ffd7fb7 100644
>>> --- a/drivers/iommu/of_iommu.c
>>> +++ b/drivers/iommu/of_iommu.c
>>> @@ -21,6 +21,7 @@
>>>  #include <linux/iommu.h>
>>>  #include <linux/limits.h>
>>>  #include <linux/of.h>
>>> +#include <linux/of_address.h>
>>>  #include <linux/of_iommu.h>
>>>  #include <linux/of_pci.h>
>>>  #include <linux/slab.h>
>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>  	return ops->of_xlate(dev, iommu_spec);
>>>  }
>>>  
>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>> +{
>>> +	u32 *rid = data;
>>> +
>>> +	*rid = alias;
>>> +	return 0;
>>> +}
>>> +
>>>  struct of_pci_iommu_alias_info {
>>>  	struct device *dev;
>>>  	struct device_node *np;
>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>  	return info->np == pdev->bus->dev.of_node;
>>>  }
>>>  
>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>> +				      struct list_head *head)
>>> +{
>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>> +	struct iommu_resv_region *region;
>>> +	struct resource res;
>>> +	int err;
>>> +
>>> +	err = of_address_to_resource(np, 0, &res);
>>
>> What is the rational for registering the first region only? Surely you
>> can have more than one region in an MSI controller (yes, your particular
>> case is the ITS which has only one, but we're trying to do something
>> generic here).
>>
>> Another issue I have with this code is that it inserts all of the ITS
>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>> for this, we're fine. But if we ever change this, we'll end-up with the
>> ITS programming interface being exposed to a device, which wouldn't be
>> acceptable.
>>
>> Why can't we have some generic property instead, that would describe the
>> actual ranges that cannot be translated? That way, no random insertion
>> of a random range, and relatively future proof.

Indeed. Furthermore, IORT has the advantage of being limited to a few
distinct device types and SBSA-compliant system topologies, so the
ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
The scope of DT covers more embedded things as well like PCI host
controllers with internal MSI doorbells, and potentially even
direct-mapped memory regions for things like bootloader framebuffers to
prevent display glitches - a generic address/size/flags description of a
region could work for just about everything.

> IORT code has the same issue (ie it reserves all ITS regions) and I do
> not know where a property can be added to describe those ranges (IORT
> specs ? I'd rather not) in ACPI other than the IORT tables entries
> themselves.
> 
>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>> like a much nicer and simpler solution to this problem.
> 
> It could be but if we throw ACPI into the picture we have to knock
> together Hisilicon specific namespace bindings to handle this and
> quickly.

As above I'm happy with the ITS-specific solution for ACPI given the
limits of IORT. What I had in mind for DT was something tied in with the
generic IOMMU bindings. Something like this is probably easiest to
handle, but does rather spread the information around:


  pci {
  	...
  	iommu-map = <0 0 &iommu1 0x10000>;
  	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
  				<0x34560000 0x1000 IOMMU_MSI>;
  };

  display {
  	...
  	iommus = <&iommu1 0x20000>;
  	/* simplefb region */
  	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
  };


Alternatively, something inspired by reserved-memory might perhaps be
conceptually neater, at the risk of being more complicated:


  iommu1: iommu@acbd0000 {
  	...
  	#iommu-cells = <1>;

  	iommu-reserved-ranges {
  		#address-cells = <1>;
  		#size-cells = <1>;

		its0_resv: msi@12340000 {
  			compatible = "iommu-msi-region";
  			reg = <0x12340000 0x1000>;
  		};

		its1_resv: msi@34560000 {
  			compatible = "iommu-msi-region";
  			reg = <0x34560000 0x1000>;
  		};

		fb_resv: direct@12340000 {
  			compatible = "iommu-direct-region";
  			reg = <0x80080000 0x80000>;
  		};
  	};
  };


DT folks - any opinions?

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 11:07             ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 11:07 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>> From: John Garry <john.garry@huawei.com>
>>>
>>> On some platforms msi-controller address regions have to be excluded
>>> from normal IOVA allocation in that they are detected and decoded in
>>> a HW specific way by system components and so they cannot be considered
>>> normal IOVA address space.
>>>
>>> Add a helper function that retrieves msi address regions through device
>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>> and will be excluded from IOVA allocations.
>>>
>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>> [Shameer: Modified msi-parent retrieval logic]
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>> ---
>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/linux/of_iommu.h | 10 +++++
>>>  2 files changed, 105 insertions(+)
>>>
>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>> index e60e3db..ffd7fb7 100644
>>> --- a/drivers/iommu/of_iommu.c
>>> +++ b/drivers/iommu/of_iommu.c
>>> @@ -21,6 +21,7 @@
>>>  #include <linux/iommu.h>
>>>  #include <linux/limits.h>
>>>  #include <linux/of.h>
>>> +#include <linux/of_address.h>
>>>  #include <linux/of_iommu.h>
>>>  #include <linux/of_pci.h>
>>>  #include <linux/slab.h>
>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>  	return ops->of_xlate(dev, iommu_spec);
>>>  }
>>>  
>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>> +{
>>> +	u32 *rid = data;
>>> +
>>> +	*rid = alias;
>>> +	return 0;
>>> +}
>>> +
>>>  struct of_pci_iommu_alias_info {
>>>  	struct device *dev;
>>>  	struct device_node *np;
>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>  	return info->np == pdev->bus->dev.of_node;
>>>  }
>>>  
>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>> +				      struct list_head *head)
>>> +{
>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>> +	struct iommu_resv_region *region;
>>> +	struct resource res;
>>> +	int err;
>>> +
>>> +	err = of_address_to_resource(np, 0, &res);
>>
>> What is the rational for registering the first region only? Surely you
>> can have more than one region in an MSI controller (yes, your particular
>> case is the ITS which has only one, but we're trying to do something
>> generic here).
>>
>> Another issue I have with this code is that it inserts all of the ITS
>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>> for this, we're fine. But if we ever change this, we'll end-up with the
>> ITS programming interface being exposed to a device, which wouldn't be
>> acceptable.
>>
>> Why can't we have some generic property instead, that would describe the
>> actual ranges that cannot be translated? That way, no random insertion
>> of a random range, and relatively future proof.

Indeed. Furthermore, IORT has the advantage of being limited to a few
distinct device types and SBSA-compliant system topologies, so the
ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
The scope of DT covers more embedded things as well like PCI host
controllers with internal MSI doorbells, and potentially even
direct-mapped memory regions for things like bootloader framebuffers to
prevent display glitches - a generic address/size/flags description of a
region could work for just about everything.

> IORT code has the same issue (ie it reserves all ITS regions) and I do
> not know where a property can be added to describe those ranges (IORT
> specs ? I'd rather not) in ACPI other than the IORT tables entries
> themselves.
> 
>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>> like a much nicer and simpler solution to this problem.
> 
> It could be but if we throw ACPI into the picture we have to knock
> together Hisilicon specific namespace bindings to handle this and
> quickly.

As above I'm happy with the ITS-specific solution for ACPI given the
limits of IORT. What I had in mind for DT was something tied in with the
generic IOMMU bindings. Something like this is probably easiest to
handle, but does rather spread the information around:


  pci {
  	...
  	iommu-map = <0 0 &iommu1 0x10000>;
  	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
  				<0x34560000 0x1000 IOMMU_MSI>;
  };

  display {
  	...
  	iommus = <&iommu1 0x20000>;
  	/* simplefb region */
  	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
  };


Alternatively, something inspired by reserved-memory might perhaps be
conceptually neater, at the risk of being more complicated:


  iommu1: iommu at acbd0000 {
  	...
  	#iommu-cells = <1>;

  	iommu-reserved-ranges {
  		#address-cells = <1>;
  		#size-cells = <1>;

		its0_resv: msi at 12340000 {
  			compatible = "iommu-msi-region";
  			reg = <0x12340000 0x1000>;
  		};

		its1_resv: msi at 34560000 {
  			compatible = "iommu-msi-region";
  			reg = <0x34560000 0x1000>;
  		};

		fb_resv: direct at 12340000 {
  			compatible = "iommu-direct-region";
  			reg = <0x80080000 0x80000>;
  		};
  	};
  };


DT folks - any opinions?

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 11:07             ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 11:07 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5461 bytes --]

On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>> From: John Garry <john.garry(a)huawei.com>
>>>
>>> On some platforms msi-controller address regions have to be excluded
>>> from normal IOVA allocation in that they are detected and decoded in
>>> a HW specific way by system components and so they cannot be considered
>>> normal IOVA address space.
>>>
>>> Add a helper function that retrieves msi address regions through device
>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>> and will be excluded from IOVA allocations.
>>>
>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>> [Shameer: Modified msi-parent retrieval logic]
>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
>>> ---
>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>  include/linux/of_iommu.h | 10 +++++
>>>  2 files changed, 105 insertions(+)
>>>
>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>> index e60e3db..ffd7fb7 100644
>>> --- a/drivers/iommu/of_iommu.c
>>> +++ b/drivers/iommu/of_iommu.c
>>> @@ -21,6 +21,7 @@
>>>  #include <linux/iommu.h>
>>>  #include <linux/limits.h>
>>>  #include <linux/of.h>
>>> +#include <linux/of_address.h>
>>>  #include <linux/of_iommu.h>
>>>  #include <linux/of_pci.h>
>>>  #include <linux/slab.h>
>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>  	return ops->of_xlate(dev, iommu_spec);
>>>  }
>>>  
>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>> +{
>>> +	u32 *rid = data;
>>> +
>>> +	*rid = alias;
>>> +	return 0;
>>> +}
>>> +
>>>  struct of_pci_iommu_alias_info {
>>>  	struct device *dev;
>>>  	struct device_node *np;
>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>  	return info->np == pdev->bus->dev.of_node;
>>>  }
>>>  
>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>> +				      struct list_head *head)
>>> +{
>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>> +	struct iommu_resv_region *region;
>>> +	struct resource res;
>>> +	int err;
>>> +
>>> +	err = of_address_to_resource(np, 0, &res);
>>
>> What is the rational for registering the first region only? Surely you
>> can have more than one region in an MSI controller (yes, your particular
>> case is the ITS which has only one, but we're trying to do something
>> generic here).
>>
>> Another issue I have with this code is that it inserts all of the ITS
>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>> for this, we're fine. But if we ever change this, we'll end-up with the
>> ITS programming interface being exposed to a device, which wouldn't be
>> acceptable.
>>
>> Why can't we have some generic property instead, that would describe the
>> actual ranges that cannot be translated? That way, no random insertion
>> of a random range, and relatively future proof.

Indeed. Furthermore, IORT has the advantage of being limited to a few
distinct device types and SBSA-compliant system topologies, so the
ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
The scope of DT covers more embedded things as well like PCI host
controllers with internal MSI doorbells, and potentially even
direct-mapped memory regions for things like bootloader framebuffers to
prevent display glitches - a generic address/size/flags description of a
region could work for just about everything.

> IORT code has the same issue (ie it reserves all ITS regions) and I do
> not know where a property can be added to describe those ranges (IORT
> specs ? I'd rather not) in ACPI other than the IORT tables entries
> themselves.
> 
>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>> like a much nicer and simpler solution to this problem.
> 
> It could be but if we throw ACPI into the picture we have to knock
> together Hisilicon specific namespace bindings to handle this and
> quickly.

As above I'm happy with the ITS-specific solution for ACPI given the
limits of IORT. What I had in mind for DT was something tied in with the
generic IOMMU bindings. Something like this is probably easiest to
handle, but does rather spread the information around:


  pci {
  	...
  	iommu-map = <0 0 &iommu1 0x10000>;
  	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
  				<0x34560000 0x1000 IOMMU_MSI>;
  };

  display {
  	...
  	iommus = <&iommu1 0x20000>;
  	/* simplefb region */
  	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
  };


Alternatively, something inspired by reserved-memory might perhaps be
conceptually neater, at the risk of being more complicated:


  iommu1: iommu(a)acbd0000 {
  	...
  	#iommu-cells = <1>;

  	iommu-reserved-ranges {
  		#address-cells = <1>;
  		#size-cells = <1>;

		its0_resv: msi(a)12340000 {
  			compatible = "iommu-msi-region";
  			reg = <0x12340000 0x1000>;
  		};

		its1_resv: msi(a)34560000 {
  			compatible = "iommu-msi-region";
  			reg = <0x34560000 0x1000>;
  		};

		fb_resv: direct(a)12340000 {
  			compatible = "iommu-direct-region";
  			reg = <0x80080000 0x80000>;
  		};
  	};
  };


DT folks - any opinions?

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 11:07             ` Robin Murphy
  (?)
@ 2017-10-05 11:57                 ` Will Deacon
  -1 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-10-05 11:57 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Lorenzo Pieralisi, Marc Zyngier, Shameer Kolothum,
	sudeep.holla-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	john.garry-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, devel-E0kO6a4B6psdnm+yROfE0A,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q,
	guohanjun-hv44wF8Li93QT0dZR+AlfA

On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> > On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> >> On 27/09/17 14:32, Shameer Kolothum wrote:
> >>> From: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >>>
> >>> On some platforms msi-controller address regions have to be excluded
> >>> from normal IOVA allocation in that they are detected and decoded in
> >>> a HW specific way by system components and so they cannot be considered
> >>> normal IOVA address space.
> >>>
> >>> Add a helper function that retrieves msi address regions through device
> >>> tree msi mapping, so that these regions will not be translated by IOMMU
> >>> and will be excluded from IOVA allocations.
> >>>
> >>> Signed-off-by: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >>> [Shameer: Modified msi-parent retrieval logic]
> >>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> >>> ---
> >>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>  include/linux/of_iommu.h | 10 +++++
> >>>  2 files changed, 105 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >>> index e60e3db..ffd7fb7 100644
> >>> --- a/drivers/iommu/of_iommu.c
> >>> +++ b/drivers/iommu/of_iommu.c
> >>> @@ -21,6 +21,7 @@
> >>>  #include <linux/iommu.h>
> >>>  #include <linux/limits.h>
> >>>  #include <linux/of.h>
> >>> +#include <linux/of_address.h>
> >>>  #include <linux/of_iommu.h>
> >>>  #include <linux/of_pci.h>
> >>>  #include <linux/slab.h>
> >>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >>>  	return ops->of_xlate(dev, iommu_spec);
> >>>  }
> >>>  
> >>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> >>> +{
> >>> +	u32 *rid = data;
> >>> +
> >>> +	*rid = alias;
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  struct of_pci_iommu_alias_info {
> >>>  	struct device *dev;
> >>>  	struct device_node *np;
> >>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >>>  	return info->np == pdev->bus->dev.of_node;
> >>>  }
> >>>  
> >>> +static int of_iommu_alloc_resv_region(struct device_node *np,
> >>> +				      struct list_head *head)
> >>> +{
> >>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>> +	struct iommu_resv_region *region;
> >>> +	struct resource res;
> >>> +	int err;
> >>> +
> >>> +	err = of_address_to_resource(np, 0, &res);
> >>
> >> What is the rational for registering the first region only? Surely you
> >> can have more than one region in an MSI controller (yes, your particular
> >> case is the ITS which has only one, but we're trying to do something
> >> generic here).
> >>
> >> Another issue I have with this code is that it inserts all of the ITS
> >> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> >> for this, we're fine. But if we ever change this, we'll end-up with the
> >> ITS programming interface being exposed to a device, which wouldn't be
> >> acceptable.
> >>
> >> Why can't we have some generic property instead, that would describe the
> >> actual ranges that cannot be translated? That way, no random insertion
> >> of a random range, and relatively future proof.
> 
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
> 
> > IORT code has the same issue (ie it reserves all ITS regions) and I do
> > not know where a property can be added to describe those ranges (IORT
> > specs ? I'd rather not) in ACPI other than the IORT tables entries
> > themselves.
> > 
> >> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> >> like a much nicer and simpler solution to this problem.
> > 
> > It could be but if we throw ACPI into the picture we have to knock
> > together Hisilicon specific namespace bindings to handle this and
> > quickly.
> 
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
> 
> 
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
> 
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
> 
> 
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
> 
> 
>   iommu1: iommu@acbd0000 {
>   	...
>   	#iommu-cells = <1>;
> 
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
> 
> 		its0_resv: msi@12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
> 
> 		its1_resv: msi@34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
> 
> 		fb_resv: direct@12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };

I like the locality of this, but is it definitely flexible enough? Do we
need to deal with systems where the reserved regions are specific to the
master (i.e. TBU) as opposed to the entire SMMU block?

Will
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 11:57                 ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-10-05 11:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> > On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> >> On 27/09/17 14:32, Shameer Kolothum wrote:
> >>> From: John Garry <john.garry@huawei.com>
> >>>
> >>> On some platforms msi-controller address regions have to be excluded
> >>> from normal IOVA allocation in that they are detected and decoded in
> >>> a HW specific way by system components and so they cannot be considered
> >>> normal IOVA address space.
> >>>
> >>> Add a helper function that retrieves msi address regions through device
> >>> tree msi mapping, so that these regions will not be translated by IOMMU
> >>> and will be excluded from IOVA allocations.
> >>>
> >>> Signed-off-by: John Garry <john.garry@huawei.com>
> >>> [Shameer: Modified msi-parent retrieval logic]
> >>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> >>> ---
> >>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>  include/linux/of_iommu.h | 10 +++++
> >>>  2 files changed, 105 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >>> index e60e3db..ffd7fb7 100644
> >>> --- a/drivers/iommu/of_iommu.c
> >>> +++ b/drivers/iommu/of_iommu.c
> >>> @@ -21,6 +21,7 @@
> >>>  #include <linux/iommu.h>
> >>>  #include <linux/limits.h>
> >>>  #include <linux/of.h>
> >>> +#include <linux/of_address.h>
> >>>  #include <linux/of_iommu.h>
> >>>  #include <linux/of_pci.h>
> >>>  #include <linux/slab.h>
> >>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >>>  	return ops->of_xlate(dev, iommu_spec);
> >>>  }
> >>>  
> >>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> >>> +{
> >>> +	u32 *rid = data;
> >>> +
> >>> +	*rid = alias;
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  struct of_pci_iommu_alias_info {
> >>>  	struct device *dev;
> >>>  	struct device_node *np;
> >>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >>>  	return info->np == pdev->bus->dev.of_node;
> >>>  }
> >>>  
> >>> +static int of_iommu_alloc_resv_region(struct device_node *np,
> >>> +				      struct list_head *head)
> >>> +{
> >>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>> +	struct iommu_resv_region *region;
> >>> +	struct resource res;
> >>> +	int err;
> >>> +
> >>> +	err = of_address_to_resource(np, 0, &res);
> >>
> >> What is the rational for registering the first region only? Surely you
> >> can have more than one region in an MSI controller (yes, your particular
> >> case is the ITS which has only one, but we're trying to do something
> >> generic here).
> >>
> >> Another issue I have with this code is that it inserts all of the ITS
> >> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> >> for this, we're fine. But if we ever change this, we'll end-up with the
> >> ITS programming interface being exposed to a device, which wouldn't be
> >> acceptable.
> >>
> >> Why can't we have some generic property instead, that would describe the
> >> actual ranges that cannot be translated? That way, no random insertion
> >> of a random range, and relatively future proof.
> 
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
> 
> > IORT code has the same issue (ie it reserves all ITS regions) and I do
> > not know where a property can be added to describe those ranges (IORT
> > specs ? I'd rather not) in ACPI other than the IORT tables entries
> > themselves.
> > 
> >> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> >> like a much nicer and simpler solution to this problem.
> > 
> > It could be but if we throw ACPI into the picture we have to knock
> > together Hisilicon specific namespace bindings to handle this and
> > quickly.
> 
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
> 
> 
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
> 
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
> 
> 
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
> 
> 
>   iommu1: iommu at acbd0000 {
>   	...
>   	#iommu-cells = <1>;
> 
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
> 
> 		its0_resv: msi at 12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
> 
> 		its1_resv: msi at 34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
> 
> 		fb_resv: direct at 12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };

I like the locality of this, but is it definitely flexible enough? Do we
need to deal with systems where the reserved regions are specific to the
master (i.e. TBU) as opposed to the entire SMMU block?

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 11:57                 ` Will Deacon
  0 siblings, 0 replies; 59+ messages in thread
From: Will Deacon @ 2017-10-05 11:57 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5994 bytes --]

On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
> > On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
> >> On 27/09/17 14:32, Shameer Kolothum wrote:
> >>> From: John Garry <john.garry(a)huawei.com>
> >>>
> >>> On some platforms msi-controller address regions have to be excluded
> >>> from normal IOVA allocation in that they are detected and decoded in
> >>> a HW specific way by system components and so they cannot be considered
> >>> normal IOVA address space.
> >>>
> >>> Add a helper function that retrieves msi address regions through device
> >>> tree msi mapping, so that these regions will not be translated by IOMMU
> >>> and will be excluded from IOVA allocations.
> >>>
> >>> Signed-off-by: John Garry <john.garry(a)huawei.com>
> >>> [Shameer: Modified msi-parent retrieval logic]
> >>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
> >>> ---
> >>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>  include/linux/of_iommu.h | 10 +++++
> >>>  2 files changed, 105 insertions(+)
> >>>
> >>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
> >>> index e60e3db..ffd7fb7 100644
> >>> --- a/drivers/iommu/of_iommu.c
> >>> +++ b/drivers/iommu/of_iommu.c
> >>> @@ -21,6 +21,7 @@
> >>>  #include <linux/iommu.h>
> >>>  #include <linux/limits.h>
> >>>  #include <linux/of.h>
> >>> +#include <linux/of_address.h>
> >>>  #include <linux/of_iommu.h>
> >>>  #include <linux/of_pci.h>
> >>>  #include <linux/slab.h>
> >>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
> >>>  	return ops->of_xlate(dev, iommu_spec);
> >>>  }
> >>>  
> >>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
> >>> +{
> >>> +	u32 *rid = data;
> >>> +
> >>> +	*rid = alias;
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  struct of_pci_iommu_alias_info {
> >>>  	struct device *dev;
> >>>  	struct device_node *np;
> >>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
> >>>  	return info->np == pdev->bus->dev.of_node;
> >>>  }
> >>>  
> >>> +static int of_iommu_alloc_resv_region(struct device_node *np,
> >>> +				      struct list_head *head)
> >>> +{
> >>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
> >>> +	struct iommu_resv_region *region;
> >>> +	struct resource res;
> >>> +	int err;
> >>> +
> >>> +	err = of_address_to_resource(np, 0, &res);
> >>
> >> What is the rational for registering the first region only? Surely you
> >> can have more than one region in an MSI controller (yes, your particular
> >> case is the ITS which has only one, but we're trying to do something
> >> generic here).
> >>
> >> Another issue I have with this code is that it inserts all of the ITS
> >> MMIO in the RESV_MSI range. As long as we don't generate any page tables
> >> for this, we're fine. But if we ever change this, we'll end-up with the
> >> ITS programming interface being exposed to a device, which wouldn't be
> >> acceptable.
> >>
> >> Why can't we have some generic property instead, that would describe the
> >> actual ranges that cannot be translated? That way, no random insertion
> >> of a random range, and relatively future proof.
> 
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
> 
> > IORT code has the same issue (ie it reserves all ITS regions) and I do
> > not know where a property can be added to describe those ranges (IORT
> > specs ? I'd rather not) in ACPI other than the IORT tables entries
> > themselves.
> > 
> >> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
> >> like a much nicer and simpler solution to this problem.
> > 
> > It could be but if we throw ACPI into the picture we have to knock
> > together Hisilicon specific namespace bindings to handle this and
> > quickly.
> 
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
> 
> 
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
> 
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
> 
> 
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
> 
> 
>   iommu1: iommu(a)acbd0000 {
>   	...
>   	#iommu-cells = <1>;
> 
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
> 
> 		its0_resv: msi(a)12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
> 
> 		its1_resv: msi(a)34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
> 
> 		fb_resv: direct(a)12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };

I like the locality of this, but is it definitely flexible enough? Do we
need to deal with systems where the reserved regions are specific to the
master (i.e. TBU) as opposed to the entire SMMU block?

Will

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 11:07             ` Robin Murphy
  (?)
@ 2017-10-05 12:37               ` John Garry
  -1 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 12:37 UTC (permalink / raw)
  To: Robin Murphy, Lorenzo Pieralisi, Marc Zyngier
  Cc: Shameer Kolothum, sudeep.holla, will.deacon, joro, mark.rutland,
	robh, gabriele.paoloni, iommu, linux-arm-kernel, linux-acpi,
	devicetree, devel, linuxarm, wangzhou1, guohanjun

On 05/10/2017 12:07, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>> From: John Garry <john.garry@huawei.com>
>>>>
>>>> On some platforms msi-controller address regions have to be excluded
>>>> from normal IOVA allocation in that they are detected and decoded in
>>>> a HW specific way by system components and so they cannot be considered
>>>> normal IOVA address space.
>>>>
>>>> Add a helper function that retrieves msi address regions through device
>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>> and will be excluded from IOVA allocations.
>>>>
>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>> [Shameer: Modified msi-parent retrieval logic]
>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>> ---
>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/of_iommu.h | 10 +++++
>>>>  2 files changed, 105 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>> index e60e3db..ffd7fb7 100644
>>>> --- a/drivers/iommu/of_iommu.c
>>>> +++ b/drivers/iommu/of_iommu.c
>>>> @@ -21,6 +21,7 @@
>>>>  #include <linux/iommu.h>
>>>>  #include <linux/limits.h>
>>>>  #include <linux/of.h>
>>>> +#include <linux/of_address.h>
>>>>  #include <linux/of_iommu.h>
>>>>  #include <linux/of_pci.h>
>>>>  #include <linux/slab.h>
>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>  }
>>>>
>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>> +{
>>>> +	u32 *rid = data;
>>>> +
>>>> +	*rid = alias;
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  struct of_pci_iommu_alias_info {
>>>>  	struct device *dev;
>>>>  	struct device_node *np;
>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>  }
>>>>
>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>> +				      struct list_head *head)
>>>> +{
>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>> +	struct iommu_resv_region *region;
>>>> +	struct resource res;
>>>> +	int err;
>>>> +
>>>> +	err = of_address_to_resource(np, 0, &res);
>>>
>>> What is the rational for registering the first region only? Surely you
>>> can have more than one region in an MSI controller (yes, your particular
>>> case is the ITS which has only one, but we're trying to do something
>>> generic here).
>>>
>>> Another issue I have with this code is that it inserts all of the ITS
>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>> ITS programming interface being exposed to a device, which wouldn't be
>>> acceptable.
>>>
>>> Why can't we have some generic property instead, that would describe the
>>> actual ranges that cannot be translated? That way, no random insertion
>>> of a random range, and relatively future proof.
>
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
>
>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>> not know where a property can be added to describe those ranges (IORT
>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>> themselves.
>>
>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>> like a much nicer and simpler solution to this problem.
>>
>> It could be but if we throw ACPI into the picture we have to knock
>> together Hisilicon specific namespace bindings to handle this and
>> quickly.
>
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
>
>
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
>
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
>
>
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
>
>
>   iommu1: iommu@acbd0000 {
>   	...
>   	#iommu-cells = <1>;
>
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
>
> 		its0_resv: msi@12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
>
> 		its1_resv: msi@34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
>
> 		fb_resv: direct@12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };
>
>

If we did this, wouldn't it be easier to create dangerous reserved 
regions, like our ITS region which Marc was concerned by? It's not so 
hard to get dts changes into the kernel.

John

> DT folks - any opinions?
>
> Robin.
>
> .
>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:37               ` John Garry
  0 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 12:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/10/2017 12:07, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>> From: John Garry <john.garry@huawei.com>
>>>>
>>>> On some platforms msi-controller address regions have to be excluded
>>>> from normal IOVA allocation in that they are detected and decoded in
>>>> a HW specific way by system components and so they cannot be considered
>>>> normal IOVA address space.
>>>>
>>>> Add a helper function that retrieves msi address regions through device
>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>> and will be excluded from IOVA allocations.
>>>>
>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>> [Shameer: Modified msi-parent retrieval logic]
>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>> ---
>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/of_iommu.h | 10 +++++
>>>>  2 files changed, 105 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>> index e60e3db..ffd7fb7 100644
>>>> --- a/drivers/iommu/of_iommu.c
>>>> +++ b/drivers/iommu/of_iommu.c
>>>> @@ -21,6 +21,7 @@
>>>>  #include <linux/iommu.h>
>>>>  #include <linux/limits.h>
>>>>  #include <linux/of.h>
>>>> +#include <linux/of_address.h>
>>>>  #include <linux/of_iommu.h>
>>>>  #include <linux/of_pci.h>
>>>>  #include <linux/slab.h>
>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>  }
>>>>
>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>> +{
>>>> +	u32 *rid = data;
>>>> +
>>>> +	*rid = alias;
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  struct of_pci_iommu_alias_info {
>>>>  	struct device *dev;
>>>>  	struct device_node *np;
>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>  }
>>>>
>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>> +				      struct list_head *head)
>>>> +{
>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>> +	struct iommu_resv_region *region;
>>>> +	struct resource res;
>>>> +	int err;
>>>> +
>>>> +	err = of_address_to_resource(np, 0, &res);
>>>
>>> What is the rational for registering the first region only? Surely you
>>> can have more than one region in an MSI controller (yes, your particular
>>> case is the ITS which has only one, but we're trying to do something
>>> generic here).
>>>
>>> Another issue I have with this code is that it inserts all of the ITS
>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>> ITS programming interface being exposed to a device, which wouldn't be
>>> acceptable.
>>>
>>> Why can't we have some generic property instead, that would describe the
>>> actual ranges that cannot be translated? That way, no random insertion
>>> of a random range, and relatively future proof.
>
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
>
>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>> not know where a property can be added to describe those ranges (IORT
>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>> themselves.
>>
>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>> like a much nicer and simpler solution to this problem.
>>
>> It could be but if we throw ACPI into the picture we have to knock
>> together Hisilicon specific namespace bindings to handle this and
>> quickly.
>
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
>
>
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
>
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
>
>
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
>
>
>   iommu1: iommu at acbd0000 {
>   	...
>   	#iommu-cells = <1>;
>
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
>
> 		its0_resv: msi at 12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
>
> 		its1_resv: msi at 34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
>
> 		fb_resv: direct at 12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };
>
>

If we did this, wouldn't it be easier to create dangerous reserved 
regions, like our ITS region which Marc was concerned by? It's not so 
hard to get dts changes into the kernel.

John

> DT folks - any opinions?
>
> Robin.
>
> .
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:37               ` John Garry
  0 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 12:37 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 5909 bytes --]

On 05/10/2017 12:07, Robin Murphy wrote:
> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>> From: John Garry <john.garry(a)huawei.com>
>>>>
>>>> On some platforms msi-controller address regions have to be excluded
>>>> from normal IOVA allocation in that they are detected and decoded in
>>>> a HW specific way by system components and so they cannot be considered
>>>> normal IOVA address space.
>>>>
>>>> Add a helper function that retrieves msi address regions through device
>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>> and will be excluded from IOVA allocations.
>>>>
>>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>>> [Shameer: Modified msi-parent retrieval logic]
>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
>>>> ---
>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  include/linux/of_iommu.h | 10 +++++
>>>>  2 files changed, 105 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>> index e60e3db..ffd7fb7 100644
>>>> --- a/drivers/iommu/of_iommu.c
>>>> +++ b/drivers/iommu/of_iommu.c
>>>> @@ -21,6 +21,7 @@
>>>>  #include <linux/iommu.h>
>>>>  #include <linux/limits.h>
>>>>  #include <linux/of.h>
>>>> +#include <linux/of_address.h>
>>>>  #include <linux/of_iommu.h>
>>>>  #include <linux/of_pci.h>
>>>>  #include <linux/slab.h>
>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>  }
>>>>
>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>> +{
>>>> +	u32 *rid = data;
>>>> +
>>>> +	*rid = alias;
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  struct of_pci_iommu_alias_info {
>>>>  	struct device *dev;
>>>>  	struct device_node *np;
>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>  }
>>>>
>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>> +				      struct list_head *head)
>>>> +{
>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>> +	struct iommu_resv_region *region;
>>>> +	struct resource res;
>>>> +	int err;
>>>> +
>>>> +	err = of_address_to_resource(np, 0, &res);
>>>
>>> What is the rational for registering the first region only? Surely you
>>> can have more than one region in an MSI controller (yes, your particular
>>> case is the ITS which has only one, but we're trying to do something
>>> generic here).
>>>
>>> Another issue I have with this code is that it inserts all of the ITS
>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>> ITS programming interface being exposed to a device, which wouldn't be
>>> acceptable.
>>>
>>> Why can't we have some generic property instead, that would describe the
>>> actual ranges that cannot be translated? That way, no random insertion
>>> of a random range, and relatively future proof.
>
> Indeed. Furthermore, IORT has the advantage of being limited to a few
> distinct device types and SBSA-compliant system topologies, so the
> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
> The scope of DT covers more embedded things as well like PCI host
> controllers with internal MSI doorbells, and potentially even
> direct-mapped memory regions for things like bootloader framebuffers to
> prevent display glitches - a generic address/size/flags description of a
> region could work for just about everything.
>
>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>> not know where a property can be added to describe those ranges (IORT
>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>> themselves.
>>
>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>> like a much nicer and simpler solution to this problem.
>>
>> It could be but if we throw ACPI into the picture we have to knock
>> together Hisilicon specific namespace bindings to handle this and
>> quickly.
>
> As above I'm happy with the ITS-specific solution for ACPI given the
> limits of IORT. What I had in mind for DT was something tied in with the
> generic IOMMU bindings. Something like this is probably easiest to
> handle, but does rather spread the information around:
>
>
>   pci {
>   	...
>   	iommu-map = <0 0 &iommu1 0x10000>;
>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>   				<0x34560000 0x1000 IOMMU_MSI>;
>   };
>
>   display {
>   	...
>   	iommus = <&iommu1 0x20000>;
>   	/* simplefb region */
>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>   };
>
>
> Alternatively, something inspired by reserved-memory might perhaps be
> conceptually neater, at the risk of being more complicated:
>
>
>   iommu1: iommu(a)acbd0000 {
>   	...
>   	#iommu-cells = <1>;
>
>   	iommu-reserved-ranges {
>   		#address-cells = <1>;
>   		#size-cells = <1>;
>
> 		its0_resv: msi(a)12340000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x12340000 0x1000>;
>   		};
>
> 		its1_resv: msi(a)34560000 {
>   			compatible = "iommu-msi-region";
>   			reg = <0x34560000 0x1000>;
>   		};
>
> 		fb_resv: direct(a)12340000 {
>   			compatible = "iommu-direct-region";
>   			reg = <0x80080000 0x80000>;
>   		};
>   	};
>   };
>
>

If we did this, wouldn't it be easier to create dangerous reserved 
regions, like our ITS region which Marc was concerned by? It's not so 
hard to get dts changes into the kernel.

John

> DT folks - any opinions?
>
> Robin.
>
> .
>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 12:37               ` John Garry
  (?)
@ 2017-10-05 12:44                   ` Robin Murphy
  -1 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:44 UTC (permalink / raw)
  To: John Garry, Lorenzo Pieralisi, Marc Zyngier
  Cc: mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	guohanjun-hv44wF8Li93QT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	linuxarm-hv44wF8Li93QT0dZR+AlfA, Shameer Kolothum,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A

On 05/10/17 13:37, John Garry wrote:
> On 05/10/2017 12:07, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry@huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be
>>>>> considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through
>>>>> device
>>>>> tree msi mapping, so that these regions will not be translated by
>>>>> IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +    u32 *rid = data;
>>>>> +
>>>>> +    *rid = alias;
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>      struct device *dev;
>>>>>      struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>> *pdev, u16 alias, void *data)
>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +                      struct list_head *head)
>>>>> +{
>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +    struct iommu_resv_region *region;
>>>>> +    struct resource res;
>>>>> +    int err;
>>>>> +
>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your
>>>> particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>> tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe
>>>> the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>> feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>       ...
>>       iommu-map = <0 0 &iommu1 0x10000>;
>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>       ...
>>       iommus = <&iommu1 0x20000>;
>>       /* simplefb region */
>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu@acbd0000 {
>>       ...
>>       #iommu-cells = <1>;
>>
>>       iommu-reserved-ranges {
>>           #address-cells = <1>;
>>           #size-cells = <1>;
>>
>>         its0_resv: msi@12340000 {
>>               compatible = "iommu-msi-region";
>>               reg = <0x12340000 0x1000>;
>>           };
>>
>>         its1_resv: msi@34560000 {
>>               compatible = "iommu-msi-region";
>>               reg = <0x34560000 0x1000>;
>>           };
>>
>>         fb_resv: direct@12340000 {
>>               compatible = "iommu-direct-region";
>>               reg = <0x80080000 0x80000>;
>>           };
>>       };
>>   };
>>
>>
> 
> If we did this, wouldn't it be easier to create dangerous reserved
> regions, like our ITS region which Marc was concerned by? It's not so
> hard to get dts changes into the kernel.

Well, yeah. It's also equally easy to add some peripheral register
region to the /memory node and watch hilarity ensue. The solution to
both is "don't put bogus crap in your DT".

There's a big difference between wilfully misdescribing your platform
requirements vs. having the kernel automagically infer something but
leave a hole open in the process.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:44                   ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:44 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/10/17 13:37, John Garry wrote:
> On 05/10/2017 12:07, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry@huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be
>>>>> considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through
>>>>> device
>>>>> tree msi mapping, so that these regions will not be translated by
>>>>> IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>> ?drivers/iommu/of_iommu.c | 95
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> ?include/linux/of_iommu.h | 10 +++++
>>>>> ?2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>> ?#include <linux/iommu.h>
>>>>> ?#include <linux/limits.h>
>>>>> ?#include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>> ?#include <linux/of_iommu.h>
>>>>> ?#include <linux/of_pci.h>
>>>>> ?#include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>> ???? return ops->of_xlate(dev, iommu_spec);
>>>>> ?}
>>>>>
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +??? u32 *rid = data;
>>>>> +
>>>>> +??? *rid = alias;
>>>>> +??? return 0;
>>>>> +}
>>>>> +
>>>>> ?struct of_pci_iommu_alias_info {
>>>>> ???? struct device *dev;
>>>>> ???? struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>> *pdev, u16 alias, void *data)
>>>>> ???? return info->np == pdev->bus->dev.of_node;
>>>>> ?}
>>>>>
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +????????????????????? struct list_head *head)
>>>>> +{
>>>>> +??? int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +??? struct iommu_resv_region *region;
>>>>> +??? struct resource res;
>>>>> +??? int err;
>>>>> +
>>>>> +??? err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your
>>>> particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>> tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe
>>>> the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>> feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>> ? pci {
>> ????? ...
>> ????? iommu-map = <0 0 &iommu1 0x10000>;
>> ????? iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>> ????????????????? <0x34560000 0x1000 IOMMU_MSI>;
>> ? };
>>
>> ? display {
>> ????? ...
>> ????? iommus = <&iommu1 0x20000>;
>> ????? /* simplefb region */
>> ????? iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>> ? };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>> ? iommu1: iommu at acbd0000 {
>> ????? ...
>> ????? #iommu-cells = <1>;
>>
>> ????? iommu-reserved-ranges {
>> ????????? #address-cells = <1>;
>> ????????? #size-cells = <1>;
>>
>> ??????? its0_resv: msi at 12340000 {
>> ????????????? compatible = "iommu-msi-region";
>> ????????????? reg = <0x12340000 0x1000>;
>> ????????? };
>>
>> ??????? its1_resv: msi at 34560000 {
>> ????????????? compatible = "iommu-msi-region";
>> ????????????? reg = <0x34560000 0x1000>;
>> ????????? };
>>
>> ??????? fb_resv: direct at 12340000 {
>> ????????????? compatible = "iommu-direct-region";
>> ????????????? reg = <0x80080000 0x80000>;
>> ????????? };
>> ????? };
>> ? };
>>
>>
> 
> If we did this, wouldn't it be easier to create dangerous reserved
> regions, like our ITS region which Marc was concerned by? It's not so
> hard to get dts changes into the kernel.

Well, yeah. It's also equally easy to add some peripheral register
region to the /memory node and watch hilarity ensue. The solution to
both is "don't put bogus crap in your DT".

There's a big difference between wilfully misdescribing your platform
requirements vs. having the kernel automagically infer something but
leave a hole open in the process.

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:44                   ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:44 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 6960 bytes --]

On 05/10/17 13:37, John Garry wrote:
> On 05/10/2017 12:07, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry(a)huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be
>>>>> considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through
>>>>> device
>>>>> tree msi mapping, so that these regions will not be translated by
>>>>> IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95
>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +    u32 *rid = data;
>>>>> +
>>>>> +    *rid = alias;
>>>>> +    return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>      struct device *dev;
>>>>>      struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>> *pdev, u16 alias, void *data)
>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +                      struct list_head *head)
>>>>> +{
>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +    struct iommu_resv_region *region;
>>>>> +    struct resource res;
>>>>> +    int err;
>>>>> +
>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your
>>>> particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>> tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe
>>>> the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>> feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>       ...
>>       iommu-map = <0 0 &iommu1 0x10000>;
>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>       ...
>>       iommus = <&iommu1 0x20000>;
>>       /* simplefb region */
>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu(a)acbd0000 {
>>       ...
>>       #iommu-cells = <1>;
>>
>>       iommu-reserved-ranges {
>>           #address-cells = <1>;
>>           #size-cells = <1>;
>>
>>         its0_resv: msi(a)12340000 {
>>               compatible = "iommu-msi-region";
>>               reg = <0x12340000 0x1000>;
>>           };
>>
>>         its1_resv: msi(a)34560000 {
>>               compatible = "iommu-msi-region";
>>               reg = <0x34560000 0x1000>;
>>           };
>>
>>         fb_resv: direct(a)12340000 {
>>               compatible = "iommu-direct-region";
>>               reg = <0x80080000 0x80000>;
>>           };
>>       };
>>   };
>>
>>
> 
> If we did this, wouldn't it be easier to create dangerous reserved
> regions, like our ITS region which Marc was concerned by? It's not so
> hard to get dts changes into the kernel.

Well, yeah. It's also equally easy to add some peripheral register
region to the /memory node and watch hilarity ensue. The solution to
both is "don't put bogus crap in your DT".

There's a big difference between wilfully misdescribing your platform
requirements vs. having the kernel automagically infer something but
leave a hole open in the process.

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 11:57                 ` Will Deacon
  (?)
@ 2017-10-05 12:55                     ` Robin Murphy
  -1 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:55 UTC (permalink / raw)
  To: Will Deacon
  Cc: Lorenzo Pieralisi, Marc Zyngier, Shameer Kolothum,
	sudeep.holla-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	john.garry-hv44wF8Li93QT0dZR+AlfA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, devel-E0kO6a4B6psdnm+yROfE0A,
	linuxarm-hv44wF8Li93QT0dZR+AlfA,
	wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q,
	guohanjun-hv44wF8Li93QT0dZR+AlfA

On 05/10/17 12:57, Will Deacon wrote:
> On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through device
>>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>  
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +	u32 *rid = data;
>>>>> +
>>>>> +	*rid = alias;
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>  	struct device *dev;
>>>>>  	struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>  
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +				      struct list_head *head)
>>>>> +{
>>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +	struct iommu_resv_region *region;
>>>>> +	struct resource res;
>>>>> +	int err;
>>>>> +
>>>>> +	err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>   	...
>>   	iommu-map = <0 0 &iommu1 0x10000>;
>>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>   				<0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>   	...
>>   	iommus = <&iommu1 0x20000>;
>>   	/* simplefb region */
>>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu@acbd0000 {
>>   	...
>>   	#iommu-cells = <1>;
>>
>>   	iommu-reserved-ranges {
>>   		#address-cells = <1>;
>>   		#size-cells = <1>;
>>
>> 		its0_resv: msi@12340000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x12340000 0x1000>;
>>   		};
>>
>> 		its1_resv: msi@34560000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x34560000 0x1000>;
>>   		};
>>
>> 		fb_resv: direct@12340000 {
>>   			compatible = "iommu-direct-region";
>>   			reg = <0x80080000 0x80000>;
>>   		};
>>   	};
>>   };
> 
> I like the locality of this, but is it definitely flexible enough? Do we
> need to deal with systems where the reserved regions are specific to the
> master (i.e. TBU) as opposed to the entire SMMU block?

That would certainly be true for most direct-mapping cases, where the
reservation is tied to the specific stream ID(s) of one master, let
alone a TBU. I guess we'd have to make a phandle reference from the
device node(s) to the region node mandatory, such that software need
only make the actual reservation/mapping upon encountering a device that
actually needs it.

Robin.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:55                     ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:55 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/10/17 12:57, Will Deacon wrote:
> On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry@huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through device
>>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>  
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +	u32 *rid = data;
>>>>> +
>>>>> +	*rid = alias;
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>  	struct device *dev;
>>>>>  	struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>  
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +				      struct list_head *head)
>>>>> +{
>>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +	struct iommu_resv_region *region;
>>>>> +	struct resource res;
>>>>> +	int err;
>>>>> +
>>>>> +	err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>   	...
>>   	iommu-map = <0 0 &iommu1 0x10000>;
>>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>   				<0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>   	...
>>   	iommus = <&iommu1 0x20000>;
>>   	/* simplefb region */
>>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu at acbd0000 {
>>   	...
>>   	#iommu-cells = <1>;
>>
>>   	iommu-reserved-ranges {
>>   		#address-cells = <1>;
>>   		#size-cells = <1>;
>>
>> 		its0_resv: msi at 12340000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x12340000 0x1000>;
>>   		};
>>
>> 		its1_resv: msi at 34560000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x34560000 0x1000>;
>>   		};
>>
>> 		fb_resv: direct at 12340000 {
>>   			compatible = "iommu-direct-region";
>>   			reg = <0x80080000 0x80000>;
>>   		};
>>   	};
>>   };
> 
> I like the locality of this, but is it definitely flexible enough? Do we
> need to deal with systems where the reserved regions are specific to the
> master (i.e. TBU) as opposed to the entire SMMU block?

That would certainly be true for most direct-mapping cases, where the
reservation is tied to the specific stream ID(s) of one master, let
alone a TBU. I guess we'd have to make a phandle reference from the
device node(s) to the region node mandatory, such that software need
only make the actual reservation/mapping upon encountering a device that
actually needs it.

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 12:55                     ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 12:55 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 6465 bytes --]

On 05/10/17 12:57, Will Deacon wrote:
> On Thu, Oct 05, 2017 at 12:07:26PM +0100, Robin Murphy wrote:
>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>> From: John Garry <john.garry(a)huawei.com>
>>>>>
>>>>> On some platforms msi-controller address regions have to be excluded
>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>> a HW specific way by system components and so they cannot be considered
>>>>> normal IOVA address space.
>>>>>
>>>>> Add a helper function that retrieves msi address regions through device
>>>>> tree msi mapping, so that these regions will not be translated by IOMMU
>>>>> and will be excluded from IOVA allocations.
>>>>>
>>>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
>>>>> ---
>>>>>  drivers/iommu/of_iommu.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>  2 files changed, 105 insertions(+)
>>>>>
>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>> index e60e3db..ffd7fb7 100644
>>>>> --- a/drivers/iommu/of_iommu.c
>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>  #include <linux/iommu.h>
>>>>>  #include <linux/limits.h>
>>>>>  #include <linux/of.h>
>>>>> +#include <linux/of_address.h>
>>>>>  #include <linux/of_iommu.h>
>>>>>  #include <linux/of_pci.h>
>>>>>  #include <linux/slab.h>
>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>  	return ops->of_xlate(dev, iommu_spec);
>>>>>  }
>>>>>  
>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>> +{
>>>>> +	u32 *rid = data;
>>>>> +
>>>>> +	*rid = alias;
>>>>> +	return 0;
>>>>> +}
>>>>> +
>>>>>  struct of_pci_iommu_alias_info {
>>>>>  	struct device *dev;
>>>>>  	struct device_node *np;
>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev *pdev, u16 alias, void *data)
>>>>>  	return info->np == pdev->bus->dev.of_node;
>>>>>  }
>>>>>  
>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>> +				      struct list_head *head)
>>>>> +{
>>>>> +	int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>> +	struct iommu_resv_region *region;
>>>>> +	struct resource res;
>>>>> +	int err;
>>>>> +
>>>>> +	err = of_address_to_resource(np, 0, &res);
>>>>
>>>> What is the rational for registering the first region only? Surely you
>>>> can have more than one region in an MSI controller (yes, your particular
>>>> case is the ITS which has only one, but we're trying to do something
>>>> generic here).
>>>>
>>>> Another issue I have with this code is that it inserts all of the ITS
>>>> MMIO in the RESV_MSI range. As long as we don't generate any page tables
>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>> acceptable.
>>>>
>>>> Why can't we have some generic property instead, that would describe the
>>>> actual ranges that cannot be translated? That way, no random insertion
>>>> of a random range, and relatively future proof.
>>
>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>> distinct device types and SBSA-compliant system topologies, so the
>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>> The scope of DT covers more embedded things as well like PCI host
>> controllers with internal MSI doorbells, and potentially even
>> direct-mapped memory regions for things like bootloader framebuffers to
>> prevent display glitches - a generic address/size/flags description of a
>> region could work for just about everything.
>>
>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>> not know where a property can be added to describe those ranges (IORT
>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>> themselves.
>>>
>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd feel
>>>> like a much nicer and simpler solution to this problem.
>>>
>>> It could be but if we throw ACPI into the picture we have to knock
>>> together Hisilicon specific namespace bindings to handle this and
>>> quickly.
>>
>> As above I'm happy with the ITS-specific solution for ACPI given the
>> limits of IORT. What I had in mind for DT was something tied in with the
>> generic IOMMU bindings. Something like this is probably easiest to
>> handle, but does rather spread the information around:
>>
>>
>>   pci {
>>   	...
>>   	iommu-map = <0 0 &iommu1 0x10000>;
>>   	iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>   				<0x34560000 0x1000 IOMMU_MSI>;
>>   };
>>
>>   display {
>>   	...
>>   	iommus = <&iommu1 0x20000>;
>>   	/* simplefb region */
>>   	iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>   };
>>
>>
>> Alternatively, something inspired by reserved-memory might perhaps be
>> conceptually neater, at the risk of being more complicated:
>>
>>
>>   iommu1: iommu(a)acbd0000 {
>>   	...
>>   	#iommu-cells = <1>;
>>
>>   	iommu-reserved-ranges {
>>   		#address-cells = <1>;
>>   		#size-cells = <1>;
>>
>> 		its0_resv: msi(a)12340000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x12340000 0x1000>;
>>   		};
>>
>> 		its1_resv: msi(a)34560000 {
>>   			compatible = "iommu-msi-region";
>>   			reg = <0x34560000 0x1000>;
>>   		};
>>
>> 		fb_resv: direct(a)12340000 {
>>   			compatible = "iommu-direct-region";
>>   			reg = <0x80080000 0x80000>;
>>   		};
>>   	};
>>   };
> 
> I like the locality of this, but is it definitely flexible enough? Do we
> need to deal with systems where the reserved regions are specific to the
> master (i.e. TBU) as opposed to the entire SMMU block?

That would certainly be true for most direct-mapping cases, where the
reservation is tied to the specific stream ID(s) of one master, let
alone a TBU. I guess we'd have to make a phandle reference from the
device node(s) to the region node mandatory, such that software need
only make the actual reservation/mapping upon encountering a device that
actually needs it.

Robin.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 12:44                   ` Robin Murphy
  (?)
@ 2017-10-05 13:08                     ` John Garry
  -1 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 13:08 UTC (permalink / raw)
  To: Robin Murphy, Lorenzo Pieralisi, Marc Zyngier
  Cc: Shameer Kolothum, sudeep.holla, will.deacon, joro, mark.rutland,
	robh, gabriele.paoloni, iommu, linux-arm-kernel, linux-acpi,
	devicetree, devel, linuxarm, wangzhou1, guohanjun

On 05/10/2017 13:44, Robin Murphy wrote:
> On 05/10/17 13:37, John Garry wrote:
>> On 05/10/2017 12:07, Robin Murphy wrote:
>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>> From: John Garry <john.garry@huawei.com>
>>>>>>
>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>> a HW specific way by system components and so they cannot be
>>>>>> considered
>>>>>> normal IOVA address space.
>>>>>>
>>>>>> Add a helper function that retrieves msi address regions through
>>>>>> device
>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>> IOMMU
>>>>>> and will be excluded from IOVA allocations.
>>>>>>
>>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>>>> ---
>>>>>>  drivers/iommu/of_iommu.c | 95
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>>  2 files changed, 105 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>> index e60e3db..ffd7fb7 100644
>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>> @@ -21,6 +21,7 @@
>>>>>>  #include <linux/iommu.h>
>>>>>>  #include <linux/limits.h>
>>>>>>  #include <linux/of.h>
>>>>>> +#include <linux/of_address.h>
>>>>>>  #include <linux/of_iommu.h>
>>>>>>  #include <linux/of_pci.h>
>>>>>>  #include <linux/slab.h>
>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>>  }
>>>>>>
>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>>> +{
>>>>>> +    u32 *rid = data;
>>>>>> +
>>>>>> +    *rid = alias;
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>>  struct of_pci_iommu_alias_info {
>>>>>>      struct device *dev;
>>>>>>      struct device_node *np;
>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>> *pdev, u16 alias, void *data)
>>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>>  }
>>>>>>
>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>> +                      struct list_head *head)
>>>>>> +{
>>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>> +    struct iommu_resv_region *region;
>>>>>> +    struct resource res;
>>>>>> +    int err;
>>>>>> +
>>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>>
>>>>> What is the rational for registering the first region only? Surely you
>>>>> can have more than one region in an MSI controller (yes, your
>>>>> particular
>>>>> case is the ITS which has only one, but we're trying to do something
>>>>> generic here).
>>>>>
>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>> tables
>>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>>> acceptable.
>>>>>
>>>>> Why can't we have some generic property instead, that would describe
>>>>> the
>>>>> actual ranges that cannot be translated? That way, no random insertion
>>>>> of a random range, and relatively future proof.
>>>
>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>> distinct device types and SBSA-compliant system topologies, so the
>>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>>> The scope of DT covers more embedded things as well like PCI host
>>> controllers with internal MSI doorbells, and potentially even
>>> direct-mapped memory regions for things like bootloader framebuffers to
>>> prevent display glitches - a generic address/size/flags description of a
>>> region could work for just about everything.
>>>
>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>> not know where a property can be added to describe those ranges (IORT
>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>> themselves.
>>>>
>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>> feel
>>>>> like a much nicer and simpler solution to this problem.
>>>>
>>>> It could be but if we throw ACPI into the picture we have to knock
>>>> together Hisilicon specific namespace bindings to handle this and
>>>> quickly.
>>>
>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>> limits of IORT. What I had in mind for DT was something tied in with the
>>> generic IOMMU bindings. Something like this is probably easiest to
>>> handle, but does rather spread the information around:
>>>
>>>
>>>   pci {
>>>       ...
>>>       iommu-map = <0 0 &iommu1 0x10000>;
>>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>>   };
>>>
>>>   display {
>>>       ...
>>>       iommus = <&iommu1 0x20000>;
>>>       /* simplefb region */
>>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>   };
>>>
>>>
>>> Alternatively, something inspired by reserved-memory might perhaps be
>>> conceptually neater, at the risk of being more complicated:
>>>
>>>
>>>   iommu1: iommu@acbd0000 {
>>>       ...
>>>       #iommu-cells = <1>;
>>>
>>>       iommu-reserved-ranges {
>>>           #address-cells = <1>;
>>>           #size-cells = <1>;
>>>
>>>         its0_resv: msi@12340000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x12340000 0x1000>;
>>>           };
>>>
>>>         its1_resv: msi@34560000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x34560000 0x1000>;
>>>           };
>>>
>>>         fb_resv: direct@12340000 {
>>>               compatible = "iommu-direct-region";
>>>               reg = <0x80080000 0x80000>;
>>>           };
>>>       };
>>>   };
>>>
>>>
>>
>> If we did this, wouldn't it be easier to create dangerous reserved
>> regions, like our ITS region which Marc was concerned by? It's not so
>> hard to get dts changes into the kernel.
>
> Well, yeah. It's also equally easy to add some peripheral register
> region to the /memory node and watch hilarity ensue. The solution to
> both is "don't put bogus crap in your DT".
>

Sure, but people make mistakes and often a lot more subtle than your 
example.

Only one person spotted the problem with our approach to the ITS 
problem. I'm pretty confident it would not have been spotted if it was 
submitted as a dts change.

Much appreciated,
John

> There's a big difference between wilfully misdescribing your platform
> requirements vs. having the kernel automatically infer something but
> leave a hole open in the process.
>
> Robin.
>
> .
>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 13:08                     ` John Garry
  0 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 13:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/10/2017 13:44, Robin Murphy wrote:
> On 05/10/17 13:37, John Garry wrote:
>> On 05/10/2017 12:07, Robin Murphy wrote:
>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>> From: John Garry <john.garry@huawei.com>
>>>>>>
>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>> a HW specific way by system components and so they cannot be
>>>>>> considered
>>>>>> normal IOVA address space.
>>>>>>
>>>>>> Add a helper function that retrieves msi address regions through
>>>>>> device
>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>> IOMMU
>>>>>> and will be excluded from IOVA allocations.
>>>>>>
>>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
>>>>>> ---
>>>>>>  drivers/iommu/of_iommu.c | 95
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>>  2 files changed, 105 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>> index e60e3db..ffd7fb7 100644
>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>> @@ -21,6 +21,7 @@
>>>>>>  #include <linux/iommu.h>
>>>>>>  #include <linux/limits.h>
>>>>>>  #include <linux/of.h>
>>>>>> +#include <linux/of_address.h>
>>>>>>  #include <linux/of_iommu.h>
>>>>>>  #include <linux/of_pci.h>
>>>>>>  #include <linux/slab.h>
>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>>  }
>>>>>>
>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>>> +{
>>>>>> +    u32 *rid = data;
>>>>>> +
>>>>>> +    *rid = alias;
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>>  struct of_pci_iommu_alias_info {
>>>>>>      struct device *dev;
>>>>>>      struct device_node *np;
>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>> *pdev, u16 alias, void *data)
>>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>>  }
>>>>>>
>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>> +                      struct list_head *head)
>>>>>> +{
>>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>> +    struct iommu_resv_region *region;
>>>>>> +    struct resource res;
>>>>>> +    int err;
>>>>>> +
>>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>>
>>>>> What is the rational for registering the first region only? Surely you
>>>>> can have more than one region in an MSI controller (yes, your
>>>>> particular
>>>>> case is the ITS which has only one, but we're trying to do something
>>>>> generic here).
>>>>>
>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>> tables
>>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>>> acceptable.
>>>>>
>>>>> Why can't we have some generic property instead, that would describe
>>>>> the
>>>>> actual ranges that cannot be translated? That way, no random insertion
>>>>> of a random range, and relatively future proof.
>>>
>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>> distinct device types and SBSA-compliant system topologies, so the
>>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>>> The scope of DT covers more embedded things as well like PCI host
>>> controllers with internal MSI doorbells, and potentially even
>>> direct-mapped memory regions for things like bootloader framebuffers to
>>> prevent display glitches - a generic address/size/flags description of a
>>> region could work for just about everything.
>>>
>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>> not know where a property can be added to describe those ranges (IORT
>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>> themselves.
>>>>
>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>> feel
>>>>> like a much nicer and simpler solution to this problem.
>>>>
>>>> It could be but if we throw ACPI into the picture we have to knock
>>>> together Hisilicon specific namespace bindings to handle this and
>>>> quickly.
>>>
>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>> limits of IORT. What I had in mind for DT was something tied in with the
>>> generic IOMMU bindings. Something like this is probably easiest to
>>> handle, but does rather spread the information around:
>>>
>>>
>>>   pci {
>>>       ...
>>>       iommu-map = <0 0 &iommu1 0x10000>;
>>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>>   };
>>>
>>>   display {
>>>       ...
>>>       iommus = <&iommu1 0x20000>;
>>>       /* simplefb region */
>>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>   };
>>>
>>>
>>> Alternatively, something inspired by reserved-memory might perhaps be
>>> conceptually neater, at the risk of being more complicated:
>>>
>>>
>>>   iommu1: iommu at acbd0000 {
>>>       ...
>>>       #iommu-cells = <1>;
>>>
>>>       iommu-reserved-ranges {
>>>           #address-cells = <1>;
>>>           #size-cells = <1>;
>>>
>>>         its0_resv: msi at 12340000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x12340000 0x1000>;
>>>           };
>>>
>>>         its1_resv: msi at 34560000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x34560000 0x1000>;
>>>           };
>>>
>>>         fb_resv: direct at 12340000 {
>>>               compatible = "iommu-direct-region";
>>>               reg = <0x80080000 0x80000>;
>>>           };
>>>       };
>>>   };
>>>
>>>
>>
>> If we did this, wouldn't it be easier to create dangerous reserved
>> regions, like our ITS region which Marc was concerned by? It's not so
>> hard to get dts changes into the kernel.
>
> Well, yeah. It's also equally easy to add some peripheral register
> region to the /memory node and watch hilarity ensue. The solution to
> both is "don't put bogus crap in your DT".
>

Sure, but people make mistakes and often a lot more subtle than your 
example.

Only one person spotted the problem with our approach to the ITS 
problem. I'm pretty confident it would not have been spotted if it was 
submitted as a dts change.

Much appreciated,
John

> There's a big difference between wilfully misdescribing your platform
> requirements vs. having the kernel automatically infer something but
> leave a hole open in the process.
>
> Robin.
>
> .
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 13:08                     ` John Garry
  0 siblings, 0 replies; 59+ messages in thread
From: John Garry @ 2017-10-05 13:08 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7186 bytes --]

On 05/10/2017 13:44, Robin Murphy wrote:
> On 05/10/17 13:37, John Garry wrote:
>> On 05/10/2017 12:07, Robin Murphy wrote:
>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>> From: John Garry <john.garry(a)huawei.com>
>>>>>>
>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>> a HW specific way by system components and so they cannot be
>>>>>> considered
>>>>>> normal IOVA address space.
>>>>>>
>>>>>> Add a helper function that retrieves msi address regions through
>>>>>> device
>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>> IOMMU
>>>>>> and will be excluded from IOVA allocations.
>>>>>>
>>>>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
>>>>>> ---
>>>>>>  drivers/iommu/of_iommu.c | 95
>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>>  2 files changed, 105 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>> index e60e3db..ffd7fb7 100644
>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>> @@ -21,6 +21,7 @@
>>>>>>  #include <linux/iommu.h>
>>>>>>  #include <linux/limits.h>
>>>>>>  #include <linux/of.h>
>>>>>> +#include <linux/of_address.h>
>>>>>>  #include <linux/of_iommu.h>
>>>>>>  #include <linux/of_pci.h>
>>>>>>  #include <linux/slab.h>
>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>>  }
>>>>>>
>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void *data)
>>>>>> +{
>>>>>> +    u32 *rid = data;
>>>>>> +
>>>>>> +    *rid = alias;
>>>>>> +    return 0;
>>>>>> +}
>>>>>> +
>>>>>>  struct of_pci_iommu_alias_info {
>>>>>>      struct device *dev;
>>>>>>      struct device_node *np;
>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>> *pdev, u16 alias, void *data)
>>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>>  }
>>>>>>
>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>> +                      struct list_head *head)
>>>>>> +{
>>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>> +    struct iommu_resv_region *region;
>>>>>> +    struct resource res;
>>>>>> +    int err;
>>>>>> +
>>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>>
>>>>> What is the rational for registering the first region only? Surely you
>>>>> can have more than one region in an MSI controller (yes, your
>>>>> particular
>>>>> case is the ITS which has only one, but we're trying to do something
>>>>> generic here).
>>>>>
>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>> tables
>>>>> for this, we're fine. But if we ever change this, we'll end-up with the
>>>>> ITS programming interface being exposed to a device, which wouldn't be
>>>>> acceptable.
>>>>>
>>>>> Why can't we have some generic property instead, that would describe
>>>>> the
>>>>> actual ranges that cannot be translated? That way, no random insertion
>>>>> of a random range, and relatively future proof.
>>>
>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>> distinct device types and SBSA-compliant system topologies, so the
>>> ITS-chasing is reasonable there (modulo only reserving GITS_TRANSLATER).
>>> The scope of DT covers more embedded things as well like PCI host
>>> controllers with internal MSI doorbells, and potentially even
>>> direct-mapped memory regions for things like bootloader framebuffers to
>>> prevent display glitches - a generic address/size/flags description of a
>>> region could work for just about everything.
>>>
>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>> not know where a property can be added to describe those ranges (IORT
>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>> themselves.
>>>>
>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>> feel
>>>>> like a much nicer and simpler solution to this problem.
>>>>
>>>> It could be but if we throw ACPI into the picture we have to knock
>>>> together Hisilicon specific namespace bindings to handle this and
>>>> quickly.
>>>
>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>> limits of IORT. What I had in mind for DT was something tied in with the
>>> generic IOMMU bindings. Something like this is probably easiest to
>>> handle, but does rather spread the information around:
>>>
>>>
>>>   pci {
>>>       ...
>>>       iommu-map = <0 0 &iommu1 0x10000>;
>>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>>   };
>>>
>>>   display {
>>>       ...
>>>       iommus = <&iommu1 0x20000>;
>>>       /* simplefb region */
>>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>   };
>>>
>>>
>>> Alternatively, something inspired by reserved-memory might perhaps be
>>> conceptually neater, at the risk of being more complicated:
>>>
>>>
>>>   iommu1: iommu(a)acbd0000 {
>>>       ...
>>>       #iommu-cells = <1>;
>>>
>>>       iommu-reserved-ranges {
>>>           #address-cells = <1>;
>>>           #size-cells = <1>;
>>>
>>>         its0_resv: msi(a)12340000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x12340000 0x1000>;
>>>           };
>>>
>>>         its1_resv: msi(a)34560000 {
>>>               compatible = "iommu-msi-region";
>>>               reg = <0x34560000 0x1000>;
>>>           };
>>>
>>>         fb_resv: direct(a)12340000 {
>>>               compatible = "iommu-direct-region";
>>>               reg = <0x80080000 0x80000>;
>>>           };
>>>       };
>>>   };
>>>
>>>
>>
>> If we did this, wouldn't it be easier to create dangerous reserved
>> regions, like our ITS region which Marc was concerned by? It's not so
>> hard to get dts changes into the kernel.
>
> Well, yeah. It's also equally easy to add some peripheral register
> region to the /memory node and watch hilarity ensue. The solution to
> both is "don't put bogus crap in your DT".
>

Sure, but people make mistakes and often a lot more subtle than your 
example.

Only one person spotted the problem with our approach to the ITS 
problem. I'm pretty confident it would not have been spotted if it was 
submitted as a dts change.

Much appreciated,
John

> There's a big difference between wilfully misdescribing your platform
> requirements vs. having the kernel automatically infer something but
> leave a hole open in the process.
>
> Robin.
>
> .
>



^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
  2017-10-05 13:08                     ` John Garry
  (?)
@ 2017-10-05 14:05                         ` Robin Murphy
  -1 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 14:05 UTC (permalink / raw)
  To: John Garry, Lorenzo Pieralisi, Marc Zyngier
  Cc: mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A,
	gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA,
	devicetree-u79uwXL29TY76Z2rM5mHXA,
	guohanjun-hv44wF8Li93QT0dZR+AlfA, will.deacon-5wv7dgnIgG8,
	linuxarm-hv44wF8Li93QT0dZR+AlfA, Shameer Kolothum,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	sudeep.holla-5wv7dgnIgG8,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A

On 05/10/17 14:08, John Garry wrote:
> On 05/10/2017 13:44, Robin Murphy wrote:
>> On 05/10/17 13:37, John Garry wrote:
>>> On 05/10/2017 12:07, Robin Murphy wrote:
>>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>>> From: John Garry <john.garry@huawei.com>
>>>>>>>
>>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>>> a HW specific way by system components and so they cannot be
>>>>>>> considered
>>>>>>> normal IOVA address space.
>>>>>>>
>>>>>>> Add a helper function that retrieves msi address regions through
>>>>>>> device
>>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>>> IOMMU
>>>>>>> and will be excluded from IOVA allocations.
>>>>>>>
>>>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>>> Signed-off-by: Shameer Kolothum
>>>>>>> <shameerali.kolothum.thodi@huawei.com>
>>>>>>> ---
>>>>>>>  drivers/iommu/of_iommu.c | 95
>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>>>  2 files changed, 105 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>>> index e60e3db..ffd7fb7 100644
>>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>>> @@ -21,6 +21,7 @@
>>>>>>>  #include <linux/iommu.h>
>>>>>>>  #include <linux/limits.h>
>>>>>>>  #include <linux/of.h>
>>>>>>> +#include <linux/of_address.h>
>>>>>>>  #include <linux/of_iommu.h>
>>>>>>>  #include <linux/of_pci.h>
>>>>>>>  #include <linux/slab.h>
>>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>>>  }
>>>>>>>
>>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void
>>>>>>> *data)
>>>>>>> +{
>>>>>>> +    u32 *rid = data;
>>>>>>> +
>>>>>>> +    *rid = alias;
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>  struct of_pci_iommu_alias_info {
>>>>>>>      struct device *dev;
>>>>>>>      struct device_node *np;
>>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>>> *pdev, u16 alias, void *data)
>>>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>>>  }
>>>>>>>
>>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>>> +                      struct list_head *head)
>>>>>>> +{
>>>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>> +    struct iommu_resv_region *region;
>>>>>>> +    struct resource res;
>>>>>>> +    int err;
>>>>>>> +
>>>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>>>
>>>>>> What is the rational for registering the first region only? Surely
>>>>>> you
>>>>>> can have more than one region in an MSI controller (yes, your
>>>>>> particular
>>>>>> case is the ITS which has only one, but we're trying to do something
>>>>>> generic here).
>>>>>>
>>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>>> tables
>>>>>> for this, we're fine. But if we ever change this, we'll end-up
>>>>>> with the
>>>>>> ITS programming interface being exposed to a device, which
>>>>>> wouldn't be
>>>>>> acceptable.
>>>>>>
>>>>>> Why can't we have some generic property instead, that would describe
>>>>>> the
>>>>>> actual ranges that cannot be translated? That way, no random
>>>>>> insertion
>>>>>> of a random range, and relatively future proof.
>>>>
>>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>>> distinct device types and SBSA-compliant system topologies, so the
>>>> ITS-chasing is reasonable there (modulo only reserving
>>>> GITS_TRANSLATER).
>>>> The scope of DT covers more embedded things as well like PCI host
>>>> controllers with internal MSI doorbells, and potentially even
>>>> direct-mapped memory regions for things like bootloader framebuffers to
>>>> prevent display glitches - a generic address/size/flags description
>>>> of a
>>>> region could work for just about everything.
>>>>
>>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>>> not know where a property can be added to describe those ranges (IORT
>>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>>> themselves.
>>>>>
>>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>>> feel
>>>>>> like a much nicer and simpler solution to this problem.
>>>>>
>>>>> It could be but if we throw ACPI into the picture we have to knock
>>>>> together Hisilicon specific namespace bindings to handle this and
>>>>> quickly.
>>>>
>>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>>> limits of IORT. What I had in mind for DT was something tied in with
>>>> the
>>>> generic IOMMU bindings. Something like this is probably easiest to
>>>> handle, but does rather spread the information around:
>>>>
>>>>
>>>>   pci {
>>>>       ...
>>>>       iommu-map = <0 0 &iommu1 0x10000>;
>>>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>>>   };
>>>>
>>>>   display {
>>>>       ...
>>>>       iommus = <&iommu1 0x20000>;
>>>>       /* simplefb region */
>>>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>>   };
>>>>
>>>>
>>>> Alternatively, something inspired by reserved-memory might perhaps be
>>>> conceptually neater, at the risk of being more complicated:
>>>>
>>>>
>>>>   iommu1: iommu@acbd0000 {
>>>>       ...
>>>>       #iommu-cells = <1>;
>>>>
>>>>       iommu-reserved-ranges {
>>>>           #address-cells = <1>;
>>>>           #size-cells = <1>;
>>>>
>>>>         its0_resv: msi@12340000 {
>>>>               compatible = "iommu-msi-region";
>>>>               reg = <0x12340000 0x1000>;
>>>>           };
>>>>
>>>>         its1_resv: msi@34560000 {
>>>>               compatible = "iommu-msi-region";
>>>>               reg = <0x34560000 0x1000>;
>>>>           };
>>>>
>>>>         fb_resv: direct@12340000 {
>>>>               compatible = "iommu-direct-region";
>>>>               reg = <0x80080000 0x80000>;
>>>>           };
>>>>       };
>>>>   };
>>>>
>>>>
>>>
>>> If we did this, wouldn't it be easier to create dangerous reserved
>>> regions, like our ITS region which Marc was concerned by? It's not so
>>> hard to get dts changes into the kernel.
>>
>> Well, yeah. It's also equally easy to add some peripheral register
>> region to the /memory node and watch hilarity ensue. The solution to
>> both is "don't put bogus crap in your DT".
>>
> 
> Sure, but people make mistakes and often a lot more subtle than your
> example.

I wanted to write a response to that but since it was too easy to hit
the wrong keys on my keyboard, I didn't.

I don't see a proposal for some alternate method of describing
restricted IOVA regions which scales beyond HiSilicon Hip07 and would be
entirely immune to bugs, so I'm really not sure what your argument is
here :/

Robin.

> Only one person spotted the problem with our approach to the ITS
> problem. I'm pretty confident it would not have been spotted if it was
> submitted as a dts change.
> 
> Much appreciated,
> John
> 
>> There's a big difference between wilfully misdescribing your platform
>> requirements vs. having the kernel automatically infer something but
>> leave a hole open in the process.
>>
>> Robin.
>>
>> .
>>
> 
> 

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 14:05                         ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/10/17 14:08, John Garry wrote:
> On 05/10/2017 13:44, Robin Murphy wrote:
>> On 05/10/17 13:37, John Garry wrote:
>>> On 05/10/2017 12:07, Robin Murphy wrote:
>>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>>> From: John Garry <john.garry@huawei.com>
>>>>>>>
>>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>>> a HW specific way by system components and so they cannot be
>>>>>>> considered
>>>>>>> normal IOVA address space.
>>>>>>>
>>>>>>> Add a helper function that retrieves msi address regions through
>>>>>>> device
>>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>>> IOMMU
>>>>>>> and will be excluded from IOVA allocations.
>>>>>>>
>>>>>>> Signed-off-by: John Garry <john.garry@huawei.com>
>>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>>> Signed-off-by: Shameer Kolothum
>>>>>>> <shameerali.kolothum.thodi@huawei.com>
>>>>>>> ---
>>>>>>> ?drivers/iommu/of_iommu.c | 95
>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>> ?include/linux/of_iommu.h | 10 +++++
>>>>>>> ?2 files changed, 105 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>>> index e60e3db..ffd7fb7 100644
>>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>>> @@ -21,6 +21,7 @@
>>>>>>> ?#include <linux/iommu.h>
>>>>>>> ?#include <linux/limits.h>
>>>>>>> ?#include <linux/of.h>
>>>>>>> +#include <linux/of_address.h>
>>>>>>> ?#include <linux/of_iommu.h>
>>>>>>> ?#include <linux/of_pci.h>
>>>>>>> ?#include <linux/slab.h>
>>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>> ???? return ops->of_xlate(dev, iommu_spec);
>>>>>>> ?}
>>>>>>>
>>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void
>>>>>>> *data)
>>>>>>> +{
>>>>>>> +??? u32 *rid = data;
>>>>>>> +
>>>>>>> +??? *rid = alias;
>>>>>>> +??? return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>> ?struct of_pci_iommu_alias_info {
>>>>>>> ???? struct device *dev;
>>>>>>> ???? struct device_node *np;
>>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>>> *pdev, u16 alias, void *data)
>>>>>>> ???? return info->np == pdev->bus->dev.of_node;
>>>>>>> ?}
>>>>>>>
>>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>>> +????????????????????? struct list_head *head)
>>>>>>> +{
>>>>>>> +??? int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>> +??? struct iommu_resv_region *region;
>>>>>>> +??? struct resource res;
>>>>>>> +??? int err;
>>>>>>> +
>>>>>>> +??? err = of_address_to_resource(np, 0, &res);
>>>>>>
>>>>>> What is the rational for registering the first region only? Surely
>>>>>> you
>>>>>> can have more than one region in an MSI controller (yes, your
>>>>>> particular
>>>>>> case is the ITS which has only one, but we're trying to do something
>>>>>> generic here).
>>>>>>
>>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>>> tables
>>>>>> for this, we're fine. But if we ever change this, we'll end-up
>>>>>> with the
>>>>>> ITS programming interface being exposed to a device, which
>>>>>> wouldn't be
>>>>>> acceptable.
>>>>>>
>>>>>> Why can't we have some generic property instead, that would describe
>>>>>> the
>>>>>> actual ranges that cannot be translated? That way, no random
>>>>>> insertion
>>>>>> of a random range, and relatively future proof.
>>>>
>>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>>> distinct device types and SBSA-compliant system topologies, so the
>>>> ITS-chasing is reasonable there (modulo only reserving
>>>> GITS_TRANSLATER).
>>>> The scope of DT covers more embedded things as well like PCI host
>>>> controllers with internal MSI doorbells, and potentially even
>>>> direct-mapped memory regions for things like bootloader framebuffers to
>>>> prevent display glitches - a generic address/size/flags description
>>>> of a
>>>> region could work for just about everything.
>>>>
>>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>>> not know where a property can be added to describe those ranges (IORT
>>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>>> themselves.
>>>>>
>>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>>> feel
>>>>>> like a much nicer and simpler solution to this problem.
>>>>>
>>>>> It could be but if we throw ACPI into the picture we have to knock
>>>>> together Hisilicon specific namespace bindings to handle this and
>>>>> quickly.
>>>>
>>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>>> limits of IORT. What I had in mind for DT was something tied in with
>>>> the
>>>> generic IOMMU bindings. Something like this is probably easiest to
>>>> handle, but does rather spread the information around:
>>>>
>>>>
>>>> ? pci {
>>>> ????? ...
>>>> ????? iommu-map = <0 0 &iommu1 0x10000>;
>>>> ????? iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>> ????????????????? <0x34560000 0x1000 IOMMU_MSI>;
>>>> ? };
>>>>
>>>> ? display {
>>>> ????? ...
>>>> ????? iommus = <&iommu1 0x20000>;
>>>> ????? /* simplefb region */
>>>> ????? iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>> ? };
>>>>
>>>>
>>>> Alternatively, something inspired by reserved-memory might perhaps be
>>>> conceptually neater, at the risk of being more complicated:
>>>>
>>>>
>>>> ? iommu1: iommu at acbd0000 {
>>>> ????? ...
>>>> ????? #iommu-cells = <1>;
>>>>
>>>> ????? iommu-reserved-ranges {
>>>> ????????? #address-cells = <1>;
>>>> ????????? #size-cells = <1>;
>>>>
>>>> ??????? its0_resv: msi at 12340000 {
>>>> ????????????? compatible = "iommu-msi-region";
>>>> ????????????? reg = <0x12340000 0x1000>;
>>>> ????????? };
>>>>
>>>> ??????? its1_resv: msi at 34560000 {
>>>> ????????????? compatible = "iommu-msi-region";
>>>> ????????????? reg = <0x34560000 0x1000>;
>>>> ????????? };
>>>>
>>>> ??????? fb_resv: direct at 12340000 {
>>>> ????????????? compatible = "iommu-direct-region";
>>>> ????????????? reg = <0x80080000 0x80000>;
>>>> ????????? };
>>>> ????? };
>>>> ? };
>>>>
>>>>
>>>
>>> If we did this, wouldn't it be easier to create dangerous reserved
>>> regions, like our ITS region which Marc was concerned by? It's not so
>>> hard to get dts changes into the kernel.
>>
>> Well, yeah. It's also equally easy to add some peripheral register
>> region to the /memory node and watch hilarity ensue. The solution to
>> both is "don't put bogus crap in your DT".
>>
> 
> Sure, but people make mistakes and often a lot more subtle than your
> example.

I wanted to write a response to that but since it was too easy to hit
the wrong keys on my keyboard, I didn't.

I don't see a proposal for some alternate method of describing
restricted IOVA regions which scales beyond HiSilicon Hip07 and would be
entirely immune to bugs, so I'm really not sure what your argument is
here :/

Robin.

> Only one person spotted the problem with our approach to the ITS
> problem. I'm pretty confident it would not have been spotted if it was
> submitted as a dts change.
> 
> Much appreciated,
> John
> 
>> There's a big difference between wilfully misdescribing your platform
>> requirements vs. having the kernel automatically infer something but
>> leave a hole open in the process.
>>
>> Robin.
>>
>> .
>>
> 
> 

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper
@ 2017-10-05 14:05                         ` Robin Murphy
  0 siblings, 0 replies; 59+ messages in thread
From: Robin Murphy @ 2017-10-05 14:05 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 8135 bytes --]

On 05/10/17 14:08, John Garry wrote:
> On 05/10/2017 13:44, Robin Murphy wrote:
>> On 05/10/17 13:37, John Garry wrote:
>>> On 05/10/2017 12:07, Robin Murphy wrote:
>>>> On 04/10/17 14:50, Lorenzo Pieralisi wrote:
>>>>> On Wed, Oct 04, 2017 at 12:22:04PM +0100, Marc Zyngier wrote:
>>>>>> On 27/09/17 14:32, Shameer Kolothum wrote:
>>>>>>> From: John Garry <john.garry(a)huawei.com>
>>>>>>>
>>>>>>> On some platforms msi-controller address regions have to be excluded
>>>>>>> from normal IOVA allocation in that they are detected and decoded in
>>>>>>> a HW specific way by system components and so they cannot be
>>>>>>> considered
>>>>>>> normal IOVA address space.
>>>>>>>
>>>>>>> Add a helper function that retrieves msi address regions through
>>>>>>> device
>>>>>>> tree msi mapping, so that these regions will not be translated by
>>>>>>> IOMMU
>>>>>>> and will be excluded from IOVA allocations.
>>>>>>>
>>>>>>> Signed-off-by: John Garry <john.garry(a)huawei.com>
>>>>>>> [Shameer: Modified msi-parent retrieval logic]
>>>>>>> Signed-off-by: Shameer Kolothum
>>>>>>> <shameerali.kolothum.thodi(a)huawei.com>
>>>>>>> ---
>>>>>>>  drivers/iommu/of_iommu.c | 95
>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>  include/linux/of_iommu.h | 10 +++++
>>>>>>>  2 files changed, 105 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
>>>>>>> index e60e3db..ffd7fb7 100644
>>>>>>> --- a/drivers/iommu/of_iommu.c
>>>>>>> +++ b/drivers/iommu/of_iommu.c
>>>>>>> @@ -21,6 +21,7 @@
>>>>>>>  #include <linux/iommu.h>
>>>>>>>  #include <linux/limits.h>
>>>>>>>  #include <linux/of.h>
>>>>>>> +#include <linux/of_address.h>
>>>>>>>  #include <linux/of_iommu.h>
>>>>>>>  #include <linux/of_pci.h>
>>>>>>>  #include <linux/slab.h>
>>>>>>> @@ -138,6 +139,14 @@ static int of_iommu_xlate(struct device *dev,
>>>>>>>      return ops->of_xlate(dev, iommu_spec);
>>>>>>>  }
>>>>>>>
>>>>>>> +static int __get_pci_rid(struct pci_dev *pdev, u16 alias, void
>>>>>>> *data)
>>>>>>> +{
>>>>>>> +    u32 *rid = data;
>>>>>>> +
>>>>>>> +    *rid = alias;
>>>>>>> +    return 0;
>>>>>>> +}
>>>>>>> +
>>>>>>>  struct of_pci_iommu_alias_info {
>>>>>>>      struct device *dev;
>>>>>>>      struct device_node *np;
>>>>>>> @@ -163,6 +172,73 @@ static int of_pci_iommu_init(struct pci_dev
>>>>>>> *pdev, u16 alias, void *data)
>>>>>>>      return info->np == pdev->bus->dev.of_node;
>>>>>>>  }
>>>>>>>
>>>>>>> +static int of_iommu_alloc_resv_region(struct device_node *np,
>>>>>>> +                      struct list_head *head)
>>>>>>> +{
>>>>>>> +    int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
>>>>>>> +    struct iommu_resv_region *region;
>>>>>>> +    struct resource res;
>>>>>>> +    int err;
>>>>>>> +
>>>>>>> +    err = of_address_to_resource(np, 0, &res);
>>>>>>
>>>>>> What is the rational for registering the first region only? Surely
>>>>>> you
>>>>>> can have more than one region in an MSI controller (yes, your
>>>>>> particular
>>>>>> case is the ITS which has only one, but we're trying to do something
>>>>>> generic here).
>>>>>>
>>>>>> Another issue I have with this code is that it inserts all of the ITS
>>>>>> MMIO in the RESV_MSI range. As long as we don't generate any page
>>>>>> tables
>>>>>> for this, we're fine. But if we ever change this, we'll end-up
>>>>>> with the
>>>>>> ITS programming interface being exposed to a device, which
>>>>>> wouldn't be
>>>>>> acceptable.
>>>>>>
>>>>>> Why can't we have some generic property instead, that would describe
>>>>>> the
>>>>>> actual ranges that cannot be translated? That way, no random
>>>>>> insertion
>>>>>> of a random range, and relatively future proof.
>>>>
>>>> Indeed. Furthermore, IORT has the advantage of being limited to a few
>>>> distinct device types and SBSA-compliant system topologies, so the
>>>> ITS-chasing is reasonable there (modulo only reserving
>>>> GITS_TRANSLATER).
>>>> The scope of DT covers more embedded things as well like PCI host
>>>> controllers with internal MSI doorbells, and potentially even
>>>> direct-mapped memory regions for things like bootloader framebuffers to
>>>> prevent display glitches - a generic address/size/flags description
>>>> of a
>>>> region could work for just about everything.
>>>>
>>>>> IORT code has the same issue (ie it reserves all ITS regions) and I do
>>>>> not know where a property can be added to describe those ranges (IORT
>>>>> specs ? I'd rather not) in ACPI other than the IORT tables entries
>>>>> themselves.
>>>>>
>>>>>> I'm not sure where to declare it (PCIe node? IOMMU node?), but it'd
>>>>>> feel
>>>>>> like a much nicer and simpler solution to this problem.
>>>>>
>>>>> It could be but if we throw ACPI into the picture we have to knock
>>>>> together Hisilicon specific namespace bindings to handle this and
>>>>> quickly.
>>>>
>>>> As above I'm happy with the ITS-specific solution for ACPI given the
>>>> limits of IORT. What I had in mind for DT was something tied in with
>>>> the
>>>> generic IOMMU bindings. Something like this is probably easiest to
>>>> handle, but does rather spread the information around:
>>>>
>>>>
>>>>   pci {
>>>>       ...
>>>>       iommu-map = <0 0 &iommu1 0x10000>;
>>>>       iommu-reserved-ranges = <0x12340000 0x1000 IOMMU_MSI>,
>>>>                   <0x34560000 0x1000 IOMMU_MSI>;
>>>>   };
>>>>
>>>>   display {
>>>>       ...
>>>>       iommus = <&iommu1 0x20000>;
>>>>       /* simplefb region */
>>>>       iommu-reserved-ranges = <0x80080000 0x80000 IOMMU_DIRECT>,
>>>>   };
>>>>
>>>>
>>>> Alternatively, something inspired by reserved-memory might perhaps be
>>>> conceptually neater, at the risk of being more complicated:
>>>>
>>>>
>>>>   iommu1: iommu(a)acbd0000 {
>>>>       ...
>>>>       #iommu-cells = <1>;
>>>>
>>>>       iommu-reserved-ranges {
>>>>           #address-cells = <1>;
>>>>           #size-cells = <1>;
>>>>
>>>>         its0_resv: msi(a)12340000 {
>>>>               compatible = "iommu-msi-region";
>>>>               reg = <0x12340000 0x1000>;
>>>>           };
>>>>
>>>>         its1_resv: msi(a)34560000 {
>>>>               compatible = "iommu-msi-region";
>>>>               reg = <0x34560000 0x1000>;
>>>>           };
>>>>
>>>>         fb_resv: direct(a)12340000 {
>>>>               compatible = "iommu-direct-region";
>>>>               reg = <0x80080000 0x80000>;
>>>>           };
>>>>       };
>>>>   };
>>>>
>>>>
>>>
>>> If we did this, wouldn't it be easier to create dangerous reserved
>>> regions, like our ITS region which Marc was concerned by? It's not so
>>> hard to get dts changes into the kernel.
>>
>> Well, yeah. It's also equally easy to add some peripheral register
>> region to the /memory node and watch hilarity ensue. The solution to
>> both is "don't put bogus crap in your DT".
>>
> 
> Sure, but people make mistakes and often a lot more subtle than your
> example.

I wanted to write a response to that but since it was too easy to hit
the wrong keys on my keyboard, I didn't.

I don't see a proposal for some alternate method of describing
restricted IOVA regions which scales beyond HiSilicon Hip07 and would be
entirely immune to bugs, so I'm really not sure what your argument is
here :/

Robin.

> Only one person spotted the problem with our approach to the ITS
> problem. I'm pretty confident it would not have been spotted if it was
> submitted as a dts change.
> 
> Much appreciated,
> John
> 
>> There's a big difference between wilfully misdescribing your platform
>> requirements vs. having the kernel automatically infer something but
>> leave a hole open in the process.
>>
>> Robin.
>>
>> .
>>
> 
> 


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
  2017-09-27 13:32   ` Shameer Kolothum
@ 2017-10-05 22:24     ` Rob Herring
  -1 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2017-10-05 22:24 UTC (permalink / raw)
  To: Shameer Kolothum
  Cc: lorenzo.pieralisi, marc.zyngier, sudeep.holla, will.deacon,
	robin.murphy, joro, mark.rutland, gabriele.paoloni, john.garry,
	iommu, linux-arm-kernel, linux-acpi, devicetree, devel, linuxarm,
	wangzhou1, guohanjun

On Wed, Sep 27, 2017 at 02:32:37PM +0100, Shameer Kolothum wrote:
> From: John Garry <john.garry@huawei.com>
> 
> The HiSilicon erratum 161010801 describes the limitation of HiSilicon
> platforms hip06/hip07 to support the SMMU mappings for MSI transactions.
> 
> On these platforms, GICv3 ITS translator is presented with the deviceID
> by extending the MSI payload data to 64 bits to include the deviceID.
> Hence, the PCIe controller on this platforms has to differentiate the MSI
> payload against other DMA payload and has to modify the MSI payload.
> This basically makes it difficult for this platforms to have a SMMU
> translation for MSI.
> 
> This patch adds a compatible string to implement this errata for
> HiSilicon Hi161x SMMUv3 model on hip06/hip07 platforms.
> 
> Also, the arm64 silicon errata is updated with this same erratum.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> [Shameer: Modified to use compatible string for errata]
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  Documentation/arm64/silicon-errata.txt                  | 1 +
>  Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 9 ++++++++-
>  2 files changed, 9 insertions(+), 1 deletion(-)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801
@ 2017-10-05 22:24     ` Rob Herring
  0 siblings, 0 replies; 59+ messages in thread
From: Rob Herring @ 2017-10-05 22:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 27, 2017 at 02:32:37PM +0100, Shameer Kolothum wrote:
> From: John Garry <john.garry@huawei.com>
> 
> The HiSilicon erratum 161010801 describes the limitation of HiSilicon
> platforms hip06/hip07 to support the SMMU mappings for MSI transactions.
> 
> On these platforms, GICv3 ITS translator is presented with the deviceID
> by extending the MSI payload data to 64 bits to include the deviceID.
> Hence, the PCIe controller on this platforms has to differentiate the MSI
> payload against other DMA payload and has to modify the MSI payload.
> This basically makes it difficult for this platforms to have a SMMU
> translation for MSI.
> 
> This patch adds a compatible string to implement this errata for
> HiSilicon Hi161x SMMUv3 model on hip06/hip07 platforms.
> 
> Also, the arm64 silicon errata is updated with this same erratum.
> 
> Signed-off-by: John Garry <john.garry@huawei.com>
> [Shameer: Modified to use compatible string for errata]
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
> ---
>  Documentation/arm64/silicon-errata.txt                  | 1 +
>  Documentation/devicetree/bindings/iommu/arm,smmu-v3.txt | 9 ++++++++-
>  2 files changed, 9 insertions(+), 1 deletion(-)

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
  2017-10-04 14:17         ` Marc Zyngier
  (?)
@ 2017-10-06 10:17             ` Shameerali Kolothum Thodi
  -1 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-06 10:17 UTC (permalink / raw)
  To: Marc Zyngier, lorenzo.pieralisi-5wv7dgnIgG8,
	sudeep.holla-5wv7dgnIgG8, will.deacon-5wv7dgnIgG8,
	robin.murphy-5wv7dgnIgG8, joro-zLv9SwRftAIdnm+yROfE0A,
	mark.rutland-5wv7dgnIgG8, robh-DgEjT+Ai2ygdnm+yROfE0A
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, Gabriele Paoloni, Linuxarm,
	linux-acpi-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	Guohanjun (Hanjun Guo),
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	devel-E0kO6a4B6psdnm+yROfE0A



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier-5wv7dgnIgG8@public.gmane.org]
> Sent: Wednesday, October 04, 2017 3:18 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>;
> lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org; sudeep.holla-5wv7dgnIgG8@public.gmane.org; will.deacon-5wv7dgnIgG8@public.gmane.org;
> robin.murphy-5wv7dgnIgG8@public.gmane.org; joro-zLv9SwRftAIdnm+yROfE0A@public.gmane.org; mark.rutland-5wv7dgnIgG8@public.gmane.org;
> robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org
> Cc: Gabriele Paoloni <gabriele.paoloni-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; John Garry
> <john.garry-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org; linux-arm-
> kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org; linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org;
> devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; devel-E0kO6a4B6psdnm+yROfE0A@public.gmane.org; Linuxarm
> <linuxarm-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>; Wangzhou (B) <wangzhou1-C8/M+/jPZTeaMJb+Lgu22Q@public.gmane.org>;
> Guohanjun (Hanjun Guo) <guohanjun-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Subject: Re: [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > On some platforms msi parent address regions have to be excluded from
> > normal IOVA allocation in that they are detected and decoded in a HW
> > specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves ITS address regions - the msi
> > parent - through IORT device <-> ITS mappings and reserves it so that
> > these regions will not be translated by IOMMU and will be excluded
> > from IOVA allocations.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> > [lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org: updated commit log/added comments]
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi-5wv7dgnIgG8@public.gmane.org>
> > ---
> >  drivers/acpi/arm64/iort.c        | 96
> ++++++++++++++++++++++++++++++++++++++--
> >  drivers/irqchip/irq-gic-v3-its.c |  3 +-
> >  include/linux/acpi_iort.h        |  7 ++-
> >  3 files changed, 101 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 9565d57..14efa9d 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -39,6 +39,7 @@
> >  struct iort_its_msi_chip {
> >  	struct list_head	list;
> >  	struct fwnode_handle	*fw_node;
> > +	phys_addr_t		base_addr;
> >  	u32			translation_id;
> >  };
> >
> > @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
> > static DEFINE_SPINLOCK(iort_msi_chip_lock);
> >
> >  /**
> > - * iort_register_domain_token() - register domain token and related
> > ITS ID
> > - * to the list from where we can get it back later on.
> > + * iort_register_domain_token() - register domain token along with
> > + related
> > + * ITS ID and base address to the list from where we can get it back later
> on.
> >   * @trans_id: ITS ID.
> > + * @base: ITS base address.
> >   * @fw_node: Domain token.
> >   *
> >   * Returns: 0 on success, -ENOMEM if no memory when allocating list
> element
> >   */
> > -int iort_register_domain_token(int trans_id, struct fwnode_handle
> > *fw_node)
> > +int iort_register_domain_token(int trans_id, phys_addr_t base,
> > +			       struct fwnode_handle *fw_node)
> >  {
> >  	struct iort_its_msi_chip *its_msi_chip;
> >
> > @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id,
> > struct fwnode_handle *fw_node)
> >
> >  	its_msi_chip->fw_node = fw_node;
> >  	its_msi_chip->translation_id = trans_id;
> > +	its_msi_chip->base_addr = base;
> >
> >  	spin_lock(&iort_msi_chip_lock);
> >  	list_add(&its_msi_chip->list, &iort_msi_chip_list); @@ -481,6
> > +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
> >  	return -ENODEV;
> >  }
> >
> > +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t
> > +*base) {
> > +	struct iort_its_msi_chip *its_msi_chip;
> > +	bool match = false;
> > +
> > +	spin_lock(&iort_msi_chip_lock);
> > +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> > +		if (its_msi_chip->translation_id == its_id) {
> > +			*base = its_msi_chip->base_addr;
> > +			match = true;
> > +			break;
> > +		}
> > +	}
> > +	spin_unlock(&iort_msi_chip_lock);
> > +
> > +	return match ? 0 : -ENODEV;
> > +}
> > +
> >  /**
> >   * iort_dev_find_its_id() - Find the ITS identifier for a device
> >   * @dev: The device.
> > @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct
> iommu_ops
> > *ops, struct device *dev)
> >
> >  	return err;
> >  }
> > +
> > +/**
> > + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> > + * @dev: Device from iommu_get_resv_regions()
> > + * @head: Reserved region list from iommu_get_resv_regions()
> > + *
> > + * Returns: Number of reserved regions on success (0 if no associated msi
> > + *          regions), appropriate error value otherwise. The ITS regions
> > + *          associated with the device are the msi reserved regions.
> > + */
> > +int iort_iommu_msi_get_resv_regions(struct device *dev, struct
> > +list_head *head) {
> > +	struct acpi_iort_its_group *its;
> > +	struct acpi_iort_node *node, *its_node = NULL;
> > +	int i, resv = 0;
> > +
> > +	node = iort_find_dev_node(dev);
> > +	if (!node)
> > +		return -ENODEV;
> > +
> > +	/*
> > +	 * Current logic to reserve ITS regions relies on HW topologies
> > +	 * where a given PCI or named component maps its IDs to only one
> > +	 * ITS group; if a PCI or named component can map its IDs to
> > +	 * different ITS groups through IORT mappings this function has
> > +	 * to be reworked to ensure we reserve regions for all ITS groups
> > +	 * a given PCI or named component may map IDs to.
> > +	 */
> > +	if (dev_is_pci(dev)) {
> > +		u32 rid;
> > +
> > +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
> &rid);
> > +		its_node = iort_node_map_id(node, rid, NULL,
> IORT_MSI_TYPE);
> > +	} else {
> > +		for (i = 0; i < node->mapping_count; i++) {
> > +			its_node = iort_node_map_platform_id(node, NULL,
> > +							 IORT_MSI_TYPE, i);
> > +			if (its_node)
> > +				break;
> > +		}
> > +	}
> > +
> > +	if (!its_node)
> > +		return 0;
> > +
> > +	/* Move to ITS specific data */
> > +	its = (struct acpi_iort_its_group *)its_node->node_data;
> > +
> > +	for (i = 0; i < its->its_count; i++) {
> > +		phys_addr_t base;
> > +
> > +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> > +			int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> > +			struct iommu_resv_region *region;
> > +
> > +			region = iommu_alloc_resv_region(base, SZ_128K,
> prot,
> > +							 IOMMU_RESV_MSI);
> 
> Same as the OF part: I strongly object to reserving the whole 128kB range.
> What we really care about is the second 64kB page, and that is what should
> get reserved.

Thanks Marc. I will make the changes in next revision.

Also as we are still discussing about the DT approach for this, I am thinking
of sending out the v9 with the above fix and blacklisting the HiSilicon PCIe
controllers on DT based hip06/hip07 systems when SMMUv3 is enabled.

Hi Will,

Hope that will address your concerns with respect to only having ACPI quirk
for this. 

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-10-06 10:17             ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-06 10:17 UTC (permalink / raw)
  To: linux-arm-kernel



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier at arm.com]
> Sent: Wednesday, October 04, 2017 3:18 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> lorenzo.pieralisi at arm.com; sudeep.holla at arm.com; will.deacon at arm.com;
> robin.murphy at arm.com; joro at 8bytes.org; mark.rutland at arm.com;
> robh at kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni@huawei.com>; John Garry
> <john.garry@huawei.com>; iommu at lists.linux-foundation.org; linux-arm-
> kernel at lists.infradead.org; linux-acpi at vger.kernel.org;
> devicetree at vger.kernel.org; devel at acpica.org; Linuxarm
> <linuxarm@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun@huawei.com>
> Subject: Re: [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > On some platforms msi parent address regions have to be excluded from
> > normal IOVA allocation in that they are detected and decoded in a HW
> > specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves ITS address regions - the msi
> > parent - through IORT device <-> ITS mappings and reserves it so that
> > these regions will not be translated by IOMMU and will be excluded
> > from IOVA allocations.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> > [lorenzo.pieralisi at arm.com: updated commit log/added comments]
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> > ---
> >  drivers/acpi/arm64/iort.c        | 96
> ++++++++++++++++++++++++++++++++++++++--
> >  drivers/irqchip/irq-gic-v3-its.c |  3 +-
> >  include/linux/acpi_iort.h        |  7 ++-
> >  3 files changed, 101 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 9565d57..14efa9d 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -39,6 +39,7 @@
> >  struct iort_its_msi_chip {
> >  	struct list_head	list;
> >  	struct fwnode_handle	*fw_node;
> > +	phys_addr_t		base_addr;
> >  	u32			translation_id;
> >  };
> >
> > @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
> > static DEFINE_SPINLOCK(iort_msi_chip_lock);
> >
> >  /**
> > - * iort_register_domain_token() - register domain token and related
> > ITS ID
> > - * to the list from where we can get it back later on.
> > + * iort_register_domain_token() - register domain token along with
> > + related
> > + * ITS ID and base address to the list from where we can get it back later
> on.
> >   * @trans_id: ITS ID.
> > + * @base: ITS base address.
> >   * @fw_node: Domain token.
> >   *
> >   * Returns: 0 on success, -ENOMEM if no memory when allocating list
> element
> >   */
> > -int iort_register_domain_token(int trans_id, struct fwnode_handle
> > *fw_node)
> > +int iort_register_domain_token(int trans_id, phys_addr_t base,
> > +			       struct fwnode_handle *fw_node)
> >  {
> >  	struct iort_its_msi_chip *its_msi_chip;
> >
> > @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id,
> > struct fwnode_handle *fw_node)
> >
> >  	its_msi_chip->fw_node = fw_node;
> >  	its_msi_chip->translation_id = trans_id;
> > +	its_msi_chip->base_addr = base;
> >
> >  	spin_lock(&iort_msi_chip_lock);
> >  	list_add(&its_msi_chip->list, &iort_msi_chip_list); @@ -481,6
> > +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
> >  	return -ENODEV;
> >  }
> >
> > +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t
> > +*base) {
> > +	struct iort_its_msi_chip *its_msi_chip;
> > +	bool match = false;
> > +
> > +	spin_lock(&iort_msi_chip_lock);
> > +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> > +		if (its_msi_chip->translation_id == its_id) {
> > +			*base = its_msi_chip->base_addr;
> > +			match = true;
> > +			break;
> > +		}
> > +	}
> > +	spin_unlock(&iort_msi_chip_lock);
> > +
> > +	return match ? 0 : -ENODEV;
> > +}
> > +
> >  /**
> >   * iort_dev_find_its_id() - Find the ITS identifier for a device
> >   * @dev: The device.
> > @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct
> iommu_ops
> > *ops, struct device *dev)
> >
> >  	return err;
> >  }
> > +
> > +/**
> > + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> > + * @dev: Device from iommu_get_resv_regions()
> > + * @head: Reserved region list from iommu_get_resv_regions()
> > + *
> > + * Returns: Number of reserved regions on success (0 if no associated msi
> > + *          regions), appropriate error value otherwise. The ITS regions
> > + *          associated with the device are the msi reserved regions.
> > + */
> > +int iort_iommu_msi_get_resv_regions(struct device *dev, struct
> > +list_head *head) {
> > +	struct acpi_iort_its_group *its;
> > +	struct acpi_iort_node *node, *its_node = NULL;
> > +	int i, resv = 0;
> > +
> > +	node = iort_find_dev_node(dev);
> > +	if (!node)
> > +		return -ENODEV;
> > +
> > +	/*
> > +	 * Current logic to reserve ITS regions relies on HW topologies
> > +	 * where a given PCI or named component maps its IDs to only one
> > +	 * ITS group; if a PCI or named component can map its IDs to
> > +	 * different ITS groups through IORT mappings this function has
> > +	 * to be reworked to ensure we reserve regions for all ITS groups
> > +	 * a given PCI or named component may map IDs to.
> > +	 */
> > +	if (dev_is_pci(dev)) {
> > +		u32 rid;
> > +
> > +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
> &rid);
> > +		its_node = iort_node_map_id(node, rid, NULL,
> IORT_MSI_TYPE);
> > +	} else {
> > +		for (i = 0; i < node->mapping_count; i++) {
> > +			its_node = iort_node_map_platform_id(node, NULL,
> > +							 IORT_MSI_TYPE, i);
> > +			if (its_node)
> > +				break;
> > +		}
> > +	}
> > +
> > +	if (!its_node)
> > +		return 0;
> > +
> > +	/* Move to ITS specific data */
> > +	its = (struct acpi_iort_its_group *)its_node->node_data;
> > +
> > +	for (i = 0; i < its->its_count; i++) {
> > +		phys_addr_t base;
> > +
> > +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> > +			int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> > +			struct iommu_resv_region *region;
> > +
> > +			region = iommu_alloc_resv_region(base, SZ_128K,
> prot,
> > +							 IOMMU_RESV_MSI);
> 
> Same as the OF part: I strongly object to reserving the whole 128kB range.
> What we really care about is the second 64kB page, and that is what should
> get reserved.

Thanks Marc. I will make the changes in next revision.

Also as we are still discussing about the DT approach for this, I am thinking
of sending out the v9 with the above fix and blacklisting the HiSilicon PCIe
controllers on DT based hip06/hip07 systems when SMMUv3 is enabled.

Hi Will,

Hope that will address your concerns with respect to only having ACPI quirk
for this. 

Thanks,
Shameer

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Devel] [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper
@ 2017-10-06 10:17             ` Shameerali Kolothum Thodi
  0 siblings, 0 replies; 59+ messages in thread
From: Shameerali Kolothum Thodi @ 2017-10-06 10:17 UTC (permalink / raw)
  To: devel

[-- Attachment #1: Type: text/plain, Size: 7165 bytes --]



> -----Original Message-----
> From: Marc Zyngier [mailto:marc.zyngier(a)arm.com]
> Sent: Wednesday, October 04, 2017 3:18 PM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi(a)huawei.com>;
> lorenzo.pieralisi(a)arm.com; sudeep.holla(a)arm.com; will.deacon(a)arm.com;
> robin.murphy(a)arm.com; joro(a)8bytes.org; mark.rutland(a)arm.com;
> robh(a)kernel.org
> Cc: Gabriele Paoloni <gabriele.paoloni(a)huawei.com>; John Garry
> <john.garry(a)huawei.com>; iommu(a)lists.linux-foundation.org; linux-arm-
> kernel(a)lists.infradead.org; linux-acpi(a)vger.kernel.org;
> devicetree(a)vger.kernel.org; devel(a)acpica.org; Linuxarm
> <linuxarm(a)huawei.com>; Wangzhou (B) <wangzhou1(a)hisilicon.com>;
> Guohanjun (Hanjun Guo) <guohanjun(a)huawei.com>
> Subject: Re: [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation
> helper
> 
> On 27/09/17 14:32, Shameer Kolothum wrote:
> > On some platforms msi parent address regions have to be excluded from
> > normal IOVA allocation in that they are detected and decoded in a HW
> > specific way by system components and so they cannot be considered
> > normal IOVA address space.
> >
> > Add a helper function that retrieves ITS address regions - the msi
> > parent - through IORT device <-> ITS mappings and reserves it so that
> > these regions will not be translated by IOMMU and will be excluded
> > from IOVA allocations.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi(a)huawei.com>
> > [lorenzo.pieralisi(a)arm.com: updated commit log/added comments]
> > Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi(a)arm.com>
> > ---
> >  drivers/acpi/arm64/iort.c        | 96
> ++++++++++++++++++++++++++++++++++++++--
> >  drivers/irqchip/irq-gic-v3-its.c |  3 +-
> >  include/linux/acpi_iort.h        |  7 ++-
> >  3 files changed, 101 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> > index 9565d57..14efa9d 100644
> > --- a/drivers/acpi/arm64/iort.c
> > +++ b/drivers/acpi/arm64/iort.c
> > @@ -39,6 +39,7 @@
> >  struct iort_its_msi_chip {
> >  	struct list_head	list;
> >  	struct fwnode_handle	*fw_node;
> > +	phys_addr_t		base_addr;
> >  	u32			translation_id;
> >  };
> >
> > @@ -136,14 +137,16 @@ typedef acpi_status (*iort_find_node_callback)
> > static DEFINE_SPINLOCK(iort_msi_chip_lock);
> >
> >  /**
> > - * iort_register_domain_token() - register domain token and related
> > ITS ID
> > - * to the list from where we can get it back later on.
> > + * iort_register_domain_token() - register domain token along with
> > + related
> > + * ITS ID and base address to the list from where we can get it back later
> on.
> >   * @trans_id: ITS ID.
> > + * @base: ITS base address.
> >   * @fw_node: Domain token.
> >   *
> >   * Returns: 0 on success, -ENOMEM if no memory when allocating list
> element
> >   */
> > -int iort_register_domain_token(int trans_id, struct fwnode_handle
> > *fw_node)
> > +int iort_register_domain_token(int trans_id, phys_addr_t base,
> > +			       struct fwnode_handle *fw_node)
> >  {
> >  	struct iort_its_msi_chip *its_msi_chip;
> >
> > @@ -153,6 +156,7 @@ int iort_register_domain_token(int trans_id,
> > struct fwnode_handle *fw_node)
> >
> >  	its_msi_chip->fw_node = fw_node;
> >  	its_msi_chip->translation_id = trans_id;
> > +	its_msi_chip->base_addr = base;
> >
> >  	spin_lock(&iort_msi_chip_lock);
> >  	list_add(&its_msi_chip->list, &iort_msi_chip_list); @@ -481,6
> > +485,24 @@ int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id)
> >  	return -ENODEV;
> >  }
> >
> > +static int __maybe_unused iort_find_its_base(u32 its_id, phys_addr_t
> > +*base) {
> > +	struct iort_its_msi_chip *its_msi_chip;
> > +	bool match = false;
> > +
> > +	spin_lock(&iort_msi_chip_lock);
> > +	list_for_each_entry(its_msi_chip, &iort_msi_chip_list, list) {
> > +		if (its_msi_chip->translation_id == its_id) {
> > +			*base = its_msi_chip->base_addr;
> > +			match = true;
> > +			break;
> > +		}
> > +	}
> > +	spin_unlock(&iort_msi_chip_lock);
> > +
> > +	return match ? 0 : -ENODEV;
> > +}
> > +
> >  /**
> >   * iort_dev_find_its_id() - Find the ITS identifier for a device
> >   * @dev: The device.
> > @@ -639,6 +661,72 @@ int iort_add_device_replay(const struct
> iommu_ops
> > *ops, struct device *dev)
> >
> >  	return err;
> >  }
> > +
> > +/**
> > + * iort_iommu_msi_get_resv_regions - Reserved region driver helper
> > + * @dev: Device from iommu_get_resv_regions()
> > + * @head: Reserved region list from iommu_get_resv_regions()
> > + *
> > + * Returns: Number of reserved regions on success (0 if no associated msi
> > + *          regions), appropriate error value otherwise. The ITS regions
> > + *          associated with the device are the msi reserved regions.
> > + */
> > +int iort_iommu_msi_get_resv_regions(struct device *dev, struct
> > +list_head *head) {
> > +	struct acpi_iort_its_group *its;
> > +	struct acpi_iort_node *node, *its_node = NULL;
> > +	int i, resv = 0;
> > +
> > +	node = iort_find_dev_node(dev);
> > +	if (!node)
> > +		return -ENODEV;
> > +
> > +	/*
> > +	 * Current logic to reserve ITS regions relies on HW topologies
> > +	 * where a given PCI or named component maps its IDs to only one
> > +	 * ITS group; if a PCI or named component can map its IDs to
> > +	 * different ITS groups through IORT mappings this function has
> > +	 * to be reworked to ensure we reserve regions for all ITS groups
> > +	 * a given PCI or named component may map IDs to.
> > +	 */
> > +	if (dev_is_pci(dev)) {
> > +		u32 rid;
> > +
> > +		pci_for_each_dma_alias(to_pci_dev(dev), __get_pci_rid,
> &rid);
> > +		its_node = iort_node_map_id(node, rid, NULL,
> IORT_MSI_TYPE);
> > +	} else {
> > +		for (i = 0; i < node->mapping_count; i++) {
> > +			its_node = iort_node_map_platform_id(node, NULL,
> > +							 IORT_MSI_TYPE, i);
> > +			if (its_node)
> > +				break;
> > +		}
> > +	}
> > +
> > +	if (!its_node)
> > +		return 0;
> > +
> > +	/* Move to ITS specific data */
> > +	its = (struct acpi_iort_its_group *)its_node->node_data;
> > +
> > +	for (i = 0; i < its->its_count; i++) {
> > +		phys_addr_t base;
> > +
> > +		if (!iort_find_its_base(its->identifiers[i], &base)) {
> > +			int prot = IOMMU_WRITE | IOMMU_NOEXEC |
> IOMMU_MMIO;
> > +			struct iommu_resv_region *region;
> > +
> > +			region = iommu_alloc_resv_region(base, SZ_128K,
> prot,
> > +							 IOMMU_RESV_MSI);
> 
> Same as the OF part: I strongly object to reserving the whole 128kB range.
> What we really care about is the second 64kB page, and that is what should
> get reserved.

Thanks Marc. I will make the changes in next revision.

Also as we are still discussing about the DT approach for this, I am thinking
of sending out the v9 with the above fix and blacklisting the HiSilicon PCIe
controllers on DT based hip06/hip07 systems when SMMUv3 is enabled.

Hi Will,

Hope that will address your concerns with respect to only having ACPI quirk
for this. 

Thanks,
Shameer





^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2017-10-06 10:17 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-27 13:32 [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI) Shameer Kolothum
2017-09-27 13:32 ` [Devel] " Shameer Kolothum
2017-09-27 13:32 ` Shameer Kolothum
2017-09-27 13:32 ` [PATCH v8 1/5] Doc: iommu/arm-smmu-v3: Add workaround for HiSilicon erratum 161010801 Shameer Kolothum
2017-09-27 13:32   ` [Devel] " Shameer Kolothum
2017-09-27 13:32   ` Shameer Kolothum
2017-10-05 22:24   ` Rob Herring
2017-10-05 22:24     ` Rob Herring
     [not found] ` <20170927133241.21036-1-shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-09-27 13:32   ` [PATCH v8 2/5] ACPI/IORT: Add msi address regions reservation helper Shameer Kolothum
2017-09-27 13:32     ` [Devel] " Shameer Kolothum
2017-09-27 13:32     ` Shameer Kolothum
     [not found]     ` <20170927133241.21036-3-shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-10-04 14:17       ` Marc Zyngier
2017-10-04 14:17         ` [Devel] " Marc Zyngier
2017-10-04 14:17         ` Marc Zyngier
     [not found]         ` <001729e8-7b34-0bb4-8af6-8af100661906-5wv7dgnIgG8@public.gmane.org>
2017-10-06 10:17           ` Shameerali Kolothum Thodi
2017-10-06 10:17             ` [Devel] " Shameerali Kolothum Thodi
2017-10-06 10:17             ` Shameerali Kolothum Thodi
2017-10-04 10:03   ` [PATCH v8 0/5] iommu/smmu-v3: Workaround for hisilicon 161010801 erratum(reserve HW MSI) Shameerali Kolothum Thodi
2017-10-04 10:03     ` [Devel] " Shameerali Kolothum Thodi
2017-10-04 10:03     ` Shameerali Kolothum Thodi
2017-09-27 13:32 ` [PATCH v8 3/5] iommu/of: Add msi address regions reservation helper Shameer Kolothum
2017-09-27 13:32   ` [Devel] " Shameer Kolothum
2017-09-27 13:32   ` Shameer Kolothum
     [not found]   ` <20170927133241.21036-4-shameerali.kolothum.thodi-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-10-04 11:22     ` Marc Zyngier
2017-10-04 11:22       ` [Devel] " Marc Zyngier
2017-10-04 11:22       ` Marc Zyngier
     [not found]       ` <f3146f49-6004-7a15-866b-653e3ea54856-5wv7dgnIgG8@public.gmane.org>
2017-10-04 13:50         ` Lorenzo Pieralisi
2017-10-04 13:50           ` [Devel] " Lorenzo Pieralisi
2017-10-04 13:50           ` Lorenzo Pieralisi
2017-10-05 11:07           ` Robin Murphy
2017-10-05 11:07             ` [Devel] " Robin Murphy
2017-10-05 11:07             ` Robin Murphy
     [not found]             ` <5c80f292-dc5d-a1c6-d7a0-df2a84cc77d1-5wv7dgnIgG8@public.gmane.org>
2017-10-05 11:57               ` Will Deacon
2017-10-05 11:57                 ` [Devel] " Will Deacon
2017-10-05 11:57                 ` Will Deacon
     [not found]                 ` <20171005115719.GC11088-5wv7dgnIgG8@public.gmane.org>
2017-10-05 12:55                   ` Robin Murphy
2017-10-05 12:55                     ` [Devel] " Robin Murphy
2017-10-05 12:55                     ` Robin Murphy
2017-10-05 12:37             ` John Garry
2017-10-05 12:37               ` [Devel] " John Garry
2017-10-05 12:37               ` John Garry
     [not found]               ` <d525f112-fb20-d14b-bea0-6547c8b933dd-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-10-05 12:44                 ` Robin Murphy
2017-10-05 12:44                   ` [Devel] " Robin Murphy
2017-10-05 12:44                   ` Robin Murphy
2017-10-05 13:08                   ` John Garry
2017-10-05 13:08                     ` [Devel] " John Garry
2017-10-05 13:08                     ` John Garry
     [not found]                     ` <82272c18-717d-bbaf-b8c3-61785af496bd-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
2017-10-05 14:05                       ` Robin Murphy
2017-10-05 14:05                         ` [Devel] " Robin Murphy
2017-10-05 14:05                         ` Robin Murphy
2017-10-04 17:07         ` Shameerali Kolothum Thodi
2017-10-04 17:07           ` [Devel] " Shameerali Kolothum Thodi
2017-10-04 17:07           ` Shameerali Kolothum Thodi
2017-09-27 13:32 ` [PATCH v8 4/5] iommu/dma: Add a helper function to reserve HW MSI address regions for IOMMU drivers Shameer Kolothum
2017-09-27 13:32   ` [Devel] " Shameer Kolothum
2017-09-27 13:32   ` Shameer Kolothum
2017-09-27 13:32 ` [PATCH v8 5/5] iommu/arm-smmu-v3:Enable ACPI based HiSilicon erratum 161010801 Shameer Kolothum
2017-09-27 13:32   ` [Devel] " Shameer Kolothum
2017-09-27 13:32   ` Shameer Kolothum

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.