Xen-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec
@ 2019-08-02 16:39 Oleksandr Tyshchenko
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff Oleksandr Tyshchenko
                   ` (6 more replies)
  0 siblings, 7 replies; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, julien.grall, sstabellini, Yoshihiro Shimoda

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The purpose of this patch series is to add IPMMU-VMSA support to Xen on ARM.

Besides new IOMMU driver, this series contains "iommu_fwspec" support
and new API iommu_add_dt_device() for adding DT device to IOMMU.

The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
which provides address translation and access protection functionalities
to processing units and interconnect networks.

Please note, this driver is supposed to work only with newest
Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation
table format and is able to use CPU's P2M table as is if one is
3-level page table (up to 40 bit IPA).

----------
This driver is based on Linux's IPMMU-VMSA driver from Renesas BSP:
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/tree/drivers/iommu/ipmmu-vmsa.c?h=v4.14.75-ltsi/rcar-3.9.6
and Xen's SMMU driver:
xen/drivers/passthrough/arm/smmu.c

Although Xen driver has a lot in common with Linux driver, it is not
a "direct ported" copy and should be treated as such.

The major differences compare to the Linux driver are:

1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
translation only (with Stage 1 translation table format). It manages
page table by itself. But Xen driver supports Stage 2 translation
(with Stage 2 translation table format) to be able to share the P2M
with the CPU. Stage 1 translation is always bypassed in Xen driver.

So, Xen driver is supposed to be used with newest Gen3 SoC revisions only
(H3 ES3.0, M3 ES3.0, etc.) which IPMMU H/W supports stage 2 translation
table format.

2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver
enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address.

3. Context bank (sets of page table) usage. In Xen, each context bank is
mapped to one Xen domain. So, all devices being pass throughed to the
same Xen domain share the same context bank.

----------
Driver was tested on Gen3 H3 ES3.0 based boards using current staging
(7d1460c xen/arm: optee: fix compilation with GCC 4.8)
in a system with several DMA masters being assigned to different guest domains.

You can find it here:
repo: https://github.com/otyshchenko1/xen.git branch: ipmmu_upstream2

You can find previous discussions here:
[V1] https://lists.xenproject.org/archives/html/xen-devel/2019-06/msg01755.html


Oleksandr Tyshchenko (6):
  iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
  iommu/arm: Add ability to handle deferred probing request
  [RFC] xen/common: Introduce _xrealloc function
  iommu/arm: Add lightweight iommu_fwspec support
  iommu/arm: Introduce iommu_add_dt_device API
  iommu/arm: Add Renesas IPMMU-VMSA support

 xen/arch/arm/domain_build.c                 |   12 +
 xen/arch/arm/platforms/Kconfig              |    1 +
 xen/common/device_tree.c                    |    1 +
 xen/common/xmalloc_tlsf.c                   |   21 +
 xen/drivers/passthrough/Kconfig             |   13 +
 xen/drivers/passthrough/arm/Makefile        |    3 +-
 xen/drivers/passthrough/arm/iommu.c         |   80 +-
 xen/drivers/passthrough/arm/iommu_fwspec.c  |   91 ++
 xen/drivers/passthrough/arm/iommu_helpers.c |   78 ++
 xen/drivers/passthrough/arm/ipmmu-vmsa.c    | 1342 +++++++++++++++++++++++++++
 xen/drivers/passthrough/arm/smmu.c          |   48 +-
 xen/include/asm-arm/device.h                |    7 +-
 xen/include/asm-arm/iommu.h                 |   12 +
 xen/include/asm-arm/iommu_fwspec.h          |   65 ++
 xen/include/xen/device_tree.h               |    1 +
 xen/include/xen/xmalloc.h                   |    1 +
 16 files changed, 1727 insertions(+), 49 deletions(-)
 create mode 100644 xen/drivers/passthrough/arm/iommu_fwspec.c
 create mode 100644 xen/drivers/passthrough/arm/iommu_helpers.c
 create mode 100644 xen/drivers/passthrough/arm/ipmmu-vmsa.c
 create mode 100644 xen/include/asm-arm/iommu_fwspec.h

-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-09 17:35   ` Julien Grall
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request Oleksandr Tyshchenko
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Introduce a separate file to keep various helpers which could be used
by more than one IOMMU driver in order not to duplicate code.

The first condidates to be moved to the new file are SMMU driver's
"map_page/unmap_page" callbacks. There callbacks neither contain any
SMMU specific info nor perform any SMMU specific actions and are going
to be the same across all IOMMU drivers which H/W IP shares P2M
with the CPU like SMMU does.

So, move callbacks to iommu_helpers.c for the upcoming IPMMU driver
to be able to re-use them.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
 xen/drivers/passthrough/arm/Makefile        |  2 +-
 xen/drivers/passthrough/arm/iommu_helpers.c | 78 +++++++++++++++++++++++++++++
 xen/drivers/passthrough/arm/smmu.c          | 48 +-----------------
 xen/include/asm-arm/iommu.h                 |  7 +++
 4 files changed, 88 insertions(+), 47 deletions(-)
 create mode 100644 xen/drivers/passthrough/arm/iommu_helpers.c

diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile
index b3efcfd..4abb87a 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1,2 +1,2 @@
-obj-y += iommu.o
+obj-y += iommu.o iommu_helpers.o
 obj-$(CONFIG_ARM_SMMU) += smmu.o
diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c b/xen/drivers/passthrough/arm/iommu_helpers.c
new file mode 100644
index 0000000..53e8daa
--- /dev/null
+++ b/xen/drivers/passthrough/arm/iommu_helpers.c
@@ -0,0 +1,78 @@
+/*
+ * xen/drivers/passthrough/arm/iommu_helpers.c
+ *
+ * Contains various helpers to be used by IOMMU drivers.
+ *
+ * Copyright (C) 2019 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/lib.h>
+#include <xen/sched.h>
+#include <xen/iommu.h>
+#include <asm/device.h>
+
+/* Should only be used if P2M Table is shared between the CPU and the IOMMU. */
+int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
+                                    unsigned int flags,
+                                    unsigned int *flush_flags)
+{
+    p2m_type_t t;
+
+    /*
+     * Grant mappings can be used for DMA requests. The dev_bus_addr
+     * returned by the hypercall is the MFN (not the IPA). For device
+     * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
+     * p2m to allow DMA request to work.
+     * This is only valid when the domain is directed mapped. Hence this
+     * function should only be used by gnttab code with gfn == mfn == dfn.
+     */
+    BUG_ON(!is_domain_direct_mapped(d));
+    BUG_ON(mfn_x(mfn) != dfn_x(dfn));
+
+    /* We only support readable and writable flags */
+    if ( !(flags & (IOMMUF_readable | IOMMUF_writable)) )
+        return -EINVAL;
+
+    t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;
+
+    /*
+     * The function guest_physmap_add_entry replaces the current mapping
+     * if there is already one...
+     */
+    return guest_physmap_add_entry(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0, t);
+}
+
+/* Should only be used if P2M Table is shared between the CPU and the IOMMU. */
+int __must_check arm_iommu_unmap_page(struct domain *d, dfn_t dfn,
+                                      unsigned int *flush_flags)
+{
+    /*
+     * This function should only be used by gnttab code when the domain
+     * is direct mapped (i.e. gfn == mfn == dfn).
+     */
+    if ( !is_domain_direct_mapped(d) )
+        return -EINVAL;
+
+    return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index f151b9f..8ae986a 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -2734,50 +2734,6 @@ static void arm_smmu_iommu_domain_teardown(struct domain *d)
 	xfree(xen_domain);
 }
 
-static int __must_check arm_smmu_map_page(struct domain *d, dfn_t dfn,
-					  mfn_t mfn, unsigned int flags,
-					  unsigned int *flush_flags)
-{
-	p2m_type_t t;
-
-	/*
-	 * Grant mappings can be used for DMA requests. The dev_bus_addr
-	 * returned by the hypercall is the MFN (not the IPA). For device
-	 * protected by an IOMMU, Xen needs to add a 1:1 mapping in the domain
-	 * p2m to allow DMA request to work.
-	 * This is only valid when the domain is directed mapped. Hence this
-	 * function should only be used by gnttab code with gfn == mfn == dfn.
-	 */
-	BUG_ON(!is_domain_direct_mapped(d));
-	BUG_ON(mfn_x(mfn) != dfn_x(dfn));
-
-	/* We only support readable and writable flags */
-	if (!(flags & (IOMMUF_readable | IOMMUF_writable)))
-		return -EINVAL;
-
-	t = (flags & IOMMUF_writable) ? p2m_iommu_map_rw : p2m_iommu_map_ro;
-
-	/*
-	 * The function guest_physmap_add_entry replaces the current mapping
-	 * if there is already one...
-	 */
-	return guest_physmap_add_entry(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)),
-				       0, t);
-}
-
-static int __must_check arm_smmu_unmap_page(struct domain *d, dfn_t dfn,
-                                            unsigned int *flush_flags)
-{
-	/*
-	 * This function should only be used by gnttab code when the domain
-	 * is direct mapped (i.e. gfn == mfn == dfn).
-	 */
-	if ( !is_domain_direct_mapped(d) )
-		return -EINVAL;
-
-	return guest_physmap_remove_page(d, _gfn(dfn_x(dfn)), _mfn(dfn_x(dfn)), 0);
-}
-
 static const struct iommu_ops arm_smmu_iommu_ops = {
     .init = arm_smmu_iommu_domain_init,
     .hwdom_init = arm_smmu_iommu_hwdom_init,
@@ -2786,8 +2742,8 @@ static const struct iommu_ops arm_smmu_iommu_ops = {
     .iotlb_flush_all = arm_smmu_iotlb_flush_all,
     .assign_device = arm_smmu_assign_dev,
     .reassign_device = arm_smmu_reassign_dev,
-    .map_page = arm_smmu_map_page,
-    .unmap_page = arm_smmu_unmap_page,
+    .map_page = arm_iommu_map_page,
+    .unmap_page = arm_iommu_unmap_page,
 };
 
 static __init const struct arm_smmu_device *find_smmu(const struct device *dev)
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 904c9ae..20d865e 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -26,6 +26,13 @@ struct arch_iommu
 const struct iommu_ops *iommu_get_ops(void);
 void iommu_set_ops(const struct iommu_ops *ops);
 
+/* mapping helpers */
+int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
+                                    unsigned int flags,
+                                    unsigned int *flush_flags);
+int __must_check arm_iommu_unmap_page(struct domain *d, dfn_t dfn,
+                                      unsigned int *flush_flags);
+
 #endif /* __ARCH_ARM_IOMMU_H__ */
 
 /*
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-12 11:11   ` Julien Grall
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function Oleksandr Tyshchenko
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds minimal required support to General IOMMU framework
to be able to handle a case when IOMMU driver requesting deferred
probing for a device.

In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
we have chosen -EAGAIN to be used for indicating that device
probing is deferred.

This is needed for the upcoming IPMMU driver which may request
deferred probing depending on what device will be probed the first
(there is some dependency between these devices, Root device must be
registered before Cache devices. If not the case, driver will deny
further Cache device probes until Root device is registered).
As we can't guarantee a fixed pre-defined order for the device nodes
in DT, we need to be ready for the situation where devices being
probed in "any" order.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
 xen/common/device_tree.c            |  1 +
 xen/drivers/passthrough/arm/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
 xen/include/asm-arm/device.h        |  6 +++++-
 xen/include/xen/device_tree.h       |  1 +
 4 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index e107c6f..6f37448 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -1774,6 +1774,7 @@ static unsigned long __init unflatten_dt_node(const void *fdt,
         /* By default the device is not protected */
         np->is_protected = false;
         INIT_LIST_HEAD(&np->domain_list);
+        INIT_LIST_HEAD(&np->deferred_probe);
 
         if ( new_format )
         {
diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 2135233..3195919 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -20,6 +20,12 @@
 #include <xen/device_tree.h>
 #include <asm/device.h>
 
+/*
+ * Used to keep track of devices for which driver requested deferred probing
+ * (returns -EAGAIN).
+ */
+static LIST_HEAD(deferred_probe_list);
+
 static const struct iommu_ops *iommu_ops;
 
 const struct iommu_ops *iommu_get_ops(void)
@@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
 
 int __init iommu_hardware_setup(void)
 {
-    struct dt_device_node *np;
+    struct dt_device_node *np, *tmp;
     int rc;
     unsigned int num_iommus = 0;
 
@@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
         rc = device_init(np, DEVICE_IOMMU, NULL);
         if ( !rc )
             num_iommus++;
+        else if (rc == -EAGAIN)
+            /*
+             * Driver requested deferred probing, so add this device to
+             * the deferred list for further processing.
+             */
+            list_add(&np->deferred_probe, &deferred_probe_list);
+    }
+
+    /*
+     * Process devices in the deferred list if at least one successfully
+     * probed device is present.
+     */
+    while ( !list_empty(&deferred_probe_list) && num_iommus )
+    {
+        list_for_each_entry_safe ( np, tmp, &deferred_probe_list,
+                                   deferred_probe )
+        {
+            rc = device_init(np, DEVICE_IOMMU, NULL);
+            if ( !rc )
+                num_iommus++;
+            if ( rc != -EAGAIN )
+                /*
+                 * Driver didn't request deferred probing, so remove this device
+                 * from the deferred list.
+                 */
+                list_del_init(&np->deferred_probe);
+        }
     }
 
     return ( num_iommus > 0 ) ? 0 : -ENODEV;
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index 63a0f36..ee1c3bc 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -44,7 +44,11 @@ struct device_desc {
     enum device_class class;
     /* List of devices supported by this driver */
     const struct dt_device_match *dt_match;
-    /* Device initialization */
+    /*
+     * Device initialization.
+     *
+     * -EAGAIN is used to indicate that device probing is deferred.
+     */
     int (*init)(struct dt_device_node *dev, const void *data);
 };
 
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 8315629..71b0e47 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -93,6 +93,7 @@ struct dt_device_node {
     /* IOMMU specific fields */
     bool is_protected;
     struct list_head domain_list;
+    struct list_head deferred_probe;
 
     struct device dev;
 };
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff Oleksandr Tyshchenko
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-05 10:02   ` Jan Beulich
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support Oleksandr Tyshchenko
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, Jan Beulich

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

Next patch in this series will make use of it.

Original patch was initially posted by Sameer Goel:
https://lists.xen.org/archives/html/xen-devel/2017-06/msg00858.html

This could be considered as another attempt to add it:
https://www.mail-archive.com/kexec@lists.infradead.org/msg21335.html

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <George.Dunlap@eu.citrix.com>
CC: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien.grall@arm.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wl@xen.org>

---
   [As it was previously discussed with Julien in IRC]

   The reason for this patch to be an RFC is that patch itself is not
   completely correct and I don't fully understand what/how should
   be done for this patch to be accepted. Or whether community even
   wants this to go in. So, to avoid bike shedding, the first target is
   to collect feedback from the maintainers.

   In a nutshell, the upcoming "iommu_fwspec" support on ARM
   is going to use xrealloc when adding new device ID.
   
   We really want to have "iommu_fwspec" support which will give us
   a generic abstract way to add new device to the IOMMU based on
   the generic IOMMU DT binding.
   
   This is how Linux does:
   https://github.com/torvalds/linux/blob/master/drivers/iommu/iommu.c#L2257
   and we are doing the similar in next patch of this thread:
   "iommu/arm: Add lightweight iommu_fwspec support"
---
 xen/common/xmalloc_tlsf.c | 21 +++++++++++++++++++++
 xen/include/xen/xmalloc.h |  1 +
 2 files changed, 22 insertions(+)

diff --git a/xen/common/xmalloc_tlsf.c b/xen/common/xmalloc_tlsf.c
index 2076953..c080763 100644
--- a/xen/common/xmalloc_tlsf.c
+++ b/xen/common/xmalloc_tlsf.c
@@ -610,6 +610,27 @@ void *_xzalloc(unsigned long size, unsigned long align)
     return p ? memset(p, 0, size) : p;
 }
 
+void *_xrealloc(void *p, unsigned long new_size, unsigned long align)
+{
+    void *new_p;
+
+    if ( !new_size )
+    {
+        xfree(p);
+        return NULL;
+    }
+
+    new_p = _xmalloc(new_size, align);
+
+    if ( new_p && p )
+    {
+        memcpy(new_p, p, new_size);
+        xfree(p);
+    }
+
+    return new_p;
+}
+
 void xfree(void *p)
 {
     struct bhdr *b;
diff --git a/xen/include/xen/xmalloc.h b/xen/include/xen/xmalloc.h
index b486fe4..63961ef 100644
--- a/xen/include/xen/xmalloc.h
+++ b/xen/include/xen/xmalloc.h
@@ -51,6 +51,7 @@ extern void xfree(void *);
 /* Underlying functions */
 extern void *_xmalloc(unsigned long size, unsigned long align);
 extern void *_xzalloc(unsigned long size, unsigned long align);
+extern void *_xrealloc(void *p, unsigned long new_size, unsigned long align);
 
 static inline void *_xmalloc_array(
     unsigned long size, unsigned long align, unsigned long num)
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
                   ` (2 preceding siblings ...)
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-13 12:39   ` Julien Grall
  2019-08-13 13:40   ` Julien Grall
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API Oleksandr Tyshchenko
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

We need to have some abstract way to add new device to the IOMMU
based on the generic IOMMU DT binding [1] which can be used for
both DT (right now) and ACPI (in future).

For that reason we can borrow the idea used in Linux these days
called "iommu_fwspec". Having this in, it will be possible
to configure IOMMU master interfaces of the device (device IDs)
from a single common place and avoid keeping almost identifical look-up
implementations in each IOMMU driver.

There is no need to port the whole implementation of "iommu_fwspec"
to Xen, we could, probably, end up with a much simpler solution,
some "stripped down" version which fits our requirments.

So, this patch adds the following:
1. A common structure "iommu_fwspec" to hold the the per-device
   firmware data
2. New member "iommu_fwspec" of struct device
3. Functions/helpers to deal with "dev->iommu_fwspec"

It should be noted that in comparing with original "iommu_fwspec"
Xen's variant doesn't contain some fields, which are not really
needed at the moment (ops, flag) and "iommu_fwnode" field was replaced
by "iommu_dev" to avoid porting a lot of code (to support "fwnode_handle")
with little benefit.

Next patch in this series will make use of that support.

[1] https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
 xen/drivers/passthrough/arm/Makefile       |  2 +-
 xen/drivers/passthrough/arm/iommu_fwspec.c | 91 ++++++++++++++++++++++++++++++
 xen/include/asm-arm/device.h               |  1 +
 xen/include/asm-arm/iommu.h                |  2 +
 xen/include/asm-arm/iommu_fwspec.h         | 65 +++++++++++++++++++++
 5 files changed, 160 insertions(+), 1 deletion(-)
 create mode 100644 xen/drivers/passthrough/arm/iommu_fwspec.c
 create mode 100644 xen/include/asm-arm/iommu_fwspec.h

diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile
index 4abb87a..5fbad45 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1,2 +1,2 @@
-obj-y += iommu.o iommu_helpers.o
+obj-y += iommu.o iommu_helpers.o iommu_fwspec.o
 obj-$(CONFIG_ARM_SMMU) += smmu.o
diff --git a/xen/drivers/passthrough/arm/iommu_fwspec.c b/xen/drivers/passthrough/arm/iommu_fwspec.c
new file mode 100644
index 0000000..3474192
--- /dev/null
+++ b/xen/drivers/passthrough/arm/iommu_fwspec.c
@@ -0,0 +1,91 @@
+/*
+ * xen/drivers/passthrough/arm/iommu_fwspec.c
+ *
+ * Contains functions to maintain per-device firmware data
+ *
+ * Based on Linux's iommu_fwspec support you can find at:
+ *    drivers/iommu/iommu.c
+ *
+ * Copyright (C) 2019 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/lib.h>
+#include <xen/iommu.h>
+#include <asm/device.h>
+#include <asm/iommu_fwspec.h>
+
+int iommu_fwspec_init(struct device *dev, struct device *iommu_dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    if ( fwspec )
+        return 0;
+
+    fwspec = xzalloc(struct iommu_fwspec);
+    if ( !fwspec )
+        return -ENOMEM;
+
+    fwspec->iommu_dev = iommu_dev;
+    dev_iommu_fwspec_set(dev, fwspec);
+
+    return 0;
+}
+
+void iommu_fwspec_free(struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    if ( fwspec )
+    {
+        xfree(fwspec);
+        dev_iommu_fwspec_set(dev, NULL);
+    }
+}
+
+int iommu_fwspec_add_ids(struct device *dev, uint32_t *ids, int num_ids)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+    size_t size;
+    int i;
+
+    if ( !fwspec )
+        return -EINVAL;
+
+    size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
+    if ( size > sizeof(*fwspec) )
+    {
+        fwspec = _xrealloc(fwspec, size, sizeof(void *));
+        if ( !fwspec )
+            return -ENOMEM;
+
+        dev_iommu_fwspec_set(dev, fwspec);
+    }
+
+    for ( i = 0; i < num_ids; i++ )
+        fwspec->ids[fwspec->num_ids + i] = ids[i];
+
+    fwspec->num_ids += num_ids;
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
index ee1c3bc..ee7cff2 100644
--- a/xen/include/asm-arm/device.h
+++ b/xen/include/asm-arm/device.h
@@ -18,6 +18,7 @@ struct device
     struct dt_device_node *of_node; /* Used by drivers imported from Linux */
 #endif
     struct dev_archdata archdata;
+    struct iommu_fwspec *iommu_fwspec; /* per-device IOMMU instance data */
 };
 
 typedef struct device device_t;
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 20d865e..1853bd9 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -14,6 +14,8 @@
 #ifndef __ARCH_ARM_IOMMU_H__
 #define __ARCH_ARM_IOMMU_H__
 
+#include <asm/iommu_fwspec.h>
+
 struct arch_iommu
 {
     /* Private information for the IOMMU drivers */
diff --git a/xen/include/asm-arm/iommu_fwspec.h b/xen/include/asm-arm/iommu_fwspec.h
new file mode 100644
index 0000000..0676285
--- /dev/null
+++ b/xen/include/asm-arm/iommu_fwspec.h
@@ -0,0 +1,65 @@
+/*
+ * xen/include/asm-arm/iommu_fwspec.h
+ *
+ * Contains a common structure to hold the per-device firmware data and
+ * declaration of functions used to maintain that data
+ *
+ * Based on Linux's iommu_fwspec support you can find at:
+ *    include/linux/iommu.h
+ *
+ * Copyright (C) 2019 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ARCH_ARM_IOMMU_FWSPEC_H__
+#define __ARCH_ARM_IOMMU_FWSPEC_H__
+
+/* per-device IOMMU instance data */
+struct iommu_fwspec {
+    /* device which represents this IOMMU H/W */
+    struct device *iommu_dev;
+    /* IOMMU driver private data for this device */
+    void *iommu_priv;
+    /* number of associated device IDs */
+    unsigned int num_ids;
+    /* IDs which this device may present to the IOMMU */
+    uint32_t ids[1];
+};
+
+int iommu_fwspec_init(struct device *dev, struct device *iommu_dev);
+void iommu_fwspec_free(struct device *dev);
+int iommu_fwspec_add_ids(struct device *dev, uint32_t *ids, int num_ids);
+
+static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
+{
+    return dev->iommu_fwspec;
+}
+
+static inline void dev_iommu_fwspec_set(struct device *dev,
+                                        struct iommu_fwspec *fwspec)
+{
+    dev->iommu_fwspec = fwspec;
+}
+
+#endif /* __ARCH_ARM_IOMMU_FWSPEC_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
                   ` (3 preceding siblings ...)
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-13 13:49   ` Julien Grall
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support Oleksandr Tyshchenko
  2019-08-05  7:58 ` [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr
  6 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel; +Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

This patch adds new iommu_add_dt_device API for adding DT device
to the IOMMU using generic IOMMU DT binding [1] and previously
added "iommu_fwspec" support.

New function parses the DT binding, prepares "dev->iommu_fwspec"
with correct information and calls the IOMMU driver using "add_device"
callback to register new DT device.
The IOMMU driver's responsibility is to check whether "dev->iommu_fwspec"
is initialized and mark that device as protected.

The additional benefit here is to avoid to go through the whole DT
multiple times in IOMMU driver trying to locate master devices which
belong to each IOMMU device being probed.

The upcoming IPMMU driver will have "add_device" callback implemented.

I hope, this patch won't break SMMU driver's functionality,
which doesn't have this callback implemented.

[1] https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
---
 xen/arch/arm/domain_build.c         | 12 ++++++++++
 xen/drivers/passthrough/arm/iommu.c | 45 +++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/iommu.h         |  3 +++
 3 files changed, 60 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d983677..d67f7d4 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -1241,6 +1241,18 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
     u64 addr, size;
     bool need_mapping = !dt_device_for_passthrough(dev);
 
+    if ( dt_parse_phandle(dev, "iommus", 0) )
+    {
+        dt_dprintk("%s add to iommu\n", dt_node_full_name(dev));
+        res = iommu_add_dt_device(dev);
+        if ( res )
+        {
+            printk(XENLOG_ERR "Failed to add %s to the IOMMU\n",
+                   dt_node_full_name(dev));
+            return res;
+        }
+    }
+
     nirq = dt_number_of_irq(dev);
     naddr = dt_number_of_address(dev);
 
diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 3195919..19516af 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -113,3 +113,48 @@ int arch_iommu_populate_page_table(struct domain *d)
 void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
 {
 }
+
+int __init iommu_add_dt_device(struct dt_device_node *np)
+{
+    const struct iommu_ops *ops = iommu_get_ops();
+    struct dt_phandle_args iommu_spec;
+    struct device *dev = dt_to_dev(np);
+    int rc = 1, index = 0;
+
+    if ( !iommu_enabled || !ops || !ops->add_device )
+        return 0;
+
+    if ( dev_iommu_fwspec_get(dev) )
+        return -EEXIST;
+
+    /* According to the Documentation/devicetree/bindings/iommu/iommu.txt */
+    while ( !dt_parse_phandle_with_args(np, "iommus", "#iommu-cells",
+                                        index, &iommu_spec) )
+    {
+        if ( !dt_device_is_available(iommu_spec.np) )
+            break;
+
+        rc = iommu_fwspec_init(dev, &iommu_spec.np->dev);
+        if ( rc )
+            break;
+
+        rc = iommu_fwspec_add_ids(dev, iommu_spec.args, 1);
+        if ( rc )
+            break;
+
+        index++;
+    }
+
+    /*
+     * Add DT device to the IOMMU if latter is present and available.
+     * The IOMMU driver's responsibility is to check whether dev->iommu_fwspec
+     * field is initialized and mark that device as protected.
+     */
+    if ( !rc )
+        rc = ops->add_device(0, dev);
+
+    if ( rc < 0 )
+        iommu_fwspec_free(dev);
+
+    return rc < 0 ? rc : 0;
+}
diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
index 1853bd9..06b07fa 100644
--- a/xen/include/asm-arm/iommu.h
+++ b/xen/include/asm-arm/iommu.h
@@ -28,6 +28,9 @@ struct arch_iommu
 const struct iommu_ops *iommu_get_ops(void);
 void iommu_set_ops(const struct iommu_ops *ops);
 
+/* helper to add DT device to the IOMMU */
+int iommu_add_dt_device(struct dt_device_node *np);
+
 /* mapping helpers */
 int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
                                     unsigned int flags,
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
                   ` (4 preceding siblings ...)
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API Oleksandr Tyshchenko
@ 2019-08-02 16:39 ` Oleksandr Tyshchenko
  2019-08-07  2:41   ` Yoshihiro Shimoda
  2019-08-14 17:38   ` Julien Grall
  2019-08-05  7:58 ` [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr
  6 siblings, 2 replies; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-02 16:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, julien.grall, sstabellini, Yoshihiro Shimoda

From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>

The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
which provides address translation and access protection functionalities
to processing units and interconnect networks.

Please note, current driver is supposed to work only with newest
Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation
table format and is able to use CPU's P2M table as is if one is
3-level page table (up to 40 bit IPA).

The major differences compare to the Linux driver are:

1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
translation only (with Stage 1 translation table format). It manages
page table by itself. But Xen driver supports Stage 2 translation
(with Stage 2 translation table format) to be able to share the P2M
with the CPU. Stage 1 translation is always bypassed in Xen driver.

So, Xen driver is supposed to be used with newest Gen3 SoC revisions only
(H3 ES3.0, M3 ES3.0, etc.) which IPMMU H/W supports stage 2 translation
table format.

2. AArch64 support. Linux driver uses VMSAv8-32 mode, while Xen driver
enables Armv8 VMSAv8-64 mode to cover up to 40 bit input address.

3. Context bank (sets of page table) usage. In Xen, each context bank is
mapped to one Xen domain. So, all devices being pass throughed to the
same Xen domain share the same context bank.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
CC: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

---
Changes V1 -> V2:
    - rewrited driver to use iommu_fwspec
    - removed DT parsing code for micro-TLBs
    - removed struct ipmmu_vmsa_master_cfg, dev_archdata macro
    - added ipmmu_find_mmu_by_dev(), various helpers to access
      fwspec->iommu_priv
    - implemented new callback "add_device" to add master device to IPMMU
    - removed ipmmu_protect_masters()
    - removed code to locate Root device in the first place,
      used EAGAIN to request deferred probing
    - used printk_once for the system wide error messages in ipmmu_init()
      which don't need to be shown for every device being probed
    - removed map_page/unmap_page implementation, reused them
      from iommu_helpers.c
    - used %pd for printing domaid id
    - performed various cosmetic fixes
    - changed u32 -> uint32_t, u64 -> uint64_t,
      unsigned int -> uint32_t where needed
    - clarified TODOs
    - clafiried supported SoC versions in config IPMMU_VMSA,
      set default to "n"
    - updated comments in code, provided more accurate description,
      added new comments where needed
    - updated patch description by providing differences between
      Linux/Xen implementations
    - removed fields for cache snoop transaction when configuring IMTTBCR
      (update from Renesas BSP)
---
 xen/arch/arm/platforms/Kconfig           |    1 +
 xen/drivers/passthrough/Kconfig          |   13 +
 xen/drivers/passthrough/arm/Makefile     |    1 +
 xen/drivers/passthrough/arm/ipmmu-vmsa.c | 1342 ++++++++++++++++++++++++++++++
 4 files changed, 1357 insertions(+)
 create mode 100644 xen/drivers/passthrough/arm/ipmmu-vmsa.c

diff --git a/xen/arch/arm/platforms/Kconfig b/xen/arch/arm/platforms/Kconfig
index bc0e9cd..c93a6b2 100644
--- a/xen/arch/arm/platforms/Kconfig
+++ b/xen/arch/arm/platforms/Kconfig
@@ -25,6 +25,7 @@ config RCAR3
 	bool "Renesas RCar3 support"
 	depends on ARM_64
 	select HAS_SCIF
+	select IPMMU_VMSA
 	---help---
 	Enable all the required drivers for Renesas RCar3
 
diff --git a/xen/drivers/passthrough/Kconfig b/xen/drivers/passthrough/Kconfig
index a3c0649..3daee16 100644
--- a/xen/drivers/passthrough/Kconfig
+++ b/xen/drivers/passthrough/Kconfig
@@ -12,4 +12,17 @@ config ARM_SMMU
 
 	  Say Y here if your SoC includes an IOMMU device implementing the
 	  ARM SMMU architecture.
+
+config IPMMU_VMSA
+	bool "Renesas IPMMU-VMSA found in R-Car Gen3 SoCs"
+	default n
+	depends on ARM_64
+	---help---
+	  Support for implementations of the Renesas IPMMU-VMSA found
+	  in R-Car Gen3 SoCs.
+
+	  Say Y here if you are using newest R-Car Gen3 SoCs revisions
+	  (H3 ES3.0, M3 ES3.0, etc) which IPMMU hardware supports stage 2
+	  translation table format and is able to use CPU's P2M table as is.
+
 endif
diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile
index 5fbad45..fcd918e 100644
--- a/xen/drivers/passthrough/arm/Makefile
+++ b/xen/drivers/passthrough/arm/Makefile
@@ -1,2 +1,3 @@
 obj-y += iommu.o iommu_helpers.o iommu_fwspec.o
 obj-$(CONFIG_ARM_SMMU) += smmu.o
+obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
new file mode 100644
index 0000000..a34a8f8
--- /dev/null
+++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
@@ -0,0 +1,1342 @@
+/*
+ * xen/drivers/passthrough/arm/ipmmu-vmsa.c
+ *
+ * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
+ *
+ * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
+ * which provides address translation and access protection functionalities
+ * to processing units and interconnect networks.
+ *
+ * Please note, current driver is supposed to work only with newest Gen3 SoCs
+ * revisions which IPMMU hardware supports stage 2 translation table format and
+ * is able to use CPU's P2M table as is.
+ *
+ * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
+ *    drivers/iommu/ipmmu-vmsa.c
+ * you can found at:
+ *    url: git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
+ *    branch: v4.14.75-ltsi/rcar-3.9.6
+ *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
+ * and Xen's SMMU driver:
+ *    xen/drivers/passthrough/arm/smmu.c
+ *
+ * Copyright (C) 2016-2019 EPAM Systems Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/delay.h>
+#include <xen/err.h>
+#include <xen/irq.h>
+#include <xen/lib.h>
+#include <xen/list.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/vmap.h>
+#include <asm/atomic.h>
+#include <asm/device.h>
+#include <asm/io.h>
+
+#define dev_name(dev) dt_node_full_name(dev_to_dt(dev))
+
+/* Device logger functions */
+#define dev_print(dev, lvl, fmt, ...)    \
+    printk(lvl "ipmmu: %s: " fmt, dev_name(dev), ## __VA_ARGS__)
+
+#define dev_info(dev, fmt, ...)    \
+    dev_print(dev, XENLOG_INFO, fmt, ## __VA_ARGS__)
+#define dev_warn(dev, fmt, ...)    \
+    dev_print(dev, XENLOG_WARNING, fmt, ## __VA_ARGS__)
+#define dev_err(dev, fmt, ...)     \
+    dev_print(dev, XENLOG_ERR, fmt, ## __VA_ARGS__)
+#define dev_err_ratelimited(dev, fmt, ...)    \
+    dev_print(dev, XENLOG_ERR, fmt, ## __VA_ARGS__)
+
+/*
+ * Gen3 SoCs make use of up to 8 IPMMU contexts (sets of page table) and
+ * these can be managed independently. Each context is mapped to one Xen domain.
+ */
+#define IPMMU_CTX_MAX     8
+/* Gen3 SoCs make use of up to 48 micro-TLBs per IPMMU device. */
+#define IPMMU_UTLB_MAX    48
+
+/* IPMMU context supports IPA size up to 40 bit. */
+#define IPMMU_MAX_P2M_IPA_BITS    40
+
+/*
+ * Xen's domain IPMMU information stored in dom_iommu(d)->arch.priv
+ *
+ * As each context (set of page table) is mapped to one Xen domain,
+ * all associated IPMMU domains use the same context mapped to this Xen domain.
+ * This makes all master devices being attached to the same Xen domain share
+ * the same context (P2M table).
+ */
+struct ipmmu_vmsa_xen_domain {
+    /*
+     * Used to protect everything which belongs to this Xen domain:
+     * device assignment, domain init/destroy, flush ops, etc
+     */
+    spinlock_t lock;
+    /* One or more Cache IPMMU domains associated with this Xen domain */
+    struct list_head cache_domains;
+    /* Root IPMMU domain associated with this Xen domain */
+    struct ipmmu_vmsa_domain *root_domain;
+};
+
+/* Xen master device's IPMMU information stored in fwspec->iommu_priv */
+struct ipmmu_vmsa_xen_device {
+    /* Cache IPMMU domain this master device is logically attached to */
+    struct ipmmu_vmsa_domain *domain;
+    /* Cache IPMMU this master device is physically connected to */
+    struct ipmmu_vmsa_device *mmu;
+};
+
+/* Root/Cache IPMMU device's information */
+struct ipmmu_vmsa_device {
+    struct device *dev;
+    void __iomem *base;
+    struct ipmmu_vmsa_device *root;
+    struct list_head list;
+    unsigned int num_utlbs;
+    unsigned int num_ctx;
+    spinlock_t lock;    /* Protects ctx and domains[] */
+    DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
+    struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+};
+
+/*
+ * Root/Cache IPMMU domain's information
+ *
+ * Root IPMMU device is assigned to Root IPMMU domain while Cache IPMMU device
+ * is assigned to Cache IPMMU domain. Master devices are connected to Cache
+ * IPMMU devices through specific ports called micro-TLBs.
+ * All Cache IPMMU devices, in turn, are connected to Root IPMMU device
+ * which manages IPMMU context.
+ */
+struct ipmmu_vmsa_domain {
+    /*
+     * IPMMU device assigned to this IPMMU domain.
+     * Either Root device which is located at the main memory bus domain or
+     * Cache device which is located at each hierarchy bus domain.
+     */
+    struct ipmmu_vmsa_device *mmu;
+
+    /* Context used for this IPMMU domain */
+    unsigned int context_id;
+
+    /* Xen domain associated with this IPMMU domain */
+    struct domain *d;
+
+    /* The fields below are used for Cache IPMMU domain only */
+
+    /*
+     * Used to keep track of the master devices which are attached to this
+     * IPMMU domain (domain users). Master devices behind the same IPMMU device
+     * are grouped together by putting into the same IPMMU domain.
+     * Only when the refcount reaches 0 this IPMMU domain can be destroyed.
+     */
+    unsigned int refcount;
+    /* Used to link this IPMMU domain for the same Xen domain */
+    struct list_head list;
+};
+
+/* Used to keep track of registered IPMMU devices */
+static LIST_HEAD(ipmmu_devices);
+static DEFINE_SPINLOCK(ipmmu_devices_lock);
+
+#define TLB_LOOP_TIMEOUT    100 /* 100us */
+
+/* Registers Definition */
+#define IM_CTX_SIZE    0x40
+
+#define IMCTR                0x0000
+/*
+ * These fields are implemented in IPMMU-MM only. So, can be set for
+ * Root IPMMU only.
+ */
+#define IMCTR_VA64           (1 << 29)
+#define IMCTR_TRE            (1 << 17)
+#define IMCTR_AFE            (1 << 16)
+#define IMCTR_RTSEL_MASK     (3 << 4)
+#define IMCTR_RTSEL_SHIFT    4
+#define IMCTR_TREN           (1 << 3)
+/*
+ * These fields are common for all IPMMU devices. So, can be set for
+ * Cache IPMMUs as well.
+ */
+#define IMCTR_INTEN          (1 << 2)
+#define IMCTR_FLUSH          (1 << 1)
+#define IMCTR_MMUEN          (1 << 0)
+#define IMCTR_COMMON_MASK    (7 << 0)
+
+#define IMCAAR               0x0004
+
+#define IMTTBCR                        0x0008
+#define IMTTBCR_EAE                    (1 << 31)
+#define IMTTBCR_PMB                    (1 << 30)
+#define IMTTBCR_SH1_NON_SHAREABLE      (0 << 28)
+#define IMTTBCR_SH1_OUTER_SHAREABLE    (2 << 28)
+#define IMTTBCR_SH1_INNER_SHAREABLE    (3 << 28)
+#define IMTTBCR_SH1_MASK               (3 << 28)
+#define IMTTBCR_ORGN1_NC               (0 << 26)
+#define IMTTBCR_ORGN1_WB_WA            (1 << 26)
+#define IMTTBCR_ORGN1_WT               (2 << 26)
+#define IMTTBCR_ORGN1_WB               (3 << 26)
+#define IMTTBCR_ORGN1_MASK             (3 << 26)
+#define IMTTBCR_IRGN1_NC               (0 << 24)
+#define IMTTBCR_IRGN1_WB_WA            (1 << 24)
+#define IMTTBCR_IRGN1_WT               (2 << 24)
+#define IMTTBCR_IRGN1_WB               (3 << 24)
+#define IMTTBCR_IRGN1_MASK             (3 << 24)
+#define IMTTBCR_TSZ1_MASK              (1f << 16)
+#define IMTTBCR_TSZ1_SHIFT             16
+#define IMTTBCR_SH0_NON_SHAREABLE      (0 << 12)
+#define IMTTBCR_SH0_OUTER_SHAREABLE    (2 << 12)
+#define IMTTBCR_SH0_INNER_SHAREABLE    (3 << 12)
+#define IMTTBCR_SH0_MASK               (3 << 12)
+#define IMTTBCR_ORGN0_NC               (0 << 10)
+#define IMTTBCR_ORGN0_WB_WA            (1 << 10)
+#define IMTTBCR_ORGN0_WT               (2 << 10)
+#define IMTTBCR_ORGN0_WB               (3 << 10)
+#define IMTTBCR_ORGN0_MASK             (3 << 10)
+#define IMTTBCR_IRGN0_NC               (0 << 8)
+#define IMTTBCR_IRGN0_WB_WA            (1 << 8)
+#define IMTTBCR_IRGN0_WT               (2 << 8)
+#define IMTTBCR_IRGN0_WB               (3 << 8)
+#define IMTTBCR_IRGN0_MASK             (3 << 8)
+#define IMTTBCR_SL0_LVL_2              (0 << 6)
+#define IMTTBCR_SL0_LVL_1              (1 << 6)
+#define IMTTBCR_TSZ0_MASK              (0x1f << 0)
+#define IMTTBCR_TSZ0_SHIFT             0
+
+#define IMTTLBR0              0x0010
+#define IMTTLBR0_TTBR_MASK    (0xfffff << 12)
+#define IMTTUBR0              0x0014
+#define IMTTUBR0_TTBR_MASK    (0xff << 0)
+#define IMTTLBR1              0x0018
+#define IMTTLBR1_TTBR_MASK    (0xfffff << 12)
+#define IMTTUBR1              0x001c
+#define IMTTUBR1_TTBR_MASK    (0xff << 0)
+
+#define IMSTR                          0x0020
+#define IMSTR_ERRLVL_MASK              (3 << 12)
+#define IMSTR_ERRLVL_SHIFT             12
+#define IMSTR_ERRCODE_TLB_FORMAT       (1 << 8)
+#define IMSTR_ERRCODE_ACCESS_PERM      (4 << 8)
+#define IMSTR_ERRCODE_SECURE_ACCESS    (5 << 8)
+#define IMSTR_ERRCODE_MASK             (7 << 8)
+#define IMSTR_MHIT                     (1 << 4)
+#define IMSTR_ABORT                    (1 << 2)
+#define IMSTR_PF                       (1 << 1)
+#define IMSTR_TF                       (1 << 0)
+
+#define IMELAR    0x0030
+#define IMEUAR    0x0034
+
+#define IMUCTR(n)              ((n) < 32 ? IMUCTR0(n) : IMUCTR32(n))
+#define IMUCTR0(n)             (0x0300 + ((n) * 16))
+#define IMUCTR32(n)            (0x0600 + (((n) - 32) * 16))
+#define IMUCTR_FIXADDEN        (1 << 31)
+#define IMUCTR_FIXADD_MASK     (0xff << 16)
+#define IMUCTR_FIXADD_SHIFT    16
+#define IMUCTR_TTSEL_MMU(n)    ((n) << 4)
+#define IMUCTR_TTSEL_PMB       (8 << 4)
+#define IMUCTR_TTSEL_MASK      (15 << 4)
+#define IMUCTR_FLUSH           (1 << 1)
+#define IMUCTR_MMUEN           (1 << 0)
+
+#define IMUASID(n)             ((n) < 32 ? IMUASID0(n) : IMUASID32(n))
+#define IMUASID0(n)            (0x0308 + ((n) * 16))
+#define IMUASID32(n)           (0x0608 + (((n) - 32) * 16))
+#define IMUASID_ASID8_MASK     (0xff << 8)
+#define IMUASID_ASID8_SHIFT    8
+#define IMUASID_ASID0_MASK     (0xff << 0)
+#define IMUASID_ASID0_SHIFT    0
+
+#define IMSAUXCTLR          0x0504
+#define IMSAUXCTLR_S2PTE    (1 << 3)
+
+static struct ipmmu_vmsa_device *to_ipmmu(struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    return fwspec && fwspec->iommu_priv ?
+        ((struct ipmmu_vmsa_xen_device *)fwspec->iommu_priv)->mmu : NULL;
+}
+
+static void set_ipmmu(struct device *dev, struct ipmmu_vmsa_device *mmu)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    ((struct ipmmu_vmsa_xen_device *)fwspec->iommu_priv)->mmu = mmu;
+}
+
+static struct ipmmu_vmsa_domain *to_domain(struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    return fwspec && fwspec->iommu_priv ?
+        ((struct ipmmu_vmsa_xen_device *)fwspec->iommu_priv)->domain : NULL;
+}
+
+static void set_domain(struct device *dev, struct ipmmu_vmsa_domain *domain)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+    ((struct ipmmu_vmsa_xen_device *)fwspec->iommu_priv)->domain = domain;
+}
+
+static struct ipmmu_vmsa_device *ipmmu_find_mmu_by_dev(struct device *dev)
+{
+    struct ipmmu_vmsa_device *mmu = NULL;
+    bool found = false;
+
+    spin_lock(&ipmmu_devices_lock);
+
+    list_for_each_entry ( mmu, &ipmmu_devices, list )
+    {
+        if ( mmu->dev == dev )
+        {
+            found = true;
+            break;
+        }
+    }
+
+    spin_unlock(&ipmmu_devices_lock);
+
+    return found ? mmu : NULL;
+}
+
+/* Root device handling */
+static bool ipmmu_is_root(struct ipmmu_vmsa_device *mmu)
+{
+    return mmu->root == mmu;
+}
+
+static struct ipmmu_vmsa_device *ipmmu_find_root(void)
+{
+    struct ipmmu_vmsa_device *mmu = NULL;
+    bool found = false;
+
+    spin_lock(&ipmmu_devices_lock);
+
+    list_for_each_entry( mmu, &ipmmu_devices, list )
+    {
+        if ( ipmmu_is_root(mmu) )
+        {
+            found = true;
+            break;
+        }
+    }
+
+    spin_unlock(&ipmmu_devices_lock);
+
+    return found ? mmu : NULL;
+}
+
+/* Read/Write Access */
+static uint32_t ipmmu_read(struct ipmmu_vmsa_device *mmu, uint32_t offset)
+{
+    return readl(mmu->base + offset);
+}
+
+static void ipmmu_write(struct ipmmu_vmsa_device *mmu, uint32_t offset,
+                        uint32_t data)
+{
+    writel(data, mmu->base + offset);
+}
+
+static uint32_t ipmmu_ctx_read_root(struct ipmmu_vmsa_domain *domain,
+                                    uint32_t reg)
+{
+    return ipmmu_read(domain->mmu->root,
+                      domain->context_id * IM_CTX_SIZE + reg);
+}
+
+static void ipmmu_ctx_write_root(struct ipmmu_vmsa_domain *domain,
+                                 uint32_t reg, uint32_t data)
+{
+    ipmmu_write(domain->mmu->root,
+                domain->context_id * IM_CTX_SIZE + reg, data);
+}
+
+static void ipmmu_ctx_write_cache(struct ipmmu_vmsa_domain *domain,
+                                  uint32_t reg, uint32_t data)
+{
+    /* We expect only IMCTR value to be passed as a reg. */
+    ASSERT(reg == IMCTR);
+
+    /* Mask fields which are implemented in IPMMU-MM only. */
+    if ( !ipmmu_is_root(domain->mmu) )
+        ipmmu_write(domain->mmu, domain->context_id * IM_CTX_SIZE + reg,
+                    data & IMCTR_COMMON_MASK);
+}
+
+/*
+ * Write the context to both Root IPMMU and all Cache IPMMUs assigned
+ * to this Xen domain.
+ */
+static void ipmmu_ctx_write_all(struct ipmmu_vmsa_domain *domain,
+                                uint32_t reg, uint32_t data)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(domain->d)->arch.priv;
+    struct ipmmu_vmsa_domain *cache_domain;
+
+    list_for_each_entry( cache_domain, &xen_domain->cache_domains, list )
+        ipmmu_ctx_write_cache(cache_domain, reg, data);
+
+    ipmmu_ctx_write_root(domain, reg, data);
+}
+
+/* TLB and micro-TLB Management */
+
+/* Wait for any pending TLB invalidations to complete. */
+static void ipmmu_tlb_sync(struct ipmmu_vmsa_domain *domain)
+{
+    unsigned int count = 0;
+
+    while ( ipmmu_ctx_read_root(domain, IMCTR) & IMCTR_FLUSH )
+    {
+        cpu_relax();
+        if ( ++count == TLB_LOOP_TIMEOUT )
+        {
+            dev_err_ratelimited(domain->mmu->dev, "TLB sync timed out -- MMU may be deadlocked\n");
+            return;
+        }
+        udelay(1);
+    }
+}
+
+static void ipmmu_tlb_invalidate(struct ipmmu_vmsa_domain *domain)
+{
+    uint32_t data;
+
+    data = ipmmu_ctx_read_root(domain, IMCTR);
+    data |= IMCTR_FLUSH;
+    ipmmu_ctx_write_all(domain, IMCTR, data);
+
+    ipmmu_tlb_sync(domain);
+}
+
+/* Enable MMU translation for the micro-TLB. */
+static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain *domain,
+                              unsigned int utlb)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+
+    /*
+     * TODO: Reference-count the micro-TLB as several bus masters can be
+     * connected to the same micro-TLB. Prevent the use cases where
+     * the same micro-TLB could be shared between multiple Xen domains.
+     */
+    ipmmu_write(mmu, IMUASID(utlb), 0);
+    ipmmu_write(mmu, IMUCTR(utlb), ipmmu_read(mmu, IMUCTR(utlb)) |
+                IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_MMUEN);
+}
+
+/* Disable MMU translation for the micro-TLB. */
+static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain *domain,
+                               unsigned int utlb)
+{
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+
+    ipmmu_write(mmu, IMUCTR(utlb), 0);
+}
+
+/* Domain/Context Management */
+static int ipmmu_domain_allocate_context(struct ipmmu_vmsa_device *mmu,
+                                         struct ipmmu_vmsa_domain *domain)
+{
+    unsigned long flags;
+    int ret;
+
+    spin_lock_irqsave(&mmu->lock, flags);
+
+    ret = find_first_zero_bit(mmu->ctx, mmu->num_ctx);
+    if ( ret != mmu->num_ctx )
+    {
+        mmu->domains[ret] = domain;
+        set_bit(ret, mmu->ctx);
+    }
+    else
+        ret = -EBUSY;
+
+    spin_unlock_irqrestore(&mmu->lock, flags);
+
+    return ret;
+}
+
+static void ipmmu_domain_free_context(struct ipmmu_vmsa_device *mmu,
+                                      unsigned int context_id)
+{
+    unsigned long flags;
+
+    spin_lock_irqsave(&mmu->lock, flags);
+
+    clear_bit(context_id, mmu->ctx);
+    mmu->domains[context_id] = NULL;
+
+    spin_unlock_irqrestore(&mmu->lock, flags);
+}
+
+static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
+{
+    uint64_t ttbr;
+    uint32_t tsz0;
+    int ret;
+
+    /* Find an unused context. */
+    ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
+    if ( ret < 0 )
+        return ret;
+
+    domain->context_id = ret;
+
+    /*
+     * TTBR0
+     * Use P2M table for this Xen domain.
+     */
+    ASSERT(domain->d != NULL);
+    ttbr = page_to_maddr(domain->d->arch.p2m.root);
+
+    dev_info(domain->mmu->root->dev, "%pd: Set IPMMU context %u (pgd 0x%"PRIx64")\n",
+             domain->d, domain->context_id, ttbr);
+
+    ipmmu_ctx_write_root(domain, IMTTLBR0, ttbr & IMTTLBR0_TTBR_MASK);
+    ipmmu_ctx_write_root(domain, IMTTUBR0, (ttbr >> 32) & IMTTUBR0_TTBR_MASK);
+
+    /*
+     * TTBCR
+     * We use long descriptors and allocate the whole "p2m_ipa_bits" IPA space
+     * to TTBR0. Use 4KB page granule. Start page table walks at first level.
+     * Always bypass stage 1 translation.
+     */
+    tsz0 = (64 - p2m_ipa_bits) << IMTTBCR_TSZ0_SHIFT;
+    ipmmu_ctx_write_root(domain, IMTTBCR, IMTTBCR_EAE | IMTTBCR_PMB |
+                         IMTTBCR_SL0_LVL_1 | tsz0);
+
+    /*
+     * IMSTR
+     * Clear all interrupt flags.
+     */
+    ipmmu_ctx_write_root(domain, IMSTR, ipmmu_ctx_read_root(domain, IMSTR));
+
+    /*
+     * IMCTR
+     * Enable the MMU and interrupt generation. The long-descriptor
+     * translation table format doesn't use TEX remapping. Don't enable AF
+     * software management as we have no use for it. Use VMSAv8-64 mode.
+     * Enable the context for Root IPMMU only. Flush the TLB as required
+     * when modifying the context registers.
+     */
+    ipmmu_ctx_write_root(domain, IMCTR,
+                         IMCTR_VA64 | IMCTR_INTEN | IMCTR_FLUSH | IMCTR_MMUEN);
+
+    return 0;
+}
+
+static void ipmmu_domain_destroy_context(struct ipmmu_vmsa_domain *domain)
+{
+    if ( !domain->mmu )
+        return;
+
+    /*
+     * Disable the context for Root IPMMU only. Flush the TLB as required
+     * when modifying the context registers.
+     */
+    ipmmu_ctx_write_root(domain, IMCTR, IMCTR_FLUSH);
+    ipmmu_tlb_sync(domain);
+
+    ipmmu_domain_free_context(domain->mmu->root, domain->context_id);
+}
+
+/* Fault Handling */
+static void ipmmu_domain_irq(struct ipmmu_vmsa_domain *domain)
+{
+    const uint32_t err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF | IMSTR_TF;
+    struct ipmmu_vmsa_device *mmu = domain->mmu;
+    uint32_t status;
+    uint64_t iova;
+
+    status = ipmmu_ctx_read_root(domain, IMSTR);
+    if ( !(status & err_mask) )
+        return;
+
+    iova = ipmmu_ctx_read_root(domain, IMELAR) |
+        ((uint64_t)ipmmu_ctx_read_root(domain, IMEUAR) << 32);
+
+    /*
+     * Clear the error status flags. Unlike traditional interrupt flag
+     * registers that must be cleared by writing 1, this status register
+     * seems to require 0. The error address register must be read before,
+     * otherwise its value will be 0.
+     */
+    ipmmu_ctx_write_root(domain, IMSTR, 0);
+
+    /* Log fatal errors. */
+    if ( status & IMSTR_MHIT )
+        dev_err_ratelimited(mmu->dev, "%pd: Multiple TLB hits @0x%"PRIx64"\n",
+                            domain->d, iova);
+    if ( status & IMSTR_ABORT )
+        dev_err_ratelimited(mmu->dev, "%pd: Page Table Walk Abort @0x%"PRIx64"\n",
+                            domain->d, iova);
+
+    /* Return if it is neither Permission Fault nor Translation Fault. */
+    if ( !(status & (IMSTR_PF | IMSTR_TF)) )
+        return;
+
+    /* Flush the TLB as required when IPMMU translation error occurred. */
+    ipmmu_tlb_invalidate(domain);
+
+    dev_err_ratelimited(mmu->dev, "%pd: Unhandled fault: status 0x%08x iova 0x%"PRIx64"\n",
+                        domain->d, status, iova);
+}
+
+static void ipmmu_irq(int irq, void *dev, struct cpu_user_regs *regs)
+{
+    struct ipmmu_vmsa_device *mmu = dev;
+    unsigned int i;
+    unsigned long flags;
+
+    spin_lock_irqsave(&mmu->lock, flags);
+
+    /*
+     * When interrupt arrives, we don't know the context it is related to.
+     * So, check interrupts for all active contexts to locate a context
+     * with status bits set.
+    */
+    for ( i = 0; i < mmu->num_ctx; i++ )
+    {
+        if ( !mmu->domains[i] )
+            continue;
+        ipmmu_domain_irq(mmu->domains[i]);
+    }
+
+    spin_unlock_irqrestore(&mmu->lock, flags);
+}
+
+/* Master devices management */
+static int ipmmu_attach_device(struct ipmmu_vmsa_domain *domain,
+                               struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+    struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
+    unsigned int i;
+
+    if ( !mmu )
+    {
+        dev_err(dev, "Cannot attach to IPMMU\n");
+        return -ENXIO;
+    }
+
+    if ( !domain->mmu )
+    {
+        /* The domain hasn't been used yet, initialize it. */
+        domain->mmu = mmu;
+
+        /*
+         * We have already enabled context for Root IPMMU assigned to this
+         * Xen domain in ipmmu_domain_init_context().
+         * Enable the context for Cache IPMMU only. Flush the TLB as required
+         * when modifying the context registers.
+         */
+        ipmmu_ctx_write_cache(domain, IMCTR,
+                              ipmmu_ctx_read_root(domain, IMCTR) | IMCTR_FLUSH);
+
+        dev_info(dev, "Using IPMMU context %u\n", domain->context_id);
+    }
+    else if ( domain->mmu != mmu )
+    {
+        /*
+         * Something is wrong, we can't attach two master devices using
+         * different IOMMUs to the same IPMMU domain.
+         */
+        dev_err(dev, "Can't attach IPMMU %s to domain on IPMMU %s\n",
+                dev_name(mmu->dev), dev_name(domain->mmu->dev));
+        return -EINVAL;
+    }
+    else
+        dev_info(dev, "Reusing IPMMU context %u\n", domain->context_id);
+
+    for ( i = 0; i < fwspec->num_ids; ++i )
+        ipmmu_utlb_enable(domain, fwspec->ids[i]);
+
+    return 0;
+}
+
+static void ipmmu_detach_device(struct ipmmu_vmsa_domain *domain,
+                                struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+    unsigned int i;
+
+    for ( i = 0; i < fwspec->num_ids; ++i )
+        ipmmu_utlb_disable(domain, fwspec->ids[i]);
+}
+
+static void ipmmu_device_reset(struct ipmmu_vmsa_device *mmu)
+{
+    unsigned int i;
+
+    /* Disable all contexts. */
+    for ( i = 0; i < mmu->num_ctx; ++i )
+        ipmmu_write(mmu, i * IM_CTX_SIZE + IMCTR, 0);
+}
+
+/*
+ * This function relies on the fact that Root IPMMU device is being probed
+ * the first. If not the case, it denies further Cache IPMMU device probes
+ * (returns the -EAGAIN) until the Root IPMMU device has been registered
+ * for sure.
+ */
+static int ipmmu_probe(struct dt_device_node *node)
+{
+    struct ipmmu_vmsa_device *mmu;
+    uint64_t addr, size;
+    int irq, ret;
+
+    mmu = xzalloc(struct ipmmu_vmsa_device);
+    if ( !mmu )
+    {
+        dev_err(&node->dev, "Cannot allocate device data\n");
+        return -ENOMEM;
+    }
+
+    mmu->dev = &node->dev;
+    mmu->num_utlbs = IPMMU_UTLB_MAX;
+    mmu->num_ctx = IPMMU_CTX_MAX;
+    spin_lock_init(&mmu->lock);
+    bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
+
+    /* Map I/O memory and request IRQ. */
+    ret = dt_device_get_address(node, 0, &addr, &size);
+    if ( ret )
+    {
+        dev_err(&node->dev, "Failed to get MMIO\n");
+        goto out;
+    }
+
+    mmu->base = ioremap_nocache(addr, size);
+    if ( !mmu->base )
+    {
+        dev_err(&node->dev, "Failed to ioremap MMIO (addr 0x%"PRIx64" size 0x%"PRIx64")\n",
+                addr, size);
+        ret = -ENOMEM;
+        goto out;
+    }
+
+    /*
+     * Determine if this IPMMU node is a Root device by checking for
+     * the lack of renesas,ipmmu-main property.
+     */
+    if ( !dt_find_property(node, "renesas,ipmmu-main", NULL) )
+        mmu->root = mmu;
+    else
+        mmu->root = ipmmu_find_root();
+
+    /* Wait until the Root device has been registered for sure. */
+    if ( !mmu->root )
+    {
+        dev_err(&node->dev, "Root IPMMU hasn't been registered yet\n");
+        ret = -EAGAIN;
+        goto out;
+    }
+
+    /* Root devices have mandatory IRQs. */
+    if ( ipmmu_is_root(mmu) )
+    {
+        irq = platform_get_irq(node, 0);
+        if ( irq < 0 )
+        {
+            dev_err(&node->dev, "No IRQ found\n");
+            ret = irq;
+            goto out;
+        }
+
+        ret = request_irq(irq, 0, ipmmu_irq, dev_name(&node->dev), mmu);
+        if ( ret < 0 )
+        {
+            dev_err(&node->dev, "Failed to request IRQ %d\n", irq);
+            goto out;
+        }
+
+        ipmmu_device_reset(mmu);
+
+        /*
+         * Use stage 2 translation table format when stage 2 translation
+         * enabled.
+         */
+        ipmmu_write(mmu, IMSAUXCTLR,
+                    ipmmu_read(mmu, IMSAUXCTLR) | IMSAUXCTLR_S2PTE);
+
+        dev_info(&node->dev, "IPMMU context 0 is reserved\n");
+        set_bit(0, mmu->ctx);
+    }
+
+    spin_lock(&ipmmu_devices_lock);
+    list_add(&mmu->list, &ipmmu_devices);
+    spin_unlock(&ipmmu_devices_lock);
+
+    dev_info(&node->dev, "Registered %s IPMMU\n",
+             ipmmu_is_root(mmu) ? "Root" : "Cache");
+
+    return 0;
+
+out:
+    if ( mmu->base )
+        iounmap(mmu->base);
+    xfree(mmu);
+
+    return ret;
+}
+
+/* Xen IOMMU ops */
+static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+
+    if ( !xen_domain || !xen_domain->root_domain )
+        return 0;
+
+    spin_lock(&xen_domain->lock);
+    ipmmu_tlb_invalidate(xen_domain->root_domain);
+    spin_unlock(&xen_domain->lock);
+
+    return 0;
+}
+
+static int __must_check ipmmu_iotlb_flush(struct domain *d, dfn_t dfn,
+                                          unsigned int page_count,
+                                          unsigned int flush_flags)
+{
+    ASSERT(flush_flags);
+
+    /* The hardware doesn't support selective TLB flush. */
+    return ipmmu_iotlb_flush_all(d);
+}
+
+static struct ipmmu_vmsa_domain *ipmmu_get_cache_domain(struct domain *d,
+                                                        struct device *dev)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+    struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
+    struct ipmmu_vmsa_domain *domain;
+
+    if ( !mmu )
+        return NULL;
+
+    /*
+     * Loop through all Cache IPMMU domains associated with this Xen domain
+     * to locate an IPMMU domain this IPMMU device is assigned to.
+     */
+    list_for_each_entry( domain, &xen_domain->cache_domains, list )
+    {
+        if ( domain->mmu == mmu )
+            return domain;
+    }
+
+    return NULL;
+}
+
+static struct ipmmu_vmsa_domain *ipmmu_alloc_cache_domain(struct domain *d)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+    struct ipmmu_vmsa_domain *domain;
+
+    domain = xzalloc(struct ipmmu_vmsa_domain);
+    if ( !domain )
+        return ERR_PTR(-ENOMEM);
+
+    /*
+     * We don't assign the Cache IPMMU device here, it will be assigned when
+     * attaching master device to this domain in ipmmu_attach_device().
+     * domain->mmu = NULL;
+     */
+
+    domain->d = d;
+    /* Use the same context mapped to this Xen domain. */
+    domain->context_id = xen_domain->root_domain->context_id;
+
+    return domain;
+}
+
+static void ipmmu_free_cache_domain(struct ipmmu_vmsa_domain *domain)
+{
+    list_del(&domain->list);
+    /*
+     * Disable the context for Cache IPMMU only. Flush the TLB as required
+     * when modifying the context registers.
+     */
+    ipmmu_ctx_write_cache(domain, IMCTR, IMCTR_FLUSH);
+    xfree(domain);
+}
+
+static struct ipmmu_vmsa_domain *ipmmu_alloc_root_domain(struct domain *d)
+{
+    struct ipmmu_vmsa_domain *domain;
+    struct ipmmu_vmsa_device *root;
+    int ret;
+
+    /* If we are here then Root device must has been registered. */
+    root = ipmmu_find_root();
+    if ( !root )
+    {
+        printk(XENLOG_ERR "ipmmu: Unable to locate Root IPMMU\n");
+        return ERR_PTR(-ENODEV);
+    }
+
+    domain = xzalloc(struct ipmmu_vmsa_domain);
+    if ( !domain )
+        return ERR_PTR(-ENOMEM);
+
+    domain->mmu = root;
+    domain->d = d;
+
+    /* Initialize the context to be mapped to this Xen domain. */
+    ret = ipmmu_domain_init_context(domain);
+    if ( ret < 0 )
+    {
+        dev_err(root->dev, "%pd: Unable to initialize IPMMU context\n", d);
+        xfree(domain);
+        return ERR_PTR(ret);
+    }
+
+    return domain;
+}
+
+static void ipmmu_free_root_domain(struct ipmmu_vmsa_domain *domain)
+{
+    ipmmu_domain_destroy_context(domain);
+    xfree(domain);
+}
+
+static int ipmmu_assign_device(struct domain *d, u8 devfn, struct device *dev,
+                               uint32_t flag)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+    struct ipmmu_vmsa_domain *domain;
+    int ret;
+
+    if ( !xen_domain )
+        return -EINVAL;
+
+    if ( !to_ipmmu(dev) )
+        return -ENODEV;
+
+    spin_lock(&xen_domain->lock);
+
+    /*
+     * The IPMMU context for the Xen domain is not allocated beforehand
+     * (at the Xen domain creation time), but on demand only, when the first
+     * master device being attached to it.
+     * Create Root IPMMU domain which context will be mapped to this Xen domain
+     * if not exits yet.
+     */
+    if ( !xen_domain->root_domain )
+    {
+        domain = ipmmu_alloc_root_domain(d);
+        if ( IS_ERR(domain) )
+        {
+            ret = PTR_ERR(domain);
+            goto out;
+        }
+
+        xen_domain->root_domain = domain;
+    }
+
+    if ( to_domain(dev) )
+    {
+        dev_err(dev, "Already attached to IPMMU domain\n");
+        ret = -EEXIST;
+        goto out;
+    }
+
+    /*
+     * Master devices behind the same Cache IPMMU can be attached to the same
+     * Cache IPMMU domain.
+     * Before creating new IPMMU domain check to see if the required one
+     * already exists for this Xen domain.
+     */
+    domain = ipmmu_get_cache_domain(d, dev);
+    if ( !domain )
+    {
+        /* Create new IPMMU domain this master device will be attached to. */
+        domain = ipmmu_alloc_cache_domain(d);
+        if ( IS_ERR(domain) )
+        {
+            ret = PTR_ERR(domain);
+            goto out;
+        }
+
+        /* Chain new IPMMU domain to the Xen domain. */
+        list_add(&domain->list, &xen_domain->cache_domains);
+    }
+
+    ret = ipmmu_attach_device(domain, dev);
+    if ( ret )
+    {
+        /*
+         * Destroy Cache IPMMU domain only if there are no master devices
+         * attached to it.
+         */
+        if ( !domain->refcount )
+            ipmmu_free_cache_domain(domain);
+    }
+    else
+    {
+        domain->refcount++;
+        set_domain(dev, domain);
+    }
+
+out:
+    spin_unlock(&xen_domain->lock);
+
+    return ret;
+}
+
+static int ipmmu_deassign_device(struct domain *d, struct device *dev)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+    struct ipmmu_vmsa_domain *domain = to_domain(dev);
+
+    if ( !domain || domain->d != d )
+    {
+        dev_err(dev, "Not attached to %pd\n", d);
+        return -ESRCH;
+    }
+
+    spin_lock(&xen_domain->lock);
+
+    ipmmu_detach_device(domain, dev);
+    set_domain(dev, NULL);
+    domain->refcount--;
+
+    /*
+     * Destroy Cache IPMMU domain only if there are no master devices
+     * attached to it.
+     */
+    if ( !domain->refcount )
+        ipmmu_free_cache_domain(domain);
+
+    spin_unlock(&xen_domain->lock);
+
+    return 0;
+}
+
+static int ipmmu_reassign_device(struct domain *s, struct domain *t,
+                                 u8 devfn,  struct device *dev)
+{
+    int ret = 0;
+
+    /* Don't allow remapping on other domain than hwdom */
+    if ( t && t != hardware_domain )
+        return -EPERM;
+
+    if ( t == s )
+        return 0;
+
+    ret = ipmmu_deassign_device(s, dev);
+    if ( ret )
+        return ret;
+
+    if ( t )
+    {
+        /* No flags are defined for ARM. */
+        ret = ipmmu_assign_device(t, devfn, dev, 0);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+
+static int ipmmu_add_device(u8 devfn, struct device *dev)
+{
+    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+    struct ipmmu_vmsa_device *mmu;
+    unsigned int i;
+
+    if ( !fwspec || !fwspec->iommu_dev )
+        return -EINVAL;
+
+    mmu = ipmmu_find_mmu_by_dev(fwspec->iommu_dev);
+    if ( !mmu )
+        return -ENODEV;
+
+    /*
+     * Perform sanity check of fwspec->num_ids and fwspec->ids[1] fields.
+     * These fields describe master device's connection to Cache IPMMU
+     * (micro-TLBs). Each master device gets micro-TLB assignment via
+     * the "iommus" property in DT.
+     */
+    if ( fwspec->num_ids == 0 )
+        return -EINVAL;
+
+    for ( i = 0; i < fwspec->num_ids; ++i )
+    {
+        if ( fwspec->ids[i] >= mmu->num_utlbs )
+            return -EINVAL;
+    }
+
+    if ( to_ipmmu(dev) )
+    {
+        dev_err(dev, "Already added to IPMMU\n");
+        return -EEXIST;
+    }
+
+    fwspec->iommu_priv = xzalloc(struct ipmmu_vmsa_xen_device);
+    if ( !fwspec->iommu_priv )
+        return -ENOMEM;
+
+    set_ipmmu(dev, mmu);
+
+    /* Let Xen know that the master device is protected by an IOMMU. */
+    dt_device_set_protected(dev_to_dt(dev));
+
+    dev_info(dev, "Added master device (IPMMU %s micro-TLBs %u)\n",
+             dev_name(mmu->dev), fwspec->num_ids);
+
+    return 0;
+}
+
+static int ipmmu_iommu_domain_init(struct domain *d)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain;
+
+    xen_domain = xzalloc(struct ipmmu_vmsa_xen_domain);
+    if ( !xen_domain )
+        return -ENOMEM;
+
+    spin_lock_init(&xen_domain->lock);
+    INIT_LIST_HEAD(&xen_domain->cache_domains);
+    /*
+     * We don't create Root IPMMU domain here, it will be created on demand
+     * only, when attaching the first master device to this Xen domain in
+     * ipmmu_assign_device().
+     * xen_domain->root_domain = NULL;
+    */
+
+    dom_iommu(d)->arch.priv = xen_domain;
+
+    return 0;
+}
+
+static void __hwdom_init ipmmu_iommu_hwdom_init(struct domain *d)
+{
+    /* Set to false options not supported on ARM. */
+    if ( iommu_hwdom_inclusive )
+        printk(XENLOG_WARNING "ipmmu: map-inclusive dom0-iommu option is not supported on ARM\n");
+    iommu_hwdom_inclusive = false;
+    if ( iommu_hwdom_reserved == 1 )
+        printk(XENLOG_WARNING "ipmmu: map-reserved dom0-iommu option is not supported on ARM\n");
+    iommu_hwdom_reserved = 0;
+
+    arch_iommu_hwdom_init(d);
+}
+
+static void ipmmu_iommu_domain_teardown(struct domain *d)
+{
+    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
+
+    if ( !xen_domain )
+        return;
+
+    spin_lock(&xen_domain->lock);
+    /*
+     * Destroy Root IPMMU domain which context is mapped to this Xen domain
+     * if exits.
+     */
+    if ( xen_domain->root_domain )
+        ipmmu_free_root_domain(xen_domain->root_domain);
+
+    spin_unlock(&xen_domain->lock);
+
+    /*
+     * We assume that all master devices have already been detached from
+     * this Xen domain and there must be no associated Cache IPMMU domains
+     * in use.
+     */
+    ASSERT(list_empty(&xen_domain->cache_domains));
+    xfree(xen_domain);
+    dom_iommu(d)->arch.priv = NULL;
+}
+
+static const struct iommu_ops ipmmu_iommu_ops =
+{
+    .init            = ipmmu_iommu_domain_init,
+    .hwdom_init      = ipmmu_iommu_hwdom_init,
+    .teardown        = ipmmu_iommu_domain_teardown,
+    .iotlb_flush     = ipmmu_iotlb_flush,
+    .iotlb_flush_all = ipmmu_iotlb_flush_all,
+    .assign_device   = ipmmu_assign_device,
+    .reassign_device = ipmmu_reassign_device,
+    .map_page        = arm_iommu_map_page,
+    .unmap_page      = arm_iommu_unmap_page,
+    .add_device      = ipmmu_add_device,
+};
+
+/* RCAR GEN3 product and cut information. */
+#define RCAR_PRODUCT_MASK    0x00007F00
+#define RCAR_PRODUCT_H3      0x00004F00
+#define RCAR_PRODUCT_M3      0x00005200
+#define RCAR_PRODUCT_M3N     0x00005500
+#define RCAR_CUT_MASK        0x000000FF
+#define RCAR_CUT_VER30       0x00000020
+
+static __init bool ipmmu_stage2_supported(void)
+{
+    struct dt_device_node *np;
+    uint64_t addr, size;
+    void __iomem *base;
+    uint32_t product, cut;
+    static enum
+    {
+        UNKNOWN,
+        SUPPORTED,
+        NOTSUPPORTED
+    } stage2_supported = UNKNOWN;
+
+    /* Use the flag to avoid checking for the compatibility more then once. */
+    switch ( stage2_supported )
+    {
+    case SUPPORTED:
+        return true;
+
+    case NOTSUPPORTED:
+        return false;
+
+    case UNKNOWN:
+    default:
+        stage2_supported = NOTSUPPORTED;
+        break;
+    }
+
+    np = dt_find_compatible_node(NULL, NULL, "renesas,prr");
+    if ( !np )
+    {
+        printk(XENLOG_ERR "ipmmu: Failed to find PRR node\n");
+        return false;
+    }
+
+    if ( dt_device_get_address(np, 0, &addr, &size) )
+    {
+        printk(XENLOG_ERR "ipmmu: Failed to get PRR MMIO\n");
+        return false;
+    }
+
+    base = ioremap_nocache(addr, size);
+    if ( !base )
+    {
+        printk(XENLOG_ERR "ipmmu: Failed to ioremap PRR MMIO\n");
+        return false;
+    }
+
+    product = readl(base);
+    cut = product & RCAR_CUT_MASK;
+    product &= RCAR_PRODUCT_MASK;
+
+    switch ( product )
+    {
+    case RCAR_PRODUCT_H3:
+    case RCAR_PRODUCT_M3:
+        if ( cut >= RCAR_CUT_VER30 )
+            stage2_supported = SUPPORTED;
+        break;
+
+    case RCAR_PRODUCT_M3N:
+        stage2_supported = SUPPORTED;
+        break;
+
+    default:
+        printk(XENLOG_ERR "ipmmu: Unsupported SoC version\n");
+        break;
+    }
+
+    iounmap(base);
+
+    return stage2_supported == SUPPORTED;
+}
+
+static const struct dt_device_match ipmmu_dt_match[] __initconst =
+{
+    DT_MATCH_COMPATIBLE("renesas,ipmmu-r8a7795"),
+    DT_MATCH_COMPATIBLE("renesas,ipmmu-r8a77965"),
+    DT_MATCH_COMPATIBLE("renesas,ipmmu-r8a7796"),
+    { /* sentinel */ },
+};
+
+static __init int ipmmu_init(struct dt_device_node *node, const void *data)
+{
+    int ret;
+
+    /*
+     * Even if the device can't be initialized, we don't want to give
+     * the IPMMU device to dom0.
+     */
+    dt_device_set_used_by(node, DOMID_XEN);
+
+    if ( !iommu_hap_pt_share )
+    {
+        printk_once(XENLOG_ERR "ipmmu: P2M table must always be shared between the CPU and the IPMMU\n");
+        return -EINVAL;
+    }
+
+    if ( !ipmmu_stage2_supported() )
+    {
+        printk_once(XENLOG_ERR "ipmmu: P2M sharing is not supported in current SoC revision\n");
+        return -EOPNOTSUPP;
+    }
+    else
+    {
+        /*
+         * As 4-level translation table is not supported in IPMMU, we need
+         * to check IPA size used for P2M table beforehand to be sure it is
+         * 3-level and the IPMMU will be able to use it.
+         *
+         * TODO: First initialize the IOMMU and gather the requirements and
+         * then initialize the P2M. In the P2M code, take into the account
+         * the IOMMU requirements and restrict "pa_range" if necessary.
+         */
+        if ( IPMMU_MAX_P2M_IPA_BITS < p2m_ipa_bits )
+        {
+            printk_once(XENLOG_ERR "ipmmu: P2M IPA size is not supported (P2M=%u IPMMU=%u)!\n",
+                        p2m_ipa_bits, IPMMU_MAX_P2M_IPA_BITS);
+            return -EOPNOTSUPP;
+        }
+    }
+
+    ret = ipmmu_probe(node);
+    if ( ret )
+    {
+        dev_err(&node->dev, "Failed to init IPMMU (%d)\n", ret);
+        return ret;
+    }
+
+    iommu_set_ops(&ipmmu_iommu_ops);
+
+    return 0;
+}
+
+DT_DEVICE_START(ipmmu, "Renesas IPMMU-VMSA", DEVICE_IOMMU)
+    .dt_match = ipmmu_dt_match,
+    .init = ipmmu_init,
+DT_DEVICE_END
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.7.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec
  2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
                   ` (5 preceding siblings ...)
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support Oleksandr Tyshchenko
@ 2019-08-05  7:58 ` Oleksandr
  2019-08-05  8:29   ` Julien Grall
  6 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-05  7:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Oleksandr Tyshchenko, julien.grall, sstabellini, Yoshihiro Shimoda


Hello, all.


Forgot to mention that an additional patch from Xen staging is needed, 
otherwise Xen may crash at the early stage:
ead6b9f78355e8d366e0c80c4a73fa7fbd6d26cc
"xen/arm: cpuerrata: Align a virtual address before unmap"


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec
  2019-08-05  7:58 ` [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr
@ 2019-08-05  8:29   ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2019-08-05  8:29 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, Yoshihiro Shimoda, sstabellini

On 05/08/2019 08:58, Oleksandr wrote:
> 
> Hello, all.

Hi,

> 
> Forgot to mention that an additional patch from Xen staging is needed, otherwise 
> Xen may crash at the early stage:
> ead6b9f78355e8d366e0c80c4a73fa7fbd6d26cc
> "xen/arm: cpuerrata: Align a virtual address before unmap"

This patch is already merged in Xen :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function Oleksandr Tyshchenko
@ 2019-08-05 10:02   ` Jan Beulich
  2019-08-06 18:50     ` Oleksandr
  2019-08-06 19:51     ` Volodymyr Babchuk
  0 siblings, 2 replies; 59+ messages in thread
From: Jan Beulich @ 2019-08-05 10:02 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel

On 02.08.2019 18:39, Oleksandr Tyshchenko wrote:
> --- a/xen/common/xmalloc_tlsf.c
> +++ b/xen/common/xmalloc_tlsf.c
> @@ -610,6 +610,27 @@ void *_xzalloc(unsigned long size, unsigned long align)
>       return p ? memset(p, 0, size) : p;
>   }
>   
> +void *_xrealloc(void *p, unsigned long new_size, unsigned long align)
> +{
> +    void *new_p;
> +
> +    if ( !new_size )
> +    {
> +        xfree(p);
> +        return NULL;
> +    }
> +
> +    new_p = _xmalloc(new_size, align);
> +
> +    if ( new_p && p )
> +    {
> +        memcpy(new_p, p, new_size);
> +        xfree(p);
> +    }
> +
> +    return new_p;
> +}

While I can see why having a re-allocation function may be handy,
explicit / direct use of _xmalloc() and _xzalloc() are discouraged,
in favor of the more type-safe underscore-less variants. I can't
see though how a type-safe "realloc" could look like, except for
arrays. If resizing arrays is all you're after, I'd like to
recommend to go that route rather then the suggested one here. If
resizing arbitrary objects is the goal, then what you suggest may
be the only route, but I'd still be not overly happy to see such
added.

Furthermore you don't even use internals of the allocator: It is
common practice to avoid re-allocation if the requested size fits
within the already allocated block. That's not the least helpful
because in such a case you can't possibly suffer any -ENOMEM
condition.

And finally - please note _xmalloc()'s and _xfree()'s use /
special casing of ZERO_BLOCK_PTR: You absolutely would need to
mirror this here.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-05 10:02   ` Jan Beulich
@ 2019-08-06 18:50     ` Oleksandr
  2019-08-07  6:22       ` Jan Beulich
  2019-08-06 19:51     ` Volodymyr Babchuk
  1 sibling, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-06 18:50 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel


On 05.08.19 13:02, Jan Beulich wrote:

Hi, Jan

> While I can see why having a re-allocation function may be handy,
> explicit / direct use of _xmalloc() and _xzalloc() are discouraged,
> in favor of the more type-safe underscore-less variants.

took into account


> I can't
> see though how a type-safe "realloc" could look like, except for
> arrays. If resizing arrays is all you're after, I'd like to
> recommend to go that route rather then the suggested one here. If
> resizing arbitrary objects is the goal, then what you suggest may
> be the only route, but I'd still be not overly happy to see such
> added.

My main goal is to get "ported" from Linux "iommu_fwspec" support 
(xrealloc user) in [1].

I tried to retain code as much as possible while porting. So, this patch 
adds almost the same thing what the ported code expects.

But, I would be OK to consider modifying a code in a way to resize an 
array as well as any other variants if present.


>
> Furthermore you don't even use internals of the allocator: It is
> common practice to avoid re-allocation if the requested size fits
> within the already allocated block. That's not the least helpful
> because in such a case you can't possibly suffer any -ENOMEM
> condition.

agree, took into account as well.


>
> And finally - please note _xmalloc()'s and _xfree()'s use /
> special casing of ZERO_BLOCK_PTR: You absolutely would need to
> mirror this here.

got it, will use for zero-size allocation


[1] 
https://lists.xenproject.org/archives/html/xen-devel/2019-08/msg00257.html


Thank you.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-05 10:02   ` Jan Beulich
  2019-08-06 18:50     ` Oleksandr
@ 2019-08-06 19:51     ` Volodymyr Babchuk
  2019-08-07  6:26       ` Jan Beulich
  1 sibling, 1 reply; 59+ messages in thread
From: Volodymyr Babchuk @ 2019-08-06 19:51 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	Oleksandr Tyshchenko, julien.grall, xen-devel


Hi Jan,

Jan Beulich writes:

> On 02.08.2019 18:39, Oleksandr Tyshchenko wrote:
>> --- a/xen/common/xmalloc_tlsf.c
>> +++ b/xen/common/xmalloc_tlsf.c
>> @@ -610,6 +610,27 @@ void *_xzalloc(unsigned long size, unsigned long align)
>>       return p ? memset(p, 0, size) : p;
>>   }
>>   
>> +void *_xrealloc(void *p, unsigned long new_size, unsigned long align)
>> +{
>> +    void *new_p;
>> +
>> +    if ( !new_size )
>> +    {
>> +        xfree(p);
>> +        return NULL;
>> +    }
>> +
>> +    new_p = _xmalloc(new_size, align);
>> +
>> +    if ( new_p && p )
>> +    {
>> +        memcpy(new_p, p, new_size);
>> +        xfree(p);
>> +    }
>> +
>> +    return new_p;
>> +}
>
> While I can see why having a re-allocation function may be handy,
> explicit / direct use of _xmalloc() and _xzalloc() are discouraged,
> in favor of the more type-safe underscore-less variants. I can't
> see though how a type-safe "realloc" could look like, except for
> arrays. If resizing arrays is all you're after, I'd like to
> recommend to go that route rather then the suggested one here. If
> resizing arbitrary objects is the goal, then what you suggest may
> be the only route, but I'd still be not overly happy to see such
> added.

I can see 3 uses for realloc:

 a. re-allocate generic data buffer
 b. re-allocate array
 c. re-allocate struct with flexible buffer.

option c. is about structures like this:

struct arrlen
{
        size_t len;
        int data[1];
};

This is Oleksandr's case.

So for option a. we can use _xreallocate(ptr, size, align)
For option b. we can use xrealloc_array(_ptr, _type, _num)
And for option c. I propose to implement the following macro:

#define realloc_flex_struct(_ptr, _type, _field, _len)                        \
 ((_type *)_xrealloc(_ptr, offsetof(_type, _field[_len]) , __alignof__(_type)))

It can be used in the following way:

newptr = realloc_flex_struct(ptr, struct arrlen, newsize);

As you can see, this approach is type-safe and covers Oleksanrd's case.

-- 
Volodymyr Babchuk at EPAM
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support Oleksandr Tyshchenko
@ 2019-08-07  2:41   ` Yoshihiro Shimoda
  2019-08-07 16:01     ` Oleksandr
  2019-08-14 17:38   ` Julien Grall
  1 sibling, 1 reply; 59+ messages in thread
From: Yoshihiro Shimoda @ 2019-08-07  2:41 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

Hi Oleksandr-san,

I can access the datasheet of this hardware, so that I reviewed this patch.
I'm not familar about Xen development rulus, so that some comments might
be not good fit. If so, please ignore :)

> From: Oleksandr Tyshchenko, Sent: Saturday, August 3, 2019 1:40 AM
> 
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
> which provides address translation and access protection functionalities
> to processing units and interconnect networks.
> 
> Please note, current driver is supposed to work only with newest
> Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation

This should be "R-Car Gen3 SoCs", instead of "Gen3 SoCs".

> table format and is able to use CPU's P2M table as is if one is
> 3-level page table (up to 40 bit IPA).
> 
> The major differences compare to the Linux driver are:
> 
> 1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
> translation only (with Stage 1 translation table format). It manages
> page table by itself. But Xen driver supports Stage 2 translation
> (with Stage 2 translation table format) to be able to share the P2M
> with the CPU. Stage 1 translation is always bypassed in Xen driver.
> 
> So, Xen driver is supposed to be used with newest Gen3 SoC revisions only

Same here.

> (H3 ES3.0, M3 ES3.0, etc.) which IPMMU H/W supports stage 2 translation

According to the latest manual, M3 ES3.0 is named as "M3-W+".

> table format.

<snip>
> diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> new file mode 100644
> index 0000000..a34a8f8
> --- /dev/null
> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
> @@ -0,0 +1,1342 @@
> +/*
> + * xen/drivers/passthrough/arm/ipmmu-vmsa.c
> + *
> + * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
> + *
> + * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
> + * which provides address translation and access protection functionalities
> + * to processing units and interconnect networks.
> + *
> + * Please note, current driver is supposed to work only with newest Gen3 SoCs
> + * revisions which IPMMU hardware supports stage 2 translation table format and
> + * is able to use CPU's P2M table as is.
> + *
> + * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
> + *    drivers/iommu/ipmmu-vmsa.c

So, I think the Linux's Copyrights should be described here.

> + * you can found at:
> + *    url: git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
> + *    branch: v4.14.75-ltsi/rcar-3.9.6
> + *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
> + * and Xen's SMMU driver:
> + *    xen/drivers/passthrough/arm/smmu.c
> + *
> + * Copyright (C) 2016-2019 EPAM Systems Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.

I don't know that Xen license description rule, but since a few source files have
SPDX-License-Identifier, can we also use it on the driver?

> + */
> +
> +#include <xen/delay.h>
> +#include <xen/err.h>
> +#include <xen/irq.h>
> +#include <xen/lib.h>
> +#include <xen/list.h>

I don't know that Xen passthrough driver rule though, doesn't here need
#include <xen/iommu.h>? (The xen/sched.h seems to have it so that
no compile error happens though.)

<snip>
> +/* Registers Definition */
> +#define IM_CTX_SIZE    0x40
> +
> +#define IMCTR                0x0000
> +/*
> + * These fields are implemented in IPMMU-MM only. So, can be set for
> + * Root IPMMU only.
> + */
> +#define IMCTR_VA64           (1 << 29)
> +#define IMCTR_TRE            (1 << 17)
> +#define IMCTR_AFE            (1 << 16)
> +#define IMCTR_RTSEL_MASK     (3 << 4)
> +#define IMCTR_RTSEL_SHIFT    4
> +#define IMCTR_TREN           (1 << 3)
> +/*
> + * These fields are common for all IPMMU devices. So, can be set for
> + * Cache IPMMUs as well.
> + */
> +#define IMCTR_INTEN          (1 << 2)
> +#define IMCTR_FLUSH          (1 << 1)
> +#define IMCTR_MMUEN          (1 << 0)
> +#define IMCTR_COMMON_MASK    (7 << 0)
> +
> +#define IMCAAR               0x0004
> +
> +#define IMTTBCR                        0x0008
> +#define IMTTBCR_EAE                    (1 << 31)
> +#define IMTTBCR_PMB                    (1 << 30)
> +#define IMTTBCR_SH1_NON_SHAREABLE      (0 << 28)
> +#define IMTTBCR_SH1_OUTER_SHAREABLE    (2 << 28)
> +#define IMTTBCR_SH1_INNER_SHAREABLE    (3 << 28)
> +#define IMTTBCR_SH1_MASK               (3 << 28)
> +#define IMTTBCR_ORGN1_NC               (0 << 26)
> +#define IMTTBCR_ORGN1_WB_WA            (1 << 26)
> +#define IMTTBCR_ORGN1_WT               (2 << 26)
> +#define IMTTBCR_ORGN1_WB               (3 << 26)
> +#define IMTTBCR_ORGN1_MASK             (3 << 26)
> +#define IMTTBCR_IRGN1_NC               (0 << 24)
> +#define IMTTBCR_IRGN1_WB_WA            (1 << 24)
> +#define IMTTBCR_IRGN1_WT               (2 << 24)
> +#define IMTTBCR_IRGN1_WB               (3 << 24)
> +#define IMTTBCR_IRGN1_MASK             (3 << 24)
> +#define IMTTBCR_TSZ1_MASK              (1f << 16)

At the moment, no one uses it though, this should be (0x1f << 16).

<snip>
+/* Xen IOMMU ops */
> +static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
> +{
> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> +
> +    if ( !xen_domain || !xen_domain->root_domain )
> +        return 0;
> +
> +    spin_lock(&xen_domain->lock);

Is local irq is already disabled here?
If no, you should use spin_lock_irqsave() because the ipmmu_irq() also
gets the lock.
# To be honest, in normal case, any irq on the current implementation
# should not happen though.

> +    ipmmu_tlb_invalidate(xen_domain->root_domain);
> +    spin_unlock(&xen_domain->lock);
> +
> +    return 0;
> +}
> +
<snip>
> +static int ipmmu_assign_device(struct domain *d, u8 devfn, struct device *dev,
> +                               uint32_t flag)
> +{
> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> +    struct ipmmu_vmsa_domain *domain;
> +    int ret;
> +
> +    if ( !xen_domain )
> +        return -EINVAL;
> +
> +    if ( !to_ipmmu(dev) )
> +        return -ENODEV;
> +
> +    spin_lock(&xen_domain->lock);

Same here.

> +    /*
> +     * The IPMMU context for the Xen domain is not allocated beforehand
> +     * (at the Xen domain creation time), but on demand only, when the first
> +     * master device being attached to it.
> +     * Create Root IPMMU domain which context will be mapped to this Xen domain
> +     * if not exits yet.
> +     */
> +    if ( !xen_domain->root_domain )
> +    {
> +        domain = ipmmu_alloc_root_domain(d);
> +        if ( IS_ERR(domain) )
> +        {
> +            ret = PTR_ERR(domain);
> +            goto out;
> +        }
> +
> +        xen_domain->root_domain = domain;
> +    }
> +
> +    if ( to_domain(dev) )
> +    {
> +        dev_err(dev, "Already attached to IPMMU domain\n");
> +        ret = -EEXIST;
> +        goto out;
> +    }
> +
> +    /*
> +     * Master devices behind the same Cache IPMMU can be attached to the same
> +     * Cache IPMMU domain.
> +     * Before creating new IPMMU domain check to see if the required one
> +     * already exists for this Xen domain.
> +     */
> +    domain = ipmmu_get_cache_domain(d, dev);
> +    if ( !domain )
> +    {
> +        /* Create new IPMMU domain this master device will be attached to. */
> +        domain = ipmmu_alloc_cache_domain(d);
> +        if ( IS_ERR(domain) )
> +        {
> +            ret = PTR_ERR(domain);
> +            goto out;
> +        }
> +
> +        /* Chain new IPMMU domain to the Xen domain. */
> +        list_add(&domain->list, &xen_domain->cache_domains);
> +    }
> +
> +    ret = ipmmu_attach_device(domain, dev);
> +    if ( ret )
> +    {
> +        /*
> +         * Destroy Cache IPMMU domain only if there are no master devices
> +         * attached to it.
> +         */
> +        if ( !domain->refcount )
> +            ipmmu_free_cache_domain(domain);
> +    }
> +    else
> +    {
> +        domain->refcount++;
> +        set_domain(dev, domain);
> +    }
> +
> +out:
> +    spin_unlock(&xen_domain->lock);
> +
> +    return ret;
> +}
> +
> +static int ipmmu_deassign_device(struct domain *d, struct device *dev)
> +{
> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> +    struct ipmmu_vmsa_domain *domain = to_domain(dev);
> +
> +    if ( !domain || domain->d != d )
> +    {
> +        dev_err(dev, "Not attached to %pd\n", d);
> +        return -ESRCH;
> +    }
> +
> +    spin_lock(&xen_domain->lock);

Same here.

> +    ipmmu_detach_device(domain, dev);
> +    set_domain(dev, NULL);
> +    domain->refcount--;
> +
> +    /*
> +     * Destroy Cache IPMMU domain only if there are no master devices
> +     * attached to it.
> +     */
> +    if ( !domain->refcount )
> +        ipmmu_free_cache_domain(domain);
> +
> +    spin_unlock(&xen_domain->lock);
> +
> +    return 0;
> +}
<snip>
> +static void __hwdom_init ipmmu_iommu_hwdom_init(struct domain *d)
> +{
> +    /* Set to false options not supported on ARM. */
> +    if ( iommu_hwdom_inclusive )
> +        printk(XENLOG_WARNING "ipmmu: map-inclusive dom0-iommu option is not supported on ARM\n");
> +    iommu_hwdom_inclusive = false;
> +    if ( iommu_hwdom_reserved == 1 )
> +        printk(XENLOG_WARNING "ipmmu: map-reserved dom0-iommu option is not supported on ARM\n");
> +    iommu_hwdom_reserved = 0;
> +
> +    arch_iommu_hwdom_init(d);
> +}
> +
> +static void ipmmu_iommu_domain_teardown(struct domain *d)
> +{
> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> +
> +    if ( !xen_domain )
> +        return;
> +
> +    spin_lock(&xen_domain->lock);

Same here.

> +    /*
> +     * Destroy Root IPMMU domain which context is mapped to this Xen domain
> +     * if exits.
> +     */
> +    if ( xen_domain->root_domain )
> +        ipmmu_free_root_domain(xen_domain->root_domain);
> +
> +    spin_unlock(&xen_domain->lock);
> +
> +    /*
> +     * We assume that all master devices have already been detached from
> +     * this Xen domain and there must be no associated Cache IPMMU domains
> +     * in use.
> +     */
> +    ASSERT(list_empty(&xen_domain->cache_domains));

I think this should be in the spin lock held by &xen_domain->lock.

> +    xfree(xen_domain);
> +    dom_iommu(d)->arch.priv = NULL;
> +}
> +
> +static const struct iommu_ops ipmmu_iommu_ops =
> +{
> +    .init            = ipmmu_iommu_domain_init,
> +    .hwdom_init      = ipmmu_iommu_hwdom_init,
> +    .teardown        = ipmmu_iommu_domain_teardown,
> +    .iotlb_flush     = ipmmu_iotlb_flush,
> +    .iotlb_flush_all = ipmmu_iotlb_flush_all,
> +    .assign_device   = ipmmu_assign_device,
> +    .reassign_device = ipmmu_reassign_device,
> +    .map_page        = arm_iommu_map_page,
> +    .unmap_page      = arm_iommu_unmap_page,
> +    .add_device      = ipmmu_add_device,
> +};
> +
> +/* RCAR GEN3 product and cut information. */

"R-Car Gen3 SoCs" is better than "RCAR GEN3".

> +#define RCAR_PRODUCT_MASK    0x00007F00
> +#define RCAR_PRODUCT_H3      0x00004F00
> +#define RCAR_PRODUCT_M3      0x00005200

At least, I think we should be M3W, instead of M3.
# FYI, M3-W and M3-W+ are the same value.

<snip>

Best regards,
Yoshihiro Shimoda


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-06 18:50     ` Oleksandr
@ 2019-08-07  6:22       ` Jan Beulich
  2019-08-07 17:31         ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2019-08-07  6:22 UTC (permalink / raw)
  To: Oleksandr
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel

On 06.08.2019 20:50, Oleksandr wrote:
> On 05.08.19 13:02, Jan Beulich wrote:
>> I can't
>> see though how a type-safe "realloc" could look like, except for
>> arrays. If resizing arrays is all you're after, I'd like to
>> recommend to go that route rather then the suggested one here. If
>> resizing arbitrary objects is the goal, then what you suggest may
>> be the only route, but I'd still be not overly happy to see such
>> added.
> 
> My main goal is to get "ported" from Linux "iommu_fwspec" support (xrealloc user) in [1].
> 
> I tried to retain code as much as possible while porting. So, this patch adds almost the same thing what the ported code expects.
> 
> But, I would be OK to consider modifying a code in a way to resize an array as well as any other variants if present.

I've looked at the use in patch 4, and it really isn't an array
allocation. Even a basic allocation would use _xmalloc() in this
case (you'll find examples in the tree if you want). Nevertheless
I'd appreciate if the type-unsafe _xrealloc() didn't remain the
only re-allocation construct, as to avoiding people using it just
because there's no better alternative.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-06 19:51     ` Volodymyr Babchuk
@ 2019-08-07  6:26       ` Jan Beulich
  2019-08-07 18:36         ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2019-08-07  6:26 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	AndrewCooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	Oleksandr Tyshchenko, julien.grall, xen-devel

On 06.08.2019 21:51, Volodymyr Babchuk wrote:
> 
> Hi Jan,
> 
> Jan Beulich writes:
> 
>> On 02.08.2019 18:39, Oleksandr Tyshchenko wrote:
>>> --- a/xen/common/xmalloc_tlsf.c
>>> +++ b/xen/common/xmalloc_tlsf.c
>>> @@ -610,6 +610,27 @@ void *_xzalloc(unsigned long size, unsigned long align)
>>>        return p ? memset(p, 0, size) : p;
>>>    }
>>>    
>>> +void *_xrealloc(void *p, unsigned long new_size, unsigned long align)
>>> +{
>>> +    void *new_p;
>>> +
>>> +    if ( !new_size )
>>> +    {
>>> +        xfree(p);
>>> +        return NULL;
>>> +    }
>>> +
>>> +    new_p = _xmalloc(new_size, align);
>>> +
>>> +    if ( new_p && p )
>>> +    {
>>> +        memcpy(new_p, p, new_size);
>>> +        xfree(p);
>>> +    }
>>> +
>>> +    return new_p;
>>> +}
>>
>> While I can see why having a re-allocation function may be handy,
>> explicit / direct use of _xmalloc() and _xzalloc() are discouraged,
>> in favor of the more type-safe underscore-less variants. I can't
>> see though how a type-safe "realloc" could look like, except for
>> arrays. If resizing arrays is all you're after, I'd like to
>> recommend to go that route rather then the suggested one here. If
>> resizing arbitrary objects is the goal, then what you suggest may
>> be the only route, but I'd still be not overly happy to see such
>> added.
> 
> I can see 3 uses for realloc:
> 
>   a. re-allocate generic data buffer
>   b. re-allocate array
>   c. re-allocate struct with flexible buffer.
> 
> option c. is about structures like this:
> 
> struct arrlen
> {
>          size_t len;
>          int data[1];
> };
> 
> This is Oleksandr's case.
> 
> So for option a. we can use _xreallocate(ptr, size, align)
> For option b. we can use xrealloc_array(_ptr, _type, _num)
> And for option c. I propose to implement the following macro:
> 
> #define realloc_flex_struct(_ptr, _type, _field, _len)                        \
>   ((_type *)_xrealloc(_ptr, offsetof(_type, _field[_len]) , __alignof__(_type)))
> 
> It can be used in the following way:
> 
> newptr = realloc_flex_struct(ptr, struct arrlen, newsize);
> 
> As you can see, this approach is type-safe and covers Oleksanrd's case.

This looks fine to me, but then wants to be accompanied by a
similar xmalloc_flex_struct(), which could be used right away
to replace a number of open-coded instances of the above.

There's one more thing for the re-alloc case though (besides
cosmetic aspects): The incoming pointer should also be verified
to be of correct type.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07  2:41   ` Yoshihiro Shimoda
@ 2019-08-07 16:01     ` Oleksandr
  2019-08-07 19:15       ` Julien Grall
  2019-08-08  4:05       ` Yoshihiro Shimoda
  0 siblings, 2 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-07 16:01 UTC (permalink / raw)
  To: Yoshihiro Shimoda, xen-devel
  Cc: Oleksandr Tyshchenko, julien.grall, sstabellini


Hi, Shimoda-san.

Thank you for the review.


>
>> From: Oleksandr Tyshchenko, Sent: Saturday, August 3, 2019 1:40 AM
>>
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
>> which provides address translation and access protection functionalities
>> to processing units and interconnect networks.
>>
>> Please note, current driver is supposed to work only with newest
>> Gen3 SoCs revisions which IPMMU hardware supports stage 2 translation
> This should be "R-Car Gen3 SoCs", instead of "Gen3 SoCs".

Will update.


>
>> table format and is able to use CPU's P2M table as is if one is
>> 3-level page table (up to 40 bit IPA).
>>
>> The major differences compare to the Linux driver are:
>>
>> 1. Stage 1/Stage 2 translation. Linux driver supports Stage 1
>> translation only (with Stage 1 translation table format). It manages
>> page table by itself. But Xen driver supports Stage 2 translation
>> (with Stage 2 translation table format) to be able to share the P2M
>> with the CPU. Stage 1 translation is always bypassed in Xen driver.
>>
>> So, Xen driver is supposed to be used with newest Gen3 SoC revisions only
> Same here.

ok


>
>> (H3 ES3.0, M3 ES3.0, etc.) which IPMMU H/W supports stage 2 translation
> According to the latest manual, M3 ES3.0 is named as "M3-W+".

Will update.


>> diff --git a/xen/drivers/passthrough/arm/ipmmu-vmsa.c b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
>> new file mode 100644
>> index 0000000..a34a8f8
>> --- /dev/null
>> +++ b/xen/drivers/passthrough/arm/ipmmu-vmsa.c
>> @@ -0,0 +1,1342 @@
>> +/*
>> + * xen/drivers/passthrough/arm/ipmmu-vmsa.c
>> + *
>> + * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
>> + *
>> + * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
>> + * which provides address translation and access protection functionalities
>> + * to processing units and interconnect networks.
>> + *
>> + * Please note, current driver is supposed to work only with newest Gen3 SoCs
>> + * revisions which IPMMU hardware supports stage 2 translation table format and
>> + * is able to use CPU's P2M table as is.
>> + *
>> + * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
>> + *    drivers/iommu/ipmmu-vmsa.c
> So, I think the Linux's Copyrights should be described here.

Yes, will add.


>
>> + * you can found at:
>> + *    url: git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
>> + *    branch: v4.14.75-ltsi/rcar-3.9.6
>> + *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
>> + * and Xen's SMMU driver:
>> + *    xen/drivers/passthrough/arm/smmu.c
>> + *
>> + * Copyright (C) 2016-2019 EPAM Systems Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> I don't know that Xen license description rule, but since a few source files have
> SPDX-License-Identifier, can we also use it on the driver?

I am afraid, I don't know a correct answer for this question. I would 
leave this to maintainers.

I just followed sample copyright notice for GPL v2 License according to 
the document:

http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=CONTRIBUTING


>> + */
>> +
>> +#include <xen/delay.h>
>> +#include <xen/err.h>
>> +#include <xen/irq.h>
>> +#include <xen/lib.h>
>> +#include <xen/list.h>
> I don't know that Xen passthrough driver rule though, doesn't here need
> #include <xen/iommu.h>? (The xen/sched.h seems to have it so that
> no compile error happens though.)

Probably, yes, I should have included that header.


>> +
>> +#define IMTTBCR                        0x0008
>> +#define IMTTBCR_EAE                    (1 << 31)
>> +#define IMTTBCR_PMB                    (1 << 30)
>> +#define IMTTBCR_SH1_NON_SHAREABLE      (0 << 28)
>> +#define IMTTBCR_SH1_OUTER_SHAREABLE    (2 << 28)
>> +#define IMTTBCR_SH1_INNER_SHAREABLE    (3 << 28)
>> +#define IMTTBCR_SH1_MASK               (3 << 28)
>> +#define IMTTBCR_ORGN1_NC               (0 << 26)
>> +#define IMTTBCR_ORGN1_WB_WA            (1 << 26)
>> +#define IMTTBCR_ORGN1_WT               (2 << 26)
>> +#define IMTTBCR_ORGN1_WB               (3 << 26)
>> +#define IMTTBCR_ORGN1_MASK             (3 << 26)
>> +#define IMTTBCR_IRGN1_NC               (0 << 24)
>> +#define IMTTBCR_IRGN1_WB_WA            (1 << 24)
>> +#define IMTTBCR_IRGN1_WT               (2 << 24)
>> +#define IMTTBCR_IRGN1_WB               (3 << 24)
>> +#define IMTTBCR_IRGN1_MASK             (3 << 24)
>> +#define IMTTBCR_TSZ1_MASK              (1f << 16)
> At the moment, no one uses it though, this should be (0x1f << 16).

Will correct.


>
> <snip>
> +/* Xen IOMMU ops */
>> +static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
>> +{
>> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
>> +
>> +    if ( !xen_domain || !xen_domain->root_domain )
>> +        return 0;
>> +
>> +    spin_lock(&xen_domain->lock);
> Is local irq is already disabled here?
> If no, you should use spin_lock_irqsave() because the ipmmu_irq() also
> gets the lock.


No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I 
think, there won't be a deadlock.

Or I really missed something?

If we worry about ipmmu_tlb_invalidate() which is called here (to 
perform a flush by request from P2M code, which manages a page table) 
and from the irq handler (to perform a flush to resume address 
translation), I could use a tasklet to schedule ipmmu_tlb_invalidate() 
from the irq handler then. This way we would get this serialized. What 
do you think?


> # To be honest, in normal case, any irq on the current implementation
> # should not happen though.

Agree here.


>> +    /*
>> +     * Destroy Root IPMMU domain which context is mapped to this Xen domain
>> +     * if exits.
>> +     */
>> +    if ( xen_domain->root_domain )
>> +        ipmmu_free_root_domain(xen_domain->root_domain);
>> +
>> +    spin_unlock(&xen_domain->lock);
>> +
>> +    /*
>> +     * We assume that all master devices have already been detached from
>> +     * this Xen domain and there must be no associated Cache IPMMU domains
>> +     * in use.
>> +     */
>> +    ASSERT(list_empty(&xen_domain->cache_domains));
> I think this should be in the spin lock held by &xen_domain->lock.

OK. Will put spin_unlock after it.


>
>> +    xfree(xen_domain);
>> +    dom_iommu(d)->arch.priv = NULL;
>> +}
>> +
>> +static const struct iommu_ops ipmmu_iommu_ops =
>> +{
>> +    .init            = ipmmu_iommu_domain_init,
>> +    .hwdom_init      = ipmmu_iommu_hwdom_init,
>> +    .teardown        = ipmmu_iommu_domain_teardown,
>> +    .iotlb_flush     = ipmmu_iotlb_flush,
>> +    .iotlb_flush_all = ipmmu_iotlb_flush_all,
>> +    .assign_device   = ipmmu_assign_device,
>> +    .reassign_device = ipmmu_reassign_device,
>> +    .map_page        = arm_iommu_map_page,
>> +    .unmap_page      = arm_iommu_unmap_page,
>> +    .add_device      = ipmmu_add_device,
>> +};
>> +
>> +/* RCAR GEN3 product and cut information. */
> "R-Car Gen3 SoCs" is better than "RCAR GEN3".

Will update.


>
>> +#define RCAR_PRODUCT_MASK    0x00007F00
>> +#define RCAR_PRODUCT_H3      0x00004F00
>> +#define RCAR_PRODUCT_M3      0x00005200
> At least, I think we should be M3W, instead of M3.
> # FYI, M3-W and M3-W+ are the same value.

Will update.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-07  6:22       ` Jan Beulich
@ 2019-08-07 17:31         ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-07 17:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	Andrew Cooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel


Hi,


> Nevertheless
> I'd appreciate if the type-unsafe _xrealloc() didn't remain the
> only re-allocation construct, as to avoiding people using it just
> because there's no better alternative.

I got your point.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-07  6:26       ` Jan Beulich
@ 2019-08-07 18:36         ` Oleksandr
  2019-08-08  6:08           ` Jan Beulich
  2019-08-08  7:05           ` Jan Beulich
  0 siblings, 2 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-07 18:36 UTC (permalink / raw)
  To: Jan Beulich, Volodymyr Babchuk
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	AndrewCooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel


Hi, Jan, Volodymyr.


>>   c. re-allocate struct with flexible buffer.
>>
>> option c. is about structures like this:
>>
>> struct arrlen
>> {
>>          size_t len;
>>          int data[1];
>> };
>>
>> This is Oleksandr's case.
>>
>> So for option a. we can use _xreallocate(ptr, size, align)
>> For option b. we can use xrealloc_array(_ptr, _type, _num)
>> And for option c. I propose to implement the following macro:
>>
>> #define realloc_flex_struct(_ptr, _type, _field, 
>> _len)                        \
>>   ((_type *)_xrealloc(_ptr, offsetof(_type, _field[_len]) , 
>> __alignof__(_type)))
>>
>> It can be used in the following way:
>>
>> newptr = realloc_flex_struct(ptr, struct arrlen, newsize);
>>
>> As you can see, this approach is type-safe and covers Oleksanrd's case.
>
> This looks fine to me, but then wants to be accompanied by a
> similar xmalloc_flex_struct(), which could be used right away
> to replace a number of open-coded instances of the above.

Thank you Volodymyr for the idea. Looks like we can get a type-safe 
approach which looks suitable for my particular case.

So, I need to focus on the proper implementation of non type-safe 
(_xrealloc) variant in the first place taking into the account Jan's 
comments. Then I will be back to the suggested type-safe marco 
(realloc_flex_struct).


>
> There's one more thing for the re-alloc case though (besides
> cosmetic aspects): The incoming pointer should also be verified
> to be of correct type.

Jan, how this could be technically implemented, or are these any 
existing examples in Xen?


>
> Jan

-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 16:01     ` Oleksandr
@ 2019-08-07 19:15       ` Julien Grall
  2019-08-07 20:28         ` Oleksandr Tyshchenko
                           ` (2 more replies)
  2019-08-08  4:05       ` Yoshihiro Shimoda
  1 sibling, 3 replies; 59+ messages in thread
From: Julien Grall @ 2019-08-07 19:15 UTC (permalink / raw)
  To: Oleksandr, Yoshihiro Shimoda, xen-devel
  Cc: Oleksandr Tyshchenko, sstabellini, Lars Kurth

(+ Lars)

Hi,

On 8/7/19 5:01 PM, Oleksandr wrote:
>>> + * you can found at:
>>> + *    url: 
>>> git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
>>> + *    branch: v4.14.75-ltsi/rcar-3.9.6
>>> + *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
>>> + * and Xen's SMMU driver:
>>> + *    xen/drivers/passthrough/arm/smmu.c
>>> + *
>>> + * Copyright (C) 2016-2019 EPAM Systems Inc.
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms and conditions of the GNU General Public
>>> + * License, version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public
>>> + * License along with this program; If not, see 
>>> <http://www.gnu.org/licenses/>.
>> I don't know that Xen license description rule, but since a few source 
>> files have
>> SPDX-License-Identifier, can we also use it on the driver?
> 
> I am afraid, I don't know a correct answer for this question. I would 
> leave this to maintainers.
> 
> I just followed sample copyright notice for GPL v2 License according to 
> the document:
> 
> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=CONTRIBUTING

The file CONTRIBUTING is only giving example of common example of 
license. So I think this is fine to use SPDX, the more they are already 
used. The only request is to put either SDPX or the full-blown text but 
not the two :). Lars, any objection?

I am quite in favor of SPDX because it is easier to find out the 
license. With the full-blown text, the text may slightly vary between 
licenses. For instance, the only difference between GPLv2 and GPLv2+ is 
",or (at your option) any later version". I let you imagine how it can 
be easy to miss it when reviewing ;).

We had a discussion last year about using SPDX in Xen code base but I 
never got the time to formally suggest it.

> 
>>> + */
>>> +
>>> +#include <xen/delay.h>
>>> +#include <xen/err.h>
>>> +#include <xen/irq.h>
>>> +#include <xen/lib.h>
>>> +#include <xen/list.h>
>> I don't know that Xen passthrough driver rule though, doesn't here need
>> #include <xen/iommu.h>? (The xen/sched.h seems to have it so that
>> no compile error happens though.)
> 
> Probably, yes, I should have included that header.

I am fine either way :). The indirect inclusion happens quite often and 
we only notice it when someone decide to rework the headers.

[...]
>> +/* Xen IOMMU ops */
>>> +static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
>>> +{
>>> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
>>> +
>>> +    if ( !xen_domain || !xen_domain->root_domain )
>>> +        return 0;
>>> +
>>> +    spin_lock(&xen_domain->lock);
>> Is local irq is already disabled here?
>> If no, you should use spin_lock_irqsave() because the ipmmu_irq() also
>> gets the lock.
> 
> 
> No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I 
> think, there won't be a deadlock.
> 
> Or I really missed something?
> 
> If we worry about ipmmu_tlb_invalidate() which is called here (to 
> perform a flush by request from P2M code, which manages a page table) 
> and from the irq handler (to perform a flush to resume address 
> translation), I could use a tasklet to schedule ipmmu_tlb_invalidate() 
> from the irq handler then. This way we would get this serialized. What 
> do you think?

I am afraid a tasklet is not an option. You need to perform the TLB 
flush when requested otherwise you are introducing a security issue.

This is because as soon as a region is unmapped in the page table, we 
remove the drop the reference on any page backing that region. When the 
reference is dropped to zero, the page can be reallocated to another 
domain or even Xen. If the TLB flush happen after, then the guest may 
still be able to access the page for a short time if the translation has 
been cached by the any TLB (IOMMU, Processor).

[...]

>>> +    /*
>>> +     * Destroy Root IPMMU domain which context is mapped to this Xen 
>>> domain
>>> +     * if exits.
>>> +     */
>>> +    if ( xen_domain->root_domain )
>>> +        ipmmu_free_root_domain(xen_domain->root_domain);
>>> +
>>> +    spin_unlock(&xen_domain->lock);
>>> +
>>> +    /*
>>> +     * We assume that all master devices have already been detached 
>>> from
>>> +     * this Xen domain and there must be no associated Cache IPMMU 
>>> domains
>>> +     * in use.
>>> +     */
>>> +    ASSERT(list_empty(&xen_domain->cache_domains));
>> I think this should be in the spin lock held by &xen_domain->lock.
> 
> OK. Will put spin_unlock after it.

The spin_lock is actually pointless here. This is done when the domain 
is destroyed, so nobody should touch it.

If you think concurrent access can still happen, then you are going to 
be in deep trouble as you free the xen_domain (and therefore the 
spinlock) below :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 19:15       ` Julien Grall
@ 2019-08-07 20:28         ` Oleksandr Tyshchenko
  2019-08-08  9:05           ` Julien Grall
  2019-08-08 12:28         ` Oleksandr
  2019-08-08 14:23         ` Lars Kurth
  2 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-07 20:28 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Lars Kurth,
	Oleksandr Tyshchenko

[-- Attachment #1.1: Type: text/plain, Size: 1611 bytes --]

Hi, Julien.
Sorry for the possible format issues.


> > No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I
> > think, there won't be a deadlock.
> >
> > Or I really missed something?
> >
> > If we worry about ipmmu_tlb_invalidate() which is called here (to
> > perform a flush by request from P2M code, which manages a page table)
> > and from the irq handler (to perform a flush to resume address
> > translation), I could use a tasklet to schedule ipmmu_tlb_invalidate()
> > from the irq handler then. This way we would get this serialized. What
> > do you think?
>
> I am afraid a tasklet is not an option. You need to perform the TLB
> flush when requested otherwise you are introducing a security issue.
>
> This is because as soon as a region is unmapped in the page table, we
> remove the drop the reference on any page backing that region. When the
> reference is dropped to zero, the page can be reallocated to another
> domain or even Xen. If the TLB flush happen after, then the guest may
> still be able to access the page for a short time if the translation has
> been cached by the any TLB (IOMMU, Processor).
>

>
I understand this. I am not proposing to delay a requested by P2M code TLB
flush in any case. I just propose to issue TLB flush (which we have to
perform in case of page faults, to resolve error condition and resume
translations) from a tasklet rather than from interrupt handler directly.
This is the TLB flush I am speaking about:

https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598

Sorry if I was unclear.

[-- Attachment #1.2: Type: text/html, Size: 2744 bytes --]

<div dir="auto"><span style="font-family:sans-serif">Hi, Julien. </span><div dir="auto"><span style="font-family:sans-serif">Sorry for the possible format issues.</span><div dir="auto"><font face="sans-serif"><br></font><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
&gt; No, it is not disabled. But, ipmmu_irq() uses another mmu-&gt;lock. So, I <br>
&gt; think, there won&#39;t be a deadlock.<br>
&gt; <br>
&gt; Or I really missed something?<br>
&gt; <br>
&gt; If we worry about ipmmu_tlb_invalidate() which is called here (to <br>
&gt; perform a flush by request from P2M code, which manages a page table) <br>
&gt; and from the irq handler (to perform a flush to resume address <br>
&gt; translation), I could use a tasklet to schedule ipmmu_tlb_invalidate() <br>
&gt; from the irq handler then. This way we would get this serialized. What <br>
&gt; do you think?<br>
<br>
I am afraid a tasklet is not an option. You need to perform the TLB <br>
flush when requested otherwise you are introducing a security issue.<br>
<br>
This is because as soon as a region is unmapped in the page table, we <br>
remove the drop the reference on any page backing that region. When the <br>
reference is dropped to zero, the page can be reallocated to another <br>
domain or even Xen. If the TLB flush happen after, then the guest may <br>
still be able to access the page for a short time if the translation has <br>
been cached by the any TLB (IOMMU, Processor).<br></blockquote></div><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br></blockquote></div><div dir="auto"><br></div><div dir="auto"><span style="font-family:sans-serif">I understand this. I am not proposing to delay a requested by P2M code TLB flush in any case. I just propose to issue TLB flush (which we have to perform in case of page faults, to resolve error condition and resume translations) from a tasklet rather than from interrupt handler directly. This is the TLB flush I am speaking about:</span><br></div><div dir="auto"><br></div><div class="gmail_quote" dir="auto"></div><div dir="auto"><a href="https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598">https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598</a><br></div><div dir="auto"><br></div><div dir="auto">Sorry if I was unclear.</div><div class="gmail_quote" dir="auto"></div></div></div></div>

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 16:01     ` Oleksandr
  2019-08-07 19:15       ` Julien Grall
@ 2019-08-08  4:05       ` Yoshihiro Shimoda
  1 sibling, 0 replies; 59+ messages in thread
From: Yoshihiro Shimoda @ 2019-08-08  4:05 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, julien.grall, sstabellini

Hi Oleksandr-san,

> From: Oleksandr, Sent: Thursday, August 8, 2019 1:01 AM
> 
> 
> Hi, Shimoda-san.
> 
> Thank you for the review.

You're welcome.

<snip>
> > +/* Xen IOMMU ops */
> >> +static int __must_check ipmmu_iotlb_flush_all(struct domain *d)
> >> +{
> >> +    struct ipmmu_vmsa_xen_domain *xen_domain = dom_iommu(d)->arch.priv;
> >> +
> >> +    if ( !xen_domain || !xen_domain->root_domain )
> >> +        return 0;
> >> +
> >> +    spin_lock(&xen_domain->lock);
> > Is local irq is already disabled here?
> > If no, you should use spin_lock_irqsave() because the ipmmu_irq() also
> > gets the lock.
> 
> 
> No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I
> think, there won't be a deadlock.
> 
> Or I really missed something?

You're correct. I didn't realized that ipmmu_irq() used another mmu->lock.

> If we worry about ipmmu_tlb_invalidate() which is called here (to
> perform a flush by request from P2M code, which manages a page table)
> and from the irq handler (to perform a flush to resume address
> translation), I could use a tasklet to schedule ipmmu_tlb_invalidate()
> from the irq handler then. This way we would get this serialized. What
> do you think?

I just concerned about a dead-lock issue by recursive spin locks.
So, calling ipmmu_tlb_invalidate() here is OK, I think.

Best regards,
Yoshihiro Shimoda

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-07 18:36         ` Oleksandr
@ 2019-08-08  6:08           ` Jan Beulich
  2019-08-08  7:05           ` Jan Beulich
  1 sibling, 0 replies; 59+ messages in thread
From: Jan Beulich @ 2019-08-08  6:08 UTC (permalink / raw)
  To: Oleksandr
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	AndrewCooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel, Volodymyr Babchuk

On 07.08.2019 20:36, Oleksandr wrote:
>> There's one more thing for the re-alloc case though (besides
>> cosmetic aspects): The incoming pointer should also be verified
>> to be of correct type.
> 
> Jan, how this could be technically implemented, or are these any existing examples in Xen?

See x86's copy_to_guest_offset(), for example. To get the compiler
to emit a warning (at least), a (typically otherwise dead)
comparison of pointers is commonly used.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-07 18:36         ` Oleksandr
  2019-08-08  6:08           ` Jan Beulich
@ 2019-08-08  7:05           ` Jan Beulich
  2019-08-08 11:05             ` Oleksandr
  1 sibling, 1 reply; 59+ messages in thread
From: Jan Beulich @ 2019-08-08  7:05 UTC (permalink / raw)
  To: Oleksandr
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	AndrewCooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel, Volodymyr Babchuk

(I'm sorry if you receive duplicates of this, but I've got a reply
back from our mail system that several of the recipients did not
have their host names resolved correctly on the first attempt.)

On 07.08.2019 20:36, Oleksandr wrote:
>> There's one more thing for the re-alloc case though (besides
>> cosmetic aspects): The incoming pointer should also be verified
>> to be of correct type.
> 
> Jan, how this could be technically implemented, or are these any existing examples in Xen?

See x86's copy_to_guest_offset(), for example. To get the compiler
to emit a warning (at least), a (typically otherwise dead)
comparison of pointers is commonly used.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 20:28         ` Oleksandr Tyshchenko
@ 2019-08-08  9:05           ` Julien Grall
  2019-08-08 10:14             ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-08  9:05 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Lars Kurth,
	Oleksandr Tyshchenko

On 07/08/2019 21:28, Oleksandr Tyshchenko wrote:
> Hi, Julien.

Hi,

> Sorry for the possible format issues.
> 
> 
>      > No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I
>      > think, there won't be a deadlock.
>      >
>      > Or I really missed something?
>      >
>      > If we worry about ipmmu_tlb_invalidate() which is called here (to
>      > perform a flush by request from P2M code, which manages a page table)
>      > and from the irq handler (to perform a flush to resume address
>      > translation), I could use a tasklet to schedule ipmmu_tlb_invalidate()
>      > from the irq handler then. This way we would get this serialized. What
>      > do you think?
> 
>     I am afraid a tasklet is not an option. You need to perform the TLB
>     flush when requested otherwise you are introducing a security issue.
> 
>     This is because as soon as a region is unmapped in the page table, we
>     remove the drop the reference on any page backing that region. When the
>     reference is dropped to zero, the page can be reallocated to another
>     domain or even Xen. If the TLB flush happen after, then the guest may
>     still be able to access the page for a short time if the translation has
>     been cached by the any TLB (IOMMU, Processor).
> 
> 
> 
> I understand this. I am not proposing to delay a requested by P2M code TLB flush 
> in any case. I just propose to issue TLB flush (which we have to perform in case 
> of page faults, to resolve error condition and resume translations) from a 
> tasklet rather than from interrupt handler directly. This is the TLB flush I am 
> speaking about:
> 
> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598
> 
> Sorry if I was unclear.

My mistake, I misread what you wrote.

I found the flush in the renesas-bsp and not Linux upstream but it is not clear 
why this is actually required. You are not fixing any translation error. So what 
this flush will do?

Regarding the placement of the flush, then if you execute in a tasklet it will 
likely be done later on when the IRQ has been acknowledge. What's the 
implication to delay it?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08  9:05           ` Julien Grall
@ 2019-08-08 10:14             ` Oleksandr
  2019-08-08 12:44               ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-08 10:14 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Lars Kurth,
	Oleksandr Tyshchenko



> Hi,

Hi, Julien.


>
>> Sorry for the possible format issues.
>>
>>
>>      > No, it is not disabled. But, ipmmu_irq() uses another 
>> mmu->lock. So, I
>>      > think, there won't be a deadlock.
>>      >
>>      > Or I really missed something?
>>      >
>>      > If we worry about ipmmu_tlb_invalidate() which is called here (to
>>      > perform a flush by request from P2M code, which manages a page 
>> table)
>>      > and from the irq handler (to perform a flush to resume address
>>      > translation), I could use a tasklet to schedule 
>> ipmmu_tlb_invalidate()
>>      > from the irq handler then. This way we would get this 
>> serialized. What
>>      > do you think?
>>
>>     I am afraid a tasklet is not an option. You need to perform the TLB
>>     flush when requested otherwise you are introducing a security issue.
>>
>>     This is because as soon as a region is unmapped in the page 
>> table, we
>>     remove the drop the reference on any page backing that region. 
>> When the
>>     reference is dropped to zero, the page can be reallocated to another
>>     domain or even Xen. If the TLB flush happen after, then the guest 
>> may
>>     still be able to access the page for a short time if the 
>> translation has
>>     been cached by the any TLB (IOMMU, Processor).
>>
>>
>>
>> I understand this. I am not proposing to delay a requested by P2M 
>> code TLB flush in any case. I just propose to issue TLB flush (which 
>> we have to perform in case of page faults, to resolve error condition 
>> and resume translations) from a tasklet rather than from interrupt 
>> handler directly. This is the TLB flush I am speaking about:
>>
>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>
>>
>> Sorry if I was unclear.
>
> My mistake, I misread what you wrote.
>
> I found the flush in the renesas-bsp and not Linux upstream but it is 
> not clear why this is actually required. You are not fixing any 
> translation error. So what this flush will do?
>
> Regarding the placement of the flush, then if you execute in a tasklet 
> it will likely be done later on when the IRQ has been acknowledge. 
> What's the implication to delay it?


Looks like, there is no need to put this flush into a tasklet. As I 
understand from Shimoda-san's answer it is OK to call flush here.

So, my worry about calling ipmmu_tlb_invalidate() directly from the 
interrupt handler is not actual anymore.
----------
This is my understanding regarding the flush purpose here. This code, 
just follows the TRM, no more no less,
which mentions about a need to flush TLB after clearing error status 
register and updating a page table entries (which, I assume, means to 
resolve a reason why translation/page fault error actually have 
happened) to resume address translation request.

But, with one remark, as you have already noted, we are not trying to 
handle/fix this fault (update page table entries), driver doesn't manage 
page table and is not aware what the page table is. What is more, it is 
unclear what actually need to be fixed in the page table which is a CPU 
page table as the same time.

I have heard there is a break-before-make sequence when updating the 
page table. So, if device in a domain is issuing DMA somewhere in the 
middle of updating a page table, theoretically, we might hit into this 
fault. In this case the page table is correct and we don't need to fix 
anything...   Being honest, I have never seen a fault caused by 
break-before-make sequence.

>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function
  2019-08-08  7:05           ` Jan Beulich
@ 2019-08-08 11:05             ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-08 11:05 UTC (permalink / raw)
  To: Jan Beulich
  Cc: sstabellini, Wei Liu, Konrad Rzeszutek Wilk, George Dunlap,
	AndrewCooper, Ian Jackson, Tim Deegan, Oleksandr Tyshchenko,
	julien.grall, xen-devel, Volodymyr Babchuk


On 08.08.19 10:05, Jan Beulich wrote:

Hi Jan

> (I'm sorry if you receive duplicates of this, but I've got a reply
> back from our mail system that several of the recipients did not
> have their host names resolved correctly on the first attempt.)

Absolutely no problem.


>> Jan, how this could be technically implemented, or are these any existing examples in Xen?
> See x86's copy_to_guest_offset(), for example. To get the compiler
> to emit a warning (at least), a (typically otherwise dead)
> comparison of pointers is commonly used.


Thank you for the pointer. It is clear now.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 19:15       ` Julien Grall
  2019-08-07 20:28         ` Oleksandr Tyshchenko
@ 2019-08-08 12:28         ` Oleksandr
  2019-08-08 14:23         ` Lars Kurth
  2 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-08 12:28 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Lars Kurth,
	Oleksandr Tyshchenko


Hi, Julien.


>> I am afraid, I don't know a correct answer for this question. I would 
>> leave this to maintainers.
>>
>> I just followed sample copyright notice for GPL v2 License according 
>> to the document:
>>
>> http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=CONTRIBUTING
>
> The file CONTRIBUTING is only giving example of common example of 
> license. So I think this is fine to use SPDX, the more they are 
> already used. The only request is to put either SDPX or the full-blown 
> text but not the two :). Lars, any objection?
>
> I am quite in favor of SPDX because it is easier to find out the 
> license. With the full-blown text, the text may slightly vary between 
> licenses. For instance, the only difference between GPLv2 and GPLv2+ 
> is ",or (at your option) any later version". I let you imagine how it 
> can be easy to miss it when reviewing ;).
>
> We had a discussion last year about using SPDX in Xen code base but I 
> never got the time to formally suggest it.

I tried to locate files in Xen where SPDX is used. After finding only 
nospec.h I got an incorrect opinion this is not popular in Xen))


Just to clarify. So the title for the driver should be the following (if 
there are no objections):

// SPDX-License-Identifier: GPL-2.0
/*
  * xen/drivers/passthrough/arm/ipmmu-vmsa.c
  *
  * Driver for the Renesas IPMMU-VMSA found in R-Car Gen3 SoCs.
  *
  * Copyright (C) 2014-2019 Renesas Electronics Corporation
  *
  * The IPMMU-VMSA is VMSA-compatible I/O Memory Management Unit (IOMMU)
  * which provides address translation and access protection functionalities
  * to processing units and interconnect networks.
  *
  * Please note, current driver is supposed to work only with newest 
R-Car Gen3
  * SoCs revisions which IPMMU hardware supports stage 2 translation 
table format
  * and is able to use CPU's P2M table as is.
  *
  * Based on Linux's IPMMU-VMSA driver from Renesas BSP:
  *    drivers/iommu/ipmmu-vmsa.c
  * you can found at:
  *    url: 
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
  *    branch: v4.14.75-ltsi/rcar-3.9.6
  *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
  * and Xen's SMMU driver:
  *    xen/drivers/passthrough/arm/smmu.c
  *
  * Copyright (C) 2016-2019 EPAM Systems Inc.
  */


Answer to myself:

Looks like, the same I have to do with all newly added files in this 
series (iommu_fwspec, etc).


>>>> +    /*
>>>> +     * Destroy Root IPMMU domain which context is mapped to this 
>>>> Xen domain
>>>> +     * if exits.
>>>> +     */
>>>> +    if ( xen_domain->root_domain )
>>>> +        ipmmu_free_root_domain(xen_domain->root_domain);
>>>> +
>>>> +    spin_unlock(&xen_domain->lock);
>>>> +
>>>> +    /*
>>>> +     * We assume that all master devices have already been 
>>>> detached from
>>>> +     * this Xen domain and there must be no associated Cache IPMMU 
>>>> domains
>>>> +     * in use.
>>>> +     */
>>>> +    ASSERT(list_empty(&xen_domain->cache_domains));
>>> I think this should be in the spin lock held by &xen_domain->lock.
>>
>> OK. Will put spin_unlock after it.
>
> The spin_lock is actually pointless here. This is done when the domain 
> is destroyed, so nobody should touch it.
>
> If you think concurrent access can still happen, then you are going to 
> be in deep trouble as you free the xen_domain (and therefore the 
> spinlock) below :).

Indeed, this is pointless. We don't really expect any other operations 
with the domain which is being destroyed. No assign/deassign devices, no 
flush, no map, nothing...


>
>
-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 10:14             ` Oleksandr
@ 2019-08-08 12:44               ` Julien Grall
  2019-08-08 15:04                 ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-08 12:44 UTC (permalink / raw)
  To: Oleksandr; +Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko

Hi,

Removing Lars there is no need to spam him with technical discussion :)

On 08/08/2019 11:14, Oleksandr wrote:
> 
> 
>> Hi,
> 
> Hi, Julien.
> 
> 
>>
>>> Sorry for the possible format issues.
>>>
>>>
>>>      > No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I
>>>      > think, there won't be a deadlock.
>>>      >
>>>      > Or I really missed something?
>>>      >
>>>      > If we worry about ipmmu_tlb_invalidate() which is called here (to
>>>      > perform a flush by request from P2M code, which manages a page table)
>>>      > and from the irq handler (to perform a flush to resume address
>>>      > translation), I could use a tasklet to schedule ipmmu_tlb_invalidate()
>>>      > from the irq handler then. This way we would get this serialized. What
>>>      > do you think?
>>>
>>>     I am afraid a tasklet is not an option. You need to perform the TLB
>>>     flush when requested otherwise you are introducing a security issue.
>>>
>>>     This is because as soon as a region is unmapped in the page table, we
>>>     remove the drop the reference on any page backing that region. When the
>>>     reference is dropped to zero, the page can be reallocated to another
>>>     domain or even Xen. If the TLB flush happen after, then the guest may
>>>     still be able to access the page for a short time if the translation has
>>>     been cached by the any TLB (IOMMU, Processor).
>>>
>>>
>>>
>>> I understand this. I am not proposing to delay a requested by P2M code TLB 
>>> flush in any case. I just propose to issue TLB flush (which we have to 
>>> perform in case of page faults, to resolve error condition and resume 
>>> translations) from a tasklet rather than from interrupt handler directly. 
>>> This is the TLB flush I am speaking about:
>>>
>>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>>
>>>
>>> Sorry if I was unclear.
>>
>> My mistake, I misread what you wrote.
>>
>> I found the flush in the renesas-bsp and not Linux upstream but it is not 
>> clear why this is actually required. You are not fixing any translation error. 
>> So what this flush will do?
>>
>> Regarding the placement of the flush, then if you execute in a tasklet it will 
>> likely be done later on when the IRQ has been acknowledge. What's the 
>> implication to delay it?
> 
> 
> Looks like, there is no need to put this flush into a tasklet. As I understand 
> from Shimoda-san's answer it is OK to call flush here.
> 
> So, my worry about calling ipmmu_tlb_invalidate() directly from the interrupt 
> handler is not actual anymore.
> ----------
> This is my understanding regarding the flush purpose here. This code, just 
> follows the TRM, no more no less,
> which mentions about a need to flush TLB after clearing error status register 
> and updating a page table entries (which, I assume, means to resolve a reason 
> why translation/page fault error actually have happened) to resume address 
> translation request.

Well, I don't have the TRM... so my point of reference is Linux. Why does 
upstream not do the TLB flush?

I have been told this is an errata on the IPMMU. Is it related to the series 
posted on linux-iommu [1]?

> 
> But, with one remark, as you have already noted, we are not trying to handle/fix 
> this fault (update page table entries), driver doesn't manage page table and is 
> not aware what the page table is. What is more, it is unclear what actually need 
> to be fixed in the page table which is a CPU page table as the same time.
> 
> I have heard there is a break-before-make sequence when updating the page table. 
> So, if device in a domain is issuing DMA somewhere in the middle of updating a 
> page table, theoretically, we might hit into this fault. In this case the page 
> table is correct and we don't need to fix anything...   Being honest, I have 
> never seen a fault caused by break-before-make sequence.

Ok, so it looks like you are trying to fix [1]. My first concern here is there 
are no ground for someone without access to the TRM why this is done.

Furthermore, AFAICT, the patch series never reached upstream. So is it present 
on all revision of GEN3?

Cheers,

[1] 
https://lore.kernel.org/linux-iommu/1485348842-23712-1-git-send-email-yoshihiro.shimoda.uh@renesas.com/

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-07 19:15       ` Julien Grall
  2019-08-07 20:28         ` Oleksandr Tyshchenko
  2019-08-08 12:28         ` Oleksandr
@ 2019-08-08 14:23         ` Lars Kurth
  2 siblings, 0 replies; 59+ messages in thread
From: Lars Kurth @ 2019-08-08 14:23 UTC (permalink / raw)
  To: Julien Grall, Oleksandr, Yoshihiro Shimoda, xen-devel
  Cc: Oleksandr Tyshchenko, sstabellini



On 07/08/2019, 20:15, "Julien Grall" <julien.grall@arm.com> wrote:

    (+ Lars)
    
    Hi,
    
    On 8/7/19 5:01 PM, Oleksandr wrote:
    >>> + * you can found at:
    >>> + *    url: 
    >>> git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git
    >>> + *    branch: v4.14.75-ltsi/rcar-3.9.6
    >>> + *    commit: e206eb5b81a60e64c35fbc3a999b1a0db2b98044
    >>> + * and Xen's SMMU driver:
    >>> + *    xen/drivers/passthrough/arm/smmu.c
    >>> + *
    >>> + * Copyright (C) 2016-2019 EPAM Systems Inc.
    >>> + *
    >>> + * This program is free software; you can redistribute it and/or
    >>> + * modify it under the terms and conditions of the GNU General Public
    >>> + * License, version 2, as published by the Free Software Foundation.
    >>> + *
    >>> + * This program is distributed in the hope that it will be useful,
    >>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
    >>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
    >>> + * General Public License for more details.
    >>> + *
    >>> + * You should have received a copy of the GNU General Public
    >>> + * License along with this program; If not, see 
    >>> <http://www.gnu.org/licenses/>.
    >> I don't know that Xen license description rule, but since a few source 
    >> files have
    >> SPDX-License-Identifier, can we also use it on the driver?
    > 
    > I am afraid, I don't know a correct answer for this question. I would 
    > leave this to maintainers.
    > 
    > I just followed sample copyright notice for GPL v2 License according to 
    > the document:
    > 
    > http://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=CONTRIBUTING
    
    The file CONTRIBUTING is only giving example of common example of 
    license. So I think this is fine to use SPDX, the more they are already 
    used. The only request is to put either SDPX or the full-blown text but 
    not the two :). Lars, any objection?
    
    I am quite in favor of SPDX because it is easier to find out the 
    license. With the full-blown text, the text may slightly vary between 
    licenses. For instance, the only difference between GPLv2 and GPLv2+ is 
    ",or (at your option) any later version". I let you imagine how it can 
    be easy to miss it when reviewing ;).
    
    We had a discussion last year about using SPDX in Xen code base but I 
    never got the time to formally suggest it.
    
I did not push it either. 

In the past one of the committers had major objections against SPDX, but after a conversation last year and changes to the latest version of SPDX he dropped these.

The only remaining objection was to have both SPDX identifier AND a license in the same file. The argument against it is: what does it mean if they contradict each other? To be fair that is a valid concern.

I am not sure it is a good idea to introduce SPDX piecemeal. It would be much better to
a) agree it
b) transform the codebase using a tool
rather than introducing it piecemeal

Lars
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 12:44               ` Julien Grall
@ 2019-08-08 15:04                 ` Oleksandr
  2019-08-08 17:16                   ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-08 15:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko


Hi

>>>
>>>> Sorry for the possible format issues.
>>>>
>>>>
>>>>      > No, it is not disabled. But, ipmmu_irq() uses another 
>>>> mmu->lock. So, I
>>>>      > think, there won't be a deadlock.
>>>>      >
>>>>      > Or I really missed something?
>>>>      >
>>>>      > If we worry about ipmmu_tlb_invalidate() which is called 
>>>> here (to
>>>>      > perform a flush by request from P2M code, which manages a 
>>>> page table)
>>>>      > and from the irq handler (to perform a flush to resume address
>>>>      > translation), I could use a tasklet to schedule 
>>>> ipmmu_tlb_invalidate()
>>>>      > from the irq handler then. This way we would get this 
>>>> serialized. What
>>>>      > do you think?
>>>>
>>>>     I am afraid a tasklet is not an option. You need to perform the 
>>>> TLB
>>>>     flush when requested otherwise you are introducing a security 
>>>> issue.
>>>>
>>>>     This is because as soon as a region is unmapped in the page 
>>>> table, we
>>>>     remove the drop the reference on any page backing that region. 
>>>> When the
>>>>     reference is dropped to zero, the page can be reallocated to 
>>>> another
>>>>     domain or even Xen. If the TLB flush happen after, then the 
>>>> guest may
>>>>     still be able to access the page for a short time if the 
>>>> translation has
>>>>     been cached by the any TLB (IOMMU, Processor).
>>>>
>>>>
>>>>
>>>> I understand this. I am not proposing to delay a requested by P2M 
>>>> code TLB flush in any case. I just propose to issue TLB flush 
>>>> (which we have to perform in case of page faults, to resolve error 
>>>> condition and resume translations) from a tasklet rather than from 
>>>> interrupt handler directly. This is the TLB flush I am speaking about:
>>>>
>>>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>>>
>>>>
>>>> Sorry if I was unclear.
>>>
>>> My mistake, I misread what you wrote.
>>>
>>> I found the flush in the renesas-bsp and not Linux upstream but it 
>>> is not clear why this is actually required. You are not fixing any 
>>> translation error. So what this flush will do?
>>>
>>> Regarding the placement of the flush, then if you execute in a 
>>> tasklet it will likely be done later on when the IRQ has been 
>>> acknowledge. What's the implication to delay it?
>>
>>
>> Looks like, there is no need to put this flush into a tasklet. As I 
>> understand from Shimoda-san's answer it is OK to call flush here.
>>
>> So, my worry about calling ipmmu_tlb_invalidate() directly from the 
>> interrupt handler is not actual anymore.
>> ----------
>> This is my understanding regarding the flush purpose here. This code, 
>> just follows the TRM, no more no less,
>> which mentions about a need to flush TLB after clearing error status 
>> register and updating a page table entries (which, I assume, means to 
>> resolve a reason why translation/page fault error actually have 
>> happened) to resume address translation request.
>
> Well, I don't have the TRM... so my point of reference is Linux. Why 
> does upstream not do the TLB flush?

I have no idea regarding that.


>
>
> I have been told this is an errata on the IPMMU. Is it related to the 
> series posted on linux-iommu [1]?

I don't think, the TLB flush we are speaking about, is related to that 
series [1] somehow. This TLB flush, I think, is just the last step in a 
sequence of actions which should be performed when the error occurs, no 
more no less. This is how I understand this.


>
>>
>> But, with one remark, as you have already noted, we are not trying to 
>> handle/fix this fault (update page table entries), driver doesn't 
>> manage page table and is not aware what the page table is. What is 
>> more, it is unclear what actually need to be fixed in the page table 
>> which is a CPU page table as the same time.
>>
>> I have heard there is a break-before-make sequence when updating the 
>> page table. So, if device in a domain is issuing DMA somewhere in the 
>> middle of updating a page table, theoretically, we might hit into 
>> this fault. In this case the page table is correct and we don't need 
>> to fix anything...   Being honest, I have never seen a fault caused 
>> by break-before-make sequence.
>
> Ok, so it looks like you are trying to fix [1]. My first concern here 
> is there are no ground for someone without access to the TRM why this 
> is done.

No, I am definitely not trying to fix [1]. I just follow the BSP driver 
I am based on, which in turn follows the TRM. I can extend a comment in 
the code before calling ipmmu_tlb_invalidate().


>
> Furthermore, AFAICT, the patch series never reached upstream. So is it 
> present on all revision of GEN3?

I think, that the newest SoCs revisions (ES 3.0) this driver is supposed 
to support only, are *not* affected by that errata. And *not* require 
such workaround.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 15:04                 ` Oleksandr
@ 2019-08-08 17:16                   ` Julien Grall
  2019-08-08 19:29                     ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-08 17:16 UTC (permalink / raw)
  To: Oleksandr; +Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko



On 08/08/2019 16:04, Oleksandr wrote:
> 
> Hi
> 
>>>>
>>>>> Sorry for the possible format issues.
>>>>>
>>>>>
>>>>>      > No, it is not disabled. But, ipmmu_irq() uses another mmu->lock. So, I
>>>>>      > think, there won't be a deadlock.
>>>>>      >
>>>>>      > Or I really missed something?
>>>>>      >
>>>>>      > If we worry about ipmmu_tlb_invalidate() which is called here (to
>>>>>      > perform a flush by request from P2M code, which manages a page table)
>>>>>      > and from the irq handler (to perform a flush to resume address
>>>>>      > translation), I could use a tasklet to schedule ipmmu_tlb_invalidate()
>>>>>      > from the irq handler then. This way we would get this serialized. What
>>>>>      > do you think?
>>>>>
>>>>>     I am afraid a tasklet is not an option. You need to perform the TLB
>>>>>     flush when requested otherwise you are introducing a security issue.
>>>>>
>>>>>     This is because as soon as a region is unmapped in the page table, we
>>>>>     remove the drop the reference on any page backing that region. When the
>>>>>     reference is dropped to zero, the page can be reallocated to another
>>>>>     domain or even Xen. If the TLB flush happen after, then the guest may
>>>>>     still be able to access the page for a short time if the translation has
>>>>>     been cached by the any TLB (IOMMU, Processor).
>>>>>
>>>>>
>>>>>
>>>>> I understand this. I am not proposing to delay a requested by P2M code TLB 
>>>>> flush in any case. I just propose to issue TLB flush (which we have to 
>>>>> perform in case of page faults, to resolve error condition and resume 
>>>>> translations) from a tasklet rather than from interrupt handler directly. 
>>>>> This is the TLB flush I am speaking about:
>>>>>
>>>>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>>>>
>>>>>
>>>>> Sorry if I was unclear.
>>>>
>>>> My mistake, I misread what you wrote.
>>>>
>>>> I found the flush in the renesas-bsp and not Linux upstream but it is not 
>>>> clear why this is actually required. You are not fixing any translation 
>>>> error. So what this flush will do?
>>>>
>>>> Regarding the placement of the flush, then if you execute in a tasklet it 
>>>> will likely be done later on when the IRQ has been acknowledge. What's the 
>>>> implication to delay it?
>>>
>>>
>>> Looks like, there is no need to put this flush into a tasklet. As I 
>>> understand from Shimoda-san's answer it is OK to call flush here.
>>>
>>> So, my worry about calling ipmmu_tlb_invalidate() directly from the interrupt 
>>> handler is not actual anymore.
>>> ----------
>>> This is my understanding regarding the flush purpose here. This code, just 
>>> follows the TRM, no more no less,
>>> which mentions about a need to flush TLB after clearing error status register 
>>> and updating a page table entries (which, I assume, means to resolve a reason 
>>> why translation/page fault error actually have happened) to resume address 
>>> translation request.
>>
>> Well, I don't have the TRM... so my point of reference is Linux. Why does 
>> upstream not do the TLB flush?
> 
> I have no idea regarding that. >
> 
>>
>>
>> I have been told this is an errata on the IPMMU. Is it related to the series 
>> posted on linux-iommu [1]?
> 
> I don't think, the TLB flush we are speaking about, is related to that series 
> [1] somehow. This TLB flush, I think, is just the last step in a sequence of 
> actions which should be performed when the error occurs, no more no less. This 
> is how I understand this.

If you have to flush the TLBs in the IRQ context then something has gone really 
wrong.

I don't deny that Break-Before-Make is an issue. However, if it is handled 
correctly in the P2M code. You should only be there because there are no mapping 
in the TLBs for the address accessed. So flushing the TLBs should be 
unnecessary, unless your TLB is also caching invalid entry?

>>
>>>
>>> But, with one remark, as you have already noted, we are not trying to 
>>> handle/fix this fault (update page table entries), driver doesn't manage page 
>>> table and is not aware what the page table is. What is more, it is unclear 
>>> what actually need to be fixed in the page table which is a CPU page table as 
>>> the same time.
>>>
>>> I have heard there is a break-before-make sequence when updating the page 
>>> table. So, if device in a domain is issuing DMA somewhere in the middle of 
>>> updating a page table, theoretically, we might hit into this fault. In this 
>>> case the page table is correct and we don't need to fix anything...   Being 
>>> honest, I have never seen a fault caused by break-before-make sequence.
>>
>> Ok, so it looks like you are trying to fix [1]. My first concern here is there 
>> are no ground for someone without access to the TRM why this is done.
> 
> No, I am definitely not trying to fix [1]. I just follow the BSP driver I am 
> based on, which in turn follows the TRM. I can extend a comment in the code 
> before calling ipmmu_tlb_invalidate().

The fact that the code is in the BSP and not in Linux is worrying me. The commit 
message in the BSP is quite unhelpful to determine the exact reason.

It either means Linux rejected the patch or this was not submitted. Either way, 
this should be understood why such discrepancy.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 17:16                   ` Julien Grall
@ 2019-08-08 19:29                     ` Oleksandr
  2019-08-08 20:32                       ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-08 19:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko


Hi, Julien.


>>>>>
>>>>>> Sorry for the possible format issues.
>>>>>>
>>>>>>
>>>>>>      > No, it is not disabled. But, ipmmu_irq() uses another 
>>>>>> mmu->lock. So, I
>>>>>>      > think, there won't be a deadlock.
>>>>>>      >
>>>>>>      > Or I really missed something?
>>>>>>      >
>>>>>>      > If we worry about ipmmu_tlb_invalidate() which is called 
>>>>>> here (to
>>>>>>      > perform a flush by request from P2M code, which manages a 
>>>>>> page table)
>>>>>>      > and from the irq handler (to perform a flush to resume 
>>>>>> address
>>>>>>      > translation), I could use a tasklet to schedule 
>>>>>> ipmmu_tlb_invalidate()
>>>>>>      > from the irq handler then. This way we would get this 
>>>>>> serialized. What
>>>>>>      > do you think?
>>>>>>
>>>>>>     I am afraid a tasklet is not an option. You need to perform 
>>>>>> the TLB
>>>>>>     flush when requested otherwise you are introducing a security 
>>>>>> issue.
>>>>>>
>>>>>>     This is because as soon as a region is unmapped in the page 
>>>>>> table, we
>>>>>>     remove the drop the reference on any page backing that 
>>>>>> region. When the
>>>>>>     reference is dropped to zero, the page can be reallocated to 
>>>>>> another
>>>>>>     domain or even Xen. If the TLB flush happen after, then the 
>>>>>> guest may
>>>>>>     still be able to access the page for a short time if the 
>>>>>> translation has
>>>>>>     been cached by the any TLB (IOMMU, Processor).
>>>>>>
>>>>>>
>>>>>>
>>>>>> I understand this. I am not proposing to delay a requested by P2M 
>>>>>> code TLB flush in any case. I just propose to issue TLB flush 
>>>>>> (which we have to perform in case of page faults, to resolve 
>>>>>> error condition and resume translations) from a tasklet rather 
>>>>>> than from interrupt handler directly. This is the TLB flush I am 
>>>>>> speaking about:
>>>>>>
>>>>>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>>>>>
>>>>>>
>>>>>> Sorry if I was unclear.
>>>>>
>>>>> My mistake, I misread what you wrote.
>>>>>
>>>>> I found the flush in the renesas-bsp and not Linux upstream but it 
>>>>> is not clear why this is actually required. You are not fixing any 
>>>>> translation error. So what this flush will do?
>>>>>
>>>>> Regarding the placement of the flush, then if you execute in a 
>>>>> tasklet it will likely be done later on when the IRQ has been 
>>>>> acknowledge. What's the implication to delay it?
>>>>
>>>>
>>>> Looks like, there is no need to put this flush into a tasklet. As I 
>>>> understand from Shimoda-san's answer it is OK to call flush here.
>>>>
>>>> So, my worry about calling ipmmu_tlb_invalidate() directly from the 
>>>> interrupt handler is not actual anymore.
>>>> ----------
>>>> This is my understanding regarding the flush purpose here. This 
>>>> code, just follows the TRM, no more no less,
>>>> which mentions about a need to flush TLB after clearing error 
>>>> status register and updating a page table entries (which, I assume, 
>>>> means to resolve a reason why translation/page fault error actually 
>>>> have happened) to resume address translation request.
>>>
>>> Well, I don't have the TRM... so my point of reference is Linux. Why 
>>> does upstream not do the TLB flush?
>>
>> I have no idea regarding that. >
>>
>>>
>>>
>>> I have been told this is an errata on the IPMMU. Is it related to 
>>> the series posted on linux-iommu [1]?
>>
>> I don't think, the TLB flush we are speaking about, is related to 
>> that series [1] somehow. This TLB flush, I think, is just the last 
>> step in a sequence of actions which should be performed when the 
>> error occurs, no more no less. This is how I understand this.
>
> If you have to flush the TLBs in the IRQ context then something has 
> gone really wrong.
>
> I don't deny that Break-Before-Make is an issue. However, if it is 
> handled correctly in the P2M code. You should only be there because 
> there are no mapping in the TLBs for the address accessed. So flushing 
> the TLBs should be unnecessary, unless your TLB is also caching 
> invalid entry?

Sorry, I don't quite understand why we need to worry about this flush 
too much for a case which won't occur in normal condition (if everything 
is correct). Why we can't just consider this flush as a required action, 
which needed to exit from the error state and resume stopped address 
translation request. The same required action as "clearing error status 
flags" before. We are not trying to understand, why is it so necessary 
to clear error flags when error happens, or can we end up without 
clearing it, for example. We just follow what described in document. The 
same, I think, we have for that flush, if described, then should be 
followed. Looks like this flush acts as a trigger to unblock stopped 
transaction in that particular case.

Different H/W could have different restoring sequences. Some H/W 
requires just clearing error status, other H/W requires full 
re-initialization in a specific order to recover from the error state.

Please correct me if I am wrong.

>
>>>
>>>>
>>>> But, with one remark, as you have already noted, we are not trying 
>>>> to handle/fix this fault (update page table entries), driver 
>>>> doesn't manage page table and is not aware what the page table is. 
>>>> What is more, it is unclear what actually need to be fixed in the 
>>>> page table which is a CPU page table as the same time.
>>>>
>>>> I have heard there is a break-before-make sequence when updating 
>>>> the page table. So, if device in a domain is issuing DMA somewhere 
>>>> in the middle of updating a page table, theoretically, we might hit 
>>>> into this fault. In this case the page table is correct and we 
>>>> don't need to fix anything...   Being honest, I have never seen a 
>>>> fault caused by break-before-make sequence.
>>>
>>> Ok, so it looks like you are trying to fix [1]. My first concern 
>>> here is there are no ground for someone without access to the TRM 
>>> why this is done.
>>
>> No, I am definitely not trying to fix [1]. I just follow the BSP 
>> driver I am based on, which in turn follows the TRM. I can extend a 
>> comment in the code before calling ipmmu_tlb_invalidate().
>
> The fact that the code is in the BSP and not in Linux is worrying me. 
> The commit message in the BSP is quite unhelpful to determine the 
> exact reason.
>
> It either means Linux rejected the patch or this was not submitted. 
> Either way, this should be understood why such discrepancy.

I failed to find something similar in the ML. So, probably, was not 
submitted. Hope, we will be able to clarify a reason.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 19:29                     ` Oleksandr
@ 2019-08-08 20:32                       ` Julien Grall
  2019-08-08 23:32                         ` Oleksandr Tyshchenko
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-08 20:32 UTC (permalink / raw)
  To: Oleksandr; +Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko

Hi Oleksandr,

On 8/8/19 8:29 PM, Oleksandr wrote:
>>>>>>
>>>>>>> Sorry for the possible format issues.
>>>>>>>
>>>>>>>
>>>>>>>      > No, it is not disabled. But, ipmmu_irq() uses another 
>>>>>>> mmu->lock. So, I
>>>>>>>      > think, there won't be a deadlock.
>>>>>>>      >
>>>>>>>      > Or I really missed something?
>>>>>>>      >
>>>>>>>      > If we worry about ipmmu_tlb_invalidate() which is called 
>>>>>>> here (to
>>>>>>>      > perform a flush by request from P2M code, which manages a 
>>>>>>> page table)
>>>>>>>      > and from the irq handler (to perform a flush to resume 
>>>>>>> address
>>>>>>>      > translation), I could use a tasklet to schedule 
>>>>>>> ipmmu_tlb_invalidate()
>>>>>>>      > from the irq handler then. This way we would get this 
>>>>>>> serialized. What
>>>>>>>      > do you think?
>>>>>>>
>>>>>>>     I am afraid a tasklet is not an option. You need to perform 
>>>>>>> the TLB
>>>>>>>     flush when requested otherwise you are introducing a security 
>>>>>>> issue.
>>>>>>>
>>>>>>>     This is because as soon as a region is unmapped in the page 
>>>>>>> table, we
>>>>>>>     remove the drop the reference on any page backing that 
>>>>>>> region. When the
>>>>>>>     reference is dropped to zero, the page can be reallocated to 
>>>>>>> another
>>>>>>>     domain or even Xen. If the TLB flush happen after, then the 
>>>>>>> guest may
>>>>>>>     still be able to access the page for a short time if the 
>>>>>>> translation has
>>>>>>>     been cached by the any TLB (IOMMU, Processor).
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I understand this. I am not proposing to delay a requested by P2M 
>>>>>>> code TLB flush in any case. I just propose to issue TLB flush 
>>>>>>> (which we have to perform in case of page faults, to resolve 
>>>>>>> error condition and resume translations) from a tasklet rather 
>>>>>>> than from interrupt handler directly. This is the TLB flush I am 
>>>>>>> speaking about:
>>>>>>>
>>>>>>> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598 
>>>>>>>
>>>>>>>
>>>>>>> Sorry if I was unclear.
>>>>>>
>>>>>> My mistake, I misread what you wrote.
>>>>>>
>>>>>> I found the flush in the renesas-bsp and not Linux upstream but it 
>>>>>> is not clear why this is actually required. You are not fixing any 
>>>>>> translation error. So what this flush will do?
>>>>>>
>>>>>> Regarding the placement of the flush, then if you execute in a 
>>>>>> tasklet it will likely be done later on when the IRQ has been 
>>>>>> acknowledge. What's the implication to delay it?
>>>>>
>>>>>
>>>>> Looks like, there is no need to put this flush into a tasklet. As I 
>>>>> understand from Shimoda-san's answer it is OK to call flush here.
>>>>>
>>>>> So, my worry about calling ipmmu_tlb_invalidate() directly from the 
>>>>> interrupt handler is not actual anymore.
>>>>> ----------
>>>>> This is my understanding regarding the flush purpose here. This 
>>>>> code, just follows the TRM, no more no less,
>>>>> which mentions about a need to flush TLB after clearing error 
>>>>> status register and updating a page table entries (which, I assume, 
>>>>> means to resolve a reason why translation/page fault error actually 
>>>>> have happened) to resume address translation request.
>>>>
>>>> Well, I don't have the TRM... so my point of reference is Linux. Why 
>>>> does upstream not do the TLB flush?
>>>
>>> I have no idea regarding that. >
>>>
>>>>
>>>>
>>>> I have been told this is an errata on the IPMMU. Is it related to 
>>>> the series posted on linux-iommu [1]?
>>>
>>> I don't think, the TLB flush we are speaking about, is related to 
>>> that series [1] somehow. This TLB flush, I think, is just the last 
>>> step in a sequence of actions which should be performed when the 
>>> error occurs, no more no less. This is how I understand this.
>>
>> If you have to flush the TLBs in the IRQ context then something has 
>> gone really wrong.
>>
>> I don't deny that Break-Before-Make is an issue. However, if it is 
>> handled correctly in the P2M code. You should only be there because 
>> there are no mapping in the TLBs for the address accessed. So flushing 
>> the TLBs should be unnecessary, unless your TLB is also caching 
>> invalid entry?
> 
> Sorry, I don't quite understand why we need to worry about this flush 
> too much for a case which won't occur in normal condition (if everything 
> is correct). Why we can't just consider this flush as a required action, 

A translation error can be easy to reach. For instance if the guest does 
not program the Device correctly and request to access an address that 
is not mapped.

> which needed to exit from the error state and resume stopped address 
> translation request. The same required action as "clearing error status 
> flags" before. We are not trying to understand, why is it so necessary 
> to clear error flags when error happens, or can we end up without 
> clearing it, for example. We just follow what described in document. The 
> same, I think, we have for that flush, if described, then should be 
> followed. Looks like this flush acts as a trigger to unblock stopped 
> transaction in that particular case.

What will actually happen if the transaction fail again? For instance, 
if the IOVA was not mapped. Will you receive the interrupt again?
If so, are you going to make the flush again and again until the guest 
is killed?

> 
> Different H/W could have different restoring sequences. Some H/W 
> requires just clearing error status, other H/W requires full 
> re-initialization in a specific order to recover from the error state.
> 
> Please correct me if I am wrong.

I am not confident to accept any code that I don't understand or I don't 
find sensible. As I pointed out in my previous e-mail, this hasn't 
reached upstream so something looks quite fishy here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 20:32                       ` Julien Grall
@ 2019-08-08 23:32                         ` Oleksandr Tyshchenko
  2019-08-09  9:56                           ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr Tyshchenko @ 2019-08-08 23:32 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Oleksandr Tyshchenko

[-- Attachment #1.1: Type: text/plain, Size: 7636 bytes --]

Hi Julien.

Sorry for the possible format issues.

чт, 8 авг. 2019 г., 23:32 Julien Grall <julien.grall@arm.com>:

> Hi Oleksandr,
>
> On 8/8/19 8:29 PM, Oleksandr wrote:
> >>>>>>
> >>>>>>> Sorry for the possible format issues.
> >>>>>>>
> >>>>>>>
> >>>>>>>      > No, it is not disabled. But, ipmmu_irq() uses another
> >>>>>>> mmu->lock. So, I
> >>>>>>>      > think, there won't be a deadlock.
> >>>>>>>      >
> >>>>>>>      > Or I really missed something?
> >>>>>>>      >
> >>>>>>>      > If we worry about ipmmu_tlb_invalidate() which is called
> >>>>>>> here (to
> >>>>>>>      > perform a flush by request from P2M code, which manages a
> >>>>>>> page table)
> >>>>>>>      > and from the irq handler (to perform a flush to resume
> >>>>>>> address
> >>>>>>>      > translation), I could use a tasklet to schedule
> >>>>>>> ipmmu_tlb_invalidate()
> >>>>>>>      > from the irq handler then. This way we would get this
> >>>>>>> serialized. What
> >>>>>>>      > do you think?
> >>>>>>>
> >>>>>>>     I am afraid a tasklet is not an option. You need to perform
> >>>>>>> the TLB
> >>>>>>>     flush when requested otherwise you are introducing a security
> >>>>>>> issue.
> >>>>>>>
> >>>>>>>     This is because as soon as a region is unmapped in the page
> >>>>>>> table, we
> >>>>>>>     remove the drop the reference on any page backing that
> >>>>>>> region. When the
> >>>>>>>     reference is dropped to zero, the page can be reallocated to
> >>>>>>> another
> >>>>>>>     domain or even Xen. If the TLB flush happen after, then the
> >>>>>>> guest may
> >>>>>>>     still be able to access the page for a short time if the
> >>>>>>> translation has
> >>>>>>>     been cached by the any TLB (IOMMU, Processor).
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> I understand this. I am not proposing to delay a requested by P2M
> >>>>>>> code TLB flush in any case. I just propose to issue TLB flush
> >>>>>>> (which we have to perform in case of page faults, to resolve
> >>>>>>> error condition and resume translations) from a tasklet rather
> >>>>>>> than from interrupt handler directly. This is the TLB flush I am
> >>>>>>> speaking about:
> >>>>>>>
> >>>>>>>
> https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598
> >>>>>>>
> >>>>>>>
> >>>>>>> Sorry if I was unclear.
> >>>>>>
> >>>>>> My mistake, I misread what you wrote.
> >>>>>>
> >>>>>> I found the flush in the renesas-bsp and not Linux upstream but it
> >>>>>> is not clear why this is actually required. You are not fixing any
> >>>>>> translation error. So what this flush will do?
> >>>>>>
> >>>>>> Regarding the placement of the flush, then if you execute in a
> >>>>>> tasklet it will likely be done later on when the IRQ has been
> >>>>>> acknowledge. What's the implication to delay it?
> >>>>>
> >>>>>
> >>>>> Looks like, there is no need to put this flush into a tasklet. As I
> >>>>> understand from Shimoda-san's answer it is OK to call flush here.
> >>>>>
> >>>>> So, my worry about calling ipmmu_tlb_invalidate() directly from the
> >>>>> interrupt handler is not actual anymore.
> >>>>> ----------
> >>>>> This is my understanding regarding the flush purpose here. This
> >>>>> code, just follows the TRM, no more no less,
> >>>>> which mentions about a need to flush TLB after clearing error
> >>>>> status register and updating a page table entries (which, I assume,
> >>>>> means to resolve a reason why translation/page fault error actually
> >>>>> have happened) to resume address translation request.
> >>>>
> >>>> Well, I don't have the TRM... so my point of reference is Linux. Why
> >>>> does upstream not do the TLB flush?
> >>>
> >>> I have no idea regarding that. >
> >>>
> >>>>
> >>>>
> >>>> I have been told this is an errata on the IPMMU. Is it related to
> >>>> the series posted on linux-iommu [1]?
> >>>
> >>> I don't think, the TLB flush we are speaking about, is related to
> >>> that series [1] somehow. This TLB flush, I think, is just the last
> >>> step in a sequence of actions which should be performed when the
> >>> error occurs, no more no less. This is how I understand this.
> >>
> >> If you have to flush the TLBs in the IRQ context then something has
> >> gone really wrong.
> >>
> >> I don't deny that Break-Before-Make is an issue. However, if it is
> >> handled correctly in the P2M code. You should only be there because
> >> there are no mapping in the TLBs for the address accessed. So flushing
> >> the TLBs should be unnecessary, unless your TLB is also caching
> >> invalid entry?
> >
> > Sorry, I don't quite understand why we need to worry about this flush
> > too much for a case which won't occur in normal condition (if everything
> > is correct). Why we can't just consider this flush as a required action,
>
> A translation error can be easy to reach. For instance if the guest does
> not program the Device correctly and request to access an address that
> is not mapped.
>

Yes, I understand these bits. But, I wrote that error wouldn't occur in
normal condition (if everything was correct).



>
>
> > which needed to exit from the error state and resume stopped address
> > translation request. The same required action as "clearing error status
> > flags" before. We are not trying to understand, why is it so necessary
> > to clear error flags when error happens, or can we end up without
> > clearing it, for example. We just follow what described in document. The
> > same, I think, we have for that flush, if described, then should be
> > followed. Looks like this flush acts as a trigger to unblock stopped
> > transaction in that particular case.
>
> What will actually happen if the transaction fail again? For instance,
> if the IOVA was not mapped. Will you receive the interrupt again?
> If so, are you going to make the flush again and again until the guest
> is killed?
>

This is a good question. I think, if address is not mapped, the transaction
will fail again and we will get the interrupt again. Not sure, until the
guest is killed or until the driver in the guest detects timeout and
cancels DMA. Let's consider the worst case, until the guest is killed.

So my questions are what do you think would be the proper driver's behavior
in that case? Do nothing and don't even try to resolve error
condition/unblock translation at the first page fault, or give it a few
attempts, or unblock every time. How does the SMMU driver act in such
situation?

Quite clear, if we get a fault, then address is not mapped. I think, it can
be both: by issuing wrong address (baggy driver, malicious driver) or by
race (unlikely). If this is the real race (device hits brake-before-make,
for example), we could give it another attempt, for example. Looks like we
need some mechanism to deploy faulted address to P2M code (which manages
page table) to analyze? Or it is not worth doing that?


> >
> > Different H/W could have different restoring sequences. Some H/W
> > requires just clearing error status, other H/W requires full
> > re-initialization in a specific order to recover from the error state.
> >
> > Please correct me if I am wrong.
>
> I am not confident to accept any code that I don't understand or I don't
> find sensible. As I pointed out in my previous e-mail, this hasn't
> reached upstream so something looks quite fishy here.
>
>
As I answered in previous e-mail, I hope, we will be able to clarify a
reason why this hasn't reached upstream.

>

[-- Attachment #1.2: Type: text/html, Size: 11553 bytes --]

<div dir="auto"><div>Hi Julien.</div><div dir="auto"><br></div><div dir="auto">Sorry for the possible format issues.<br><br><div class="gmail_quote" dir="auto"><div dir="ltr" class="gmail_attr">чт, 8 авг. 2019 г., 23:32 Julien Grall &lt;<a href="mailto:julien.grall@arm.com">julien.grall@arm.com</a>&gt;:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Oleksandr,<br>
<br>
On 8/8/19 8:29 PM, Oleksandr wrote:<br>
&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; Sorry for the possible format issues.<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; No, it is not disabled. But, ipmmu_irq() uses another <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; mmu-&gt;lock. So, I<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; think, there won&#39;t be a deadlock.<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; Or I really missed something?<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; If we worry about ipmmu_tlb_invalidate() which is called <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; here (to<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; perform a flush by request from P2M code, which manages a <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; page table)<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; and from the irq handler (to perform a flush to resume <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; address<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; translation), I could use a tasklet to schedule <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; ipmmu_tlb_invalidate()<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; from the irq handler then. This way we would get this <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; serialized. What<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;      &gt; do you think?<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     I am afraid a tasklet is not an option. You need to perform <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; the TLB<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     flush when requested otherwise you are introducing a security <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; issue.<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     This is because as soon as a region is unmapped in the page <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; table, we<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     remove the drop the reference on any page backing that <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; region. When the<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     reference is dropped to zero, the page can be reallocated to <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; another<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     domain or even Xen. If the TLB flush happen after, then the <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; guest may<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     still be able to access the page for a short time if the <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; translation has<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;     been cached by the any TLB (IOMMU, Processor).<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; I understand this. I am not proposing to delay a requested by P2M <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; code TLB flush in any case. I just propose to issue TLB flush <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; (which we have to perform in case of page faults, to resolve <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; error condition and resume translations) from a tasklet rather <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; than from interrupt handler directly. This is the TLB flush I am <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; speaking about:<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; <a href="https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598" rel="noreferrer noreferrer" target="_blank">https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598</a> <br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt;&gt; Sorry if I was unclear.<br>
&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt; My mistake, I misread what you wrote.<br>
&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt; I found the flush in the renesas-bsp and not Linux upstream but it <br>
&gt;&gt;&gt;&gt;&gt;&gt; is not clear why this is actually required. You are not fixing any <br>
&gt;&gt;&gt;&gt;&gt;&gt; translation error. So what this flush will do?<br>
&gt;&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;&gt; Regarding the placement of the flush, then if you execute in a <br>
&gt;&gt;&gt;&gt;&gt;&gt; tasklet it will likely be done later on when the IRQ has been <br>
&gt;&gt;&gt;&gt;&gt;&gt; acknowledge. What&#39;s the implication to delay it?<br>
&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt; Looks like, there is no need to put this flush into a tasklet. As I <br>
&gt;&gt;&gt;&gt;&gt; understand from Shimoda-san&#39;s answer it is OK to call flush here.<br>
&gt;&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;&gt; So, my worry about calling ipmmu_tlb_invalidate() directly from the <br>
&gt;&gt;&gt;&gt;&gt; interrupt handler is not actual anymore.<br>
&gt;&gt;&gt;&gt;&gt; ----------<br>
&gt;&gt;&gt;&gt;&gt; This is my understanding regarding the flush purpose here. This <br>
&gt;&gt;&gt;&gt;&gt; code, just follows the TRM, no more no less,<br>
&gt;&gt;&gt;&gt;&gt; which mentions about a need to flush TLB after clearing error <br>
&gt;&gt;&gt;&gt;&gt; status register and updating a page table entries (which, I assume, <br>
&gt;&gt;&gt;&gt;&gt; means to resolve a reason why translation/page fault error actually <br>
&gt;&gt;&gt;&gt;&gt; have happened) to resume address translation request.<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; Well, I don&#39;t have the TRM... so my point of reference is Linux. Why <br>
&gt;&gt;&gt;&gt; does upstream not do the TLB flush?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I have no idea regarding that. &gt;<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt;<br>
&gt;&gt;&gt;&gt; I have been told this is an errata on the IPMMU. Is it related to <br>
&gt;&gt;&gt;&gt; the series posted on linux-iommu [1]?<br>
&gt;&gt;&gt;<br>
&gt;&gt;&gt; I don&#39;t think, the TLB flush we are speaking about, is related to <br>
&gt;&gt;&gt; that series [1] somehow. This TLB flush, I think, is just the last <br>
&gt;&gt;&gt; step in a sequence of actions which should be performed when the <br>
&gt;&gt;&gt; error occurs, no more no less. This is how I understand this.<br>
&gt;&gt;<br>
&gt;&gt; If you have to flush the TLBs in the IRQ context then something has <br>
&gt;&gt; gone really wrong.<br>
&gt;&gt;<br>
&gt;&gt; I don&#39;t deny that Break-Before-Make is an issue. However, if it is <br>
&gt;&gt; handled correctly in the P2M code. You should only be there because <br>
&gt;&gt; there are no mapping in the TLBs for the address accessed. So flushing <br>
&gt;&gt; the TLBs should be unnecessary, unless your TLB is also caching <br>
&gt;&gt; invalid entry?<br>
&gt; <br>
&gt; Sorry, I don&#39;t quite understand why we need to worry about this flush <br>
&gt; too much for a case which won&#39;t occur in normal condition (if everything <br>
&gt; is correct). Why we can&#39;t just consider this flush as a required action, <br>
<br>
A translation error can be easy to reach. For instance if the guest does <br>
not program the Device correctly and request to access an address that <br>
is not mapped.<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"></blockquote></div></div><div dir="auto">Yes, I understand these bits. But, I wrote that error wouldn&#39;t occur in normal condition (if everything was correct).</div><div dir="auto"><br></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br><br><br>
&gt; which needed to exit from the error state and resume stopped address <br>
&gt; translation request. The same required action as &quot;clearing error status <br>
&gt; flags&quot; before. We are not trying to understand, why is it so necessary <br>
&gt; to clear error flags when error happens, or can we end up without <br>
&gt; clearing it, for example. We just follow what described in document. The <br>
&gt; same, I think, we have for that flush, if described, then should be <br>
&gt; followed. Looks like this flush acts as a trigger to unblock stopped <br>
&gt; transaction in that particular case.<br>
<br>
What will actually happen if the transaction fail again? For instance, <br>
if the IOVA was not mapped. Will you receive the interrupt again?<br>
If so, are you going to make the flush again and again until the guest <br>
is killed?<br></blockquote></div></div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"></blockquote></div></div><div dir="auto">This is a good question. I think, if address is not mapped, the transaction will fail again and we will get the interrupt again. Not sure, until the guest is killed or until the driver in the guest detects timeout and cancels DMA. Let&#39;s consider the worst case, until the guest is killed.</div><div dir="auto"><br></div><div dir="auto">So my questions are what do you think would be the proper driver&#39;s behavior in that case? Do nothing and don&#39;t even try to resolve error condition/unblock translation at the first page fault, or give it a few attempts, or unblock every time. How does the SMMU driver act in such situation?</div><div dir="auto"><br></div><div dir="auto">Quite clear, if we get a fault, then address is not mapped. I think, it can be both: by issuing wrong address (baggy driver, malicious driver) or by race (unlikely). If this is the real race (device hits brake-before-make, for example), we could give it another attempt, for example. Looks like we need some mechanism to deploy faulted address to P2M code (which manages page table) to analyze? Or it is not worth doing that?</div><div dir="auto"><br></div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
&gt; <br>
&gt; Different H/W could have different restoring sequences. Some H/W <br>
&gt; requires just clearing error status, other H/W requires full <br>
&gt; re-initialization in a specific order to recover from the error state.<br>
&gt; <br>
&gt; Please correct me if I am wrong.<br>
<br>
I am not confident to accept any code that I don&#39;t understand or I don&#39;t <br>
find sensible. As I pointed out in my previous e-mail, this hasn&#39;t <br>
reached upstream so something looks quite fishy here.<br><br></blockquote></div></div><div dir="auto"><br></div><div dir="auto">As I answered in previous e-mail, I hope, we will be able to clarify a reason why this hasn&#39;t reached upstream.</div><div dir="auto"><div class="gmail_quote" dir="auto"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
</blockquote></div></div></div>

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-08 23:32                         ` Oleksandr Tyshchenko
@ 2019-08-09  9:56                           ` Julien Grall
  2019-08-09 18:38                             ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-09  9:56 UTC (permalink / raw)
  To: Oleksandr Tyshchenko
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Robin Murphy,
	Oleksandr Tyshchenko

(+ Robin)

On 09/08/2019 00:32, Oleksandr Tyshchenko wrote:
> Hi Julien.

Hi,

> 
> Sorry for the possible format issues.
> 
> чт, 8 авг. 2019 г., 23:32 Julien Grall <julien.grall@arm.com 
> <mailto:julien.grall@arm.com>>:
> 
>     Hi Oleksandr,
> 
>     On 8/8/19 8:29 PM, Oleksandr wrote:
>      >>>>>>
>      >>>>>>> Sorry for the possible format issues.
>      >>>>>>>
>      >>>>>>>
>      >>>>>>>      > No, it is not disabled. But, ipmmu_irq() uses another
>      >>>>>>> mmu->lock. So, I
>      >>>>>>>      > think, there won't be a deadlock.
>      >>>>>>>      >
>      >>>>>>>      > Or I really missed something?
>      >>>>>>>      >
>      >>>>>>>      > If we worry about ipmmu_tlb_invalidate() which is called
>      >>>>>>> here (to
>      >>>>>>>      > perform a flush by request from P2M code, which manages a
>      >>>>>>> page table)
>      >>>>>>>      > and from the irq handler (to perform a flush to resume
>      >>>>>>> address
>      >>>>>>>      > translation), I could use a tasklet to schedule
>      >>>>>>> ipmmu_tlb_invalidate()
>      >>>>>>>      > from the irq handler then. This way we would get this
>      >>>>>>> serialized. What
>      >>>>>>>      > do you think?
>      >>>>>>>
>      >>>>>>>     I am afraid a tasklet is not an option. You need to perform
>      >>>>>>> the TLB
>      >>>>>>>     flush when requested otherwise you are introducing a security
>      >>>>>>> issue.
>      >>>>>>>
>      >>>>>>>     This is because as soon as a region is unmapped in the page
>      >>>>>>> table, we
>      >>>>>>>     remove the drop the reference on any page backing that
>      >>>>>>> region. When the
>      >>>>>>>     reference is dropped to zero, the page can be reallocated to
>      >>>>>>> another
>      >>>>>>>     domain or even Xen. If the TLB flush happen after, then the
>      >>>>>>> guest may
>      >>>>>>>     still be able to access the page for a short time if the
>      >>>>>>> translation has
>      >>>>>>>     been cached by the any TLB (IOMMU, Processor).
>      >>>>>>>
>      >>>>>>>
>      >>>>>>>
>      >>>>>>> I understand this. I am not proposing to delay a requested by P2M
>      >>>>>>> code TLB flush in any case. I just propose to issue TLB flush
>      >>>>>>> (which we have to perform in case of page faults, to resolve
>      >>>>>>> error condition and resume translations) from a tasklet rather
>      >>>>>>> than from interrupt handler directly. This is the TLB flush I am
>      >>>>>>> speaking about:
>      >>>>>>>
>      >>>>>>>
>     https://github.com/otyshchenko1/xen/blob/ipmmu_upstream2/xen/drivers/passthrough/arm/ipmmu-vmsa.c#L598
> 
>      >>>>>>>
>      >>>>>>>
>      >>>>>>> Sorry if I was unclear.
>      >>>>>>
>      >>>>>> My mistake, I misread what you wrote.
>      >>>>>>
>      >>>>>> I found the flush in the renesas-bsp and not Linux upstream but it
>      >>>>>> is not clear why this is actually required. You are not fixing any
>      >>>>>> translation error. So what this flush will do?
>      >>>>>>
>      >>>>>> Regarding the placement of the flush, then if you execute in a
>      >>>>>> tasklet it will likely be done later on when the IRQ has been
>      >>>>>> acknowledge. What's the implication to delay it?
>      >>>>>
>      >>>>>
>      >>>>> Looks like, there is no need to put this flush into a tasklet. As I
>      >>>>> understand from Shimoda-san's answer it is OK to call flush here.
>      >>>>>
>      >>>>> So, my worry about calling ipmmu_tlb_invalidate() directly from the
>      >>>>> interrupt handler is not actual anymore.
>      >>>>> ----------
>      >>>>> This is my understanding regarding the flush purpose here. This
>      >>>>> code, just follows the TRM, no more no less,
>      >>>>> which mentions about a need to flush TLB after clearing error
>      >>>>> status register and updating a page table entries (which, I assume,
>      >>>>> means to resolve a reason why translation/page fault error actually
>      >>>>> have happened) to resume address translation request.
>      >>>>
>      >>>> Well, I don't have the TRM... so my point of reference is Linux. Why
>      >>>> does upstream not do the TLB flush?
>      >>>
>      >>> I have no idea regarding that. >
>      >>>
>      >>>>
>      >>>>
>      >>>> I have been told this is an errata on the IPMMU. Is it related to
>      >>>> the series posted on linux-iommu [1]?
>      >>>
>      >>> I don't think, the TLB flush we are speaking about, is related to
>      >>> that series [1] somehow. This TLB flush, I think, is just the last
>      >>> step in a sequence of actions which should be performed when the
>      >>> error occurs, no more no less. This is how I understand this.
>      >>
>      >> If you have to flush the TLBs in the IRQ context then something has
>      >> gone really wrong.
>      >>
>      >> I don't deny that Break-Before-Make is an issue. However, if it is
>      >> handled correctly in the P2M code. You should only be there because
>      >> there are no mapping in the TLBs for the address accessed. So flushing
>      >> the TLBs should be unnecessary, unless your TLB is also caching
>      >> invalid entry?
>      >
>      > Sorry, I don't quite understand why we need to worry about this flush
>      > too much for a case which won't occur in normal condition (if everything
>      > is correct). Why we can't just consider this flush as a required action,
> 
>     A translation error can be easy to reach. For instance if the guest does
>     not program the Device correctly and request to access an address that
>     is not mapped.
> 
> 
> Yes, I understand these bits. But, I wrote that error wouldn't occur in normal 
> condition (if everything was correct).

I don't understand your point here. Whether this is in an error path or correct 
path, we should be able to understand the reason behind it. Otherwise, error 
path would become the wild west...

> 
> 
> 
> 
> 
>      > which needed to exit from the error state and resume stopped address
>      > translation request. The same required action as "clearing error status
>      > flags" before. We are not trying to understand, why is it so necessary
>      > to clear error flags when error happens, or can we end up without
>      > clearing it, for example. We just follow what described in document. The
>      > same, I think, we have for that flush, if described, then should be
>      > followed. Looks like this flush acts as a trigger to unblock stopped
>      > transaction in that particular case.
> 
>     What will actually happen if the transaction fail again? For instance,
>     if the IOVA was not mapped. Will you receive the interrupt again?
>     If so, are you going to make the flush again and again until the guest
>     is killed?
> 
> 
> This is a good question. I think, if address is not mapped, the transaction will 
> fail again and we will get the interrupt again. Not sure, until the guest is 
> killed or until the driver in the guest detects timeout and cancels DMA. Let's 
> consider the worst case, until the guest is killed.
> 
> So my questions are what do you think would be the proper driver's behavior in 
> that case? Do nothing and don't even try to resolve error condition/unblock 
> translation at the first page fault, or give it a few attempts, or unblock every 
> time.

I will answer back with a question here. How is the TLB flush is going to 
unblock anything? The more you are not fixing any error condition here... And 
the print "Unhandled fault" just afterwards clearly leads to think that there 
are very little chance the fault has been resolved.

> How does the SMMU driver act in such situation?

I have CCed Robin who knows better than me the SMMU driver. Though it is the 
Linux one but Xen is based on it.

 From my understanding, it is implementation defined whether the SMMU supports 
stalling a transaction on fault. AFAICT, the current Xen driver will just 
terminate the transaction and therefore the client transaction behave as RAZ/WI.

> 
> Quite clear, if we get a fault, then address is not mapped. I think, it can be 
> both: by issuing wrong address (baggy driver, malicious driver) or by race 
> (unlikely). If this is the real race (device hits brake-before-make, for 
> example), we could give it another attempt, for example. Looks like we need some 
> mechanism to deploy faulted address to P2M code (which manages page table) to 
> analyze? Or it is not worth doing that?

You seem to speak about break-before-make as it was an error. Break-Before-Make 
is just a sequence to prevent the TLB walker to cache both old and new mapping 
at the same time. At a given point the IOVA translation can only be:
    1) The old physical address
    2) No address -> result to a fault
    3) The new physical address

1) and 3) should not result to a fault. 2) will result to a fault but then the 
TLB should not cache invalid entry, right?

In order to see 2), we always flush the TLBs after removing the old physical 
address.

Unfortunately, some of the IOMMUs are not able to restart transactions, Xen 
currently avoids to flush the TLBs after 2). So you may be able to see both 
mapping at the same time.

Looking at your driver, I believe you would have the flag IMSTR.MHIT (multiple 
tlb hits) set because this is the condition we are trying to prevent with 
break-before-make. The comment in the code leads to think this is a fault error, 
so I am not sure why you would recover here...

If your IOMMU is able to stall transaction, then it would be best if we properly 
handle break-before-make with it.

Overall, it feels to me the TLB flush is here for a different reason.

> 
> 
>      >
>      > Different H/W could have different restoring sequences. Some H/W
>      > requires just clearing error status, other H/W requires full
>      > re-initialization in a specific order to recover from the error state.
>      >
>      > Please correct me if I am wrong.
> 
>     I am not confident to accept any code that I don't understand or I don't
>     find sensible. As I pointed out in my previous e-mail, this hasn't
>     reached upstream so something looks quite fishy here.
> 
> 
> As I answered in previous e-mail, I hope, we will be able to clarify a reason 
> why this hasn't reached upstream.

Thank you.

Cheers,


-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff Oleksandr Tyshchenko
@ 2019-08-09 17:35   ` Julien Grall
  2019-08-09 18:10     ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-09 17:35 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> Introduce a separate file to keep various helpers which could be used
> by more than one IOMMU driver in order not to duplicate code.
> 
> The first condidates to be moved to the new file are SMMU driver's

NIT: s/condidates/candidates/

> "map_page/unmap_page" callbacks. There callbacks neither contain any
> SMMU specific info nor perform any SMMU specific actions and are going
> to be the same across all IOMMU drivers which H/W IP shares P2M
> with the CPU like SMMU does.
> 
> So, move callbacks to iommu_helpers.c for the upcoming IPMMU driver
> to be able to re-use them.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
>   xen/drivers/passthrough/arm/Makefile        |  2 +-
>   xen/drivers/passthrough/arm/iommu_helpers.c | 78 +++++++++++++++++++++++++++++
>   xen/drivers/passthrough/arm/smmu.c          | 48 +-----------------
>   xen/include/asm-arm/iommu.h                 |  7 +++
>   4 files changed, 88 insertions(+), 47 deletions(-)
>   create mode 100644 xen/drivers/passthrough/arm/iommu_helpers.c
> 
> diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile
> index b3efcfd..4abb87a 100644
> --- a/xen/drivers/passthrough/arm/Makefile
> +++ b/xen/drivers/passthrough/arm/Makefile
> @@ -1,2 +1,2 @@
> -obj-y += iommu.o
> +obj-y += iommu.o iommu_helpers.o
>   obj-$(CONFIG_ARM_SMMU) += smmu.o
> diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c b/xen/drivers/passthrough/arm/iommu_helpers.c
> new file mode 100644
> index 0000000..53e8daa
> --- /dev/null
> +++ b/xen/drivers/passthrough/arm/iommu_helpers.c
> @@ -0,0 +1,78 @@
> +/*
> + * xen/drivers/passthrough/arm/iommu_helpers.c
> + *
> + * Contains various helpers to be used by IOMMU drivers.
> + *
> + * Copyright (C) 2019 EPAM Systems Inc.

You mostly moved the code from the SMMU code, so the copyright there should be 
retain. As this is Xen modification the copyright here should be:

  * Copyright (C) 2014 Linaro Limited.

> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/lib.h>
> +#include <xen/sched.h>
> +#include <xen/iommu.h>

Could you order the headers above alphabetical please?

And also as a extra NIT a newline before 'xen' headers and 'asm' one :).

> +#include <asm/device.h>

The rest of the good looks good to me.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff
  2019-08-09 17:35   ` Julien Grall
@ 2019-08-09 18:10     ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-09 18:10 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


Hi, Julien


>
> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> Introduce a separate file to keep various helpers which could be used
>> by more than one IOMMU driver in order not to duplicate code.
>>
>> The first condidates to be moved to the new file are SMMU driver's
>
> NIT: s/condidates/candidates/

ok


>
>> "map_page/unmap_page" callbacks. There callbacks neither contain any
>> SMMU specific info nor perform any SMMU specific actions and are going
>> to be the same across all IOMMU drivers which H/W IP shares P2M
>> with the CPU like SMMU does.
>>
>> So, move callbacks to iommu_helpers.c for the upcoming IPMMU driver
>> to be able to re-use them.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> ---
>>   xen/drivers/passthrough/arm/Makefile        |  2 +-
>>   xen/drivers/passthrough/arm/iommu_helpers.c | 78 
>> +++++++++++++++++++++++++++++
>>   xen/drivers/passthrough/arm/smmu.c          | 48 +-----------------
>>   xen/include/asm-arm/iommu.h                 |  7 +++
>>   4 files changed, 88 insertions(+), 47 deletions(-)
>>   create mode 100644 xen/drivers/passthrough/arm/iommu_helpers.c
>>
>> diff --git a/xen/drivers/passthrough/arm/Makefile 
>> b/xen/drivers/passthrough/arm/Makefile
>> index b3efcfd..4abb87a 100644
>> --- a/xen/drivers/passthrough/arm/Makefile
>> +++ b/xen/drivers/passthrough/arm/Makefile
>> @@ -1,2 +1,2 @@
>> -obj-y += iommu.o
>> +obj-y += iommu.o iommu_helpers.o
>>   obj-$(CONFIG_ARM_SMMU) += smmu.o
>> diff --git a/xen/drivers/passthrough/arm/iommu_helpers.c 
>> b/xen/drivers/passthrough/arm/iommu_helpers.c
>> new file mode 100644
>> index 0000000..53e8daa
>> --- /dev/null
>> +++ b/xen/drivers/passthrough/arm/iommu_helpers.c
>> @@ -0,0 +1,78 @@
>> +/*
>> + * xen/drivers/passthrough/arm/iommu_helpers.c
>> + *
>> + * Contains various helpers to be used by IOMMU drivers.
>> + *
>> + * Copyright (C) 2019 EPAM Systems Inc.
>
> You mostly moved the code from the SMMU code, so the copyright there 
> should be retain. As this is Xen modification the copyright here 
> should be:
>
>  * Copyright (C) 2014 Linaro Limited.

Oh, yes. Sorry, forgot about it. Will add.


>
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; If not, see 
>> <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <xen/lib.h>
>> +#include <xen/sched.h>
>> +#include <xen/iommu.h>
>
> Could you order the headers above alphabetical please?
>
> And also as a extra NIT a newline before 'xen' headers and 'asm' one :).

Yes, will follow this rule.


>
>> +#include <asm/device.h>
>
> The rest of the good looks good to me.
>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-09  9:56                           ` Julien Grall
@ 2019-08-09 18:38                             ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-09 18:38 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Yoshihiro Shimoda, sstabellini, Robin Murphy,
	Oleksandr Tyshchenko


Hi, Julien


>>
>>     What will actually happen if the transaction fail again? For 
>> instance,
>>     if the IOVA was not mapped. Will you receive the interrupt again?
>>     If so, are you going to make the flush again and again until the 
>> guest
>>     is killed?
>>
>>
>> This is a good question. I think, if address is not mapped, the 
>> transaction will fail again and we will get the interrupt again. Not 
>> sure, until the guest is killed or until the driver in the guest 
>> detects timeout and cancels DMA. Let's consider the worst case, until 
>> the guest is killed.
>>
>> So my questions are what do you think would be the proper driver's 
>> behavior in that case? Do nothing and don't even try to resolve error 
>> condition/unblock translation at the first page fault, or give it a 
>> few attempts, or unblock every time.
>
> I will answer back with a question here. How is the TLB flush is going 
> to unblock anything? The more you are not fixing any error condition 
> here... And the print "Unhandled fault" just afterwards clearly leads 
> to think that there are very little chance the fault has been resolved.

Now I understand your point. This really makes sense.


>
>> How does the SMMU driver act in such situation?
>
> I have CCed Robin who knows better than me the SMMU driver. Though it 
> is the Linux one but Xen is based on it.
>
> From my understanding, it is implementation defined whether the SMMU 
> supports stalling a transaction on fault. AFAICT, the current Xen 
> driver will just terminate the transaction and therefore the client 
> transaction behave as RAZ/WI.

I got it. So, sounds like the client won't be able to do something bad, 
and we won't receive an interrupt storm here in Xen.


>
>
>>
>> Quite clear, if we get a fault, then address is not mapped. I think, 
>> it can be both: by issuing wrong address (baggy driver, malicious 
>> driver) or by race (unlikely). If this is the real race (device hits 
>> brake-before-make, for example), we could give it another attempt, 
>> for example. Looks like we need some mechanism to deploy faulted 
>> address to P2M code (which manages page table) to analyze? Or it is 
>> not worth doing that?
>
> You seem to speak about break-before-make as it was an error. 
> Break-Before-Make is just a sequence to prevent the TLB walker to 
> cache both old and new mapping at the same time. At a given point the 
> IOVA translation can only be:
>    1) The old physical address
>    2) No address -> result to a fault
>    3) The new physical address
>
> 1) and 3) should not result to a fault. 2) will result to a fault but 
> then the TLB should not cache invalid entry, right?

right.


>
> In order to see 2), we always flush the TLBs after removing the old 
> physical address.
>
> Unfortunately, some of the IOMMUs are not able to restart 
> transactions, Xen currently avoids to flush the TLBs after 2). So you 
> may be able to see both mapping at the same time.
>
> Looking at your driver, I believe you would have the flag IMSTR.MHIT 
> (multiple tlb hits) set because this is the condition we are trying to 
> prevent with break-before-make. The comment in the code leads to think 
> this is a fault error, so I am not sure why you would recover here...
>
> If your IOMMU is able to stall transaction, then it would be best if 
> we properly handle break-before-make with it.

Thank you for the detailed answer. I would like to say that I have never 
seen Multiple tlb hits error raised by IPMMU in Xen.


>
> Overall, it feels to me the TLB flush is here for a different reason.


I will drop this TLB flush from interrupt handler until clarified.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request Oleksandr Tyshchenko
@ 2019-08-12 11:11   ` Julien Grall
  2019-08-12 12:01     ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-12 11:11 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds minimal required support to General IOMMU framework
> to be able to handle a case when IOMMU driver requesting deferred
> probing for a device.
> 
> In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
> we have chosen -EAGAIN to be used for indicating that device
> probing is deferred.
> 
> This is needed for the upcoming IPMMU driver which may request
> deferred probing depending on what device will be probed the first
> (there is some dependency between these devices, Root device must be
> registered before Cache devices. If not the case, driver will deny
> further Cache device probes until Root device is registered).
> As we can't guarantee a fixed pre-defined order for the device nodes
> in DT, we need to be ready for the situation where devices being
> probed in "any" order.
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
>   xen/common/device_tree.c            |  1 +
>   xen/drivers/passthrough/arm/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
>   xen/include/asm-arm/device.h        |  6 +++++-
>   xen/include/xen/device_tree.h       |  1 +
>   4 files changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index e107c6f..6f37448 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -1774,6 +1774,7 @@ static unsigned long __init unflatten_dt_node(const void *fdt,
>           /* By default the device is not protected */
>           np->is_protected = false;
>           INIT_LIST_HEAD(&np->domain_list);
> +        INIT_LIST_HEAD(&np->deferred_probe);

I am not entirely happy to add a new list_head field per node just for the 
benefits of boot code. Could we re-use domain_list (with a comment in the code 
and appropriate ASSERT)?

>   
>           if ( new_format )
>           {
> diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
> index 2135233..3195919 100644
> --- a/xen/drivers/passthrough/arm/iommu.c
> +++ b/xen/drivers/passthrough/arm/iommu.c
> @@ -20,6 +20,12 @@
>   #include <xen/device_tree.h>
>   #include <asm/device.h>
>   
> +/*
> + * Used to keep track of devices for which driver requested deferred probing
> + * (returns -EAGAIN).
> + */
> +static LIST_HEAD(deferred_probe_list);

This wants to be in init section as this is only used at boot.

> +
>   static const struct iommu_ops *iommu_ops;
>   
>   const struct iommu_ops *iommu_get_ops(void)
> @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
>   
>   int __init iommu_hardware_setup(void)
>   {
> -    struct dt_device_node *np;
> +    struct dt_device_node *np, *tmp;
>       int rc;
>       unsigned int num_iommus = 0;
>   
> @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
>           rc = device_init(np, DEVICE_IOMMU, NULL);
>           if ( !rc )
>               num_iommus++;
> +        else if (rc == -EAGAIN)
> +            /*
> +             * Driver requested deferred probing, so add this device to
> +             * the deferred list for further processing.
> +             */
> +            list_add(&np->deferred_probe, &deferred_probe_list);
> +    }
> +
> +    /*
> +     * Process devices in the deferred list if at least one successfully
> +     * probed device is present.
> +     */

I think this can turn into an infinite loop if all device in deferred_probe_list 
still return -EDEFER_PROBE and num_iommus is a non-zero.

A better condition would be to check that at least one IOMMU is added at each 
loop. If not, then we should bail with an error because it likely means 
something is buggy.

> +    while ( !list_empty(&deferred_probe_list) && num_iommus )
> +    {
> +        list_for_each_entry_safe ( np, tmp, &deferred_probe_list,
> +                                   deferred_probe )
> +        {
> +            rc = device_init(np, DEVICE_IOMMU, NULL);
> +            if ( !rc )
> +                num_iommus++;
> +            if ( rc != -EAGAIN )
> +                /*
> +                 * Driver didn't request deferred probing, so remove this device
> +                 * from the deferred list.
> +                 */
> +                list_del_init(&np->deferred_probe);
> +        }
>       }
>   
>       return ( num_iommus > 0 ) ? 0 : -ENODEV;
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index 63a0f36..ee1c3bc 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -44,7 +44,11 @@ struct device_desc {
>       enum device_class class;
>       /* List of devices supported by this driver */
>       const struct dt_device_match *dt_match;
> -    /* Device initialization */
> +    /*
> +     * Device initialization.
> +     *
> +     * -EAGAIN is used to indicate that device probing is deferred.
> +     */
>       int (*init)(struct dt_device_node *dev, const void *data);
>   };
>   
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 8315629..71b0e47 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -93,6 +93,7 @@ struct dt_device_node {
>       /* IOMMU specific fields */
>       bool is_protected;
>       struct list_head domain_list;
> +    struct list_head deferred_probe;
>   
>       struct device dev;
>   };
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-12 11:11   ` Julien Grall
@ 2019-08-12 12:01     ` Oleksandr
  2019-08-12 19:46       ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-12 12:01 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 12.08.19 14:11, Julien Grall wrote:
> Hi Oleksandr,

Hi, Julien


>
> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch adds minimal required support to General IOMMU framework
>> to be able to handle a case when IOMMU driver requesting deferred
>> probing for a device.
>>
>> In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
>> we have chosen -EAGAIN to be used for indicating that device
>> probing is deferred.
>>
>> This is needed for the upcoming IPMMU driver which may request
>> deferred probing depending on what device will be probed the first
>> (there is some dependency between these devices, Root device must be
>> registered before Cache devices. If not the case, driver will deny
>> further Cache device probes until Root device is registered).
>> As we can't guarantee a fixed pre-defined order for the device nodes
>> in DT, we need to be ready for the situation where devices being
>> probed in "any" order.
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> ---
>>   xen/common/device_tree.c            |  1 +
>>   xen/drivers/passthrough/arm/iommu.c | 35 
>> ++++++++++++++++++++++++++++++++++-
>>   xen/include/asm-arm/device.h        |  6 +++++-
>>   xen/include/xen/device_tree.h       |  1 +
>>   4 files changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>> index e107c6f..6f37448 100644
>> --- a/xen/common/device_tree.c
>> +++ b/xen/common/device_tree.c
>> @@ -1774,6 +1774,7 @@ static unsigned long __init 
>> unflatten_dt_node(const void *fdt,
>>           /* By default the device is not protected */
>>           np->is_protected = false;
>>           INIT_LIST_HEAD(&np->domain_list);
>> +        INIT_LIST_HEAD(&np->deferred_probe);
>
> I am not entirely happy to add a new list_head field per node just for 
> the benefits of boot code. Could we re-use domain_list (with a comment 
> in the code and appropriate ASSERT)?

Agree that only boot code uses deferred_probe field. I will consider 
re-using domain_list. Could you please clarify regarding ASSERT (where 
to put and what to check).


>
>>             if ( new_format )
>>           {
>> diff --git a/xen/drivers/passthrough/arm/iommu.c 
>> b/xen/drivers/passthrough/arm/iommu.c
>> index 2135233..3195919 100644
>> --- a/xen/drivers/passthrough/arm/iommu.c
>> +++ b/xen/drivers/passthrough/arm/iommu.c
>> @@ -20,6 +20,12 @@
>>   #include <xen/device_tree.h>
>>   #include <asm/device.h>
>>   +/*
>> + * Used to keep track of devices for which driver requested deferred 
>> probing
>> + * (returns -EAGAIN).
>> + */
>> +static LIST_HEAD(deferred_probe_list);
>
> This wants to be in init section as this is only used at boot.

Will do.


>
>
>> +
>>   static const struct iommu_ops *iommu_ops;
>>     const struct iommu_ops *iommu_get_ops(void)
>> @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
>>     int __init iommu_hardware_setup(void)
>>   {
>> -    struct dt_device_node *np;
>> +    struct dt_device_node *np, *tmp;
>>       int rc;
>>       unsigned int num_iommus = 0;
>>   @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
>>           rc = device_init(np, DEVICE_IOMMU, NULL);
>>           if ( !rc )
>>               num_iommus++;
>> +        else if (rc == -EAGAIN)
>> +            /*
>> +             * Driver requested deferred probing, so add this device to
>> +             * the deferred list for further processing.
>> +             */
>> +            list_add(&np->deferred_probe, &deferred_probe_list);
>> +    }
>> +
>> +    /*
>> +     * Process devices in the deferred list if at least one 
>> successfully
>> +     * probed device is present.
>> +     */
>
> I think this can turn into an infinite loop if all device in 
> deferred_probe_list still return -EDEFER_PROBE and num_iommus is a 
> non-zero.

Agree.


>
> A better condition would be to check that at least one IOMMU is added 
> at each loop. If not, then we should bail with an error because it 
> likely means something is buggy.

Sounds reasonable. Will do.


Just to clarify:

 >>> A better condition would be to check that at least one IOMMU is 
added at each loop.

Maybe, not only added (rc == 0), but driver didn't request deferred 
probe (rc != -EAGAIN).



-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-12 12:01     ` Oleksandr
@ 2019-08-12 19:46       ` Julien Grall
  2019-08-13 12:35         ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-12 19:46 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 8/12/19 1:01 PM, Oleksandr wrote:
> On 12.08.19 14:11, Julien Grall wrote:
>> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>
>>> This patch adds minimal required support to General IOMMU framework
>>> to be able to handle a case when IOMMU driver requesting deferred
>>> probing for a device.
>>>
>>> In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
>>> we have chosen -EAGAIN to be used for indicating that device
>>> probing is deferred.
>>>
>>> This is needed for the upcoming IPMMU driver which may request
>>> deferred probing depending on what device will be probed the first
>>> (there is some dependency between these devices, Root device must be
>>> registered before Cache devices. If not the case, driver will deny
>>> further Cache device probes until Root device is registered).
>>> As we can't guarantee a fixed pre-defined order for the device nodes
>>> in DT, we need to be ready for the situation where devices being
>>> probed in "any" order.
>>>
>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>> ---
>>>   xen/common/device_tree.c            |  1 +
>>>   xen/drivers/passthrough/arm/iommu.c | 35 
>>> ++++++++++++++++++++++++++++++++++-
>>>   xen/include/asm-arm/device.h        |  6 +++++-
>>>   xen/include/xen/device_tree.h       |  1 +
>>>   4 files changed, 41 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>>> index e107c6f..6f37448 100644
>>> --- a/xen/common/device_tree.c
>>> +++ b/xen/common/device_tree.c
>>> @@ -1774,6 +1774,7 @@ static unsigned long __init 
>>> unflatten_dt_node(const void *fdt,
>>>           /* By default the device is not protected */
>>>           np->is_protected = false;
>>>           INIT_LIST_HEAD(&np->domain_list);
>>> +        INIT_LIST_HEAD(&np->deferred_probe);
>>
>> I am not entirely happy to add a new list_head field per node just for 
>> the benefits of boot code. Could we re-use domain_list (with a comment 
>> in the code and appropriate ASSERT)?
> 
> Agree that only boot code uses deferred_probe field. I will consider 
> re-using domain_list. Could you please clarify regarding ASSERT (where 
> to put and what to check).

What I meant is adding an ASSERT to check that np->domain_list is at 
empty at least before trying to add in the list. This would help to 
debug any potential issue if we end up to use domain_list earlier in the 
future. I can't see why it would as iommu is called earlier, but who 
knows :).

> 
> 
>>
>>>             if ( new_format )
>>>           {
>>> diff --git a/xen/drivers/passthrough/arm/iommu.c 
>>> b/xen/drivers/passthrough/arm/iommu.c
>>> index 2135233..3195919 100644
>>> --- a/xen/drivers/passthrough/arm/iommu.c
>>> +++ b/xen/drivers/passthrough/arm/iommu.c
>>> @@ -20,6 +20,12 @@
>>>   #include <xen/device_tree.h>
>>>   #include <asm/device.h>
>>>   +/*
>>> + * Used to keep track of devices for which driver requested deferred 
>>> probing
>>> + * (returns -EAGAIN).
>>> + */
>>> +static LIST_HEAD(deferred_probe_list);
>>
>> This wants to be in init section as this is only used at boot.
> 
> Will do.
> 
> 
>>
>>
>>> +
>>>   static const struct iommu_ops *iommu_ops;
>>>     const struct iommu_ops *iommu_get_ops(void)
>>> @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
>>>     int __init iommu_hardware_setup(void)
>>>   {
>>> -    struct dt_device_node *np;
>>> +    struct dt_device_node *np, *tmp;
>>>       int rc;
>>>       unsigned int num_iommus = 0;
>>>   @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
>>>           rc = device_init(np, DEVICE_IOMMU, NULL);
>>>           if ( !rc )
>>>               num_iommus++;
>>> +        else if (rc == -EAGAIN)
>>> +            /*
>>> +             * Driver requested deferred probing, so add this device to
>>> +             * the deferred list for further processing.
>>> +             */
>>> +            list_add(&np->deferred_probe, &deferred_probe_list);
>>> +    }
>>> +
>>> +    /*
>>> +     * Process devices in the deferred list if at least one 
>>> successfully
>>> +     * probed device is present.
>>> +     */
>>
>> I think this can turn into an infinite loop if all device in 
>> deferred_probe_list still return -EDEFER_PROBE and num_iommus is a 
>> non-zero.
> 
> Agree.
> 
> 
>>
>> A better condition would be to check that at least one IOMMU is added 
>> at each loop. If not, then we should bail with an error because it 
>> likely means something is buggy.
> 
> Sounds reasonable. Will do.
> 
> 
> Just to clarify:
> 
>  >>> A better condition would be to check that at least one IOMMU is 
> added at each loop.
> 
> Maybe, not only added (rc == 0), but driver didn't request deferred 
> probe (rc != -EAGAIN).

I think adding an IOMMU is enough. If you return an error other than 
-EAGAIN here after deferring probing, then you are likely going to fail 
at the next loop. So better to stop early.

I realize this not what the current code is doing (I know I wrote it 
;)). But I am not sure it is sane to continue if only part of the IOMMUs 
are initialized. Most likely you will see an error much later that may 
be not trivial to find out.

Imagine you want to passthrough you network card to a guest but the 
IOMMU initialization failed...

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-12 19:46       ` Julien Grall
@ 2019-08-13 12:35         ` Oleksandr
  2019-08-14 17:34           ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-13 12:35 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 12.08.19 22:46, Julien Grall wrote:
> Hi Oleksandr,

Hi, Julien


>
> On 8/12/19 1:01 PM, Oleksandr wrote:
>> On 12.08.19 14:11, Julien Grall wrote:
>>> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>
>>>> This patch adds minimal required support to General IOMMU framework
>>>> to be able to handle a case when IOMMU driver requesting deferred
>>>> probing for a device.
>>>>
>>>> In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
>>>> we have chosen -EAGAIN to be used for indicating that device
>>>> probing is deferred.
>>>>
>>>> This is needed for the upcoming IPMMU driver which may request
>>>> deferred probing depending on what device will be probed the first
>>>> (there is some dependency between these devices, Root device must be
>>>> registered before Cache devices. If not the case, driver will deny
>>>> further Cache device probes until Root device is registered).
>>>> As we can't guarantee a fixed pre-defined order for the device nodes
>>>> in DT, we need to be ready for the situation where devices being
>>>> probed in "any" order.
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>> ---
>>>>   xen/common/device_tree.c            |  1 +
>>>>   xen/drivers/passthrough/arm/iommu.c | 35 
>>>> ++++++++++++++++++++++++++++++++++-
>>>>   xen/include/asm-arm/device.h        |  6 +++++-
>>>>   xen/include/xen/device_tree.h       |  1 +
>>>>   4 files changed, 41 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>>>> index e107c6f..6f37448 100644
>>>> --- a/xen/common/device_tree.c
>>>> +++ b/xen/common/device_tree.c
>>>> @@ -1774,6 +1774,7 @@ static unsigned long __init 
>>>> unflatten_dt_node(const void *fdt,
>>>>           /* By default the device is not protected */
>>>>           np->is_protected = false;
>>>>           INIT_LIST_HEAD(&np->domain_list);
>>>> +        INIT_LIST_HEAD(&np->deferred_probe);
>>>
>>> I am not entirely happy to add a new list_head field per node just 
>>> for the benefits of boot code. Could we re-use domain_list (with a 
>>> comment in the code and appropriate ASSERT)?
>>
>> Agree that only boot code uses deferred_probe field. I will consider 
>> re-using domain_list. Could you please clarify regarding ASSERT 
>> (where to put and what to check).
>
> What I meant is adding an ASSERT to check that np->domain_list is at 
> empty at least before trying to add in the list. This would help to 
> debug any potential issue if we end up to use domain_list earlier in 
> the future. I can't see why it would as iommu is called earlier, but 
> who knows :).

Got it. Thank you for clarification.


>>>> +
>>>>   static const struct iommu_ops *iommu_ops;
>>>>     const struct iommu_ops *iommu_get_ops(void)
>>>> @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops 
>>>> *ops)
>>>>     int __init iommu_hardware_setup(void)
>>>>   {
>>>> -    struct dt_device_node *np;
>>>> +    struct dt_device_node *np, *tmp;
>>>>       int rc;
>>>>       unsigned int num_iommus = 0;
>>>>   @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
>>>>           rc = device_init(np, DEVICE_IOMMU, NULL);
>>>>           if ( !rc )
>>>>               num_iommus++;
>>>> +        else if (rc == -EAGAIN)
>>>> +            /*
>>>> +             * Driver requested deferred probing, so add this 
>>>> device to
>>>> +             * the deferred list for further processing.
>>>> +             */
>>>> +            list_add(&np->deferred_probe, &deferred_probe_list);
>>>> +    }
>>>> +
>>>> +    /*
>>>> +     * Process devices in the deferred list if at least one 
>>>> successfully
>>>> +     * probed device is present.
>>>> +     */
>>>
>>> I think this can turn into an infinite loop if all device in 
>>> deferred_probe_list still return -EDEFER_PROBE and num_iommus is a 
>>> non-zero.
>>
>> Agree.
>>
>>
>>>
>>> A better condition would be to check that at least one IOMMU is 
>>> added at each loop. If not, then we should bail with an error 
>>> because it likely means something is buggy.
>>
>> Sounds reasonable. Will do.
>>
>>
>> Just to clarify:
>>
>>  >>> A better condition would be to check that at least one IOMMU is 
>> added at each loop.
>>
>> Maybe, not only added (rc == 0), but driver didn't request deferred 
>> probe (rc != -EAGAIN).
>
> I think adding an IOMMU is enough. If you return an error other than 
> -EAGAIN here after deferring probing, then you are likely going to 
> fail at the next loop. So better to stop early.

It makes sense.


>
>
> I realize this not what the current code is doing (I know I wrote it 
> ;)). But I am not sure it is sane to continue if only part of the 
> IOMMUs are initialized. Most likely you will see an error much later 
> that may be not trivial to find out.
>
> Imagine you want to passthrough you network card to a guest but the 
> IOMMU initialization failed...

Oh, agree.

As I understand, the new strict logic would be the following:

If initialization for at least one IOMMU device failed (device_init 
returns an error other than -EAGAIN), we should stop and return an error 
to upper layer (even if num_iommus > 0). No matter whether it is during 
the first attempt or after deferring probe. We don't allow the "I/O 
virtualisation" to be enabled (iommu_enabled == true) with only part of 
the IOMMU devices being initialized. Is my understanding correct?


>
> Cheers,
>
> -- 
> Julien Grall

-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support Oleksandr Tyshchenko
@ 2019-08-13 12:39   ` Julien Grall
  2019-08-13 15:17     ` Oleksandr
  2019-08-13 13:40   ` Julien Grall
  1 sibling, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-13 12:39 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> We need to have some abstract way to add new device to the IOMMU
> based on the generic IOMMU DT binding [1] which can be used for
> both DT (right now) and ACPI (in future).
> 
> For that reason we can borrow the idea used in Linux these days
> called "iommu_fwspec". Having this in, it will be possible
> to configure IOMMU master interfaces of the device (device IDs)
> from a single common place and avoid keeping almost identifical look-up

s/identifical/identical/

> implementations in each IOMMU driver.
> 
> There is no need to port the whole implementation of "iommu_fwspec"
> to Xen, we could, probably, end up with a much simpler solution,
> some "stripped down" version which fits our requirments.

s/requirments/requirements/

> 
> So, this patch adds the following:
> 1. A common structure "iommu_fwspec" to hold the the per-device
>     firmware data
> 2. New member "iommu_fwspec" of struct device
> 3. Functions/helpers to deal with "dev->iommu_fwspec"
> 
> It should be noted that in comparing with original "iommu_fwspec"
> Xen's variant doesn't contain some fields, which are not really
> needed at the moment (ops, flag) and "iommu_fwnode" field was replaced
> by "iommu_dev" to avoid porting a lot of code (to support "fwnode_handle")
> with little benefit.
> 
> Next patch in this series will make use of that support.
> 
> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
>   xen/drivers/passthrough/arm/Makefile       |  2 +-
>   xen/drivers/passthrough/arm/iommu_fwspec.c | 91 ++++++++++++++++++++++++++++++
>   xen/include/asm-arm/device.h               |  1 +
>   xen/include/asm-arm/iommu.h                |  2 +
>   xen/include/asm-arm/iommu_fwspec.h         | 65 +++++++++++++++++++++
>   5 files changed, 160 insertions(+), 1 deletion(-)
>   create mode 100644 xen/drivers/passthrough/arm/iommu_fwspec.c
>   create mode 100644 xen/include/asm-arm/iommu_fwspec.h
> 
> diff --git a/xen/drivers/passthrough/arm/Makefile b/xen/drivers/passthrough/arm/Makefile
> index 4abb87a..5fbad45 100644
> --- a/xen/drivers/passthrough/arm/Makefile
> +++ b/xen/drivers/passthrough/arm/Makefile
> @@ -1,2 +1,2 @@
> -obj-y += iommu.o iommu_helpers.o
> +obj-y += iommu.o iommu_helpers.o iommu_fwspec.o
>   obj-$(CONFIG_ARM_SMMU) += smmu.o
> diff --git a/xen/drivers/passthrough/arm/iommu_fwspec.c b/xen/drivers/passthrough/arm/iommu_fwspec.c
> new file mode 100644
> index 0000000..3474192
> --- /dev/null
> +++ b/xen/drivers/passthrough/arm/iommu_fwspec.c
> @@ -0,0 +1,91 @@
> +/*
> + * xen/drivers/passthrough/arm/iommu_fwspec.c
> + *
> + * Contains functions to maintain per-device firmware data
> + *
> + * Based on Linux's iommu_fwspec support you can find at:
> + *    drivers/iommu/iommu.c
> + *
> + * Copyright (C) 2019 EPAM Systems Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/lib.h>
> +#include <xen/iommu.h>

Please order the headers alphabetically.

NIT: Can you a newline between xen and asm headers?

> +#include <asm/device.h>
> +#include <asm/iommu_fwspec.h>

> +
> +int iommu_fwspec_init(struct device *dev, struct device *iommu_dev)
> +{
> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +
> +    if ( fwspec )
> +        return 0;
> +
> +    fwspec = xzalloc(struct iommu_fwspec);
> +    if ( !fwspec )
> +        return -ENOMEM;
> +
> +    fwspec->iommu_dev = iommu_dev;
> +    dev_iommu_fwspec_set(dev, fwspec);
> +
> +    return 0;
> +}
> +
> +void iommu_fwspec_free(struct device *dev)
> +{
> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +
> +    if ( fwspec )

xfree is able to deal with NULL pointer, so the check is not necessary.

> +    {
> +        xfree(fwspec);
> +        dev_iommu_fwspec_set(dev, NULL);
> +    }
> +}
> +
> +int iommu_fwspec_add_ids(struct device *dev, uint32_t *ids, int num_ids)

While I realize the prototype is coming from Linux, num_ids cannot be 
negative (the code below would not work properly). So the parameter 
should be unsigned.

> +{
> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +    size_t size;
> +    int i;

Any variable that can't be negative should be unsigned.

> +
> +    if ( !fwspec )
> +        return -EINVAL;
> +
> +    size = offsetof(struct iommu_fwspec, ids[fwspec->num_ids + num_ids]);
> +    if ( size > sizeof(*fwspec) )
> +    {
> +        fwspec = _xrealloc(fwspec, size, sizeof(void *));
> +        if ( !fwspec )
> +            return -ENOMEM;
> +
> +        dev_iommu_fwspec_set(dev, fwspec);
> +    }
> +
> +    for ( i = 0; i < num_ids; i++ )
> +        fwspec->ids[fwspec->num_ids + i] = ids[i];
> +
> +    fwspec->num_ids += num_ids;
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/device.h b/xen/include/asm-arm/device.h
> index ee1c3bc..ee7cff2 100644
> --- a/xen/include/asm-arm/device.h
> +++ b/xen/include/asm-arm/device.h
> @@ -18,6 +18,7 @@ struct device
>       struct dt_device_node *of_node; /* Used by drivers imported from Linux */
>   #endif
>       struct dev_archdata archdata;
> +    struct iommu_fwspec *iommu_fwspec; /* per-device IOMMU instance data */
>   };
>   
>   typedef struct device device_t;
> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
> index 20d865e..1853bd9 100644
> --- a/xen/include/asm-arm/iommu.h
> +++ b/xen/include/asm-arm/iommu.h
> @@ -14,6 +14,8 @@
>   #ifndef __ARCH_ARM_IOMMU_H__
>   #define __ARCH_ARM_IOMMU_H__
>   
> +#include <asm/iommu_fwspec.h>

iommu.h does not seem to use anything defined in iommu_fwspec.h. So why 
do you include it here?

> +
>   struct arch_iommu
>   {
>       /* Private information for the IOMMU drivers */
> diff --git a/xen/include/asm-arm/iommu_fwspec.h b/xen/include/asm-arm/iommu_fwspec.h
> new file mode 100644
> index 0000000..0676285
> --- /dev/null
> +++ b/xen/include/asm-arm/iommu_fwspec.h
> @@ -0,0 +1,65 @@
> +/*
> + * xen/include/asm-arm/iommu_fwspec.h
> + *
> + * Contains a common structure to hold the per-device firmware data and
> + * declaration of functions used to maintain that data
> + *
> + * Based on Linux's iommu_fwspec support you can find at:
> + *    include/linux/iommu.h
> + *
> + * Copyright (C) 2019 EPAM Systems Inc.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms and conditions of the GNU General Public
> + * License, version 2, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public
> + * License along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __ARCH_ARM_IOMMU_FWSPEC_H__
> +#define __ARCH_ARM_IOMMU_FWSPEC_H__
> +
> +/* per-device IOMMU instance data */
> +struct iommu_fwspec {
> +    /* device which represents this IOMMU H/W */

Did you intend to say "this device's IOMMU"?

> +    struct device *iommu_dev;
> +    /* IOMMU driver private data for this device */
> +    void *iommu_priv;
> +    /* number of associated device IDs */
> +    unsigned int num_ids;
> +    /* IDs which this device may present to the IOMMU */
> +    uint32_t ids[1];
> +};
> +
> +int iommu_fwspec_init(struct device *dev, struct device *iommu_dev);
> +void iommu_fwspec_free(struct device *dev);
> +int iommu_fwspec_add_ids(struct device *dev, uint32_t *ids, int num_ids);
> +
> +static inline struct iommu_fwspec *dev_iommu_fwspec_get(struct device *dev)
> +{
> +    return dev->iommu_fwspec;
> +}
> +
> +static inline void dev_iommu_fwspec_set(struct device *dev,
> +                                        struct iommu_fwspec *fwspec)
> +{
> +    dev->iommu_fwspec = fwspec;
> +}
> +
> +#endif /* __ARCH_ARM_IOMMU_FWSPEC_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support Oleksandr Tyshchenko
  2019-08-13 12:39   ` Julien Grall
@ 2019-08-13 13:40   ` Julien Grall
  2019-08-13 16:28     ` Oleksandr
  1 sibling, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-13 13:40 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

One more comment :).

On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
> +int iommu_fwspec_init(struct device *dev, struct device *iommu_dev)
> +{
> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +
> +    if ( fwspec )

I would actually check the iommu_dev passed in parameter is the same as 
stored. We expect all device to be protected by only one IOMMU. So 
better to be safe than allow overriding ;).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API Oleksandr Tyshchenko
@ 2019-08-13 13:49   ` Julien Grall
  2019-08-13 16:05     ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-13 13:49 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> 
> This patch adds new iommu_add_dt_device API for adding DT device
> to the IOMMU using generic IOMMU DT binding [1] and previously
> added "iommu_fwspec" support.
> 
> New function parses the DT binding, prepares "dev->iommu_fwspec"
> with correct information and calls the IOMMU driver using "add_device"
> callback to register new DT device.
> The IOMMU driver's responsibility is to check whether "dev->iommu_fwspec"
> is initialized and mark that device as protected.
> 
> The additional benefit here is to avoid to go through the whole DT
> multiple times in IOMMU driver trying to locate master devices which
> belong to each IOMMU device being probed.
> 
> The upcoming IPMMU driver will have "add_device" callback implemented.
> 
> I hope, this patch won't break SMMU driver's functionality,
> which doesn't have this callback implemented.

The last two sentence does not really belong to the commit message. So I 
think they should go after ---.

Anyway, the only concern for the SMMU is to not break the old bindings. 
New bindings are not supported, so it does not matter whether they are 
broken or not. Once this series is merged, we can have a look how new 
bindings for the SMMU can be supported.

> 
> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt
> 
> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> ---
>   xen/arch/arm/domain_build.c         | 12 ++++++++++
>   xen/drivers/passthrough/arm/iommu.c | 45 +++++++++++++++++++++++++++++++++++++
>   xen/include/asm-arm/iommu.h         |  3 +++
>   3 files changed, 60 insertions(+)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index d983677..d67f7d4 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -1241,6 +1241,18 @@ static int __init handle_device(struct domain *d, struct dt_device_node *dev,
>       u64 addr, size;
>       bool need_mapping = !dt_device_for_passthrough(dev);
>   
> +    if ( dt_parse_phandle(dev, "iommus", 0) )

I don't particularly like this check. dt_parse_phandle is non-trivial to 
execute.

TBH, what we should do is trying to call iommu_add_dt_device if IOMMU is 
enabled. We can then return a recognizable value to tell the device is 
not protected.

> +    {
> +        dt_dprintk("%s add to iommu\n", dt_node_full_name(dev));
> +        res = iommu_add_dt_device(dev);
> +        if ( res )
> +        {
> +            printk(XENLOG_ERR "Failed to add %s to the IOMMU\n",
> +                   dt_node_full_name(dev));
> +            return res;
> +        }
> +    }
> +
>       nirq = dt_number_of_irq(dev);
>       naddr = dt_number_of_address(dev);
>   
> diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
> index 3195919..19516af 100644
> --- a/xen/drivers/passthrough/arm/iommu.c
> +++ b/xen/drivers/passthrough/arm/iommu.c
> @@ -113,3 +113,48 @@ int arch_iommu_populate_page_table(struct domain *d)
>   void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>   {
>   }
> +
> +int __init iommu_add_dt_device(struct dt_device_node *np)
> +{
> +    const struct iommu_ops *ops = iommu_get_ops();
> +    struct dt_phandle_args iommu_spec;
> +    struct device *dev = dt_to_dev(np);
> +    int rc = 1, index = 0;
> +
> +    if ( !iommu_enabled || !ops || !ops->add_device )
> +        return 0;

While I agree that !iommu_enabled should return 0, for the two others I 
am not entirely sure this is the right thing to do.

!ops is definitely an error because if you have the IOMMU enabled then 
you should have ops installed.

!ops->add_device means that you will not be able to add any device using 
the new bindings. IOW, the device will be unusable later one as most 
likely the IOMMU was configured to deny any transaction. Therefore, this 
should return an error.

> +
> +    if ( dev_iommu_fwspec_get(dev) )
> +        return -EEXIST;
> +
> +    /* According to the Documentation/devicetree/bindings/iommu/iommu.txt */

This file does not exist in Xen, so Linux should at least be mentioned 
in the comment.

> +    while ( !dt_parse_phandle_with_args(np, "iommus", "#iommu-cells",
> +                                        index, &iommu_spec) )
> +    {
> +        if ( !dt_device_is_available(iommu_spec.np) )
> +            break;
> +
> +        rc = iommu_fwspec_init(dev, &iommu_spec.np->dev);
> +        if ( rc )
> +            break;
> +
> +        rc = iommu_fwspec_add_ids(dev, iommu_spec.args, 1);

Here you assume that there will at least always be one cells and the 
first cell is the IDs. For a first, #iommu-cells may be 0 (and therefore 
no cells) when the master IOMMU device cannot be configured.

Furthermore, the content of the #iommu-cells depends on the driver. This 
is why Linux provides a callback of_xlate to let the driver decide how 
to interpret it.

For instance, the SMMU can support either 1 or 2 cells. It also may need 
to look-up for other properties in the node (e.g stream-match-mask).

So I think we also want to provide the of_xlate in Xen.

> +        if ( rc )
> +            break;
> +
> +        index++;
> +    }
> +
> +    /*
> +     * Add DT device to the IOMMU if latter is present and available.
> +     * The IOMMU driver's responsibility is to check whether dev->iommu_fwspec
> +     * field is initialized and mark that device as protected.
> +     */
> +    if ( !rc )
> +        rc = ops->add_device(0, dev);
> +
> +    if ( rc < 0 )
> +        iommu_fwspec_free(dev);
> +
> +    return rc < 0 ? rc : 0;
> +}
> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
> index 1853bd9..06b07fa 100644
> --- a/xen/include/asm-arm/iommu.h
> +++ b/xen/include/asm-arm/iommu.h
> @@ -28,6 +28,9 @@ struct arch_iommu
>   const struct iommu_ops *iommu_get_ops(void);
>   void iommu_set_ops(const struct iommu_ops *ops);
>   
> +/* helper to add DT device to the IOMMU */
> +int iommu_add_dt_device(struct dt_device_node *np);
> +
>   /* mapping helpers */
>   int __must_check arm_iommu_map_page(struct domain *d, dfn_t dfn, mfn_t mfn,
>                                       unsigned int flags,
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-13 12:39   ` Julien Grall
@ 2019-08-13 15:17     ` Oleksandr
  2019-08-13 15:28       ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-13 15:17 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 13.08.19 15:39, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien.


>
> On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> We need to have some abstract way to add new device to the IOMMU
>> based on the generic IOMMU DT binding [1] which can be used for
>> both DT (right now) and ACPI (in future).
>>
>> For that reason we can borrow the idea used in Linux these days
>> called "iommu_fwspec". Having this in, it will be possible
>> to configure IOMMU master interfaces of the device (device IDs)
>> from a single common place and avoid keeping almost identifical look-up
>
> s/identifical/identical/

ok


>
>> implementations in each IOMMU driver.
>>
>> There is no need to port the whole implementation of "iommu_fwspec"
>> to Xen, we could, probably, end up with a much simpler solution,
>> some "stripped down" version which fits our requirments.
>
> s/requirments/requirements/

ok


>
>> + */
>> +
>> +#include <xen/lib.h>
>> +#include <xen/iommu.h>
>
> Please order the headers alphabetically.
>
> NIT: Can you a newline between xen and asm headers?

Will do


>
>> +#include <asm/device.h>
>> +#include <asm/iommu_fwspec.h>
>
>> +
>> +int iommu_fwspec_init(struct device *dev, struct device *iommu_dev)
>> +{
>> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>> +
>> +    if ( fwspec )
>> +        return 0;
>> +
>> +    fwspec = xzalloc(struct iommu_fwspec);
>> +    if ( !fwspec )
>> +        return -ENOMEM;
>> +
>> +    fwspec->iommu_dev = iommu_dev;
>> +    dev_iommu_fwspec_set(dev, fwspec);
>> +
>> +    return 0;
>> +}
>> +
>> +void iommu_fwspec_free(struct device *dev)
>> +{
>> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>> +
>> +    if ( fwspec )
>
> xfree is able to deal with NULL pointer, so the check is not necessary.

Yes, the reason I left this check is to not perform an extra operation 
(dev_iommu_fwspec_set). Shall I drop this check anyway?

>
>> +    {
>> +        xfree(fwspec);
>> +        dev_iommu_fwspec_set(dev, NULL);
>> +    }
>> +}
>> +
>> +int iommu_fwspec_add_ids(struct device *dev, uint32_t *ids, int 
>> num_ids)
>
> While I realize the prototype is coming from Linux, num_ids cannot be 
> negative (the code below would not work properly). So the parameter 
> should be unsigned.

Agree, will use unsigned.


>
>> +{
>> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>> +    size_t size;
>> +    int i;
>
> Any variable that can't be negative should be unsigned.

Yes, will follow.


>> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
>> index 20d865e..1853bd9 100644
>> --- a/xen/include/asm-arm/iommu.h
>> +++ b/xen/include/asm-arm/iommu.h
>> @@ -14,6 +14,8 @@
>>   #ifndef __ARCH_ARM_IOMMU_H__
>>   #define __ARCH_ARM_IOMMU_H__
>>   +#include <asm/iommu_fwspec.h>
>
> iommu.h does not seem to use anything defined in iommu_fwspec.h. So 
> why do you include it here?

I was thinking that every source which includes iommu.h will get 
iommu_fwspec.h included indirectly. No need to include iommu_fwspec.h in 
many sources.

This was a reason. Shall I included it directly where needed?


>
>> +
>>   struct arch_iommu
>>   {
>>       /* Private information for the IOMMU drivers */
>> diff --git a/xen/include/asm-arm/iommu_fwspec.h 
>> b/xen/include/asm-arm/iommu_fwspec.h
>> new file mode 100644
>> index 0000000..0676285
>> --- /dev/null
>> +++ b/xen/include/asm-arm/iommu_fwspec.h
>> @@ -0,0 +1,65 @@
>> +/*
>> + * xen/include/asm-arm/iommu_fwspec.h
>> + *
>> + * Contains a common structure to hold the per-device firmware data and
>> + * declaration of functions used to maintain that data
>> + *
>> + * Based on Linux's iommu_fwspec support you can find at:
>> + *    include/linux/iommu.h
>> + *
>> + * Copyright (C) 2019 EPAM Systems Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public
>> + * License along with this program; If not, see 
>> <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef __ARCH_ARM_IOMMU_FWSPEC_H__
>> +#define __ARCH_ARM_IOMMU_FWSPEC_H__
>> +
>> +/* per-device IOMMU instance data */
>> +struct iommu_fwspec {
>> +    /* device which represents this IOMMU H/W */
>
> Did you intend to say "this device's IOMMU"?

Exactly)


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-13 15:17     ` Oleksandr
@ 2019-08-13 15:28       ` Julien Grall
  2019-08-13 16:18         ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-13 15:28 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi,

On 8/13/19 4:17 PM, Oleksandr wrote:
> 
> On 13.08.19 15:39, Julien Grall wrote:
>>
>> xfree is able to deal with NULL pointer, so the check is not necessary.
> 
> Yes, the reason I left this check is to not perform an extra operation 
> (dev_iommu_fwspec_set). Shall I drop this check anyway?

I can't see any issue to do the extra operation. This is not hotpath and 
it is harmless.


>>> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
>>> index 20d865e..1853bd9 100644
>>> --- a/xen/include/asm-arm/iommu.h
>>> +++ b/xen/include/asm-arm/iommu.h
>>> @@ -14,6 +14,8 @@
>>>   #ifndef __ARCH_ARM_IOMMU_H__
>>>   #define __ARCH_ARM_IOMMU_H__
>>>   +#include <asm/iommu_fwspec.h>
>>
>> iommu.h does not seem to use anything defined in iommu_fwspec.h. So 
>> why do you include it here?
> 
> I was thinking that every source which includes iommu.h will get 
> iommu_fwspec.h included indirectly. No need to include iommu_fwspec.h in 
> many sources.
> 
> This was a reason. Shall I included it directly where needed?

There are a few cases where iommu.h is required but not iommu_fwspec.h. 
In general, I would prefer if headers are only included when strictly 
necessary.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API
  2019-08-13 13:49   ` Julien Grall
@ 2019-08-13 16:05     ` Oleksandr
  2019-08-13 17:13       ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-13 16:05 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 13.08.19 16:49, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien


>
> On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>
>> This patch adds new iommu_add_dt_device API for adding DT device
>> to the IOMMU using generic IOMMU DT binding [1] and previously
>> added "iommu_fwspec" support.
>>
>> New function parses the DT binding, prepares "dev->iommu_fwspec"
>> with correct information and calls the IOMMU driver using "add_device"
>> callback to register new DT device.
>> The IOMMU driver's responsibility is to check whether 
>> "dev->iommu_fwspec"
>> is initialized and mark that device as protected.
>>
>> The additional benefit here is to avoid to go through the whole DT
>> multiple times in IOMMU driver trying to locate master devices which
>> belong to each IOMMU device being probed.
>>
>> The upcoming IPMMU driver will have "add_device" callback implemented.
>>
>> I hope, this patch won't break SMMU driver's functionality,
>> which doesn't have this callback implemented.
>
> The last two sentence does not really belong to the commit message. So 
> I think they should go after ---.
>
> Anyway, the only concern for the SMMU is to not break the old 
> bindings. New bindings are not supported, so it does not matter 
> whether they are broken or not. Once this series is merged, we can 
> have a look how new bindings for the SMMU can be supported.

Sounds reasonable.


>
>>
>> [1] 
>> https://www.kernel.org/doc/Documentation/devicetree/bindings/iommu/iommu.txt
>>
>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>> ---
>>   xen/arch/arm/domain_build.c         | 12 ++++++++++
>>   xen/drivers/passthrough/arm/iommu.c | 45 
>> +++++++++++++++++++++++++++++++++++++
>>   xen/include/asm-arm/iommu.h         |  3 +++
>>   3 files changed, 60 insertions(+)
>>
>> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
>> index d983677..d67f7d4 100644
>> --- a/xen/arch/arm/domain_build.c
>> +++ b/xen/arch/arm/domain_build.c
>> @@ -1241,6 +1241,18 @@ static int __init handle_device(struct domain 
>> *d, struct dt_device_node *dev,
>>       u64 addr, size;
>>       bool need_mapping = !dt_device_for_passthrough(dev);
>>   +    if ( dt_parse_phandle(dev, "iommus", 0) )
>
> I don't particularly like this check. dt_parse_phandle is non-trivial 
> to execute.
>
> TBH, what we should do is trying to call iommu_add_dt_device if IOMMU 
> is enabled. We can then return a recognizable value to tell the device 
> is not protected.

Well. Don't really mind.


>
>> +    {
>> +        dt_dprintk("%s add to iommu\n", dt_node_full_name(dev));
>> +        res = iommu_add_dt_device(dev);
>> +        if ( res )
>> +        {
>> +            printk(XENLOG_ERR "Failed to add %s to the IOMMU\n",
>> +                   dt_node_full_name(dev));
>> +            return res;
>> +        }
>> +    }
>> +
>>       nirq = dt_number_of_irq(dev);
>>       naddr = dt_number_of_address(dev);
>>   diff --git a/xen/drivers/passthrough/arm/iommu.c 
>> b/xen/drivers/passthrough/arm/iommu.c
>> index 3195919..19516af 100644
>> --- a/xen/drivers/passthrough/arm/iommu.c
>> +++ b/xen/drivers/passthrough/arm/iommu.c
>> @@ -113,3 +113,48 @@ int arch_iommu_populate_page_table(struct domain 
>> *d)
>>   void __hwdom_init arch_iommu_hwdom_init(struct domain *d)
>>   {
>>   }
>> +
>> +int __init iommu_add_dt_device(struct dt_device_node *np)
>> +{
>> +    const struct iommu_ops *ops = iommu_get_ops();
>> +    struct dt_phandle_args iommu_spec;
>> +    struct device *dev = dt_to_dev(np);
>> +    int rc = 1, index = 0;
>> +
>> +    if ( !iommu_enabled || !ops || !ops->add_device )
>> +        return 0;
>
> While I agree that !iommu_enabled should return 0, for the two others 
> I am not entirely sure this is the right thing to do.
>
> !ops is definitely an error because if you have the IOMMU enabled then 
> you should have ops installed.

Agree.


>
>
> !ops->add_device means that you will not be able to add any device 
> using the new bindings. IOW, the device will be unusable later one as 
> most likely the IOMMU was configured to deny any transaction. 
> Therefore, this should return an error.

The initial reason *was* to not break SMMU which hasn't had add_device 
callback implemented yet. But, I got your point regarding SMMU above 
(the only concern for the SMMU is to not break the old bindings), so 
agree here.


>
>> +
>> +    if ( dev_iommu_fwspec_get(dev) )
>> +        return -EEXIST;
>> +
>> +    /* According to the 
>> Documentation/devicetree/bindings/iommu/iommu.txt */
>
> This file does not exist in Xen, so Linux should at least be mentioned 
> in the comment.

Will do.


>
>> +    while ( !dt_parse_phandle_with_args(np, "iommus", "#iommu-cells",
>> +                                        index, &iommu_spec) )
>> +    {
>> +        if ( !dt_device_is_available(iommu_spec.np) )
>> +            break;
>> +
>> +        rc = iommu_fwspec_init(dev, &iommu_spec.np->dev);
>> +        if ( rc )
>> +            break;
>> +
>> +        rc = iommu_fwspec_add_ids(dev, iommu_spec.args, 1);
>
> Here you assume that there will at least always be one cells and the 
> first cell is the IDs. For a first, #iommu-cells may be 0 (and 
> therefore no cells) when the master IOMMU device cannot be configured.
>
> Furthermore, the content of the #iommu-cells depends on the driver. 
> This is why Linux provides a callback of_xlate to let the driver 
> decide how to interpret it.
>
> For instance, the SMMU can support either 1 or 2 cells. It also may 
> need to look-up for other properties in the node (e.g stream-match-mask).
>
> So I think we also want to provide the of_xlate in Xen.


Hmm, I was thinking how to end up with only one callback re-used 
(add_device), really didn't want to add a new one (of_xlate). But, I 
didn't take into the account that this stuff is a really 
driver-depended. So, likely yes, we need to provide of_xlate callback.


May I ask some questions to clarify:

1. Do you want me to introduce of_xlate callback in a separate patch 
(under CONFIG_ARM?)?

2. Can we avoid introducing new API for that callback? 
iommu_add_dt_device will be able to call it directly.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-13 15:28       ` Julien Grall
@ 2019-08-13 16:18         ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-13 16:18 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 13.08.19 18:28, Julien Grall wrote:
> Hi,

Hi Julien


>
> On 8/13/19 4:17 PM, Oleksandr wrote:
>>
>> On 13.08.19 15:39, Julien Grall wrote:
>>>
>>> xfree is able to deal with NULL pointer, so the check is not necessary.
>>
>> Yes, the reason I left this check is to not perform an extra 
>> operation (dev_iommu_fwspec_set). Shall I drop this check anyway?
>
> I can't see any issue to do the extra operation. This is not hotpath 
> and it is harmless.

ok, will drop.


>
>
>>>> diff --git a/xen/include/asm-arm/iommu.h b/xen/include/asm-arm/iommu.h
>>>> index 20d865e..1853bd9 100644
>>>> --- a/xen/include/asm-arm/iommu.h
>>>> +++ b/xen/include/asm-arm/iommu.h
>>>> @@ -14,6 +14,8 @@
>>>>   #ifndef __ARCH_ARM_IOMMU_H__
>>>>   #define __ARCH_ARM_IOMMU_H__
>>>>   +#include <asm/iommu_fwspec.h>
>>>
>>> iommu.h does not seem to use anything defined in iommu_fwspec.h. So 
>>> why do you include it here?
>>
>> I was thinking that every source which includes iommu.h will get 
>> iommu_fwspec.h included indirectly. No need to include iommu_fwspec.h 
>> in many sources.
>>
>> This was a reason. Shall I included it directly where needed?
>
> There are a few cases where iommu.h is required but not 
> iommu_fwspec.h. In general, I would prefer if headers are only 
> included when strictly necessary.

got it, will drop from here and include where necessary.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support
  2019-08-13 13:40   ` Julien Grall
@ 2019-08-13 16:28     ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-13 16:28 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini


On 13.08.19 16:40, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien.


>
> One more comment :).
>
> On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
>> +int iommu_fwspec_init(struct device *dev, struct device *iommu_dev)
>> +{
>> +    struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>> +
>> +    if ( fwspec )
>
> I would actually check the iommu_dev passed in parameter is the same 
> as stored. We expect all device to be protected by only one IOMMU. So 
> better to be safe than allow overriding ;).

Actually, it makes sense, will add appropriate check.


-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API
  2019-08-13 16:05     ` Oleksandr
@ 2019-08-13 17:13       ` Julien Grall
  0 siblings, 0 replies; 59+ messages in thread
From: Julien Grall @ 2019-08-13 17:13 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi,

On 8/13/19 5:05 PM, Oleksandr wrote:
> On 13.08.19 16:49, Julien Grall wrote:
>> On 8/2/19 5:39 PM, Oleksandr Tyshchenko wrote:
> Hmm, I was thinking how to end up with only one callback re-used 
> (add_device), really didn't want to add a new one (of_xlate). But, I 
> didn't take into the account that this stuff is a really 
> driver-depended. So, likely yes, we need to provide of_xlate callback.
> 
> 
> May I ask some questions to clarify:
> 
> 1. Do you want me to introduce of_xlate callback in a separate patch 
> (under CONFIG_ARM?)?

Preferably yes. I think this wants to be under CONFIG_HAS_DEVICE_TREE 
rather than CONFIG_ARM.

> 2. Can we avoid introducing new API for that callback?

Do you mean a wrapper for the callback? If so, yes.

> iommu_add_dt_device will be able to call it directly.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-13 12:35         ` Oleksandr
@ 2019-08-14 17:34           ` Julien Grall
  2019-08-14 19:25             ` Stefano Stabellini
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-14 17:34 UTC (permalink / raw)
  To: Oleksandr, xen-devel; +Cc: Oleksandr Tyshchenko, sstabellini

Hi Oleksandr,

On 13/08/2019 13:35, Oleksandr wrote:
> 
> On 12.08.19 22:46, Julien Grall wrote:
>> Hi Oleksandr,
> 
> Hi, Julien
> 
> 
>>
>> On 8/12/19 1:01 PM, Oleksandr wrote:
>>> On 12.08.19 14:11, Julien Grall wrote:
>>>> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>>
>>>>> This patch adds minimal required support to General IOMMU framework
>>>>> to be able to handle a case when IOMMU driver requesting deferred
>>>>> probing for a device.
>>>>>
>>>>> In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
>>>>> we have chosen -EAGAIN to be used for indicating that device
>>>>> probing is deferred.
>>>>>
>>>>> This is needed for the upcoming IPMMU driver which may request
>>>>> deferred probing depending on what device will be probed the first
>>>>> (there is some dependency between these devices, Root device must be
>>>>> registered before Cache devices. If not the case, driver will deny
>>>>> further Cache device probes until Root device is registered).
>>>>> As we can't guarantee a fixed pre-defined order for the device nodes
>>>>> in DT, we need to be ready for the situation where devices being
>>>>> probed in "any" order.
>>>>>
>>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
>>>>> ---
>>>>>   xen/common/device_tree.c            |  1 +
>>>>>   xen/drivers/passthrough/arm/iommu.c | 35 ++++++++++++++++++++++++++++++++++-
>>>>>   xen/include/asm-arm/device.h        |  6 +++++-
>>>>>   xen/include/xen/device_tree.h       |  1 +
>>>>>   4 files changed, 41 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>>>>> index e107c6f..6f37448 100644
>>>>> --- a/xen/common/device_tree.c
>>>>> +++ b/xen/common/device_tree.c
>>>>> @@ -1774,6 +1774,7 @@ static unsigned long __init unflatten_dt_node(const 
>>>>> void *fdt,
>>>>>           /* By default the device is not protected */
>>>>>           np->is_protected = false;
>>>>>           INIT_LIST_HEAD(&np->domain_list);
>>>>> +        INIT_LIST_HEAD(&np->deferred_probe);
>>>>
>>>> I am not entirely happy to add a new list_head field per node just for the 
>>>> benefits of boot code. Could we re-use domain_list (with a comment in the 
>>>> code and appropriate ASSERT)?
>>>
>>> Agree that only boot code uses deferred_probe field. I will consider re-using 
>>> domain_list. Could you please clarify regarding ASSERT (where to put and what 
>>> to check).
>>
>> What I meant is adding an ASSERT to check that np->domain_list is at empty at 
>> least before trying to add in the list. This would help to debug any potential 
>> issue if we end up to use domain_list earlier in the future. I can't see why 
>> it would as iommu is called earlier, but who knows :).
> 
> Got it. Thank you for clarification.
> 
> 
>>>>> +
>>>>>   static const struct iommu_ops *iommu_ops;
>>>>>     const struct iommu_ops *iommu_get_ops(void)
>>>>> @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops *ops)
>>>>>     int __init iommu_hardware_setup(void)
>>>>>   {
>>>>> -    struct dt_device_node *np;
>>>>> +    struct dt_device_node *np, *tmp;
>>>>>       int rc;
>>>>>       unsigned int num_iommus = 0;
>>>>>   @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
>>>>>           rc = device_init(np, DEVICE_IOMMU, NULL);
>>>>>           if ( !rc )
>>>>>               num_iommus++;
>>>>> +        else if (rc == -EAGAIN)
>>>>> +            /*
>>>>> +             * Driver requested deferred probing, so add this device to
>>>>> +             * the deferred list for further processing.
>>>>> +             */
>>>>> +            list_add(&np->deferred_probe, &deferred_probe_list);
>>>>> +    }
>>>>> +
>>>>> +    /*
>>>>> +     * Process devices in the deferred list if at least one successfully
>>>>> +     * probed device is present.
>>>>> +     */
>>>>
>>>> I think this can turn into an infinite loop if all device in 
>>>> deferred_probe_list still return -EDEFER_PROBE and num_iommus is a non-zero.
>>>
>>> Agree.
>>>
>>>
>>>>
>>>> A better condition would be to check that at least one IOMMU is added at 
>>>> each loop. If not, then we should bail with an error because it likely means 
>>>> something is buggy.
>>>
>>> Sounds reasonable. Will do.
>>>
>>>
>>> Just to clarify:
>>>
>>>  >>> A better condition would be to check that at least one IOMMU is added at 
>>> each loop.
>>>
>>> Maybe, not only added (rc == 0), but driver didn't request deferred probe (rc 
>>> != -EAGAIN).
>>
>> I think adding an IOMMU is enough. If you return an error other than -EAGAIN 
>> here after deferring probing, then you are likely going to fail at the next 
>> loop. So better to stop early.
> 
> It makes sense.
> 
> 
>>
>>
>> I realize this not what the current code is doing (I know I wrote it ;)). But 
>> I am not sure it is sane to continue if only part of the IOMMUs are 
>> initialized. Most likely you will see an error much later that may be not 
>> trivial to find out.
>>
>> Imagine you want to passthrough you network card to a guest but the IOMMU 
>> initialization failed...
> 
> Oh, agree.
> 
> As I understand, the new strict logic would be the following:
> 
> If initialization for at least one IOMMU device failed (device_init returns an 
> error other than -EAGAIN), we should stop and return an error to upper layer 
> (even if num_iommus > 0). No matter whether it is during the first attempt or 
> after deferring probe. We don't allow the "I/O virtualisation" to be enabled 
> (iommu_enabled == true) with only part of the IOMMU devices being initialized. 
> Is my understanding correct?

Let me summarize the discussion we had on IRC :). Without your patch, Xen may 
initialize only half the IOMMUs. If the device is behind an IOMMU that wasn't 
initialized, then we have two possibility:
    1) The device was already mark as protected (if using the old binding in the 
SMMU). Xen will not be able to assign the device to Dom0 and therefore Xen will 
crash (not able to build dom0). For domU, it will depend whether the 
configuration contain the options 'dtdev'. If the option is specified, then 
guest will fail to build. On the contrary if the option isn't specified then the 
guest will boot and this could either lead to transaction failure (if the IOMMU 
was already reset) or bypassing the IOMMU. Note that the latter can today happen 
if your IOMMU was disabled. But we can't do much against it.
    2) The device is not marked as protected. Xen will not be able to "assign" 
the device to Dom0 and this could either lead to the device bypassing the IOMMU 
or a transaction failure. For domU, the problem is similar to 1).

In the case of the SMMU driver, we only support old bindings. So devices are 
marked as protected during SMMU initialization. This is done before the SMMU is 
reset. Before reset the SMMU will bypassed.

So the risk is to have an half secure system and may be unnoticed until later. I 
realize this is the current behavior, so not very ideal.

It feels to me if the user requested to use IOMMU then if we should panic if any 
of the available IOMMU are not initialized correctly. This will save a lot of 
debug afterwards.

@Stefano, any opinions?


Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-02 16:39 ` [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support Oleksandr Tyshchenko
  2019-08-07  2:41   ` Yoshihiro Shimoda
@ 2019-08-14 17:38   ` Julien Grall
  2019-08-14 18:45     ` Oleksandr
  1 sibling, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-14 17:38 UTC (permalink / raw)
  To: Oleksandr Tyshchenko, xen-devel
  Cc: Oleksandr Tyshchenko, Yoshihiro Shimoda, sstabellini

Hi Oleksandr,

On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
> +static int ipmmu_iommu_domain_init(struct domain *d)
> +{
> +    struct ipmmu_vmsa_xen_domain *xen_domain;
> +
> +    xen_domain = xzalloc(struct ipmmu_vmsa_xen_domain);
> +    if ( !xen_domain )
> +        return -ENOMEM;
> +
> +    spin_lock_init(&xen_domain->lock);
> +    INIT_LIST_HEAD(&xen_domain->cache_domains);
> +    /*
> +     * We don't create Root IPMMU domain here, it will be created on demand
> +     * only, when attaching the first master device to this Xen domain in
> +     * ipmmu_assign_device().
> +     * xen_domain->root_domain = NULL;
> +    */
> +
> +    dom_iommu(d)->arch.priv = xen_domain;

While looking at other part of Xen I realized you don't set 
IOMMU_FEAT_COHERENT_WALK. Does it mean the IOMMU walker does not support 
coherent walk (i.e snooping the cache)?

Note that when this feature is not set, the p2m code will require to clean each 
P2M entry when updated. So if the IPMMU supports coherent walk, I would strongly 
suggest to set the flag :).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support
  2019-08-14 17:38   ` Julien Grall
@ 2019-08-14 18:45     ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-14 18:45 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Oleksandr Tyshchenko, Yoshihiro Shimoda, sstabellini


On 14.08.19 20:38, Julien Grall wrote:
> Hi Oleksandr,

Hi Julien.


>
> On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
>> +static int ipmmu_iommu_domain_init(struct domain *d)
>> +{
>> +    struct ipmmu_vmsa_xen_domain *xen_domain;
>> +
>> +    xen_domain = xzalloc(struct ipmmu_vmsa_xen_domain);
>> +    if ( !xen_domain )
>> +        return -ENOMEM;
>> +
>> +    spin_lock_init(&xen_domain->lock);
>> +    INIT_LIST_HEAD(&xen_domain->cache_domains);
>> +    /*
>> +     * We don't create Root IPMMU domain here, it will be created on 
>> demand
>> +     * only, when attaching the first master device to this Xen 
>> domain in
>> +     * ipmmu_assign_device().
>> +     * xen_domain->root_domain = NULL;
>> +    */
>> +
>> +    dom_iommu(d)->arch.priv = xen_domain;
>
> While looking at other part of Xen I realized you don't set 
> IOMMU_FEAT_COHERENT_WALK. Does it mean the IOMMU walker does not 
> support coherent walk (i.e snooping the cache)?

*AFAIK*, not supported.

Linux driver reports coherent_walk is not supported as well.


>
>
> Note that when this feature is not set, the p2m code will require to 
> clean each P2M entry when updated. So if the IPMMU supports coherent 
> walk, I would strongly suggest to set the flag :).

When playing with non-shared IOMMU in Xen (two years ago), I noticed 
that I had forgotten to use clean_dcache after updating a page table 
entry. I could face faults when
shattering superpages for example. Once I added it, the faults went away 
completely.

So, leave IOMMU_FEAT_COHERENT_WALK in disabled state, but will keep your 
suggestion in mind.


>
> Cheers,
>
-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-14 17:34           ` Julien Grall
@ 2019-08-14 19:25             ` Stefano Stabellini
  2019-08-15  9:29               ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Stefano Stabellini @ 2019-08-14 19:25 UTC (permalink / raw)
  To: Julien Grall; +Cc: Oleksandr, xen-devel, sstabellini, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 8864 bytes --]

On Wed, 14 Aug 2019, Julien Grall wrote:
> Hi Oleksandr,
> 
> On 13/08/2019 13:35, Oleksandr wrote:
> > 
> > On 12.08.19 22:46, Julien Grall wrote:
> > > Hi Oleksandr,
> > 
> > Hi, Julien
> > 
> > 
> > > 
> > > On 8/12/19 1:01 PM, Oleksandr wrote:
> > > > On 12.08.19 14:11, Julien Grall wrote:
> > > > > On 02/08/2019 17:39, Oleksandr Tyshchenko wrote:
> > > > > > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > > > > > 
> > > > > > This patch adds minimal required support to General IOMMU framework
> > > > > > to be able to handle a case when IOMMU driver requesting deferred
> > > > > > probing for a device.
> > > > > > 
> > > > > > In order not to pull Linux's error code (-EPROBE_DEFER) to Xen
> > > > > > we have chosen -EAGAIN to be used for indicating that device
> > > > > > probing is deferred.
> > > > > > 
> > > > > > This is needed for the upcoming IPMMU driver which may request
> > > > > > deferred probing depending on what device will be probed the first
> > > > > > (there is some dependency between these devices, Root device must be
> > > > > > registered before Cache devices. If not the case, driver will deny
> > > > > > further Cache device probes until Root device is registered).
> > > > > > As we can't guarantee a fixed pre-defined order for the device nodes
> > > > > > in DT, we need to be ready for the situation where devices being
> > > > > > probed in "any" order.
> > > > > > 
> > > > > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
> > > > > > ---
> > > > > >   xen/common/device_tree.c            |  1 +
> > > > > >   xen/drivers/passthrough/arm/iommu.c | 35
> > > > > > ++++++++++++++++++++++++++++++++++-
> > > > > >   xen/include/asm-arm/device.h        |  6 +++++-
> > > > > >   xen/include/xen/device_tree.h       |  1 +
> > > > > >   4 files changed, 41 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> > > > > > index e107c6f..6f37448 100644
> > > > > > --- a/xen/common/device_tree.c
> > > > > > +++ b/xen/common/device_tree.c
> > > > > > @@ -1774,6 +1774,7 @@ static unsigned long __init
> > > > > > unflatten_dt_node(const void *fdt,
> > > > > >           /* By default the device is not protected */
> > > > > >           np->is_protected = false;
> > > > > >           INIT_LIST_HEAD(&np->domain_list);
> > > > > > +        INIT_LIST_HEAD(&np->deferred_probe);
> > > > > 
> > > > > I am not entirely happy to add a new list_head field per node just for
> > > > > the benefits of boot code. Could we re-use domain_list (with a comment
> > > > > in the code and appropriate ASSERT)?
> > > > 
> > > > Agree that only boot code uses deferred_probe field. I will consider
> > > > re-using domain_list. Could you please clarify regarding ASSERT (where
> > > > to put and what to check).
> > > 
> > > What I meant is adding an ASSERT to check that np->domain_list is at empty
> > > at least before trying to add in the list. This would help to debug any
> > > potential issue if we end up to use domain_list earlier in the future. I
> > > can't see why it would as iommu is called earlier, but who knows :).
> > 
> > Got it. Thank you for clarification.
> > 
> > 
> > > > > > +
> > > > > >   static const struct iommu_ops *iommu_ops;
> > > > > >     const struct iommu_ops *iommu_get_ops(void)
> > > > > > @@ -42,7 +48,7 @@ void __init iommu_set_ops(const struct iommu_ops
> > > > > > *ops)
> > > > > >     int __init iommu_hardware_setup(void)
> > > > > >   {
> > > > > > -    struct dt_device_node *np;
> > > > > > +    struct dt_device_node *np, *tmp;
> > > > > >       int rc;
> > > > > >       unsigned int num_iommus = 0;
> > > > > >   @@ -51,6 +57,33 @@ int __init iommu_hardware_setup(void)
> > > > > >           rc = device_init(np, DEVICE_IOMMU, NULL);
> > > > > >           if ( !rc )
> > > > > >               num_iommus++;
> > > > > > +        else if (rc == -EAGAIN)
> > > > > > +            /*
> > > > > > +             * Driver requested deferred probing, so add this
> > > > > > device to
> > > > > > +             * the deferred list for further processing.
> > > > > > +             */
> > > > > > +            list_add(&np->deferred_probe, &deferred_probe_list);
> > > > > > +    }
> > > > > > +
> > > > > > +    /*
> > > > > > +     * Process devices in the deferred list if at least one
> > > > > > successfully
> > > > > > +     * probed device is present.
> > > > > > +     */
> > > > > 
> > > > > I think this can turn into an infinite loop if all device in
> > > > > deferred_probe_list still return -EDEFER_PROBE and num_iommus is a
> > > > > non-zero.
> > > > 
> > > > Agree.
> > > > 
> > > > 
> > > > > 
> > > > > A better condition would be to check that at least one IOMMU is added
> > > > > at each loop. If not, then we should bail with an error because it
> > > > > likely means something is buggy.
> > > > 
> > > > Sounds reasonable. Will do.
> > > > 
> > > > 
> > > > Just to clarify:
> > > > 
> > > >  >>> A better condition would be to check that at least one IOMMU is
> > > > added at each loop.
> > > > 
> > > > Maybe, not only added (rc == 0), but driver didn't request deferred
> > > > probe (rc != -EAGAIN).
> > > 
> > > I think adding an IOMMU is enough. If you return an error other than
> > > -EAGAIN here after deferring probing, then you are likely going to fail at
> > > the next loop. So better to stop early.
> > 
> > It makes sense.
> > 
> > 
> > > 
> > > 
> > > I realize this not what the current code is doing (I know I wrote it ;)).
> > > But I am not sure it is sane to continue if only part of the IOMMUs are
> > > initialized. Most likely you will see an error much later that may be not
> > > trivial to find out.
> > > 
> > > Imagine you want to passthrough you network card to a guest but the IOMMU
> > > initialization failed...
> > 
> > Oh, agree.
> > 
> > As I understand, the new strict logic would be the following:
> > 
> > If initialization for at least one IOMMU device failed (device_init returns
> > an error other than -EAGAIN), we should stop and return an error to upper
> > layer (even if num_iommus > 0). No matter whether it is during the first
> > attempt or after deferring probe. We don't allow the "I/O virtualisation" to
> > be enabled (iommu_enabled == true) with only part of the IOMMU devices being
> > initialized. Is my understanding correct?
> 
> Let me summarize the discussion we had on IRC :). Without your patch, Xen may
> initialize only half the IOMMUs. If the device is behind an IOMMU that wasn't
> initialized, then we have two possibility:
>    1) The device was already mark as protected (if using the old binding in
> the SMMU). Xen will not be able to assign the device to Dom0 and therefore Xen
> will crash (not able to build dom0). For domU, it will depend whether the
> configuration contain the options 'dtdev'. If the option is specified, then
> guest will fail to build. On the contrary if the option isn't specified then
> the guest will boot and this could either lead to transaction failure (if the
> IOMMU was already reset) or bypassing the IOMMU. Note that the latter can
> today happen if your IOMMU was disabled. But we can't do much against it.
>    2) The device is not marked as protected. Xen will not be able to "assign"
> the device to Dom0 and this could either lead to the device bypassing the
> IOMMU or a transaction failure. For domU, the problem is similar to 1).
> 
> In the case of the SMMU driver, we only support old bindings. So devices are
> marked as protected during SMMU initialization. This is done before the SMMU
> is reset. Before reset the SMMU will bypassed.
> 
> So the risk is to have an half secure system and may be unnoticed until later.
> I realize this is the current behavior, so not very ideal.
> 
> It feels to me if the user requested to use IOMMU then if we should panic if
> any of the available IOMMU are not initialized correctly. This will save a lot
> of debug afterwards.
> 
> @Stefano, any opinions?

I agree that we should enable all IOMMUs or none. I don't think we want
to deal with partially initialized IOMMUs systems.

Failure to enable all IOMMUs should lead to returning an error from the
relevant function (arch_iommu_populate_page_table?) which should
translate into Xen failing to boot and crashing. Which I think it is
what you are suggesting, right?

(I wouldn't call panic() inside the IOMMU specific initializer, because
there might be iommu= command line options for Xen that specify a
different desired outcome. I would deal with this case the same way we
would deal with an error during initialization of a single IOMMU.)

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-14 19:25             ` Stefano Stabellini
@ 2019-08-15  9:29               ` Julien Grall
  2019-08-15 12:54                 ` Julien Grall
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-15  9:29 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Oleksandr, xen-devel, Oleksandr Tyshchenko

Hi Stefano,

On 14/08/2019 20:25, Stefano Stabellini wrote:
> On Wed, 14 Aug 2019, Julien Grall wrote:
> I agree that we should enable all IOMMUs or none. I don't think we want
> to deal with partially initialized IOMMUs systems.
> 
> Failure to enable all IOMMUs should lead to returning an error from the
> relevant function (arch_iommu_populate_page_table?) which should

The patch is:

|> start_xen()
|>   iommu_setup()
|>     iommu_hardware_setup()

> translate into Xen failing to boot and crashing. Which I think it is
> what you are suggesting, right?

That's correct. At the moment the return value of iommu_setup() is ignored. What 
I would like to suggest is something along the following lines:

rc = iommu_setup();
if ( iommu_enable && rc != -ENODEV )
   panic("Unable to setup IOMMUs");

> 
> (I wouldn't call panic() inside the IOMMU specific initializer, because
> there might be iommu= command line options for Xen that specify a
> different desired outcome. I would deal with this case the same way we
> would deal with an error during initialization of a single IOMMU.)

I am not sure to understand this. If you have an half initialized IOMMU (note 
the "single" here!), then continuing is likely to make things much worst.

I don't advocate to put the panic() inside the IOMMU specific initializer (see 
above). But clearly, we should return an error no matter the content of 'iommu' 
command line if the user requested to enable the IOMMUs (if any). It wouldn't be 
right if the user can say "ignore IOMMU error" as most likely you will have 
unexpected error afterwards.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-15  9:29               ` Julien Grall
@ 2019-08-15 12:54                 ` Julien Grall
  2019-08-15 13:14                   ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Julien Grall @ 2019-08-15 12:54 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Oleksandr, xen-devel, Oleksandr Tyshchenko

Hi,

On 15/08/2019 10:29, Julien Grall wrote:
> On 14/08/2019 20:25, Stefano Stabellini wrote:
>> On Wed, 14 Aug 2019, Julien Grall wrote:
>> I agree that we should enable all IOMMUs or none. I don't think we want
>> to deal with partially initialized IOMMUs systems.
>>
>> Failure to enable all IOMMUs should lead to returning an error from the
>> relevant function (arch_iommu_populate_page_table?) which should
> 
> The patch is:
> 
> |> start_xen()
> |>   iommu_setup()
> |>     iommu_hardware_setup()
> 
>> translate into Xen failing to boot and crashing. Which I think it is
>> what you are suggesting, right?
> 
> That's correct. At the moment the return value of iommu_setup() is ignored. What 
> I would like to suggest is something along the following lines:
> 
> rc = iommu_setup();
> if ( iommu_enable && rc != -ENODEV )
>    panic("Unable to setup IOMMUs");
> 
>>
>> (I wouldn't call panic() inside the IOMMU specific initializer, because
>> there might be iommu= command line options for Xen that specify a
>> different desired outcome. I would deal with this case the same way we
>> would deal with an error during initialization of a single IOMMU.)
> 
> I am not sure to understand this. If you have an half initialized IOMMU (note 
> the "single" here!), then continuing is likely to make things much worst.
> 
> I don't advocate to put the panic() inside the IOMMU specific initializer (see 
> above). But clearly, we should return an error no matter the content of 'iommu' 
> command line if the user requested to enable the IOMMUs (if any). It wouldn't be 
> right if the user can say "ignore IOMMU error" as most likely you will have 
> unexpected error afterwards.

I noticed there was already a panic() in iommu_setup() just in case the user
force the use of IOMMU but they were not initialized. I was half-tempted to set
iommu_force to true for Arm, but I think this is a different issue.

So here my take (not tested nor compiled).

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 2c5d1372c0..8f94f618b0 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -755,6 +755,7 @@ void __init start_xen(unsigned long boot_phys_offset,
         .max_grant_frames = gnttab_dom0_frames(),
         .max_maptrack_frames = opt_max_maptrack_frames,
     };
+    int rc;
 
     dcache_line_bytes = read_dcache_line_bytes();
 
@@ -890,7 +891,9 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     setup_virt_paging();
 
-    iommu_setup();
+    rc = iommu_setup();
+    if ( !iommu_enabled && rc != -ENODEV )
+        panic("Couldn't configure correctly all the IOMMUs.");
 
     do_initcalls();
 
diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 2135233736..f219de9ac3 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -51,6 +51,14 @@ int __init iommu_hardware_setup(void)
         rc = device_init(np, DEVICE_IOMMU, NULL);
         if ( !rc )
             num_iommus++;
+        /*
+         * Ignore the following error codes:
+         *   - EBADF: Indicate the current not is not an IOMMU
+         *   - ENODEV: The IOMMU is not present or cannot be used by
+         *     Xen.
+         */
+        else if ( rc != -EBADF && rc != -ENODEV )
+            return rc;
     }
 
     return ( num_iommus > 0 ) ? 0 : -ENODEV;




> 
> Cheers,
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-15 12:54                 ` Julien Grall
@ 2019-08-15 13:14                   ` Oleksandr
  2019-08-15 16:39                     ` Oleksandr
  0 siblings, 1 reply; 59+ messages in thread
From: Oleksandr @ 2019-08-15 13:14 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Oleksandr Tyshchenko


On 15.08.19 15:54, Julien Grall wrote:
> Hi,

Hi Julien


> I noticed there was already a panic() in iommu_setup() just in case the user
> force the use of IOMMU but they were not initialized. I was half-tempted to set
> iommu_force to true for Arm, but I think this is a different issue.
>
> So here my take (not tested nor compiled).

Thank you. I will check it and come back with results.


>
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 2c5d1372c0..8f94f618b0 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -755,6 +755,7 @@ void __init start_xen(unsigned long boot_phys_offset,
>           .max_grant_frames = gnttab_dom0_frames(),
>           .max_maptrack_frames = opt_max_maptrack_frames,
>       };
> +    int rc;
>   
>       dcache_line_bytes = read_dcache_line_bytes();
>   
> @@ -890,7 +891,9 @@ void __init start_xen(unsigned long boot_phys_offset,
>   
>       setup_virt_paging();
>   
> -    iommu_setup();
> +    rc = iommu_setup();
> +    if ( !iommu_enabled && rc != -ENODEV )
> +        panic("Couldn't configure correctly all the IOMMUs.");
>   
>       do_initcalls();
>   
> diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
> index 2135233736..f219de9ac3 100644
> --- a/xen/drivers/passthrough/arm/iommu.c
> +++ b/xen/drivers/passthrough/arm/iommu.c
> @@ -51,6 +51,14 @@ int __init iommu_hardware_setup(void)
>           rc = device_init(np, DEVICE_IOMMU, NULL);
>           if ( !rc )
>               num_iommus++;
> +        /*
> +         * Ignore the following error codes:
> +         *   - EBADF: Indicate the current not is not an IOMMU
> +         *   - ENODEV: The IOMMU is not present or cannot be used by
> +         *     Xen.
> +         */
> +        else if ( rc != -EBADF && rc != -ENODEV )
> +            return rc;
>       }
>   
>       return ( num_iommus > 0 ) ? 0 : -ENODEV;
>
>
>
>
>> Cheers,
>>
-- 
Regards,

Oleksandr Tyshchenko


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request
  2019-08-15 13:14                   ` Oleksandr
@ 2019-08-15 16:39                     ` Oleksandr
  0 siblings, 0 replies; 59+ messages in thread
From: Oleksandr @ 2019-08-15 16:39 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Oleksandr Tyshchenko

[-- Attachment #1: Type: text/plain, Size: 2171 bytes --]


Hi, Julien


>
>> I noticed there was already a panic() in iommu_setup() just in case 
>> the user
>> force the use of IOMMU but they were not initialized. I was 
>> half-tempted to set
>> iommu_force to true for Arm, but I think this is a different issue.
>>
>> So here my take (not tested nor compiled).
>
> Thank you. I will check it and come back with results.

I have preliminary tested it with my IPMMU series including new 
modification of "deferred probing" patch (attached). Being honest my 
series is not based on the *current* staging, but not too outdated.

So, your patch works as expected in the following scenarios:

1. [No panic] Without IOMMU driver in Xen (#CONFIG_IPMMU_VMSA is not set).

2. [No panic] IOMMU driver is present, but reports it is cannot be used 
in Xen (P2M sharing not supported, etc) by returning -ENODEV.

3. [No panic] IOMMU is globally disabled in command line "iommu=0".

4. [No panic] IOMMU driver requests deferred probing until the last 
device in DT is initialized, after that all deferred devices get 
initialized.

5. [Panic] IOMMU driver returns an error (other than -EBADF and -ENODEV) 
at the very beginning/when a part of devices are already initialized.

6. [Panic] IOMMU driver always returns deferred probing/returns an error 
other than -EAGAIN after deferred probing.


>>
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index 2c5d1372c0..8f94f618b0 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -755,6 +755,7 @@ void __init start_xen(unsigned long 
>> boot_phys_offset,
>>           .max_grant_frames = gnttab_dom0_frames(),
>>           .max_maptrack_frames = opt_max_maptrack_frames,
>>       };
>> +    int rc;
>>         dcache_line_bytes = read_dcache_line_bytes();
>>   @@ -890,7 +891,9 @@ void __init start_xen(unsigned long 
>> boot_phys_offset,
>>         setup_virt_paging();
>>   -    iommu_setup();
>> +    rc = iommu_setup();
>> +    if ( !iommu_enabled && rc != -ENODEV )
>> +        panic("Couldn't configure correctly all the IOMMUs.");

"\n" should be added.



-- 
Regards,

Oleksandr Tyshchenko


[-- Attachment #2: 0001-iommu-arm-Add-ability-to-handle-deferred-probing-req.patch --]
[-- Type: text/x-patch, Size: 5887 bytes --]

[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, back to index

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-02 16:39 [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr Tyshchenko
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 1/6] iommu/arm: Add iommu_helpers.c file to keep common for IOMMUs stuff Oleksandr Tyshchenko
2019-08-09 17:35   ` Julien Grall
2019-08-09 18:10     ` Oleksandr
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 2/6] iommu/arm: Add ability to handle deferred probing request Oleksandr Tyshchenko
2019-08-12 11:11   ` Julien Grall
2019-08-12 12:01     ` Oleksandr
2019-08-12 19:46       ` Julien Grall
2019-08-13 12:35         ` Oleksandr
2019-08-14 17:34           ` Julien Grall
2019-08-14 19:25             ` Stefano Stabellini
2019-08-15  9:29               ` Julien Grall
2019-08-15 12:54                 ` Julien Grall
2019-08-15 13:14                   ` Oleksandr
2019-08-15 16:39                     ` Oleksandr
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 3/6] [RFC] xen/common: Introduce _xrealloc function Oleksandr Tyshchenko
2019-08-05 10:02   ` Jan Beulich
2019-08-06 18:50     ` Oleksandr
2019-08-07  6:22       ` Jan Beulich
2019-08-07 17:31         ` Oleksandr
2019-08-06 19:51     ` Volodymyr Babchuk
2019-08-07  6:26       ` Jan Beulich
2019-08-07 18:36         ` Oleksandr
2019-08-08  6:08           ` Jan Beulich
2019-08-08  7:05           ` Jan Beulich
2019-08-08 11:05             ` Oleksandr
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 4/6] iommu/arm: Add lightweight iommu_fwspec support Oleksandr Tyshchenko
2019-08-13 12:39   ` Julien Grall
2019-08-13 15:17     ` Oleksandr
2019-08-13 15:28       ` Julien Grall
2019-08-13 16:18         ` Oleksandr
2019-08-13 13:40   ` Julien Grall
2019-08-13 16:28     ` Oleksandr
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 5/6] iommu/arm: Introduce iommu_add_dt_device API Oleksandr Tyshchenko
2019-08-13 13:49   ` Julien Grall
2019-08-13 16:05     ` Oleksandr
2019-08-13 17:13       ` Julien Grall
2019-08-02 16:39 ` [Xen-devel] [PATCH V2 6/6] iommu/arm: Add Renesas IPMMU-VMSA support Oleksandr Tyshchenko
2019-08-07  2:41   ` Yoshihiro Shimoda
2019-08-07 16:01     ` Oleksandr
2019-08-07 19:15       ` Julien Grall
2019-08-07 20:28         ` Oleksandr Tyshchenko
2019-08-08  9:05           ` Julien Grall
2019-08-08 10:14             ` Oleksandr
2019-08-08 12:44               ` Julien Grall
2019-08-08 15:04                 ` Oleksandr
2019-08-08 17:16                   ` Julien Grall
2019-08-08 19:29                     ` Oleksandr
2019-08-08 20:32                       ` Julien Grall
2019-08-08 23:32                         ` Oleksandr Tyshchenko
2019-08-09  9:56                           ` Julien Grall
2019-08-09 18:38                             ` Oleksandr
2019-08-08 12:28         ` Oleksandr
2019-08-08 14:23         ` Lars Kurth
2019-08-08  4:05       ` Yoshihiro Shimoda
2019-08-14 17:38   ` Julien Grall
2019-08-14 18:45     ` Oleksandr
2019-08-05  7:58 ` [Xen-devel] [PATCH V2 0/6] iommu/arm: Add Renesas IPMMU-VMSA support + Linux's iommu_fwspec Oleksandr
2019-08-05  8:29   ` Julien Grall

Xen-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/xen-devel/0 xen-devel/git/0.git
	git clone --mirror https://lore.kernel.org/xen-devel/1 xen-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 xen-devel xen-devel/ https://lore.kernel.org/xen-devel \
		xen-devel@lists.xenproject.org xen-devel@archiver.kernel.org
	public-inbox-index xen-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.xenproject.lists.xen-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox