All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/19] xe/arm: Add support for non-pci passthrough
@ 2014-06-16 16:17 Julien Grall
  2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
                   ` (18 more replies)
  0 siblings, 19 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

Hello all,

This is the first attempt to add support for device tree passthrough (i.e
non-pci passthrough) on ARM.

The user will have to specify the list of device node to passthrough via
the new options "dtdev" in the xl configuration file. Only device protected
by an IOMMU can be passthrough to the guest. This is because the device
can use DMA and will be use the wrong address space.
I'm thinking to add an option "force" when the user knows that this device
doesn't use DMA. This might be useful to passthrough serial device.

This has been tested on midway by assigning the secondary network to a guest.
As this is an early stage of device passthrough, DOM0 has not been modified.
Therefore to avoid DOM0 using the network card, a properties status="disabled"
has been added in the device tree. I can send my device tree if necessary.

TODO list:
    - Deassign device from dom0 before passthrough
    - Specific device properties in the DT

There is also some TODO in different patches (see the /* TODO: */).

The series is based on stefano's interrupts series [1] and Arianna's memory
mapping series [2]. A working tree can be found here:

git://xenbits.xen.org/people/julieng/xen-unstable.git branch passthrough-v1

Sincerely yours,

[1] http://lists.xen.org/archives/html/xen-devel/2014-06/msg01224.html
[2] http://lists.xenproject.org/archives/html/xen-devel/2014-05/msg03050.html

Julien Grall (19):
  xen/arm: guest_physmap_remove_page: Print a warning if we fail to
    unmap the page
  xen: guestcopy: Provide an helper to copy string from guest
  xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  xen/arm: route_irq_to_guest: Check validity of the IRQ
  xen/arm: Release IRQ routed to a domain when it's destroying
  xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  xen/dts: Use unsigned int for MMIO and IRQ index
  xen/dts: Provide an helper to get a DT node from a path provided by a
    guest
  xen/dts: Add hypercalls to retrieve device node information
  xen/passthrough: Introduce iommu_buildup
  xen/passthrough: Call arch_iommu_domain_destroy before calling
    iommu_teardown
  xen/passthrough: iommu_deassign_device_dt: By default reassign device
    to nobody
  xen/iommu: arm: Wire iommu DOMCTL for ARM
  xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  xen/arm: Reserve region in guest memory for device passthrough
  libxl/arm: Introduce DT_IRQ_TYPE_*
  libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs
  libxl: Add support for non-PCI passthrough
  xl: Add new option dtdev

 docs/man/xl.cfg.pod.5                 |    5 ++
 tools/libxc/xc_domain.c               |   29 ++++++
 tools/libxc/xc_physdev.c              |  129 ++++++++++++++++++++++++++
 tools/libxc/xenctrl.h                 |   40 +++++++++
 tools/libxl/Makefile                  |    2 +-
 tools/libxl/libxl_arch.h              |    4 +-
 tools/libxl/libxl_arm.c               |  142 +++++++++++++++++++++++++++--
 tools/libxl/libxl_create.c            |   11 +++
 tools/libxl/libxl_dom.c               |   22 ++++-
 tools/libxl/libxl_dtdev.c             |  159 +++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h          |   31 +++++++
 tools/libxl/libxl_types.idl           |    5 ++
 tools/libxl/libxl_x86.c               |    4 +-
 tools/libxl/xl_cmdimpl.c              |   21 ++++-
 xen/arch/arm/domain_build.c           |   66 +++++++++-----
 xen/arch/arm/gic.c                    |   12 +++
 xen/arch/arm/irq.c                    |   40 ++++++++-
 xen/arch/arm/p2m.c                    |   14 ++-
 xen/arch/arm/physdev.c                |   93 ++++++++++++++++++-
 xen/arch/arm/vgic.c                   |   15 +++-
 xen/arch/x86/domctl.c                 |    2 +-
 xen/common/Makefile                   |    1 +
 xen/common/device_tree.c              |  141 +++++++++++++++++++++++++++--
 xen/common/domctl.c                   |    4 +
 xen/common/guestcopy.c                |   28 ++++++
 xen/drivers/passthrough/arm/iommu.c   |    6 ++
 xen/drivers/passthrough/arm/smmu.c    |    7 +-
 xen/drivers/passthrough/device_tree.c |   60 +++++++++++--
 xen/drivers/passthrough/iommu.c       |   36 +++++++-
 xen/drivers/passthrough/pci.c         |   12 +--
 xen/include/asm-arm/gic.h             |    5 ++
 xen/include/asm-arm/irq.h             |    6 ++
 xen/include/public/arch-arm.h         |    4 +
 xen/include/public/domctl.h           |   10 +++
 xen/include/public/physdev.h          |   40 +++++++++
 xen/include/xen/device_tree.h         |   27 +++++-
 xen/include/xen/guest_access.h        |    5 ++
 xen/include/xen/iommu.h               |    5 ++
 xen/xsm/flask/flask_op.c              |   29 +-----
 39 files changed, 1166 insertions(+), 106 deletions(-)
 create mode 100644 tools/libxl/libxl_dtdev.c
 create mode 100644 xen/common/guestcopy.c

-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 122+ messages in thread

* [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 15:03   ` Stefano Stabellini
  2014-07-03 10:52   ` Ian Campbell
  2014-06-16 16:17 ` [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest Julien Grall
                   ` (17 subsequent siblings)
  18 siblings, 2 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

The function guest_physmap_remove_page does't have a return value. With
the change "arch/arm: add consistency check to REMOVE p2m changes",
apply_p2m_changes can unlikely fail. Warn the user in this case.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/p2m.c |   14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index f93f99a..51f9225 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -607,10 +607,16 @@ void guest_physmap_remove_page(struct domain *d,
                                unsigned long gpfn,
                                unsigned long mfn, unsigned int page_order)
 {
-    apply_p2m_changes(d, REMOVE,
-                      pfn_to_paddr(gpfn),
-                      pfn_to_paddr(gpfn + (1<<page_order)),
-                      pfn_to_paddr(mfn), NULL, MATTR_MEM, p2m_invalid);
+    int ret;
+
+    ret = apply_p2m_changes(d, REMOVE,
+                            pfn_to_paddr(gpfn),
+                            pfn_to_paddr(gpfn + (1<<page_order)),
+                            pfn_to_paddr(mfn), NULL, MATTR_MEM, p2m_invalid);
+    if ( ret )
+        dprintk(XENLOG_G_WARNING,
+                "DOM%u: Unable to unmap region GPFN 0x%lx - 0x%lx MFN 0x%lx\n",
+                d->domain_id, gpfn, gpfn + (1 << page_order), mfn);
 }
 
 int p2m_alloc_table(struct domain *d)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
  2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-17  8:01   ` Jan Beulich
  2014-06-16 16:17 ` [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO Julien Grall
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, ian.campbell, tim, Julien Grall, Ian Jackson,
	stefano.stabellini, Jan Beulich, Daniel De Graaf

Flask code already provides an helper to copy a string from guest. In a later
patch, the new DT hypercalls will need a similar function.

To avoid code duplication, copy the flask helper (flask_copying_string) to
common code:
    - Rename into copy_string_from_guest
    - Update arguments. The size provided by the hypercall is unsigned
    not signed
    - Add comment to explain the extra +1

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
---
 xen/common/Makefile            |    1 +
 xen/common/guestcopy.c         |   28 ++++++++++++++++++++++++++++
 xen/include/xen/guest_access.h |    5 +++++
 xen/xsm/flask/flask_op.c       |   29 +++--------------------------
 4 files changed, 37 insertions(+), 26 deletions(-)
 create mode 100644 xen/common/guestcopy.c

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 3683ae3..627b6e3 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -9,6 +9,7 @@ obj-y += event_2l.o
 obj-y += event_channel.o
 obj-y += event_fifo.o
 obj-y += grant_table.o
+obj-y += guestcopy.o
 obj-y += irq.o
 obj-y += kernel.o
 obj-y += keyhandler.o
diff --git a/xen/common/guestcopy.c b/xen/common/guestcopy.c
new file mode 100644
index 0000000..13bde81
--- /dev/null
+++ b/xen/common/guestcopy.c
@@ -0,0 +1,28 @@
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/guest_access.h>
+
+int copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, char **buf,
+                           unsigned long size, unsigned long max_size)
+{
+    char *tmp;
+
+    if ( size > max_size )
+        return -ENOENT;
+
+    /* Add an extra +1 to append \0. We can't assume the guest will
+     * provide a valid string */
+    tmp = xmalloc_array(char, size + 1);
+    if ( !tmp )
+        return -ENOMEM;
+
+    if ( copy_from_guest(tmp, u_buf, size) )
+    {
+        xfree(tmp);
+        return -EFAULT;
+    }
+    tmp[size] = 0;
+
+    *buf = tmp;
+    return 0;
+}
diff --git a/xen/include/xen/guest_access.h b/xen/include/xen/guest_access.h
index 373454e..61756ae 100644
--- a/xen/include/xen/guest_access.h
+++ b/xen/include/xen/guest_access.h
@@ -8,6 +8,8 @@
 #define __XEN_GUEST_ACCESS_H__
 
 #include <asm/guest_access.h>
+#include <xen/types.h>
+#include <public/xen.h>
 
 #define copy_to_guest(hnd, ptr, nr)                     \
     copy_to_guest_offset(hnd, 0, ptr, nr)
@@ -27,4 +29,7 @@
 #define __clear_guest(hnd, nr)                          \
     __clear_guest_offset(hnd, 0, nr)
 
+int copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, char **buf,
+                           unsigned long size, unsigned long max_size);
+
 #endif /* __XEN_GUEST_ACCESS_H__ */
diff --git a/xen/xsm/flask/flask_op.c b/xen/xsm/flask/flask_op.c
index 7743aac..6847108 100644
--- a/xen/xsm/flask/flask_op.c
+++ b/xen/xsm/flask/flask_op.c
@@ -76,29 +76,6 @@ static int domain_has_security(struct domain *d, u32 perms)
                         perms, NULL);
 }
 
-static int flask_copyin_string(XEN_GUEST_HANDLE(char) u_buf, char **buf,
-                               size_t size, size_t max_size)
-{
-    char *tmp;
-
-    if ( size > max_size )
-        return -ENOENT;
-
-    tmp = xmalloc_array(char, size + 1);
-    if ( !tmp )
-        return -ENOMEM;
-
-    if ( copy_from_guest(tmp, u_buf, size) )
-    {
-        xfree(tmp);
-        return -EFAULT;
-    }
-    tmp[size] = 0;
-
-    *buf = tmp;
-    return 0;
-}
-
 #endif /* COMPAT */
 
 static int flask_security_user(struct xen_flask_userlist *arg)
@@ -112,7 +89,7 @@ static int flask_security_user(struct xen_flask_userlist *arg)
     if ( rv )
         return rv;
 
-    rv = flask_copyin_string(arg->u.user, &user, arg->size, PAGE_SIZE);
+    rv = copy_string_from_guest(arg->u.user, &user, arg->size, PAGE_SIZE);
     if ( rv )
         return rv;
 
@@ -227,7 +204,7 @@ static int flask_security_context(struct xen_flask_sid_context *arg)
     if ( rv )
         return rv;
 
-    rv = flask_copyin_string(arg->context, &buf, arg->size, PAGE_SIZE);
+    rv = copy_string_from_guest(arg->context, &buf, arg->size, PAGE_SIZE);
     if ( rv )
         return rv;
 
@@ -324,7 +301,7 @@ static int flask_security_resolve_bool(struct xen_flask_boolean *arg)
     if ( arg->bool_id != -1 )
         return 0;
 
-    rv = flask_copyin_string(arg->name, &name, arg->size, bool_maxstr);
+    rv = copy_string_from_guest(arg->name, &name, arg->size, bool_maxstr);
     if ( rv )
         return rv;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
  2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
  2014-06-16 16:17 ` [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 20:21   ` Stefano Stabellini
  2014-06-16 16:17 ` [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ Julien Grall
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

The commit 33233c2 "arch/arm: domain build: let dom0 access I/O memory of
mapped series" fill the iomem_caps to allow DOM0 managing MMIO of mapped
device.

A device can be disabled (i.e by adding a property status="disabled" in the
device tree) because the user may want to passthrough this device to a guest.
This will avoid DOM0 loading (and few minutes after unloading) the driver to
handle this device.

Even though, we don't want to let DOM0 using this device, the domain needs
to be able to manage the MMIO/IRQ range. Rework the function map_device
(renamed into handle_device) to:

* For a given device node:
    - Give permission to manage IRQ/MMIO for this device
    - Retrieve the IRQ configuration (i.e edge/level) from the device
    tree
* For available device (i.e status != disabled in the DT)
    - Assign the device to the guest if it's protected by an IOMMU
    - Map the IRQs and MMIOs regions to the guest

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/domain_build.c |   66 ++++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 22 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index c3783cf..6a711cc 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -680,8 +680,14 @@ static int make_timer_node(const struct domain *d, void *fdt,
     return res;
 }
 
-/* Map the device in the domain */
-static int map_device(struct domain *d, struct dt_device_node *dev)
+/* For a given device node:
+ *  - Give permission to the guest to manage IRQ and MMIO range
+ *  - Retrieve the IRQ configuration (i.e edge/level) from device tree
+ * When the device is available:
+ *  - Assign the device to the guest if it's protected by an IOMMU
+ *  - Map the IRQs and iomem regions to DOM0
+ */
+static int handle_device(struct domain *d, struct dt_device_node *dev, bool_t map)
 {
     unsigned int nirq;
     unsigned int naddr;
@@ -694,9 +700,10 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
     nirq = dt_number_of_irq(dev);
     naddr = dt_number_of_address(dev);
 
-    DPRINT("%s nirq = %d naddr = %u\n", dt_node_full_name(dev), nirq, naddr);
+    DPRINT("%s map = %d nirq = %d naddr = %u\n", dt_node_full_name(dev),
+           map, nirq, naddr);
 
-    if ( dt_device_is_protected(dev) )
+    if ( dt_device_is_protected(dev) && map )
     {
         DPRINT("%s setup iommu\n", dt_node_full_name(dev));
         res = iommu_assign_dt_device(d, dev);
@@ -708,7 +715,7 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
         }
     }
 
-    /* Map IRQs */
+    /* Give permission and  map IRQs */
     for ( i = 0; i < nirq; i++ )
     {
         res = dt_device_get_raw_irq(dev, i, &rirq);
@@ -741,16 +748,28 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
         irq = res;
 
         DPRINT("irq %u = %u\n", i, irq);
-        res = route_irq_to_guest(d, irq, dt_node_name(dev));
+
+        res = irq_permit_access(d, irq);
         if ( res )
         {
-            printk(XENLOG_ERR "Unable to route IRQ %u to domain %u\n",
-                   irq, d->domain_id);
+            printk(XENLOG_ERR "Unable to permit to dom%u access to IRQ %u\n",
+                   d->domain_id, irq);
             return res;
         }
+
+        if ( map )
+        {
+            res = route_irq_to_guest(d, irq, dt_node_name(dev));
+            if ( res )
+            {
+                printk(XENLOG_ERR "Unable to route IRQ %u to domain %u\n",
+                       irq, d->domain_id);
+                return res;
+            }
+        }
     }
 
-    /* Map the address ranges */
+    /* Give permission and map MMIOs */
     for ( i = 0; i < naddr; i++ )
     {
         res = dt_device_get_address(dev, i, &addr, &size);
@@ -774,17 +793,21 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
                    addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
             return res;
         }
-        res = map_mmio_regions(d,
-                               paddr_to_pfn(addr & PAGE_MASK),
-                               DIV_ROUND_UP(size, PAGE_SIZE),
-                               paddr_to_pfn(addr & PAGE_MASK));
-        if ( res )
+
+        if ( map )
         {
-            printk(XENLOG_ERR "Unable to map 0x%"PRIx64
-                   " - 0x%"PRIx64" in domain %d\n",
-                   addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1,
-                   d->domain_id);
-            return res;
+            res = map_mmio_regions(d,
+                                   paddr_to_pfn(addr & PAGE_MASK),
+                                   DIV_ROUND_UP(size, PAGE_SIZE),
+                                   paddr_to_pfn(addr & PAGE_MASK));
+            if ( res )
+            {
+                printk(XENLOG_ERR "Unable to map 0x%"PRIx64
+                       " - 0x%"PRIx64" in domain %d\n",
+                       addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1,
+                       d->domain_id);
+                return res;
+            }
         }
     }
 
@@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
      *  property. Therefore these device doesn't need to be mapped. This
      *  solution can be use later for pass through.
      */
-    if ( !dt_device_type_is_equal(node, "memory") &&
-         dt_device_is_available(node) )
+    if ( !dt_device_type_is_equal(node, "memory") )
     {
-        res = map_device(d, node);
+        res = handle_device(d, node, dt_device_is_available(node));
 
         if ( res )
             return res;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (2 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 18:52   ` Stefano Stabellini
  2014-07-03 11:04   ` Ian Campbell
  2014-06-16 16:17 ` [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying Julien Grall
                   ` (14 subsequent siblings)
  18 siblings, 2 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

Currently Xen only supports SPIs routing for guest, add a function
is_routable_irq to check if we can route a given IRQ to the guest.

Secondly, make sure the vIRQ (which is currently the same as the pIRQ) is not
the greater that the number of IRQs handle to the vGIC.

Finally, desc->arch.type which contains the IRQ type (i.e level/edge) must
be correctly configured before. The IRQ type won't be configure when:
    - the device has been blacklist for the current platform
    - the IRQ has not been describe in the device tree

I think we can safely assume that a user won't never ask to route
as such IRQ to the guest.

Also, use XENLOG_G_ERR in the error message within the function as it will
be later called from a guest.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/irq.c        |   32 +++++++++++++++++++++++++++++---
 xen/include/asm-arm/gic.h |    2 ++
 xen/include/asm-arm/irq.h |    6 ++++++
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 9c141bc..4e51fee 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -361,6 +361,10 @@ err:
     return rc;
 }
 
+/* Route an IRQ to a specific guest.
+ * For now the vIRQ is equal to the pIRQ and only SPIs are routabled to
+ * the guest.
+ */
 int route_irq_to_guest(struct domain *d, unsigned int irq,
                        const char * devname)
 {
@@ -369,6 +373,20 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
     unsigned long flags;
     int retval = 0;
 
+    if ( !is_routable_irq(irq) )
+    {
+        dprintk(XENLOG_G_ERR, "the IRQ%u is not routable\n", irq);
+        return -EINVAL;
+    }
+
+    if ( irq > vgic_num_irqs(d) )
+    {
+        dprintk(XENLOG_G_ERR,
+                "the IRQ number %u is too high for domain %u (max = %u)\n",
+                irq, d->domain_id, vgic_num_irqs(d));
+        return -EINVAL;
+    }
+
     action = xmalloc(struct irqaction);
     if (!action)
         return -ENOMEM;
@@ -379,6 +397,14 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
 
     spin_lock_irqsave(&desc->lock, flags);
 
+    if ( desc->arch.type == DT_IRQ_TYPE_INVALID )
+    {
+        dprintk(XENLOG_G_ERR, "IRQ %u has not been configured\n",
+                irq);
+        retval = -EIO;
+        goto out;
+    }
+
     /* If the IRQ is already used by someone
      *  - If it's the same domain -> Xen doesn't need to update the IRQ desc
      *  - Otherwise -> For now, don't allow the IRQ to be shared between
@@ -392,10 +418,10 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
             goto out;
 
         if ( desc->status & IRQ_GUEST )
-            printk(XENLOG_ERR "ERROR: IRQ %u is already used by domain %u\n",
-                   irq, ad->domain_id);
+            dprintk(XENLOG_G_ERR, "IRQ %u is already used by domain %u\n",
+                    irq, ad->domain_id);
         else
-            printk(XENLOG_ERR "ERROR: IRQ %u is already used by Xen\n", irq);
+            dprintk(XENLOG_G_ERR, "IRQ %u is already used by Xen\n", irq);
         retval = -EBUSY;
         goto out;
     }
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 8e37ccf..6e7375c 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -163,6 +163,8 @@
 #define DT_MATCH_GIC    DT_MATCH_COMPATIBLE("arm,cortex-a15-gic"), \
                         DT_MATCH_COMPATIBLE("arm,cortex-a7-gic")
 
+#define vgic_num_irqs(d)    ((d)->arch.vgic.nr_lines + 32)
+
 extern int domain_vgic_init(struct domain *d);
 extern void domain_vgic_free(struct domain *d);
 
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index e567f71..63926a5 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -37,6 +37,12 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
 
 #define domain_pirq_to_irq(d, pirq) (pirq)
 
+static inline bool_t is_routable_irq(unsigned int irq)
+{
+    /* For now, we can only route SPIs to the guest */
+    return (irq >= NR_LOCAL_IRQS);
+}
+
 void init_IRQ(void);
 void init_secondary_IRQ(void);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (3 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 18:08   ` Stefano Stabellini
  2014-06-16 16:17 ` [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq Julien Grall
                   ` (13 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

Xen has to release IRQ routed to a domain in order to reuse later. Currently
only SPIs can be routed to the guest so we only need to browse SPIs for a
specific domain.

Futhermore, a guest can crash and let the IRQ in an incorrect state (i.e has
not being EOIed). Add a function to reset a given IRQ to allow Xen route again
the IRQ in the future.

Also, reset the desc->handler to no_irq_type. This will let you know if we
did something wrong with the IRQ management.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/arch/arm/gic.c        |   12 ++++++++++++
 xen/arch/arm/irq.c        |    8 ++++++++
 xen/arch/arm/vgic.c       |   10 ++++++++++
 xen/include/asm-arm/gic.h |    3 +++
 4 files changed, 33 insertions(+)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 11e53af..42fc3bc 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -928,6 +928,18 @@ int gicv_setup(struct domain *d)
 
 }
 
+/* The guest may not have EOIed the IRQ.
+ * Be sure to reset correctly the IRQ.
+ */
+void gic_reset_guest_irq(struct irq_desc *desc)
+{
+    ASSERT(spin_is_locked(&desc->lock));
+    ASSERT(desc->status & IRQ_GUEST);
+
+    if ( desc->status & IRQ_INPROGRESS )
+        GICC[GICC_DIR] = desc->irq;
+}
+
 static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
 {
     /* 
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 4e51fee..e44a90f 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -274,7 +274,15 @@ void release_irq(unsigned int irq, const void *dev_id)
     if ( !desc->action )
     {
         desc->handler->shutdown(desc);
+
+        if ( desc->status & IRQ_GUEST )
+        {
+            gic_reset_guest_irq(desc);
+            desc->status &= ~IRQ_INPROGRESS;
+        }
+
         desc->status &= ~IRQ_GUEST;
+        desc->handler = &no_irq_type;
     }
 
     spin_unlock_irqrestore(&desc->lock,flags);
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index cb8df3a..e451324 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -112,6 +112,16 @@ int domain_vgic_init(struct domain *d)
 
 void domain_vgic_free(struct domain *d)
 {
+    int i;
+
+    for ( i = NR_LOCAL_IRQS; i < d->arch.vgic.nr_lines; i++ )
+    {
+        struct irq_desc *desc = d->arch.vgic.pending_irqs[i].desc;
+
+        if ( desc )
+            release_irq(desc->irq, d);
+    }
+
     xfree(d->arch.vgic.shared_irqs);
     xfree(d->arch.vgic.pending_irqs);
 }
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 6e7375c..841d845 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -228,6 +228,9 @@ int gic_irq_xlate(const u32 *intspec, unsigned int intsize,
                   unsigned int *out_hwirq, unsigned int *out_type);
 void gic_clear_lrs(struct vcpu *v);
 
+/* Reset an IRQ passthrough to a guest */
+void gic_reset_guest_irq(struct irq_desc *desc);
+
 #endif /* __ASSEMBLY__ */
 #endif
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (4 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 19:24   ` Stefano Stabellini
  2014-06-16 16:17 ` [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index Julien Grall
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

The physdev sub-hypercall PHYSDEVOP_map_pirq allow the toolstack to route
a physical IRQ to the guest (via the config options "irqs" for xl).
For now, we allow only SPIs to be mapped to the guest.
The type MAP_PIRQ_TYPE_GSI is used for this purpose.

The virtual IRQ number is equal to the physical one. This will avoid adding
logic in Xen to allocate the vIRQ number. The drawbacks is we allocated
unconditionally the same amount of SPIs as the host. This value will never
be more than 1024 with GICv2.

Signed-off-by: Julien Grall <julien.grall@linaro.org>

---
    I'm wondering if we should introduce an alias of MAP_PIRQ_TYPE_GSI
    for ARM. It's will be less confuse for the user.
---
 xen/arch/arm/physdev.c |   77 ++++++++++++++++++++++++++++++++++++++++++++++--
 xen/arch/arm/vgic.c    |    5 +---
 2 files changed, 76 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index 61b4a18..d17589c 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -8,13 +8,86 @@
 #include <xen/types.h>
 #include <xen/lib.h>
 #include <xen/errno.h>
+#include <xen/iocap.h>
+#include <xen/guest_access.h>
+#include <xsm/xsm.h>
+#include <asm/current.h>
 #include <asm/hypercall.h>
+#include <public/physdev.h>
+
+static int physdev_map_pirq(domid_t domid, int type, int index, int *pirq_p)
+{
+    struct domain *d;
+    int ret;
+    int irq = index;
+
+    d = rcu_lock_domain_by_any_id(domid);
+    if ( d == NULL )
+        return -ESRCH;
+
+    ret = xsm_map_domain_pirq(XSM_TARGET, d);
+    if ( ret )
+        goto free_domain;
+
+    /* For now we only suport GSI */
+    if ( type != MAP_PIRQ_TYPE_GSI )
+    {
+        ret = -EINVAL;
+        dprintk(XENLOG_G_ERR, "dom%u: wrong map_pirq type 0x%x\n",
+                d->domain_id, type);
+        goto free_domain;
+    }
+
+    if ( !is_routable_irq(irq) )
+    {
+        ret = -EINVAL;
+        dprintk(XENLOG_G_ERR, "IRQ%u is not routable to a guest\n", irq);
+        goto free_domain;
+    }
+
+    ret = -EPERM;
+    if ( !irq_access_permitted(current->domain, irq) )
+        goto free_domain;
+
+    ret = route_irq_to_guest(d, irq, "routed IRQ");
+
+    /* GSIs are mapped 1:1 to the guest */
+    if ( !ret )
+        *pirq_p = irq;
+
+free_domain:
+    rcu_unlock_domain(d);
+
+    return ret;
+}
 
 
 int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
-    printk("%s %d cmd=%d: not implemented yet\n", __func__, __LINE__, cmd);
-    return -ENOSYS;
+    int ret;
+
+    switch ( cmd )
+    {
+    case PHYSDEVOP_map_pirq:
+        {
+            physdev_map_pirq_t map;
+
+            ret = -EFAULT;
+            if ( copy_from_guest(&map, arg, 1) != 0 )
+                break;
+
+            ret = physdev_map_pirq(map.domid, map.type, map.index, &map.pirq);
+
+            if ( __copy_to_guest(arg, &map, 1) )
+                ret = -EFAULT;
+        }
+        break;
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
 }
 
 /*
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index e451324..c18b2ca 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
     /* Currently nr_lines in vgic and gic doesn't have the same meanings
      * Here nr_lines = number of SPIs
      */
-    if ( is_hardware_domain(d) )
-        d->arch.vgic.nr_lines = gic_number_lines() - 32;
-    else
-        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
+    d->arch.vgic.nr_lines = gic_number_lines() - 32;
 
     d->arch.vgic.shared_irqs =
         xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (5 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 18:54   ` Stefano Stabellini
  2014-06-16 16:17 ` [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest Julien Grall
                   ` (11 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

There is no reason to use signed integer for an index. Futhermore, this will
avoid possible issue when theses functions will be exposed to the guest
via new hypercalls.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      |   10 +++++-----
 xen/include/xen/device_tree.h |    7 ++++---
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index f0b17a3..4736e0d 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -876,7 +876,7 @@ static const struct dt_bus *dt_match_bus(const struct dt_device_node *np)
 }
 
 static const __be32 *dt_get_address(const struct dt_device_node *dev,
-                                    int index, u64 *size,
+                                    unsigned index, u64 *size,
                                     unsigned int *flags)
 {
     const __be32 *prop;
@@ -1063,7 +1063,7 @@ bail:
 }
 
 /* dt_device_address - Translate device tree address and return it */
-int dt_device_get_address(const struct dt_device_node *dev, int index,
+int dt_device_get_address(const struct dt_device_node *dev, unsigned int index,
                           u64 *addr, u64 *size)
 {
     const __be32 *addrp;
@@ -1386,7 +1386,7 @@ fail:
     return -EINVAL;
 }
 
-int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
+int dt_device_get_raw_irq(const struct dt_device_node *device, uint32_t index,
                           struct dt_raw_irq *out_irq)
 {
     const struct dt_device_node *p;
@@ -1394,7 +1394,7 @@ int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
     u32 intsize, intlen;
     int res = -EINVAL;
 
-    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%d\n",
+    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%u\n",
                device->full_name, index);
 
     /* Get the interrupts property */
@@ -1445,7 +1445,7 @@ int dt_irq_translate(const struct dt_raw_irq *raw,
                         &out_irq->irq, &out_irq->type);
 }
 
-int dt_device_get_irq(const struct dt_device_node *device, int index,
+int dt_device_get_irq(const struct dt_device_node *device, uint32_t index,
                       struct dt_irq *out_irq)
 {
     struct dt_raw_irq raw;
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index 25db076..e413447 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -502,7 +502,7 @@ const struct dt_device_node *dt_get_parent(const struct dt_device_node *node);
  * This function resolves an address, walking the tree, for a give
  * device-tree node. It returns 0 on success.
  */
-int dt_device_get_address(const struct dt_device_node *dev, int index,
+int dt_device_get_address(const struct dt_device_node *dev, unsigned int index,
                           u64 *addr, u64 *size);
 
 /**
@@ -532,7 +532,7 @@ unsigned int dt_number_of_address(const struct dt_device_node *device);
  * This function resolves an interrupt, walking the tree, for a given
  * device-tree node. It's the high level pendant to dt_device_get_raw_irq().
  */
-int dt_device_get_irq(const struct dt_device_node *device, int index,
+int dt_device_get_irq(const struct dt_device_node *device, unsigned int index,
                       struct dt_irq *irq);
 
 /**
@@ -544,7 +544,8 @@ int dt_device_get_irq(const struct dt_device_node *device, int index,
  * This function resolves an interrupt for a device, no translation is
  * made. dt_irq_translate can be called after.
  */
-int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
+int dt_device_get_raw_irq(const struct dt_device_node *device,
+                          unsigned int index,
                           struct dt_raw_irq *irq);
 
 /**
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (6 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-07-03 11:30   ` Ian Campbell
  2014-06-16 16:17 ` [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information Julien Grall
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/common/device_tree.c      |   19 +++++++++++++++++++
 xen/include/xen/device_tree.h |   17 +++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index 4736e0d..fd95307 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -13,6 +13,7 @@
 #include <xen/config.h>
 #include <xen/types.h>
 #include <xen/init.h>
+#include <xen/guest_access.h>
 #include <xen/device_tree.h>
 #include <xen/kernel.h>
 #include <xen/lib.h>
@@ -661,6 +662,24 @@ struct dt_device_node *dt_find_node_by_path(const char *path)
     return np;
 }
 
+int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
+                          struct dt_device_node **node)
+{
+    char *path;
+    int ret;
+
+    ret = copy_string_from_guest(u_path, &path, u_plen,
+                                 DEVICE_TREE_MAX_PATHLEN);
+    if ( ret )
+        return ret;
+
+    *node = dt_find_node_by_path(path);
+
+    xfree(path);
+
+    return (*node == NULL) ? -ESRCH : 0;
+}
+
 struct dt_device_node *dt_find_node_by_alias(const char *alias)
 {
     const struct dt_alias_prop *app;
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index e413447..bb33e54 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -20,6 +20,9 @@
 
 #define DEVICE_TREE_MAX_DEPTH 16
 
+/* This limit is used by the hypercalls to restrict the size of the path */
+#define DEVICE_TREE_MAX_PATHLEN 1024
+
 #define NR_MEM_BANKS 8
 
 #define MOD_XEN    0
@@ -484,6 +487,20 @@ struct dt_device_node *dt_find_node_by_alias(const char *alias);
  */
 struct dt_device_node *dt_find_node_by_path(const char *path);
 
+
+/**
+ * dt_find_node_by_gpath - Same as dt_find_node_by_path but retrieve the
+ * path from the guest
+ *
+ * @u_path: Xen Guest handle to the buffer containing the path
+ * @u_plen: Length of the buffer
+ * @node: TODO
+ *
+ * Return 0 if succeed otherwise -errno
+ */
+int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
+                          struct dt_device_node **node);
+
 /**
  * dt_get_parent - Get a node's parent if any
  * @node: Node to get parent
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (7 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 19:38   ` Stefano Stabellini
  2014-07-03 11:33   ` Ian Campbell
  2014-06-16 16:17 ` [RFC 10/19] xen/passthrough: Introduce iommu_buildup Julien Grall
                   ` (9 subsequent siblings)
  18 siblings, 2 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

DOM0 doesn't provide a generic way to get information about a device tree
node. If we want to do it in userspace, we will have to duplicate the
MMIO/IRQ translation from Xen. Therefore, we can let the hypervisor
doing the job for us and get nearly all the informations.

This new physdev operation will let the toolstack get the IRQ/MMIO regions
and the compatible string. Most the device node can be described with only
theses 3 items. If we need to add a specific properties, then we will have
to implement it in userspace (some idea was to use a configuration file
describing the additional properties).

The hypercall is divided in 4 parts:
    - GET_INFO: get the numbers of IRQ/MMIO and the size of the
    compatible string;
    - GET_IRQ: get the IRQ by index. If the IRQ is not routable (i.e not
    an SPIs), the errno will be set to -EINVAL;
    - GET_MMIO: get the MMIO range by index. If the base and the size of
    is not page-aligned, the errno will be set to -EINVAL;
    - GET_COMPAT: get the compatible string

All the information will be accessible if the device is not used by Xen
and protected by an IOMMU.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>

---

I'm wondering if we can let the toolstack retrieve device information for
every device not used by Xen. This would allow embedded guys using passthrough
"easily" when their devices are not under an IOMMU.
---
 tools/libxc/xc_physdev.c      |  129 +++++++++++++++++++++++++++++++++++++++++
 tools/libxc/xenctrl.h         |   36 ++++++++++++
 xen/arch/arm/physdev.c        |   16 +++++
 xen/common/device_tree.c      |  112 +++++++++++++++++++++++++++++++++++
 xen/include/public/physdev.h  |   40 +++++++++++++
 xen/include/xen/device_tree.h |    3 +
 6 files changed, 336 insertions(+)

diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
index cf02d85..405fe78 100644
--- a/tools/libxc/xc_physdev.c
+++ b/tools/libxc/xc_physdev.c
@@ -108,3 +108,132 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
     return rc;
 }
 
+int xc_physdev_dtdev_getinfo(xc_interface *xch,
+                             char *path,
+                             xc_dtdev_info_t *info)
+{
+    int rc;
+    size_t size = strlen(path);
+    struct physdev_dtdev_op op;
+    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( xc_hypercall_bounce_pre(xch, path) )
+        return -1;
+
+    op.op = PHYSDEVOP_DTDEV_GET_INFO;
+    op.plen = size;
+    set_xen_guest_handle(op.path, path);
+
+    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
+
+    xc_hypercall_bounce_post(xch, path);
+
+    if ( !rc )
+    {
+        info->num_irqs = op.u.info.num_irqs;
+        info->num_mmios = op.u.info.num_mmios;
+        info->compat_len = op.u.info.compat_len;
+    }
+
+    return rc;
+}
+
+int xc_physdev_dtdev_getirq(xc_interface *xch,
+                            char *path,
+                            uint32_t index,
+                            xc_dtdev_irq_t *irq)
+{
+    int rc;
+    size_t size = strlen(path);
+    struct physdev_dtdev_op op;
+    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( xc_hypercall_bounce_pre(xch, path) )
+        return -1;
+
+    op.op = PHYSDEVOP_DTDEV_GET_IRQ;
+    op.plen = size;
+    op.index = index;
+    set_xen_guest_handle(op.path, path);
+
+    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
+
+    xc_hypercall_bounce_post(xch, path);
+
+    if ( !rc )
+    {
+        irq->irq = op.u.irq.irq;
+        irq->type = op.u.irq.type;
+    }
+
+    return rc;
+}
+
+int xc_physdev_dtdev_getmmio(xc_interface *xch,
+                             char *path,
+                             uint32_t index,
+                             xc_dtdev_mmio_t *mmio)
+{
+    int rc;
+    size_t size = strlen(path);
+    struct physdev_dtdev_op op;
+    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( xc_hypercall_bounce_pre(xch, path) )
+        return -1;
+
+    op.op = PHYSDEVOP_DTDEV_GET_MMIO;
+    op.plen = size;
+    op.index = index;
+    set_xen_guest_handle(op.path, path);
+
+    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
+
+    xc_hypercall_bounce_post(xch, path);
+
+    if ( !rc )
+    {
+        mmio->mfn = op.u.mmio.mfn;
+        mmio->nr_mfn = op.u.mmio.nr_mfn;
+    }
+
+    return rc;
+}
+
+int xc_physdev_dtdev_getcompat(xc_interface *xch,
+                               char *path,
+                               char *compat,
+                               uint32_t *clen)
+{
+    int rc;
+    size_t size = strlen(path);
+    struct physdev_dtdev_op op;
+    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    DECLARE_HYPERCALL_BOUNCE(compat, *clen, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+
+    if ( xc_hypercall_bounce_pre(xch, path) )
+        return -1;
+
+    rc = -1;
+    if ( xc_hypercall_bounce_pre(xch, compat) )
+        goto out;
+
+    op.op = PHYSDEVOP_DTDEV_GET_COMPAT;
+    op.plen = size;
+    set_xen_guest_handle(op.path, path);
+
+    op.u.compat.clen = *clen;
+    set_xen_guest_handle(op.u.compat.compat, compat);
+
+    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
+
+    if ( !rc )
+        *clen = op.u.compat.clen;
+
+    xc_hypercall_bounce_post(xch, compat);
+
+out:
+    xc_hypercall_bounce_post(xch, path);
+
+    return rc;
+}
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index b55d857..5ad2d65 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -1143,6 +1143,42 @@ int xc_physdev_pci_access_modify(xc_interface *xch,
                                  int func,
                                  int enable);
 
+typedef struct xc_dtdev_info {
+    uint32_t num_irqs;
+    uint32_t num_mmios;
+    uint32_t compat_len;
+} xc_dtdev_info_t;
+
+int xc_physdev_dtdev_getinfo(xc_interface *xch,
+                             char *path,
+                             xc_dtdev_info_t *info);
+
+typedef struct xc_dtdev_irq {
+    uint32_t irq;
+    /* TODO: Maybe an enum here? */
+    uint32_t type;
+} xc_dtdev_irq_t;
+
+int xc_physdev_dtdev_getirq(xc_interface *xch,
+                            char *path,
+                            uint32_t index,
+                            xc_dtdev_irq_t *irq);
+
+typedef struct xc_dtdev_mmio {
+    uint64_t mfn;
+    uint64_t nr_mfn;
+} xc_dtdev_mmio_t;
+
+int xc_physdev_dtdev_getmmio(xc_interface *xch,
+                             char *path,
+                             uint32_t index,
+                             xc_dtdev_mmio_t *mmio);
+
+int xc_physdev_dtdev_getcompat(xc_interface *xch,
+                               char *path,
+                               char *compat,
+                               uint32_t *clen);
+
 int xc_readconsolering(xc_interface *xch,
                        char *buffer,
                        unsigned int *pnr_chars,
diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index d17589c..11c5b59 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -82,6 +82,22 @@ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
                 ret = -EFAULT;
         }
         break;
+
+    case PHYSDEVOP_dtdev_op:
+        {
+            physdev_dtdev_op_t info;
+
+            ret = -EFAULT;
+            if ( copy_from_guest(&info, arg, 1) != 0 )
+                break;
+
+            /* TODO: Add xsm */
+            ret = dt_do_physdev_op(&info);
+
+            if ( __copy_to_guest(arg, &info, 1) )
+                ret = -EFAULT;
+        }
+        break;
     default:
         ret = -ENOSYS;
         break;
diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
index fd95307..482ff8f 100644
--- a/xen/common/device_tree.c
+++ b/xen/common/device_tree.c
@@ -24,6 +24,7 @@
 #include <xen/cpumask.h>
 #include <xen/ctype.h>
 #include <xen/lib.h>
+#include <xen/irq.h>
 
 struct dt_early_info __initdata early_info;
 const void *device_tree_flattened;
@@ -2021,6 +2022,117 @@ void __init dt_unflatten_host_device_tree(void)
     dt_alias_scan();
 }
 
+
+/* TODO: I think we need a bit of caching in each device node to get the
+ * information in constant time.
+ * For now we need to translate IRQs/MMIOs every time
+ */
+int dt_do_physdev_op(physdev_dtdev_op_t *info)
+{
+    struct dt_device_node *dev;
+    int ret;
+
+    ret = dt_find_node_by_gpath(info->path, info->plen, &dev);
+    if ( ret )
+        return ret;
+
+    /* Only allow access to protected device and not used by Xen */
+    if ( !dt_device_is_protected(dev) || dt_device_used_by(dev) == DOMID_XEN )
+        return -EACCES;
+
+    switch ( info->op )
+    {
+    case PHYSDEVOP_DTDEV_GET_INFO:
+        {
+            const struct dt_property *compat;
+
+            compat = dt_find_property(dev, "compatible", NULL);
+            /* Hopefully, this case should never happen, print error
+             * if it occurs
+             */
+            if ( !compat )
+            {
+                dprintk(XENLOG_G_ERR, "Unable to find compatible node for %s\n",
+                        dt_node_full_name(dev));
+                return -EBADFD;
+            }
+
+            info->u.info.num_irqs = dt_number_of_irq(dev);
+            info->u.info.num_mmios = dt_number_of_address(dev);
+            info->u.info.compat_len = compat->length;
+        }
+        break;
+
+    case PHYSDEVOP_DTDEV_GET_IRQ:
+        {
+            struct dt_irq irq;
+
+            ret = dt_device_get_irq(dev, info->index, &irq);
+            if ( ret )
+                return ret;
+
+            /* Check if Xen is able to route the IRQ to the guest */
+            if ( !is_routable_irq(irq.irq) )
+                return -EINVAL;
+
+            info->u.irq.irq = irq.irq;
+            /* TODO: Translate the type into an exportable value */
+            info->u.irq.type = irq.type;
+        }
+        break;
+
+    case PHYSDEVOP_DTDEV_GET_MMIO:
+        {
+            uint64_t addr, size;
+
+            ret = dt_device_get_address(dev, info->index, &addr, &size);
+            if ( ret )
+                return ret;
+
+            /* Make sure the address and the size are page aligned.
+             * If not, we may passthrough MMIO regions which may belong
+             * to another device. Deny it!
+             */
+            if ( (addr & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
+            {
+                dprintk(XENLOG_ERR, "%s: contain non-page aligned range:"
+                        " addr = 0x%"PRIx64" size = 0x%"PRIx64"\n",
+                        dt_node_full_name(dev), addr, size);
+                return -EINVAL;
+            }
+
+            info->u.mmio.mfn = paddr_to_pfn(addr);
+            info->u.mmio.nr_mfn = paddr_to_pfn(size);
+        }
+        break;
+
+    case PHYSDEVOP_DTDEV_GET_COMPAT:
+        {
+            const struct dt_property *compat;
+
+            compat = dt_find_property(dev, "compatible", NULL);
+            if ( !compat || !compat->length )
+                return -ENOENT;
+
+            if ( info->u.compat.clen < compat->length )
+                return -ENOSPC;
+
+            if ( copy_to_guest(info->u.compat.compat, compat->value,
+                               compat->length) != 0 )
+                return -EFAULT;
+
+            info->u.compat.clen = compat->length;
+        }
+        break;
+
+    default:
+        return -ENOSYS;
+    }
+
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
index d547928..23cf673 100644
--- a/xen/include/public/physdev.h
+++ b/xen/include/public/physdev.h
@@ -337,6 +337,46 @@ struct physdev_dbgp_op {
 typedef struct physdev_dbgp_op physdev_dbgp_op_t;
 DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
 
+/* Retrieve informations about a device node */
+#define PHYSDEVOP_dtdev_op        32
+
+struct physdev_dtdev_op {
+    /* IN */
+    uint32_t plen;                  /* Length of the path */
+    XEN_GUEST_HANDLE(char) path;    /* Path to the device tree node */
+#define PHYSDEVOP_DTDEV_GET_INFO        0
+#define PHYSDEVOP_DTDEV_GET_IRQ         1
+#define PHYSDEVOP_DTDEV_GET_MMIO        2
+#define PHYSDEVOP_DTDEV_GET_COMPAT      3
+    uint8_t op;
+    uint32_t pad0:24;
+    uint32_t index;                 /* Index for the IRQ/MMIO to retrieve */
+    /* OUT */
+    union {
+        struct {
+            uint32_t num_irqs;      /* Number of IRQs */
+            uint32_t num_mmios;     /* Number of MMIOs */
+            uint32_t compat_len;    /* Length of the compatible string */
+        } info;
+        struct {
+            /* TODO: Do we need to handle MSI-X? */
+            uint32_t irq;           /* IRQ number */
+            /* TODO: Describe with defines the IRQ type */
+            uint32_t type;          /* IRQ type (i.e edge, level...) */
+        } irq;
+        struct {
+            uint64_t mfn;
+            uint64_t nr_mfn;
+        } mmio;
+        struct {
+            uint32_t clen;          /* IN: Size of buffer. OUT: Size copied */
+            XEN_GUEST_HANDLE_64(char) compat;
+        } compat;
+    } u;
+};
+typedef struct physdev_dtdev_op physdev_dtdev_op_t;
+DEFINE_XEN_GUEST_HANDLE(physdev_dtdev_op_t);
+
 /*
  * Notify that some PIRQ-bound event channels have been unmasked.
  * ** This command is obsolete since interface version 0x00030202 and is **
diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
index bb33e54..3d5101c 100644
--- a/xen/include/xen/device_tree.h
+++ b/xen/include/xen/device_tree.h
@@ -12,6 +12,7 @@
 
 #include <asm/byteorder.h>
 #include <public/xen.h>
+#include <public/physdev.h>
 #include <xen/init.h>
 #include <xen/string.h>
 #include <xen/types.h>
@@ -711,6 +712,8 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np,
                                const char *cells_name, int index,
                                struct dt_phandle_args *out_args);
 
+int dt_do_physdev_op(physdev_dtdev_op_t *info);
+
 #endif /* __XEN_DEVICE_TREE_H */
 
 /*
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 10/19] xen/passthrough: Introduce iommu_buildup
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (8 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-07-03 11:45   ` Ian Campbell
  2014-06-16 16:17 ` [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown Julien Grall
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

This new function will correctly initialize the IOMMU page table for the
current domain.

Also use it in iommu_assign_dt_device even though the current IOMMU
implementation on ARM shares P2M with the processor.
---
 xen/drivers/passthrough/arm/iommu.c   |    6 ++++++
 xen/drivers/passthrough/device_tree.c |    7 +++++++
 xen/drivers/passthrough/iommu.c       |   25 +++++++++++++++++++++++++
 xen/drivers/passthrough/pci.c         |   12 ++++--------
 xen/include/xen/iommu.h               |    2 ++
 5 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
index 3007b99..de4ed64 100644
--- a/xen/drivers/passthrough/arm/iommu.c
+++ b/xen/drivers/passthrough/arm/iommu.c
@@ -68,3 +68,9 @@ void arch_iommu_domain_destroy(struct domain *d)
 {
     iommu_dt_domain_destroy(d);
 }
+
+int arch_iommu_populate_page_table(struct domain *d)
+{
+    /* The IOMMU share the p2m with the CPU */
+    return -ENOSYS;
+}
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 3e47df5..afb4dfc 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -41,6 +41,13 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
     if ( !list_empty(&dev->domain_list) )
         goto fail;
 
+    if ( need_iommu(d) <= 0 )
+    {
+        rc = iommu_buildup(d);
+        if ( rc )
+            goto fail;
+    }
+
     rc = hd->platform_ops->assign_dt_device(d, dev);
 
     if ( rc )
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index cc12735..2e9b48d 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -187,6 +187,31 @@ void iommu_teardown(struct domain *d)
     tasklet_schedule(&iommu_pt_cleanup_tasklet);
 }
 
+int iommu_buildup(struct domain *d)
+{
+    int rc = 0;
+
+    /*
+     * The caller should check we effectively need to set up the IOMMMU
+     * for this domain.
+     */
+    ASSERT(need_iommu(d) <= 0);
+
+    if ( need_iommu(d) > 0 )
+        return 0;
+
+    if ( !iommu_use_hap_pt(d) )
+    {
+        rc = arch_iommu_populate_page_table(d);
+        if ( rc )
+            return rc;
+    }
+
+    d->need_iommu = 1;
+
+    return rc;
+}
+
 void iommu_domain_destroy(struct domain *d)
 {
     struct hvm_iommu *hd = domain_hvm_iommu(d);
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 1eba833..e31dcfb 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1342,14 +1342,10 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
 
     if ( need_iommu(d) <= 0 )
     {
-        if ( !iommu_use_hap_pt(d) )
-        {
-            rc = arch_iommu_populate_page_table(d);
-            if ( rc )
-            {
-                spin_unlock(&pcidevs_lock);
-                return rc;
-            }
+        rc = iommu_buildup(d);
+        if ( rc ) {
+            spin_unlock(&pcidevs_lock);
+            return rc;
         }
         d->need_iommu = 1;
     }
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 8eb764a..9b2af51 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -64,6 +64,8 @@ int arch_iommu_domain_init(struct domain *d);
 int arch_iommu_populate_page_table(struct domain *d);
 void arch_iommu_check_autotranslated_hwdom(struct domain *d);
 
+int iommu_buildup(struct domain *d);
+
 /* Function used internally, use iommu_domain_destroy */
 void iommu_teardown(struct domain *d);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (9 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 10/19] xen/passthrough: Introduce iommu_buildup Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-17  8:07   ` Jan Beulich
  2014-06-16 16:17 ` [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody Julien Grall
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel
  Cc: stefano.stabellini, Julien Grall, tim, ian.campbell, Jan Beulich

arch_iommu_domain_destroy contains specific architecture code.

On x86, this code will clean up the ioport_list which is not used in
both iommu (i.e AMD & x86) drivers.

On ARM, the toolstack may not have deassign every device to the guest.
Therefore, we have to go through the device list and removing them before
asking the IOMMU drivers to release memory for this domain. This is done
by iommu_dt_domain_destroy which is called by arch_iommu_domain_destroy.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/drivers/passthrough/iommu.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index 2e9b48d..d71ab03 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
     if ( !iommu_enabled || !hd->platform_ops )
         return;
 
+    arch_iommu_domain_destroy(d);
+
     if ( need_iommu(d) )
         iommu_teardown(d);
-
-    arch_iommu_domain_destroy(d);
 }
 
 int iommu_map_page(struct domain *d, unsigned long gfn, unsigned long mfn,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (10 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown Julien Grall
@ 2014-06-16 16:17 ` Julien Grall
  2014-06-18 19:28   ` Stefano Stabellini
  2014-07-03 11:48   ` Ian Campbell
  2014-06-16 16:18 ` [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM Julien Grall
                   ` (6 subsequent siblings)
  18 siblings, 2 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:17 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

Currently, when the device is deassigned from a domain, we directly reassign
to DOM0.

As the device may not have been correctly reset, this may lead to corrupt or
expose some part of DOM0 memory.

If Xen reassigns the device to "nobody", it may receive some global/context
fault because the transaction has failed (indeed the context has been
marked invalid).

DOM0 will have to issue an hypercall to assign the device to itself if it
wants to use it.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/drivers/passthrough/arm/smmu.c    |    7 ++++---
 xen/drivers/passthrough/device_tree.c |    8 +++-----
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
index f4eb2a2..b25034e 100644
--- a/xen/drivers/passthrough/arm/smmu.c
+++ b/xen/drivers/passthrough/arm/smmu.c
@@ -1245,8 +1245,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
 {
     int ret = 0;
 
-    /* Don't allow remapping on other domain than hwdom */
-    if ( t != hardware_domain )
+    /* Allow remapping either on the hardware domain or to nothing */
+    if ( t && t != hardware_domain )
         return -EPERM;
 
     if ( t == s )
@@ -1256,7 +1256,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
     if ( ret )
         return ret;
 
-    ret = arm_smmu_attach_dev(t, dev);
+    if ( t )
+        ret = arm_smmu_attach_dev(t, dev);
 
     return ret;
 }
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index afb4dfc..8a4bc69 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -75,14 +75,12 @@ int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev)
 
     spin_lock(&dtdevs_lock);
 
-    rc = hd->platform_ops->reassign_dt_device(d, hardware_domain, dev);
+    rc = hd->platform_ops->reassign_dt_device(d, NULL, dev);
     if ( rc )
         goto fail;
 
-    list_del(&dev->domain_list);
-
-    dt_device_set_used_by(dev, hardware_domain->domain_id);
-    list_add(&dev->domain_list, &domain_hvm_iommu(hardware_domain)->dt_devices);
+    list_del_init(&dev->domain_list);
+    dt_device_set_used_by(dev, DOMID_IO);
 
 fail:
     spin_unlock(&dtdevs_lock);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (11 preceding siblings ...)
  2014-06-16 16:17 ` [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-06-17  8:24   ` Jan Beulich
  2014-06-16 16:18 ` [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device Julien Grall
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: Keir Fraser, ian.campbell, Julien Grall, tim, stefano.stabellini,
	Jan Beulich

The call iommu_do_domctl is similar as the x86 one. Move this code to
the common code and protected by HAS_PASSTHROUGH.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Keir Fraser <keir@xen.org>
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/domctl.c |    2 +-
 xen/common/domctl.c   |    4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 437ba11..ffecc15 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -1320,7 +1320,7 @@ long arch_do_domctl(
     break;
 
     default:
-        ret = iommu_do_domctl(domctl, d, u_domctl);
+        ret = -ENOSYS;
         break;
     }
 
diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index 5d3ac87..85866b7 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -1028,6 +1028,10 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
 
     default:
         ret = arch_do_domctl(op, d, u_domctl);
+#ifdef HAS_PASSTHROUGH
+        if ( ret == -ENOSYS )
+            ret = iommu_do_domctl(op, d, u_domctl);
+#endif
         break;
     }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (12 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-06-17  8:34   ` Jan Beulich
  2014-06-16 16:18 ` [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough Julien Grall
                   ` (4 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

A device node is described by a path. It will be used to retrieved the
node in the device tree and assign the related device to the domain.

Only device protected by an IOMMU can be assigned to a guest.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxc/xc_domain.c               |   29 +++++++++++++++++++++
 tools/libxc/xenctrl.h                 |    4 +++
 xen/drivers/passthrough/device_tree.c |   45 ++++++++++++++++++++++++++++++---
 xen/drivers/passthrough/iommu.c       |    7 +++++
 xen/include/public/domctl.h           |   10 ++++++++
 xen/include/xen/iommu.h               |    3 +++
 6 files changed, 95 insertions(+), 3 deletions(-)

diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 7909536..ea8fc0d 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -1628,6 +1628,35 @@ int xc_deassign_device(
     return do_domctl(xch, &domctl);
 }
 
+int xc_assign_dt_device(
+    xc_interface *xch,
+    uint32_t domid,
+    char *path)
+{
+    int rc;
+    size_t size = strlen(path);
+    xen_domctl_assign_dt_device_t *assign_dt_device;
+    DECLARE_DOMCTL;
+    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+
+    if ( xc_hypercall_bounce_pre(xch, path) )
+        return -1;
+
+    domctl.cmd = XEN_DOMCTL_assign_dt_device;
+    domctl.domain = (domid_t)domid;
+
+    assign_dt_device = &domctl.u.assign_dt_device;
+
+    assign_dt_device->size = size;
+    set_xen_guest_handle(assign_dt_device->path, path);
+
+    rc = do_domctl(xch, &domctl);
+
+    xc_hypercall_bounce_post(xch, path);
+
+    return rc;
+}
+
 int xc_domain_update_msi_irq(
     xc_interface *xch,
     uint32_t domid,
diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
index 5ad2d65..07dcadc 100644
--- a/tools/libxc/xenctrl.h
+++ b/tools/libxc/xenctrl.h
@@ -2010,6 +2010,10 @@ int xc_deassign_device(xc_interface *xch,
                      uint32_t domid,
                      uint32_t machine_bdf);
 
+int xc_assign_dt_device(xc_interface *xch,
+                        uint32_t domid,
+                        char *path);
+
 int xc_domain_memory_mapping(xc_interface *xch,
                              uint32_t domid,
                              unsigned long first_gfn,
diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
index 8a4bc69..9df4343 100644
--- a/xen/drivers/passthrough/device_tree.c
+++ b/xen/drivers/passthrough/device_tree.c
@@ -1,9 +1,6 @@
 /*
  * Code to passthrough a device tree node to a guest
  *
- * TODO: This contains only the necessary code to protected device passed to
- * dom0. It will need some updates when device passthrough will is added.
- *
  * Julien Grall <julien.grall@linaro.org>
  * Copyright (c) 2014 Linaro Limited.
  *
@@ -20,6 +17,7 @@
 
 #include <xen/lib.h>
 #include <xen/sched.h>
+#include <xen/guest_access.h>
 #include <xen/iommu.h>
 #include <xen/device_tree.h>
 
@@ -111,3 +109,44 @@ void iommu_dt_domain_destroy(struct domain *d)
                     dt_node_full_name(dev), d->domain_id);
     }
 }
+
+int iommu_do_dt_domctl(struct xen_domctl *domctl, struct domain *d,
+                       XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
+{
+    int ret;
+
+    /* TODO: How to deal with XSM? */
+
+    switch ( domctl->cmd )
+    {
+    case XEN_DOMCTL_assign_dt_device:
+    {
+        struct dt_device_node *dev;
+
+
+        /* TODO: Do we need to check is_dying? Mostly to protect against
+         * hypercall trying to passthrough a device while we are
+         * dying.
+         */
+
+        ret = dt_find_node_by_gpath(domctl->u.assign_dt_device.path,
+                                    domctl->u.assign_dt_device.size,
+                                    &dev);
+        if ( ret )
+            break;
+
+        ret = iommu_assign_dt_device(d, dev);
+
+        if ( ret )
+            printk(XENLOG_G_ERR "XEN_DOMCTL_assign_dt_device: assign \"%s\""
+                   " to dom%u failed (%d)\n",
+                   dt_node_full_name(dev), d->domain_id, ret);
+    }
+    break;
+    default:
+        ret = -ENOSYS;
+        break;
+    }
+
+    return ret;
+}
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index d71ab03..8351814 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -343,6 +343,13 @@ int iommu_do_domctl(
     ret = iommu_do_pci_domctl(domctl, d, u_domctl);
 #endif
 
+    if ( ret != -ENOSYS )
+        return ret;
+
+#ifdef HAS_DEVICE_TREE
+    ret = iommu_do_dt_domctl(domctl, d, u_domctl);
+#endif
+
     return ret;
 }
 
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index 5b11bbf..66806d2 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -936,6 +936,14 @@ typedef struct xen_domctl_vcpu_msrs xen_domctl_vcpu_msrs_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpu_msrs_t);
 #endif
 
+/* Device Tree: Assign a non-PCI device to a guest */
+struct xen_domctl_assign_dt_device {
+    uint32_t size; /* IN: Length of the path */
+    XEN_GUEST_HANDLE_64(char) path; /* IN: path to the device tree node */
+};
+typedef struct xen_domctl_assign_dt_device xen_domctl_assign_dt_device_t;
+DEFINE_XEN_GUEST_HANDLE(xen_domctl_assign_dt_device_t);
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1008,6 +1016,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_cacheflush                    71
 #define XEN_DOMCTL_get_vcpu_msrs                 72
 #define XEN_DOMCTL_set_vcpu_msrs                 73
+#define XEN_DOMCTL_assign_dt_device              74
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1044,6 +1053,7 @@ struct xen_domctl {
         struct xen_domctl_sendtrigger       sendtrigger;
         struct xen_domctl_get_device_group  get_device_group;
         struct xen_domctl_assign_device     assign_device;
+        struct xen_domctl_assign_dt_device  assign_dt_device;
         struct xen_domctl_bind_pt_irq       bind_pt_irq;
         struct xen_domctl_memory_mapping    memory_mapping;
         struct xen_domctl_ioport_mapping    ioport_mapping;
diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h
index 9b2af51..833baca 100644
--- a/xen/include/xen/iommu.h
+++ b/xen/include/xen/iommu.h
@@ -118,6 +118,9 @@ int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev);
 int iommu_dt_domain_init(struct domain *d);
 void iommu_dt_domain_destroy(struct domain *d);
 
+int iommu_do_dt_domctl(struct xen_domctl *, struct domain *,
+                       XEN_GUEST_HANDLE_PARAM(xen_domctl_t));
+
 #endif /* HAS_DEVICE_TREE */
 
 struct page_info;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (13 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-06-18 15:12   ` Stefano Stabellini
  2014-06-16 16:18 ` [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_* Julien Grall
                   ` (3 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel; +Cc: stefano.stabellini, Julien Grall, tim, ian.campbell

This region will be split by the toolstack to allocate MMIO range for eac
device.

For now only reserve a 512MB region, this should be enought to passthrough
multiple device at the same time.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
---
 xen/include/public/arch-arm.h |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index ac54cd6..789bffb 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
 #define GUEST_GICC_BASE   0x03002000ULL
 #define GUEST_GICC_SIZE   0x00000100ULL
 
+/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
+#define GUEST_MMIO_BASE   0x10000000ULL
+#define GUEST_MMIO_SIZE   0x20000000ULL
+
 /* 16MB == 4096 pages reserved for guest to use as a region to map its
  * grant table in.
  */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_*
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (14 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-07-03 11:56   ` Ian Campbell
  2014-06-16 16:18 ` [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs Julien Grall
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

Avoid to use hardcode value when the interrupt type is set.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_arm.c |   29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 21c3399..1edb87a 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -6,6 +6,23 @@
 #include <libfdt.h>
 #include <assert.h>
 
+/**
+ * IRQ line type.
+ * DT_IRQ_TYPE_NONE            - default, unspecified type
+ * DT_IRQ_TYPE_EDGE_RISING     - rising edge triggered
+ * DT_IRQ_TYPE_EDGE_FALLING    - falling edge triggered
+ * DT_IRQ_TYPE_EDGE_BOTH       - rising and falling edge triggered
+ * DT_IRQ_TYPE_LEVEL_HIGH      - high level triggered
+ * DT_IRQ_TYPE_LEVEL_LOW       - low level triggered
+ */
+#define DT_IRQ_TYPE_NONE           0x00000000
+#define DT_IRQ_TYPE_EDGE_RISING    0x00000001
+#define DT_IRQ_TYPE_EDGE_FALLING   0x00000002
+#define DT_IRQ_TYPE_EDGE_BOTH                           \
+    (DT_IRQ_TYPE_EDGE_FALLING | DT_IRQ_TYPE_EDGE_RISING)
+#define DT_IRQ_TYPE_LEVEL_HIGH     0x00000004
+#define DT_IRQ_TYPE_LEVEL_LOW      0x00000008
+
 int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
                               uint32_t domid)
 {
@@ -338,9 +355,12 @@ static int make_timer_node(libxl__gc *gc, void *fdt, const struct arch_info *ain
     res = fdt_property_compat(gc, fdt, 1, ainfo->timer_compat);
     if (res) return res;
 
-    set_interrupt_ppi(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf, 0x8);
-    set_interrupt_ppi(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf, 0x8);
-    set_interrupt_ppi(ints[2], GUEST_TIMER_VIRT_PPI, 0xf, 0x8);
+    set_interrupt_ppi(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf,
+                      DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt_ppi(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf,
+                      DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt_ppi(ints[2], GUEST_TIMER_VIRT_PPI, 0xf,
+                      DT_IRQ_TYPE_LEVEL_LOW);
 
     res = fdt_property_interrupts(gc, fdt, ints, 3);
     if (res) return res;
@@ -378,7 +398,8 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
      *  - Active-low level-sensitive
      *  - All cpus
      */
-    set_interrupt_ppi(intr, GUEST_EVTCHN_PPI, 0xf, 0x8);
+    set_interrupt_ppi(intr, GUEST_EVTCHN_PPI, 0xf,
+                      DT_IRQ_TYPE_LEVEL_LOW);
 
     res = fdt_property_interrupts(gc, fdt, &intr, 1);
     if (res) return res;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (15 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_* Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-07-03 11:58   ` Ian Campbell
  2014-06-16 16:18 ` [RFC 18/19] libxl: Add support for non-PCI passthrough Julien Grall
  2014-06-16 16:18 ` [RFC 19/19] xl: Add new option dtdev Julien Grall
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

The function will be used later during device passthrough to create
interrupts in the device tree. Those interrupts are usually SPIs.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/libxl_arm.c |   26 ++++++++++++++------------
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 1edb87a..e19e2f4 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -64,14 +64,20 @@ static void set_cell(be32 **cellp, int size, uint64_t val)
     (*cellp) += cells;
 }
 
-static void set_interrupt_ppi(gic_interrupt interrupt, unsigned int irq,
-                              unsigned int cpumask, unsigned int level)
+static void set_interrupt(gic_interrupt interrupt, unsigned int irq,
+                          unsigned int cpumask, unsigned int level)
 {
     be32 *cells = interrupt;
+    int is_ppi = (irq < 32);
+
+    /* SGIs are not describe in the device tree */
+    assert(irq >= 16);
+
+    irq -= (is_ppi) ? 16: 32; /* PPIs start at 16, SPIs at 32 */
 
     /* See linux Documentation/devictree/bindings/arm/gic.txt */
-    set_cell(&cells, 1, 1); /* is a PPI */
-    set_cell(&cells, 1, irq - 16); /* PPIs start at 16 */
+    set_cell(&cells, 1, is_ppi); /* is a PPI? */
+    set_cell(&cells, 1, irq);
     set_cell(&cells, 1, (cpumask << 8) | level);
 }
 
@@ -355,12 +361,9 @@ static int make_timer_node(libxl__gc *gc, void *fdt, const struct arch_info *ain
     res = fdt_property_compat(gc, fdt, 1, ainfo->timer_compat);
     if (res) return res;
 
-    set_interrupt_ppi(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf,
-                      DT_IRQ_TYPE_LEVEL_LOW);
-    set_interrupt_ppi(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf,
-                      DT_IRQ_TYPE_LEVEL_LOW);
-    set_interrupt_ppi(ints[2], GUEST_TIMER_VIRT_PPI, 0xf,
-                      DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf, DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf, DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt(ints[2], GUEST_TIMER_VIRT_PPI, 0xf, DT_IRQ_TYPE_LEVEL_LOW);
 
     res = fdt_property_interrupts(gc, fdt, ints, 3);
     if (res) return res;
@@ -398,8 +401,7 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
      *  - Active-low level-sensitive
      *  - All cpus
      */
-    set_interrupt_ppi(intr, GUEST_EVTCHN_PPI, 0xf,
-                      DT_IRQ_TYPE_LEVEL_LOW);
+    set_interrupt(intr, GUEST_EVTCHN_PPI, 0xf, DT_IRQ_TYPE_LEVEL_LOW);
 
     res = fdt_property_interrupts(gc, fdt, &intr, 1);
     if (res) return res;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 18/19] libxl: Add support for non-PCI passthrough
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (16 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-06-16 17:19   ` Wei Liu
  2014-06-16 16:18 ` [RFC 19/19] xl: Add new option dtdev Julien Grall
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

On ARM, every non-PCI device are described in the device tree. Each of them
can be found via a path.

This path will be used to retrieved the different informations about the
device (compatible string, interrupts, MMIOs). Libxl will take care of:
    - Allocate the MMIOs regions for the device in the guest
    - Create the device node in the guest device tree
    - Map the IRQs and MMIOs range in the guest P2M

Note, that the device node won't contains specific properties for the node.
Only generic one (compatible, interrupts, regs) will be created by libxl.

In the future, per-device properties will be added. Maybe via a configuration
file listing what is needed.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 tools/libxl/Makefile         |    2 +-
 tools/libxl/libxl_arch.h     |    4 +-
 tools/libxl/libxl_arm.c      |  103 ++++++++++++++++++++++++++-
 tools/libxl/libxl_create.c   |   11 +++
 tools/libxl/libxl_dom.c      |   22 +++++-
 tools/libxl/libxl_dtdev.c    |  159 ++++++++++++++++++++++++++++++++++++++++++
 tools/libxl/libxl_internal.h |   31 ++++++++
 tools/libxl/libxl_types.idl  |    5 ++
 tools/libxl/libxl_x86.c      |    4 +-
 9 files changed, 335 insertions(+), 6 deletions(-)
 create mode 100644 tools/libxl/libxl_dtdev.c

diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
index 4cfa275..986df08 100644
--- a/tools/libxl/Makefile
+++ b/tools/libxl/Makefile
@@ -76,7 +76,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \
 			libxl_internal.o libxl_utils.o libxl_uuid.o \
 			libxl_json.o libxl_aoutils.o libxl_numa.o \
 			libxl_save_callout.o _libxl_save_msgs_callout.o \
-			libxl_qmp.o libxl_event.o libxl_fork.o $(LIBXL_OBJS-y)
+			libxl_qmp.o libxl_event.o libxl_fork.o libxl_dtdev.o $(LIBXL_OBJS-y)
 LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o
 
 LIBXL_TESTS += timedereg
diff --git a/tools/libxl/libxl_arch.h b/tools/libxl/libxl_arch.h
index d3bc136..fea1893 100644
--- a/tools/libxl/libxl_arch.h
+++ b/tools/libxl/libxl_arch.h
@@ -17,11 +17,13 @@
 
 /* arch specific internal domain creation function */
 int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
-               uint32_t domid);
+                              libxl__domain_build_state *state,
+                              uint32_t domid);
 
 /* setup arch specific hardware description, i.e. DTB on ARM */
 int libxl__arch_domain_init_hw_description(libxl__gc *gc,
                                            libxl_domain_build_info *info,
+                                           libxl__domain_build_state *state,
                                            struct xc_dom_image *dom);
 /* finalize arch specific hardware description. */
 int libxl__arch_domain_finalise_hw_description(libxl__gc *gc,
diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index e19e2f4..0eec174 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -24,8 +24,38 @@
 #define DT_IRQ_TYPE_LEVEL_LOW      0x00000008
 
 int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
+                              libxl__domain_build_state *state,
                               uint32_t domid)
 {
+    int dev_index;
+    uint32_t mmio_index;
+    uint64_t mmiocurr = GUEST_MMIO_BASE >> XC_PAGE_SHIFT;
+    /* Convenient */
+    const uint64_t mmioend = (GUEST_MMIO_BASE + GUEST_MMIO_SIZE) >> XC_PAGE_SHIFT;
+
+
+    for (dev_index = 0; dev_index < d_config->num_dtdevs; dev_index++) {
+        const libxl_device_dt *dtdev = &d_config->dtdevs[dev_index];
+        libxl__dtdev_info *info = &state->dtdevs_info[dev_index];
+
+        LOG(DEBUG, "Allocate %d MMIOs region for \"%s\"",
+            info->num_mmios, dtdev->path);
+        for (mmio_index = 0; mmio_index < info->num_mmios; mmio_index++) {
+            libxl_iomem_range *io = &info->mmios[mmio_index];
+
+            /* Check if we have enough space for the MMIO region */
+            /* TODO: Do I need to check overlap? */
+            if ((mmiocurr + io->number) > mmioend) {
+                LOG(ERROR, "Not enough space in the guest layout to allocate the MMIOs regions");
+                return -ENOMEM;
+            }
+            LOG(DEBUG, "\t0x%"PRIx64"-0x%"PRIx64,
+                mmiocurr, mmiocurr + io->number);
+            io->gfn = mmiocurr;
+            mmiocurr += io->number;
+        }
+    }
+
     return 0;
 }
 
@@ -412,6 +442,72 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
     return 0;
 }
 
+static int make_dtdev_node(libxl__gc *gc, void *fdt,
+                           int index, const libxl__dtdev_info *dtdev)
+{
+    const char *name;
+    uint32_t i;
+    int res;
+
+    /* The unit-address (after @) is only request when the device has MMIO */
+    if (dtdev->num_mmios > 0) {
+        uint64_t base = dtdev->mmios[0].gfn << XC_PAGE_SHIFT;
+
+        name = GCSPRINTF("dtdev-%u@%"PRIx64, index, base);
+    } else
+        name = GCSPRINTF("dtdev-%u", index);
+
+    res = fdt_begin_node(fdt, name);
+    if (res) return res;
+
+    assert(dtdev->compat_len != 0);
+    fdt_property(fdt, "compatible", dtdev->compat, dtdev->compat_len);
+
+    if (dtdev->num_mmios > 0) {
+        be32 *regs, *cells;
+        /* Convenient */
+        const unsigned addr_cells = ROOT_ADDRESS_CELLS;
+        const unsigned size_cells = ROOT_SIZE_CELLS;
+        const unsigned len = sizeof(*regs) * (addr_cells + size_cells);
+
+        regs = libxl__malloc(gc, len);
+        cells = &regs[0];
+
+        for (i = 0; i < dtdev->num_mmios; i++) {
+            uint64_t base = dtdev->mmios[i].gfn << XC_PAGE_SHIFT;
+            uint64_t size = dtdev->mmios[i].number << XC_PAGE_SHIFT;
+
+            set_range(&cells, addr_cells, size_cells, base, size);
+        }
+
+        res = fdt_property(fdt, "reg", regs, len);
+        if (res) return res;
+    }
+
+    if (dtdev->num_irqs > 0) {
+        gic_interrupt *ints;
+
+        ints = libxl__malloc(gc, sizeof(*ints) * dtdev->num_irqs);
+        for (i =0; i < dtdev->num_irqs; i++) {
+            /* TODO: Translate the IRQ type into DT type. We should
+             * not assume a 1:1 mapping */
+            /* For now, Xen is only handling SPIs passthrough and
+             * forward to VCPU0
+             */
+            assert(dtdev->irqs[i].irq >= 32);
+            set_interrupt(ints[i], dtdev->irqs[i].irq,
+                          0x1, dtdev->irqs[i].type);
+        }
+
+        res = fdt_property_interrupts(gc, fdt, ints, dtdev->num_irqs);
+    }
+
+    res = fdt_end_node(fdt);
+    if (res) return res;
+
+    return 0;
+}
+
 static const struct arch_info *get_arch_info(libxl__gc *gc,
                                              const struct xc_dom_image *dom)
 {
@@ -454,10 +550,11 @@ out:
 
 int libxl__arch_domain_init_hw_description(libxl__gc *gc,
                                            libxl_domain_build_info *info,
+                                           libxl__domain_build_state *state,
                                            struct xc_dom_image *dom)
 {
     void *fdt = NULL;
-    int rc, res;
+    int rc, res, i;
     size_t fdt_size = 0;
 
     const libxl_version_info *vers;
@@ -527,6 +624,10 @@ next_resize:
         FDT( make_timer_node(gc, fdt, ainfo) );
         FDT( make_hypervisor_node(gc, fdt, vers) );
 
+        for (i = 0; i < state->num_dtdevs; i++) {
+            FDT( make_dtdev_node(gc, fdt, i, &state->dtdevs_info[i]) );
+        }
+
         FDT( fdt_end_node(fdt) );
 
         FDT( fdt_finish(fdt) );
diff --git a/tools/libxl/libxl_create.c b/tools/libxl/libxl_create.c
index e544bbf..cad9987 100644
--- a/tools/libxl/libxl_create.c
+++ b/tools/libxl/libxl_create.c
@@ -1298,6 +1298,7 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev,
 
     /* convenience aliases */
     libxl_domain_config *const d_config = dcs->guest_config;
+    libxl__domain_build_state *const state = &dcs->build_state;
 
     if (ret) {
         LOG(ERROR, "unable to add vtpm devices");
@@ -1323,6 +1324,16 @@ static void domcreate_attach_pci(libxl__egc *egc, libxl__multidev *multidev,
         }
     }
 
+    for (i = 0; i < d_config->num_dtdevs; i++) {
+
+        ret = libxl__device_dt_add(gc, domid, &state->dtdevs_info[i]);
+        if (ret < 0) {
+            LIBXL__LOG(ctx, LIBXL__LOG_ERROR,
+                       "libxl__device_dt_add failed: %d\n", ret);
+            goto error_out;
+        }
+    }
+
     domcreate_console_available(egc, dcs);
 
     domcreate_complete(egc, dcs, 0);
diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
index 661999c..c58a4a8 100644
--- a/tools/libxl/libxl_dom.c
+++ b/tools/libxl/libxl_dom.c
@@ -233,6 +233,7 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
     libxl_domain_build_info *const info = &d_config->b_info;
     libxl_ctx *ctx = libxl__gc_owner(gc);
     char *xs_domid, *con_domid;
+    int i;
     int rc;
 
     if (xc_domain_max_vcpus(ctx->xch, domid, info->max_vcpus) != 0) {
@@ -284,7 +285,23 @@ int libxl__build_pre(libxl__gc *gc, uint32_t domid,
     if (info->type == LIBXL_DOMAIN_TYPE_HVM)
         hvm_set_conf_params(ctx->xch, domid, info);
 
-    rc = libxl__arch_domain_create(gc, d_config, domid);
+    /* We need to retrieve DT devs information before calling
+     * libxl__arch_domain_create. On ARM, the function will allocate GFN
+     * for each MMIO regions
+     */
+    state->num_dtdevs = d_config->num_dtdevs;
+    state->dtdevs_info = libxl__calloc(gc, sizeof(*state->dtdevs_info),
+                                       d_config->num_dtdevs);
+    for (i = 0; i < d_config->num_dtdevs; i++) {
+        rc = libxl__device_dt_get_info(gc, &d_config->dtdevs[i],
+                                       &state->dtdevs_info[i]);
+        if (rc) {
+            LOG(ERROR, "libxl__device_dt_get_info failed: %d", rc);
+            return ERROR_INVAL;
+        }
+    }
+
+    rc = libxl__arch_domain_create(gc, d_config, state, domid);
 
     return rc;
 }
@@ -436,7 +453,8 @@ int libxl__build_pv(libxl__gc *gc, uint32_t domid,
         LOGE(ERROR, "xc_dom_parse_image failed");
         goto out;
     }
-    if ( (ret = libxl__arch_domain_init_hw_description(gc, info, dom)) != 0 ) {
+    ret = libxl__arch_domain_init_hw_description(gc, info, state, dom);
+    if ( ret != 0 ) {
         LOGE(ERROR, "libxl__arch_domain_init_hw_description failed");
         goto out;
     }
diff --git a/tools/libxl/libxl_dtdev.c b/tools/libxl/libxl_dtdev.c
new file mode 100644
index 0000000..995de5c
--- /dev/null
+++ b/tools/libxl/libxl_dtdev.c
@@ -0,0 +1,159 @@
+/*
+ * Copyright (C) 2014      Linaro Limited.
+ * Author Julien Grall <julien.grall@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU Lesser General Public License as published
+ * by the Free Software Foundation; version 2.1 only. with the special
+ * exception on linking described in file LICENSE.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU Lesser General Public License for more details.
+ */
+
+#include "libxl_osdeps.h" /* Must come before other headers */
+
+#include "libxl_internal.h"
+
+int libxl__device_dt_add(libxl__gc *gc, uint32_t domid,
+                         const libxl__dtdev_info *dtdev)
+{
+    uint32_t i;
+    int ret;
+
+    LOG(DEBUG, "Assign device \"%s\" to dom%u", dtdev->conf->path, domid);
+
+    for (i = 0; i < dtdev->num_mmios; i++) {
+        const libxl_iomem_range *io = &dtdev->mmios[i];
+
+        ret = xc_domain_iomem_permission(CTX->xch, domid, io->start,
+                                         io->number, 1);
+        if (ret < 0) {
+            LOGE(ERROR,
+                 "%s: failed to give dom%d access to iomem range"
+                 "%"PRIx64"-%"PRIx64, dtdev->conf->path,
+                 domid, io->start, io->start + io->number + 1);
+            return ret;
+        }
+
+        ret = xc_domain_memory_mapping(CTX->xch, domid, io->gfn,
+                                       io->start, io->number, 1);
+        if (ret < 0) {
+            LOGE(ERROR,
+                "%s: failed to map to dom%u iomem range %"PRIx64"-%"PRIx64
+                " to guest address %"PRIx64,
+                dtdev->conf->path, domid, io->start,
+                io->start + io->number - 1, io->gfn);
+            return ret;
+        }
+    }
+
+    for (i = 0; i < dtdev->num_irqs; i++) {
+        int irq = dtdev->irqs[i].irq;
+
+        ret = xc_physdev_map_pirq(CTX->xch, domid, irq, &irq);
+        if (ret < 0) {
+            LOGE(ERROR, "%s: failed to map the IRQ %u into dom%u\n",
+                 dtdev->conf->path, domid, irq);
+            return ret;
+        }
+    }
+
+    return xc_assign_dt_device(CTX->xch, domid, dtdev->conf->path);
+}
+
+int libxl__device_dt_get_info(libxl__gc *gc,
+                              const libxl_device_dt *dtdev,
+                              libxl__dtdev_info *dtinfo)
+{
+    xc_dtdev_info_t info;
+    int ret = 0;
+    uint32_t i;
+    char *buff;
+    uint32_t len;
+
+    LOG(DEBUG, "Get information for DT dev \"%s\"", dtdev->path);
+
+    ret = xc_physdev_dtdev_getinfo(CTX->xch, dtdev->path, &info);
+    if (ret) {
+        LOGE(ERROR, "Unable to get the informations for \"%s\"",
+             dtdev->path);
+        return ret;
+    }
+
+    LOG(DEBUG, "num_irqs = %u num_mmios = %u compat_len = %u",
+        info.num_irqs, info.num_mmios, info.compat_len);
+
+    dtinfo->conf = dtdev;
+
+    /* Retrieve the IRQs */
+    dtinfo->num_irqs = info.num_irqs;
+    dtinfo->irqs = libxl__calloc(gc, dtinfo->num_irqs,
+                                 sizeof(*dtinfo->mmios));
+
+    LOG(DEBUG, "List of IRQs");
+    for (i = 0; i < dtinfo->num_irqs; i++) {
+        xc_dtdev_irq_t irq;
+
+        ret = xc_physdev_dtdev_getirq(CTX->xch, dtdev->path, i, &irq);
+        if (ret) {
+            LOGE(ERROR, "Unable to get IRQ%u for \"%s\"", i, dtdev->path);
+            return ret;
+        }
+
+        LOG(DEBUG, "\t- irq = %u type = %u", irq.irq, irq.type);
+        dtinfo->irqs[i].irq = irq.irq;
+        /* TODO translate the type correctly */
+        dtinfo->irqs[i].type = irq.type;
+    }
+
+    /* Retrieve the MMIOs range */
+    dtinfo->num_mmios = info.num_mmios;
+    dtinfo->mmios =  libxl__calloc(gc, dtinfo->num_mmios,
+                                   sizeof(*dtinfo->mmios));
+
+    LOG(DEBUG, "List of MMIOs");
+    for (i = 0; i < dtinfo->num_mmios; i++) {
+        xc_dtdev_mmio_t mmio;
+
+        ret = xc_physdev_dtdev_getmmio(CTX->xch, dtdev->path, i, &mmio);
+        if (ret) {
+            LOGE(ERROR, "Unable to get the MMIO range %u for \"%s\"",
+                 i, dtdev->path);
+            return ret;
+        }
+
+        LOG(DEBUG, "\t- mfn = 0x%"PRIx64" nr_mfn = 0x%"PRIx64,
+            mmio.mfn, mmio.nr_mfn);
+
+        dtinfo->mmios[i].start = mmio.mfn;
+        dtinfo->mmios[i].number = mmio.nr_mfn;
+        dtinfo->mmios[i].gfn = LIBXL_INVALID_GFN;
+    }
+
+    /* Retrieve the compatible property */
+    len = info.compat_len;
+    buff = libxl__malloc(gc, len);
+
+    ret = xc_physdev_dtdev_getcompat(CTX->xch, dtdev->path,
+                                     buff, &len);
+    if (ret) {
+        LOGE(ERROR, "Unable to get the compatible string for \"%s\"",
+             dtdev->path);
+        return ret;
+    }
+    dtinfo->compat_len = len;
+    dtinfo->compat = buff;
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index 60d6f1d..cfe4482 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -944,6 +944,25 @@ typedef struct {
 _hidden int libxl__file_reference_map(libxl__file_reference *f);
 _hidden int libxl__file_reference_unmap(libxl__file_reference *f);
 
+/* DT dev representation */
+typedef struct {
+    uint32_t num_mmios;
+    libxl_iomem_range *mmios;
+
+    uint32_t num_irqs;
+    struct
+    {
+        uint32_t irq;
+        uint32_t type;
+    } *irqs;
+
+    const char *compat;
+    ssize_t compat_len; /* The compatible string may contain \0 */
+
+    /* Short-hand to the User configuration for this device */
+    const libxl_device_dt *conf;
+} libxl__dtdev_info;
+
 /* from xl_dom */
 _hidden libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid);
 _hidden int libxl__domain_shutdown_reason(libxl__gc *gc, uint32_t domid);
@@ -969,6 +988,9 @@ typedef struct {
     libxl__file_reference pv_ramdisk;
     const char * pv_cmdline;
     bool pvh_enabled;
+
+    int num_dtdevs;
+    libxl__dtdev_info *dtdevs_info;
 } libxl__domain_build_state;
 
 _hidden int libxl__build_pre(libxl__gc *gc, uint32_t domid,
@@ -1152,6 +1174,15 @@ _hidden int libxl__create_pci_backend(libxl__gc *gc, uint32_t domid,
                                       libxl_device_pci *pcidev, int num);
 _hidden int libxl__device_pci_destroy_all(libxl__gc *gc, uint32_t domid);
 
+/* from libxl_dtdev */
+
+_hidden int libxl__device_dt_add(libxl__gc *gc, uint32_t domid,
+                                 const libxl__dtdev_info *info);
+_hidden int libxl__device_dt_get_info(libxl__gc *gc,
+                                      const libxl_device_dt *dtdev,
+                                      libxl__dtdev_info *info);
+
+
 /*----- xswait: wait for a xenstore node to be suitable -----*/
 
 typedef struct libxl__xswait_state libxl__xswait_state;
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 9201461..74ba20d 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -452,6 +452,10 @@ libxl_device_pci = Struct("device_pci", [
     ("seize", bool),
     ])
 
+libxl_device_dt = Struct("device_dt", [
+    ("path", string),
+    ])
+
 libxl_device_vtpm = Struct("device_vtpm", [
     ("backend_domid",    libxl_domid),
     ("backend_domname",  string),
@@ -466,6 +470,7 @@ libxl_domain_config = Struct("domain_config", [
     ("disks", Array(libxl_device_disk, "num_disks")),
     ("nics", Array(libxl_device_nic, "num_nics")),
     ("pcidevs", Array(libxl_device_pci, "num_pcidevs")),
+    ("dtdevs", Array(libxl_device_dt, "num_dtdevs")),
     ("vfbs", Array(libxl_device_vfb, "num_vfbs")),
     ("vkbs", Array(libxl_device_vkb, "num_vkbs")),
     ("vtpms", Array(libxl_device_vtpm, "num_vtpms")),
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 7589060..3e9640b 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -245,7 +245,8 @@ static int libxl__e820_alloc(libxl__gc *gc, uint32_t domid,
 }
 
 int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
-        uint32_t domid)
+                              libxl__domain_build_state *state,
+                              uint32_t domid)
 {
     int ret = 0;
     int tsc_mode;
@@ -313,6 +314,7 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
 
 int libxl__arch_domain_init_hw_description(libxl__gc *gc,
                                            libxl_domain_build_info *info,
+                                           libxl__domain_build_state *state,
                                            struct xc_dom_image *dom)
 {
     return 0;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* [RFC 19/19] xl: Add new option dtdev
  2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
                   ` (17 preceding siblings ...)
  2014-06-16 16:18 ` [RFC 18/19] libxl: Add support for non-PCI passthrough Julien Grall
@ 2014-06-16 16:18 ` Julien Grall
  2014-06-16 17:19   ` Wei Liu
  18 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-16 16:18 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini

The option "dtdev" will be used to passthrough a non-PCI device described
in the device tree to a guest.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
---
 docs/man/xl.cfg.pod.5    |    5 +++++
 tools/libxl/xl_cmdimpl.c |   21 ++++++++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 025df27..fdadbe6 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -601,6 +601,11 @@ More information about Xen gfx_passthru feature is available
 on the XenVGAPassthrough L<http://wiki.xen.org/wiki/XenVGAPassthrough>
 wiki page.
 
+=item B<dtdev=[ "DTDEV_PATH", "DTDEV_PATH", ... ]>
+
+Specifies the host device node to passthrough to this guest. Each DTDEV_PATH
+is the absolute path in the device tree.
+
 =item B<ioports=[ "IOPORT_RANGE", "IOPORT_RANGE", ... ]>
 
 Allow guest to access specific legacy I/O ports. Each B<IOPORT_RANGE>
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index f99e6b6..3ecc804 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -702,7 +702,7 @@ static void parse_config_data(const char *config_source,
     long l;
     XLU_Config *config;
     XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms;
-    XLU_ConfigList *ioports, *irqs, *iomem;
+    XLU_ConfigList *ioports, *irqs, *iomem, *dtdevs;
     int num_ioports, num_irqs, num_iomem;
     int pci_power_mgmt = 0;
     int pci_msitranslate = 0;
@@ -1463,6 +1463,25 @@ skip_vfb:
             libxl_defbool_set(&b_info->u.pv.e820_host, true);
     }
 
+    if (!xlu_cfg_get_list (config, "dtdev", &dtdevs, 0, 0)) {
+        d_config->num_dtdevs = 0;
+        d_config->dtdevs = NULL;
+        for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) {
+            libxl_device_dt *dtdev;
+
+            d_config->dtdevs = (libxl_device_dt *) realloc(d_config->dtdevs, sizeof (libxl_device_dt) * (d_config->num_dtdevs + 1));
+            dtdev = d_config->dtdevs + d_config->num_dtdevs;
+            libxl_device_dt_init(dtdev);
+
+            dtdev->path = strdup(buf);
+            if (dtdev->path == NULL) {
+                fprintf(stderr, "unable to duplicate string for dtdevs\n");
+                exit(-1);
+            }
+            d_config->num_dtdevs++;
+        }
+    }
+
     switch (xlu_cfg_get_list(config, "cpuid", &cpuids, 0, 1)) {
     case 0:
         {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 122+ messages in thread

* Re: [RFC 18/19] libxl: Add support for non-PCI passthrough
  2014-06-16 16:18 ` [RFC 18/19] libxl: Add support for non-PCI passthrough Julien Grall
@ 2014-06-16 17:19   ` Wei Liu
  2014-06-18 12:26     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Wei Liu @ 2014-06-16 17:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: wei.liu2, ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On Mon, Jun 16, 2014 at 05:18:05PM +0100, Julien Grall wrote:
[...]
>  /*----- xswait: wait for a xenstore node to be suitable -----*/
>  
>  typedef struct libxl__xswait_state libxl__xswait_state;
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 9201461..74ba20d 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -452,6 +452,10 @@ libxl_device_pci = Struct("device_pci", [
>      ("seize", bool),
>      ])
>  
> +libxl_device_dt = Struct("device_dt", [
> +    ("path", string),
> +    ])
> +
>  libxl_device_vtpm = Struct("device_vtpm", [
>      ("backend_domid",    libxl_domid),
>      ("backend_domname",  string),
> @@ -466,6 +470,7 @@ libxl_domain_config = Struct("domain_config", [
>      ("disks", Array(libxl_device_disk, "num_disks")),
>      ("nics", Array(libxl_device_nic, "num_nics")),
>      ("pcidevs", Array(libxl_device_pci, "num_pcidevs")),
> +    ("dtdevs", Array(libxl_device_dt, "num_dtdevs")),

I would say let's go for "dts" instead of "dtdevs", just like "nics",
"vtpms" etc. Or you can do it the other way around, make
"libxl_device_dt" "libxl_device_dtdev". So that it follows the pattern

  pointer libxl_device_FOO *FOOs, counter num_FOOs.

"pcidevs" has caused enough trouble already. :-)


Wei.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 19/19] xl: Add new option dtdev
  2014-06-16 16:18 ` [RFC 19/19] xl: Add new option dtdev Julien Grall
@ 2014-06-16 17:19   ` Wei Liu
  2014-06-18 13:40     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Wei Liu @ 2014-06-16 17:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: wei.liu2, ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On Mon, Jun 16, 2014 at 05:18:06PM +0100, Julien Grall wrote:
[...]
> +    if (!xlu_cfg_get_list (config, "dtdev", &dtdevs, 0, 0)) {
> +        d_config->num_dtdevs = 0;
> +        d_config->dtdevs = NULL;
> +        for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) {
> +            libxl_device_dt *dtdev;
> +
> +            d_config->dtdevs = (libxl_device_dt *) realloc(d_config->dtdevs, sizeof (libxl_device_dt) * (d_config->num_dtdevs + 1));
> +            dtdev = d_config->dtdevs + d_config->num_dtdevs;
> +            libxl_device_dt_init(dtdev);
> +

There's a macro called ARRAY_EXTEND_INIT, you can probably use that.

Wei.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-16 16:17 ` [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest Julien Grall
@ 2014-06-17  8:01   ` Jan Beulich
  2014-06-17  9:09     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  8:01 UTC (permalink / raw)
  To: Julien Grall
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel, Daniel De Graaf

>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:

While generally I'm okay with adding such a helper, it should be done
a little more cleanly I think:

> --- /dev/null
> +++ b/xen/common/guestcopy.c
> @@ -0,0 +1,28 @@
> +#include <xen/config.h>
> +#include <xen/lib.h>
> +#include <xen/guest_access.h>
> +
> +int copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, char **buf,
> +                           unsigned long size, unsigned long max_size)

Both of these ought to be size_t (as it was in the flask original).

> +{
> +    char *tmp;
> +
> +    if ( size > max_size )
> +        return -ENOENT;

ENOBUFS would seem the better error code here.

> +
> +    /* Add an extra +1 to append \0. We can't assume the guest will
> +     * provide a valid string */

Now this is the case for flask, but for a generic string copying
routine I don't think this is desirable. It seems especially wrong to
aid the guest with putting a NUL where none was. If you really
want this, I guess you would be better off adding two variants:
One which demands the string to be NUL-terminated (in which
case passing in a size is sort of bogus), and one which takes a
size and inserts a NUL.

And in the end with the above I guess you realize why flask
rolled its own special purpose function rather than adding a
generic helper.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-16 16:17 ` [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown Julien Grall
@ 2014-06-17  8:07   ` Jan Beulich
  2014-06-17  9:18     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  8:07 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>      if ( !iommu_enabled || !hd->platform_ops )
>          return;
>  
> +    arch_iommu_domain_destroy(d);
> +
>      if ( need_iommu(d) )
>          iommu_teardown(d);
> -
> -    arch_iommu_domain_destroy(d);

At the first glance this doesn't look right, including the explanation
you gave (why would devices still be assigned to a guest at this
point). And it's rather hard to properly decide with the series here
depending on two other series, i.e. there not being a
arch_iommu_domain_destroy() at all in current staging.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM
  2014-06-16 16:18 ` [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM Julien Grall
@ 2014-06-17  8:24   ` Jan Beulich
  2014-06-17 13:05     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  8:24 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Keir Fraser, stefano.stabellini, ian.campbell, tim

>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
> --- a/xen/arch/x86/domctl.c
> +++ b/xen/arch/x86/domctl.c
> @@ -1320,7 +1320,7 @@ long arch_do_domctl(
>      break;
>  
>      default:
> -        ret = iommu_do_domctl(domctl, d, u_domctl);
> +        ret = -ENOSYS;
>          break;
>      }
>  
> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
> index 5d3ac87..85866b7 100644
> --- a/xen/common/domctl.c
> +++ b/xen/common/domctl.c
> @@ -1028,6 +1028,10 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>  
>      default:
>          ret = arch_do_domctl(op, d, u_domctl);
> +#ifdef HAS_PASSTHROUGH
> +        if ( ret == -ENOSYS )
> +            ret = iommu_do_domctl(op, d, u_domctl);
> +#endif
>          break;
>      }
>  

To be honest I'm not convinced of this approach. I'd prefer ARM's
arch_do_domctl() to invoke iommu_do_domctl() just like x86's does.
In particular I'm neither in favor of checking for specific error codes
before chaining, nor do I think that - despite there being a number
of such cases in the tree - ENOSYS is the right error value for not
implemented sub-hypercalls (to me only top level hypercalls may
produce this).

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-16 16:18 ` [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device Julien Grall
@ 2014-06-17  8:34   ` Jan Beulich
  2014-06-17 13:23     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  8:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h

I think you should be Cc-ing all relevant maintainers for common code
(here: interface) changes.

> @@ -936,6 +936,14 @@ typedef struct xen_domctl_vcpu_msrs xen_domctl_vcpu_msrs_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpu_msrs_t);
>  #endif
>  
> +/* Device Tree: Assign a non-PCI device to a guest */
> +struct xen_domctl_assign_dt_device {
> +    uint32_t size; /* IN: Length of the path */
> +    XEN_GUEST_HANDLE_64(char) path; /* IN: path to the device tree node */

Are paths (encoded as strings) indeed the canonical way of
representing devices? How does the tool stack know what is valid
to be passed in here?

> +};
> +typedef struct xen_domctl_assign_dt_device xen_domctl_assign_dt_device_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_assign_dt_device_t);
> +
>  struct xen_domctl {
>      uint32_t cmd;
>  #define XEN_DOMCTL_createdomain                   1
> @@ -1008,6 +1016,7 @@ struct xen_domctl {
>  #define XEN_DOMCTL_cacheflush                    71
>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>  #define XEN_DOMCTL_set_vcpu_msrs                 73
> +#define XEN_DOMCTL_assign_dt_device              74

How come you get away with just one operation here, when for PCI
pass-through we have three (assign, test-assign, and deassign)?

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-17  8:01   ` Jan Beulich
@ 2014-06-17  9:09     ` Julien Grall
  2014-06-17  9:17       ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17  9:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel, Daniel De Graaf

Hi Jan,

On 17/06/14 09:01, Jan Beulich wrote:
>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>
> While generally I'm okay with adding such a helper, it should be done
> a little more cleanly I think:
>
>> --- /dev/null
>> +++ b/xen/common/guestcopy.c
>> @@ -0,0 +1,28 @@
>> +#include <xen/config.h>
>> +#include <xen/lib.h>
>> +#include <xen/guest_access.h>
>> +
>> +int copy_string_from_guest(XEN_GUEST_HANDLE(char) u_buf, char **buf,
>> +                           unsigned long size, unsigned long max_size)
> 't
> Both of these ought to be size_t (as it was in the flask original).

Hrrmmm... I'm not sure why I made this change. I though the hypercall 
uses unsigned long but it uses uint32_t.

I will use size_t in the next version.



>> +{
>> +    char *tmp;
>> +
>> +    if ( size > max_size )
>> +        return -ENOENT;
>
> ENOBUFS would seem the better error code here.

Ok.

>> +
>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>> +     * provide a valid string */
>
> Now this is the case for flask, but for a generic string copying
> routine I don't think this is desirable. It seems especially wrong to
> aid the guest with putting a NUL where none was. If you really
> want this, I guess you would be better off adding two variants:
> One which demands the string to be NUL-terminated (in which
> case passing in a size is sort of bogus), and one which takes a
> size and inserts a NUL.

A malicious guest could pass a big buffer without a NUL-terminated. If 
we don't limit the size and check the NUL-terminated character the guest 
could respectively exhaust Xen memory and exploit it.

Therefore we can't rely on the guest to provide a valid string. This 
solution will avoid to check in every caller that the string is 
correctly terminated.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-17  9:09     ` Julien Grall
@ 2014-06-17  9:17       ` Jan Beulich
  2014-06-17  9:23         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  9:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel, Daniel De Graaf

>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
> On 17/06/14 09:01, Jan Beulich wrote:
>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>> +
>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>> +     * provide a valid string */
>>
>> Now this is the case for flask, but for a generic string copying
>> routine I don't think this is desirable. It seems especially wrong to
>> aid the guest with putting a NUL where none was. If you really
>> want this, I guess you would be better off adding two variants:
>> One which demands the string to be NUL-terminated (in which
>> case passing in a size is sort of bogus), and one which takes a
>> size and inserts a NUL.
> 
> A malicious guest could pass a big buffer without a NUL-terminated. If 
> we don't limit the size and check the NUL-terminated character the guest 
> could respectively exhaust Xen memory and exploit it.
> 
> Therefore we can't rely on the guest to provide a valid string. This 
> solution will avoid to check in every caller that the string is 
> correctly terminated.

You seem to imply that by not passing in a size I also meant not
passing in a maximum size - I didn't say that, though. You absolutely
have to limit the string length for security reasons, but it's clearly a
difference whether you silently NUL-terminate the value after the
maximum number of characters, or return with an error.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-17  8:07   ` Jan Beulich
@ 2014-06-17  9:18     ` Julien Grall
  2014-06-17  9:29       ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17  9:18 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

Hi Ian,

On 17/06/14 09:07, Jan Beulich wrote:
>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>> --- a/xen/drivers/passthrough/iommu.c
>> +++ b/xen/drivers/passthrough/iommu.c
>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>       if ( !iommu_enabled || !hd->platform_ops )
>>           return;
>>
>> +    arch_iommu_domain_destroy(d);
>> +
>>       if ( need_iommu(d) )
>>           iommu_teardown(d);
>> -
>> -    arch_iommu_domain_destroy(d);
>
> At the first glance this doesn't look right, including the explanation
> you gave (why would devices still be assigned to a guest at this
> point).

Because the toolstack may forget to deassign a device. How do you handle 
this case in x86? In the SMMU case, this will mean a memory leak and 
misconfiguration of the registers.

I think it's safer to let Xen deassign the remaining devices.

> And it's rather hard to properly decide with the series here
> depending on two other series, i.e. there not being a
> arch_iommu_domain_destroy() at all in current staging.

Are you sure? The other series doesn't deal with the IOMMU stuff. This 
change has been pushed upstream a month ago (see commit 4905b35c " 
iommu: introduce arch specific code").

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-17  9:17       ` Jan Beulich
@ 2014-06-17  9:23         ` Julien Grall
  2014-06-17 22:43           ` Daniel De Graaf
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17  9:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel, Daniel De Graaf



On 17/06/14 10:17, Jan Beulich wrote:
>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>> +
>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>> +     * provide a valid string */
>>>
>>> Now this is the case for flask, but for a generic string copying
>>> routine I don't think this is desirable. It seems especially wrong to
>>> aid the guest with putting a NUL where none was. If you really
>>> want this, I guess you would be better off adding two variants:
>>> One which demands the string to be NUL-terminated (in which
>>> case passing in a size is sort of bogus), and one which takes a
>>> size and inserts a NUL.
>>
>> A malicious guest could pass a big buffer without a NUL-terminated. If
>> we don't limit the size and check the NUL-terminated character the guest
>> could respectively exhaust Xen memory and exploit it.
>>
>> Therefore we can't rely on the guest to provide a valid string. This
>> solution will avoid to check in every caller that the string is
>> correctly terminated.
>
> You seem to imply that by not passing in a size I also meant not
> passing in a maximum size - I didn't say that, though. You absolutely
> have to limit the string length for security reasons, but it's clearly a
> difference whether you silently NUL-terminate the value after the
> maximum number of characters, or return with an error.

I didn't understand in this way your previous mail. Thank you for the 
explanation.

It looks like for my use case it's better to throw an error if we don't 
have enough place. It would help us if one day the path start to be very 
long.

I'm wondering if we can also make this change for flask... Daniel, any 
though?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-17  9:18     ` Julien Grall
@ 2014-06-17  9:29       ` Jan Beulich
  2014-06-17 12:38         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17  9:29 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

>>> On 17.06.14 at 11:18, <julien.grall@linaro.org> wrote:
> Hi Ian,
> 
> On 17/06/14 09:07, Jan Beulich wrote:
>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>> --- a/xen/drivers/passthrough/iommu.c
>>> +++ b/xen/drivers/passthrough/iommu.c
>>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>>       if ( !iommu_enabled || !hd->platform_ops )
>>>           return;
>>>
>>> +    arch_iommu_domain_destroy(d);
>>> +
>>>       if ( need_iommu(d) )
>>>           iommu_teardown(d);
>>> -
>>> -    arch_iommu_domain_destroy(d);
>>
>> At the first glance this doesn't look right, including the explanation
>> you gave (why would devices still be assigned to a guest at this
>> point).
> 
> Because the toolstack may forget to deassign a device. How do you handle 
> this case in x86? In the SMMU case, this will mean a memory leak and 
> misconfiguration of the registers.

Proper tool stack behavior is required (and not just here).

>> And it's rather hard to properly decide with the series here
>> depending on two other series, i.e. there not being a
>> arch_iommu_domain_destroy() at all in current staging.
> 
> Are you sure? The other series doesn't deal with the IOMMU stuff. This 
> change has been pushed upstream a month ago (see commit 4905b35c " 
> iommu: introduce arch specific code").

Oops, indeed - I'm sorry, I looked at a stale branch. Looking at the
correct code I still think the current order is the correct one, and if
you need to take extra steps you ought to do so from the .teardown
hook.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-17  9:29       ` Jan Beulich
@ 2014-06-17 12:38         ` Julien Grall
  2014-06-17 13:04           ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17 12:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On 06/17/2014 10:29 AM, Jan Beulich wrote:
>>>> On 17.06.14 at 11:18, <julien.grall@linaro.org> wrote:
>> On 17/06/14 09:07, Jan Beulich wrote:
>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>> --- a/xen/drivers/passthrough/iommu.c
>>>> +++ b/xen/drivers/passthrough/iommu.c
>>>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>>>       if ( !iommu_enabled || !hd->platform_ops )
>>>>           return;
>>>>
>>>> +    arch_iommu_domain_destroy(d);
>>>> +
>>>>       if ( need_iommu(d) )
>>>>           iommu_teardown(d);
>>>> -
>>>> -    arch_iommu_domain_destroy(d);
>>>
>>> At the first glance this doesn't look right, including the explanation
>>> you gave (why would devices still be assigned to a guest at this
>>> point).
>>
>> Because the toolstack may forget to deassign a device. How do you handle 
>> this case in x86? In the SMMU case, this will mean a memory leak and 
>> misconfiguration of the registers.
> 
> Proper tool stack behavior is required (and not just here).

I think this is important to handle toolstack failure (such as crash)
just in case. Hence it doesn't add much code for this purpose.

>>> And it's rather hard to properly decide with the series here
>>> depending on two other series, i.e. there not being a
>>> arch_iommu_domain_destroy() at all in current staging.
>>
>> Are you sure? The other series doesn't deal with the IOMMU stuff. This 
>> change has been pushed upstream a month ago (see commit 4905b35c " 
>> iommu: introduce arch specific code").
> 
> Oops, indeed - I'm sorry, I looked at a stale branch. Looking at the
> correct code I still think the current order is the correct one, and if
> you need to take extra steps you ought to do so from the .teardown
> hook.

I though about implement it in .teardown, but it results to non-obvious
code.

I could call iommu_dt_domain_destroy in .teardown, that will mean to
call "arch dt" code in the SMMU drivers which I think break the design.
I would prefer call it the arch specific function. Do you mind if I add
a new function called arch_iommu_reassign_devices? This function will
reassign every devices of a given domain to the hardware domain.

The iommmu_domain_destroy will look like:

void iommu_domain_destroy(struct domain *d)
{
	if ( !iommu_enabled )
		return;

	arch_iommu_reassign_devices(d);
	if ( need_iommu(d) )
	  iommu_teardown(d);
	arch_iommu_domain_destroy(d);
}

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-17 12:38         ` Julien Grall
@ 2014-06-17 13:04           ` Jan Beulich
  2014-06-18 12:24             ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17 13:04 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

>>> On 17.06.14 at 14:38, <julien.grall@linaro.org> wrote:
> On 06/17/2014 10:29 AM, Jan Beulich wrote:
>>>>> On 17.06.14 at 11:18, <julien.grall@linaro.org> wrote:
>>> On 17/06/14 09:07, Jan Beulich wrote:
>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>> --- a/xen/drivers/passthrough/iommu.c
>>>>> +++ b/xen/drivers/passthrough/iommu.c
>>>>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>>>>       if ( !iommu_enabled || !hd->platform_ops )
>>>>>           return;
>>>>>
>>>>> +    arch_iommu_domain_destroy(d);
>>>>> +
>>>>>       if ( need_iommu(d) )
>>>>>           iommu_teardown(d);
>>>>> -
>>>>> -    arch_iommu_domain_destroy(d);
>>>>
>>>> At the first glance this doesn't look right, including the explanation
>>>> you gave (why would devices still be assigned to a guest at this
>>>> point).
>>>
>>> Because the toolstack may forget to deassign a device. How do you handle 
>>> this case in x86? In the SMMU case, this will mean a memory leak and 
>>> misconfiguration of the registers.
>> 
>> Proper tool stack behavior is required (and not just here).
> 
> I think this is important to handle toolstack failure (such as crash)
> just in case. Hence it doesn't add much code for this purpose.

If you think this is necessary, then there's no reason to make this
ARM-specific (which in turn would eliminate the need for this to sit
in an arch hook).

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM
  2014-06-17  8:24   ` Jan Beulich
@ 2014-06-17 13:05     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-17 13:05 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Keir Fraser, stefano.stabellini, ian.campbell, tim

Hi Jan,

On 06/17/2014 09:24 AM, Jan Beulich wrote:
>>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
>> --- a/xen/arch/x86/domctl.c
>> +++ b/xen/arch/x86/domctl.c
>> @@ -1320,7 +1320,7 @@ long arch_do_domctl(
>>      break;
>>  
>>      default:
>> -        ret = iommu_do_domctl(domctl, d, u_domctl);
>> +        ret = -ENOSYS;
>>          break;
>>      }
>>  
>> diff --git a/xen/common/domctl.c b/xen/common/domctl.c
>> index 5d3ac87..85866b7 100644
>> --- a/xen/common/domctl.c
>> +++ b/xen/common/domctl.c
>> @@ -1028,6 +1028,10 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
>>  
>>      default:
>>          ret = arch_do_domctl(op, d, u_domctl);
>> +#ifdef HAS_PASSTHROUGH
>> +        if ( ret == -ENOSYS )
>> +            ret = iommu_do_domctl(op, d, u_domctl);
>> +#endif
>>          break;
>>      }
>>  
> 
> To be honest I'm not convinced of this approach. I'd prefer ARM's
> arch_do_domctl() to invoke iommu_do_domctl() just like x86's does.
> In particular I'm neither in favor of checking for specific error codes
> before chaining, nor do I think that - despite there being a number
> of such cases in the tree - ENOSYS is the right error value for not
> implemented sub-hypercalls (to me only top level hypercalls may
> produce this).

Ok. I will add the iommu_do_domctl call directh in arch_do_domctl.

Regards,


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-17  8:34   ` Jan Beulich
@ 2014-06-17 13:23     ` Julien Grall
  2014-06-17 13:30       ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17 13:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

Hi Jan,

On 06/17/2014 09:34 AM, Jan Beulich wrote:
>>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
>> --- a/xen/include/public/domctl.h
>> +++ b/xen/include/public/domctl.h
> 
> I think you should be Cc-ing all relevant maintainers for common code
> (here: interface) changes.

Sorry, get_maintainers.pl mislead me on the relevant maintainers (see
output below).

42sh> scripts/get_maintainers.pl < this.patch
Ian Jackson <ian.jackson@eu.citrix.com>
Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Campbell <ian.campbell@citrix.com>
xen-devel@lists.xen.org

For some reason it doesn't provide neither Keir nor you.

> 
>> @@ -936,6 +936,14 @@ typedef struct xen_domctl_vcpu_msrs xen_domctl_vcpu_msrs_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_domctl_vcpu_msrs_t);
>>  #endif
>>  
>> +/* Device Tree: Assign a non-PCI device to a guest */
>> +struct xen_domctl_assign_dt_device {
>> +    uint32_t size; /* IN: Length of the path */
>> +    XEN_GUEST_HANDLE_64(char) path; /* IN: path to the device tree node */
> 
> Are paths (encoded as strings) indeed the canonical way of
> representing devices?

Yes, a device node is uniquely identified by the full path from the root
node.

This path will look like:

/soc/ethernet@fff51000

> How does the tool stack know what is valid
> to be passed in here?

The path is provided directly by the user. The sanity check is only done
in the hypervisor side.

>> +};
>> +typedef struct xen_domctl_assign_dt_device xen_domctl_assign_dt_device_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_domctl_assign_dt_device_t);
>> +
>>  struct xen_domctl {
>>      uint32_t cmd;
>>  #define XEN_DOMCTL_createdomain                   1
>> @@ -1008,6 +1016,7 @@ struct xen_domctl {
>>  #define XEN_DOMCTL_cacheflush                    71
>>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>>  #define XEN_DOMCTL_set_vcpu_msrs                 73
>> +#define XEN_DOMCTL_assign_dt_device              74
> 
> How come you get away with just one operation here, when for PCI
> pass-through we have three (assign, test-assign, and deassign)?

As said on my cover letter this a very very RFC. I send it to have some
comment about the design.

For now device are reassign directly in Xen when the domain is
destroyed. I plan to implement deassign in the next version.

I don't think test-assign is useful for non-PCI passthrough. The assign
hypercall will correctly check if we can passthrough this device to the
guest.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-17 13:23     ` Julien Grall
@ 2014-06-17 13:30       ` Jan Beulich
  2014-06-17 13:48         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17 13:30 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

>>> On 17.06.14 at 15:23, <julien.grall@linaro.org> wrote:
> On 06/17/2014 09:34 AM, Jan Beulich wrote:
>>>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
>>> @@ -1008,6 +1016,7 @@ struct xen_domctl {
>>>  #define XEN_DOMCTL_cacheflush                    71
>>>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>>>  #define XEN_DOMCTL_set_vcpu_msrs                 73
>>> +#define XEN_DOMCTL_assign_dt_device              74
>> 
>> How come you get away with just one operation here, when for PCI
>> pass-through we have three (assign, test-assign, and deassign)?
> 
> As said on my cover letter this a very very RFC. I send it to have some
> comment about the design.
> 
> For now device are reassign directly in Xen when the domain is
> destroyed. I plan to implement deassign in the next version.
> 
> I don't think test-assign is useful for non-PCI passthrough. The assign
> hypercall will correctly check if we can passthrough this device to the
> guest.

But that may mean more diverging code on the tools side. Unless the
current PCI model is wrong or unusable for DT pass-through, I think
best would be to retain the abstract model as is.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-17 13:30       ` Jan Beulich
@ 2014-06-17 13:48         ` Julien Grall
  2014-06-17 13:55           ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-17 13:48 UTC (permalink / raw)
  To: Jan Beulich
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On 06/17/2014 02:30 PM, Jan Beulich wrote:
>>>> On 17.06.14 at 15:23, <julien.grall@linaro.org> wrote:
>> On 06/17/2014 09:34 AM, Jan Beulich wrote:
>>>>>> On 16.06.14 at 18:18, <julien.grall@linaro.org> wrote:
>>>> @@ -1008,6 +1016,7 @@ struct xen_domctl {
>>>>  #define XEN_DOMCTL_cacheflush                    71
>>>>  #define XEN_DOMCTL_get_vcpu_msrs                 72
>>>>  #define XEN_DOMCTL_set_vcpu_msrs                 73
>>>> +#define XEN_DOMCTL_assign_dt_device              74
>>>
>>> How come you get away with just one operation here, when for PCI
>>> pass-through we have three (assign, test-assign, and deassign)?
>>
>> As said on my cover letter this a very very RFC. I send it to have some
>> comment about the design.
>>
>> For now device are reassign directly in Xen when the domain is
>> destroyed. I plan to implement deassign in the next version.
>>
>> I don't think test-assign is useful for non-PCI passthrough. The assign
>> hypercall will correctly check if we can passthrough this device to the
>> guest.
> 
> But that may mean more diverging code on the tools side. Unless the
> current PCI model is wrong or unusable for DT pass-through, I think
> best would be to retain the abstract model as is.

I looked at the libxl code for pci and this is only used in
libxl__device_pci_add just before calling do_pci_add in case of an HVM.

I'm not sure why we function is used but from the DT pass-through POV,
we only need to ask the hypervisor to assign this specific device to the
guest. The hypercall will return "ok" if it has succeeded or an error if
the device is not protected (i.e not behind an IOMMU). I don't see why
we should have a test-assign hypercall in this case...

Anyway, I'm fine to introduce the hypercall for consistency but I don't
plan to use it in the toolstack because it's pointless.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-17 13:48         ` Julien Grall
@ 2014-06-17 13:55           ` Jan Beulich
  2014-07-03 11:54             ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-17 13:55 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

>>> On 17.06.14 at 15:48, <julien.grall@linaro.org> wrote:
> I'm not sure why we function is used but from the DT pass-through POV,
> we only need to ask the hypervisor to assign this specific device to the
> guest. The hypercall will return "ok" if it has succeeded or an error if
> the device is not protected (i.e not behind an IOMMU). I don't see why
> we should have a test-assign hypercall in this case...

The reason for its existence isn't entirely clear to me either - you may
want to ping whoever added this.

> Anyway, I'm fine to introduce the hypercall for consistency but I don't
> plan to use it in the toolstack because it's pointless.

It's the tools maintainers' call in the end, but my advice would be to not
have it used in the PCI case, but not in the DT one. Either rip it out on
the PCI side too, or have it be called even if not strictly needed.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-17  9:23         ` Julien Grall
@ 2014-06-17 22:43           ` Daniel De Graaf
  2014-06-18 11:59             ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Daniel De Graaf @ 2014-06-17 22:43 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

On 06/17/2014 05:23 AM, Julien Grall wrote:
>
>
> On 17/06/14 10:17, Jan Beulich wrote:
>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>> +
>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>> +     * provide a valid string */
>>>>
>>>> Now this is the case for flask, but for a generic string copying
>>>> routine I don't think this is desirable. It seems especially wrong to
>>>> aid the guest with putting a NUL where none was. If you really
>>>> want this, I guess you would be better off adding two variants:
>>>> One which demands the string to be NUL-terminated (in which
>>>> case passing in a size is sort of bogus), and one which takes a
>>>> size and inserts a NUL.

I'm not sure why you would want a string copy-in function to not
NUL-terminate the strings it copies in.  If you don't want the strings
to be NUL-terminated at all, I would call it buffer copy-in function
(and copy_from_guest seems to cover buffer copy-in better).  If you want
the strings to be NUL-terminated and the guest has passed you a length,
it's simpler to have the hypervisor add the NUL instead of copying it
and then checking that it is there.  The current toolstack code for
XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
often passes in (s, strlen(s)).

>>> A malicious guest could pass a big buffer without a NUL-terminated. If
>>> we don't limit the size and check the NUL-terminated character the guest
>>> could respectively exhaust Xen memory and exploit it.
>>>
>>> Therefore we can't rely on the guest to provide a valid string. This
>>> solution will avoid to check in every caller that the string is
>>> correctly terminated.
>>
>> You seem to imply that by not passing in a size I also meant not
>> passing in a maximum size - I didn't say that, though. You absolutely
>> have to limit the string length for security reasons, but it's clearly a
>> difference whether you silently NUL-terminate the value after the
>> maximum number of characters, or return with an error.
>
> I didn't understand in this way your previous mail. Thank you for the explanation.
>
> It looks like for my use case it's better to throw an error if we don't have enough place. It would help us if one day the path start to be very long.
>
> I'm wondering if we can also make this change for flask... Daniel, any though?

Silently cropping the string at some maximum would be a problem for
FLASK (and probably other places), as it could result in a valid label
that has a different meaning than intended - better to return an error
and force the caller to deal with it.  Otherwise, I don't think there
is any change to be made here, unless I missed something?

-- 
Daniel De Graaf
National Security Agency

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-17 22:43           ` Daniel De Graaf
@ 2014-06-18 11:59             ` Jan Beulich
  2014-06-18 12:22               ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-18 11:59 UTC (permalink / raw)
  To: Julien Grall, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

>>> On 18.06.14 at 00:43, <dgdegra@tycho.nsa.gov> wrote:
> On 06/17/2014 05:23 AM, Julien Grall wrote:
>>
>>
>> On 17/06/14 10:17, Jan Beulich wrote:
>>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>> +
>>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>>> +     * provide a valid string */
>>>>>
>>>>> Now this is the case for flask, but for a generic string copying
>>>>> routine I don't think this is desirable. It seems especially wrong to
>>>>> aid the guest with putting a NUL where none was. If you really
>>>>> want this, I guess you would be better off adding two variants:
>>>>> One which demands the string to be NUL-terminated (in which
>>>>> case passing in a size is sort of bogus), and one which takes a
>>>>> size and inserts a NUL.
> 
> I'm not sure why you would want a string copy-in function to not
> NUL-terminate the strings it copies in.  If you don't want the strings
> to be NUL-terminated at all, I would call it buffer copy-in function
> (and copy_from_guest seems to cover buffer copy-in better).  If you want
> the strings to be NUL-terminated and the guest has passed you a length,
> it's simpler to have the hypervisor add the NUL instead of copying it
> and then checking that it is there.  The current toolstack code for
> XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
> often passes in (s, strlen(s)).

I didn't say to just leave such strings unterminated. Instead I said
that if there is no zero terminator, rather than putting one there we
should just fail the operation if the buffer size limit was exceeded.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-18 11:59             ` Jan Beulich
@ 2014-06-18 12:22               ` Julien Grall
  2014-06-18 12:49                 ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 12:22 UTC (permalink / raw)
  To: Jan Beulich, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

On 06/18/2014 12:59 PM, Jan Beulich wrote:
>>>> On 18.06.14 at 00:43, <dgdegra@tycho.nsa.gov> wrote:
>> On 06/17/2014 05:23 AM, Julien Grall wrote:
>>>
>>>
>>> On 17/06/14 10:17, Jan Beulich wrote:
>>>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>>> +
>>>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>>>> +     * provide a valid string */
>>>>>>
>>>>>> Now this is the case for flask, but for a generic string copying
>>>>>> routine I don't think this is desirable. It seems especially wrong to
>>>>>> aid the guest with putting a NUL where none was. If you really
>>>>>> want this, I guess you would be better off adding two variants:
>>>>>> One which demands the string to be NUL-terminated (in which
>>>>>> case passing in a size is sort of bogus), and one which takes a
>>>>>> size and inserts a NUL.
>>
>> I'm not sure why you would want a string copy-in function to not
>> NUL-terminate the strings it copies in.  If you don't want the strings
>> to be NUL-terminated at all, I would call it buffer copy-in function
>> (and copy_from_guest seems to cover buffer copy-in better).  If you want
>> the strings to be NUL-terminated and the guest has passed you a length,
>> it's simpler to have the hypervisor add the NUL instead of copying it
>> and then checking that it is there.  The current toolstack code for
>> XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
>> often passes in (s, strlen(s)).
> 
> I didn't say to just leave such strings unterminated. Instead I said
> that if there is no zero terminator, rather than putting one there we
> should just fail the operation if the buffer size limit was exceeded.

It looks like I use the same trick as for flask, i.e using strlen(s) and
therefore let the hypervisor set the NUL-terminator.

I will add a comment on this function to say that we expect the
hypervisor to set the NUL-terminator.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-17 13:04           ` Jan Beulich
@ 2014-06-18 12:24             ` Julien Grall
  2014-06-18 12:50               ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 12:24 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On 06/17/2014 02:04 PM, Jan Beulich wrote:
>>>> On 17.06.14 at 14:38, <julien.grall@linaro.org> wrote:
>> On 06/17/2014 10:29 AM, Jan Beulich wrote:
>>>>>> On 17.06.14 at 11:18, <julien.grall@linaro.org> wrote:
>>>> On 17/06/14 09:07, Jan Beulich wrote:
>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>> --- a/xen/drivers/passthrough/iommu.c
>>>>>> +++ b/xen/drivers/passthrough/iommu.c
>>>>>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>>>>>       if ( !iommu_enabled || !hd->platform_ops )
>>>>>>           return;
>>>>>>
>>>>>> +    arch_iommu_domain_destroy(d);
>>>>>> +
>>>>>>       if ( need_iommu(d) )
>>>>>>           iommu_teardown(d);
>>>>>> -
>>>>>> -    arch_iommu_domain_destroy(d);
>>>>>
>>>>> At the first glance this doesn't look right, including the explanation
>>>>> you gave (why would devices still be assigned to a guest at this
>>>>> point).
>>>>
>>>> Because the toolstack may forget to deassign a device. How do you handle 
>>>> this case in x86? In the SMMU case, this will mean a memory leak and 
>>>> misconfiguration of the registers.
>>>
>>> Proper tool stack behavior is required (and not just here).
>>
>> I think this is important to handle toolstack failure (such as crash)
>> just in case. Hence it doesn't add much code for this purpose.
> 
> If you think this is necessary, then there's no reason to make this
> ARM-specific (which in turn would eliminate the need for this to sit
> in an arch hook).

We will have to do DT/PCI specific as I don't use the same way to know
which device is assigned to which guest.

I won't be able to write (or at least test) the piece of code for the
PCI part as my ARM board doesn't support PCI and I don't have a Xen
setup for x86.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 18/19] libxl: Add support for non-PCI passthrough
  2014-06-16 17:19   ` Wei Liu
@ 2014-06-18 12:26     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-18 12:26 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

Hi Wei,

On 06/16/2014 06:19 PM, Wei Liu wrote:
> On Mon, Jun 16, 2014 at 05:18:05PM +0100, Julien Grall wrote:
> [...]
>>  /*----- xswait: wait for a xenstore node to be suitable -----*/
>>  
>>  typedef struct libxl__xswait_state libxl__xswait_state;
>> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
>> index 9201461..74ba20d 100644
>> --- a/tools/libxl/libxl_types.idl
>> +++ b/tools/libxl/libxl_types.idl
>> @@ -452,6 +452,10 @@ libxl_device_pci = Struct("device_pci", [
>>      ("seize", bool),
>>      ])
>>  
>> +libxl_device_dt = Struct("device_dt", [
>> +    ("path", string),
>> +    ])
>> +
>>  libxl_device_vtpm = Struct("device_vtpm", [
>>      ("backend_domid",    libxl_domid),
>>      ("backend_domname",  string),
>> @@ -466,6 +470,7 @@ libxl_domain_config = Struct("domain_config", [
>>      ("disks", Array(libxl_device_disk, "num_disks")),
>>      ("nics", Array(libxl_device_nic, "num_nics")),
>>      ("pcidevs", Array(libxl_device_pci, "num_pcidevs")),
>> +    ("dtdevs", Array(libxl_device_dt, "num_dtdevs")),
> 
> I would say let's go for "dts" instead of "dtdevs", just like "nics",
> "vtpms" etc. Or you can do it the other way around, make
> "libxl_device_dt" "libxl_device_dtdev". So that it follows the pattern

I find "dts" confusing because this is also the extension used for file
describing the device tree.

I will rename the structure libxl_device_dt in libxl_device_dtdev.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-18 12:22               ` Julien Grall
@ 2014-06-18 12:49                 ` Jan Beulich
  2014-06-18 12:53                   ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-18 12:49 UTC (permalink / raw)
  To: Julien Grall, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

>>> On 18.06.14 at 14:22, <julien.grall@linaro.org> wrote:
> On 06/18/2014 12:59 PM, Jan Beulich wrote:
>>>>> On 18.06.14 at 00:43, <dgdegra@tycho.nsa.gov> wrote:
>>> On 06/17/2014 05:23 AM, Julien Grall wrote:
>>>>
>>>>
>>>> On 17/06/14 10:17, Jan Beulich wrote:
>>>>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>>>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>>>> +
>>>>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>>>>> +     * provide a valid string */
>>>>>>>
>>>>>>> Now this is the case for flask, but for a generic string copying
>>>>>>> routine I don't think this is desirable. It seems especially wrong to
>>>>>>> aid the guest with putting a NUL where none was. If you really
>>>>>>> want this, I guess you would be better off adding two variants:
>>>>>>> One which demands the string to be NUL-terminated (in which
>>>>>>> case passing in a size is sort of bogus), and one which takes a
>>>>>>> size and inserts a NUL.
>>>
>>> I'm not sure why you would want a string copy-in function to not
>>> NUL-terminate the strings it copies in.  If you don't want the strings
>>> to be NUL-terminated at all, I would call it buffer copy-in function
>>> (and copy_from_guest seems to cover buffer copy-in better).  If you want
>>> the strings to be NUL-terminated and the guest has passed you a length,
>>> it's simpler to have the hypervisor add the NUL instead of copying it
>>> and then checking that it is there.  The current toolstack code for
>>> XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
>>> often passes in (s, strlen(s)).
>> 
>> I didn't say to just leave such strings unterminated. Instead I said
>> that if there is no zero terminator, rather than putting one there we
>> should just fail the operation if the buffer size limit was exceeded.
> 
> It looks like I use the same trick as for flask, i.e using strlen(s) and
> therefore let the hypervisor set the NUL-terminator.
> 
> I will add a comment on this function to say that we expect the
> hypervisor to set the NUL-terminator.

But just to make sure - the generic helper introduced there shouldn't
behave that way if being given the proposed name.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown
  2014-06-18 12:24             ` Julien Grall
@ 2014-06-18 12:50               ` Jan Beulich
  0 siblings, 0 replies; 122+ messages in thread
From: Jan Beulich @ 2014-06-18 12:50 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

>>> On 18.06.14 at 14:24, <julien.grall@linaro.org> wrote:
> On 06/17/2014 02:04 PM, Jan Beulich wrote:
>>>>> On 17.06.14 at 14:38, <julien.grall@linaro.org> wrote:
>>> On 06/17/2014 10:29 AM, Jan Beulich wrote:
>>>>>>> On 17.06.14 at 11:18, <julien.grall@linaro.org> wrote:
>>>>> On 17/06/14 09:07, Jan Beulich wrote:
>>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>>> --- a/xen/drivers/passthrough/iommu.c
>>>>>>> +++ b/xen/drivers/passthrough/iommu.c
>>>>>>> @@ -219,10 +219,10 @@ void iommu_domain_destroy(struct domain *d)
>>>>>>>       if ( !iommu_enabled || !hd->platform_ops )
>>>>>>>           return;
>>>>>>>
>>>>>>> +    arch_iommu_domain_destroy(d);
>>>>>>> +
>>>>>>>       if ( need_iommu(d) )
>>>>>>>           iommu_teardown(d);
>>>>>>> -
>>>>>>> -    arch_iommu_domain_destroy(d);
>>>>>>
>>>>>> At the first glance this doesn't look right, including the explanation
>>>>>> you gave (why would devices still be assigned to a guest at this
>>>>>> point).
>>>>>
>>>>> Because the toolstack may forget to deassign a device. How do you handle 
>>>>> this case in x86? In the SMMU case, this will mean a memory leak and 
>>>>> misconfiguration of the registers.
>>>>
>>>> Proper tool stack behavior is required (and not just here).
>>>
>>> I think this is important to handle toolstack failure (such as crash)
>>> just in case. Hence it doesn't add much code for this purpose.
>> 
>> If you think this is necessary, then there's no reason to make this
>> ARM-specific (which in turn would eliminate the need for this to sit
>> in an arch hook).
> 
> We will have to do DT/PCI specific as I don't use the same way to know
> which device is assigned to which guest.

Which sounds wrong from a design pov. But again, it's the tool stack
maintainers' call in the end.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-18 12:49                 ` Jan Beulich
@ 2014-06-18 12:53                   ` Julien Grall
  2014-06-18 13:01                     ` Jan Beulich
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 12:53 UTC (permalink / raw)
  To: Jan Beulich, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

On 06/18/2014 01:49 PM, Jan Beulich wrote:
>>>> On 18.06.14 at 14:22, <julien.grall@linaro.org> wrote:
>> On 06/18/2014 12:59 PM, Jan Beulich wrote:
>>>>>> On 18.06.14 at 00:43, <dgdegra@tycho.nsa.gov> wrote:
>>>> On 06/17/2014 05:23 AM, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 17/06/14 10:17, Jan Beulich wrote:
>>>>>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>>>>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>>>>> +
>>>>>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>>>>>> +     * provide a valid string */
>>>>>>>>
>>>>>>>> Now this is the case for flask, but for a generic string copying
>>>>>>>> routine I don't think this is desirable. It seems especially wrong to
>>>>>>>> aid the guest with putting a NUL where none was. If you really
>>>>>>>> want this, I guess you would be better off adding two variants:
>>>>>>>> One which demands the string to be NUL-terminated (in which
>>>>>>>> case passing in a size is sort of bogus), and one which takes a
>>>>>>>> size and inserts a NUL.
>>>>
>>>> I'm not sure why you would want a string copy-in function to not
>>>> NUL-terminate the strings it copies in.  If you don't want the strings
>>>> to be NUL-terminated at all, I would call it buffer copy-in function
>>>> (and copy_from_guest seems to cover buffer copy-in better).  If you want
>>>> the strings to be NUL-terminated and the guest has passed you a length,
>>>> it's simpler to have the hypervisor add the NUL instead of copying it
>>>> and then checking that it is there.  The current toolstack code for
>>>> XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
>>>> often passes in (s, strlen(s)).
>>>
>>> I didn't say to just leave such strings unterminated. Instead I said
>>> that if there is no zero terminator, rather than putting one there we
>>> should just fail the operation if the buffer size limit was exceeded.
>>
>> It looks like I use the same trick as for flask, i.e using strlen(s) and
>> therefore let the hypervisor set the NUL-terminator.
>>
>> I will add a comment on this function to say that we expect the
>> hypervisor to set the NUL-terminator.
> 
> But just to make sure - the generic helper introduced there shouldn't
> behave that way if being given the proposed name.

How will you rename the function?

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-18 12:53                   ` Julien Grall
@ 2014-06-18 13:01                     ` Jan Beulich
  2014-06-24 14:58                       ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Jan Beulich @ 2014-06-18 13:01 UTC (permalink / raw)
  To: Julien Grall, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

>>> On 18.06.14 at 14:53, <julien.grall@linaro.org> wrote:
> On 06/18/2014 01:49 PM, Jan Beulich wrote:
>>>>> On 18.06.14 at 14:22, <julien.grall@linaro.org> wrote:
>>> On 06/18/2014 12:59 PM, Jan Beulich wrote:
>>>>>>> On 18.06.14 at 00:43, <dgdegra@tycho.nsa.gov> wrote:
>>>>> On 06/17/2014 05:23 AM, Julien Grall wrote:
>>>>>>
>>>>>>
>>>>>> On 17/06/14 10:17, Jan Beulich wrote:
>>>>>>>>>> On 17.06.14 at 11:09, <julien.grall@linaro.org> wrote:
>>>>>>>> On 17/06/14 09:01, Jan Beulich wrote:
>>>>>>>>>>>> On 16.06.14 at 18:17, <julien.grall@linaro.org> wrote:
>>>>>>>>>> +
>>>>>>>>>> +    /* Add an extra +1 to append \0. We can't assume the guest will
>>>>>>>>>> +     * provide a valid string */
>>>>>>>>>
>>>>>>>>> Now this is the case for flask, but for a generic string copying
>>>>>>>>> routine I don't think this is desirable. It seems especially wrong to
>>>>>>>>> aid the guest with putting a NUL where none was. If you really
>>>>>>>>> want this, I guess you would be better off adding two variants:
>>>>>>>>> One which demands the string to be NUL-terminated (in which
>>>>>>>>> case passing in a size is sort of bogus), and one which takes a
>>>>>>>>> size and inserts a NUL.
>>>>>
>>>>> I'm not sure why you would want a string copy-in function to not
>>>>> NUL-terminate the strings it copies in.  If you don't want the strings
>>>>> to be NUL-terminated at all, I would call it buffer copy-in function
>>>>> (and copy_from_guest seems to cover buffer copy-in better).  If you want
>>>>> the strings to be NUL-terminated and the guest has passed you a length,
>>>>> it's simpler to have the hypervisor add the NUL instead of copying it
>>>>> and then checking that it is there.  The current toolstack code for
>>>>> XSM/FLASK relies on the hypervisor to add the NUL terminator, since it
>>>>> often passes in (s, strlen(s)).
>>>>
>>>> I didn't say to just leave such strings unterminated. Instead I said
>>>> that if there is no zero terminator, rather than putting one there we
>>>> should just fail the operation if the buffer size limit was exceeded.
>>>
>>> It looks like I use the same trick as for flask, i.e using strlen(s) and
>>> therefore let the hypervisor set the NUL-terminator.
>>>
>>> I will add a comment on this function to say that we expect the
>>> hypervisor to set the NUL-terminator.
>> 
>> But just to make sure - the generic helper introduced there shouldn't
>> behave that way if being given the proposed name.
> 
> How will you rename the function?

I don't know. All I know is that the function isn't simply coping in a
string.

Jan

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 19/19] xl: Add new option dtdev
  2014-06-16 17:19   ` Wei Liu
@ 2014-06-18 13:40     ` Julien Grall
  2014-06-18 13:43       ` Wei Liu
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 13:40 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

Hi Wei,

On 06/16/2014 06:19 PM, Wei Liu wrote:
> On Mon, Jun 16, 2014 at 05:18:06PM +0100, Julien Grall wrote:
> [...]
>> +    if (!xlu_cfg_get_list (config, "dtdev", &dtdevs, 0, 0)) {
>> +        d_config->num_dtdevs = 0;
>> +        d_config->dtdevs = NULL;
>> +        for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) {
>> +            libxl_device_dt *dtdev;
>> +
>> +            d_config->dtdevs = (libxl_device_dt *) realloc(d_config->dtdevs, sizeof (libxl_device_dt) * (d_config->num_dtdevs + 1));
>> +            dtdev = d_config->dtdevs + d_config->num_dtdevs;
>> +            libxl_device_dt_init(dtdev);
>> +
> 
> There's a macro called ARRAY_EXTEND_INIT, you can probably use that.

I can't use this macro because it requires to have a field devid in the
structure.

I don't think it worth to add this field just for code conciseness.
Though I can add rename this macro into ARRAY_EXTEND_INIT_DEVID and
introduce an ARRAY_EXTEND_INIT that could be use in more general case.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 19/19] xl: Add new option dtdev
  2014-06-18 13:40     ` Julien Grall
@ 2014-06-18 13:43       ` Wei Liu
  2014-06-18 13:46         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Wei Liu @ 2014-06-18 13:43 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On Wed, Jun 18, 2014 at 02:40:37PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 06/16/2014 06:19 PM, Wei Liu wrote:
> > On Mon, Jun 16, 2014 at 05:18:06PM +0100, Julien Grall wrote:
> > [...]
> >> +    if (!xlu_cfg_get_list (config, "dtdev", &dtdevs, 0, 0)) {
> >> +        d_config->num_dtdevs = 0;
> >> +        d_config->dtdevs = NULL;
> >> +        for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) {
> >> +            libxl_device_dt *dtdev;
> >> +
> >> +            d_config->dtdevs = (libxl_device_dt *) realloc(d_config->dtdevs, sizeof (libxl_device_dt) * (d_config->num_dtdevs + 1));
> >> +            dtdev = d_config->dtdevs + d_config->num_dtdevs;
> >> +            libxl_device_dt_init(dtdev);
> >> +
> > 
> > There's a macro called ARRAY_EXTEND_INIT, you can probably use that.
> 
> I can't use this macro because it requires to have a field devid in the
> structure.

Ah, OK.

In that case, you need to use xrealloc in your code.

> 
> I don't think it worth to add this field just for code conciseness.
> Though I can add rename this macro into ARRAY_EXTEND_INIT_DEVID and
> introduce an ARRAY_EXTEND_INIT that could be use in more general case.
> 

No need to do that in my opinion.

Wei.

> Regards,
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 19/19] xl: Add new option dtdev
  2014-06-18 13:43       ` Wei Liu
@ 2014-06-18 13:46         ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-18 13:46 UTC (permalink / raw)
  To: Wei Liu
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On 06/18/2014 02:43 PM, Wei Liu wrote:
> On Wed, Jun 18, 2014 at 02:40:37PM +0100, Julien Grall wrote:
>> Hi Wei,
>>
>> On 06/16/2014 06:19 PM, Wei Liu wrote:
>>> On Mon, Jun 16, 2014 at 05:18:06PM +0100, Julien Grall wrote:
>>> [...]
>>>> +    if (!xlu_cfg_get_list (config, "dtdev", &dtdevs, 0, 0)) {
>>>> +        d_config->num_dtdevs = 0;
>>>> +        d_config->dtdevs = NULL;
>>>> +        for (i = 0; (buf = xlu_cfg_get_listitem(dtdevs, i)) != NULL; i++) {
>>>> +            libxl_device_dt *dtdev;
>>>> +
>>>> +            d_config->dtdevs = (libxl_device_dt *) realloc(d_config->dtdevs, sizeof (libxl_device_dt) * (d_config->num_dtdevs + 1));
>>>> +            dtdev = d_config->dtdevs + d_config->num_dtdevs;
>>>> +            libxl_device_dt_init(dtdev);
>>>> +
>>>
>>> There's a macro called ARRAY_EXTEND_INIT, you can probably use that.
>>
>> I can't use this macro because it requires to have a field devid in the
>> structure.
> 
> Ah, OK.
> 
> In that case, you need to use xrealloc in your code.

Right, I blindly copied the pcidevs code which is also using realloc. I
will fix both of them in my next series.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page
  2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
@ 2014-06-18 15:03   ` Stefano Stabellini
  2014-07-03 10:52   ` Ian Campbell
  1 sibling, 0 replies; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 15:03 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> The function guest_physmap_remove_page does't have a return value. With
> the change "arch/arm: add consistency check to REMOVE p2m changes",
> apply_p2m_changes can unlikely fail. Warn the user in this case.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>

Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  xen/arch/arm/p2m.c |   14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index f93f99a..51f9225 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -607,10 +607,16 @@ void guest_physmap_remove_page(struct domain *d,
>                                 unsigned long gpfn,
>                                 unsigned long mfn, unsigned int page_order)
>  {
> -    apply_p2m_changes(d, REMOVE,
> -                      pfn_to_paddr(gpfn),
> -                      pfn_to_paddr(gpfn + (1<<page_order)),
> -                      pfn_to_paddr(mfn), NULL, MATTR_MEM, p2m_invalid);
> +    int ret;
> +
> +    ret = apply_p2m_changes(d, REMOVE,
> +                            pfn_to_paddr(gpfn),
> +                            pfn_to_paddr(gpfn + (1<<page_order)),
> +                            pfn_to_paddr(mfn), NULL, MATTR_MEM, p2m_invalid);
> +    if ( ret )
> +        dprintk(XENLOG_G_WARNING,
> +                "DOM%u: Unable to unmap region GPFN 0x%lx - 0x%lx MFN 0x%lx\n",
> +                d->domain_id, gpfn, gpfn + (1 << page_order), mfn);
>  }
>  
>  int p2m_alloc_table(struct domain *d)
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-16 16:18 ` [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough Julien Grall
@ 2014-06-18 15:12   ` Stefano Stabellini
  2014-06-18 15:23     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 15:12 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> This region will be split by the toolstack to allocate MMIO range for eac
> device.
> 
> For now only reserve a 512MB region, this should be enought to passthrough
> multiple device at the same time.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/include/public/arch-arm.h |    4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index ac54cd6..789bffb 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_GICC_BASE   0x03002000ULL
>  #define GUEST_GICC_SIZE   0x00000100ULL
>  
> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
> +#define GUEST_MMIO_BASE   0x10000000ULL
> +#define GUEST_MMIO_SIZE   0x20000000ULL

Is it really necessary to specify size here? It looks like an artifical
limitation to me: given that is unlikely that we'll ever be able to
support non-PCI device hotplug, we only have to handle cold-plug here.
So the toolstack has all the information it needs to build the perfect
memory layout for the guest at VM creation time.


>  /* 16MB == 4096 pages reserved for guest to use as a region to map its
>   * grant table in.
>   */
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 15:12   ` Stefano Stabellini
@ 2014-06-18 15:23     ` Julien Grall
  2014-06-18 15:26       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 15:23 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On 06/18/2014 04:12 PM, Stefano Stabellini wrote:
> On Mon, 16 Jun 2014, Julien Grall wrote:
>> This region will be split by the toolstack to allocate MMIO range for eac
>> device.
>>
>> For now only reserve a 512MB region, this should be enought to passthrough
>> multiple device at the same time.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/include/public/arch-arm.h |    4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
>> index ac54cd6..789bffb 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
>>  #define GUEST_GICC_BASE   0x03002000ULL
>>  #define GUEST_GICC_SIZE   0x00000100ULL
>>  
>> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
>> +#define GUEST_MMIO_BASE   0x10000000ULL
>> +#define GUEST_MMIO_SIZE   0x20000000ULL
> 
> Is it really necessary to specify size here? It looks like an artifical
> limitation to me: given that is unlikely that we'll ever be able to
> support non-PCI device hotplug, we only have to handle cold-plug here.
> So the toolstack has all the information it needs to build the perfect
> memory layout for the guest at VM creation time.

We have the same "artificial" limitation for the RAM banks... The
toolstack doesn't know where the different regions end up without the
size. As the layout may move in the future, adding the size avoid
modifying the toolstack every time we change it.

For instance, the layout will change again soon with guest support for
GICv3.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 15:23     ` Julien Grall
@ 2014-06-18 15:26       ` Ian Campbell
  2014-06-18 17:48         ` Stefano Stabellini
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-06-18 15:26 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Wed, 2014-06-18 at 16:23 +0100, Julien Grall wrote:
> On 06/18/2014 04:12 PM, Stefano Stabellini wrote:
> > On Mon, 16 Jun 2014, Julien Grall wrote:
> >> This region will be split by the toolstack to allocate MMIO range for eac
> >> device.
> >>
> >> For now only reserve a 512MB region, this should be enought to passthrough
> >> multiple device at the same time.
> >>
> >> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >> ---
> >>  xen/include/public/arch-arm.h |    4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> >> index ac54cd6..789bffb 100644
> >> --- a/xen/include/public/arch-arm.h
> >> +++ b/xen/include/public/arch-arm.h
> >> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
> >>  #define GUEST_GICC_BASE   0x03002000ULL
> >>  #define GUEST_GICC_SIZE   0x00000100ULL
> >>  
> >> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
> >> +#define GUEST_MMIO_BASE   0x10000000ULL
> >> +#define GUEST_MMIO_SIZE   0x20000000ULL
> > 
> > Is it really necessary to specify size here? It looks like an artifical
> > limitation to me: given that is unlikely that we'll ever be able to
> > support non-PCI device hotplug, we only have to handle cold-plug here.
> > So the toolstack has all the information it needs to build the perfect
> > memory layout for the guest at VM creation time.
> 
> We have the same "artificial" limitation for the RAM banks... The
> toolstack doesn't know where the different regions end up without the
> size. As the layout may move in the future, adding the size avoid
> modifying the toolstack every time we change it.

It also provides a documentary clue to people modifying things in this
file to remind them to think about how much space they need to try and
leave for this when adding something else.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 15:26       ` Ian Campbell
@ 2014-06-18 17:48         ` Stefano Stabellini
  2014-06-18 17:54           ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 17:48 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, tim, Julien Grall, stefano.stabellini, Stefano Stabellini

On Wed, 18 Jun 2014, Ian Campbell wrote:
> On Wed, 2014-06-18 at 16:23 +0100, Julien Grall wrote:
> > On 06/18/2014 04:12 PM, Stefano Stabellini wrote:
> > > On Mon, 16 Jun 2014, Julien Grall wrote:
> > >> This region will be split by the toolstack to allocate MMIO range for eac
> > >> device.
> > >>
> > >> For now only reserve a 512MB region, this should be enought to passthrough
> > >> multiple device at the same time.
> > >>
> > >> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> > >> ---
> > >>  xen/include/public/arch-arm.h |    4 ++++
> > >>  1 file changed, 4 insertions(+)
> > >>
> > >> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> > >> index ac54cd6..789bffb 100644
> > >> --- a/xen/include/public/arch-arm.h
> > >> +++ b/xen/include/public/arch-arm.h
> > >> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
> > >>  #define GUEST_GICC_BASE   0x03002000ULL
> > >>  #define GUEST_GICC_SIZE   0x00000100ULL
> > >>  
> > >> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
> > >> +#define GUEST_MMIO_BASE   0x10000000ULL
> > >> +#define GUEST_MMIO_SIZE   0x20000000ULL
> > > 
> > > Is it really necessary to specify size here? It looks like an artifical
> > > limitation to me: given that is unlikely that we'll ever be able to
> > > support non-PCI device hotplug, we only have to handle cold-plug here.
> > > So the toolstack has all the information it needs to build the perfect
> > > memory layout for the guest at VM creation time.
> > 
> > We have the same "artificial" limitation for the RAM banks... The
> > toolstack doesn't know where the different regions end up without the
> > size. As the layout may move in the future, adding the size avoid
> > modifying the toolstack every time we change it.
> 
> It also provides a documentary clue to people modifying things in this
> file to remind them to think about how much space they need to try and
> leave for this when adding something else.

I am not against documentation or resonable defaults. Let me explain
what I mean more clearly: if we are trying to assign 1 device with an
MMIO region of 1024MB, we know that it is not going to fit.

Can we rearrange the guest memory layout to increase GUEST_MMIO_SIZE?
After all the guest hasn't booted yet.

Otherwise could we calculate the size of the MMIO hole needed earlier,
at the time of building the guest p2m? This sounds harder.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 17:48         ` Stefano Stabellini
@ 2014-06-18 17:54           ` Julien Grall
  2014-06-18 18:14             ` Stefano Stabellini
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 17:54 UTC (permalink / raw)
  To: Stefano Stabellini, Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 06/18/2014 06:48 PM, Stefano Stabellini wrote:
> On Wed, 18 Jun 2014, Ian Campbell wrote:
>> On Wed, 2014-06-18 at 16:23 +0100, Julien Grall wrote:
>>> On 06/18/2014 04:12 PM, Stefano Stabellini wrote:
>>>> On Mon, 16 Jun 2014, Julien Grall wrote:
>>>>> This region will be split by the toolstack to allocate MMIO range for eac
>>>>> device.
>>>>>
>>>>> For now only reserve a 512MB region, this should be enought to passthrough
>>>>> multiple device at the same time.
>>>>>
>>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>>>> ---
>>>>>  xen/include/public/arch-arm.h |    4 ++++
>>>>>  1 file changed, 4 insertions(+)
>>>>>
>>>>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
>>>>> index ac54cd6..789bffb 100644
>>>>> --- a/xen/include/public/arch-arm.h
>>>>> +++ b/xen/include/public/arch-arm.h
>>>>> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
>>>>>  #define GUEST_GICC_BASE   0x03002000ULL
>>>>>  #define GUEST_GICC_SIZE   0x00000100ULL
>>>>>  
>>>>> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
>>>>> +#define GUEST_MMIO_BASE   0x10000000ULL
>>>>> +#define GUEST_MMIO_SIZE   0x20000000ULL
>>>>
>>>> Is it really necessary to specify size here? It looks like an artifical
>>>> limitation to me: given that is unlikely that we'll ever be able to
>>>> support non-PCI device hotplug, we only have to handle cold-plug here.
>>>> So the toolstack has all the information it needs to build the perfect
>>>> memory layout for the guest at VM creation time.
>>>
>>> We have the same "artificial" limitation for the RAM banks... The
>>> toolstack doesn't know where the different regions end up without the
>>> size. As the layout may move in the future, adding the size avoid
>>> modifying the toolstack every time we change it.
>>
>> It also provides a documentary clue to people modifying things in this
>> file to remind them to think about how much space they need to try and
>> leave for this when adding something else.
> 
> I am not against documentation or resonable defaults. Let me explain
> what I mean more clearly: if we are trying to assign 1 device with an
> MMIO region of 1024MB, we know that it is not going to fit.

For non-PCI passthrough this size will unlikely happen. We always map a
matter of few pages per-device.

I think this size is enough for the time being. I plan to revisit it for
PCI passthrough where we will be able to allocate some of them after 4G.

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-16 16:17 ` [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying Julien Grall
@ 2014-06-18 18:08   ` Stefano Stabellini
  2014-06-18 18:26     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:08 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> Xen has to release IRQ routed to a domain in order to reuse later. Currently
> only SPIs can be routed to the guest so we only need to browse SPIs for a
> specific domain.
> 
> Futhermore, a guest can crash and let the IRQ in an incorrect state (i.e has
> not being EOIed). Add a function to reset a given IRQ to allow Xen route again
> the IRQ in the future.
> 
> Also, reset the desc->handler to no_irq_type. This will let you know if we
> did something wrong with the IRQ management.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/gic.c        |   12 ++++++++++++
>  xen/arch/arm/irq.c        |    8 ++++++++
>  xen/arch/arm/vgic.c       |   10 ++++++++++
>  xen/include/asm-arm/gic.h |    3 +++
>  4 files changed, 33 insertions(+)
> 
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 11e53af..42fc3bc 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -928,6 +928,18 @@ int gicv_setup(struct domain *d)
>  
>  }
>  
> +/* The guest may not have EOIed the IRQ.
> + * Be sure to reset correctly the IRQ.
> + */
> +void gic_reset_guest_irq(struct irq_desc *desc)
> +{
> +    ASSERT(spin_is_locked(&desc->lock));
> +    ASSERT(desc->status & IRQ_GUEST);
> +
> +    if ( desc->status & IRQ_INPROGRESS )
> +        GICC[GICC_DIR] = desc->irq;
> +}

You should call gic_update_one_lr first, then check IRQ_INPROGRESS.
You should also call gic_remove_from_queues, remove the irq from the
inflight queue and clear the GIC_IRQ_GUEST_* status bits.


>  static void maintenance_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs)
>  {
>      /* 
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index 4e51fee..e44a90f 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -274,7 +274,15 @@ void release_irq(unsigned int irq, const void *dev_id)
>      if ( !desc->action )
>      {
>          desc->handler->shutdown(desc);
> +
> +        if ( desc->status & IRQ_GUEST )
> +        {
> +            gic_reset_guest_irq(desc);
> +            desc->status &= ~IRQ_INPROGRESS;
> +        }
> +
>          desc->status &= ~IRQ_GUEST;
> +        desc->handler = &no_irq_type;
>      }
>  
>      spin_unlock_irqrestore(&desc->lock,flags);
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index cb8df3a..e451324 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -112,6 +112,16 @@ int domain_vgic_init(struct domain *d)
>  
>  void domain_vgic_free(struct domain *d)
>  {
> +    int i;
> +
> +    for ( i = NR_LOCAL_IRQS; i < d->arch.vgic.nr_lines; i++ )
> +    {
> +        struct irq_desc *desc = d->arch.vgic.pending_irqs[i].desc;
> +
> +        if ( desc )
> +            release_irq(desc->irq, d);
> +    }
> +
>      xfree(d->arch.vgic.shared_irqs);
>      xfree(d->arch.vgic.pending_irqs);
>  }
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 6e7375c..841d845 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -228,6 +228,9 @@ int gic_irq_xlate(const u32 *intspec, unsigned int intsize,
>                    unsigned int *out_hwirq, unsigned int *out_type);
>  void gic_clear_lrs(struct vcpu *v);
>  
> +/* Reset an IRQ passthrough to a guest */
> +void gic_reset_guest_irq(struct irq_desc *desc);
> +
>  #endif /* __ASSEMBLY__ */
>  #endif
>  
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 17:54           ` Julien Grall
@ 2014-06-18 18:14             ` Stefano Stabellini
  2014-06-18 18:33               ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:14 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, stefano.stabellini, Ian Campbell, Stefano Stabellini

On Wed, 18 Jun 2014, Julien Grall wrote:
> On 06/18/2014 06:48 PM, Stefano Stabellini wrote:
> > On Wed, 18 Jun 2014, Ian Campbell wrote:
> >> On Wed, 2014-06-18 at 16:23 +0100, Julien Grall wrote:
> >>> On 06/18/2014 04:12 PM, Stefano Stabellini wrote:
> >>>> On Mon, 16 Jun 2014, Julien Grall wrote:
> >>>>> This region will be split by the toolstack to allocate MMIO range for eac
> >>>>> device.
> >>>>>
> >>>>> For now only reserve a 512MB region, this should be enought to passthrough
> >>>>> multiple device at the same time.
> >>>>>
> >>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >>>>> ---
> >>>>>  xen/include/public/arch-arm.h |    4 ++++
> >>>>>  1 file changed, 4 insertions(+)
> >>>>>
> >>>>> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> >>>>> index ac54cd6..789bffb 100644
> >>>>> --- a/xen/include/public/arch-arm.h
> >>>>> +++ b/xen/include/public/arch-arm.h
> >>>>> @@ -369,6 +369,10 @@ typedef uint64_t xen_callback_t;
> >>>>>  #define GUEST_GICC_BASE   0x03002000ULL
> >>>>>  #define GUEST_GICC_SIZE   0x00000100ULL
> >>>>>  
> >>>>> +/* Space for mapping MMIO from device passthrough: 512MB @ 256MB*/
> >>>>> +#define GUEST_MMIO_BASE   0x10000000ULL
> >>>>> +#define GUEST_MMIO_SIZE   0x20000000ULL
> >>>>
> >>>> Is it really necessary to specify size here? It looks like an artifical
> >>>> limitation to me: given that is unlikely that we'll ever be able to
> >>>> support non-PCI device hotplug, we only have to handle cold-plug here.
> >>>> So the toolstack has all the information it needs to build the perfect
> >>>> memory layout for the guest at VM creation time.
> >>>
> >>> We have the same "artificial" limitation for the RAM banks... The
> >>> toolstack doesn't know where the different regions end up without the
> >>> size. As the layout may move in the future, adding the size avoid
> >>> modifying the toolstack every time we change it.
> >>
> >> It also provides a documentary clue to people modifying things in this
> >> file to remind them to think about how much space they need to try and
> >> leave for this when adding something else.
> > 
> > I am not against documentation or resonable defaults. Let me explain
> > what I mean more clearly: if we are trying to assign 1 device with an
> > MMIO region of 1024MB, we know that it is not going to fit.
> 
> For non-PCI passthrough this size will unlikely happen. We always map a
> matter of few pages per-device.
> 
> I think this size is enough for the time being. I plan to revisit it for
> PCI passthrough where we will be able to allocate some of them after 4G.

Modern GPUs can easily exceed 512MB and they work on ARM.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-18 18:08   ` Stefano Stabellini
@ 2014-06-18 18:26     ` Julien Grall
  2014-06-18 18:48       ` Stefano Stabellini
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 18:26 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim



On 18/06/14 19:08, Stefano Stabellini wrote:
>> +/* The guest may not have EOIed the IRQ.
>> + * Be sure to reset correctly the IRQ.
>> + */
>> +void gic_reset_guest_irq(struct irq_desc *desc)
>> +{
>> +    ASSERT(spin_is_locked(&desc->lock));
>> +    ASSERT(desc->status & IRQ_GUEST);
>> +
>> +    if ( desc->status & IRQ_INPROGRESS )
>> +        GICC[GICC_DIR] = desc->irq;
>> +}
>
> You should call gic_update_one_lr first, then check IRQ_INPROGRESS.
> You should also call gic_remove_from_queues, remove the irq from the
> inflight queue and clear the GIC_IRQ_GUEST_* status bits.

Are you sure? This function is only called when the domain is dying, so 
the guest is already unscheduled. Therefore gic_update_one_lr won't work.

I can add an ASSERT(irq_get_domain(desc)->is_dying) here...

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 18:14             ` Stefano Stabellini
@ 2014-06-18 18:33               ` Julien Grall
  2014-06-18 18:55                 ` Stefano Stabellini
  2014-07-03 11:56                 ` Ian Campbell
  0 siblings, 2 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-18 18:33 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, Ian Campbell, tim



On 18/06/14 19:14, Stefano Stabellini wrote:
>> For non-PCI passthrough this size will unlikely happen. We always map a
>> matter of few pages per-device.
>>
>> I think this size is enough for the time being. I plan to revisit it for
>> PCI passthrough where we will be able to allocate some of them after 4G.
>
> Modern GPUs can easily exceed 512MB and they work on ARM.

If it's like on x86, passthrough GPUs may also require firmware. At 
least it looks like the case with some configuration on the Arndale, 
which doesn't have any SMMU.

I'd like to make this series as simple as possible so we can get 
"quickly" a working solution to passthrough device with small amount of 
MMIO region.

So if you don't mind I suggest to bump the amount of MMIO region to 1GB, 
I may need to shrink the first RAM bank, and think about bigger region 
support in a follow-up series.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-18 18:26     ` Julien Grall
@ 2014-06-18 18:48       ` Stefano Stabellini
  2014-06-18 18:54         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:48 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, stefano.stabellini, ian.campbell, Stefano Stabellini

On Wed, 18 Jun 2014, Julien Grall wrote:
> On 18/06/14 19:08, Stefano Stabellini wrote:
> > > +/* The guest may not have EOIed the IRQ.
> > > + * Be sure to reset correctly the IRQ.
> > > + */
> > > +void gic_reset_guest_irq(struct irq_desc *desc)
> > > +{
> > > +    ASSERT(spin_is_locked(&desc->lock));
> > > +    ASSERT(desc->status & IRQ_GUEST);
> > > +
> > > +    if ( desc->status & IRQ_INPROGRESS )
> > > +        GICC[GICC_DIR] = desc->irq;
> > > +}
> > 
> > You should call gic_update_one_lr first, then check IRQ_INPROGRESS.
> > You should also call gic_remove_from_queues, remove the irq from the
> > inflight queue and clear the GIC_IRQ_GUEST_* status bits.
> 
> Are you sure? This function is only called when the domain is dying, so the
> guest is already unscheduled. Therefore gic_update_one_lr won't work.
> 
> I can add an ASSERT(irq_get_domain(desc)->is_dying) here...

The ASSERT is a good idea.

Given that the domain has been descheduled, gic_update_one_lr won't work
but you can read the saved lr (pending_irq->lr) from v->arch.gic_lr.
You can obtain the target vcpu calling vgic_get_target_vcpu. 
You only need to write to GICC_DIR if (gic_lr & (GICH_LR_ACTIVE|GICH_LR_PENDING)).

gic_remove_from_queues should still work.

Also I wonder if you need to call gic_reset_guest_irq before
desc->handler->shutdown.
The specification states (4.3.5):

'Disabling an interrupt only disables the forwarding of the interrupt
from the Distributor to any CPU interface. It does not prevent the
interrupt from changing state, for example becoming pending, or active
and pending if it is already active.'

So from the text above I think that EOIing an interrupt that has been
disabled at the GICD level should work, but it is not 100% clear.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ
  2014-06-16 16:17 ` [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ Julien Grall
@ 2014-06-18 18:52   ` Stefano Stabellini
  2014-06-18 19:03     ` Julien Grall
  2014-07-03 11:04   ` Ian Campbell
  1 sibling, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:52 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> Currently Xen only supports SPIs routing for guest, add a function
> is_routable_irq to check if we can route a given IRQ to the guest.
> 
> Secondly, make sure the vIRQ (which is currently the same as the pIRQ) is not
> the greater that the number of IRQs handle to the vGIC.
> 
> Finally, desc->arch.type which contains the IRQ type (i.e level/edge) must
> be correctly configured before. The IRQ type won't be configure when:
>     - the device has been blacklist for the current platform
>     - the IRQ has not been describe in the device tree
> 
> I think we can safely assume that a user won't never ask to route
> as such IRQ to the guest.
> 
> Also, use XENLOG_G_ERR in the error message within the function as it will
> be later called from a guest.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/irq.c        |   32 +++++++++++++++++++++++++++++---
>  xen/include/asm-arm/gic.h |    2 ++
>  xen/include/asm-arm/irq.h |    6 ++++++
>  3 files changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index 9c141bc..4e51fee 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -361,6 +361,10 @@ err:
>      return rc;
>  }
>  
> +/* Route an IRQ to a specific guest.
> + * For now the vIRQ is equal to the pIRQ and only SPIs are routabled to
> + * the guest.
> + */
>  int route_irq_to_guest(struct domain *d, unsigned int irq,
>                         const char * devname)
>  {
> @@ -369,6 +373,20 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
>      unsigned long flags;
>      int retval = 0;
>  
> +    if ( !is_routable_irq(irq) )
> +    {
> +        dprintk(XENLOG_G_ERR, "the IRQ%u is not routable\n", irq);
> +        return -EINVAL;
> +    }
> +
> +    if ( irq > vgic_num_irqs(d) )
> +    {
> +        dprintk(XENLOG_G_ERR,
> +                "the IRQ number %u is too high for domain %u (max = %u)\n",
> +                irq, d->domain_id, vgic_num_irqs(d));
> +        return -EINVAL;
> +    }

I think it makes sense to move the "irq > vgic_num_irqs(d)" check
within is_routable_irq.


>      action = xmalloc(struct irqaction);
>      if (!action)
>          return -ENOMEM;
> @@ -379,6 +397,14 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
>  
>      spin_lock_irqsave(&desc->lock, flags);
>  
> +    if ( desc->arch.type == DT_IRQ_TYPE_INVALID )
> +    {
> +        dprintk(XENLOG_G_ERR, "IRQ %u has not been configured\n",
> +                irq);
> +        retval = -EIO;
> +        goto out;
> +    }
> +
>      /* If the IRQ is already used by someone
>       *  - If it's the same domain -> Xen doesn't need to update the IRQ desc
>       *  - Otherwise -> For now, don't allow the IRQ to be shared between
> @@ -392,10 +418,10 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
>              goto out;
>  
>          if ( desc->status & IRQ_GUEST )
> -            printk(XENLOG_ERR "ERROR: IRQ %u is already used by domain %u\n",
> -                   irq, ad->domain_id);
> +            dprintk(XENLOG_G_ERR, "IRQ %u is already used by domain %u\n",
> +                    irq, ad->domain_id);
>          else
> -            printk(XENLOG_ERR "ERROR: IRQ %u is already used by Xen\n", irq);
> +            dprintk(XENLOG_G_ERR, "IRQ %u is already used by Xen\n", irq);
>          retval = -EBUSY;
>          goto out;
>      }
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 8e37ccf..6e7375c 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -163,6 +163,8 @@
>  #define DT_MATCH_GIC    DT_MATCH_COMPATIBLE("arm,cortex-a15-gic"), \
>                          DT_MATCH_COMPATIBLE("arm,cortex-a7-gic")
>  
> +#define vgic_num_irqs(d)    ((d)->arch.vgic.nr_lines + 32)
> +
>  extern int domain_vgic_init(struct domain *d);
>  extern void domain_vgic_free(struct domain *d);
>  
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index e567f71..63926a5 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -37,6 +37,12 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>  
>  #define domain_pirq_to_irq(d, pirq) (pirq)
>  
> +static inline bool_t is_routable_irq(unsigned int irq)
> +{
> +    /* For now, we can only route SPIs to the guest */
> +    return (irq >= NR_LOCAL_IRQS);
> +}
> +
>  void init_IRQ(void);
>  void init_secondary_IRQ(void);
>  
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index
  2014-06-16 16:17 ` [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index Julien Grall
@ 2014-06-18 18:54   ` Stefano Stabellini
  2014-06-19 11:42     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:54 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> There is no reason to use signed integer for an index. Futhermore, this will
> avoid possible issue when theses functions will be exposed to the guest
> via new hypercalls.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/common/device_tree.c      |   10 +++++-----
>  xen/include/xen/device_tree.h |    7 ++++---
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index f0b17a3..4736e0d 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -876,7 +876,7 @@ static const struct dt_bus *dt_match_bus(const struct dt_device_node *np)
>  }
>  
>  static const __be32 *dt_get_address(const struct dt_device_node *dev,
> -                                    int index, u64 *size,
> +                                    unsigned index, u64 *size,

It is the same thing but you might as well use unsigned int.


>                                      unsigned int *flags)
>  {
>      const __be32 *prop;
> @@ -1063,7 +1063,7 @@ bail:
>  }
>  
>  /* dt_device_address - Translate device tree address and return it */
> -int dt_device_get_address(const struct dt_device_node *dev, int index,
> +int dt_device_get_address(const struct dt_device_node *dev, unsigned int index,
>                            u64 *addr, u64 *size)
>  {
>      const __be32 *addrp;
> @@ -1386,7 +1386,7 @@ fail:
>      return -EINVAL;
>  }
>  
> -int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
> +int dt_device_get_raw_irq(const struct dt_device_node *device, uint32_t index,

Why are you changing the other indexes to unsigned int and this one to
uint32_t?


>                            struct dt_raw_irq *out_irq)
>  {
>      const struct dt_device_node *p;
> @@ -1394,7 +1394,7 @@ int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
>      u32 intsize, intlen;
>      int res = -EINVAL;
>  
> -    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%d\n",
> +    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%u\n",
>                 device->full_name, index);
>  
>      /* Get the interrupts property */
> @@ -1445,7 +1445,7 @@ int dt_irq_translate(const struct dt_raw_irq *raw,
>                          &out_irq->irq, &out_irq->type);
>  }
>  
> -int dt_device_get_irq(const struct dt_device_node *device, int index,
> +int dt_device_get_irq(const struct dt_device_node *device, uint32_t index,

ditto


>                        struct dt_irq *out_irq)
>  {
>      struct dt_raw_irq raw;
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index 25db076..e413447 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -502,7 +502,7 @@ const struct dt_device_node *dt_get_parent(const struct dt_device_node *node);
>   * This function resolves an address, walking the tree, for a give
>   * device-tree node. It returns 0 on success.
>   */
> -int dt_device_get_address(const struct dt_device_node *dev, int index,
> +int dt_device_get_address(const struct dt_device_node *dev, unsigned int index,
>                            u64 *addr, u64 *size);
>  
>  /**
> @@ -532,7 +532,7 @@ unsigned int dt_number_of_address(const struct dt_device_node *device);
>   * This function resolves an interrupt, walking the tree, for a given
>   * device-tree node. It's the high level pendant to dt_device_get_raw_irq().
>   */
> -int dt_device_get_irq(const struct dt_device_node *device, int index,
> +int dt_device_get_irq(const struct dt_device_node *device, unsigned int index,
>                        struct dt_irq *irq);
>  
>  /**
> @@ -544,7 +544,8 @@ int dt_device_get_irq(const struct dt_device_node *device, int index,
>   * This function resolves an interrupt for a device, no translation is
>   * made. dt_irq_translate can be called after.
>   */
> -int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
> +int dt_device_get_raw_irq(const struct dt_device_node *device,
> +                          unsigned int index,
>                            struct dt_raw_irq *irq);
>  
>  /**
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-18 18:48       ` Stefano Stabellini
@ 2014-06-18 18:54         ` Julien Grall
  2014-06-18 19:06           ` Stefano Stabellini
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 18:54 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim



On 18/06/14 19:48, Stefano Stabellini wrote:
>> I can add an ASSERT(irq_get_domain(desc)->is_dying) here...
>
> The ASSERT is a good idea.
>
> Given that the domain has been descheduled, gic_update_one_lr won't work
> but you can read the saved lr (pending_irq->lr) from v->arch.gic_lr.
> You can obtain the target vcpu calling vgic_get_target_vcpu.
> You only need to write to GICC_DIR if (gic_lr & (GICH_LR_ACTIVE|GICH_LR_PENDING)).
 > gic_remove_from_queues should still work.

Unless we assume a virq == pirq  we can't retrieve the irq_pending 
structure via an irq_desc. Also this may has been free earlier, even tho 
for now SPIs are stored directly in arch_domain.

I prefer to keep the check IRQ_INPROGRESS, and call gic_update_lrs 
before unscheduled the VCPU. BTW, is it the case?

>
> Also I wonder if you need to call gic_reset_guest_irq before
> desc->handler->shutdown.
> The specification states (4.3.5):
>
> 'Disabling an interrupt only disables the forwarding of the interrupt
> from the Distributor to any CPU interface. It does not prevent the
> interrupt from changing state, for example becoming pending, or active
> and pending if it is already active.'
>
> So from the text above I think that EOIing an interrupt that has been
> disabled at the GICD level should work, but it is not 100% clear.

On oneshot IRQ, Linux is disabling the IRQ before EOIing it. This will 
avoid to receive spurious interrupt.

Here we need to do the same thing otherwise we may receive spurious 
interrupt.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 18:33               ` Julien Grall
@ 2014-06-18 18:55                 ` Stefano Stabellini
  2014-07-03 11:56                 ` Ian Campbell
  1 sibling, 0 replies; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 18:55 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, stefano.stabellini, Ian Campbell, Stefano Stabellini

On Wed, 18 Jun 2014, Julien Grall wrote:
> On 18/06/14 19:14, Stefano Stabellini wrote:
> > > For non-PCI passthrough this size will unlikely happen. We always map a
> > > matter of few pages per-device.
> > > 
> > > I think this size is enough for the time being. I plan to revisit it for
> > > PCI passthrough where we will be able to allocate some of them after 4G.
> > 
> > Modern GPUs can easily exceed 512MB and they work on ARM.
> 
> If it's like on x86, passthrough GPUs may also require firmware. At least it
> looks like the case with some configuration on the Arndale, which doesn't have
> any SMMU.
> 
> I'd like to make this series as simple as possible so we can get "quickly" a
> working solution to passthrough device with small amount of MMIO region.
> 
> So if you don't mind I suggest to bump the amount of MMIO region to 1GB, I may
> need to shrink the first RAM bank, and think about bigger region support in a
> follow-up series.

OK

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ
  2014-06-18 18:52   ` Stefano Stabellini
@ 2014-06-18 19:03     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-18 19:03 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

Hi Stefano,

On 18/06/14 19:52, Stefano Stabellini wrote:
>> +/* Route an IRQ to a specific guest.
>> + * For now the vIRQ is equal to the pIRQ and only SPIs are routabled to
>> + * the guest.
>> + */
>>   int route_irq_to_guest(struct domain *d, unsigned int irq,
>>                          const char * devname)
>>   {
>> @@ -369,6 +373,20 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
>>       unsigned long flags;
>>       int retval = 0;
>>
>> +    if ( !is_routable_irq(irq) )
>> +    {
>> +        dprintk(XENLOG_G_ERR, "the IRQ%u is not routable\n", irq);
>> +        return -EINVAL;
>> +    }
>> +
>> +    if ( irq > vgic_num_irqs(d) )
>> +    {
>> +        dprintk(XENLOG_G_ERR,
>> +                "the IRQ number %u is too high for domain %u (max = %u)\n",
>> +                irq, d->domain_id, vgic_num_irqs(d));
>> +        return -EINVAL;
>> +    }
>
> I think it makes sense to move the "irq > vgic_num_irqs(d)" check
> within is_routable_irq.

is_routable_irq checks that Xen is effectively able to route the IRQ to 
a guest, rather than the check "irq > vgic_num_irqs(d)" is here because 
we assume a virq == pirq.

I suspect we will have to handle virq != pirq sooner or later because 
allocate 1000 irq_pending structure unconditionally per guest is a waste 
of memory.

Furthermore, I will use it in different place is_routable_irq (see patch 
#6, #9) where we don't necessary have the domain in hand.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-18 18:54         ` Julien Grall
@ 2014-06-18 19:06           ` Stefano Stabellini
  2014-06-18 19:09             ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 19:06 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, stefano.stabellini, ian.campbell, Stefano Stabellini

On Wed, 18 Jun 2014, Julien Grall wrote:
> On 18/06/14 19:48, Stefano Stabellini wrote:
> > > I can add an ASSERT(irq_get_domain(desc)->is_dying) here...
> > 
> > The ASSERT is a good idea.
> > 
> > Given that the domain has been descheduled, gic_update_one_lr won't work
> > but you can read the saved lr (pending_irq->lr) from v->arch.gic_lr.
> > You can obtain the target vcpu calling vgic_get_target_vcpu.
> > You only need to write to GICC_DIR if (gic_lr &
> > (GICH_LR_ACTIVE|GICH_LR_PENDING)).
> > gic_remove_from_queues should still work.
> 
> Unless we assume a virq == pirq  we can't retrieve the irq_pending structure
> via an irq_desc.

That's annoying. We might have to fix this sooner or later.


> Also this may has been free earlier, even tho for now SPIs
> are stored directly in arch_domain.
> 
> I prefer to keep the check IRQ_INPROGRESS, and call gic_update_lrs before
> unscheduled the VCPU.

I was thinking about this alternative, and it is better (see below).


> BTW, is it the case?
 
At the moment gic_update_lrs is called on hypervisor entry, so whenever
the vcpu gets interrupted, gic_update_lrs is called.  If
gic_reset_guest_irq is called always after all the guest vcpus have been
descheduled, IRQ_INPROGRESS is surely up to date and you don't need to
go through the lrs.



> > Also I wonder if you need to call gic_reset_guest_irq before
> > desc->handler->shutdown.
> > The specification states (4.3.5):
> > 
> > 'Disabling an interrupt only disables the forwarding of the interrupt
> > from the Distributor to any CPU interface. It does not prevent the
> > interrupt from changing state, for example becoming pending, or active
> > and pending if it is already active.'
> > 
> > So from the text above I think that EOIing an interrupt that has been
> > disabled at the GICD level should work, but it is not 100% clear.
> 
> On oneshot IRQ, Linux is disabling the IRQ before EOIing it. This will avoid
> to receive spurious interrupt.
> 
> Here we need to do the same thing otherwise we may receive spurious interrupt.
> 

Good to know

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying
  2014-06-18 19:06           ` Stefano Stabellini
@ 2014-06-18 19:09             ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-18 19:09 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim



On 18/06/14 20:06, Stefano Stabellini wrote:
> At the moment gic_update_lrs is called on hypervisor entry, so whenever
> the vcpu gets interrupted, gic_update_lrs is called.  If
> gic_reset_guest_irq is called always after all the guest vcpus have been
> descheduled, IRQ_INPROGRESS is surely up to date and you don't need to
> go through the lrs.

Thanks. I will add a comment in the code explaining why IRQ_INPROGRESS 
is valid here.

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-06-16 16:17 ` [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq Julien Grall
@ 2014-06-18 19:24   ` Stefano Stabellini
  2014-06-19 11:39     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 19:24 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> The physdev sub-hypercall PHYSDEVOP_map_pirq allow the toolstack to route
> a physical IRQ to the guest (via the config options "irqs" for xl).
> For now, we allow only SPIs to be mapped to the guest.
> The type MAP_PIRQ_TYPE_GSI is used for this purpose.
> 
> The virtual IRQ number is equal to the physical one. This will avoid adding
> logic in Xen to allocate the vIRQ number. The drawbacks is we allocated
> unconditionally the same amount of SPIs as the host. This value will never
> be more than 1024 with GICv2.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> 
> ---
>     I'm wondering if we should introduce an alias of MAP_PIRQ_TYPE_GSI
>     for ARM. It's will be less confuse for the user.
> ---
>  xen/arch/arm/physdev.c |   77 ++++++++++++++++++++++++++++++++++++++++++++++--
>  xen/arch/arm/vgic.c    |    5 +---
>  2 files changed, 76 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
> index 61b4a18..d17589c 100644
> --- a/xen/arch/arm/physdev.c
> +++ b/xen/arch/arm/physdev.c
> @@ -8,13 +8,86 @@
>  #include <xen/types.h>
>  #include <xen/lib.h>
>  #include <xen/errno.h>
> +#include <xen/iocap.h>
> +#include <xen/guest_access.h>
> +#include <xsm/xsm.h>
> +#include <asm/current.h>
>  #include <asm/hypercall.h>
> +#include <public/physdev.h>
> +
> +static int physdev_map_pirq(domid_t domid, int type, int index, int *pirq_p)
> +{
> +    struct domain *d;
> +    int ret;
> +    int irq = index;
> +
> +    d = rcu_lock_domain_by_any_id(domid);
> +    if ( d == NULL )
> +        return -ESRCH;
> +
> +    ret = xsm_map_domain_pirq(XSM_TARGET, d);
> +    if ( ret )
> +        goto free_domain;
> +
> +    /* For now we only suport GSI */
> +    if ( type != MAP_PIRQ_TYPE_GSI )
> +    {
> +        ret = -EINVAL;
> +        dprintk(XENLOG_G_ERR, "dom%u: wrong map_pirq type 0x%x\n",
> +                d->domain_id, type);
> +        goto free_domain;
> +    }
> +
> +    if ( !is_routable_irq(irq) )
> +    {
> +        ret = -EINVAL;
> +        dprintk(XENLOG_G_ERR, "IRQ%u is not routable to a guest\n", irq);
> +        goto free_domain;
> +    }
> +
> +    ret = -EPERM;
> +    if ( !irq_access_permitted(current->domain, irq) )
> +        goto free_domain;
> +
> +    ret = route_irq_to_guest(d, irq, "routed IRQ");
> +
> +    /* GSIs are mapped 1:1 to the guest */
> +    if ( !ret )
> +        *pirq_p = irq;
> +
> +free_domain:
> +    rcu_unlock_domain(d);
> +
> +    return ret;
> +}
>  
>  
>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
> -    printk("%s %d cmd=%d: not implemented yet\n", __func__, __LINE__, cmd);
> -    return -ENOSYS;
> +    int ret;
> +
> +    switch ( cmd )
> +    {
> +    case PHYSDEVOP_map_pirq:
> +        {
> +            physdev_map_pirq_t map;
> +
> +            ret = -EFAULT;
> +            if ( copy_from_guest(&map, arg, 1) != 0 )
> +                break;
> +
> +            ret = physdev_map_pirq(map.domid, map.type, map.index, &map.pirq);
> +
> +            if ( __copy_to_guest(arg, &map, 1) )
> +                ret = -EFAULT;
> +        }
> +        break;
> +    default:
> +        ret = -ENOSYS;
> +        break;
> +    }
> +
> +    return ret;
>  }
>  
>  /*
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index e451324..c18b2ca 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
>      /* Currently nr_lines in vgic and gic doesn't have the same meanings
>       * Here nr_lines = number of SPIs
>       */
> -    if ( is_hardware_domain(d) )
> -        d->arch.vgic.nr_lines = gic_number_lines() - 32;
> -    else
> -        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
> +    d->arch.vgic.nr_lines = gic_number_lines() - 32;
>  
>      d->arch.vgic.shared_irqs =
>          xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));

I see what you mean about virq != pirq.

It seems to me that setting d->arch.vgic.nr_lines = gic_number_lines() -
32 for the hardware domain is OK, but it is really a waste for the
others. We could find a way to pass down the info about how many SPIs we
need from libxl. Or we could delay the vgic allocations until the first
SPI is assigned to the domU.

Similarly to the MMIO hole sizing, I don't think that it would be a
requirement for this patch series but it is something to keep in mind.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-06-16 16:17 ` [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody Julien Grall
@ 2014-06-18 19:28   ` Stefano Stabellini
  2014-07-03 11:48   ` Ian Campbell
  1 sibling, 0 replies; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 19:28 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> Currently, when the device is deassigned from a domain, we directly reassign
> to DOM0.
> 
> As the device may not have been correctly reset, this may lead to corrupt or
> expose some part of DOM0 memory.
> 
> If Xen reassigns the device to "nobody", it may receive some global/context
> fault because the transaction has failed (indeed the context has been
> marked invalid).
> 
> DOM0 will have to issue an hypercall to assign the device to itself if it
> wants to use it.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>

Makes sense. The toolstack knows if it is able to reset the device.
If so, it could call an hypercall to give it back to dom0.


>  xen/drivers/passthrough/arm/smmu.c    |    7 ++++---
>  xen/drivers/passthrough/device_tree.c |    8 +++-----
>  2 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
> index f4eb2a2..b25034e 100644
> --- a/xen/drivers/passthrough/arm/smmu.c
> +++ b/xen/drivers/passthrough/arm/smmu.c
> @@ -1245,8 +1245,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
>  {
>      int ret = 0;
>  
> -    /* Don't allow remapping on other domain than hwdom */
> -    if ( t != hardware_domain )
> +    /* Allow remapping either on the hardware domain or to nothing */
> +    if ( t && t != hardware_domain )
>          return -EPERM;
>  
>      if ( t == s )
> @@ -1256,7 +1256,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
>      if ( ret )
>          return ret;
>  
> -    ret = arm_smmu_attach_dev(t, dev);
> +    if ( t )
> +        ret = arm_smmu_attach_dev(t, dev);
>  
>      return ret;
>  }
> diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
> index afb4dfc..8a4bc69 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -75,14 +75,12 @@ int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev)
>  
>      spin_lock(&dtdevs_lock);
>  
> -    rc = hd->platform_ops->reassign_dt_device(d, hardware_domain, dev);
> +    rc = hd->platform_ops->reassign_dt_device(d, NULL, dev);
>      if ( rc )
>          goto fail;
>  
> -    list_del(&dev->domain_list);
> -
> -    dt_device_set_used_by(dev, hardware_domain->domain_id);
> -    list_add(&dev->domain_list, &domain_hvm_iommu(hardware_domain)->dt_devices);
> +    list_del_init(&dev->domain_list);
> +    dt_device_set_used_by(dev, DOMID_IO);
>  
>  fail:
>      spin_unlock(&dtdevs_lock);
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-16 16:17 ` [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information Julien Grall
@ 2014-06-18 19:38   ` Stefano Stabellini
  2014-06-19 11:58     ` Julien Grall
  2014-07-03 11:33   ` Ian Campbell
  1 sibling, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 19:38 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel

On Mon, 16 Jun 2014, Julien Grall wrote:
> DOM0 doesn't provide a generic way to get information about a device tree
> node. If we want to do it in userspace, we will have to duplicate the
> MMIO/IRQ translation from Xen. Therefore, we can let the hypervisor
> doing the job for us and get nearly all the informations.
> 
> This new physdev operation will let the toolstack get the IRQ/MMIO regions
> and the compatible string. Most the device node can be described with only
> theses 3 items. If we need to add a specific properties, then we will have
> to implement it in userspace (some idea was to use a configuration file
> describing the additional properties).
> 
> The hypercall is divided in 4 parts:
>     - GET_INFO: get the numbers of IRQ/MMIO and the size of the
>     compatible string;
>     - GET_IRQ: get the IRQ by index. If the IRQ is not routable (i.e not
>     an SPIs), the errno will be set to -EINVAL;
>     - GET_MMIO: get the MMIO range by index. If the base and the size of
>     is not page-aligned, the errno will be set to -EINVAL;
>     - GET_COMPAT: get the compatible string
> 
> All the information will be accessible if the device is not used by Xen
> and protected by an IOMMU.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> 

I know that we talked about this face to face already, but this troubles
me: is it really so uncommon for a device tree node corresponding to a
device to have a key-value pair that is critical for the initialization
of the device?

The ACPI on ARM people are discussing how to introduce these key-value
pairs in ACPI too, so I wonder if we can really dismiss them so easily
for device assignment.

Could Xen discard everything that it knows cannot be passed to the guest
(information on clocks and phandles for example), but return to the
toolstack other harmless key-value pairs, such as device specific
configurations? Maybe we could introduce PHYSDEVOP_DTDEV_GET_KEYVALUE.


> I'm wondering if we can let the toolstack retrieve device information for
> every device not used by Xen. This would allow embedded guys using passthrough
> "easily" when their devices are not under an IOMMU.
> ---
>  tools/libxc/xc_physdev.c      |  129 +++++++++++++++++++++++++++++++++++++++++
>  tools/libxc/xenctrl.h         |   36 ++++++++++++
>  xen/arch/arm/physdev.c        |   16 +++++
>  xen/common/device_tree.c      |  112 +++++++++++++++++++++++++++++++++++
>  xen/include/public/physdev.h  |   40 +++++++++++++
>  xen/include/xen/device_tree.h |    3 +
>  6 files changed, 336 insertions(+)
> 
> diff --git a/tools/libxc/xc_physdev.c b/tools/libxc/xc_physdev.c
> index cf02d85..405fe78 100644
> --- a/tools/libxc/xc_physdev.c
> +++ b/tools/libxc/xc_physdev.c
> @@ -108,3 +108,132 @@ int xc_physdev_unmap_pirq(xc_interface *xch,
>      return rc;
>  }
>  
> +int xc_physdev_dtdev_getinfo(xc_interface *xch,
> +                             char *path,
> +                             xc_dtdev_info_t *info)
> +{
> +    int rc;
> +    size_t size = strlen(path);
> +    struct physdev_dtdev_op op;
> +    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> +    if ( xc_hypercall_bounce_pre(xch, path) )
> +        return -1;
> +
> +    op.op = PHYSDEVOP_DTDEV_GET_INFO;
> +    op.plen = size;
> +    set_xen_guest_handle(op.path, path);
> +
> +    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
> +
> +    xc_hypercall_bounce_post(xch, path);
> +
> +    if ( !rc )
> +    {
> +        info->num_irqs = op.u.info.num_irqs;
> +        info->num_mmios = op.u.info.num_mmios;
> +        info->compat_len = op.u.info.compat_len;
> +    }
> +
> +    return rc;
> +}
> +
> +int xc_physdev_dtdev_getirq(xc_interface *xch,
> +                            char *path,
> +                            uint32_t index,
> +                            xc_dtdev_irq_t *irq)
> +{
> +    int rc;
> +    size_t size = strlen(path);
> +    struct physdev_dtdev_op op;
> +    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> +    if ( xc_hypercall_bounce_pre(xch, path) )
> +        return -1;
> +
> +    op.op = PHYSDEVOP_DTDEV_GET_IRQ;
> +    op.plen = size;
> +    op.index = index;
> +    set_xen_guest_handle(op.path, path);
> +
> +    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
> +
> +    xc_hypercall_bounce_post(xch, path);
> +
> +    if ( !rc )
> +    {
> +        irq->irq = op.u.irq.irq;
> +        irq->type = op.u.irq.type;
> +    }
> +
> +    return rc;
> +}
> +
> +int xc_physdev_dtdev_getmmio(xc_interface *xch,
> +                             char *path,
> +                             uint32_t index,
> +                             xc_dtdev_mmio_t *mmio)
> +{
> +    int rc;
> +    size_t size = strlen(path);
> +    struct physdev_dtdev_op op;
> +    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +
> +    if ( xc_hypercall_bounce_pre(xch, path) )
> +        return -1;
> +
> +    op.op = PHYSDEVOP_DTDEV_GET_MMIO;
> +    op.plen = size;
> +    op.index = index;
> +    set_xen_guest_handle(op.path, path);
> +
> +    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
> +
> +    xc_hypercall_bounce_post(xch, path);
> +
> +    if ( !rc )
> +    {
> +        mmio->mfn = op.u.mmio.mfn;
> +        mmio->nr_mfn = op.u.mmio.nr_mfn;
> +    }
> +
> +    return rc;
> +}
> +
> +int xc_physdev_dtdev_getcompat(xc_interface *xch,
> +                               char *path,
> +                               char *compat,
> +                               uint32_t *clen)
> +{
> +    int rc;
> +    size_t size = strlen(path);
> +    struct physdev_dtdev_op op;
> +    DECLARE_HYPERCALL_BOUNCE(path, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    DECLARE_HYPERCALL_BOUNCE(compat, *clen, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
> +
> +    if ( xc_hypercall_bounce_pre(xch, path) )
> +        return -1;
> +
> +    rc = -1;
> +    if ( xc_hypercall_bounce_pre(xch, compat) )
> +        goto out;
> +
> +    op.op = PHYSDEVOP_DTDEV_GET_COMPAT;
> +    op.plen = size;
> +    set_xen_guest_handle(op.path, path);
> +
> +    op.u.compat.clen = *clen;
> +    set_xen_guest_handle(op.u.compat.compat, compat);
> +
> +    rc = do_physdev_op(xch, PHYSDEVOP_dtdev_op, &op, sizeof(op));
> +
> +    if ( !rc )
> +        *clen = op.u.compat.clen;
> +
> +    xc_hypercall_bounce_post(xch, compat);
> +
> +out:
> +    xc_hypercall_bounce_post(xch, path);
> +
> +    return rc;
> +}
> diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h
> index b55d857..5ad2d65 100644
> --- a/tools/libxc/xenctrl.h
> +++ b/tools/libxc/xenctrl.h
> @@ -1143,6 +1143,42 @@ int xc_physdev_pci_access_modify(xc_interface *xch,
>                                   int func,
>                                   int enable);
>  
> +typedef struct xc_dtdev_info {
> +    uint32_t num_irqs;
> +    uint32_t num_mmios;
> +    uint32_t compat_len;
> +} xc_dtdev_info_t;
> +
> +int xc_physdev_dtdev_getinfo(xc_interface *xch,
> +                             char *path,
> +                             xc_dtdev_info_t *info);
> +
> +typedef struct xc_dtdev_irq {
> +    uint32_t irq;
> +    /* TODO: Maybe an enum here? */
> +    uint32_t type;
> +} xc_dtdev_irq_t;
> +
> +int xc_physdev_dtdev_getirq(xc_interface *xch,
> +                            char *path,
> +                            uint32_t index,
> +                            xc_dtdev_irq_t *irq);
> +
> +typedef struct xc_dtdev_mmio {
> +    uint64_t mfn;
> +    uint64_t nr_mfn;
> +} xc_dtdev_mmio_t;
> +
> +int xc_physdev_dtdev_getmmio(xc_interface *xch,
> +                             char *path,
> +                             uint32_t index,
> +                             xc_dtdev_mmio_t *mmio);
> +
> +int xc_physdev_dtdev_getcompat(xc_interface *xch,
> +                               char *path,
> +                               char *compat,
> +                               uint32_t *clen);
> +
>  int xc_readconsolering(xc_interface *xch,
>                         char *buffer,
>                         unsigned int *pnr_chars,
> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
> index d17589c..11c5b59 100644
> --- a/xen/arch/arm/physdev.c
> +++ b/xen/arch/arm/physdev.c
> @@ -82,6 +82,22 @@ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>                  ret = -EFAULT;
>          }
>          break;
> +
> +    case PHYSDEVOP_dtdev_op:
> +        {
> +            physdev_dtdev_op_t info;
> +
> +            ret = -EFAULT;
> +            if ( copy_from_guest(&info, arg, 1) != 0 )
> +                break;
> +
> +            /* TODO: Add xsm */
> +            ret = dt_do_physdev_op(&info);
> +
> +            if ( __copy_to_guest(arg, &info, 1) )
> +                ret = -EFAULT;
> +        }
> +        break;
>      default:
>          ret = -ENOSYS;
>          break;
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index fd95307..482ff8f 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -24,6 +24,7 @@
>  #include <xen/cpumask.h>
>  #include <xen/ctype.h>
>  #include <xen/lib.h>
> +#include <xen/irq.h>
>  
>  struct dt_early_info __initdata early_info;
>  const void *device_tree_flattened;
> @@ -2021,6 +2022,117 @@ void __init dt_unflatten_host_device_tree(void)
>      dt_alias_scan();
>  }
>  
> +
> +/* TODO: I think we need a bit of caching in each device node to get the
> + * information in constant time.
> + * For now we need to translate IRQs/MMIOs every time
> + */
> +int dt_do_physdev_op(physdev_dtdev_op_t *info)
> +{
> +    struct dt_device_node *dev;
> +    int ret;
> +
> +    ret = dt_find_node_by_gpath(info->path, info->plen, &dev);
> +    if ( ret )
> +        return ret;
> +
> +    /* Only allow access to protected device and not used by Xen */
> +    if ( !dt_device_is_protected(dev) || dt_device_used_by(dev) == DOMID_XEN )
> +        return -EACCES;
> +
> +    switch ( info->op )
> +    {
> +    case PHYSDEVOP_DTDEV_GET_INFO:
> +        {
> +            const struct dt_property *compat;
> +
> +            compat = dt_find_property(dev, "compatible", NULL);
> +            /* Hopefully, this case should never happen, print error
> +             * if it occurs
> +             */
> +            if ( !compat )
> +            {
> +                dprintk(XENLOG_G_ERR, "Unable to find compatible node for %s\n",
> +                        dt_node_full_name(dev));
> +                return -EBADFD;
> +            }
> +
> +            info->u.info.num_irqs = dt_number_of_irq(dev);
> +            info->u.info.num_mmios = dt_number_of_address(dev);
> +            info->u.info.compat_len = compat->length;
> +        }
> +        break;
> +
> +    case PHYSDEVOP_DTDEV_GET_IRQ:
> +        {
> +            struct dt_irq irq;
> +
> +            ret = dt_device_get_irq(dev, info->index, &irq);
> +            if ( ret )
> +                return ret;
> +
> +            /* Check if Xen is able to route the IRQ to the guest */
> +            if ( !is_routable_irq(irq.irq) )
> +                return -EINVAL;
> +
> +            info->u.irq.irq = irq.irq;
> +            /* TODO: Translate the type into an exportable value */
> +            info->u.irq.type = irq.type;
> +        }
> +        break;
> +
> +    case PHYSDEVOP_DTDEV_GET_MMIO:
> +        {
> +            uint64_t addr, size;
> +
> +            ret = dt_device_get_address(dev, info->index, &addr, &size);
> +            if ( ret )
> +                return ret;
> +
> +            /* Make sure the address and the size are page aligned.
> +             * If not, we may passthrough MMIO regions which may belong
> +             * to another device. Deny it!
> +             */
> +            if ( (addr & (PAGE_SIZE - 1)) || (size & (PAGE_SIZE - 1)) )
> +            {
> +                dprintk(XENLOG_ERR, "%s: contain non-page aligned range:"
> +                        " addr = 0x%"PRIx64" size = 0x%"PRIx64"\n",
> +                        dt_node_full_name(dev), addr, size);
> +                return -EINVAL;
> +            }
> +
> +            info->u.mmio.mfn = paddr_to_pfn(addr);
> +            info->u.mmio.nr_mfn = paddr_to_pfn(size);
> +        }
> +        break;
> +
> +    case PHYSDEVOP_DTDEV_GET_COMPAT:
> +        {
> +            const struct dt_property *compat;
> +
> +            compat = dt_find_property(dev, "compatible", NULL);
> +            if ( !compat || !compat->length )
> +                return -ENOENT;
> +
> +            if ( info->u.compat.clen < compat->length )
> +                return -ENOSPC;
> +
> +            if ( copy_to_guest(info->u.compat.compat, compat->value,
> +                               compat->length) != 0 )
> +                return -EFAULT;
> +
> +            info->u.compat.clen = compat->length;
> +        }
> +        break;
> +
> +    default:
> +        return -ENOSYS;
> +    }
> +
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/public/physdev.h b/xen/include/public/physdev.h
> index d547928..23cf673 100644
> --- a/xen/include/public/physdev.h
> +++ b/xen/include/public/physdev.h
> @@ -337,6 +337,46 @@ struct physdev_dbgp_op {
>  typedef struct physdev_dbgp_op physdev_dbgp_op_t;
>  DEFINE_XEN_GUEST_HANDLE(physdev_dbgp_op_t);
>  
> +/* Retrieve informations about a device node */
> +#define PHYSDEVOP_dtdev_op        32
> +
> +struct physdev_dtdev_op {
> +    /* IN */
> +    uint32_t plen;                  /* Length of the path */
> +    XEN_GUEST_HANDLE(char) path;    /* Path to the device tree node */
> +#define PHYSDEVOP_DTDEV_GET_INFO        0
> +#define PHYSDEVOP_DTDEV_GET_IRQ         1
> +#define PHYSDEVOP_DTDEV_GET_MMIO        2
> +#define PHYSDEVOP_DTDEV_GET_COMPAT      3
> +    uint8_t op;
> +    uint32_t pad0:24;
> +    uint32_t index;                 /* Index for the IRQ/MMIO to retrieve */
> +    /* OUT */
> +    union {
> +        struct {
> +            uint32_t num_irqs;      /* Number of IRQs */
> +            uint32_t num_mmios;     /* Number of MMIOs */
> +            uint32_t compat_len;    /* Length of the compatible string */
> +        } info;
> +        struct {
> +            /* TODO: Do we need to handle MSI-X? */
> +            uint32_t irq;           /* IRQ number */
> +            /* TODO: Describe with defines the IRQ type */
> +            uint32_t type;          /* IRQ type (i.e edge, level...) */
> +        } irq;
> +        struct {
> +            uint64_t mfn;
> +            uint64_t nr_mfn;
> +        } mmio;
> +        struct {
> +            uint32_t clen;          /* IN: Size of buffer. OUT: Size copied */
> +            XEN_GUEST_HANDLE_64(char) compat;
> +        } compat;
> +    } u;
> +};
> +typedef struct physdev_dtdev_op physdev_dtdev_op_t;
> +DEFINE_XEN_GUEST_HANDLE(physdev_dtdev_op_t);
> +
>  /*
>   * Notify that some PIRQ-bound event channels have been unmasked.
>   * ** This command is obsolete since interface version 0x00030202 and is **
> diff --git a/xen/include/xen/device_tree.h b/xen/include/xen/device_tree.h
> index bb33e54..3d5101c 100644
> --- a/xen/include/xen/device_tree.h
> +++ b/xen/include/xen/device_tree.h
> @@ -12,6 +12,7 @@
>  
>  #include <asm/byteorder.h>
>  #include <public/xen.h>
> +#include <public/physdev.h>
>  #include <xen/init.h>
>  #include <xen/string.h>
>  #include <xen/types.h>
> @@ -711,6 +712,8 @@ int dt_parse_phandle_with_args(const struct dt_device_node *np,
>                                 const char *cells_name, int index,
>                                 struct dt_phandle_args *out_args);
>  
> +int dt_do_physdev_op(physdev_dtdev_op_t *info);
> +
>  #endif /* __XEN_DEVICE_TREE_H */
>  
>  /*
> -- 
> 1.7.10.4
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-06-16 16:17 ` [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO Julien Grall
@ 2014-06-18 20:21   ` Stefano Stabellini
  2014-06-18 20:32     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-18 20:21 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On Mon, 16 Jun 2014, Julien Grall wrote:
> The commit 33233c2 "arch/arm: domain build: let dom0 access I/O memory of
> mapped series" fill the iomem_caps to allow DOM0 managing MMIO of mapped
> device.
> 
> A device can be disabled (i.e by adding a property status="disabled" in the
> device tree) because the user may want to passthrough this device to a guest.
> This will avoid DOM0 loading (and few minutes after unloading) the driver to
> handle this device.
> 
> Even though, we don't want to let DOM0 using this device, the domain needs
> to be able to manage the MMIO/IRQ range. Rework the function map_device
> (renamed into handle_device) to:

Is that so the toolstack in dom0 could re-assign the device to another
guest?

In any case it would be good to write down exactly why DOM0 would still
need to be able to manage the MMIO/IRQ range as a comment in the code.


> * For a given device node:
>     - Give permission to manage IRQ/MMIO for this device
>     - Retrieve the IRQ configuration (i.e edge/level) from the device
>     tree
> * For available device (i.e status != disabled in the DT)
>     - Assign the device to the guest if it's protected by an IOMMU
>     - Map the IRQs and MMIOs regions to the guest
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/arch/arm/domain_build.c |   66 ++++++++++++++++++++++++++++---------------
>  1 file changed, 44 insertions(+), 22 deletions(-)
> 
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index c3783cf..6a711cc 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -680,8 +680,14 @@ static int make_timer_node(const struct domain *d, void *fdt,
>      return res;
>  }
>  
> -/* Map the device in the domain */
> -static int map_device(struct domain *d, struct dt_device_node *dev)
> +/* For a given device node:
> + *  - Give permission to the guest to manage IRQ and MMIO range
> + *  - Retrieve the IRQ configuration (i.e edge/level) from device tree
> + * When the device is available:
> + *  - Assign the device to the guest if it's protected by an IOMMU
> + *  - Map the IRQs and iomem regions to DOM0
> + */
> +static int handle_device(struct domain *d, struct dt_device_node *dev, bool_t map)

This is confusing.
If map == true then we get a similar behavior as before. If map ==
false, what is the function supposed to achieve? Only permit IRQ access?
Could we split it into a separate function?


>  {
>      unsigned int nirq;
>      unsigned int naddr;
> @@ -694,9 +700,10 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
>      nirq = dt_number_of_irq(dev);
>      naddr = dt_number_of_address(dev);
>  
> -    DPRINT("%s nirq = %d naddr = %u\n", dt_node_full_name(dev), nirq, naddr);
> +    DPRINT("%s map = %d nirq = %d naddr = %u\n", dt_node_full_name(dev),
> +           map, nirq, naddr);
>  
> -    if ( dt_device_is_protected(dev) )
> +    if ( dt_device_is_protected(dev) && map )
>      {
>          DPRINT("%s setup iommu\n", dt_node_full_name(dev));
>          res = iommu_assign_dt_device(d, dev);
> @@ -708,7 +715,7 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
>          }
>      }
>  
> -    /* Map IRQs */
> +    /* Give permission and  map IRQs */
>      for ( i = 0; i < nirq; i++ )
>      {
>          res = dt_device_get_raw_irq(dev, i, &rirq);
> @@ -741,16 +748,28 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
>          irq = res;
>  
>          DPRINT("irq %u = %u\n", i, irq);
> -        res = route_irq_to_guest(d, irq, dt_node_name(dev));
> +
> +        res = irq_permit_access(d, irq);
>          if ( res )
>          {
> -            printk(XENLOG_ERR "Unable to route IRQ %u to domain %u\n",
> -                   irq, d->domain_id);
> +            printk(XENLOG_ERR "Unable to permit to dom%u access to IRQ %u\n",
> +                   d->domain_id, irq);
>              return res;
>          }
> +
> +        if ( map )
> +        {
> +            res = route_irq_to_guest(d, irq, dt_node_name(dev));
> +            if ( res )
> +            {
> +                printk(XENLOG_ERR "Unable to route IRQ %u to domain %u\n",
> +                       irq, d->domain_id);
> +                return res;
> +            }
> +        }
>      }
>  
> -    /* Map the address ranges */
> +    /* Give permission and map MMIOs */
>      for ( i = 0; i < naddr; i++ )
>      {
>          res = dt_device_get_address(dev, i, &addr, &size);
> @@ -774,17 +793,21 @@ static int map_device(struct domain *d, struct dt_device_node *dev)
>                     addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
>              return res;
>          }
> -        res = map_mmio_regions(d,
> -                               paddr_to_pfn(addr & PAGE_MASK),
> -                               DIV_ROUND_UP(size, PAGE_SIZE),
> -                               paddr_to_pfn(addr & PAGE_MASK));
> -        if ( res )
> +
> +        if ( map )
>          {
> -            printk(XENLOG_ERR "Unable to map 0x%"PRIx64
> -                   " - 0x%"PRIx64" in domain %d\n",
> -                   addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1,
> -                   d->domain_id);
> -            return res;
> +            res = map_mmio_regions(d,
> +                                   paddr_to_pfn(addr & PAGE_MASK),
> +                                   DIV_ROUND_UP(size, PAGE_SIZE),
> +                                   paddr_to_pfn(addr & PAGE_MASK));
> +            if ( res )
> +            {
> +                printk(XENLOG_ERR "Unable to map 0x%"PRIx64
> +                       " - 0x%"PRIx64" in domain %d\n",
> +                       addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1,
> +                       d->domain_id);
> +                return res;
> +            }
>          }
>      }
>  
> @@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
>       *  property. Therefore these device doesn't need to be mapped. This
>       *  solution can be use later for pass through.
>       */
> -    if ( !dt_device_type_is_equal(node, "memory") &&
> -         dt_device_is_available(node) )
> +    if ( !dt_device_type_is_equal(node, "memory") )
>      {
> -        res = map_device(d, node);
> +        res = handle_device(d, node, dt_device_is_available(node));
>  
>          if ( res )
>              return res;

We need a comment here

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-06-18 20:21   ` Stefano Stabellini
@ 2014-06-18 20:32     ` Julien Grall
  2014-07-03 11:02       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-18 20:32 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim



On 18/06/14 21:21, Stefano Stabellini wrote:
> On Mon, 16 Jun 2014, Julien Grall wrote:
>> The commit 33233c2 "arch/arm: domain build: let dom0 access I/O memory of
>> mapped series" fill the iomem_caps to allow DOM0 managing MMIO of mapped
>> device.
>>
>> A device can be disabled (i.e by adding a property status="disabled" in the
>> device tree) because the user may want to passthrough this device to a guest.
>> This will avoid DOM0 loading (and few minutes after unloading) the driver to
>> handle this device.
>>
>> Even though, we don't want to let DOM0 using this device, the domain needs
>> to be able to manage the MMIO/IRQ range. Rework the function map_device
>> (renamed into handle_device) to:
>
> Is that so the toolstack in dom0 could re-assign the device to another
> guest?

Yes.


> In any case it would be good to write down exactly why DOM0 would still
> need to be able to manage the MMIO/IRQ range as a comment in the code.

Will do.

>> +static int handle_device(struct domain *d, struct dt_device_node *dev, bool_t map)
>
> This is confusing.
> If map == true then we get a similar behavior as before. If map ==
> false, what is the function supposed to achieve? Only permit IRQ access?
> Could we split it into a separate function?

I think the name "map" is confusing here. I should rename it to "available".

This function is supposed give access perrmision to IRQ and MMIO access 
(the latter was partially done by Arianna's patch series) and if the 
device is available map it to DOM0.

Splitting in 2 functions would mean duplicate the loop which is quiet 
complex in term of translation (see the interrupt and MMIO translation).

>> @@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
>>        *  property. Therefore these device doesn't need to be mapped. This
>>        *  solution can be use later for pass through.
>>        */
>> -    if ( !dt_device_type_is_equal(node, "memory") &&
>> -         dt_device_is_available(node) )
>> +    if ( !dt_device_type_is_equal(node, "memory") )
>>       {
>> -        res = map_device(d, node);
>> +        res = handle_device(d, node, dt_device_is_available(node));
>>
>>           if ( res )
>>               return res;
>
> We need a comment here

Hmmm... I don't see what kind of comment I can add here. There is 
already lots of comments explaining handle_device and the previous if.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-06-18 19:24   ` Stefano Stabellini
@ 2014-06-19 11:39     ` Julien Grall
  2014-06-19 12:29       ` Stefano Stabellini
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-19 11:39 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

On 06/18/2014 08:24 PM, Stefano Stabellini wrote:
>>  /*
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index e451324..c18b2ca 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
>>      /* Currently nr_lines in vgic and gic doesn't have the same meanings
>>       * Here nr_lines = number of SPIs
>>       */
>> -    if ( is_hardware_domain(d) )
>> -        d->arch.vgic.nr_lines = gic_number_lines() - 32;
>> -    else
>> -        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
>> +    d->arch.vgic.nr_lines = gic_number_lines() - 32;
>>  
>>      d->arch.vgic.shared_irqs =
>>          xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
> 
> I see what you mean about virq != pirq.
> 
> It seems to me that setting d->arch.vgic.nr_lines = gic_number_lines() -
> 32 for the hardware domain is OK, but it is really a waste for the
> others. We could find a way to pass down the info about how many SPIs we
> need from libxl. Or we could delay the vgic allocations until the first
> SPI is assigned to the domU.

I gave a check on both midway and the versatile express and there is
about 200 lines.

It make the overhead of less than 8K per domain. Which is not too bad.

If the host really support 1024 IRQs that would make an overhead of ~32K.

> Similarly to the MMIO hole sizing, I don't think that it would be a
> requirement for this patch series but it is something to keep in mind.

Handling virq != pirq will be more complex as we need to take into
account of the hotplug solution.

The vgic has a register which provide the number of lines, I suspect
this number can't grow up while the guest is running.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index
  2014-06-18 18:54   ` Stefano Stabellini
@ 2014-06-19 11:42     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-19 11:42 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, ian.campbell, tim

Hi Stefano,

On 06/18/2014 07:54 PM, Stefano Stabellini wrote:
> On Mon, 16 Jun 2014, Julien Grall wrote:
>> There is no reason to use signed integer for an index. Futhermore, this will
>> avoid possible issue when theses functions will be exposed to the guest
>> via new hypercalls.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/common/device_tree.c      |   10 +++++-----
>>  xen/include/xen/device_tree.h |    7 ++++---
>>  2 files changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>> index f0b17a3..4736e0d 100644
>> --- a/xen/common/device_tree.c
>> +++ b/xen/common/device_tree.c
>> @@ -876,7 +876,7 @@ static const struct dt_bus *dt_match_bus(const struct dt_device_node *np)
>>  }
>>  
>>  static const __be32 *dt_get_address(const struct dt_device_node *dev,
>> -                                    int index, u64 *size,
>> +                                    unsigned index, u64 *size,
> 
> It is the same thing but you might as well use unsigned int.

I forgot to add int here. I will do in the next version.

>>                                      unsigned int *flags)
>>  {
>>      const __be32 *prop;
>> @@ -1063,7 +1063,7 @@ bail:
>>  }
>>  
>>  /* dt_device_address - Translate device tree address and return it */
>> -int dt_device_get_address(const struct dt_device_node *dev, int index,
>> +int dt_device_get_address(const struct dt_device_node *dev, unsigned int index,
>>                            u64 *addr, u64 *size)
>>  {
>>      const __be32 *addrp;
>> @@ -1386,7 +1386,7 @@ fail:
>>      return -EINVAL;
>>  }
>>  
>> -int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
>> +int dt_device_get_raw_irq(const struct dt_device_node *device, uint32_t index,
> 
> Why are you changing the other indexes to unsigned int and this one to
> uint32_t?

I don't remember. I will change it to unsigned int.

>>                            struct dt_raw_irq *out_irq)
>>  {
>>      const struct dt_device_node *p;
>> @@ -1394,7 +1394,7 @@ int dt_device_get_raw_irq(const struct dt_device_node *device, int index,
>>      u32 intsize, intlen;
>>      int res = -EINVAL;
>>  
>> -    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%d\n",
>> +    dt_dprintk("dt_device_get_raw_irq: dev=%s, index=%u\n",
>>                 device->full_name, index);
>>  
>>      /* Get the interrupts property */
>> @@ -1445,7 +1445,7 @@ int dt_irq_translate(const struct dt_raw_irq *raw,
>>                          &out_irq->irq, &out_irq->type);
>>  }
>>  
>> -int dt_device_get_irq(const struct dt_device_node *device, int index,
>> +int dt_device_get_irq(const struct dt_device_node *device, uint32_t index,
> 
> ditto

Same here :/.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-18 19:38   ` Stefano Stabellini
@ 2014-06-19 11:58     ` Julien Grall
  2014-06-19 12:21       ` Stefano Stabellini
                         ` (2 more replies)
  0 siblings, 3 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-19 11:58 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: ian.campbell, tim, Ian Jackson, stefano.stabellini, xen-devel,
	Christoffer Dall

(Adding Christoffer)

On 06/18/2014 08:38 PM, Stefano Stabellini wrote:
> On Mon, 16 Jun 2014, Julien Grall wrote:
>> DOM0 doesn't provide a generic way to get information about a device tree
>> node. If we want to do it in userspace, we will have to duplicate the
>> MMIO/IRQ translation from Xen. Therefore, we can let the hypervisor
>> doing the job for us and get nearly all the informations.
>>
>> This new physdev operation will let the toolstack get the IRQ/MMIO regions
>> and the compatible string. Most the device node can be described with only
>> theses 3 items. If we need to add a specific properties, then we will have
>> to implement it in userspace (some idea was to use a configuration file
>> describing the additional properties).
>>
>> The hypercall is divided in 4 parts:
>>     - GET_INFO: get the numbers of IRQ/MMIO and the size of the
>>     compatible string;
>>     - GET_IRQ: get the IRQ by index. If the IRQ is not routable (i.e not
>>     an SPIs), the errno will be set to -EINVAL;
>>     - GET_MMIO: get the MMIO range by index. If the base and the size of
>>     is not page-aligned, the errno will be set to -EINVAL;
>>     - GET_COMPAT: get the compatible string
>>
>> All the information will be accessible if the device is not used by Xen
>> and protected by an IOMMU.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>> Cc: Ian Campbell <ian.campbell@citrix.com>
>>
> 
> I know that we talked about this face to face already, but this troubles
> me: is it really so uncommon for a device tree node corresponding to a
> device to have a key-value pair that is critical for the initialization
> of the device?

I remembered a chat with Christoffer (I think you were in CC) about
specific device properties. But I can't find it in my mailbox.

I think the idea was Xen provides the generic properties (regs,
interrupts) and we implement device specific properties in a
configuration file that could be share with KVM (IIRC, KVM has the same
needs).

> The ACPI on ARM people are discussing how to introduce these key-value
> pairs in ACPI too, so I wonder if we can really dismiss them so easily
> for device assignment.
> 
> Could Xen discard everything that it knows cannot be passed to the guest
> (information on clocks and phandles for example), but return to the
> toolstack other harmless key-value pairs, such as device specific
> configurations? Maybe we could introduce PHYSDEVOP_DTDEV_GET_KEYVALUE.

A blacklist won't work here because Xen may return properties that
contain a list of phandle (for instance see the SMMU bindings). The name
of those properties are not necessary generic.

IHMO, need to let the toolstack device whether we need to add specific
properties. Those properties can be write down in a configuration file
which will be parsed by the toolstack.

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-19 11:58     ` Julien Grall
@ 2014-06-19 12:21       ` Stefano Stabellini
  2014-06-19 12:25         ` Julien Grall
  2014-06-24  8:46       ` Christoffer Dall
  2014-07-03 11:34       ` Ian Campbell
  2 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-19 12:21 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, Stefano Stabellini, tim, Ian Jackson,
	stefano.stabellini, xen-devel, Christoffer Dall

On Thu, 19 Jun 2014, Julien Grall wrote:
> (Adding Christoffer)
> 
> On 06/18/2014 08:38 PM, Stefano Stabellini wrote:
> > On Mon, 16 Jun 2014, Julien Grall wrote:
> >> DOM0 doesn't provide a generic way to get information about a device tree
> >> node. If we want to do it in userspace, we will have to duplicate the
> >> MMIO/IRQ translation from Xen. Therefore, we can let the hypervisor
> >> doing the job for us and get nearly all the informations.
> >>
> >> This new physdev operation will let the toolstack get the IRQ/MMIO regions
> >> and the compatible string. Most the device node can be described with only
> >> theses 3 items. If we need to add a specific properties, then we will have
> >> to implement it in userspace (some idea was to use a configuration file
> >> describing the additional properties).
> >>
> >> The hypercall is divided in 4 parts:
> >>     - GET_INFO: get the numbers of IRQ/MMIO and the size of the
> >>     compatible string;
> >>     - GET_IRQ: get the IRQ by index. If the IRQ is not routable (i.e not
> >>     an SPIs), the errno will be set to -EINVAL;
> >>     - GET_MMIO: get the MMIO range by index. If the base and the size of
> >>     is not page-aligned, the errno will be set to -EINVAL;
> >>     - GET_COMPAT: get the compatible string
> >>
> >> All the information will be accessible if the device is not used by Xen
> >> and protected by an IOMMU.
> >>
> >> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >> Cc: Ian Campbell <ian.campbell@citrix.com>
> >>
> > 
> > I know that we talked about this face to face already, but this troubles
> > me: is it really so uncommon for a device tree node corresponding to a
> > device to have a key-value pair that is critical for the initialization
> > of the device?
> 
> I remembered a chat with Christoffer (I think you were in CC) about
> specific device properties. But I can't find it in my mailbox.
> 
> I think the idea was Xen provides the generic properties (regs,
> interrupts) and we implement device specific properties in a
> configuration file that could be share with KVM (IIRC, KVM has the same
> needs).

What configuration file? Where would it live?
I would rather avoid forcing the user to specify these properties in the
VM config file.


> > The ACPI on ARM people are discussing how to introduce these key-value
> > pairs in ACPI too, so I wonder if we can really dismiss them so easily
> > for device assignment.
> > 
> > Could Xen discard everything that it knows cannot be passed to the guest
> > (information on clocks and phandles for example), but return to the
> > toolstack other harmless key-value pairs, such as device specific
> > configurations? Maybe we could introduce PHYSDEVOP_DTDEV_GET_KEYVALUE.
> 
> A blacklist won't work here because Xen may return properties that
> contain a list of phandle (for instance see the SMMU bindings). The name
> of those properties are not necessary generic.
> 
> IHMO, need to let the toolstack device whether we need to add specific
> properties. Those properties can be write down in a configuration file
> which will be parsed by the toolstack.

Could we simply remove anything that contains phandles? Is there a way
to detect if a value is a phandle?

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-19 12:21       ` Stefano Stabellini
@ 2014-06-19 12:25         ` Julien Grall
  2014-07-03 11:40           ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-06-19 12:25 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: ian.campbell, tim, Ian Jackson, stefano.stabellini, xen-devel,
	Christoffer Dall

On 06/19/2014 01:21 PM, Stefano Stabellini wrote:
>>> I know that we talked about this face to face already, but this troubles
>>> me: is it really so uncommon for a device tree node corresponding to a
>>> device to have a key-value pair that is critical for the initialization
>>> of the device?
>>
>> I remembered a chat with Christoffer (I think you were in CC) about
>> specific device properties. But I can't find it in my mailbox.
>>
>> I think the idea was Xen provides the generic properties (regs,
>> interrupts) and we implement device specific properties in a
>> configuration file that could be share with KVM (IIRC, KVM has the same
>> needs).
> 
> What configuration file? Where would it live?
> I would rather avoid forcing the user to specify these properties in the
> VM config file.

I meant an host config file.

>>> The ACPI on ARM people are discussing how to introduce these key-value
>>> pairs in ACPI too, so I wonder if we can really dismiss them so easily
>>> for device assignment.
>>>
>>> Could Xen discard everything that it knows cannot be passed to the guest
>>> (information on clocks and phandles for example), but return to the
>>> toolstack other harmless key-value pairs, such as device specific
>>> configurations? Maybe we could introduce PHYSDEVOP_DTDEV_GET_KEYVALUE.
>>
>> A blacklist won't work here because Xen may return properties that
>> contain a list of phandle (for instance see the SMMU bindings). The name
>> of those properties are not necessary generic.
>>
>> IHMO, need to let the toolstack device whether we need to add specific
>> properties. Those properties can be write down in a configuration file
>> which will be parsed by the toolstack.
> 
> Could we simply remove anything that contains phandles? Is there a way
> to detect if a value is a phandle?

A phandle is only a way to interpret a number. AFAIK, there is no way to
differentiate it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-06-19 11:39     ` Julien Grall
@ 2014-06-19 12:29       ` Stefano Stabellini
  2014-07-03 11:27         ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Stefano Stabellini @ 2014-06-19 12:29 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, stefano.stabellini, ian.campbell, Stefano Stabellini

On Thu, 19 Jun 2014, Julien Grall wrote:
> On 06/18/2014 08:24 PM, Stefano Stabellini wrote:
> >>  /*
> >> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> >> index e451324..c18b2ca 100644
> >> --- a/xen/arch/arm/vgic.c
> >> +++ b/xen/arch/arm/vgic.c
> >> @@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
> >>      /* Currently nr_lines in vgic and gic doesn't have the same meanings
> >>       * Here nr_lines = number of SPIs
> >>       */
> >> -    if ( is_hardware_domain(d) )
> >> -        d->arch.vgic.nr_lines = gic_number_lines() - 32;
> >> -    else
> >> -        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
> >> +    d->arch.vgic.nr_lines = gic_number_lines() - 32;
> >>  
> >>      d->arch.vgic.shared_irqs =
> >>          xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
> > 
> > I see what you mean about virq != pirq.
> > 
> > It seems to me that setting d->arch.vgic.nr_lines = gic_number_lines() -
> > 32 for the hardware domain is OK, but it is really a waste for the
> > others. We could find a way to pass down the info about how many SPIs we
> > need from libxl. Or we could delay the vgic allocations until the first
> > SPI is assigned to the domU.
> 
> I gave a check on both midway and the versatile express and there is
> about 200 lines.
> 
> It make the overhead of less than 8K per domain. Which is not too bad.
> 
> If the host really support 1024 IRQs that would make an overhead of ~32K.
> 
> > Similarly to the MMIO hole sizing, I don't think that it would be a
> > requirement for this patch series but it is something to keep in mind.
> 
> Handling virq != pirq will be more complex as we need to take into
> account of the hotplug solution.
> 
> The vgic has a register which provide the number of lines, I suspect
> this number can't grow up while the guest is running.

Of course not. But keep in mind that for non-PCI passthrough we would be
fully aware of all the assigned interrupts before starting the VM.

PCI passthrough and MSI-X are the issue because there can be many MSI
per device and the device can be hotplugged into the guest.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-19 11:58     ` Julien Grall
  2014-06-19 12:21       ` Stefano Stabellini
@ 2014-06-24  8:46       ` Christoffer Dall
  2014-07-03 11:34       ` Ian Campbell
  2 siblings, 0 replies; 122+ messages in thread
From: Christoffer Dall @ 2014-06-24  8:46 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, tim, Ian Jackson,
	Stefano Stabellini, xen-devel

On 19 June 2014 13:58, Julien Grall <julien.grall@linaro.org> wrote:
> (Adding Christoffer)
>
> On 06/18/2014 08:38 PM, Stefano Stabellini wrote:
>> On Mon, 16 Jun 2014, Julien Grall wrote:
>>> DOM0 doesn't provide a generic way to get information about a device tree
>>> node. If we want to do it in userspace, we will have to duplicate the
>>> MMIO/IRQ translation from Xen. Therefore, we can let the hypervisor
>>> doing the job for us and get nearly all the informations.
>>>
>>> This new physdev operation will let the toolstack get the IRQ/MMIO regions
>>> and the compatible string. Most the device node can be described with only
>>> theses 3 items. If we need to add a specific properties, then we will have
>>> to implement it in userspace (some idea was to use a configuration file
>>> describing the additional properties).
>>>
>>> The hypercall is divided in 4 parts:
>>>     - GET_INFO: get the numbers of IRQ/MMIO and the size of the
>>>     compatible string;
>>>     - GET_IRQ: get the IRQ by index. If the IRQ is not routable (i.e not
>>>     an SPIs), the errno will be set to -EINVAL;
>>>     - GET_MMIO: get the MMIO range by index. If the base and the size of
>>>     is not page-aligned, the errno will be set to -EINVAL;
>>>     - GET_COMPAT: get the compatible string
>>>
>>> All the information will be accessible if the device is not used by Xen
>>> and protected by an IOMMU.
>>>
>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>> Cc: Ian Campbell <ian.campbell@citrix.com>
>>>
>>
>> I know that we talked about this face to face already, but this troubles
>> me: is it really so uncommon for a device tree node corresponding to a
>> device to have a key-value pair that is critical for the initialization
>> of the device?
>
> I remembered a chat with Christoffer (I think you were in CC) about
> specific device properties. But I can't find it in my mailbox.
>
> I think the idea was Xen provides the generic properties (regs,
> interrupts) and we implement device specific properties in a
> configuration file that could be share with KVM (IIRC, KVM has the same
> needs).
>

Yeah, experience just shows us that when you start exposing the raw
hardware information to user space or through to VMs without have
semantic control over them, then you will very likely end up in a lot
of trouble.

It really should be able to limit the properties of devices that you
want to export to a reasonable set through a well-defined API.

If you have an extremely complicated device with interesting
inter-dependency, chances are you're going to need a device-specific
user space driver to couple devices, tie inter-dependent devices
together when you describe the machine to your VM, etc.  The API
suggested should take care of the common generic case.

The suggestion about a VM config file was more of a loose-thought if
we start having a bunch of networking devices that have special
properties and we wish to support passthrough of these on both Xen and
KVM, then keeping device-specific data in a config file may be a way
to accomplish that.  This is pretty much speculation and loose ideas
at this point.

Personally, I would prefer doing something along the lines of what
Julien suggest and add necessary properties as needed.  If it turns
out you need a lot more information about devices someone actually
tries to pass through to VMs, then revisit the issue.  This patch
doesn't seem to suggest an awfully complicated ABI that will cause a
lot of headache to maintain in the future or anything like that.

My 2 cents.

-Christoffer

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest
  2014-06-18 13:01                     ` Jan Beulich
@ 2014-06-24 14:58                       ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-06-24 14:58 UTC (permalink / raw)
  To: Jan Beulich, Daniel De Graaf
  Cc: Keir Fraser, ian.campbell, tim, Ian Jackson, stefano.stabellini,
	xen-devel

On 06/18/2014 02:01 PM, Jan Beulich wrote:
>> How will you rename the function?
> 
> I don't know. All I know is that the function isn't simply coping in a
> string.

I though a bit more. I will rename function into
safe_copy_string_from_guest and add a comment explain what this function
really does (i.e adding a NUL-terminated).

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page
  2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
  2014-06-18 15:03   ` Stefano Stabellini
@ 2014-07-03 10:52   ` Ian Campbell
  2014-07-03 11:17     ` Julien Grall
  1 sibling, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 10:52 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> The function guest_physmap_remove_page does't have a return value. With
> the change "arch/arm: add consistency check to REMOVE p2m changes",
> apply_p2m_changes can unlikely fail.

Looking at v9 of that patch I don't see it adding any new failures, is
this comment (I suppose written against an older version) still
accurate?

>  Warn the user in this case.

Given that apply_p2m_changes can fail it seems reasonable to log
regardless of the above:
        Acked-by: Ian Campbell <ian.campbell@citrix.com>

Two questions:

Would it be better to instead ensure that apply_p2m_changes always logs
on failure? I suppose it would be more able to give a specific message.

On failure do we retain any reference counts to prevent these pages
getting reused?

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-06-18 20:32     ` Julien Grall
@ 2014-07-03 11:02       ` Ian Campbell
  2014-07-03 11:23         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:02 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Wed, 2014-06-18 at 21:32 +0100, Julien Grall wrote:
> >> @@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
> >>        *  property. Therefore these device doesn't need to be mapped. This
> >>        *  solution can be use later for pass through.
> >>        */
> >> -    if ( !dt_device_type_is_equal(node, "memory") &&
> >> -         dt_device_is_available(node) )
> >> +    if ( !dt_device_type_is_equal(node, "memory") )
> >>       {
> >> -        res = map_device(d, node);
> >> +        res = handle_device(d, node, dt_device_is_available(node));
> >>
> >>           if ( res )
> >>               return res;
> >
> > We need a comment here
> 
> Hmmm... I don't see what kind of comment I can add here. There is 
> already lots of comments explaining handle_device and the previous if.

I'd be inclined to push the dt_device_is_available call down into the
handle_function. 

Otherwise I would go with
    res = make_deevice_available(node)
    if (!res && dt_...available(node)
        res = map_device(node).

Ian

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ
  2014-06-16 16:17 ` [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ Julien Grall
  2014-06-18 18:52   ` Stefano Stabellini
@ 2014-07-03 11:04   ` Ian Campbell
  2014-07-03 11:47     ` Julien Grall
  1 sibling, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:04 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index e567f71..63926a5 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -37,6 +37,12 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>  
>  #define domain_pirq_to_irq(d, pirq) (pirq)
>  
> +static inline bool_t is_routable_irq(unsigned int irq)

is_assignable_irq I think better suits your intention.

routable doesn't imply "to the guest" since you might also route it to
Xen.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page
  2014-07-03 10:52   ` Ian Campbell
@ 2014-07-03 11:17     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:17 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

Hi Ian,

On 07/03/2014 11:52 AM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>> The function guest_physmap_remove_page does't have a return value. With
>> the change "arch/arm: add consistency check to REMOVE p2m changes",
>> apply_p2m_changes can unlikely fail.
> 
> Looking at v9 of that patch I don't see it adding any new failures, is
> this comment (I suppose written against an older version) still
> accurate?

It doesn't look like accurate for v9. I can change the commit message to
explain that apply_p2m_changes may fail. So it's better to log it/

>>  Warn the user in this case.
> 
> Given that apply_p2m_changes can fail it seems reasonable to log
> regardless of the above:
>         Acked-by: Ian Campbell <ian.campbell@citrix.com>
> Two questions:
> 
> Would it be better to instead ensure that apply_p2m_changes always logs
> on failure? I suppose it would be more able to give a specific message.

I didn't though about this solution. It may be easier for debugging later.


> On failure do we retain any reference counts to prevent these pages
> getting reused?

We don't have any mapping reference count so it's fine.

I've asked few changes in Arianna's series to clear the p2m even if the
mapping doesn't correspond. Otherwise we may have issue with foreign
mapping.

I will drop this patch from the series and send a separate patch.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-07-03 11:02       ` Ian Campbell
@ 2014-07-03 11:23         ` Julien Grall
  2014-07-03 12:12           ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:23 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On 07/03/2014 12:02 PM, Ian Campbell wrote:
> On Wed, 2014-06-18 at 21:32 +0100, Julien Grall wrote:
>>>> @@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
>>>>        *  property. Therefore these device doesn't need to be mapped. This
>>>>        *  solution can be use later for pass through.
>>>>        */
>>>> -    if ( !dt_device_type_is_equal(node, "memory") &&
>>>> -         dt_device_is_available(node) )
>>>> +    if ( !dt_device_type_is_equal(node, "memory") )
>>>>       {
>>>> -        res = map_device(d, node);
>>>> +        res = handle_device(d, node, dt_device_is_available(node));
>>>>
>>>>           if ( res )
>>>>               return res;
>>>
>>> We need a comment here
>>
>> Hmmm... I don't see what kind of comment I can add here. There is 
>> already lots of comments explaining handle_device and the previous if.
> 
> I'd be inclined to push the dt_device_is_available call down into the
> handle_function. 
> 
> Otherwise I would go with
>     res = make_deevice_available(node)

What will do make_device_available?

>     if (!res && dt_...available(node)
>         res = map_device(node).

AFAIU, it would means to duplicate the loop to get interrupt/MMIO twice
which I think it's stupid.

So, I prefer to push the dt_device_is_available call down into the
handle_function.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-06-19 12:29       ` Stefano Stabellini
@ 2014-07-03 11:27         ` Ian Campbell
  2014-07-03 12:02           ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:27 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall, stefano.stabellini, tim

On Thu, 2014-06-19 at 13:29 +0100, Stefano Stabellini wrote:
> On Thu, 19 Jun 2014, Julien Grall wrote:
> > On 06/18/2014 08:24 PM, Stefano Stabellini wrote:
> > >>  /*
> > >> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> > >> index e451324..c18b2ca 100644
> > >> --- a/xen/arch/arm/vgic.c
> > >> +++ b/xen/arch/arm/vgic.c
> > >> @@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
> > >>      /* Currently nr_lines in vgic and gic doesn't have the same meanings
> > >>       * Here nr_lines = number of SPIs
> > >>       */
> > >> -    if ( is_hardware_domain(d) )
> > >> -        d->arch.vgic.nr_lines = gic_number_lines() - 32;
> > >> -    else
> > >> -        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
> > >> +    d->arch.vgic.nr_lines = gic_number_lines() - 32;
> > >>  
> > >>      d->arch.vgic.shared_irqs =
> > >>          xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
> > > 
> > > I see what you mean about virq != pirq.
> > > 
> > > It seems to me that setting d->arch.vgic.nr_lines = gic_number_lines() -
> > > 32 for the hardware domain is OK, but it is really a waste for the
> > > others. We could find a way to pass down the info about how many SPIs we
> > > need from libxl. Or we could delay the vgic allocations until the first
> > > SPI is assigned to the domU.
> > 
> > I gave a check on both midway and the versatile express and there is
> > about 200 lines.
> > 
> > It make the overhead of less than 8K per domain. Which is not too bad.
> > 
> > If the host really support 1024 IRQs that would make an overhead of ~32K.
> > 
> > > Similarly to the MMIO hole sizing, I don't think that it would be a
> > > requirement for this patch series but it is something to keep in mind.
> > 
> > Handling virq != pirq will be more complex as we need to take into
> > account of the hotplug solution.

What's the issue here? Something to do with irqdesc->irq-pending lookup?

Seems like irqdesc needs to store the domain and virq number when the
irq is passed through. I assume it must store the dmain already.


> > The vgic has a register which provide the number of lines, I suspect
> > this number can't grow up while the guest is running.
> 
> Of course not. But keep in mind that for non-PCI passthrough we would be
> fully aware of all the assigned interrupts before starting the VM.

Are we ruling out hotplug of such devices? (I don't have a problem with
that BTW)

> PCI passthrough and MSI-X are the issue because there can be many MSI
> per device and the device can be hotplugged into the guest.

MSI(-X) AKA LPIs are in a different more dynamic number space though
(from 8192 onwards). I think for that specific case we can dynamically
do things.

The bigger issue would be the legacy INT-x interrupts (which I expect
look like SPIs), those would no doubt need exposing somehow.

Do we think it is the case that we are eventually going to need a guest
cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
later (i.e. mmio space is reserved, INTx interrupts are assigned etc).

We already have something not totally different in the e820_host option
on x86.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest
  2014-06-16 16:17 ` [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest Julien Grall
@ 2014-07-03 11:30   ` Ian Campbell
  2014-07-03 11:49     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:30 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/common/device_tree.c      |   19 +++++++++++++++++++
>  xen/include/xen/device_tree.h |   17 +++++++++++++++++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> index 4736e0d..fd95307 100644
> --- a/xen/common/device_tree.c
> +++ b/xen/common/device_tree.c
> @@ -13,6 +13,7 @@
>  #include <xen/config.h>
>  #include <xen/types.h>
>  #include <xen/init.h>
> +#include <xen/guest_access.h>
>  #include <xen/device_tree.h>
>  #include <xen/kernel.h>
>  #include <xen/lib.h>
> @@ -661,6 +662,24 @@ struct dt_device_node *dt_find_node_by_path(const char *path)
>      return np;
>  }
>  
> +int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
> +                          struct dt_device_node **node)

My initial feeling is that this should be a static helper in whichever
file is using this rather than part of the "dt library" interface.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-16 16:17 ` [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information Julien Grall
  2014-06-18 19:38   ` Stefano Stabellini
@ 2014-07-03 11:33   ` Ian Campbell
  2014-07-03 11:51     ` Julien Grall
  1 sibling, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:

>  /*
>   * Local variables:
>   * mode: C

> +        struct {
> +            uint64_t mfn;
> +            uint64_t nr_mfn;

xen_pfn_t for both of these I think.

> +        } mmio;
> +        struct {
> +            uint32_t clen;          /* IN: Size of buffer. OUT: Size copied */

clen?

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-19 11:58     ` Julien Grall
  2014-06-19 12:21       ` Stefano Stabellini
  2014-06-24  8:46       ` Christoffer Dall
@ 2014-07-03 11:34       ` Ian Campbell
  2 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Ian Jackson, tim, stefano.stabellini,
	xen-devel, Christoffer Dall

On Thu, 2014-06-19 at 12:58 +0100, Julien Grall wrote:
> I think the idea was Xen provides the generic properties (regs,
> interrupts) and we implement device specific properties in a
> configuration file that could be share with KVM (IIRC, KVM has the same
> needs).

For this is the retrieval of the compatible string really needed?

Making this interface DT specific make me uncomfortable. I'd much rather
that it only contained hardware stuff (MMIO regions, irqs, etc).

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-06-19 12:25         ` Julien Grall
@ 2014-07-03 11:40           ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:40 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, tim, Ian Jackson, stefano.stabellini,
	xen-devel, Christoffer Dall

On Thu, 2014-06-19 at 13:25 +0100, Julien Grall wrote:
> On 06/19/2014 01:21 PM, Stefano Stabellini wrote:
> >>> I know that we talked about this face to face already, but this troubles
> >>> me: is it really so uncommon for a device tree node corresponding to a
> >>> device to have a key-value pair that is critical for the initialization
> >>> of the device?
> >>
> >> I remembered a chat with Christoffer (I think you were in CC) about
> >> specific device properties. But I can't find it in my mailbox.
> >>
> >> I think the idea was Xen provides the generic properties (regs,
> >> interrupts) and we implement device specific properties in a
> >> configuration file that could be share with KVM (IIRC, KVM has the same
> >> needs).
> > 
> > What configuration file? Where would it live?
> > I would rather avoid forcing the user to specify these properties in the
> > VM config file.
> 
> I meant an host config file.

I think this is basically unavoidable. There is simply no sane way to
expose the necessary stuff from the actual dtb.

My opinion is that we should aim to get something which has the
underlying moving parts (mmio and irq mapping etc) working and in tree
sooner rather than later and leave the question of the sharp edges on
the UI until later.

Regardless of what syntactic sugar we add in the future we should always
have the lowlevel "inject a snippet of dts" interface as an option, and
that is all we really need to implement on the first pass.

A host config file mapping some useful string to such snippets is a nice
simple enhancement to that (and if it's already implemented, great, but
it's not mandatory on this pass IMHO).

> >>> The ACPI on ARM people are discussing how to introduce these key-value
> >>> pairs in ACPI too, so I wonder if we can really dismiss them so easily
> >>> for device assignment.
> >>>
> >>> Could Xen discard everything that it knows cannot be passed to the guest
> >>> (information on clocks and phandles for example), but return to the
> >>> toolstack other harmless key-value pairs, such as device specific
> >>> configurations? Maybe we could introduce PHYSDEVOP_DTDEV_GET_KEYVALUE.
> >>
> >> A blacklist won't work here because Xen may return properties that
> >> contain a list of phandle (for instance see the SMMU bindings). The name
> >> of those properties are not necessary generic.
> >>
> >> IHMO, need to let the toolstack device whether we need to add specific
> >> properties. Those properties can be write down in a configuration file
> >> which will be parsed by the toolstack.
> > 
> > Could we simply remove anything that contains phandles? Is there a way
> > to detect if a value is a phandle?
> 
> A phandle is only a way to interpret a number. AFAIK, there is no way to
> differentiate it.

Correct, it's just a number which the driver knows (by virtue of the
name etc) happens to be a phandle.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 10/19] xen/passthrough: Introduce iommu_buildup
  2014-06-16 16:17 ` [RFC 10/19] xen/passthrough: Introduce iommu_buildup Julien Grall
@ 2014-07-03 11:45   ` Ian Campbell
  2014-07-03 11:55     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:45 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> This new function will correctly initialize the IOMMU page table for the
> current domain.

_setup, _configure, _initialise, _construct etc would be more normal
names I think.

"buildup" is the half hour of mindless blather from pundits that you
have to watch before a televised sporting event begins ;-)

> Also use it in iommu_assign_dt_device even though the current IOMMU
> implementation on ARM shares P2M with the processor.
> ---
>  xen/drivers/passthrough/arm/iommu.c   |    6 ++++++
>  xen/drivers/passthrough/device_tree.c |    7 +++++++
>  xen/drivers/passthrough/iommu.c       |   25 +++++++++++++++++++++++++
>  xen/drivers/passthrough/pci.c         |   12 ++++--------
>  xen/include/xen/iommu.h               |    2 ++
>  5 files changed, 44 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
> index 3007b99..de4ed64 100644
> --- a/xen/drivers/passthrough/arm/iommu.c
> +++ b/xen/drivers/passthrough/arm/iommu.c
> @@ -68,3 +68,9 @@ void arch_iommu_domain_destroy(struct domain *d)
>  {
>      iommu_dt_domain_destroy(d);
>  }
> +
> +int arch_iommu_populate_page_table(struct domain *d)
> +{
> +    /* The IOMMU share the p2m with the CPU */

"shares"

> +    return -ENOSYS;
> +}
> diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
> index 3e47df5..afb4dfc 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -41,6 +41,13 @@ int iommu_assign_dt_device(struct domain *d, struct dt_device_node *dev)
>      if ( !list_empty(&dev->domain_list) )
>          goto fail;
>  
> +    if ( need_iommu(d) <= 0 )
> +    {
> +        rc = iommu_buildup(d);
> +        if ( rc )
> +            goto fail;
> +    }
> +
>      rc = hd->platform_ops->assign_dt_device(d, dev);
>  
>      if ( rc )
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index cc12735..2e9b48d 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -187,6 +187,31 @@ void iommu_teardown(struct domain *d)
>      tasklet_schedule(&iommu_pt_cleanup_tasklet);
>  }
>  
> +int iommu_buildup(struct domain *d)

Is there really not such a function already? How does x86 setup the
iommu then?

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ
  2014-07-03 11:04   ` Ian Campbell
@ 2014-07-03 11:47     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:47 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 12:04 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
>> index e567f71..63926a5 100644
>> --- a/xen/include/asm-arm/irq.h
>> +++ b/xen/include/asm-arm/irq.h
>> @@ -37,6 +37,12 @@ void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>>  
>>  #define domain_pirq_to_irq(d, pirq) (pirq)
>>  
>> +static inline bool_t is_routable_irq(unsigned int irq)
> 
> is_assignable_irq I think better suits your intention.
> 
> routable doesn't imply "to the guest" since you might also route it to
> Xen.

Hmmm... right. I will do the change in the next version.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-06-16 16:17 ` [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody Julien Grall
  2014-06-18 19:28   ` Stefano Stabellini
@ 2014-07-03 11:48   ` Ian Campbell
  2014-07-03 12:07     ` Julien Grall
  1 sibling, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:48 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> Currently, when the device is deassigned from a domain, we directly reassign
> to DOM0.
> 
> As the device may not have been correctly reset, this may lead to corrupt or
> expose some part of DOM0 memory.

"corruption".

I'd go further and say "and we may have no way to reset some platform
devices".

> If Xen reassigns the device to "nobody", it may receive some global/context
> fault because the transaction has failed (indeed the context has been
> marked invalid).

Can you describe here what happen in this case (I presume Xen tears down
the iommu to quiesce them somehow?)

> DOM0 will have to issue an hypercall to assign the device to itself if it
> wants to use it.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> ---
>  xen/drivers/passthrough/arm/smmu.c    |    7 ++++---
>  xen/drivers/passthrough/device_tree.c |    8 +++-----
>  2 files changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c
> index f4eb2a2..b25034e 100644
> --- a/xen/drivers/passthrough/arm/smmu.c
> +++ b/xen/drivers/passthrough/arm/smmu.c
> @@ -1245,8 +1245,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
>  {
>      int ret = 0;
>  
> -    /* Don't allow remapping on other domain than hwdom */
> -    if ( t != hardware_domain )
> +    /* Allow remapping either on the hardware domain or to nothing */
> +    if ( t && t != hardware_domain )
>          return -EPERM;
>  
>      if ( t == s )
> @@ -1256,7 +1256,8 @@ static int arm_smmu_reassign_dt_dev(struct domain *s, struct domain *t,
>      if ( ret )
>          return ret;
>  
> -    ret = arm_smmu_attach_dev(t, dev);
> +    if ( t )
> +        ret = arm_smmu_attach_dev(t, dev);
>  
>      return ret;
>  }
> diff --git a/xen/drivers/passthrough/device_tree.c b/xen/drivers/passthrough/device_tree.c
> index afb4dfc..8a4bc69 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -75,14 +75,12 @@ int iommu_deassign_dt_device(struct domain *d, struct dt_device_node *dev)
>  
>      spin_lock(&dtdevs_lock);
>  
> -    rc = hd->platform_ops->reassign_dt_device(d, hardware_domain, dev);
> +    rc = hd->platform_ops->reassign_dt_device(d, NULL, dev);
>      if ( rc )
>          goto fail;
>  
> -    list_del(&dev->domain_list);
> -
> -    dt_device_set_used_by(dev, hardware_domain->domain_id);
> -    list_add(&dev->domain_list, &domain_hvm_iommu(hardware_domain)->dt_devices);
> +    list_del_init(&dev->domain_list);
> +    dt_device_set_used_by(dev, DOMID_IO);
>  
>  fail:
>      spin_unlock(&dtdevs_lock);

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest
  2014-07-03 11:30   ` Ian Campbell
@ 2014-07-03 11:49     ` Julien Grall
  2014-07-03 12:13       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:49 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

Hi Ian,

On 07/03/2014 12:30 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> ---
>>  xen/common/device_tree.c      |   19 +++++++++++++++++++
>>  xen/include/xen/device_tree.h |   17 +++++++++++++++++
>>  2 files changed, 36 insertions(+)
>>
>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>> index 4736e0d..fd95307 100644
>> --- a/xen/common/device_tree.c
>> +++ b/xen/common/device_tree.c
>> @@ -13,6 +13,7 @@
>>  #include <xen/config.h>
>>  #include <xen/types.h>
>>  #include <xen/init.h>
>> +#include <xen/guest_access.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/kernel.h>
>>  #include <xen/lib.h>
>> @@ -661,6 +662,24 @@ struct dt_device_node *dt_find_node_by_path(const char *path)
>>      return np;
>>  }
>>  
>> +int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
>> +                          struct dt_device_node **node)
> 
> My initial feeling is that this should be a static helper in whichever
> file is using this rather than part of the "dt library" interface.

It's used in 2 different place (see patch #9 and #12).

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-07-03 11:33   ` Ian Campbell
@ 2014-07-03 11:51     ` Julien Grall
  2014-07-03 12:13       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:51 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

Hi Ian,

On 07/03/2014 12:33 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> 
>>  /*
>>   * Local variables:
>>   * mode: C
> 
>> +        struct {
>> +            uint64_t mfn;
>> +            uint64_t nr_mfn;
> 
> xen_pfn_t for both of these I think.

Ok. I will change it in the next version.

>> +        } mmio;
>> +        struct {
>> +            uint32_t clen;          /* IN: Size of buffer. OUT: Size copied */
> 
> clen?

compatible length.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device
  2014-06-17 13:55           ` Jan Beulich
@ 2014-07-03 11:54             ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:54 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, tim, Julien Grall, Ian Jackson,
	stefano.stabellini, xen-devel

On Tue, 2014-06-17 at 14:55 +0100, Jan Beulich wrote:
> >>> On 17.06.14 at 15:48, <julien.grall@linaro.org> wrote:
> > I'm not sure why we function is used but from the DT pass-through POV,
> > we only need to ask the hypervisor to assign this specific device to the
> > guest. The hypercall will return "ok" if it has succeeded or an error if
> > the device is not protected (i.e not behind an IOMMU). I don't see why
> > we should have a test-assign hypercall in this case...
> 
> The reason for its existence isn't entirely clear to me either - you may
> want to ping whoever added this.

Yes, I was about to suggest trawling through git log a bit.

This may be an old xend-ism perhaps.

I can sort of imagine an early sanity check early the availability of
the devices before building the domain (despite the TOCTOU race perhaps
that's useful?)

The other use case might be to be able to list all assignable devices.

Those are all wild guesses on my part though.

Ian.

> 
> > Anyway, I'm fine to introduce the hypercall for consistency but I don't
> > plan to use it in the toolstack because it's pointless.
> 
> It's the tools maintainers' call in the end, but my advice would be to not
> have it used in the PCI case, but not in the DT one. Either rip it out on
> the PCI side too, or have it be called even if not strictly needed.
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 10/19] xen/passthrough: Introduce iommu_buildup
  2014-07-03 11:45   ` Ian Campbell
@ 2014-07-03 11:55     ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-07-03 11:55 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 12:45 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>> This new function will correctly initialize the IOMMU page table for the
>> current domain.
> 
> _setup, _configure, _initialise, _construct etc would be more normal
> names I think.

I will rename to iommu_construct. The function is used to construct the
page table and set to 1 need_iommu(d).

> "buildup" is the half hour of mindless blather from pundits that you
> have to watch before a televised sporting event begins ;-)
> 
>> Also use it in iommu_assign_dt_device even though the current IOMMU
>> implementation on ARM shares P2M with the processor.
>> ---
>>  xen/drivers/passthrough/arm/iommu.c   |    6 ++++++
>>  xen/drivers/passthrough/device_tree.c |    7 +++++++
>>  xen/drivers/passthrough/iommu.c       |   25 +++++++++++++++++++++++++
>>  xen/drivers/passthrough/pci.c         |   12 ++++--------
>>  xen/include/xen/iommu.h               |    2 ++
>>  5 files changed, 44 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c
>> index 3007b99..de4ed64 100644
>> --- a/xen/drivers/passthrough/arm/iommu.c
>> +++ b/xen/drivers/passthrough/arm/iommu.c
>> @@ -68,3 +68,9 @@ void arch_iommu_domain_destroy(struct domain *d)
>>  {
>>      iommu_dt_domain_destroy(d);
>>  }
>> +
>> +int arch_iommu_populate_page_table(struct domain *d)
>> +{
>> +    /* The IOMMU share the p2m with the CPU */
> 
> "shares"

Ok.

>> +    return -ENOSYS;
>> +}

[..]

>> +int iommu_buildup(struct domain *d)
> 
> Is there really not such a function already? How does x86 setup the
> iommu then?

There is a function to initialize the IOMMU but the activation and the
setup of page table are down later, I guess to avoid using memory for
nothing.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough
  2014-06-18 18:33               ` Julien Grall
  2014-06-18 18:55                 ` Stefano Stabellini
@ 2014-07-03 11:56                 ` Ian Campbell
  1 sibling, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:56 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Wed, 2014-06-18 at 19:33 +0100, Julien Grall wrote:
> 
> On 18/06/14 19:14, Stefano Stabellini wrote:
> >> For non-PCI passthrough this size will unlikely happen. We always map a
> >> matter of few pages per-device.
> >>
> >> I think this size is enough for the time being. I plan to revisit it for
> >> PCI passthrough where we will be able to allocate some of them after 4G.
> >
> > Modern GPUs can easily exceed 512MB and they work on ARM.
> 
> If it's like on x86, passthrough GPUs may also require firmware. At 
> least it looks like the case with some configuration on the Arndale, 
> which doesn't have any SMMU.
> 
> I'd like to make this series as simple as possible so we can get 
> "quickly" a working solution to passthrough device with small amount of 
> MMIO region.
> 
> So if you don't mind I suggest to bump the amount of MMIO region to 1GB, 
> I may need to shrink the first RAM bank, and think about bigger region 
> support in a follow-up series.

I think so long as we aren't painting ourselves into any ABI corners (I
don't think we are) then defering the handling of large devices is fine.

> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_*
  2014-06-16 16:18 ` [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_* Julien Grall
@ 2014-07-03 11:56   ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:56 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

On Mon, 2014-06-16 at 17:18 +0100, Julien Grall wrote:
> Avoid to use hardcode value when the interrupt type is set.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

> ---
>  tools/libxl/libxl_arm.c |   29 +++++++++++++++++++++++++----
>  1 file changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
> index 21c3399..1edb87a 100644
> --- a/tools/libxl/libxl_arm.c
> +++ b/tools/libxl/libxl_arm.c
> @@ -6,6 +6,23 @@
>  #include <libfdt.h>
>  #include <assert.h>
>  
> +/**
> + * IRQ line type.
> + * DT_IRQ_TYPE_NONE            - default, unspecified type
> + * DT_IRQ_TYPE_EDGE_RISING     - rising edge triggered
> + * DT_IRQ_TYPE_EDGE_FALLING    - falling edge triggered
> + * DT_IRQ_TYPE_EDGE_BOTH       - rising and falling edge triggered
> + * DT_IRQ_TYPE_LEVEL_HIGH      - high level triggered
> + * DT_IRQ_TYPE_LEVEL_LOW       - low level triggered
> + */
> +#define DT_IRQ_TYPE_NONE           0x00000000
> +#define DT_IRQ_TYPE_EDGE_RISING    0x00000001
> +#define DT_IRQ_TYPE_EDGE_FALLING   0x00000002
> +#define DT_IRQ_TYPE_EDGE_BOTH                           \
> +    (DT_IRQ_TYPE_EDGE_FALLING | DT_IRQ_TYPE_EDGE_RISING)
> +#define DT_IRQ_TYPE_LEVEL_HIGH     0x00000004
> +#define DT_IRQ_TYPE_LEVEL_LOW      0x00000008
> +
>  int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
>                                uint32_t domid)
>  {
> @@ -338,9 +355,12 @@ static int make_timer_node(libxl__gc *gc, void *fdt, const struct arch_info *ain
>      res = fdt_property_compat(gc, fdt, 1, ainfo->timer_compat);
>      if (res) return res;
>  
> -    set_interrupt_ppi(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf, 0x8);
> -    set_interrupt_ppi(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf, 0x8);
> -    set_interrupt_ppi(ints[2], GUEST_TIMER_VIRT_PPI, 0xf, 0x8);
> +    set_interrupt_ppi(ints[0], GUEST_TIMER_PHYS_S_PPI, 0xf,
> +                      DT_IRQ_TYPE_LEVEL_LOW);
> +    set_interrupt_ppi(ints[1], GUEST_TIMER_PHYS_NS_PPI, 0xf,
> +                      DT_IRQ_TYPE_LEVEL_LOW);
> +    set_interrupt_ppi(ints[2], GUEST_TIMER_VIRT_PPI, 0xf,
> +                      DT_IRQ_TYPE_LEVEL_LOW);
>  
>      res = fdt_property_interrupts(gc, fdt, ints, 3);
>      if (res) return res;
> @@ -378,7 +398,8 @@ static int make_hypervisor_node(libxl__gc *gc, void *fdt,
>       *  - Active-low level-sensitive
>       *  - All cpus
>       */
> -    set_interrupt_ppi(intr, GUEST_EVTCHN_PPI, 0xf, 0x8);
> +    set_interrupt_ppi(intr, GUEST_EVTCHN_PPI, 0xf,
> +                      DT_IRQ_TYPE_LEVEL_LOW);
>  
>      res = fdt_property_interrupts(gc, fdt, &intr, 1);
>      if (res) return res;

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs
  2014-06-16 16:18 ` [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs Julien Grall
@ 2014-07-03 11:58   ` Ian Campbell
  2014-07-03 12:04     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 11:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

On Mon, 2014-06-16 at 17:18 +0100, Julien Grall wrote:
> The function will be used later during device passthrough to create
> interrupts in the device tree. Those interrupts are usually SPIs.
> 
> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Acked-by: Ian Campbell <ian.campbell@citrix.com>

Could this and the previous patch be applied today? I don't see any
dependency on what went before and they seem reasonable cleanups
regardless of the rest of the series.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-07-03 11:27         ` Ian Campbell
@ 2014-07-03 12:02           ` Julien Grall
  2014-07-03 12:53             ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 12:02 UTC (permalink / raw)
  To: Ian Campbell, Stefano Stabellini; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 12:27 PM, Ian Campbell wrote:
> On Thu, 2014-06-19 at 13:29 +0100, Stefano Stabellini wrote:
>> On Thu, 19 Jun 2014, Julien Grall wrote:
>>> On 06/18/2014 08:24 PM, Stefano Stabellini wrote:
>>>>>  /*
>>>>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>>>>> index e451324..c18b2ca 100644
>>>>> --- a/xen/arch/arm/vgic.c
>>>>> +++ b/xen/arch/arm/vgic.c
>>>>> @@ -82,10 +82,7 @@ int domain_vgic_init(struct domain *d)
>>>>>      /* Currently nr_lines in vgic and gic doesn't have the same meanings
>>>>>       * Here nr_lines = number of SPIs
>>>>>       */
>>>>> -    if ( is_hardware_domain(d) )
>>>>> -        d->arch.vgic.nr_lines = gic_number_lines() - 32;
>>>>> -    else
>>>>> -        d->arch.vgic.nr_lines = 0; /* We don't need SPIs for the guest */
>>>>> +    d->arch.vgic.nr_lines = gic_number_lines() - 32;
>>>>>  
>>>>>      d->arch.vgic.shared_irqs =
>>>>>          xzalloc_array(struct vgic_irq_rank, DOMAIN_NR_RANKS(d));
>>>>
>>>> I see what you mean about virq != pirq.
>>>>
>>>> It seems to me that setting d->arch.vgic.nr_lines = gic_number_lines() -
>>>> 32 for the hardware domain is OK, but it is really a waste for the
>>>> others. We could find a way to pass down the info about how many SPIs we
>>>> need from libxl. Or we could delay the vgic allocations until the first
>>>> SPI is assigned to the domU.
>>>
>>> I gave a check on both midway and the versatile express and there is
>>> about 200 lines.
>>>
>>> It make the overhead of less than 8K per domain. Which is not too bad.
>>>
>>> If the host really support 1024 IRQs that would make an overhead of ~32K.
>>>
>>>> Similarly to the MMIO hole sizing, I don't think that it would be a
>>>> requirement for this patch series but it is something to keep in mind.
>>>
>>> Handling virq != pirq will be more complex as we need to take into
>>> account of the hotplug solution.
> 
> What's the issue here? Something to do with irqdesc->irq-pending lookup?
> 
> Seems like irqdesc needs to store the domain and virq number when the
> irq is passed through. I assume it must store the dmain already.

The issues are mostly:
	- we need to defer the vGIC IRQs allocation
	- Add a new hypercall to setup the number of IRQs
	- How do we handle hotplug?

>>> The vgic has a register which provide the number of lines, I suspect
>>> this number can't grow up while the guest is running.
>>
>> Of course not. But keep in mind that for non-PCI passthrough we would be
>> fully aware of all the assigned interrupts before starting the VM.
> 
> Are we ruling out hotplug of such devices? (I don't have a problem with
> that BTW)
> 
>> PCI passthrough and MSI-X are the issue because there can be many MSI
>> per device and the device can be hotplugged into the guest.
> 
> MSI(-X) AKA LPIs are in a different more dynamic number space though
> (from 8192 onwards). I think for that specific case we can dynamically
> do things.
> 
> The bigger issue would be the legacy INT-x interrupts (which I expect
> look like SPIs), those would no doubt need exposing somehow.

INT-x is shared between different PCI and this will means lots of rework
in the interrupt code (mostly now with the no maintenance interrupt
series). I hope we won't have to handle them.

> Do we think it is the case that we are eventually going to need a guest
> cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
> would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
> later (i.e. mmio space is reserved, INTx interrupts are assigned etc).

I'm not sure to understand what we would need a "pci" cfg option... For
now, this series doesn't aim to support PCI. So I think we could defer
this problem later.

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs
  2014-07-03 11:58   ` Ian Campbell
@ 2014-07-03 12:04     ` Julien Grall
  2014-07-03 14:04       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 12:04 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

Hi Ian,

On 07/03/2014 12:58 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:18 +0100, Julien Grall wrote:
>> The function will be used later during device passthrough to create
>> interrupts in the device tree. Those interrupts are usually SPIs.
>>
>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
> 
> Could this and the previous patch be applied today? I don't see any
> dependency on what went before and they seem reasonable cleanups
> regardless of the rest of the series.

They have no dependencies on the other patches of the series. You can
apply them without any issue.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 11:48   ` Ian Campbell
@ 2014-07-03 12:07     ` Julien Grall
  2014-07-03 12:53       ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 12:07 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

Hi Ian,

On 07/03/2014 12:48 PM, Ian Campbell wrote:
> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>> Currently, when the device is deassigned from a domain, we directly reassign
>> to DOM0.
>>
>> As the device may not have been correctly reset, this may lead to corrupt or
>> expose some part of DOM0 memory.
> 
> "corruption".
> 
> I'd go further and say "and we may have no way to reset some platform
> devices".

Ok.

>> If Xen reassigns the device to "nobody", it may receive some global/context
>> fault because the transaction has failed (indeed the context has been
>> marked invalid).
> 
> Can you describe here what happen in this case (I presume Xen tears down
> the iommu to quiesce them somehow?)

The SMMU drivers will mark the different Context Bank, S2CR, SMR as
invalid. If the device is attempt to access the memory then, we will
receive an interrupt in Xen.

Actually it's only happen once, if the device is still enabled when the
domain is shutdown.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO
  2014-07-03 11:23         ` Julien Grall
@ 2014-07-03 12:12           ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 12:12 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Thu, 2014-07-03 at 12:23 +0100, Julien Grall wrote:
> On 07/03/2014 12:02 PM, Ian Campbell wrote:
> > On Wed, 2014-06-18 at 21:32 +0100, Julien Grall wrote:
> >>>> @@ -865,10 +888,9 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
> >>>>        *  property. Therefore these device doesn't need to be mapped. This
> >>>>        *  solution can be use later for pass through.
> >>>>        */
> >>>> -    if ( !dt_device_type_is_equal(node, "memory") &&
> >>>> -         dt_device_is_available(node) )
> >>>> +    if ( !dt_device_type_is_equal(node, "memory") )
> >>>>       {
> >>>> -        res = map_device(d, node);
> >>>> +        res = handle_device(d, node, dt_device_is_available(node));
> >>>>
> >>>>           if ( res )
> >>>>               return res;
> >>>
> >>> We need a comment here
> >>
> >> Hmmm... I don't see what kind of comment I can add here. There is 
> >> already lots of comments explaining handle_device and the previous if.
> > 
> > I'd be inclined to push the dt_device_is_available call down into the
> > handle_function. 
> > 
> > Otherwise I would go with
> >     res = make_deevice_available(node)
> 
> What will do make_device_available?

The tweaking of the iocaps arrays etc (perhaps a badly chosen name given
the dt_....available).

> >     if (!res && dt_...available(node)
> >         res = map_device(node).
> 
> AFAIU, it would means to duplicate the loop to get interrupt/MMIO twice
> which I think it's stupid.
> 
> So, I prefer to push the dt_device_is_available call down into the
> handle_function.

Sure.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information
  2014-07-03 11:51     ` Julien Grall
@ 2014-07-03 12:13       ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 12:13 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

On Thu, 2014-07-03 at 12:51 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 07/03/2014 12:33 PM, Ian Campbell wrote:
> > On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> > 
> >>  /*
> >>   * Local variables:
> >>   * mode: C
> > 
> >> +        struct {
> >> +            uint64_t mfn;
> >> +            uint64_t nr_mfn;
> > 
> > xen_pfn_t for both of these I think.
> 
> Ok. I will change it in the next version.
> 
> >> +        } mmio;
> >> +        struct {
> >> +            uint32_t clen;          /* IN: Size of buffer. OUT: Size copied */
> > 
> > clen?
> 
> compatible length.

I thought you had typo'd clean and didn't understand.

This is in a member called compat isn't it? In that case len would be
fine I think.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest
  2014-07-03 11:49     ` Julien Grall
@ 2014-07-03 12:13       ` Ian Campbell
  2014-07-03 12:22         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 12:13 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Thu, 2014-07-03 at 12:49 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 07/03/2014 12:30 PM, Ian Campbell wrote:
> > On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
> >> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >> ---
> >>  xen/common/device_tree.c      |   19 +++++++++++++++++++
> >>  xen/include/xen/device_tree.h |   17 +++++++++++++++++
> >>  2 files changed, 36 insertions(+)
> >>
> >> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
> >> index 4736e0d..fd95307 100644
> >> --- a/xen/common/device_tree.c
> >> +++ b/xen/common/device_tree.c
> >> @@ -13,6 +13,7 @@
> >>  #include <xen/config.h>
> >>  #include <xen/types.h>
> >>  #include <xen/init.h>
> >> +#include <xen/guest_access.h>
> >>  #include <xen/device_tree.h>
> >>  #include <xen/kernel.h>
> >>  #include <xen/lib.h>
> >> @@ -661,6 +662,24 @@ struct dt_device_node *dt_find_node_by_path(const char *path)
> >>      return np;
> >>  }
> >>  
> >> +int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
> >> +                          struct dt_device_node **node)
> > 
> > My initial feeling is that this should be a static helper in whichever
> > file is using this rather than part of the "dt library" interface.
> 
> It's used in 2 different place (see patch #9 and #12).

domtctl and physdev op I think?

I suppose device_tree.c isn't so bad, I just felt odd putting guest
specific stuff in there.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest
  2014-07-03 12:13       ` Ian Campbell
@ 2014-07-03 12:22         ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-07-03 12:22 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 01:13 PM, Ian Campbell wrote:
> On Thu, 2014-07-03 at 12:49 +0100, Julien Grall wrote:
>> Hi Ian,
>>
>> On 07/03/2014 12:30 PM, Ian Campbell wrote:
>>> On Mon, 2014-06-16 at 17:17 +0100, Julien Grall wrote:
>>>> Signed-off-by: Julien Grall <julien.grall@linaro.org>
>>>> ---
>>>>  xen/common/device_tree.c      |   19 +++++++++++++++++++
>>>>  xen/include/xen/device_tree.h |   17 +++++++++++++++++
>>>>  2 files changed, 36 insertions(+)
>>>>
>>>> diff --git a/xen/common/device_tree.c b/xen/common/device_tree.c
>>>> index 4736e0d..fd95307 100644
>>>> --- a/xen/common/device_tree.c
>>>> +++ b/xen/common/device_tree.c
>>>> @@ -13,6 +13,7 @@
>>>>  #include <xen/config.h>
>>>>  #include <xen/types.h>
>>>>  #include <xen/init.h>
>>>> +#include <xen/guest_access.h>
>>>>  #include <xen/device_tree.h>
>>>>  #include <xen/kernel.h>
>>>>  #include <xen/lib.h>
>>>> @@ -661,6 +662,24 @@ struct dt_device_node *dt_find_node_by_path(const char *path)
>>>>      return np;
>>>>  }
>>>>  
>>>> +int dt_find_node_by_gpath(XEN_GUEST_HANDLE(char) u_path, uint32_t u_plen,
>>>> +                          struct dt_device_node **node)
>>>
>>> My initial feeling is that this should be a static helper in whichever
>>> file is using this rather than part of the "dt library" interface.
>>
>> It's used in 2 different place (see patch #9 and #12).
> 
> domtctl and physdev op I think?

Yes.

> I suppose device_tree.c isn't so bad, I just felt odd putting guest
> specific stuff in there.

I can move in arch/arm/guestcopy.c. Even though for me it's not arch
specific but device tree specific.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-07-03 12:02           ` Julien Grall
@ 2014-07-03 12:53             ` Ian Campbell
  2014-07-15 13:01               ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 12:53 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Thu, 2014-07-03 at 13:02 +0100, Julien Grall wrote:
> >>> Handling virq != pirq will be more complex as we need to take into
> >>> account of the hotplug solution.
> > 
> > What's the issue here? Something to do with irqdesc->irq-pending lookup?
> > 
> > Seems like irqdesc needs to store the domain and virq number when the
> > irq is passed through. I assume it must store the dmain already.
> 
> The issues are mostly:
> 	- we need to defer the vGIC IRQs allocation
> 	- Add a new hypercall to setup the number of IRQs
> 	- How do we handle hotplug?

Those all sound like issues with having dynamic numbers of irqs, rather
than with a non-1:1 virq<->pirq mapping per-se.

> >>> The vgic has a register which provide the number of lines, I suspect
> >>> this number can't grow up while the guest is running.
> >>
> >> Of course not. But keep in mind that for non-PCI passthrough we would be
> >> fully aware of all the assigned interrupts before starting the VM.
> > 
> > Are we ruling out hotplug of such devices? (I don't have a problem with
> > that BTW)
> > 
> >> PCI passthrough and MSI-X are the issue because there can be many MSI
> >> per device and the device can be hotplugged into the guest.
> > 
> > MSI(-X) AKA LPIs are in a different more dynamic number space though
> > (from 8192 onwards). I think for that specific case we can dynamically
> > do things.
> > 
> > The bigger issue would be the legacy INT-x interrupts (which I expect
> > look like SPIs), those would no doubt need exposing somehow.
> 
> INT-x is shared between different PCI and this will means lots of rework
> in the interrupt code (mostly now with the no maintenance interrupt
> series). I hope we won't have to handle them.

I've no idea what PCI on ARM is going to be like in this respect. Xgene
PCI controller exports them though and I suppose if the h/w wants to
support all PCI devices it might be needed.

Not sure how PCI-X changes that though.

> 
> > Do we think it is the case that we are eventually going to need a guest
> > cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
> > would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
> > later (i.e. mmio space is reserved, INTx interrupts are assigned etc).
> 
> I'm not sure to understand what we would need a "pci" cfg option... For
> now, this series doesn't aim to support PCI. So I think we could defer
> this problem later.

Yeah, we got onto PCI somehow.

So long as we are happy not being able to hotplug platform devices I
think we don't need an equivalent option (the point would be to only
setup huge numbers of SPIs, reserve MMIO space etc if it was going to be
used).

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 12:07     ` Julien Grall
@ 2014-07-03 12:53       ` Ian Campbell
  2014-07-03 13:01         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 12:53 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
> >> If Xen reassigns the device to "nobody", it may receive some global/context
> >> fault because the transaction has failed (indeed the context has been
> >> marked invalid).
> > 
> > Can you describe here what happen in this case (I presume Xen tears down
> > the iommu to quiesce them somehow?)
> 
> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
> invalid. If the device is attempt to access the memory then, we will
> receive an interrupt in Xen.
> 
> Actually it's only happen once, if the device is still enabled when the
> domain is shutdown.

My concern was with getting a storm of such interrupts after this point.
If it only happens once and any subsequent ones are damped by some means
then great.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 12:53       ` Ian Campbell
@ 2014-07-03 13:01         ` Julien Grall
  2014-07-03 13:42           ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 13:01 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 01:53 PM, Ian Campbell wrote:
> On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>> fault because the transaction has failed (indeed the context has been
>>>> marked invalid).
>>>
>>> Can you describe here what happen in this case (I presume Xen tears down
>>> the iommu to quiesce them somehow?)
>>
>> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
>> invalid. If the device is attempt to access the memory then, we will
>> receive an interrupt in Xen.
>>
>> Actually it's only happen once, if the device is still enabled when the
>> domain is shutdown.
> 
> My concern was with getting a storm of such interrupts after this point.
> If it only happens once and any subsequent ones are damped by some means
> then great.

I guess, it can happen with a buggy device trying to access memory
alone. But I don't think we should care about this case.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 13:01         ` Julien Grall
@ 2014-07-03 13:42           ` Ian Campbell
  2014-07-03 13:51             ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 13:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Thu, 2014-07-03 at 14:01 +0100, Julien Grall wrote:
> On 07/03/2014 01:53 PM, Ian Campbell wrote:
> > On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
> >>>> If Xen reassigns the device to "nobody", it may receive some global/context
> >>>> fault because the transaction has failed (indeed the context has been
> >>>> marked invalid).
> >>>
> >>> Can you describe here what happen in this case (I presume Xen tears down
> >>> the iommu to quiesce them somehow?)
> >>
> >> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
> >> invalid. If the device is attempt to access the memory then, we will
> >> receive an interrupt in Xen.
> >>
> >> Actually it's only happen once, if the device is still enabled when the
> >> domain is shutdown.
> > 
> > My concern was with getting a storm of such interrupts after this point.
> > If it only happens once and any subsequent ones are damped by some means
> > then great.
> 
> I guess, it can happen with a buggy device trying to access memory
> alone. But I don't think we should care about this case.

Ideally such a device wouldn't be able to DoS the rest of the system.

Does the SMMU not have a bit to say: deny all MMIO from this context
without raising an exception?

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 13:42           ` Ian Campbell
@ 2014-07-03 13:51             ` Julien Grall
  2014-07-03 14:04               ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-03 13:51 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 02:42 PM, Ian Campbell wrote:
> On Thu, 2014-07-03 at 14:01 +0100, Julien Grall wrote:
>> On 07/03/2014 01:53 PM, Ian Campbell wrote:
>>> On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
>>>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>>>> fault because the transaction has failed (indeed the context has been
>>>>>> marked invalid).
>>>>>
>>>>> Can you describe here what happen in this case (I presume Xen tears down
>>>>> the iommu to quiesce them somehow?)
>>>>
>>>> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
>>>> invalid. If the device is attempt to access the memory then, we will
>>>> receive an interrupt in Xen.
>>>>
>>>> Actually it's only happen once, if the device is still enabled when the
>>>> domain is shutdown.
>>>
>>> My concern was with getting a storm of such interrupts after this point.
>>> If it only happens once and any subsequent ones are damped by some means
>>> then great.
>>
>> I guess, it can happen with a buggy device trying to access memory
>> alone. But I don't think we should care about this case.
> 
> Ideally such a device wouldn't be able to DoS the rest of the system.
> 
> Does the SMMU not have a bit to say: deny all MMIO from this context
> without raising an exception?

AFAIK, no. We receive a transaction fault via the global interrupt. If
we disable this interrupt we also disable potentially helpful message
when the register are misconfigured.

Regards,


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 13:51             ` Julien Grall
@ 2014-07-03 14:04               ` Ian Campbell
  2014-07-03 14:09                 ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 14:04 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, stefano.stabellini, tim

On Thu, 2014-07-03 at 14:51 +0100, Julien Grall wrote:
> On 07/03/2014 02:42 PM, Ian Campbell wrote:
> > On Thu, 2014-07-03 at 14:01 +0100, Julien Grall wrote:
> >> On 07/03/2014 01:53 PM, Ian Campbell wrote:
> >>> On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
> >>>>>> If Xen reassigns the device to "nobody", it may receive some global/context
> >>>>>> fault because the transaction has failed (indeed the context has been
> >>>>>> marked invalid).
> >>>>>
> >>>>> Can you describe here what happen in this case (I presume Xen tears down
> >>>>> the iommu to quiesce them somehow?)
> >>>>
> >>>> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
> >>>> invalid. If the device is attempt to access the memory then, we will
> >>>> receive an interrupt in Xen.
> >>>>
> >>>> Actually it's only happen once, if the device is still enabled when the
> >>>> domain is shutdown.
> >>>
> >>> My concern was with getting a storm of such interrupts after this point.
> >>> If it only happens once and any subsequent ones are damped by some means
> >>> then great.
> >>
> >> I guess, it can happen with a buggy device trying to access memory
> >> alone. But I don't think we should care about this case.
> > 
> > Ideally such a device wouldn't be able to DoS the rest of the system.
> > 
> > Does the SMMU not have a bit to say: deny all MMIO from this context
> > without raising an exception?
> 
> AFAIK, no. We receive a transaction fault via the global interrupt. If
> we disable this interrupt we also disable potentially helpful message
> when the register are misconfigured.

That seems like something of a hardware shortcoming.

It might be worth asking one of the ARM guys what we should do with a
device which won't shut up.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs
  2014-07-03 12:04     ` Julien Grall
@ 2014-07-03 14:04       ` Ian Campbell
  0 siblings, 0 replies; 122+ messages in thread
From: Ian Campbell @ 2014-07-03 14:04 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, tim, Ian Jackson, stefano.stabellini, Stefano Stabellini

On Thu, 2014-07-03 at 13:04 +0100, Julien Grall wrote:
> Hi Ian,
> 
> On 07/03/2014 12:58 PM, Ian Campbell wrote:
> > On Mon, 2014-06-16 at 17:18 +0100, Julien Grall wrote:
> >> The function will be used later during device passthrough to create
> >> interrupts in the device tree. Those interrupts are usually SPIs.
> >>
> >> Signed-off-by: Julien Grall <julien.grall@linaro.org>
> >> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > 
> > Acked-by: Ian Campbell <ian.campbell@citrix.com>
> > 
> > Could this and the previous patch be applied today? I don't see any
> > dependency on what went before and they seem reasonable cleanups
> > regardless of the rest of the series.
> 
> They have no dependencies on the other patches of the series. You can
> apply them without any issue.

I have done so.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody
  2014-07-03 14:04               ` Ian Campbell
@ 2014-07-03 14:09                 ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-07-03 14:09 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, stefano.stabellini, tim

On 07/03/2014 03:04 PM, Ian Campbell wrote:
> On Thu, 2014-07-03 at 14:51 +0100, Julien Grall wrote:
>> On 07/03/2014 02:42 PM, Ian Campbell wrote:
>>> On Thu, 2014-07-03 at 14:01 +0100, Julien Grall wrote:
>>>> On 07/03/2014 01:53 PM, Ian Campbell wrote:
>>>>> On Thu, 2014-07-03 at 13:07 +0100, Julien Grall wrote:
>>>>>>>> If Xen reassigns the device to "nobody", it may receive some global/context
>>>>>>>> fault because the transaction has failed (indeed the context has been
>>>>>>>> marked invalid).
>>>>>>>
>>>>>>> Can you describe here what happen in this case (I presume Xen tears down
>>>>>>> the iommu to quiesce them somehow?)
>>>>>>
>>>>>> The SMMU drivers will mark the different Context Bank, S2CR, SMR as
>>>>>> invalid. If the device is attempt to access the memory then, we will
>>>>>> receive an interrupt in Xen.
>>>>>>
>>>>>> Actually it's only happen once, if the device is still enabled when the
>>>>>> domain is shutdown.
>>>>>
>>>>> My concern was with getting a storm of such interrupts after this point.
>>>>> If it only happens once and any subsequent ones are damped by some means
>>>>> then great.
>>>>
>>>> I guess, it can happen with a buggy device trying to access memory
>>>> alone. But I don't think we should care about this case.
>>>
>>> Ideally such a device wouldn't be able to DoS the rest of the system.
>>>
>>> Does the SMMU not have a bit to say: deny all MMIO from this context
>>> without raising an exception?
>>
>> AFAIK, no. We receive a transaction fault via the global interrupt. If
>> we disable this interrupt we also disable potentially helpful message
>> when the register are misconfigured.
> 
> That seems like something of a hardware shortcoming.
> 
> It might be worth asking one of the ARM guys what we should do with a
> device which won't shut up.

I will send an email to Marc & Will.

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-07-03 12:53             ` Ian Campbell
@ 2014-07-15 13:01               ` Julien Grall
  2014-07-15 13:03                 ` Ian Campbell
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-07-15 13:01 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

Hi Ian,

Sorry I forgot to answer to this mail.

On 03/07/14 13:53, Ian Campbell wrote:
> On Thu, 2014-07-03 at 13:02 +0100, Julien Grall wrote:
>>>>> Handling virq != pirq will be more complex as we need to take into
>>>>> account of the hotplug solution.
>>>
>>> What's the issue here? Something to do with irqdesc->irq-pending lookup?
>>>
>>> Seems like irqdesc needs to store the domain and virq number when the
>>> irq is passed through. I assume it must store the dmain already.
>>
>> The issues are mostly:
>> 	- we need to defer the vGIC IRQs allocation
>> 	- Add a new hypercall to setup the number of IRQs
>> 	- How do we handle hotplug?
>
> Those all sound like issues with having dynamic numbers of irqs, rather
> than with a non-1:1 virq<->pirq mapping per-se.

Yes. I will stick on a static allocation and 1:1 mapping for this first 
version. We could rework it when PCI passthrough will be added.

>>
>>> Do we think it is the case that we are eventually going to need a guest
>>> cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
>>> would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
>>> later (i.e. mmio space is reserved, INTx interrupts are assigned etc).
>>
>> I'm not sure to understand what we would need a "pci" cfg option... For
>> now, this series doesn't aim to support PCI. So I think we could defer
>> this problem later.
>
> Yeah, we got onto PCI somehow.
>
> So long as we are happy not being able to hotplug platform devices I
> think we don't need an equivalent option (the point would be to only
> setup huge numbers of SPIs, reserve MMIO space etc if it was going to be
> used).

We can reuse the number of IRQs used by the HW. The current series 
allocates SPIs for guest even if we don't plan to passthrough a device. 
Are we fine with this drawback for Xen 4.5?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-07-15 13:01               ` Julien Grall
@ 2014-07-15 13:03                 ` Ian Campbell
  2014-08-18 19:20                   ` Andrii Tseglytskyi
  0 siblings, 1 reply; 122+ messages in thread
From: Ian Campbell @ 2014-07-15 13:03 UTC (permalink / raw)
  To: Julien Grall; +Cc: xen-devel, tim, stefano.stabellini, Stefano Stabellini

On Tue, 2014-07-15 at 14:01 +0100, Julien Grall wrote:
> >>> Do we think it is the case that we are eventually going to need a guest
> >>> cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
> >>> would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
> >>> later (i.e. mmio space is reserved, INTx interrupts are assigned etc).
> >>
> >> I'm not sure to understand what we would need a "pci" cfg option... For
> >> now, this series doesn't aim to support PCI. So I think we could defer
> >> this problem later.
> >
> > Yeah, we got onto PCI somehow.
> >
> > So long as we are happy not being able to hotplug platform devices I
> > think we don't need an equivalent option (the point would be to only
> > setup huge numbers of SPIs, reserve MMIO space etc if it was going to be
> > used).
> 
> We can reuse the number of IRQs used by the HW. The current series 
> allocates SPIs for guest even if we don't plan to passthrough a device. 
> Are we fine with this drawback for Xen 4.5?

I meant no need for an equivalent user facing option. I'd much prefer it
if the toolstack did the Right Thing and configured the number of SPIs
for each guest at domain build time, based on its knowledge of the
devices configured for passthrough.

Ian.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-07-15 13:03                 ` Ian Campbell
@ 2014-08-18 19:20                   ` Andrii Tseglytskyi
  2014-08-18 21:55                     ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Andrii Tseglytskyi @ 2014-08-18 19:20 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Julien Grall, Tim Deegan, Stefano Stabellini,
	Stefano Stabellini

Hi All,

Could someone answer - what is the future of this patch series? Are
you going to post patches as non RFC ? Will it be merged to Xen 4.5 ?
I'm asking because it is *very useful* for development we have in
GlobalLogic. In our current state we need to route some HW irqs to
domainU (Android) and we need them 1 to 1.

For now we use similar patch series -
http://lists.xen.org/archives/html/xen-devel/2013-07/msg01146.html but
looks like it will not be merged

Regards,
Andri

On Tue, Jul 15, 2014 at 4:03 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:
> On Tue, 2014-07-15 at 14:01 +0100, Julien Grall wrote:
>> >>> Do we think it is the case that we are eventually going to need a guest
>> >>> cfg option pci = 0|1? I think the answer is yes. Assinging a pci device
>> >>> would cause pci=1, or you can set pci=1 to enable hotplug of pci devices
>> >>> later (i.e. mmio space is reserved, INTx interrupts are assigned etc).
>> >>
>> >> I'm not sure to understand what we would need a "pci" cfg option... For
>> >> now, this series doesn't aim to support PCI. So I think we could defer
>> >> this problem later.
>> >
>> > Yeah, we got onto PCI somehow.
>> >
>> > So long as we are happy not being able to hotplug platform devices I
>> > think we don't need an equivalent option (the point would be to only
>> > setup huge numbers of SPIs, reserve MMIO space etc if it was going to be
>> > used).
>>
>> We can reuse the number of IRQs used by the HW. The current series
>> allocates SPIs for guest even if we don't plan to passthrough a device.
>> Are we fine with this drawback for Xen 4.5?
>
> I meant no need for an equivalent user facing option. I'd much prefer it
> if the toolstack did the Right Thing and configured the number of SPIs
> for each guest at domain build time, based on its knowledge of the
> devices configured for passthrough.
>
> Ian.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel



-- 

Andrii Tseglytskyi | Embedded Dev
GlobalLogic
www.globallogic.com

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-08-18 19:20                   ` Andrii Tseglytskyi
@ 2014-08-18 21:55                     ` Julien Grall
  2014-08-19  9:11                       ` Andrii Tseglytskyi
  0 siblings, 1 reply; 122+ messages in thread
From: Julien Grall @ 2014-08-18 21:55 UTC (permalink / raw)
  To: Andrii Tseglytskyi, Ian Campbell
  Cc: xen-devel, Tim Deegan, Stefano Stabellini, Stefano Stabellini



On 18/08/14 14:20, Andrii Tseglytskyi wrote:
> Hi All,

Hello,

> Could someone answer - what is the future of this patch series? Are
> you going to post patches as non RFC ?

I've sent a v2 a couple of weeks ago:

https://patches.linaro.org/34666/

 > Will it be merged to Xen 4.5 ?

I hope so. If I can't get the whole series in Xen 4.5, I will at least 
try to get the interrupt assignment in it.

> I'm asking because it is *very useful* for development we have in
> GlobalLogic. In our current state we need to route some HW irqs to
> domainU (Android) and we need them 1 to 1.

The new approach allocate dynamically the virtual IRQ number. I chose 
this solution because otherwise Xen is allocating memory which is never 
used.

You could hack this patch to support 1:1 (see vgic_allocate_virq and 
vgic_free_virq).

OOI, why do you need Virtual IRQ == Physical IRQ?

> For now we use similar patch series -
> http://lists.xen.org/archives/html/xen-devel/2013-07/msg01146.html but
> looks like it will not be merged

IIRC, there was few comments in the v2 and no v3 has been sent after.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-08-18 21:55                     ` Julien Grall
@ 2014-08-19  9:11                       ` Andrii Tseglytskyi
  2014-08-19 14:24                         ` Julien Grall
  0 siblings, 1 reply; 122+ messages in thread
From: Andrii Tseglytskyi @ 2014-08-19  9:11 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Tim Deegan, Ian Campbell, Stefano Stabellini,
	Stefano Stabellini

Hi Julien,

On Tue, Aug 19, 2014 at 12:55 AM, Julien Grall <julien.grall@linaro.org> wrote:
>
>
> On 18/08/14 14:20, Andrii Tseglytskyi wrote:
>>
>> Hi All,
>
>
> Hello,
>
>
>> Could someone answer - what is the future of this patch series? Are
>> you going to post patches as non RFC ?
>
>
> I've sent a v2 a couple of weeks ago:
>
> https://patches.linaro.org/34666/
>
>
>> Will it be merged to Xen 4.5 ?
>
> I hope so. If I can't get the whole series in Xen 4.5, I will at least try
> to get the interrupt assignment in it.
>

Sounds great. Thank you.

>
>> I'm asking because it is *very useful* for development we have in
>> GlobalLogic. In our current state we need to route some HW irqs to
>> domainU (Android) and we need them 1 to 1.
>
>
> The new approach allocate dynamically the virtual IRQ number. I chose this
> solution because otherwise Xen is allocating memory which is never used.
>
> You could hack this patch to support 1:1 (see vgic_allocate_virq and
> vgic_free_virq).
>
> OOI, why do you need Virtual IRQ == Physical IRQ?
>

Because I have few devices in DomU, which use hardcoded IRQ numbers.

BTW - which code allocates IRQ number dynamically? My code is almost
the same and I have 1 to 1 in domU ?

Regards,
Andrii

>
>> For now we use similar patch series -
>> http://lists.xen.org/archives/html/xen-devel/2013-07/msg01146.html but
>> looks like it will not be merged
>
>
> IIRC, there was few comments in the v2 and no v3 has been sent after.
>
> Regards,
>
> --
> Julien Grall



-- 

Andrii Tseglytskyi | Embedded Dev
GlobalLogic
www.globallogic.com

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq
  2014-08-19  9:11                       ` Andrii Tseglytskyi
@ 2014-08-19 14:24                         ` Julien Grall
  0 siblings, 0 replies; 122+ messages in thread
From: Julien Grall @ 2014-08-19 14:24 UTC (permalink / raw)
  To: Andrii Tseglytskyi
  Cc: xen-devel, Tim Deegan, Ian Campbell, Stefano Stabellini,
	Stefano Stabellini


Hello Andrii,


On 19/08/14 04:11, Andrii Tseglytskyi wrote:
>> OOI, why do you need Virtual IRQ == Physical IRQ?
>>
>
> Because I have few devices in DomU, which use hardcoded IRQ numbers.
>
> BTW - which code allocates IRQ number dynamically? My code is almost
> the same and I have 1 to 1 in domU ?

With this series, the toolstack is deciding the number of SPIs (see 
xen_domctl_configure_domain domctl).

Then when map_pirq is called, the function vgic_allocate_irq allocates 
the SPIs number.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2014-08-19 16:06 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-06-16 16:17 [RFC 00/19] xe/arm: Add support for non-pci passthrough Julien Grall
2014-06-16 16:17 ` [RFC 01/19] xen/arm: guest_physmap_remove_page: Print a warning if we fail to unmap the page Julien Grall
2014-06-18 15:03   ` Stefano Stabellini
2014-07-03 10:52   ` Ian Campbell
2014-07-03 11:17     ` Julien Grall
2014-06-16 16:17 ` [RFC 02/19] xen: guestcopy: Provide an helper to copy string from guest Julien Grall
2014-06-17  8:01   ` Jan Beulich
2014-06-17  9:09     ` Julien Grall
2014-06-17  9:17       ` Jan Beulich
2014-06-17  9:23         ` Julien Grall
2014-06-17 22:43           ` Daniel De Graaf
2014-06-18 11:59             ` Jan Beulich
2014-06-18 12:22               ` Julien Grall
2014-06-18 12:49                 ` Jan Beulich
2014-06-18 12:53                   ` Julien Grall
2014-06-18 13:01                     ` Jan Beulich
2014-06-24 14:58                       ` Julien Grall
2014-06-16 16:17 ` [RFC 03/19] xen/arm: follow-up to allow DOM0 manage IRQ and MMIO Julien Grall
2014-06-18 20:21   ` Stefano Stabellini
2014-06-18 20:32     ` Julien Grall
2014-07-03 11:02       ` Ian Campbell
2014-07-03 11:23         ` Julien Grall
2014-07-03 12:12           ` Ian Campbell
2014-06-16 16:17 ` [RFC 04/19] xen/arm: route_irq_to_guest: Check validity of the IRQ Julien Grall
2014-06-18 18:52   ` Stefano Stabellini
2014-06-18 19:03     ` Julien Grall
2014-07-03 11:04   ` Ian Campbell
2014-07-03 11:47     ` Julien Grall
2014-06-16 16:17 ` [RFC 05/19] xen/arm: Release IRQ routed to a domain when it's destroying Julien Grall
2014-06-18 18:08   ` Stefano Stabellini
2014-06-18 18:26     ` Julien Grall
2014-06-18 18:48       ` Stefano Stabellini
2014-06-18 18:54         ` Julien Grall
2014-06-18 19:06           ` Stefano Stabellini
2014-06-18 19:09             ` Julien Grall
2014-06-16 16:17 ` [RFC 06/19] xen/arm: Implement hypercall PHYSDEVOP_map_pirq Julien Grall
2014-06-18 19:24   ` Stefano Stabellini
2014-06-19 11:39     ` Julien Grall
2014-06-19 12:29       ` Stefano Stabellini
2014-07-03 11:27         ` Ian Campbell
2014-07-03 12:02           ` Julien Grall
2014-07-03 12:53             ` Ian Campbell
2014-07-15 13:01               ` Julien Grall
2014-07-15 13:03                 ` Ian Campbell
2014-08-18 19:20                   ` Andrii Tseglytskyi
2014-08-18 21:55                     ` Julien Grall
2014-08-19  9:11                       ` Andrii Tseglytskyi
2014-08-19 14:24                         ` Julien Grall
2014-06-16 16:17 ` [RFC 07/19] xen/dts: Use unsigned int for MMIO and IRQ index Julien Grall
2014-06-18 18:54   ` Stefano Stabellini
2014-06-19 11:42     ` Julien Grall
2014-06-16 16:17 ` [RFC 08/19] xen/dts: Provide an helper to get a DT node from a path provided by a guest Julien Grall
2014-07-03 11:30   ` Ian Campbell
2014-07-03 11:49     ` Julien Grall
2014-07-03 12:13       ` Ian Campbell
2014-07-03 12:22         ` Julien Grall
2014-06-16 16:17 ` [RFC 09/19] xen/dts: Add hypercalls to retrieve device node information Julien Grall
2014-06-18 19:38   ` Stefano Stabellini
2014-06-19 11:58     ` Julien Grall
2014-06-19 12:21       ` Stefano Stabellini
2014-06-19 12:25         ` Julien Grall
2014-07-03 11:40           ` Ian Campbell
2014-06-24  8:46       ` Christoffer Dall
2014-07-03 11:34       ` Ian Campbell
2014-07-03 11:33   ` Ian Campbell
2014-07-03 11:51     ` Julien Grall
2014-07-03 12:13       ` Ian Campbell
2014-06-16 16:17 ` [RFC 10/19] xen/passthrough: Introduce iommu_buildup Julien Grall
2014-07-03 11:45   ` Ian Campbell
2014-07-03 11:55     ` Julien Grall
2014-06-16 16:17 ` [RFC 11/19] xen/passthrough: Call arch_iommu_domain_destroy before calling iommu_teardown Julien Grall
2014-06-17  8:07   ` Jan Beulich
2014-06-17  9:18     ` Julien Grall
2014-06-17  9:29       ` Jan Beulich
2014-06-17 12:38         ` Julien Grall
2014-06-17 13:04           ` Jan Beulich
2014-06-18 12:24             ` Julien Grall
2014-06-18 12:50               ` Jan Beulich
2014-06-16 16:17 ` [RFC 12/19] xen/passthrough: iommu_deassign_device_dt: By default reassign device to nobody Julien Grall
2014-06-18 19:28   ` Stefano Stabellini
2014-07-03 11:48   ` Ian Campbell
2014-07-03 12:07     ` Julien Grall
2014-07-03 12:53       ` Ian Campbell
2014-07-03 13:01         ` Julien Grall
2014-07-03 13:42           ` Ian Campbell
2014-07-03 13:51             ` Julien Grall
2014-07-03 14:04               ` Ian Campbell
2014-07-03 14:09                 ` Julien Grall
2014-06-16 16:18 ` [RFC 13/19] xen/iommu: arm: Wire iommu DOMCTL for ARM Julien Grall
2014-06-17  8:24   ` Jan Beulich
2014-06-17 13:05     ` Julien Grall
2014-06-16 16:18 ` [RFC 14/19] xen/passthrough: dt: Add new domctl XEN_DOMCTL_assign_dt_device Julien Grall
2014-06-17  8:34   ` Jan Beulich
2014-06-17 13:23     ` Julien Grall
2014-06-17 13:30       ` Jan Beulich
2014-06-17 13:48         ` Julien Grall
2014-06-17 13:55           ` Jan Beulich
2014-07-03 11:54             ` Ian Campbell
2014-06-16 16:18 ` [RFC 15/19] xen/arm: Reserve region in guest memory for device passthrough Julien Grall
2014-06-18 15:12   ` Stefano Stabellini
2014-06-18 15:23     ` Julien Grall
2014-06-18 15:26       ` Ian Campbell
2014-06-18 17:48         ` Stefano Stabellini
2014-06-18 17:54           ` Julien Grall
2014-06-18 18:14             ` Stefano Stabellini
2014-06-18 18:33               ` Julien Grall
2014-06-18 18:55                 ` Stefano Stabellini
2014-07-03 11:56                 ` Ian Campbell
2014-06-16 16:18 ` [RFC 16/19] libxl/arm: Introduce DT_IRQ_TYPE_* Julien Grall
2014-07-03 11:56   ` Ian Campbell
2014-06-16 16:18 ` [RFC 17/19] libxl/arm: Rename set_interrupt_ppi to set_interrupt and handle SPIs Julien Grall
2014-07-03 11:58   ` Ian Campbell
2014-07-03 12:04     ` Julien Grall
2014-07-03 14:04       ` Ian Campbell
2014-06-16 16:18 ` [RFC 18/19] libxl: Add support for non-PCI passthrough Julien Grall
2014-06-16 17:19   ` Wei Liu
2014-06-18 12:26     ` Julien Grall
2014-06-16 16:18 ` [RFC 19/19] xl: Add new option dtdev Julien Grall
2014-06-16 17:19   ` Wei Liu
2014-06-18 13:40     ` Julien Grall
2014-06-18 13:43       ` Wei Liu
2014-06-18 13:46         ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.