All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation
@ 2016-09-28 18:24 Andre Przywara
  2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
                   ` (24 more replies)
  0 siblings, 25 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Hi,

apologies for sending this series, which due to its early status is
now targeting Xen 4.9, still before the 4.8 hard feature freeze, but
there seems to be some interested parties and I wanted to get the
discussion started on this.

This series introduces ARM GICv3 ITS emulation, for now restricted to
Dom0 only. The ITS is an interrupt controller widget providing a
sophisticated way to deal with MSIs in a scalable manner.
For hardware which relies on the ITS to provide interrupts for its
peripherals this code is needed to get a machine booted into Dom0 at all.
ITS emulation for DomUs is only really useful with PCI passthrough,
which is not yet available for ARM. It is expected that this feature
will be co-developed with the ITS DomU code.

This implementation is totally independent from earlier submissons and
tries to provide a new approach:

* The current GIC code statically allocates structures for each supported
IRQ (both for the host and the guest), which due to the potentially
millions of LPI interrupts is not feasible to copy for the ITS.
So we refrain from introducing the ITS as a first class Xen interrupt
controller, also we don't hold struct irq_desc's or struct pending_irq's
for each possible LPI.
Fortunately LPIs are only interesting to guests, so we get away with
storing only the virtual IRQ number and the guest VCPU for each allocated
host LPI, which can be stashed into one uint64_t. This data is stored in
a two-level table, which is both memory efficient and quick to access.
We hook into the existing IRQ handling and VGIC code to avoid accessing
the normal structures, providing alternative methods for getting the
needed information (priority, is enabled?) for LPIs.
For interrupts which are queued to or are actually in a guest we
allocate struct pending_irq's on demand. As it is expected that only a
very small number of interrupts is ever on a VCPU at the same time, this
seems like the best approach. For now allocated structs are re-used and
held in a linked list.

* On the guest side we (later will) have to deal with malicious guests
trying to hog Xen with mapping requests for a lot of LPIs, for instance.
As the ITS actually uses system memory for storing status information,
we use this memory (which the guest has to provide) to naturally limit
a guest. For those tables which are page sized (devices, collections (CPUs),
LPI properties) we map those pages into Xen, so we can easily access
them from the virtual GIC code.
Unfortunately the actual interrupt mapping tables are not necessarily
page aligned, also can be much smaller than a page, so mapping all of
them permanently is fiddly. As ITS commands in need to iterate those
tables are pretty rare after all, we for now map them on demand upon
emulating a virtual ITS command.

* At least for the Dom0 use case we need to pass some virtual ITS
commands on to the hardware, namely device and interrupt mapping requests.
As we have to deal with command emulation synchronously, this could lead
to situations where the host command queue gets congested and CPUs stall
on this. For now we assume that Dom0 (as being non-malicious) is not
affected by this, so we allow this.
DomUs will later use PCI passthrough, so we can do the device and also
IRQ mapping upon domain creation time, obviating ITS command passthrough
for DomUs during their runtime at all.

This series is an early draft, with some known and many unknown issues.
I made ITS support a Kconfig option, also it is only supported on arm64.
This leads to some hideous constructs like an #ifdef'ed header file with
empty function stubs, but I guess we can clean this up later in the
upstreaming process.
Also I am not sure the host ITS and LPI initialization code is correct,
especially when it comes to the subtle differences between ITS and LPIs
enabling and initialization. Affinity handling is only rudimentarily
implemented at this point. Also locking some of the shared data structures
isn't fully implemented yet, partly because Xen's lack of mutexes, RCUs
and preemption requiring some more clever solutions than my Linux
experiences would easily provide offhand.

So for now I am mostly interested in feedback on the general architecture
approach. Please comment on anything that looks suspicious or not
sustainable.
Oh, and apologies in advance for any tabs and missing "spaces in
if-statements" that slipped through my (unfortunately only manual) QA
process.

For now this code happens to boot Dom0 on an ARM fast model with ITS
support. I haven't had the chance to get hold of a Xen supported hardware
platform with an ITS yet, so I expect some surprises when this code sees
real hardware for the first time ;-)
That being said any testing and feedback is warmly welcomed!

The code can also be found on the its/rfc branch here:
git://linux-arm.org/xen-ap.git
http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/rfc

Cheers,
Andre

Andre Przywara (24):
  ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  ARM: GICv3: allocate LPI pending and property table
  ARM: GICv3 ITS: allocate device and collection table
  ARM: GICv3 ITS: map ITS command buffer
  ARM: GICv3 ITS: introduce ITS command handling
  ARM: GICv3 ITS: introduce host LPI array
  ARM: GICv3 ITS: introduce device mapping
  ARM: GICv3: introduce separate pending_irq structs for LPIs
  ARM: GICv3: forward pending LPIs to guests
  ARM: GICv3: enable ITS and LPIs on the host
  ARM: vGICv3: handle virtual LPI pending and property tables
  ARM: vGICv3: introduce basic ITS emulation bits
  ARM: vITS: handle CLEAR command
  ARM: vITS: handle INT command
  ARM: vITS: handle MAPC command
  ARM: vITS: handle MAPD command
  ARM: vITS: handle MAPTI command
  ARM: vITS: handle MOVI command
  ARM: vITS: handle DISCARD command
  ARM: vITS: handle INV command
  ARM: vITS: handle INVALL command
  ARM: vITS: create and initialize virtual ITSes for Dom0
  ARM: vITS: create ITS subnodes for Dom0 DT
  ARM: vGIC: advertising LPI support

 xen/arch/arm/Kconfig              |  11 +
 xen/arch/arm/Makefile             |   2 +
 xen/arch/arm/efi/efi-boot.h       |   1 -
 xen/arch/arm/gic-its.c            | 861 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c             |  78 +++-
 xen/arch/arm/gic.c                |   9 +-
 xen/arch/arm/vgic-its.c           | 861 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            | 240 +++++++++--
 xen/arch/arm/vgic.c               |  60 ++-
 xen/include/asm-arm/cache.h       |   4 +
 xen/include/asm-arm/domain.h      |   9 +-
 xen/include/asm-arm/gic-its.h     | 246 +++++++++++
 xen/include/asm-arm/gic_v3_defs.h |  67 ++-
 xen/include/asm-arm/irq.h         |   8 +
 xen/include/asm-arm/vgic.h        |  12 +
 15 files changed, 2432 insertions(+), 37 deletions(-)
 create mode 100644 xen/arch/arm/gic-its.c
 create mode 100644 xen/arch/arm/vgic-its.c
 create mode 100644 xen/include/asm-arm/gic-its.h

-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-26  1:11   ` Stefano Stabellini
  2016-11-01 15:13   ` Julien Grall
  2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
                   ` (23 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Parse the DT GIC subnodes to find every ITS MSI controller the hardware
offers. Store that information in a list to both propagate all of them
later to Dom0, but also to be able to iterate over all ITSes.
This introduces an ITS Kconfig option.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig          |  5 ++++
 xen/arch/arm/Makefile         |  1 +
 xen/arch/arm/gic-its.c        | 67 +++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c         |  6 ++++
 xen/include/asm-arm/gic-its.h | 57 ++++++++++++++++++++++++++++++++++++
 5 files changed, 136 insertions(+)
 create mode 100644 xen/arch/arm/gic-its.c
 create mode 100644 xen/include/asm-arm/gic-its.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 797c91f..9fe3b8e 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -45,6 +45,11 @@ config ACPI
 config HAS_GICV3
 	bool
 
+config HAS_ITS
+        bool "GICv3 ITS MSI controller support"
+        depends on ARM_64
+        depends on HAS_GICV3
+
 config ALTERNATIVE
 	bool
 
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 64fdf41..c2c4daa 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -18,6 +18,7 @@ obj-$(EARLY_PRINTK) += early_printk.o
 obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
+obj-$(CONFIG_HAS_ITS) += gic-its.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
new file mode 100644
index 0000000..0f42a77
--- /dev/null
+++ b/xen/arch/arm/gic-its.c
@@ -0,0 +1,67 @@
+/*
+ * xen/arch/arm/gic-its.c
+ *
+ * ARM Generic Interrupt Controller ITS support
+ *
+ * Copyright (C) 2016 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/device_tree.h>
+#include <xen/libfdt/libfdt.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
+
+void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+    const struct dt_device_node *its = NULL;
+    struct host_its *its_data;
+
+    /*
+     * Check for ITS MSI subnodes. If any, add the ITS register
+     * frames to the ITS list.
+     */
+    dt_for_each_child_node(node, its)
+    {
+        paddr_t addr, size;
+
+        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
+            continue;
+
+        if ( dt_device_get_address(its, 0, &addr, &size) )
+            panic("GICv3: Cannot find a valid ITS frame address");
+
+        its_data = xzalloc(struct host_its);
+        if ( !its_data )
+            panic("GICv3: Cannot allocate memory for ITS frame");
+
+        its_data->addr = addr;
+        its_data->size = size;
+        its_data->dt_node = its;
+
+        printk("GICv3: Found ITS @0x%lx\n", addr);
+
+        list_add_tail(&its_data->entry, &host_its_list);
+    }
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b8be395..238da84 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -43,9 +43,12 @@
 #include <asm/device.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
 #include <asm/cpufeature.h>
 #include <asm/acpi.h>
 
+LIST_HEAD(host_its_list);
+
 /* Global state */
 static struct {
     void __iomem *map_dbase;  /* Mapped address of distributor registers */
@@ -1229,6 +1232,9 @@ static void __init gicv3_dt_init(void)
 
     dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
                           &vbase, &vsize);
+
+    /* Check for ITS child nodes and build the host ITS list accordingly. */
+    gicv3_its_dt_init(node);
 }
 
 static int gicv3_iomem_deny_access(const struct domain *d)
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
new file mode 100644
index 0000000..2f5c51c
--- /dev/null
+++ b/xen/include/asm-arm/gic-its.h
@@ -0,0 +1,57 @@
+/*
+ * ARM GICv3 ITS support
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ASM_ARM_ITS_H__
+#define __ASM_ARM_ITS_H__
+
+#ifndef __ASSEMBLY__
+#include <xen/device_tree.h>
+
+/* data structure for each hardware ITS */
+struct host_its {
+    struct list_head entry;
+    const struct dt_device_node *dt_node;
+    paddr_t addr;
+    paddr_t size;
+};
+
+extern struct list_head host_its_list;
+
+#ifdef CONFIG_HAS_ITS
+
+/* Parse the host DT and pick up all host ITSes. */
+void gicv3_its_dt_init(const struct dt_device_node *node);
+
+#else
+
+static inline void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+}
+
+#endif /* CONFIG_HAS_ITS */
+
+#endif /* __ASSEMBLY__ */
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
  2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 14:28   ` Vijay Kilari
                     ` (2 more replies)
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                   ` (22 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
The pending bits and the configuration data (priority, enable bits) for
those LPIs are stored in tables in normal memory, which software has to
provide to the hardware.
Allocate the required memory, initialize it and hand it over to each
ITS. We limit the number of LPIs we use with a compile time constant to
avoid wasting memory.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig              |  6 ++++
 xen/arch/arm/efi/efi-boot.h       |  1 -
 xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
 xen/include/asm-arm/cache.h       |  4 +++
 xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
 xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
 7 files changed, 181 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 9fe3b8e..66e2bb8 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -50,6 +50,12 @@ config HAS_ITS
         depends on ARM_64
         depends on HAS_GICV3
 
+config HOST_LPI_BITS
+        depends on HAS_ITS
+        int "Maximum bits for GICv3 host LPIs (14-32)"
+        range 14 32
+        default "20"
+
 config ALTERNATIVE
 	bool
 
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 045d6ce..dc64aec 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -10,7 +10,6 @@
 #include "efi-dom0.h"
 
 void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
-void __flush_dcache_area(const void *vaddr, unsigned long size);
 
 #define DEVICE_TREE_GUID \
 {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 0f42a77..b52dff3 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -20,10 +20,86 @@
 #include <xen/lib.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
+#include <asm/p2m.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic-its.h>
 
+/* Global state */
+static struct {
+    uint8_t *lpi_property;
+    int host_lpi_bits;
+} lpi_data;
+
+/* Pending table for each redistributor */
+static DEFINE_PER_CPU(void *, pending_table);
+
+#define MAX_HOST_LPI_BITS                                                \
+        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
+#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
+
+uint64_t gicv3_lpi_allocate_pendtable(void)
+{
+    uint64_t reg, attr;
+    void *pendtable;
+
+    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    /*
+     * The pending table holds one bit per LPI, so we need three bits less
+     * than the number of LPI_BITs. But the alignment requirement from the
+     * ITS is 64K, so make order at least 16 (-12).
+     */
+    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);
+    if ( !pendtable )
+        return 0;
+
+    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
+    this_cpu(pending_table) = pendtable;
+
+    reg  = attr | GICR_PENDBASER_PTZ;
+    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
+
+    return reg;
+}
+
+uint64_t gicv3_lpi_get_proptable()
+{
+    uint64_t attr;
+    static uint64_t reg = 0;
+
+    /* The property table is shared across all redistributors. */
+    if ( reg )
+        return reg;
+
+    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
+    if ( !lpi_data.lpi_property )
+        return 0;
+
+    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
+    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
+
+    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
+    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
+
+    return reg;
+}
+
+int gicv3_lpi_init_host_lpis(int lpi_bits)
+{
+    lpi_data.host_lpi_bits = lpi_bits;
+
+    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
+
+    return 0;
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 238da84..2534aa5 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
     type = readl_relaxed(GICD + GICD_TYPER);
     nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
 
+    if ( type & GICD_TYPE_LPIS )
+        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
+
     printk("GICv3: %d lines, (IID %8.8x).\n",
            nr_lines, readl_relaxed(GICD + GICD_IIDR));
 
@@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
 
     return 0;
 }
+static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
+{
+    uint32_t reg;
+    uint64_t table_reg;
+
+    if ( list_empty(&host_its_list) )
+        return;
+
+    /* Make sure LPIs are disabled before setting up the BASERs. */
+    reg = readl_relaxed(rdist_base + GICR_CTLR);
+    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
+
+    table_reg = gicv3_lpi_allocate_pendtable();
+    if ( table_reg )
+        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
+
+    table_reg = gicv3_lpi_get_proptable();
+    if ( table_reg )
+        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
+}
 
 static int __init gicv3_populate_rdist(void)
 {
@@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
             if ( (typer >> 32) == aff )
             {
                 this_cpu(rbase) = ptr;
+
+                if ( typer & GICR_TYPER_PLPIS )
+                    gicv3_rdist_init_lpis(ptr);
+
                 printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
                         smp_processor_id(), i, ptr);
                 return 0;
diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
index 2de6564..af96eee 100644
--- a/xen/include/asm-arm/cache.h
+++ b/xen/include/asm-arm/cache.h
@@ -7,6 +7,10 @@
 #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
 #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
 
+#ifndef __ASSEMBLY__
+void __flush_dcache_area(const void *vaddr, unsigned long size);
+#endif
+
 #define __read_mostly __section(".data.read_mostly")
 
 #endif
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 2f5c51c..48c6c78 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -36,12 +36,32 @@ extern struct list_head host_its_list;
 /* Parse the host DT and pick up all host ITSes. */
 void gicv3_its_dt_init(const struct dt_device_node *node);
 
+/* Allocate and initialize tables for each host redistributor.
+ * Returns the respective {PROP,PEND}BASER register value.
+ */
+uint64_t gicv3_lpi_get_proptable(void);
+uint64_t gicv3_lpi_allocate_pendtable(void);
+
+/* Initialize the host structures for LPIs. */
+int gicv3_lpi_init_host_lpis(int nr_lpis);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
 {
 }
-
+static inline uint64_t gicv3_lpi_get_proptable(void)
+{
+    return 0;
+}
+static inline uint64_t gicv3_lpi_allocate_pendtable(void)
+{
+    return 0;
+}
+static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 6bd25a5..da5fb77 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -44,7 +44,8 @@
 #define GICC_SRE_EL2_ENEL1           (1UL << 3)
 
 /* Additional bits in GICD_TYPER defined by GICv3 */
-#define GICD_TYPE_ID_BITS_SHIFT 19
+#define GICD_TYPE_ID_BITS_SHIFT      19
+#define GICD_TYPE_LPIS               (1U << 17)
 
 #define GICD_CTLR_RWP                (1UL << 31)
 #define GICD_CTLR_ARE_NS             (1U << 4)
@@ -95,12 +96,57 @@
 #define GICR_IGRPMODR0               (0x0D00)
 #define GICR_NSACR                   (0x0E00)
 
+#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
 
+#define GIC_BASER_CACHE_nCnB         0ULL
+#define GIC_BASER_CACHE_SameAsInner  0ULL
+#define GIC_BASER_CACHE_nC           1ULL
+#define GIC_BASER_CACHE_RaWt         2ULL
+#define GIC_BASER_CACHE_RaWb         3ULL
+#define GIC_BASER_CACHE_WaWt         4ULL
+#define GIC_BASER_CACHE_WaWb         5ULL
+#define GIC_BASER_CACHE_RaWaWt       6ULL
+#define GIC_BASER_CACHE_RaWaWb       7ULL
+#define GIC_BASER_CACHE_MASK         7ULL
+#define GIC_BASER_NonShareable       0ULL
+#define GIC_BASER_InnerShareable     1ULL
+#define GIC_BASER_OuterShareable     2ULL
+
+#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
+#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PROPBASER_SHAREABILITY_MASK                     \
+        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
+#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
+#define PROPBASER_RES0_MASK                                  \
+        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
+
+#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
+#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PENDBASER_SHAREABILITY_MASK                     \
+	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
+#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
+	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_PTZ                              BIT(62)
+#define PENDBASER_RES0_MASK                                  \
+        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
+         GENMASK(15, 12) | GENMASK(6, 0))
+
 #define DEFAULT_PMR_VALUE            0xff
 
+#define LPI_PROP_DEFAULT_PRIO        0xa0
+#define LPI_PROP_RES1                (1 << 1)
+#define LPI_PROP_ENABLED             (1 << 0)
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
  2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
  2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-09 13:55   ` Vijay Kilari
                     ` (3 more replies)
  2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
                   ` (21 subsequent siblings)
  24 siblings, 4 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
an EventID (the MSI payload or interrupt ID) to a pair of LPI number
and collection ID, which points to the target CPU.
This mapping is stored in the device and collection tables, which software
has to provide for the ITS to use.
Allocate the required memory and hand it the ITS.
We limit the number of devices to cover 4 PCI busses for now.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c         |   5 ++
 xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
 3 files changed, 167 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index b52dff3..40238a2 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -21,6 +21,7 @@
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <asm/p2m.h>
+#include <asm/io.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic-its.h>
@@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
         min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
 #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
 
+#define BASER_ATTR_MASK                                           \
+        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
+         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
+         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
+#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
+
+static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
+{
+    uint64_t ret;
+
+    if ( page_bits < 16)
+        return (uint64_t)addr & GENMASK(47, page_bits);
+
+    ret = addr & GENMASK(47, 16);
+    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
+}
+
+static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
+{
+    uint64_t attr;
+    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
+    int pagesz;
+    int order;
+    void *buffer = NULL;
+
+    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    /*
+     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
+     * supports.
+     */
+    for (pagesz = 0; pagesz < 3; pagesz++)
+    {
+        uint64_t reg;
+        int nr_bytes;
+
+        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
+        order = get_order_from_bytes(nr_bytes);
+
+        if ( !buffer )
+            buffer = alloc_xenheap_pages(order, 0);
+        if ( !buffer )
+            return -ENOMEM;
+
+        reg  = attr;
+        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
+        reg |= nr_bytes >> (pagesz * 2 + 12);
+        reg |= regc & BASER_RO_MASK;
+        reg |= GITS_BASER_VALID;
+        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
+
+        writeq_relaxed(reg, basereg);
+        regc = readl_relaxed(basereg);
+
+        /* The host didn't like our attributes, just use what it returned. */
+        if ( (regc & BASER_ATTR_MASK) != attr )
+            attr = regc & BASER_ATTR_MASK;
+
+        /* If the host accepted our page size, we are done. */
+        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
+            return 0;
+
+        /* Check whether our buffer is aligned to the next page size already. */
+        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
+        {
+            free_xenheap_pages(buffer, order);
+            buffer = NULL;
+        }
+    }
+
+    if ( buffer )
+        free_xenheap_pages(buffer, order);
+
+    return -EINVAL;
+}
+
+int gicv3_its_init(struct host_its *hw_its)
+{
+    uint64_t reg;
+    int i;
+
+    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
+    if ( !hw_its->its_base )
+        return -ENOMEM;
+
+    for (i = 0; i < 8; i++)
+    {
+        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
+        int type;
+
+        reg = readq_relaxed(basereg);
+        type = (reg >> 56) & 0x7;
+        switch ( type )
+        {
+        case GITS_BASER_TYPE_NONE:
+            continue;
+        case GITS_BASER_TYPE_DEVICE:
+            /* TODO: find some better way of limiting the number of devices */
+            gicv3_map_baser(basereg, reg, 1024);
+            break;
+        case GITS_BASER_TYPE_COLLECTION:
+            gicv3_map_baser(basereg, reg, NR_CPUS);
+            break;
+        default:
+            continue;
+        }
+    }
+
+    return 0;
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg, attr;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 2534aa5..5cf4618 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -29,6 +29,7 @@
 #include <xen/irq.h>
 #include <xen/iocap.h>
 #include <xen/sched.h>
+#include <xen/err.h>
 #include <xen/errno.h>
 #include <xen/delay.h>
 #include <xen/device_tree.h>
@@ -1548,6 +1549,7 @@ static int __init gicv3_init(void)
 {
     int res, i;
     uint32_t reg;
+    struct host_its *hw_its;
 
     if ( !cpu_has_gicv3 )
     {
@@ -1603,6 +1605,9 @@ static int __init gicv3_init(void)
     res = gicv3_cpu_init();
     gicv3_hyp_init();
 
+    list_for_each_entry(hw_its, &host_its_list, entry)
+        gicv3_its_init(hw_its);
+
     spin_unlock(&gicv3.lock);
 
     return res;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 48c6c78..589b889 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -18,6 +18,47 @@
 #ifndef __ASM_ARM_ITS_H__
 #define __ASM_ARM_ITS_H__
 
+#define LPI_OFFSET      8192
+
+#define GITS_CTLR       (0x000)
+#define GITS_IIDR       (0x004)
+#define GITS_TYPER      (0x008)
+#define GITS_CBASER     (0x080)
+#define GITS_CWRITER    (0x088)
+#define GITS_CREADR     (0x090)
+#define GITS_BASER0     (0x100)
+#define GITS_BASER1     (0x108)
+#define GITS_BASER2     (0x110)
+#define GITS_BASER3     (0x118)
+#define GITS_BASER4     (0x120)
+#define GITS_BASER5     (0x128)
+#define GITS_BASER6     (0x130)
+#define GITS_BASER7     (0x138)
+
+/* Register bits */
+#define GITS_CTLR_ENABLE     0x1
+#define GITS_IIDR_VALUE      0x34c
+
+#define GITS_BASER_VALID                BIT(63)
+#define GITS_BASER_INDIRECT             BIT(62)
+#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
+#define GITS_BASER_TYPE_SHIFT           56
+#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
+#define GITS_BASER_TYPE_NONE            0UL
+#define GITS_BASER_TYPE_DEVICE          1UL
+#define GITS_BASER_TYPE_VCPU            2UL
+#define GITS_BASER_TYPE_CPU             3UL
+#define GITS_BASER_TYPE_COLLECTION      4UL
+#define GITS_BASER_TYPE_RESERVED5       5UL
+#define GITS_BASER_TYPE_RESERVED6       6UL
+#define GITS_BASER_TYPE_RESERVED7       7UL
+#define GITS_BASER_ENTRY_SIZE_SHIFT     48
+#define GITS_BASER_SHAREABILITY_SHIFT   10
+#define GITS_BASER_PAGE_SIZE_SHIFT      8
+#define GITS_BASER_RO_MASK              ((7UL << GITS_BASER_TYPE_SHIFT) | \
+                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
+                                        GITS_BASER_INDIRECT)
+
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
@@ -27,6 +68,7 @@ struct host_its {
     const struct dt_device_node *dt_node;
     paddr_t addr;
     paddr_t size;
+    void __iomem *its_base;
 };
 
 extern struct list_head host_its_list;
@@ -42,8 +84,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
 uint64_t gicv3_lpi_get_proptable(void);
 uint64_t gicv3_lpi_allocate_pendtable(void);
 
-/* Initialize the host structures for LPIs. */
+/* Initialize the host structures for LPIs and the host ITSes. */
 int gicv3_lpi_init_host_lpis(int nr_lpis);
+int gicv3_its_init(struct host_its *hw_its);
 
 #else
 
@@ -62,6 +105,10 @@ static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
 {
     return 0;
 }
+static inline int gicv3_its_init(struct host_its *hw_its)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (2 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 14:31   ` Vijay Kilari
                     ` (2 more replies)
  2016-09-28 18:24 ` [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
                   ` (20 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Instead of directly manipulating the tables in memory, an ITS driver
sends commands via a ring buffer to the ITS h/w to create or alter the
LPI mappings.
Allocate memory for that buffer and tell the ITS about it to be able
to send ITS commands.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 25 +++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 40238a2..c8a7a7e 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -18,6 +18,7 @@
 
 #include <xen/config.h>
 #include <xen/lib.h>
+#include <xen/err.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <asm/p2m.h>
@@ -56,6 +57,26 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
     return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
 }
 
+static void *gicv3_map_cbaser(void __iomem *cbasereg)
+{
+    uint64_t attr, reg;
+    void *buffer;
+
+    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    buffer = alloc_xenheap_pages(0, 0);
+    if ( !buffer )
+        return ERR_PTR(-ENOMEM);
+
+    /* We use exactly one 4K page, so the "Size" field is 0. */
+    reg = attr | BIT(63) | (virt_to_maddr(buffer) & GENMASK(51, 12));
+    writeq_relaxed(reg, cbasereg);
+
+    return buffer;
+}
+
 static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
 {
     uint64_t attr;
@@ -149,6 +170,10 @@ int gicv3_its_init(struct host_its *hw_its)
         }
     }
 
+    hw_its->cmd_buf = gicv3_map_cbaser(hw_its->its_base + GITS_CBASER);
+    if ( IS_ERR(hw_its->cmd_buf) )
+        return PTR_ERR(hw_its->cmd_buf);
+
     return 0;
 }
 
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 589b889..b2a003f 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -69,6 +69,7 @@ struct host_its {
     paddr_t addr;
     paddr_t size;
     void __iomem *its_base;
+    void *cmd_buf;
 };
 
 extern struct list_head host_its_list;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (3 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-26 23:55   ` Stefano Stabellini
  2016-11-02 15:05   ` Julien Grall
  2016-09-28 18:24 ` [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
                   ` (19 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

To be able to easily send commands to the ITS, create the respective
wrapper functions, which take care of the ring buffer.
The first two commands we implement provide methods to map a collection
to a redistributor (aka host core) and to flush the command queue (SYNC).
Start using these commands for mapping one collection to each host CPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 101 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c         |  17 +++++++
 xen/include/asm-arm/gic-its.h |  32 +++++++++++++
 3 files changed, 150 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index c8a7a7e..88397bc 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -33,6 +33,10 @@ static struct {
     int host_lpi_bits;
 } lpi_data;
 
+/* Physical redistributor address */
+static DEFINE_PER_CPU(uint64_t, rdist_addr);
+/* Redistributor ID */
+static DEFINE_PER_CPU(uint64_t, rdist_id);
 /* Pending table for each redistributor */
 static DEFINE_PER_CPU(void *, pending_table);
 
@@ -40,6 +44,86 @@ static DEFINE_PER_CPU(void *, pending_table);
         min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
 #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
 
+#define ITS_COMMAND_SIZE        32
+
+static int its_send_command(struct host_its *hw_its, void *its_cmd)
+{
+    int readp, writep;
+
+    spin_lock(&hw_its->cmd_lock);
+
+    readp = readl_relaxed(hw_its->its_base + GITS_CREADR) & GENMASK(19, 5);
+    writep = readl_relaxed(hw_its->its_base + GITS_CWRITER) & GENMASK(19, 5);
+
+    if ( ((writep + ITS_COMMAND_SIZE) % PAGE_SIZE) == readp )
+    {
+        spin_unlock(&hw_its->cmd_lock);
+        return -EBUSY;
+    }
+
+    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_COMMAND_SIZE);
+    __flush_dcache_area(hw_its->cmd_buf + writep, ITS_COMMAND_SIZE);
+    writep = (writep + ITS_COMMAND_SIZE) % PAGE_SIZE;
+
+    writeq_relaxed(writep & GENMASK(19, 5), hw_its->its_base + GITS_CWRITER);
+
+    spin_unlock(&hw_its->cmd_lock);
+
+    return 0;
+}
+
+static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
+{
+    reg &= ~GENMASK(51, 16);
+
+    if ( hw_its->pta )
+        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);
+    else
+        reg |= per_cpu(rdist_id, cpu) << 16;
+
+    return reg;
+}
+
+static int its_send_cmd_sync(struct host_its *its, int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_SYNC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, 0x0);
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)) | BIT(63));
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+/* Set up the (1:1) collection mapping for the given host CPU. */
+void gicv3_its_setup_collection(int cpu)
+{
+    struct host_its *its;
+
+    list_for_each_entry(its, &host_its_list, entry)
+    {
+        /* Only send commands to ITS that have been initialized already. */
+        if ( !its->cmd_buf )
+            continue;
+
+        its_send_cmd_mapc(its, cpu, cpu);
+        its_send_cmd_sync(its, cpu);
+    }
+}
+
 #define BASER_ATTR_MASK                                           \
         ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
          (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
@@ -147,6 +231,13 @@ int gicv3_its_init(struct host_its *hw_its)
     if ( !hw_its->its_base )
         return -ENOMEM;
 
+    /* Make sure the ITS is disabled before programming the BASE registers. */
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
+    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
+    hw_its->pta = reg & GITS_TYPER_PTA;
+
     for (i = 0; i < 8; i++)
     {
         void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
@@ -174,9 +265,18 @@ int gicv3_its_init(struct host_its *hw_its)
     if ( IS_ERR(hw_its->cmd_buf) )
         return PTR_ERR(hw_its->cmd_buf);
 
+    its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
+    its_send_cmd_sync(hw_its, smp_processor_id());
+
     return 0;
 }
 
+void gicv3_set_redist_addr(paddr_t address, int redist_id)
+{
+    this_cpu(rdist_addr) = address;
+    this_cpu(rdist_id) = redist_id;
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg, attr;
@@ -265,6 +365,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
         its_data->addr = addr;
         its_data->size = size;
         its_data->dt_node = its;
+        spin_lock_init(&its_data->cmd_lock);
 
         printk("GICv3: Found ITS @0x%lx\n", addr);
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 5cf4618..b9387a3 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -638,6 +638,8 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
     table_reg = gicv3_lpi_get_proptable();
     if ( table_reg )
         writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
+
+    gicv3_its_setup_collection(smp_processor_id());
 }
 
 static int __init gicv3_populate_rdist(void)
@@ -684,7 +686,22 @@ static int __init gicv3_populate_rdist(void)
                 this_cpu(rbase) = ptr;
 
                 if ( typer & GICR_TYPER_PLPIS )
+                {
+                    paddr_t rdist_addr;
+
+                    rdist_addr = gicv3.rdist_regions[i].base;
+                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
+
+                    /* The ITS refers to redistributors either by their physical
+                     * address or by their ID. Determine those two values and
+                     * let the ITS code store them in per host CPU variables to
+                     * later be able to address those redistributors.
+                     */
+                    gicv3_set_redist_addr(rdist_addr,
+                                          (typer >> 8) & GENMASK(15, 0));
+
                     gicv3_rdist_init_lpis(ptr);
+                }
 
                 printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
                         smp_processor_id(), i, ptr);
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index b2a003f..b49d274 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -37,6 +37,7 @@
 
 /* Register bits */
 #define GITS_CTLR_ENABLE     0x1
+#define GITS_TYPER_PTA       BIT(19)
 #define GITS_IIDR_VALUE      0x34c
 
 #define GITS_BASER_VALID                BIT(63)
@@ -59,6 +60,22 @@
                                         (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
                                         GITS_BASER_INDIRECT)
 
+/* ITS command definitions */
+#define ITS_CMD_SIZE                    32
+
+#define GITS_CMD_MOVI                   0x01
+#define GITS_CMD_INT                    0x03
+#define GITS_CMD_CLEAR                  0x04
+#define GITS_CMD_SYNC                   0x05
+#define GITS_CMD_MAPD                   0x08
+#define GITS_CMD_MAPC                   0x09
+#define GITS_CMD_MAPTI                  0x0a
+#define GITS_CMD_MAPI                   0x0b
+#define GITS_CMD_INV                    0x0c
+#define GITS_CMD_INVALL                 0x0d
+#define GITS_CMD_MOVALL                 0x0e
+#define GITS_CMD_DISCARD                0x0f
+
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
@@ -69,7 +86,9 @@ struct host_its {
     paddr_t addr;
     paddr_t size;
     void __iomem *its_base;
+    spinlock_t cmd_lock;
     void *cmd_buf;
+    bool pta;
 };
 
 extern struct list_head host_its_list;
@@ -89,6 +108,12 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
 int gicv3_lpi_init_host_lpis(int nr_lpis);
 int gicv3_its_init(struct host_its *hw_its);
 
+/* Set the physical address and ID for each redistributor as read from DT. */
+void gicv3_set_redist_addr(paddr_t address, int redist_id);
+
+/* Map a collection for this host CPU to each host ITS. */
+void gicv3_its_setup_collection(int cpu);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
@@ -110,6 +135,13 @@ static inline int gicv3_its_init(struct host_its *hw_its)
 {
     return 0;
 }
+static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
+{
+}
+static inline void gicv3_its_setup_collection(int cpu)
+{
+}
+
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (4 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-27 22:59   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping Andre Przywara
                   ` (18 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
We allocate and access LPI information in a way that does not require
a lock.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  18 +++++
 2 files changed, 172 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 88397bc..2140e4a 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -18,18 +18,31 @@
 
 #include <xen/config.h>
 #include <xen/lib.h>
+#include <xen/sched.h>
 #include <xen/err.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
 #include <asm/p2m.h>
+#include <asm/domain.h>
 #include <asm/io.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic-its.h>
 
+/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
+union host_lpi {
+    uint64_t data;
+    struct {
+        uint64_t virt_lpi:32;
+        uint64_t dom_id:16;
+        uint64_t vcpu_id:16;
+    };
+};
+
 /* Global state */
 static struct {
     uint8_t *lpi_property;
+    union host_lpi **host_lpis;
     int host_lpi_bits;
 } lpi_data;
 
@@ -43,6 +56,26 @@ static DEFINE_PER_CPU(void *, pending_table);
 #define MAX_HOST_LPI_BITS                                                \
         min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
 #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
+#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
+
+static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
+{
+    union host_lpi *hlpi;
+
+    if ( lpi < 8192 || lpi >= MAX_HOST_LPIS + 8192 )
+        return NULL;
+
+    lpi -= 8192;
+    if ( !lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE] )
+        return NULL;
+
+    hlpi = &lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE][lpi % HOST_LPIS_PER_PAGE];
+
+    if ( d && hlpi->dom_id != d->domain_id )
+        return NULL;
+
+    return hlpi;
+}
 
 #define ITS_COMMAND_SIZE        32
 
@@ -96,6 +129,33 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_discard(struct host_its *its,
+                                uint32_t deviceid, uint32_t eventid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_DISCARD | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid;
+    cmd[2] = 0x00;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+static int its_send_cmd_mapti(struct host_its *its,
+                              uint32_t deviceid, uint32_t eventid,
+                              uint32_t pintid, uint16_t icid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid | ((uint64_t)pintid << 32);
+    cmd[2] = icid;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
 {
     uint64_t cmd[4];
@@ -330,15 +390,109 @@ uint64_t gicv3_lpi_get_proptable()
     return reg;
 }
 
+/* Allocate the 2nd level array for host LPIs. This one holds pointers
+ * to the page with the actual "union host_lpi" entries. Our LPI limit
+ * avoids excessive memory usage.
+ */
 int gicv3_lpi_init_host_lpis(int lpi_bits)
 {
+    int nr_lpi_ptrs;
+
     lpi_data.host_lpi_bits = lpi_bits;
 
+    nr_lpi_ptrs = MAX_HOST_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
+
+    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
+    if ( !lpi_data.host_lpis )
+        return -ENOMEM;
+
     printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
 
     return 0;
 }
 
+/* Allocates a new host LPI to be injected as "virt_lpi" into the specified
+ * VCPU. Returns the host LPI ID or a negative error value.
+ */
+int gicv3_lpi_allocate_host_lpi(struct host_its *its,
+                                uint32_t devid, uint32_t eventid,
+                                struct vcpu *v, int virt_lpi)
+{
+    int chunk, i;
+    union host_lpi hlpi, *new_chunk;
+
+    /* TODO: handle some kind of preassigned LPI mapping for DomUs */
+    if ( !its )
+        return -EPERM;
+
+    /* TODO: This could be optimized by storing some "next available" hint and
+     * only iterate if this one doesn't work. But this function should be
+     * called rarely.
+     */
+    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
+    {
+        /* If we hit an unallocated chunk, we initialize it and use entry 0. */
+        if ( !lpi_data.host_lpis[chunk] )
+        {
+            new_chunk = alloc_xenheap_pages(0, 0);
+            if ( !new_chunk )
+                return -ENOMEM;
+
+            memset(new_chunk, 0, PAGE_SIZE);
+            lpi_data.host_lpis[chunk] = new_chunk;
+            i = 0;
+        }
+        else
+        {
+            /* Find an unallocted entry in this chunk. */
+            for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
+                if ( !lpi_data.host_lpis[chunk][i].virt_lpi )
+                    break;
+
+            /* If this chunk is fully allocted, advance to the next one. */
+            if ( i == HOST_LPIS_PER_PAGE)
+                continue;
+        }
+
+        hlpi.virt_lpi = virt_lpi;
+        hlpi.dom_id = v->domain->domain_id;
+        hlpi.vcpu_id = v->vcpu_id;
+        lpi_data.host_lpis[chunk][i].data = hlpi.data;
+
+        if (its)
+        {
+            its_send_cmd_mapti(its, devid, eventid,
+                               chunk * HOST_LPIS_PER_PAGE + i + 8192, 0);
+            its_send_cmd_sync(its, 0);
+        }
+
+        return chunk * HOST_LPIS_PER_PAGE + i + 8192;
+    }
+
+    return -ENOSPC;
+}
+
+/* Drops the connection of the given host LPI to a virtual LPI.
+ */
+int gicv3_lpi_drop_host_lpi(struct host_its *its,
+                            uint32_t devid, uint32_t eventid, uint32_t host_lpi)
+{
+    union host_lpi *hlpip;
+
+    if ( !its )
+        return -EPERM;
+
+    hlpip = gic_find_host_lpi(host_lpi, NULL);
+    if ( !hlpip )
+        return -1;
+
+    hlpip->data = 0;
+
+    its_send_cmd_discard(its, devid, eventid);
+
+    return 0;
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index b49d274..512a388 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -114,6 +114,12 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
 /* Map a collection for this host CPU to each host ITS. */
 void gicv3_its_setup_collection(int cpu);
 
+int gicv3_lpi_allocate_host_lpi(struct host_its *its,
+                                uint32_t devid, uint32_t eventid,
+                                struct vcpu *v, int virt_lpi);
+int gicv3_lpi_drop_host_lpi(struct host_its *its,
+                            uint32_t devid, uint32_t eventid,
+                            uint32_t host_lpi);
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
@@ -141,6 +147,18 @@ static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
 static inline void gicv3_its_setup_collection(int cpu)
 {
 }
+static inline int gicv3_lpi_allocate_host_lpi(struct host_its *its,
+                                              uint32_t devid, uint32_t eventid,
+                                              struct vcpu *v, int virt_lpi)
+{
+    return 0;
+}
+static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
+                                          uint32_t devid, uint32_t eventid,
+                                          uint32_t host_lpi)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (5 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
  2016-10-28  0:08   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
                   ` (17 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The ITS uses device IDs to map LPIs to a device. Dom0 will later use
those IDs, which we directly pass on to the host.
For this we have to map each device that Dom0 may request to a host
ITS device with the same identifier.
Allocate the respective memory and enter each device into a list to
later be able to iterate over it or to easily teardown guests.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 90 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h | 16 ++++++++
 2 files changed, 106 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 2140e4a..bf1f5b5 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -168,6 +168,94 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
+                             int size, uint64_t itt_addr, bool valid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
+    cmd[1] = size & GENMASK(4, 0);
+    cmd[2] = itt_addr & GENMASK(51, 8);
+    if ( valid )
+        cmd[2] |= BIT(63);
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
+                         int devid, int bits, bool valid)
+{
+    void *itt_addr = NULL;
+    struct its_devices *dev, *temp;
+    bool reuse_dev = false;
+
+    list_for_each_entry_safe(dev, temp, &hw_its->its_devices, entry)
+    {
+        if ( (dev->d->domain_id != d->domain_id) || (dev->devid != devid) )
+            continue;
+
+        its_send_cmd_mapd(hw_its, dev->devid, 0, 0, false);
+        xfree(dev->itt_addr);
+        if ( !valid )
+        {
+            xfree(dev);
+            list_del(&dev->entry);
+
+            return 0;
+        }
+
+        reuse_dev = true;
+        break;
+    }
+
+    if ( !valid )
+        return 0;
+
+    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
+    if ( !itt_addr )
+        return -ENOMEM;
+
+    if ( !reuse_dev )
+    {
+        dev = xmalloc(struct its_devices);
+        if ( !dev )
+            return -ENOMEM;
+
+        list_add_tail(&dev->entry, &hw_its->its_devices);
+    }
+
+    dev->itt_addr = itt_addr;
+    dev->d = d;
+    dev->devid = devid;
+
+    return its_send_cmd_mapd(hw_its, devid, bits - 1,
+                             itt_addr ? virt_to_maddr(itt_addr) : 0, true);
+}
+
+/* Removing any connections a domain had to any ITS in the system. */
+int its_remove_domain(struct domain *d)
+{
+    struct host_its *its;
+    struct its_devices *dev, *temp;
+
+    list_for_each_entry(its, &host_its_list, entry)
+    {
+        list_for_each_entry_safe(dev, temp, &its->its_devices, entry)
+        {
+            if ( dev->d->domain_id != d->domain_id )
+                continue;
+
+            its_send_cmd_mapd(its, dev->devid, 0, 0, false);
+            xfree(dev->itt_addr);
+            xfree(dev);
+            list_del(&dev->entry);
+        }
+    }
+
+    return 0;
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 void gicv3_its_setup_collection(int cpu)
 {
@@ -297,6 +385,7 @@ int gicv3_its_init(struct host_its *hw_its)
 
     reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
     hw_its->pta = reg & GITS_TYPER_PTA;
+    hw_its->itte_size = ((reg >> 4) & 0xf) + 1;
 
     for (i = 0; i < 8; i++)
     {
@@ -520,6 +609,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
         its_data->size = size;
         its_data->dt_node = its;
         spin_lock_init(&its_data->cmd_lock);
+        INIT_LIST_HEAD(&its_data->its_devices);
 
         printk("GICv3: Found ITS @0x%lx\n", addr);
 
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 512a388..4e9841a 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -79,6 +79,13 @@
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
+struct its_devices {
+    struct list_head entry;
+    struct domain *d;
+    void *itt_addr;
+    int devid;
+};
+
 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -88,6 +95,8 @@ struct host_its {
     void __iomem *its_base;
     spinlock_t cmd_lock;
     void *cmd_buf;
+    struct list_head its_devices;
+    int itte_size;
     bool pta;
 };
 
@@ -114,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
 /* Map a collection for this host CPU to each host ITS. */
 void gicv3_its_setup_collection(int cpu);
 
+/* Map a device on the host by allocating an ITT on the host (ITS).
+ * "bits" specifies how many events (interrupts) this device will need.
+ * Setting "valid" to false deallocates the device.
+ */
+int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
+                         int devid, int bits, bool valid);
+
 int gicv3_lpi_allocate_host_lpi(struct host_its *its,
                                 uint32_t devid, uint32_t eventid,
                                 struct vcpu *v, int virt_lpi);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (6 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
                     ` (2 more replies)
  2016-09-28 18:24 ` [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests Andre Przywara
                   ` (16 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

For the same reason that allocating a struct irq_desc for each
possible LPI is not an option, having a struct pending_irq for each LPI
is also not feasible. However we actually only need those when an
interrupt is on a vCPU (or is about to be injected).
Maintain a list of those structs that we can use for the lifecycle of
a guest LPI. We allocate new entries if necessary, however reuse
pre-owned entries whenever possible.
Teach the existing VGIC functions to find the right pointer when being
given a virtual LPI number.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic.c            |  3 +++
 xen/arch/arm/vgic-v3.c        |  2 ++
 xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
 xen/include/asm-arm/domain.h  |  1 +
 xen/include/asm-arm/gic-its.h | 10 ++++++++
 xen/include/asm-arm/vgic.h    |  9 +++++++
 6 files changed, 78 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 63c744a..ebe4035 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
                 struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
                 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
             }
+            /* If this was an LPI, mark this struct as available again. */
+            if ( p->irq >= 8192 )
+                p->irq = 0;
         }
     }
 }
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index ec038a3..e9b6490 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
     if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
         v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
 
+    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
+
     return 0;
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 0965119..b961551 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -31,6 +31,8 @@
 #include <asm/mmio.h>
 #include <asm/gic.h>
 #include <asm/vgic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
 
 static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
 {
@@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
     return vgic_get_rank(v, rank);
 }
 
-static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
+void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
 {
     INIT_LIST_HEAD(&p->inflight);
     INIT_LIST_HEAD(&p->lr_queue);
@@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
 
 static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
 {
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
+    struct vgic_irq_rank *rank;
     unsigned long flags;
     int priority;
 
+    if ( virq >= 8192 )
+        return gicv3_lpi_get_priority(v->domain, virq);
+
+    rank = vgic_rank_irq(v, virq);
     vgic_lock_rank(v, rank, flags);
     priority = rank->priority[virq & INTERRUPT_RANK_MASK];
     vgic_unlock_rank(v, rank, flags);
@@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
     return 1;
 }
 
+/*
+ * Holding struct pending_irq's for each possible virtual LPI in each domain
+ * requires too much Xen memory, also a malicious guest could potentially
+ * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
+ * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
+ * on demand.
+ */
+struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
+                                   bool allocate)
+{
+    struct lpi_pending_irq *lpi_irq, *empty = NULL;
+
+    /* TODO: locking! */
+    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
+    {
+        if ( lpi_irq->pirq.irq == lpi )
+            return &lpi_irq->pirq;
+
+        if ( lpi_irq->pirq.irq == 0 && !empty )
+            empty = lpi_irq;
+    }
+
+    if ( !allocate )
+        return NULL;
+
+    if ( !empty )
+    {
+        empty = xzalloc(struct lpi_pending_irq);
+        vgic_init_pending_irq(&empty->pirq, lpi);
+        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
+    } else
+    {
+        empty->pirq.status = 0;
+        empty->pirq.irq = lpi;
+    }
+
+    return &empty->pirq;
+}
+
 struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
 {
     struct pending_irq *n;
+
     /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
      * are used for SPIs; the rests are used for per cpu irqs */
     if ( irq < 32 )
         n = &v->arch.vgic.pending_irqs[irq];
+    else if ( irq >= 8192 )
+        n = lpi_to_pending(v, irq, true);
     else
         n = &v->domain->arch.vgic.pending_irqs[irq - 32];
     return n;
@@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 {
     uint8_t priority;
-    struct pending_irq *iter, *n = irq_to_pending(v, virq);
+    struct pending_irq *iter, *n;
     unsigned long flags;
     bool_t running;
 
@@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
+    n = irq_to_pending(v, virq);
+
     /* vcpu offline */
     if ( test_bit(_VPF_down, &v->pause_flags) )
     {
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 9452fcd..ae8a9de 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -249,6 +249,7 @@ struct arch_vcpu
         paddr_t rdist_base;
 #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
         uint8_t flags;
+        struct list_head pending_lpi_list;
     } vgic;
 
     /* Timer registers  */
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 4e9841a..1f881c0 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
 int gicv3_lpi_drop_host_lpi(struct host_its *its,
                             uint32_t devid, uint32_t eventid,
                             uint32_t host_lpi);
+
+static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
+{
+    return GIC_PRI_IRQ;
+}
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
@@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
 {
     return 0;
 }
+static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
+{
+    return GIC_PRI_IRQ;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 300f461..4e29ba6 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -83,6 +83,12 @@ struct pending_irq
     struct list_head lr_queue;
 };
 
+struct lpi_pending_irq
+{
+    struct list_head entry;
+    struct pending_irq pirq;
+};
+
 #define NR_INTERRUPT_PER_RANK   32
 #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
 
@@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
+extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
 extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
+extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
+                                          bool allocate);
 extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
 extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
 extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (7 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-28  1:51   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
                   ` (15 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
number to get this IRQ injected.
Iterate our two-level LPI table to find this information quickly when
the host takes an LPI. Call the existing injection function to let the
GIC emulation deal with this interrupt.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c    | 32 ++++++++++++++++++++++++++++++++
 xen/arch/arm/gic.c        |  6 ++++--
 xen/include/asm-arm/irq.h |  8 ++++++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index bf1f5b5..b7aa918 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -77,6 +77,38 @@ static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
     return hlpi;
 }
 
+/* Handle incoming LPIs, which are a bit special, because they are potentially
+ * numerous and also only get injected into guests. Treat them specially here,
+ * by just looking up their target vCPU and virtual LPI number and hand it
+ * over to the injection function.
+ */
+void do_LPI(unsigned int lpi)
+{
+    struct domain *d;
+    union host_lpi *hlpip, hlpi;
+
+    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
+    WRITE_SYSREG32(lpi, ICC_DIR_EL1);
+
+    hlpip = gic_find_host_lpi(lpi, NULL);
+    if ( !hlpip )
+        return;
+
+    hlpi.data = hlpip->data;
+
+    if ( !hlpi.virt_lpi )
+        return;
+
+    d = get_domain_by_id(hlpi.dom_id);
+    if ( !d )
+        return;
+
+    if ( hlpi.vcpu_id >= d->max_vcpus )
+        return;
+
+    vgic_vcpu_inject_irq(d->vcpu[hlpi.vcpu_id], hlpi.virt_lpi);
+}
+
 #define ITS_COMMAND_SIZE        32
 
 static int its_send_command(struct host_its *hw_its, void *its_cmd)
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index ebe4035..2fad2f1 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -697,8 +697,10 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
             local_irq_enable();
             do_IRQ(regs, irq, is_fiq);
             local_irq_disable();
-        }
-        else if (unlikely(irq < 16))
+        } else if ( irq >= 8192 )
+        {
+            do_LPI(irq);
+        } else if ( unlikely(irq < 16) )
         {
             do_sgi(regs, irq);
         }
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 8f7a167..ee47de8 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -34,6 +34,14 @@ struct irq_desc *__irq_to_desc(int irq);
 
 void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
 
+#ifdef CONFIG_HAS_ITS
+void do_LPI(unsigned int irq);
+#else
+static inline void do_LPI(unsigned int irq)
+{
+}
+#endif
+
 #define domain_pirq_to_irq(d, pirq) (pirq)
 
 bool_t is_assignable_irq(unsigned int irq);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (8 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-28 23:07   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
                   ` (14 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Now that the host part of the ITS code is in place, we can enable the
ITS and also LPIs on each redistributor to get the show rolling.
At this point there would be no LPIs mapped, as guests don't know about
the ITS yet.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c |  4 ++++
 xen/arch/arm/gic-v3.c  | 19 +++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index b7aa918..6bac422 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -449,6 +449,10 @@ int gicv3_its_init(struct host_its *hw_its)
     its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
     its_send_cmd_sync(hw_its, smp_processor_id());
 
+    /* Now enable interrupt translation on that ITS. */
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    writel_relaxed(reg | GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
     return 0;
 }
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b9387a3..57009c6 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -642,6 +642,21 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
     gicv3_its_setup_collection(smp_processor_id());
 }
 
+/* Enable LPIs on this redistributor (only useful when the host has an ITS. */
+static bool gicv3_enable_lpis(void)
+{
+    uint32_t val;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_TYPER);
+    if ( !(val & GICR_TYPER_PLPIS) )
+        return false;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_CTLR);
+    writel_relaxed(val | GICR_CTLR_ENABLE_LPIS, GICD_RDIST_BASE + GICR_CTLR);
+
+    return true;
+}
+
 static int __init gicv3_populate_rdist(void)
 {
     int i;
@@ -741,6 +756,10 @@ static int gicv3_cpu_init(void)
     if ( gicv3_enable_redist() )
         return -ENODEV;
 
+    /* If the host has any ITSes, enable LPIs now. */
+    if ( !list_empty(&host_its_list) )
+        gicv3_enable_lpis();
+
     /* Set priority on PPI and SGI interrupts */
     priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
                 GIC_PRI_IPI);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (9 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 15:32   ` Vijay Kilari
                     ` (2 more replies)
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
                   ` (13 subsequent siblings)
  24 siblings, 3 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Allow a guest to provide the address and size for the memory regions
it has reserved for the GICv3 pending and property tables.
We sanitise the various fields of the respective redistributor
registers and map those pages into Xen's address space to have easy
access.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c        | 189 ++++++++++++++++++++++++++++++++++++++----
 xen/arch/arm/vgic.c           |   4 +
 xen/include/asm-arm/domain.h  |   7 +-
 xen/include/asm-arm/gic-its.h |  10 ++-
 xen/include/asm-arm/vgic.h    |   3 +
 5 files changed, 197 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index e9b6490..8fe8386 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -20,12 +20,14 @@
 
 #include <xen/bitops.h>
 #include <xen/config.h>
+#include <xen/domain_page.h>
 #include <xen/lib.h>
 #include <xen/init.h>
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/sched.h>
 #include <xen/sizes.h>
+#include <xen/vmap.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
@@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_reserved;
 
     case VREG64(GICR_PROPBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
+        return 1;
 
     case VREG64(GICR_PENDBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
+        return 1;
 
     case 0x0080:
         goto read_reserved;
@@ -301,11 +305,6 @@ bad_width:
     domain_crash_synchronous();
     return 0;
 
-read_as_zero_64:
-    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-    *r = 0;
-    return 1;
-
 read_as_zero_32:
     if ( dabt.size != DABT_WORD ) goto bad_width;
     *r = 0;
@@ -330,11 +329,149 @@ read_unknown:
     return 1;
 }
 
+static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
+                                    int field_shift,
+                                    uint64_t (*sanitise_fn)(uint64_t))
+{
+    uint64_t field = (reg & field_mask) >> field_shift;
+
+    field = sanitise_fn(field) << field_shift;
+    return (reg & ~field_mask) | field;
+}
+
+/* We want to avoid outer shareable. */
+static uint64_t vgic_sanitise_shareability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_OuterShareable:
+        return GIC_BASER_InnerShareable;
+    default:
+        return field;
+    }
+}
+
+/* Avoid any inner non-cacheable mapping. */
+static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_CACHE_nCnB:
+    case GIC_BASER_CACHE_nC:
+        return GIC_BASER_CACHE_RaWb;
+    default:
+        return field;
+    }
+}
+
+/* Non-cacheable or same-as-inner are OK. */
+static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_CACHE_SameAsInner:
+    case GIC_BASER_CACHE_nC:
+        return field;
+    default:
+        return GIC_BASER_CACHE_nC;
+    }
+}
+
+static uint64_t sanitize_propbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
+                              GICR_PROPBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~PROPBASER_RES0_MASK;
+    reg &= ~GENMASK(51, 48);
+    return reg;
+}
+
+static uint64_t sanitize_pendbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
+                              GICR_PENDBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~PENDBASER_RES0_MASK;
+    reg &= ~GENMASK(51, 48);
+    return reg;
+}
+
+/*
+ * Allow mapping some parts of guest memory into Xen's VA space to have easy
+ * access to it. This is to allow ITS configuration data to be held in
+ * guest memory and avoid using Xen memory for that.
+ */
+void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
+{
+    mfn_t onepage;
+    mfn_t *pages;
+    int i;
+    void *ptr;
+
+    /* TODO: free previous mapping, change prototype? use get-put-put? */
+
+    guest_addr &= PAGE_MASK;
+
+    if ( nr_pages == 1 )
+    {
+        pages = &onepage;
+    } else
+    {
+        pages = xmalloc_array(mfn_t, nr_pages);
+        if ( !pages )
+            return NULL;
+    }
+
+    for (i = 0; i < nr_pages; i++)
+    {
+        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
+        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);
+    }
+
+    ptr = vmap(pages, nr_pages);
+
+    if ( nr_pages > 1 )
+        xfree(pages);
+
+    return ptr;
+}
+
+void unmap_guest_pages(void *va, int nr_pages)
+{
+    paddr_t pa;
+    unsigned long i;
+
+    if ( !va )
+        return;
+
+    va = (void *)((uintptr_t)va & PAGE_MASK);
+    pa = virt_to_maddr(va);
+
+    vunmap(va);
+    for (i = 0; i < nr_pages; i++)
+        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
+
+    return;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
 {
     struct hsr_dabt dabt = info->dabt;
+    uint64_t reg;
 
     switch ( gicr_reg )
     {
@@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     case 0x0050:
         goto write_reserved;
 
-    case VREG64(GICR_PROPBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+    case VREG64(GICR_PROPBASER): {
+        int nr_pages;
+
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
+            return 1;
+
+        reg = v->domain->arch.vgic.rdist_propbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_propbaser(reg);
+        v->domain->arch.vgic.rdist_propbase = reg;
 
+        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;
+        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
+        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);
+        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
+                                                         reg & GENMASK(47, 12),
+                                                         nr_pages);
+        return 1;
+    }
     case VREG64(GICR_PENDBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+	reg = v->arch.vgic.rdist_pendbase;
+	vgic_reg64_update(&reg, r, info);
+	reg = sanitize_pendbaser(reg);
+	v->arch.vgic.rdist_pendbase = reg;
+
+        unmap_guest_pages(v->arch.vgic.pendtable, 16);
+	v->arch.vgic.pendtable = map_guest_pages(v->domain,
+                                                 reg & GENMASK(47, 12), 16);
+	return 1;
 
     case 0x0080:
         goto write_reserved;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index b961551..4d9304f 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
         empty->pirq.irq = lpi;
     }
 
+    /* Update the enabled status */
+    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
+        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
+
     return &empty->pirq;
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index ae8a9de..0cd3500 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -109,6 +109,8 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        uint64_t rdist_propbase;
+        uint8_t *proptable;
 #endif
     } vgic;
 
@@ -247,7 +249,10 @@ struct arch_vcpu
 
         /* GICv3: redistributor base and flags for this vCPU */
         paddr_t rdist_base;
-#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_LPIS_ENABLED    (1 << 1)
+        uint64_t rdist_pendbase;
+        unsigned long *pendtable;
         uint8_t flags;
         struct list_head pending_lpi_list;
     } vgic;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 1f881c0..3b2e5c0 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
 
 static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
 {
-    return GIC_PRI_IRQ;
+    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
+}
+static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
+{
+    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
 }
 
 #else
@@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
 {
     return GIC_PRI_IRQ;
 }
+static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
+{
+    return false;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 4e29ba6..2b216cc 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
 
 #undef VGIC_REG_HELPERS
 
+void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
+void unmap_guest_pages(void *va, int nr_pages);
+
 enum gic_sgi_mode;
 
 /*
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (10 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-09 14:20   ` Vijay Kilari
                     ` (3 more replies)
  2016-09-28 18:24 ` [RFC PATCH 13/24] ARM: vITS: handle CLEAR command Andre Przywara
                   ` (12 subsequent siblings)
  24 siblings, 4 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Create a new file to hold the emulation code for the ITS widget.
For now we emulate the memory mapped ITS registers and provide a stub
to introduce the ITS command handling framework (but without actually
emulating any commands at this time).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            |   9 -
 xen/include/asm-arm/gic_v3_defs.h |  19 ++
 4 files changed, 398 insertions(+), 9 deletions(-)
 create mode 100644 xen/arch/arm/vgic-its.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index c2c4daa..cb0201f 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -44,6 +44,7 @@ obj-y += traps.o
 obj-y += vgic.o
 obj-y += vgic-v2.o
 obj-$(CONFIG_ARM_64) += vgic-v3.o
+obj-$(CONFIG_HAS_ITS) += vgic-its.o
 obj-y += vm_event.o
 obj-y += vtimer.o
 obj-y += vpsci.o
diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
new file mode 100644
index 0000000..875b992
--- /dev/null
+++ b/xen/arch/arm/vgic-its.c
@@ -0,0 +1,378 @@
+/*
+ * xen/arch/arm/vgic-its.c
+ *
+ * ARM Interrupt Translation Service (ITS) emulation
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/bitops.h>
+#include <xen/config.h>
+#include <xen/domain_page.h>
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/softirq.h>
+#include <xen/irq.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
+#include <asm/current.h>
+#include <asm/mmio.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
+#include <asm/vgic.h>
+#include <asm/vgic-emul.h>
+
+/* Data structure to describe a virtual ITS */
+struct virt_its {
+    struct domain *d;
+    struct host_its *hw_its;
+    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
+    uint64_t cbaser;
+    uint64_t *cmdbuf;
+    int cwriter;
+    int creadr;
+    spinlock_t its_lock;        /* protects the collection and device tables */
+    uint64_t baser0, baser1;
+    uint16_t *coll_table;
+    int max_collections;
+    uint64_t *dev_table;
+    int max_devices;
+    bool enabled;
+};
+
+/* An Interrupt Translation Table Entry: this is indexed by a
+ * DeviceID/EventID pair and is located in guest memory.
+ */
+struct vits_itte
+{
+    uint64_t hlpi:24;
+    uint64_t vlpi:24;
+    uint64_t collection:16;
+};
+
+/**************************************
+ * Functions that handle ITS commands *
+ **************************************/
+
+static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
+                                   int word, int shift, int size)
+{
+    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
+}
+
+#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
+#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
+#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
+#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
+#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
+#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
+#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
+
+#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
+
+static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
+                                uint32_t writer)
+{
+    uint64_t *cmdptr;
+
+    if ( !its->cmdbuf )
+        return -1;
+
+    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
+        return -1;
+
+    spin_lock(&its->vcmd_lock);
+
+    while ( its->creadr != writer )
+    {
+        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
+        switch (its_cmd_get_command(cmdptr))
+        {
+        case GITS_CMD_SYNC:
+            /* We handle ITS commands synchronously, so we ignore SYNC. */
+	    break;
+        default:
+            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
+                   its_cmd_get_command(cmdptr));
+            break;
+        }
+
+        its->creadr += ITS_CMD_SIZE;
+        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
+            its->creadr = 0;
+    }
+    its->cwriter = writer;
+
+    spin_unlock(&its->vcmd_lock);
+
+    return 0;
+}
+
+/*****************************
+ * ITS registers read access *
+ *****************************/
+
+/* The physical address is encoded slightly differently depending on
+ * the used page size: the highest four bits are stored in the lowest
+ * four bits of the field for 64K pages.
+ */
+static paddr_t get_baser_phys_addr(uint64_t reg)
+{
+    if ( reg & BIT(9) )
+        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
+    else
+        return reg & GENMASK(47, 12);
+}
+
+static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
+                                 register_t *r, void *priv)
+{
+    struct virt_its *its = priv;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
+	break;
+    case VREG32(GITS_IIDR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
+        break;
+    case VREG64(GITS_TYPER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(0x1eff1, info);
+        break;
+    case VREG64(GITS_CBASER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->cbaser, info);
+        break;
+    case VREG64(GITS_CWRITER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->cwriter, info);
+        break;
+    case VREG64(GITS_CREADR):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->creadr, info);
+        break;
+    case VREG64(GITS_BASER0):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser0, info);
+        break;
+    case VREG64(GITS_BASER1):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser1, info);
+        break;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(0, info);
+        break;
+    case VREG32(GICD_PIDR2):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
+        break;
+    }
+
+    return 1;
+
+bad_width:
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+/******************************
+ * ITS registers write access *
+ ******************************/
+
+static int its_baser_table_size(uint64_t baser)
+{
+    int page_size = 0;
+
+    switch ( (baser >> 8) & 3 )
+    {
+    case 0: page_size = SZ_4K; break;
+    case 1: page_size = SZ_16K; break;
+    case 2:
+    case 3: page_size = SZ_64K; break;
+    }
+
+    return page_size * ((baser & GENMASK(7, 0)) + 1);
+}
+
+static int its_baser_nr_entries(uint64_t baser)
+{
+    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;
+
+    return its_baser_table_size(baser) / entry_size;
+}
+
+static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
+                                  register_t r, void *priv)
+{
+    struct domain *d = v->domain;
+    struct virt_its *its = priv;
+    uint64_t reg;
+    uint32_t ctlr;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+	vgic_reg32_update(&ctlr, r, info);
+	its->enabled = ctlr & GITS_CTLR_ENABLE;
+	/* TODO: trigger something ... */
+        return 1;
+    case VREG32(GITS_IIDR):
+        goto write_ignore_32;
+    case VREG32(GITS_TYPER):
+        goto write_ignore_32;
+    case VREG64(GITS_CBASER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->cbaser;
+        vgic_reg64_update(&reg, r, info);
+        /* TODO: sanitise! */
+        its->cbaser = reg;
+
+        if ( reg & BIT(63) )
+        {
+            its->cmdbuf = map_guest_pages(d, reg & GENMASK(51, 12), 1);
+        }
+        else
+        {
+            unmap_guest_pages(its->cmdbuf, 1);
+            its->cmdbuf = NULL;
+        }
+
+	return 1;
+    case VREG64(GITS_CWRITER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        reg = its->cwriter;
+        vgic_reg64_update(&reg, r, info);
+        vgic_its_handle_cmds(d, its, reg);
+        return 1;
+    case VREG64(GITS_CREADR):
+        goto write_ignore_64;
+    case VREG64(GITS_BASER0):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->baser0;
+        vgic_reg64_update(&reg, r, info);
+
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
+        /* TODO: sanitise! */
+        /* TODO: locking(?) */
+
+        if ( reg & GITS_BASER_VALID )
+        {
+            its->dev_table = map_guest_pages(d,
+                                             get_baser_phys_addr(reg),
+                                             its_baser_table_size(reg) >> PAGE_SHIFT);
+            its->max_devices = its_baser_nr_entries(reg);
+            memset(its->dev_table, 0, its->max_devices * sizeof(uint64_t));
+        }
+        else
+        {
+            unmap_guest_pages(its->dev_table,
+                              its_baser_table_size(reg) >> PAGE_SHIFT);
+            its->max_devices = 0;
+        }
+
+        its->baser0 = reg;
+        return 1;
+    case VREG64(GITS_BASER1):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->baser1;
+        vgic_reg64_update(&reg, r, info);
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
+        /* TODO: sanitise! */
+
+        /* TODO: sort out locking */
+        /* TODO: repeated calls: free old mapping */
+        if ( reg & GITS_BASER_VALID )
+        {
+            its->coll_table = map_guest_pages(d, get_baser_phys_addr(reg),
+                                              its_baser_table_size(reg) >> PAGE_SHIFT);
+            its->max_collections = its_baser_nr_entries(reg);
+            memset(its->coll_table, 0xff,
+                   its->max_collections * sizeof(uint16_t));
+        }
+        else
+        {
+            unmap_guest_pages(its->coll_table,
+                              its_baser_table_size(reg) >> PAGE_SHIFT);
+            its->max_collections = 0;
+        }
+        its->baser1 = reg;
+        return 1;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        goto write_ignore_64;
+    default:
+        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
+                 info->gpa & 0xffff);
+        return 0;
+    }
+
+    return 1;
+
+write_ignore_64:
+    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+    return 1;
+
+write_ignore_32:
+    if ( info->dabt.size != DABT_WORD ) goto bad_width;
+    return 1;
+
+bad_width:
+    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
+           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
+
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+static const struct mmio_handler_ops vgic_its_mmio_handler = {
+    .read  = vgic_v3_its_mmio_read,
+    .write = vgic_v3_its_mmio_write,
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 8fe8386..aa53a1e 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -158,15 +158,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
     rank->vcpu[offset] = new_vcpu->vcpu_id;
 }
 
-static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
-{
-    /*
-     * 64 bits registers can be accessible using 32-bit and 64-bit unless
-     * stated otherwise (See 8.1.3 ARM IHI 0069A).
-     */
-    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
-}
-
 static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
                                          uint32_t gicr_reg,
                                          register_t *r)
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index da5fb77..6a91f5b 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -147,6 +147,16 @@
 #define LPI_PROP_RES1                (1 << 1)
 #define LPI_PROP_ENABLED             (1 << 0)
 
+/*
+ * PIDR2: Only bits[7:4] are not implementation defined. We are
+ * emulating a GICv3 ([7:4] = 0x3).
+ *
+ * We don't emulate a specific registers scheme so implement the others
+ * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
+ */
+#define GICV3_GICD_PIDR2  0x30
+#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
@@ -190,6 +200,15 @@ struct rdist_region {
     bool single_rdist;
 };
 
+/*
+ * 64 bits registers can be accessible using 32-bit and 64-bit unless
+ * stated otherwise (See 8.1.3 ARM IHI 0069A).
+ */
+static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
+{
+    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
+}
+
 #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
 
 /*
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 13/24] ARM: vITS: handle CLEAR command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (11 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-04 15:48   ` Julien Grall
  2016-11-09  0:39   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 14/24] ARM: vITS: handle INT command Andre Przywara
                   ` (11 subsequent siblings)
  24 siblings, 2 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

This introduces the ITS command handler for the CLEAR command, which
clears the pending state of an LPI.
This removes a not-yet injected, but already queued IRQ from a VCPU.

In addition this patch introduces the lookup function which translates
a given DeviceID/EventID pair into a pointer to our vITTE structure.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 875b992..99d9e9c 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -61,6 +61,73 @@ struct vits_itte
     uint64_t collection:16;
 };
 
+#define UNMAPPED_COLLECTION      ((uint16_t)~0)
+
+/* Must be called with the ITS lock held. */
+static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
+{
+    uint16_t vcpu_id;
+
+    if ( collid >= its->max_collections )
+        return NULL;
+
+    vcpu_id = its->coll_table[collid];
+    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
+        return NULL;
+
+    return its->d->vcpu[vcpu_id];
+}
+
+#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
+#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))
+#define DEV_TABLE_ENTRY(addr, bits)                     \
+        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
+
+static paddr_t get_itte_address(struct virt_its *its,
+                                uint32_t devid, uint32_t evid)
+{
+    paddr_t addr;
+
+    if ( devid >= its->max_devices )
+        return ~0;
+
+    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
+        return ~0;
+
+    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
+
+    return addr + evid * sizeof(struct vits_itte);
+}
+
+/* Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
+ * the corresponding ITTE. This maps the respective guest page into Xen.
+ * Once finished with handling the ITTE, call put_devid_evid() to unmap
+ * the page again.
+ * Must be called with the ITS lock held.
+ */
+static struct vits_itte *get_devid_evid(struct virt_its *its,
+                                        uint32_t devid, uint32_t evid)
+{
+    paddr_t addr = get_itte_address(its, devid, evid);
+    struct vits_itte *itte;
+
+    if (addr == ~0)
+        return NULL;
+
+    /* TODO: check locking for map_guest_pages() */
+    itte = map_guest_pages(its->d, addr & PAGE_MASK, 1);
+    if ( !itte )
+        return NULL;
+
+    return itte + (addr & ~PAGE_MASK) / sizeof(struct vits_itte);
+}
+
+/* Must be called with the ITS lock held. */
+static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
+{
+    unmap_guest_pages(itte, 1);
+}
+
 /**************************************
  * Functions that handle ITS commands *
  **************************************/
@@ -80,6 +147,51 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
 #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
 #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
 
+static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *pirq;
+    struct vits_itte *itte;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    spin_lock(&its->its_lock);
+
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+    {
+        spin_unlock(&its->its_lock);
+        return -1;
+    }
+
+    vcpu = get_vcpu_from_collection(its, itte->collection);
+    if ( !vcpu )
+    {
+        spin_unlock(&its->its_lock);
+        return -1;
+    }
+
+    vlpi = itte->vlpi;
+
+    put_devid_evid(its, itte);
+    spin_unlock(&its->its_lock);
+
+    /* Remove a pending, but not yet injected guest IRQ. */
+    pirq = lpi_to_pending(vcpu, vlpi, false);
+    if ( pirq )
+    {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
+        gic_remove_from_queues(vcpu, vlpi);
+
+        /* Mark this pending IRQ struct as availabe again. */
+        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+            pirq->irq = 0;
+    }
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -100,6 +212,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
         switch (its_cmd_get_command(cmdptr))
         {
+        case GITS_CMD_CLEAR:
+            its_handle_clear(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 14/24] ARM: vITS: handle INT command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (12 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 13/24] ARM: vITS: handle CLEAR command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  0:42   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 15/24] ARM: vITS: handle MAPC command Andre Przywara
                   ` (10 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The INT command sets a given LPI identified by a DeviceID/EventID pair
as pending and thus triggers it to be injected.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 99d9e9c..7072753 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -192,6 +192,37 @@ static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct vits_itte *itte;
+    struct vcpu *vcpu;
+    int ret = -1;
+    uint32_t vlpi;
+
+    spin_lock(&its->its_lock);
+
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+        goto out_unlock;
+
+    vcpu = its->d->vcpu[itte->collection];
+    vlpi = itte->vlpi;
+
+    ret = 0;
+
+    put_devid_evid(its, itte);
+
+out_unlock:
+    spin_unlock(&its->its_lock);
+
+    if ( !ret)
+        vgic_vcpu_inject_irq(vcpu, vlpi);
+
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -215,6 +246,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_INT:
+            its_handle_int(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 15/24] ARM: vITS: handle MAPC command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (13 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 14/24] ARM: vITS: handle INT command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  0:48   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 16/24] ARM: vITS: handle MAPD command Andre Przywara
                   ` (9 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The MAPC command associates a given collection ID with a given
redistributor, thus mapping collections to VCPUs.
We just store the vcpu_id in the collection table for that.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 7072753..caad320 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -223,6 +223,33 @@ out_unlock:
     return ret;
 }
 
+static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    uint64_t rdbase = its_cmd_mask_field(cmdptr, 2, 16, 44);
+    int ret = -1;
+
+    if ( collid >= its->max_collections )
+        return ret;
+
+    if ( rdbase >= its->d->max_vcpus )
+        return ret;
+
+    spin_lock(&its->its_lock);
+    if ( its->coll_table )
+    {
+        if ( its_cmd_get_validbit(cmdptr) )
+            its->coll_table[collid] = rdbase;
+        else
+            its->coll_table[collid] = UNMAPPED_COLLECTION;
+
+        ret = 0;
+    }
+    spin_unlock(&its->its_lock);
+
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -249,6 +276,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_MAPC:
+            its_handle_mapc(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 16/24] ARM: vITS: handle MAPD command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (14 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 15/24] ARM: vITS: handle MAPC command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  0:54   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 17/24] ARM: vITS: handle MAPTI command Andre Przywara
                   ` (8 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The MAPD command maps a device by associating a memory region for
storing ITTEs with a certain device ID.
We just store the given guest physical address in the device table.
We don't map the device tables permanently, as their alignment
requirement is only 256 Bytes, thus making mapping of several tables
complicated. We map the device tables on demand when we need them later.

Also we propagate the MAPD request to the hardware ITS, as the device ID
is only meaningful there.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index caad320..83d47e1 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -250,6 +250,34 @@ static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
     return ret;
 }
 
+static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    int size = its_cmd_get_size(cmdptr);
+    bool valid = its_cmd_get_validbit(cmdptr);
+    paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) & GENMASK(51, 8);
+
+    if ( !its->dev_table )
+        return -1;
+
+    spin_lock(&its->its_lock);
+    if ( valid )
+        its->dev_table[devid] = DEV_TABLE_ENTRY(itt_addr, size + 1);
+    else
+        its->dev_table[devid] = 0;
+
+    spin_unlock(&its->its_lock);
+
+    /* DomUs (will later) have their ITTs allocated at domain creation time,
+     * when Dom0 configures the passthrough.
+     */
+    if ( its->hw_its )
+        return gicv3_its_map_device(its->hw_its,
+                                    its->d, devid, size + 1, valid);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -279,6 +307,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
+        case GITS_CMD_MAPD:
+            its_handle_mapd(its, cmdptr);
+	    break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 17/24] ARM: vITS: handle MAPTI command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (15 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 16/24] ARM: vITS: handle MAPD command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  1:07   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 18/24] ARM: vITS: handle MOVI command Andre Przywara
                   ` (7 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
pair and actually instantiates LPI interrupts.
We allocate a new host LPI and connect that one to this virtual LPI,
so that any triggering IRQ on the host can be quickly forwarded to
a guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 83d47e1..70897dd 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -278,6 +278,55 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    uint32_t intid = its_cmd_get_physical_id(cmdptr);
+    int collid = its_cmd_get_collection(cmdptr);
+    struct vits_itte *itte;
+    uint32_t host_lpi;
+    struct vcpu *vcpu;
+    int ret = -1;
+
+    if ( its_cmd_get_command(cmdptr) == GITS_CMD_MAPI )
+        intid = eventid;
+
+    if ( collid >= its->max_collections )
+        return -1;
+
+    spin_lock(&its->its_lock);
+    vcpu = get_vcpu_from_collection(its, collid);
+    if ( !vcpu )
+        goto out_unlock;
+
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+        goto out_unlock;
+
+    if ( itte->hlpi )
+        goto out_unmap;
+
+    host_lpi = gicv3_lpi_allocate_host_lpi(its->hw_its,
+                                           devid, eventid,
+                                           vcpu, intid);
+    if ( host_lpi >= 0 )
+        itte->hlpi = host_lpi;
+
+    itte->vlpi = intid;
+    itte->collection = collid;
+
+    ret = 0;
+
+out_unmap:
+    put_devid_evid(its, itte);
+
+out_unlock:
+    spin_unlock(&its->its_lock);
+    
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -310,6 +359,10 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPD:
             its_handle_mapd(its, cmdptr);
 	    break;
+        case GITS_CMD_MAPI:
+        case GITS_CMD_MAPTI:
+            its_handle_mapti(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 18/24] ARM: vITS: handle MOVI command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (16 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 17/24] ARM: vITS: handle MAPTI command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  1:13   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 19/24] ARM: vITS: handle DISCARD command Andre Przywara
                   ` (6 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The MOVI command moves the interrupt affinity from one redistributor
(read: VCPU) to another.
For now migration of "live" LPIs is not yet implemented, but we store
the changed affinity in the host LPI structure and in our virtual ITTE.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 16 +++++++++++++++
 xen/arch/arm/vgic-its.c       | 46 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  1 +
 3 files changed, 63 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 6bac422..d1b1cbb 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -618,6 +618,22 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
     return 0;
 }
 
+/* Changes the target VCPU for a given host LPI assigned to a domain. */
+int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id)
+{
+    union host_lpi *hlpip, hlpi;
+
+    hlpip = gic_find_host_lpi(host_lpi, d);
+    if ( !hlpip )
+        return -1;
+
+    hlpi.data = hlpip->data;
+    hlpi.vcpu_id = new_vcpu_id;
+    hlpip->data = hlpi.data;
+
+    return 0;
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 70897dd..c0a60ad 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -327,6 +327,46 @@ out_unlock:
     return ret;
 }
 
+static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    int collid = its_cmd_get_collection(cmdptr);
+    struct vits_itte *itte;
+    struct vcpu *vcpu;
+    uint32_t host_lpi = 0;
+
+    if ( collid >= its->max_collections )
+        return -1;
+
+    spin_lock(&its->its_lock);
+
+    vcpu = get_vcpu_from_collection(its, collid);
+    if ( !vcpu )
+        goto out_unlock;
+
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+        goto out_unlock;
+
+    itte->collection = collid;
+    host_lpi = itte->hlpi;
+
+    /* TODO: lookup currently-in-guest virtual IRQs and migrate them */
+
+    put_devid_evid(its, itte);
+
+out_unlock:
+    spin_unlock(&its->its_lock);
+
+    if ( !host_lpi )
+        return -1;
+
+    gicv3_lpi_change_vcpu(its->d, host_lpi, vcpu->vcpu_id);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -363,6 +403,12 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPTI:
             its_handle_mapti(its, cmdptr);
             break;
+        case GITS_CMD_MOVALL:
+            gdprintk(XENLOG_G_INFO, "ITS: ignoring MOVALL command\n");
+            break;
+        case GITS_CMD_MOVI:
+            its_handle_movi(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 3b2e5c0..7e1142f 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -133,6 +133,7 @@ int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
 int gicv3_lpi_allocate_host_lpi(struct host_its *its,
                                 uint32_t devid, uint32_t eventid,
                                 struct vcpu *v, int virt_lpi);
+int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
 int gicv3_lpi_drop_host_lpi(struct host_its *its,
                             uint32_t devid, uint32_t eventid,
                             uint32_t host_lpi);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 19/24] ARM: vITS: handle DISCARD command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (17 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 18/24] ARM: vITS: handle MOVI command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  1:28   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 20/24] ARM: vITS: handle INV command Andre Przywara
                   ` (5 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The DISCARD command drops the connection between a DeviceID/EventID
and an LPI/collection pair.
We mark the respective structure entries as not allocated and make
sure that any queued IRQs are removed.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 21 +++++++++++++++++++
 xen/arch/arm/vgic-its.c       | 48 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  5 +++++
 3 files changed, 74 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index d1b1cbb..766a7cb 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -634,6 +634,27 @@ int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id)
     return 0;
 }
 
+/* Looks up a given host LPI assigned to that domain and returns the
+ * connected virtual LPI number. Also stores the target vcpu ID in
+ * the passed vcpu_id pointer.
+ * Returns 0 if no host LPI could be found for that domain, or the
+ * virtual LPI number (>= 8192) if the lookup succeeded.
+ */
+uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
+{
+    union host_lpi *hlpip, hlpi;
+
+    hlpip = gic_find_host_lpi(host_lpi, d);
+    if ( !hlpip )
+        return 0;
+
+    hlpi.data = hlpip->data;
+    if ( vcpu_id )
+        *vcpu_id = hlpi.vcpu_id;
+
+    return hlpi.virt_lpi;
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index c0a60ad..028d234 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -367,6 +367,51 @@ out_unlock:
     return 0;
 }
 
+static int its_handle_discard(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *pirq;
+    struct vits_itte *itte;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+    int ret = -1, vcpu_id;
+
+    spin_lock(&its->its_lock);
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+        goto out_unlock;
+
+    vlpi = gicv3_lpi_lookup_lpi(its->d, itte->hlpi, &vcpu_id);
+    if ( !vlpi )
+        goto out_unlock;
+
+    vcpu = its->d->vcpu[vcpu_id];
+
+    pirq = lpi_to_pending(vcpu, vlpi, false);
+    if ( pirq )
+    {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
+        gic_remove_from_queues(vcpu, vlpi);
+
+        /* Mark this pending IRQ struct as availabe again. */
+        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+            pirq->irq = 0;
+    }
+
+    gicv3_lpi_drop_host_lpi(its->hw_its, devid, eventid, itte->hlpi);
+
+    itte->hlpi = 0;             /* Mark this ITTE as unused. */
+    ret = 0;
+
+    put_devid_evid(its, itte);
+
+out_unlock:
+    spin_unlock(&its->its_lock);
+
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -390,6 +435,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_DISCARD:
+            its_handle_discard(its, cmdptr);
+            break;
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 7e1142f..3f5698d 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -133,6 +133,11 @@ int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
 int gicv3_lpi_allocate_host_lpi(struct host_its *its,
                                 uint32_t devid, uint32_t eventid,
                                 struct vcpu *v, int virt_lpi);
+/* Given a physical LPI, looks up and returns the associated virtual LPI
+ * and the target VCPU in the given domain.
+ */
+uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi,
+                              int *vcpu_id);
 int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
 int gicv3_lpi_drop_host_lpi(struct host_its *its,
                             uint32_t devid, uint32_t eventid,
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 20/24] ARM: vITS: handle INV command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (18 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 19/24] ARM: vITS: handle DISCARD command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-09  1:49   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 21/24] ARM: vITS: handle INVALL command Andre Przywara
                   ` (4 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The INV command instructs the ITS to update the configuration data for
a given LPI by re-reading its entry from the property table.
We don't need to care so much about the priority value, but enabling
or disabling an LPI has some effect: We remove or push virtual LPIs
to their VCPUs, also propagate the enable bit to the hardware.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 35 ++++++++++++++++++++
 xen/arch/arm/vgic-its.c       | 74 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  3 ++
 3 files changed, 112 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 766a7cb..6f4329f 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -215,6 +215,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_inv(struct host_its *its,
+                            uint32_t deviceid, uint32_t eventid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid;
+    cmd[2] = 0x00;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
                          int devid, int bits, bool valid)
 {
@@ -655,6 +668,28 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
     return hlpi.virt_lpi;
 }
 
+void gicv3_lpi_set_enable(struct host_its *its,
+                          uint32_t deviceid, uint32_t eventid,
+                          uint32_t host_lpi, bool enabled)
+{
+    host_lpi -= 8192;
+
+    if ( host_lpi >= MAX_HOST_LPIS )
+        return;
+
+    if ( !its )
+        return;
+
+    if (enabled)
+        lpi_data.lpi_property[host_lpi] |= LPI_PROP_ENABLED;
+    else
+        lpi_data.lpi_property[host_lpi] &= ~LPI_PROP_ENABLED;
+
+    __flush_dcache_area(&lpi_data.lpi_property[host_lpi], 1);
+    its_send_cmd_inv(its, deviceid, eventid);
+    its_send_cmd_sync(its, 0);
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 028d234..74da8fc 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -223,6 +223,77 @@ out_unlock:
     return ret;
 }
 
+/* For a given virtual LPI read the enabled bit from the virtual property
+ * table and update the virtual IRQ's state.
+ * This enables or disables the associated hardware LPI, also takes care
+ * of removing or pushing of virtual LPIs to their VCPUs.
+ */
+static void update_lpi_enabled_status(struct virt_its* its,
+                                      struct vcpu *vcpu, uint32_t vlpi,
+                                      uint32_t deviceid, uint32_t eventid,
+                                      uint32_t hlpi)
+{
+    struct pending_irq *pirq = lpi_to_pending(vcpu, vlpi, false);
+    uint8_t property = its->d->arch.vgic.proptable[vlpi - 8192];
+
+    if ( property & LPI_PROP_ENABLED )
+    {
+        if ( pirq )
+        {
+            unsigned long flags;
+
+            set_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
+            spin_lock_irqsave(&vcpu->arch.vgic.lock, flags);
+            if ( !list_empty(&pirq->inflight) &&
+                 !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+                gic_raise_guest_irq(vcpu, vlpi, property & 0xfc);
+            spin_unlock_irqrestore(&vcpu->arch.vgic.lock, flags);
+        }
+        gicv3_lpi_set_enable(its->hw_its, deviceid, eventid, hlpi, true);
+    }
+    else
+    {
+        if ( pirq )
+        {
+            clear_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
+            gic_remove_from_queues(vcpu, vlpi);
+        }
+        gicv3_lpi_set_enable(its->hw_its, deviceid, eventid, hlpi, false);
+    }
+}
+
+static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct vits_itte *itte;
+    struct vcpu *vcpu;
+    uint32_t hlpi, vlpi;
+    int ret = -1;
+
+    spin_lock(&its->its_lock);
+
+    itte = get_devid_evid(its, devid, eventid);
+    if ( !itte )
+        goto out_unlock;
+
+    vcpu = its->d->vcpu[itte->collection];
+    vlpi = itte->vlpi;
+    hlpi = itte->hlpi;
+
+    ret = 0;
+
+    put_devid_evid(its, itte);
+
+out_unlock:
+    spin_unlock(&its->its_lock);
+
+    if ( !ret )
+        update_lpi_enabled_status(its, vcpu, vlpi, devid, eventid, hlpi);
+
+    return ret;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -441,6 +512,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_INV:
+            its_handle_inv(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 3f5698d..2cdb3e1 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -139,6 +139,9 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
 uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi,
                               int *vcpu_id);
 int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
+void gicv3_lpi_set_enable(struct host_its *its,
+                          uint32_t deviceid, uint32_t eventid,
+                          uint32_t host_lpi, bool enabled);
 int gicv3_lpi_drop_host_lpi(struct host_its *its,
                             uint32_t devid, uint32_t eventid,
                             uint32_t host_lpi);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (19 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 20/24] ARM: vITS: handle INV command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-10-24 15:32   ` Vijay Kilari
  2016-09-28 18:24 ` [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
                   ` (3 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

The INVALL command instructs an ITS to invalidate the configuration
data for all LPIs associated with a given redistributor (read: VCPU).
To avoid iterating (and mapping!) all guest tables, we instead go through
the host LPI table to find any LPIs targetting this VCPU. We then update
the configuration bits for the connected virtual LPIs.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 58 +++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |  2 ++
 3 files changed, 90 insertions(+)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 6f4329f..5129d6e 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_invall(struct host_its *its, int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INVALL;
+    cmd[1] = 0x00;
+    cmd[2] = cpu & GENMASK(15, 0);
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
                          int devid, int bits, bool valid)
 {
@@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
     return hlpi.virt_lpi;
 }
 
+/* Iterate over all host LPIs, and updating the "enabled" state for a given
+ * guest redistributor (VCPU) given the respective state in the provided
+ * proptable. This proptable is indexed by the stored virtual LPI number.
+ * This is to implement a guest INVALL command.
+ */
+void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable)
+{
+    int chunk, i;
+    struct host_its *its;
+
+    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
+    {
+        if ( !lpi_data.host_lpis[chunk] )
+            continue;
+
+        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
+        {
+            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i], hlpi;
+            uint32_t hlpi_nr;
+
+            hlpi.data = hlpip->data;
+            if ( !hlpi.virt_lpi )
+                continue;
+
+            if ( hlpi.dom_id != v->domain->domain_id )
+                continue;
+
+            if ( hlpi.vcpu_id != v->vcpu_id )
+                continue;
+
+            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
+
+            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
+                lpi_data.lpi_property[hlpi_nr - 8192] |= LPI_PROP_ENABLED;
+            else
+                lpi_data.lpi_property[hlpi_nr - 8192] &= ~LPI_PROP_ENABLED;
+        }
+    }
+
+    /* Tell all ITSes that they should update the property table for CPU 0,
+     * which is where we map all LPIs to.
+     */
+    list_for_each_entry(its, &host_its_list, entry)
+        its_send_cmd_invall(its, 0);
+}
+
 void gicv3_lpi_set_enable(struct host_its *its,
                           uint32_t deviceid, uint32_t eventid,
                           uint32_t host_lpi, bool enabled)
diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 74da8fc..1e429b7 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -294,6 +294,33 @@ out_unlock:
     return ret;
 }
 
+/* INVALL updates the per-LPI configuration status for every LPI mapped to
+ * this redistributor. For the guest side we don't need to update anything,
+ * as we always refer to the actual table for the enabled bit and the
+ * priority.
+ * Enabling or disabling a virtual LPI however needs to be propagated to
+ * the respective host LPI. Instead of iterating over all mapped LPIs in our
+ * emulated GIC (which is expensive due to the required on-demand mapping),
+ * we iterate over all mapped _host_ LPIs and filter for those which are
+ * forwarded to this virtual redistributor.
+ */
+static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+
+    spin_lock(&its->its_lock);
+    vcpu = get_vcpu_from_collection(its, collid);
+    spin_unlock(&its->its_lock);
+
+    if ( !vcpu )
+        return -1;
+
+    gicv3_lpi_update_configurations(vcpu, its->d->arch.vgic.proptable);
+
+    return 0;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -515,6 +542,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INV:
             its_handle_inv(its, cmdptr);
 	    break;
+        case GITS_CMD_INVALL:
+            its_handle_invall(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 2cdb3e1..ba6b2d5 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -146,6 +146,8 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
                             uint32_t devid, uint32_t eventid,
                             uint32_t host_lpi);
 
+void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable);
+
 static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
 {
     return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (20 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 21/24] ARM: vITS: handle INVALL command Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-10  0:38   ` Stefano Stabellini
  2016-09-28 18:24 ` [RFC PATCH 23/24] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
                   ` (2 subsequent siblings)
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

For each hardware ITS create and initialize a virtual ITS for Dom0.
We use the same memory mapped address to keep the doorbell working.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-its.c       | 22 ++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c        | 12 ++++++++++++
 xen/include/asm-arm/domain.h  |  1 +
 xen/include/asm-arm/gic-its.h | 13 +++++++++++++
 4 files changed, 48 insertions(+)

diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
index 1e429b7..5c605b5 100644
--- a/xen/arch/arm/vgic-its.c
+++ b/xen/arch/arm/vgic-its.c
@@ -829,6 +829,28 @@ static const struct mmio_handler_ops vgic_its_mmio_handler = {
     .write = vgic_v3_its_mmio_write,
 };
 
+int vgic_v3_its_init_virtual(struct domain *d, struct host_its *hw_its,
+                             paddr_t guest_addr)
+{
+    struct virt_its *its;
+
+    its = xzalloc(struct virt_its);
+    if ( ! its )
+        return -ENOMEM;
+
+    its->d = d;
+    its->hw_its = hw_its;
+    its->baser0 = 0x7917000000000400;
+    its->baser1 = 0x3c01000000000400;
+    its->cbaser = 0x380e000000000400;
+    spin_lock_init(&its->vcmd_lock);
+    spin_lock_init(&its->its_lock);
+
+    register_mmio_handler(d, &vgic_its_mmio_handler, guest_addr, SZ_64K, its);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index aa53a1e..d230a1f 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -31,6 +31,7 @@
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
 #include <asm/vgic.h>
 #include <asm/vgic-emul.h>
 
@@ -1572,6 +1573,7 @@ static int vgic_v3_domain_init(struct domain *d)
      */
     if ( is_hardware_domain(d) )
     {
+        struct host_its *hw_its;
         unsigned int first_cpu = 0;
 
         d->arch.vgic.dbase = vgic_v3_hw.dbase;
@@ -1597,6 +1599,16 @@ static int vgic_v3_domain_init(struct domain *d)
 
             first_cpu += size / d->arch.vgic.rdist_stride;
         }
+        d->arch.vgic.nr_regions = vgic_v3_hw.nr_rdist_regions;
+
+        list_for_each_entry(hw_its, &host_its_list, entry)
+        {
+            /* Emulate the control registers frame (lower 64K). */
+            vgic_v3_its_init_virtual(d, hw_its, hw_its->addr);
+
+            d->arch.vgic.has_its = true;
+        }
+
     }
     else
     {
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 0cd3500..1c2f7c7 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -111,6 +111,7 @@ struct arch_domain
         uint32_t rdist_stride;              /* Re-Distributor stride */
         uint64_t rdist_propbase;
         uint8_t *proptable;
+        bool has_its;
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index ba6b2d5..b58e092 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -123,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
 /* Map a collection for this host CPU to each host ITS. */
 void gicv3_its_setup_collection(int cpu);
 
+/* Create and register a virtual ITS at the given guest address.
+ * If a host ITS is specified, a hardware domain can reach out to that host
+ * ITS to deal with devices and LPI mappings and can enable/disable LPIs.
+ */
+int vgic_v3_its_init_virtual(struct domain *d, struct host_its *hw_its,
+                             paddr_t guest_addr);
+
 /* Map a device on the host by allocating an ITT on the host (ITS).
  * "bits" specifies how many events (interrupts) this device will need.
  * Setting "valid" to false deallocates the device.
@@ -204,6 +211,12 @@ static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
 {
     return false;
 }
+static inline int vgic_v3_its_init_virtual(struct domain *d,
+                                           struct host_its *hw_its,
+                                           paddr_t guest_addr)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 23/24] ARM: vITS: create ITS subnodes for Dom0 DT
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (21 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-09-28 18:24 ` [RFC PATCH 24/24] ARM: vGIC: advertising LPI support Andre Przywara
  2016-11-02 13:56 ` [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Julien Grall
  24 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Dom0 expects all ITSes in the system to be propagated to be able to
use MSIs.
Create Dom0 DT nodes for each hardware ITS, keeping the register frame
address the same, as the doorbell address that the Dom0 drivers program
into the BARs has to match the hardware.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-its.c        | 68 +++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c         |  4 ++-
 xen/include/asm-arm/gic-its.h | 13 +++++++++
 3 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
index 5129d6e..bb0b80a 100644
--- a/xen/arch/arm/gic-its.c
+++ b/xen/arch/arm/gic-its.c
@@ -748,6 +748,74 @@ void gicv3_lpi_set_enable(struct host_its *its,
     its_send_cmd_sync(its, 0);
 }
 
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt)
+{
+    uint32_t len;
+    int res;
+    const void *prop = NULL;
+    const struct dt_device_node *its = NULL;
+    const struct host_its *its_data;
+
+    if ( list_empty(its_list) )
+        return 0;
+
+    /* The sub-nodes require the ranges property */
+    prop = dt_get_property(gic, "ranges", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_ERR "Can't find ranges property for the gic node\n");
+        return -FDT_ERR_XEN(ENOENT);
+    }
+
+    res = fdt_property(fdt, "ranges", prop, len);
+    if ( res )
+        return res;
+
+    list_for_each_entry(its_data, its_list, entry)
+    {
+        its = its_data->dt_node;
+
+        res = fdt_begin_node(fdt, its->name);
+        if ( res )
+            return res;
+
+        res = fdt_property_string(fdt, "compatible", "arm,gic-v3-its");
+        if ( res )
+            return res;
+
+        res = fdt_property(fdt, "msi-controller", NULL, 0);
+        if ( res )
+            return res;
+
+        if ( its->phandle )
+        {
+            res = fdt_property_cell(fdt, "phandle", its->phandle);
+            if ( res )
+                return res;
+        }
+
+        /* Use the same reg regions as the ITS node in host DTB. */
+        prop = dt_get_property(its, "reg", &len);
+        if ( !prop )
+        {
+            printk(XENLOG_ERR "GICv3: Can't find ITS reg property.\n");
+            res = -FDT_ERR_XEN(ENOENT);
+            return res;
+        }
+
+        res = fdt_property(fdt, "reg", prop, len);
+        if ( res )
+            return res;
+
+        fdt_end_node(fdt);
+    }
+
+    return res;
+}
+
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
     const struct dt_device_node *its = NULL;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 57009c6..9fba3eb 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1176,8 +1176,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d,
 
     res = fdt_property(fdt, "reg", new_cells, len);
     xfree(new_cells);
+    if ( res )
+        return res;
 
-    return res;
+    return gicv3_its_make_dt_nodes(&host_its_list, d, gic, fdt);
 }
 
 static const hw_irq_controller gicv3_host_irq_type = {
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index b58e092..e20b5bc 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -130,6 +130,12 @@ void gicv3_its_setup_collection(int cpu);
 int vgic_v3_its_init_virtual(struct domain *d, struct host_its *hw_its,
                              paddr_t guest_addr);
 
+/* Given a list of ITSes, create the appropriate DT nodes for a domain. */
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt);
+
 /* Map a device on the host by allocating an ITT on the host (ITS).
  * "bits" specifies how many events (interrupts) this device will need.
  * Setting "valid" to false deallocates the device.
@@ -217,6 +223,13 @@ static inline int vgic_v3_its_init_virtual(struct domain *d,
 {
     return 0;
 }
+static inline int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                                       const struct domain *d,
+                                       const struct dt_device_node *gic,
+                                       void *fdt)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* [RFC PATCH 24/24] ARM: vGIC: advertising LPI support
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (22 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 23/24] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
@ 2016-09-28 18:24 ` Andre Przywara
  2016-11-10  0:49   ` Stefano Stabellini
  2016-11-02 13:56 ` [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Julien Grall
  24 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-09-28 18:24 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

To let a guest know about the availability of virtual LPIs, set the
respective bits in the virtual GIC registers and let a guest control
the LPI enable bit.
Only report the LPI capability if the host has initialized at least
one ITS.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d230a1f..61c97a2 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -168,8 +168,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* We have not implemented LPI's, read zero */
-        goto read_as_zero_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
+                                info);
+        return 1;
 
     case VREG32(GICR_IIDR):
         if ( dabt.size != DABT_WORD ) goto bad_width;
@@ -181,16 +183,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         uint64_t typer, aff;
 
         if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-        /* TBD: Update processor id in [23:8] when ITS support is added */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
         typer = aff;
+        typer |= (v->vcpu_id & 0xffff) << 8;
 
         if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
             typer |= GICR_TYPER_LAST;
 
+        if ( v->domain->arch.vgic.has_its )
+            typer |= GICR_TYPER_PLPIS;
+
         *r = vgic_reg64_extract(typer, info);
 
         return 1;
@@ -468,8 +473,16 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* LPI's not implemented */
-        goto write_ignore_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        if ( !v->domain->arch.vgic.has_its )
+            return 1;
+
+        if ( r & 1 )
+            v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
+        else
+            v->arch.vgic.flags &= !VGIC_V3_LPIS_ENABLED;
+
+        return 1;
 
     case VREG32(GICR_IIDR):
         /* RO */
@@ -1075,6 +1088,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
                  DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
 
+        if ( v->domain->arch.vgic.has_its )
+        {
+            typer |= GICD_TYPE_LPIS;
+            irq_bits = 16;
+        }
         typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
 
         *r = vgic_reg32_extract(typer, info);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2016-10-09 13:55   ` Vijay Kilari
  2016-10-10  9:05     ` Andre Przywara
  2016-10-24 14:30   ` Vijay Kilari
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-09 13:55 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi Andre,

   On Thunderx, MAPD commands are failing with error 0x1,
which mean DEVID out of range.

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> We limit the number of devices to cover 4 PCI busses for now.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |   5 ++
>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>  3 files changed, 167 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index b52dff3..40238a2 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -21,6 +21,7 @@
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16)
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> +{
> +    uint64_t attr;
> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
> +    int pagesz;
> +    int order;
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
> +     * supports.
> +     */
> +    for (pagesz = 0; pagesz < 3; pagesz++)
> +    {
> +        uint64_t reg;
> +        int nr_bytes;
> +
> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
> +        order = get_order_from_bytes(nr_bytes);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= nr_bytes >> (pagesz * 2 + 12);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_BASER_VALID;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +            attr = regc & BASER_ATTR_MASK;
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> +            return 0;
> +
> +        /* Check whether our buffer is aligned to the next page size already. */
> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
> +        {
> +            free_xenheap_pages(buffer, order);
> +            buffer = NULL;
> +        }
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for (i = 0; i < 8; i++)
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg >> 56) & 0x7;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            gicv3_map_baser(basereg, reg, 1024);

Thunderx has larger device id values.
Changing to hw supported device ids makes MAPD commands to pass
and Thunderx to boot.
You can refer to Thunderx bdf numbers here
https://github.com/vijaykilari/its_v6/commit/e1a8ec82ad2bb00b299727d0847b89671e9ba66d

> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            gicv3_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 2534aa5..5cf4618 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1548,6 +1549,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1603,6 +1605,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>      spin_unlock(&gicv3.lock);
>
>      return res;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 48c6c78..589b889 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -18,6 +18,47 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>
> +#define LPI_OFFSET      8192
> +
> +#define GITS_CTLR       (0x000)
> +#define GITS_IIDR       (0x004)
> +#define GITS_TYPER      (0x008)
> +#define GITS_CBASER     (0x080)
> +#define GITS_CWRITER    (0x088)
> +#define GITS_CREADR     (0x090)
> +#define GITS_BASER0     (0x100)
> +#define GITS_BASER1     (0x108)
> +#define GITS_BASER2     (0x110)
> +#define GITS_BASER3     (0x118)
> +#define GITS_BASER4     (0x120)
> +#define GITS_BASER5     (0x128)
> +#define GITS_BASER6     (0x130)
> +#define GITS_BASER7     (0x138)
> +
> +/* Register bits */
> +#define GITS_CTLR_ENABLE     0x1
> +#define GITS_IIDR_VALUE      0x34c
> +
> +#define GITS_BASER_VALID                BIT(63)
> +#define GITS_BASER_INDIRECT             BIT(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              ((7UL << GITS_BASER_TYPE_SHIFT) | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> @@ -27,6 +68,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>
>  extern struct list_head host_its_list;
> @@ -42,8 +84,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>  #else
>
> @@ -62,6 +105,10 @@ static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
@ 2016-10-09 14:20   ` Vijay Kilari
  2016-10-10 10:38     ` Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-09 14:20 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Create a new file to hold the emulation code for the ITS widget.
> For now we emulate the memory mapped ITS registers and provide a stub
> to introduce the ITS command handling framework (but without actually
> emulating any commands at this time).
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            |   9 -
>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>  4 files changed, 398 insertions(+), 9 deletions(-)
>  create mode 100644 xen/arch/arm/vgic-its.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index c2c4daa..cb0201f 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -44,6 +44,7 @@ obj-y += traps.o
>  obj-y += vgic.o
>  obj-y += vgic-v2.o
>  obj-$(CONFIG_ARM_64) += vgic-v3.o
> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>  obj-y += vm_event.o
>  obj-y += vtimer.o
>  obj-y += vpsci.o
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> new file mode 100644
> index 0000000..875b992
> --- /dev/null
> +++ b/xen/arch/arm/vgic-its.c
> @@ -0,0 +1,378 @@
> +/*
> + * xen/arch/arm/vgic-its.c
> + *
> + * ARM Interrupt Translation Service (ITS) emulation
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/domain_page.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <asm/current.h>
> +#include <asm/mmio.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
> +#include <asm/vgic.h>
> +#include <asm/vgic-emul.h>
> +
> +/* Data structure to describe a virtual ITS */
> +struct virt_its {
> +    struct domain *d;
> +    struct host_its *hw_its;
> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
> +    uint64_t cbaser;
> +    uint64_t *cmdbuf;
> +    int cwriter;
> +    int creadr;
> +    spinlock_t its_lock;        /* protects the collection and device tables */
> +    uint64_t baser0, baser1;
> +    uint16_t *coll_table;
> +    int max_collections;
> +    uint64_t *dev_table;
> +    int max_devices;
> +    bool enabled;
> +};
> +
> +/* An Interrupt Translation Table Entry: this is indexed by a
> + * DeviceID/EventID pair and is located in guest memory.
> + */
> +struct vits_itte
> +{
> +    uint64_t hlpi:24;
> +    uint64_t vlpi:24;
> +    uint64_t collection:16;
> +};
> +
> +/**************************************
> + * Functions that handle ITS commands *
> + **************************************/
> +
> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
> +                                   int word, int shift, int size)
> +{
> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
> +
> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> +                                uint32_t writer)
> +{
> +    uint64_t *cmdptr;
> +
> +    if ( !its->cmdbuf )
> +        return -1;
> +
> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +        return -1;
> +
> +    spin_lock(&its->vcmd_lock);
> +
> +    while ( its->creadr != writer )
> +    {
> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
> +        switch (its_cmd_get_command(cmdptr))
> +        {
> +        case GITS_CMD_SYNC:
> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
> +           break;
> +        default:
> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
> +                   its_cmd_get_command(cmdptr));
> +            break;
> +        }
> +
> +        its->creadr += ITS_CMD_SIZE;
> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +            its->creadr = 0;
> +    }
> +    its->cwriter = writer;
> +
> +    spin_unlock(&its->vcmd_lock);
> +
> +    return 0;
> +}
> +
> +/*****************************
> + * ITS registers read access *
> + *****************************/
> +
> +/* The physical address is encoded slightly differently depending on
> + * the used page size: the highest four bits are stored in the lowest
> + * four bits of the field for 64K pages.
> + */
> +static paddr_t get_baser_phys_addr(uint64_t reg)
> +{
> +    if ( reg & BIT(9) )
> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
> +    else
> +        return reg & GENMASK(47, 12);
> +}
> +
> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                                 register_t *r, void *priv)
> +{
> +    struct virt_its *its = priv;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
> +       break;
> +    case VREG32(GITS_IIDR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
> +        break;
> +    case VREG64(GITS_TYPER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0x1eff1, info);

     Here you are limiting DevID bits to 15, which is not enough

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-10-09 13:55   ` Vijay Kilari
@ 2016-10-10  9:05     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-10-10  9:05 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 09/10/16 14:55, Vijay Kilari wrote:
> Hi Andre,
> 
>    On Thunderx, MAPD commands are failing with error 0x1,
> which mean DEVID out of range.

MAPD commands from Dom0, you mean?

And thanks for giving it a try!

> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
>> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
>> and collection ID, which points to the target CPU.
>> This mapping is stored in the device and collection tables, which software
>> has to provide for the ITS to use.
>> Allocate the required memory and hand it the ITS.
>> We limit the number of devices to cover 4 PCI busses for now.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |   5 ++
>>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>>  3 files changed, 167 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index b52dff3..40238a2 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -21,6 +21,7 @@
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> +#include <asm/io.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>>
>> +#define BASER_ATTR_MASK                                           \
>> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
>> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
>> +
>> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>> +{
>> +    uint64_t ret;
>> +
>> +    if ( page_bits < 16)
>> +        return (uint64_t)addr & GENMASK(47, page_bits);
>> +
>> +    ret = addr & GENMASK(47, 16);
>> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>> +}
>> +
>> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>> +{
>> +    uint64_t attr;
>> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
>> +    int pagesz;
>> +    int order;
>> +    void *buffer = NULL;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    /*
>> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
>> +     * supports.
>> +     */
>> +    for (pagesz = 0; pagesz < 3; pagesz++)
>> +    {
>> +        uint64_t reg;
>> +        int nr_bytes;
>> +
>> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
>> +        order = get_order_from_bytes(nr_bytes);
>> +
>> +        if ( !buffer )
>> +            buffer = alloc_xenheap_pages(order, 0);
>> +        if ( !buffer )
>> +            return -ENOMEM;
>> +
>> +        reg  = attr;
>> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
>> +        reg |= nr_bytes >> (pagesz * 2 + 12);
>> +        reg |= regc & BASER_RO_MASK;
>> +        reg |= GITS_BASER_VALID;
>> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
>> +
>> +        writeq_relaxed(reg, basereg);
>> +        regc = readl_relaxed(basereg);
>> +
>> +        /* The host didn't like our attributes, just use what it returned. */
>> +        if ( (regc & BASER_ATTR_MASK) != attr )
>> +            attr = regc & BASER_ATTR_MASK;
>> +
>> +        /* If the host accepted our page size, we are done. */
>> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
>> +            return 0;
>> +
>> +        /* Check whether our buffer is aligned to the next page size already. */
>> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
>> +        {
>> +            free_xenheap_pages(buffer, order);
>> +            buffer = NULL;
>> +        }
>> +    }
>> +
>> +    if ( buffer )
>> +        free_xenheap_pages(buffer, order);
>> +
>> +    return -EINVAL;
>> +}
>> +
>> +int gicv3_its_init(struct host_its *hw_its)
>> +{
>> +    uint64_t reg;
>> +    int i;
>> +
>> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>> +    if ( !hw_its->its_base )
>> +        return -ENOMEM;
>> +
>> +    for (i = 0; i < 8; i++)
>> +    {
>> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> +        int type;
>> +
>> +        reg = readq_relaxed(basereg);
>> +        type = (reg >> 56) & 0x7;
>> +        switch ( type )
>> +        {
>> +        case GITS_BASER_TYPE_NONE:
>> +            continue;
>> +        case GITS_BASER_TYPE_DEVICE:
>> +            /* TODO: find some better way of limiting the number of devices */
>> +            gicv3_map_baser(basereg, reg, 1024);
> 
> Thunderx has larger device id values.
> Changing to hw supported device ids makes MAPD commands to pass
> and Thunderx to boot.

Ah, thanks for the heads up.
Obviously this "1024" is a hack.
Julien wanted to use platform specific code for the Dom0 device mapping,
which could take care of those cases.
I am not so happy with this, since it requires code for each and every
platform. Instead I was thinking of using the PV PCI calls that Dom0
issues anyway to get an idea of how many devices we need, but this maybe
too late in the boot process.
We would need to take a closer look into this.

> You can refer to Thunderx bdf numbers here
> https://github.com/vijaykilari/its_v6/commit/e1a8ec82ad2bb00b299727d0847b89671e9ba66d

Thanks for the link. To be honest, I deliberately didn't look into the
previous patches to get a clean start. I guess I will take a look now.

So you need something like 200,000 devices to cover those "three and
some" segments?
I guess this means no excuse anymore for postponing indirect mapping ;-)

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-10-09 14:20   ` Vijay Kilari
@ 2016-10-10 10:38     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-10-10 10:38 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 09/10/16 15:20, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Create a new file to hold the emulation code for the ITS widget.
>> For now we emulate the memory mapped ITS registers and provide a stub
>> to introduce the ITS command handling framework (but without actually
>> emulating any commands at this time).
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Makefile             |   1 +
>>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/vgic-v3.c            |   9 -
>>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>>  4 files changed, 398 insertions(+), 9 deletions(-)
>>  create mode 100644 xen/arch/arm/vgic-its.c
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index c2c4daa..cb0201f 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -44,6 +44,7 @@ obj-y += traps.o
>>  obj-y += vgic.o
>>  obj-y += vgic-v2.o
>>  obj-$(CONFIG_ARM_64) += vgic-v3.o
>> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>>  obj-y += vm_event.o
>>  obj-y += vtimer.o
>>  obj-y += vpsci.o
>> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
>> new file mode 100644
>> index 0000000..875b992
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic-its.c
>> @@ -0,0 +1,378 @@
>> +/*
>> + * xen/arch/arm/vgic-its.c
>> + *
>> + * ARM Interrupt Translation Service (ITS) emulation
>> + *
>> + * Andre Przywara <andre.przywara@arm.com>
>> + * Copyright (c) 2016 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/bitops.h>
>> +#include <xen/config.h>
>> +#include <xen/domain_page.h>
>> +#include <xen/lib.h>
>> +#include <xen/init.h>
>> +#include <xen/softirq.h>
>> +#include <xen/irq.h>
>> +#include <xen/sched.h>
>> +#include <xen/sizes.h>
>> +#include <asm/current.h>
>> +#include <asm/mmio.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic-its.h>
>> +#include <asm/vgic.h>
>> +#include <asm/vgic-emul.h>
>> +
>> +/* Data structure to describe a virtual ITS */
>> +struct virt_its {
>> +    struct domain *d;
>> +    struct host_its *hw_its;
>> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
>> +    uint64_t cbaser;
>> +    uint64_t *cmdbuf;
>> +    int cwriter;
>> +    int creadr;
>> +    spinlock_t its_lock;        /* protects the collection and device tables */
>> +    uint64_t baser0, baser1;
>> +    uint16_t *coll_table;
>> +    int max_collections;
>> +    uint64_t *dev_table;
>> +    int max_devices;
>> +    bool enabled;
>> +};
>> +
>> +/* An Interrupt Translation Table Entry: this is indexed by a
>> + * DeviceID/EventID pair and is located in guest memory.
>> + */
>> +struct vits_itte
>> +{
>> +    uint64_t hlpi:24;
>> +    uint64_t vlpi:24;
>> +    uint64_t collection:16;
>> +};
>> +
>> +/**************************************
>> + * Functions that handle ITS commands *
>> + **************************************/
>> +
>> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>> +                                   int word, int shift, int size)
>> +{
>> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
>> +}
>> +
>> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
>> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
>> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
>> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
>> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
>> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
>> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
>> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
>> +
>> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>> +
>> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>> +                                uint32_t writer)
>> +{
>> +    uint64_t *cmdptr;
>> +
>> +    if ( !its->cmdbuf )
>> +        return -1;
>> +
>> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
>> +        return -1;
>> +
>> +    spin_lock(&its->vcmd_lock);
>> +
>> +    while ( its->creadr != writer )
>> +    {
>> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>> +        switch (its_cmd_get_command(cmdptr))
>> +        {
>> +        case GITS_CMD_SYNC:
>> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
>> +           break;
>> +        default:
>> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
>> +                   its_cmd_get_command(cmdptr));
>> +            break;
>> +        }
>> +
>> +        its->creadr += ITS_CMD_SIZE;
>> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
>> +            its->creadr = 0;
>> +    }
>> +    its->cwriter = writer;
>> +
>> +    spin_unlock(&its->vcmd_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +/*****************************
>> + * ITS registers read access *
>> + *****************************/
>> +
>> +/* The physical address is encoded slightly differently depending on
>> + * the used page size: the highest four bits are stored in the lowest
>> + * four bits of the field for 64K pages.
>> + */
>> +static paddr_t get_baser_phys_addr(uint64_t reg)
>> +{
>> +    if ( reg & BIT(9) )
>> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
>> +    else
>> +        return reg & GENMASK(47, 12);
>> +}
>> +
>> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
>> +                                 register_t *r, void *priv)
>> +{
>> +    struct virt_its *its = priv;
>> +
>> +    switch ( info->gpa & 0xffff )
>> +    {
>> +    case VREG32(GITS_CTLR):
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
>> +       break;
>> +    case VREG32(GITS_IIDR):
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
>> +        break;
>> +    case VREG64(GITS_TYPER):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(0x1eff1, info);
> 
>      Here you are limiting DevID bits to 15, which is not enough

Right, I wasn't aware of those segments Cavium uses.
I will try to find out other vendor's requirements for maximum device
IDs and will adjust this number.
And it seems to be one of the places where I missed to change the hacked
up numbers into nice #defines ;-)

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2016-10-24 14:28   ` Vijay Kilari
  2016-11-02 16:22     ` Andre Przywara
  2016-10-26  1:10   ` Stefano Stabellini
  2016-11-01 17:22   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 14:28 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> ITS. We limit the number of LPIs we use with a compile time constant to
> avoid wasting memory.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig              |  6 ++++
>  xen/arch/arm/efi/efi-boot.h       |  1 -
>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>  xen/include/asm-arm/cache.h       |  4 +++
>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>  7 files changed, 181 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 9fe3b8e..66e2bb8 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -50,6 +50,12 @@ config HAS_ITS
>          depends on ARM_64
>          depends on HAS_GICV3
>
> +config HOST_LPI_BITS
> +        depends on HAS_ITS
> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> +        range 14 32
> +        default "20"
> +
>  config ALTERNATIVE
>         bool
>
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 045d6ce..dc64aec 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -10,7 +10,6 @@
>  #include "efi-dom0.h"
>
>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>
>  #define DEVICE_TREE_GUID \
>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 0f42a77..b52dff3 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -20,10 +20,86 @@
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
> +#include <asm/p2m.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
>
> +/* Global state */
> +static struct {
> +    uint8_t *lpi_property;
> +    int host_lpi_bits;
> +} lpi_data;
> +
> +/* Pending table for each redistributor */
> +static DEFINE_PER_CPU(void *, pending_table);
> +
> +#define MAX_HOST_LPI_BITS                                                \
> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> +
> +uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    uint64_t reg, attr;
> +    void *pendtable;
> +
> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The pending table holds one bit per LPI, so we need three bits less
> +     * than the number of LPI_BITs. But the alignment requirement from the
> +     * ITS is 64K, so make order at least 16 (-12).
> +     */
> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);

    The pend table size allocated is differ from proptable size?

> +    if ( !pendtable )
> +        return 0;
> +
> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
         memset size is different from allocated size?
         flushing zeroed pendtable?
> +    this_cpu(pending_table) = pendtable;
> +
> +    reg  = attr | GICR_PENDBASER_PTZ;
> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
     can use __pa instead of virt_to_maddr()
     Isn't GENMASK(47, 12) here?

> +
> +    return reg;
> +}
> +
> +uint64_t gicv3_lpi_get_proptable()
> +{
> +    uint64_t attr;
> +    static uint64_t reg = 0;
> +
> +    /* The property table is shared across all redistributors. */
> +    if ( reg )
> +        return reg;
> +
> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;

     using PENDBASER definitions for PROPBASER?

> +
> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
> +    if ( !lpi_data.lpi_property )
> +        return 0;
> +
> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
> +
> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
   Isn't GENMASK(47, 12)?
> +
> +    return reg;
> +}
> +
> +int gicv3_lpi_init_host_lpis(int lpi_bits)
> +{
> +    lpi_data.host_lpi_bits = lpi_bits;
> +
> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
> +
> +    return 0;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 238da84..2534aa5 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>
> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>
>      return 0;
>  }
> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +
> +    if ( list_empty(&host_its_list) )
> +        return;
> +
> +    /* Make sure LPIs are disabled before setting up the BASERs. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
> +
> +    table_reg = gicv3_lpi_allocate_pendtable();
> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> +
> +    table_reg = gicv3_lpi_get_proptable();

   Here LPI property table is allocated per cpu. One property table
should be enough and can be shared by all cpus.

> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
    After updating GICR_PENDBASER and GICR_PROPBASER regs
shouldn't we read back and check if sharability bits are support by HW or not
like it is done in linux driver?

> +}
>
>  static int __init gicv3_populate_rdist(void)
>  {
> @@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                    gicv3_rdist_init_lpis(ptr);
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
> index 2de6564..af96eee 100644
> --- a/xen/include/asm-arm/cache.h
> +++ b/xen/include/asm-arm/cache.h
> @@ -7,6 +7,10 @@
>  #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
>  #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
>
> +#ifndef __ASSEMBLY__
> +void __flush_dcache_area(const void *vaddr, unsigned long size);
> +#endif
> +
>  #define __read_mostly __section(".data.read_mostly")
>
>  #endif
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 2f5c51c..48c6c78 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>  /* Parse the host DT and pick up all host ITSes. */
>  void gicv3_its_dt_init(const struct dt_device_node *node);
>
> +/* Allocate and initialize tables for each host redistributor.
> + * Returns the respective {PROP,PEND}BASER register value.
> + */
> +uint64_t gicv3_lpi_get_proptable(void);
> +uint64_t gicv3_lpi_allocate_pendtable(void);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(int nr_lpis);
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>  }
> -
> +static inline uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    return 0;
> +}
> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    return 0;
> +}
> +static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..da5fb77 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,8 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_LPIS               (1U << 17)
>
>  #define GICD_CTLR_RWP                (1UL << 31)
>  #define GICD_CTLR_ARE_NS             (1U << 4)
> @@ -95,12 +96,57 @@
>  #define GICR_IGRPMODR0               (0x0D00)
>  #define GICR_NSACR                   (0x0E00)
>
> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
>
> +#define GIC_BASER_CACHE_nCnB         0ULL
> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> +#define GIC_BASER_CACHE_nC           1ULL
> +#define GIC_BASER_CACHE_RaWt         2ULL
> +#define GIC_BASER_CACHE_RaWb         3ULL
> +#define GIC_BASER_CACHE_WaWt         4ULL
> +#define GIC_BASER_CACHE_WaWb         5ULL
> +#define GIC_BASER_CACHE_RaWaWt       6ULL
> +#define GIC_BASER_CACHE_RaWaWb       7ULL
> +#define GIC_BASER_CACHE_MASK         7ULL
> +#define GIC_BASER_NonShareable       0ULL
> +#define GIC_BASER_InnerShareable     1ULL
> +#define GIC_BASER_OuterShareable     2ULL
> +
> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> +#define PROPBASER_RES0_MASK                                  \
> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
> +
> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
> +       (3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
> +       (7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_PTZ                              BIT(62)
> +#define PENDBASER_RES0_MASK                                  \
> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
> +         GENMASK(15, 12) | GENMASK(6, 0))
> +
>  #define DEFAULT_PMR_VALUE            0xff
>
> +#define LPI_PROP_DEFAULT_PRIO        0xa0
> +#define LPI_PROP_RES1                (1 << 1)
> +#define LPI_PROP_ENABLED             (1 << 0)
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
  2016-10-09 13:55   ` Vijay Kilari
@ 2016-10-24 14:30   ` Vijay Kilari
  2016-11-02 17:51     ` Andre Przywara
  2016-10-26 22:57   ` Stefano Stabellini
  2016-11-01 18:19   ` Julien Grall
  3 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 14:30 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> We limit the number of devices to cover 4 PCI busses for now.

   Thunderx has more than 4 PCI busses

>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |   5 ++
>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>  3 files changed, 167 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index b52dff3..40238a2 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -21,6 +21,7 @@
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16)
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);

    why this mask and shift for?.
> +}
> +
> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> +{
> +    uint64_t attr;
> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
> +    int pagesz;
> +    int order;
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
> +     * supports.
> +     */
> +    for (pagesz = 0; pagesz < 3; pagesz++)
> +    {
> +        uint64_t reg;
> +        int nr_bytes;
> +
> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
> +        order = get_order_from_bytes(nr_bytes);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
           Don't we need to reset to zero all the pages before handing
memory to ITS hw?

> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= nr_bytes >> (pagesz * 2 + 12);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_BASER_VALID;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +            attr = regc & BASER_ATTR_MASK;
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> +            return 0;
> +
> +        /* Check whether our buffer is aligned to the next page size already. */
> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
> +        {
> +            free_xenheap_pages(buffer, order);
> +            buffer = NULL;
> +        }
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for (i = 0; i < 8; i++)
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg >> 56) & 0x7;

      define a macro for these constants
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            gicv3_map_baser(basereg, reg, 1024);
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            gicv3_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 2534aa5..5cf4618 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1548,6 +1549,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1603,6 +1605,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>      spin_unlock(&gicv3.lock);
>
>      return res;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 48c6c78..589b889 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -18,6 +18,47 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>
> +#define LPI_OFFSET      8192
> +
> +#define GITS_CTLR       (0x000)
> +#define GITS_IIDR       (0x004)
> +#define GITS_TYPER      (0x008)
> +#define GITS_CBASER     (0x080)
> +#define GITS_CWRITER    (0x088)
> +#define GITS_CREADR     (0x090)
> +#define GITS_BASER0     (0x100)
> +#define GITS_BASER1     (0x108)
> +#define GITS_BASER2     (0x110)
> +#define GITS_BASER3     (0x118)
> +#define GITS_BASER4     (0x120)
> +#define GITS_BASER5     (0x128)
> +#define GITS_BASER6     (0x130)
> +#define GITS_BASER7     (0x138)
> +
> +/* Register bits */
> +#define GITS_CTLR_ENABLE     0x1
> +#define GITS_IIDR_VALUE      0x34c
> +
> +#define GITS_BASER_VALID                BIT(63)
> +#define GITS_BASER_INDIRECT             BIT(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              ((7UL << GITS_BASER_TYPE_SHIFT) | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> @@ -27,6 +68,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>
>  extern struct list_head host_its_list;
> @@ -42,8 +84,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>  #else
>
> @@ -62,6 +105,10 @@ static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer
  2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2016-10-24 14:31   ` Vijay Kilari
  2016-10-26 23:03   ` Stefano Stabellini
  2016-11-02 13:38   ` Julien Grall
  2 siblings, 0 replies; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 14:31 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer to the ITS h/w to create or alter the
> LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 25 +++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  1 +
>  2 files changed, 26 insertions(+)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 40238a2..c8a7a7e 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -18,6 +18,7 @@
>
>  #include <xen/config.h>
>  #include <xen/lib.h>
> +#include <xen/err.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> @@ -56,6 +57,26 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>  }
>
> +static void *gicv3_map_cbaser(void __iomem *cbasereg)
> +{
> +    uint64_t attr, reg;
> +    void *buffer;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    buffer = alloc_xenheap_pages(0, 0);
> +    if ( !buffer )
> +        return ERR_PTR(-ENOMEM);
> +
> +    /* We use exactly one 4K page, so the "Size" field is 0. */
> +    reg = attr | BIT(63) | (virt_to_maddr(buffer) & GENMASK(51, 12));

Isn't GENMASK(47, 12)?. Am I referring to wrong spec?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
  2016-10-09 14:20   ` Vijay Kilari
@ 2016-10-24 15:31   ` Vijay Kilari
  2016-11-03 19:26     ` Andre Przywara
  2016-11-03 17:50   ` Julien Grall
  2016-11-08 23:54   ` Stefano Stabellini
  3 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 15:31 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Create a new file to hold the emulation code for the ITS widget.
> For now we emulate the memory mapped ITS registers and provide a stub
> to introduce the ITS command handling framework (but without actually
> emulating any commands at this time).
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            |   9 -
>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>  4 files changed, 398 insertions(+), 9 deletions(-)
>  create mode 100644 xen/arch/arm/vgic-its.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index c2c4daa..cb0201f 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -44,6 +44,7 @@ obj-y += traps.o
>  obj-y += vgic.o
>  obj-y += vgic-v2.o
>  obj-$(CONFIG_ARM_64) += vgic-v3.o
> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>  obj-y += vm_event.o
>  obj-y += vtimer.o
>  obj-y += vpsci.o
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> new file mode 100644
> index 0000000..875b992
> --- /dev/null
> +++ b/xen/arch/arm/vgic-its.c
> @@ -0,0 +1,378 @@
> +/*
> + * xen/arch/arm/vgic-its.c
> + *
> + * ARM Interrupt Translation Service (ITS) emulation
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/domain_page.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <asm/current.h>
> +#include <asm/mmio.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
> +#include <asm/vgic.h>
> +#include <asm/vgic-emul.h>
> +
> +/* Data structure to describe a virtual ITS */
> +struct virt_its {
> +    struct domain *d;
> +    struct host_its *hw_its;
> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
> +    uint64_t cbaser;
> +    uint64_t *cmdbuf;
> +    int cwriter;
> +    int creadr;
> +    spinlock_t its_lock;        /* protects the collection and device tables */
> +    uint64_t baser0, baser1;
> +    uint16_t *coll_table;
> +    int max_collections;
> +    uint64_t *dev_table;
> +    int max_devices;
> +    bool enabled;
> +};
> +
> +/* An Interrupt Translation Table Entry: this is indexed by a
> + * DeviceID/EventID pair and is located in guest memory.
> + */
> +struct vits_itte
> +{
> +    uint64_t hlpi:24;
> +    uint64_t vlpi:24;
> +    uint64_t collection:16;
> +};
> +
> +/**************************************
> + * Functions that handle ITS commands *
> + **************************************/
> +
> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
> +                                   int word, int shift, int size)
> +{
> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
> +
> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> +                                uint32_t writer)
> +{
> +    uint64_t *cmdptr;
> +
> +    if ( !its->cmdbuf )
> +        return -1;
> +
> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +        return -1;
> +
> +    spin_lock(&its->vcmd_lock);
> +
> +    while ( its->creadr != writer )
> +    {
> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
> +        switch (its_cmd_get_command(cmdptr))
> +        {
> +        case GITS_CMD_SYNC:
> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
> +           break;
> +        default:
> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
> +                   its_cmd_get_command(cmdptr));
> +            break;
> +        }
> +
> +        its->creadr += ITS_CMD_SIZE;
> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +            its->creadr = 0;
> +    }
> +    its->cwriter = writer;
> +
> +    spin_unlock(&its->vcmd_lock);
> +
> +    return 0;
> +}
> +
> +/*****************************
> + * ITS registers read access *
> + *****************************/
> +
> +/* The physical address is encoded slightly differently depending on
> + * the used page size: the highest four bits are stored in the lowest
> + * four bits of the field for 64K pages.
> + */
> +static paddr_t get_baser_phys_addr(uint64_t reg)
> +{
> +    if ( reg & BIT(9) )
> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
> +    else
> +        return reg & GENMASK(47, 12);
> +}
> +
> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                                 register_t *r, void *priv)
> +{
> +    struct virt_its *its = priv;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
> +       break;
> +    case VREG32(GITS_IIDR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
> +        break;
> +    case VREG64(GITS_TYPER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0x1eff1, info);
       GITS_TYPER.HCC is not set. Should be max vcpus of the domain
       GITS_TYPER.ID_bits are also just set to 15.
> +        break;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->cbaser, info);
> +        break;
> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->cwriter, info);
> +        break;
> +    case VREG64(GITS_CREADR):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->creadr, info);
> +        break;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->baser0, info);
> +        break;
> +    case VREG64(GITS_BASER1):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->baser1, info);
> +        break;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0, info);
> +        break;
> +    case VREG32(GICD_PIDR2):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
> +        break;
         missing default
> +    }
> +
> +    return 1;
> +
> +bad_width:
    print would be helpful
> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
> +/******************************
> + * ITS registers write access *
> + ******************************/
> +
> +static int its_baser_table_size(uint64_t baser)
> +{
> +    int page_size = 0;
> +
> +    switch ( (baser >> 8) & 3 )
> +    {
> +    case 0: page_size = SZ_4K; break;
> +    case 1: page_size = SZ_16K; break;
> +    case 2:
> +    case 3: page_size = SZ_64K; break;
> +    }
> +
> +    return page_size * ((baser & GENMASK(7, 0)) + 1);
> +}
> +
> +static int its_baser_nr_entries(uint64_t baser)
> +{
> +    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;
> +
> +    return its_baser_table_size(baser) / entry_size;
> +}
> +
> +static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
> +                                  register_t r, void *priv)
> +{
> +    struct domain *d = v->domain;
> +    struct virt_its *its = priv;
> +    uint64_t reg;
> +    uint32_t ctlr;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +       vgic_reg32_update(&ctlr, r, info);
> +       its->enabled = ctlr & GITS_CTLR_ENABLE;
> +       /* TODO: trigger something ... */
> +        return 1;
> +    case VREG32(GITS_IIDR):
> +        goto write_ignore_32;
> +    case VREG32(GITS_TYPER):
> +        goto write_ignore_32;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->cbaser;
> +        vgic_reg64_update(&reg, r, info);
> +        /* TODO: sanitise! */
> +        its->cbaser = reg;
> +
> +        if ( reg & BIT(63) )
> +        {
> +            its->cmdbuf = map_guest_pages(d, reg & GENMASK(51, 12), 1);

       Only one page of guest cmd queue is mapped. After cwriter
moving beyond 1 page,
panic is observed. Map all the guest pages
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->cmdbuf, 1);
   Same here.
> +            its->cmdbuf = NULL;
> +        }
> +
> +       return 1;
> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        reg = its->cwriter;
> +        vgic_reg64_update(&reg, r, info);
> +        vgic_its_handle_cmds(d, its, reg);
> +        return 1;
> +    case VREG64(GITS_CREADR):
> +        goto write_ignore_64;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->baser0;
> +        vgic_reg64_update(&reg, r, info);
> +
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
> +        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
> +        /* TODO: sanitise! */
> +        /* TODO: locking(?) */
> +
> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->dev_table = map_guest_pages(d,
> +                                             get_baser_phys_addr(reg),
> +                                             its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = its_baser_nr_entries(reg);
> +            memset(its->dev_table, 0, its->max_devices * sizeof(uint64_t));
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->dev_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = 0;
> +        }
> +
> +        its->baser0 = reg;
> +        return 1;
> +    case VREG64(GITS_BASER1):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->baser1;
> +        vgic_reg64_update(&reg, r, info);
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
> +        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
> +        /* TODO: sanitise! */
> +
> +        /* TODO: sort out locking */
> +        /* TODO: repeated calls: free old mapping */
> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->coll_table = map_guest_pages(d, get_baser_phys_addr(reg),
> +                                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = its_baser_nr_entries(reg);
> +            memset(its->coll_table, 0xff,
> +                   its->max_collections * sizeof(uint16_t));
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->coll_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = 0;
> +        }
> +        its->baser1 = reg;
> +        return 1;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        goto write_ignore_64;
> +    default:
> +        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
> +                 info->gpa & 0xffff);
> +        return 0;
> +    }
> +
> +    return 1;
> +
> +write_ignore_64:
> +    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
> +    return 1;
> +
> +write_ignore_32:
> +    if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +    return 1;
> +
> +bad_width:
> +    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
> +           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
> +
> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
> +static const struct mmio_handler_ops vgic_its_mmio_handler = {
> +    .read  = vgic_v3_its_mmio_read,
> +    .write = vgic_v3_its_mmio_write,
> +};
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 8fe8386..aa53a1e 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -158,15 +158,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
>      rank->vcpu[offset] = new_vcpu->vcpu_id;
>  }
>
> -static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> -{
> -    /*
> -     * 64 bits registers can be accessible using 32-bit and 64-bit unless
> -     * stated otherwise (See 8.1.3 ARM IHI 0069A).
> -     */
> -    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> -}
> -
>  static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>                                           uint32_t gicr_reg,
>                                           register_t *r)
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index da5fb77..6a91f5b 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -147,6 +147,16 @@
>  #define LPI_PROP_RES1                (1 << 1)
>  #define LPI_PROP_ENABLED             (1 << 0)
>
> +/*
> + * PIDR2: Only bits[7:4] are not implementation defined. We are
> + * emulating a GICv3 ([7:4] = 0x3).
> + *
> + * We don't emulate a specific registers scheme so implement the others
> + * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
> + */
> +#define GICV3_GICD_PIDR2  0x30
> +#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>
> @@ -190,6 +200,15 @@ struct rdist_region {
>      bool single_rdist;
>  };
>
> +/*
> + * 64 bits registers can be accessible using 32-bit and 64-bit unless
> + * stated otherwise (See 8.1.3 ARM IHI 0069A).
> + */
> +static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> +{
> +    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> +}
> +
>  #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
>
>  /*
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping
  2016-09-28 18:24 ` [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2016-10-24 15:31   ` Vijay Kilari
  2016-11-03 19:33     ` Andre Przywara
  2016-10-28  0:08   ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 15:31 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into a list to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 90 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h | 16 ++++++++
>  2 files changed, 106 insertions(+)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 2140e4a..bf1f5b5 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -168,6 +168,94 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             int size, uint64_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size & GENMASK(4, 0);
> +    cmd[2] = itt_addr & GENMASK(51, 8);
> +    if ( valid )
> +        cmd[2] |= BIT(63);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> +                         int devid, int bits, bool valid)
> +{
> +    void *itt_addr = NULL;
> +    struct its_devices *dev, *temp;
> +    bool reuse_dev = false;
> +
> +    list_for_each_entry_safe(dev, temp, &hw_its->its_devices, entry)
> +    {
> +        if ( (dev->d->domain_id != d->domain_id) || (dev->devid != devid) )
> +            continue;
> +
> +        its_send_cmd_mapd(hw_its, dev->devid, 0, 0, false);
> +        xfree(dev->itt_addr);
> +        if ( !valid )
> +        {
> +            xfree(dev);
    xfree() should be done after list_del()
> +            list_del(&dev->entry);
> +
> +            return 0;
> +        }
> +
> +        reuse_dev = true;
> +        break;
> +    }
> +
> +    if ( !valid )
> +        return 0;
> +
> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
> +    if ( !itt_addr )
> +        return -ENOMEM;
> +
> +    if ( !reuse_dev )
> +    {
> +        dev = xmalloc(struct its_devices);
> +        if ( !dev )
> +            return -ENOMEM;
> +
> +        list_add_tail(&dev->entry, &hw_its->its_devices);
> +    }
> +
> +    dev->itt_addr = itt_addr;
> +    dev->d = d;
> +    dev->devid = devid;
> +
> +    return its_send_cmd_mapd(hw_its, devid, bits - 1,
> +                             itt_addr ? virt_to_maddr(itt_addr) : 0, true);
          check on itt_addr is redundant

> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +int its_remove_domain(struct domain *d)
> +{
> +    struct host_its *its;
> +    struct its_devices *dev, *temp;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        list_for_each_entry_safe(dev, temp, &its->its_devices, entry)
> +        {
> +            if ( dev->d->domain_id != d->domain_id )
> +                continue;
> +
> +            its_send_cmd_mapd(its, dev->devid, 0, 0, false);
> +            xfree(dev->itt_addr);
> +            xfree(dev);

xfree() should be done after list_del(
> +            list_del(&dev->entry);
> +        }
        This code is same as above. Can be moved to a separate function?

> +    }
> +
> +    return 0;
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  void gicv3_its_setup_collection(int cpu)
>  {
> @@ -297,6 +385,7 @@ int gicv3_its_init(struct host_its *hw_its)
>
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      hw_its->pta = reg & GITS_TYPER_PTA;
> +    hw_its->itte_size = ((reg >> 4) & 0xf) + 1;
      can define a macro for these constants
>
>      for (i = 0; i < 8; i++)
>      {
> @@ -520,6 +609,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->size = size;
>          its_data->dt_node = its;
>          spin_lock_init(&its_data->cmd_lock);
> +        INIT_LIST_HEAD(&its_data->its_devices);
>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 512a388..4e9841a 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -79,6 +79,13 @@
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> +struct its_devices {
> +    struct list_head entry;
> +    struct domain *d;
> +    void *itt_addr;
> +    int devid;
> +};
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -88,6 +95,8 @@ struct host_its {
>      void __iomem *its_base;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    struct list_head its_devices;
> +    int itte_size;
>      bool pta;
>  };
>
> @@ -114,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>  /* Map a collection for this host CPU to each host ITS. */
>  void gicv3_its_setup_collection(int cpu);
>
> +/* Map a device on the host by allocating an ITT on the host (ITS).
> + * "bits" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> +                         int devid, int bits, bool valid);
> +
>  int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>                                  uint32_t devid, uint32_t eventid,
>                                  struct vcpu *v, int virt_lpi);
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
@ 2016-10-24 15:31   ` Vijay Kilari
  2016-11-03 19:47     ` Andre Przywara
  2016-10-28  1:04   ` Stefano Stabellini
  2016-11-04 15:46   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 15:31 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> For the same reason that allocating a struct irq_desc for each
> possible LPI is not an option, having a struct pending_irq for each LPI
> is also not feasible. However we actually only need those when an
> interrupt is on a vCPU (or is about to be injected).
> Maintain a list of those structs that we can use for the lifecycle of
> a guest LPI. We allocate new entries if necessary, however reuse
> pre-owned entries whenever possible.
> Teach the existing VGIC functions to find the right pointer when being
> given a virtual LPI number.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic.c            |  3 +++
>  xen/arch/arm/vgic-v3.c        |  2 ++
>  xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
>  xen/include/asm-arm/domain.h  |  1 +
>  xen/include/asm-arm/gic-its.h | 10 ++++++++
>  xen/include/asm-arm/vgic.h    |  9 +++++++
>  6 files changed, 78 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 63c744a..ebe4035 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>              }
> +            /* If this was an LPI, mark this struct as available again. */
> +            if ( p->irq >= 8192 )
 Can define something line is_lpi(irq) and use it everywhere
> +                p->irq = 0;
>          }
>      }
>  }
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index ec038a3..e9b6490 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>
> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
> +
>      return 0;
>  }
>
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 0965119..b961551 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -31,6 +31,8 @@
>  #include <asm/mmio.h>
>  #include <asm/gic.h>
>  #include <asm/vgic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
>
>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>  {
> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>      return vgic_get_rank(v, rank);
>  }
>
> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>  {
>      INIT_LIST_HEAD(&p->inflight);
>      INIT_LIST_HEAD(&p->lr_queue);
> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>
>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>  {
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> +    struct vgic_irq_rank *rank;
>      unsigned long flags;
>      int priority;
>
> +    if ( virq >= 8192 )
> +        return gicv3_lpi_get_priority(v->domain, virq);
> +
> +    rank = vgic_rank_irq(v, virq);
>      vgic_lock_rank(v, rank, flags);
>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>      vgic_unlock_rank(v, rank, flags);
> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
>      return 1;
>  }
>
> +/*
> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> + * requires too much Xen memory, also a malicious guest could potentially
> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> + * on demand.
> + */
> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> +                                   bool allocate)
> +{
> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> +
> +    /* TODO: locking! */
> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> +    {
> +        if ( lpi_irq->pirq.irq == lpi )
> +            return &lpi_irq->pirq;
> +
> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> +            empty = lpi_irq;
> +    }
   With this approach of allocating pending_irq on demand, if the
pending_lpi_list
is at n position then it iterates for long time to find pending_irq entry.
This will increase LPI injection time to domain.

Why can't we use btree?

> +
> +    if ( !allocate )
> +        return NULL;
> +
> +    if ( !empty )
> +    {
> +        empty = xzalloc(struct lpi_pending_irq);
> +        vgic_init_pending_irq(&empty->pirq, lpi);
> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> +    } else
> +    {
> +        empty->pirq.status = 0;
> +        empty->pirq.irq = lpi;
> +    }
> +
> +    return &empty->pirq;
> +}
> +
>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>  {
>      struct pending_irq *n;
> +
>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>       * are used for SPIs; the rests are used for per cpu irqs */
>      if ( irq < 32 )
>          n = &v->arch.vgic.pending_irqs[irq];
> +    else if ( irq >= 8192 )
> +        n = lpi_to_pending(v, irq, true);
>      else
>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>      return n;
> @@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  {
>      uint8_t priority;
> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
> +    struct pending_irq *iter, *n;
>      unsigned long flags;
>      bool_t running;
>
> @@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>
> +    n = irq_to_pending(v, virq);
> +
>      /* vcpu offline */
>      if ( test_bit(_VPF_down, &v->pause_flags) )
>      {
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 9452fcd..ae8a9de 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -249,6 +249,7 @@ struct arch_vcpu
>          paddr_t rdist_base;
>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>          uint8_t flags;
> +        struct list_head pending_lpi_list;
>      } vgic;
>
>      /* Timer registers  */
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 4e9841a..1f881c0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
>                              uint32_t host_lpi);
> +
> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> +{
> +    return GIC_PRI_IRQ;
   Why lpi priority is fixed?. can't we use domain set lpi priority?

> +}
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
>  {
>      return 0;
>  }
> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> +{
> +    return GIC_PRI_IRQ;
> +}
>
>  #endif /* CONFIG_HAS_ITS */
>
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 300f461..4e29ba6 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -83,6 +83,12 @@ struct pending_irq
>      struct list_head lr_queue;
>  };
>
> +struct lpi_pending_irq
> +{
> +    struct list_head entry;
> +    struct pending_irq pirq;
> +};
> +
>  #define NR_INTERRUPT_PER_RANK   32
>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>
> @@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
> +                                          bool allocate);
>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2016-10-24 15:32   ` Vijay Kilari
  2016-11-03 20:21     ` Andre Przywara
  2016-10-29  0:39   ` Stefano Stabellini
  2016-11-02 17:18   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 15:32 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c        | 189 ++++++++++++++++++++++++++++++++++++++----
>  xen/arch/arm/vgic.c           |   4 +
>  xen/include/asm-arm/domain.h  |   7 +-
>  xen/include/asm-arm/gic-its.h |  10 ++-
>  xen/include/asm-arm/vgic.h    |   3 +
>  5 files changed, 197 insertions(+), 16 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index e9b6490..8fe8386 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
> +        return 1;
>
>      case 0x0080:
>          goto read_reserved;
> @@ -301,11 +305,6 @@ bad_width:
>      domain_crash_synchronous();
>      return 0;
>
> -read_as_zero_64:
> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -    *r = 0;
> -    return 1;
> -
>  read_as_zero_32:
>      if ( dabt.size != DABT_WORD ) goto bad_width;
>      *r = 0;
> @@ -330,11 +329,149 @@ read_unknown:
>      return 1;
>  }
>
> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
> +                                    int field_shift,
> +                                    uint64_t (*sanitise_fn)(uint64_t))
> +{
> +    uint64_t field = (reg & field_mask) >> field_shift;
> +
> +    field = sanitise_fn(field) << field_shift;
> +    return (reg & ~field_mask) | field;
> +}
> +
> +/* We want to avoid outer shareable. */
> +static uint64_t vgic_sanitise_shareability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_OuterShareable:
> +        return GIC_BASER_InnerShareable;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Avoid any inner non-cacheable mapping. */
> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_nCnB:
> +    case GIC_BASER_CACHE_nC:
> +        return GIC_BASER_CACHE_RaWb;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Non-cacheable or same-as-inner are OK. */
> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_SameAsInner:
> +    case GIC_BASER_CACHE_nC:
> +        return field;
> +    default:
> +        return GIC_BASER_CACHE_nC;
> +    }
> +}
> +
> +static uint64_t sanitize_propbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PROPBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);
> +    return reg;
> +}
> +
> +static uint64_t sanitize_pendbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PENDBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);
> +    return reg;
> +}
> +
> +/*
> + * Allow mapping some parts of guest memory into Xen's VA space to have easy
> + * access to it. This is to allow ITS configuration data to be held in
> + * guest memory and avoid using Xen memory for that.
> + */
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
   I think this file is not right place to put this generic function
> +{
> +    mfn_t onepage;
> +    mfn_t *pages;
> +    int i;
> +    void *ptr;
> +
> +    /* TODO: free previous mapping, change prototype? use get-put-put? */
> +
> +    guest_addr &= PAGE_MASK;
> +
> +    if ( nr_pages == 1 )
> +    {
> +        pages = &onepage;
> +    } else
> +    {
> +        pages = xmalloc_array(mfn_t, nr_pages);
> +        if ( !pages )
> +            return NULL;
> +    }
> +
> +    for (i = 0; i < nr_pages; i++)
> +    {
> +        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);

             check return value of this function

> +        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);
> +    }
> +
> +    ptr = vmap(pages, nr_pages);
> +
> +    if ( nr_pages > 1 )
> +        xfree(pages);
> +
> +    return ptr;
> +}
> +
> +void unmap_guest_pages(void *va, int nr_pages)
      same here. Can be put in generic file p2m.c?
> +{
> +    paddr_t pa;
> +    unsigned long i;
> +
> +    if ( !va )
> +        return;
> +
> +    va = (void *)((uintptr_t)va & PAGE_MASK);
> +    pa = virt_to_maddr(va);
  can use _pa()
> +
> +    vunmap(va);
> +    for (i = 0; i < nr_pages; i++)
> +        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
> +
> +    return;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>
>      switch ( gicr_reg )
>      {
> @@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      case 0x0050:
>          goto write_reserved;
>
> -    case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +    case VREG64(GICR_PROPBASER): {
> +        int nr_pages;
> +
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;
> +
> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;
>
> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;
             should be validated against HOST_LPIS?

> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
> +        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);
> +        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
> +                                                         reg & GENMASK(47, 12),
> +                                                         nr_pages);
> +        return 1;
> +    }
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

   check on VGIC_V3_LPIS_ENABLED is required
> +       reg = v->arch.vgic.rdist_pendbase;
> +       vgic_reg64_update(&reg, r, info);
> +       reg = sanitize_pendbaser(reg);
> +       v->arch.vgic.rdist_pendbase = reg;
> +
> +        unmap_guest_pages(v->arch.vgic.pendtable, 16);
     why only 16 pages are unmapped?
> +       v->arch.vgic.pendtable = map_guest_pages(v->domain,
> +                                                 reg & GENMASK(47, 12), 16);
> +       return 1;
>
>      case 0x0080:
>          goto write_reserved;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index b961551..4d9304f 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>          empty->pirq.irq = lpi;
>      }
>
> +    /* Update the enabled status */
> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
> +
>      return &empty->pirq;
>  }
>
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index ae8a9de..0cd3500 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -109,6 +109,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        uint64_t rdist_propbase;
> +        uint8_t *proptable;
>  #endif
>      } vgic;
>
> @@ -247,7 +249,10 @@ struct arch_vcpu
>
>          /* GICv3: redistributor base and flags for this vCPU */
>          paddr_t rdist_base;
> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
> +        uint64_t rdist_pendbase;
> +        unsigned long *pendtable;
>          uint8_t flags;
>          struct list_head pending_lpi_list;
>      } vgic;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 1f881c0..3b2e5c0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>
>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
> -    return GIC_PRI_IRQ;
> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
> +}
> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> +{
> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>  }
>
>  #else
> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
>      return GIC_PRI_IRQ;
>  }
> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> +{
> +    return false;
> +}
>
>  #endif /* CONFIG_HAS_ITS */
>
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 4e29ba6..2b216cc 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>
>  #undef VGIC_REG_HELPERS
>
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
> +void unmap_guest_pages(void *va, int nr_pages);
> +
>  enum gic_sgi_mode;
>
>  /*
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-09-28 18:24 ` [RFC PATCH 21/24] ARM: vITS: handle INVALL command Andre Przywara
@ 2016-10-24 15:32   ` Vijay Kilari
  2016-11-04  9:22     ` Andre Przywara
  0 siblings, 1 reply; 144+ messages in thread
From: Vijay Kilari @ 2016-10-24 15:32 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> The INVALL command instructs an ITS to invalidate the configuration
> data for all LPIs associated with a given redistributor (read: VCPU).
> To avoid iterating (and mapping!) all guest tables, we instead go through
> the host LPI table to find any LPIs targetting this VCPU. We then update
> the configuration bits for the connected virtual LPIs.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 58 +++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  2 ++
>  3 files changed, 90 insertions(+)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 6f4329f..5129d6e 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_invall(struct host_its *its, int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INVALL;
> +    cmd[1] = 0x00;
> +    cmd[2] = cpu & GENMASK(15, 0);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>                           int devid, int bits, bool valid)
>  {
> @@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
>      return hlpi.virt_lpi;
>  }
>
> +/* Iterate over all host LPIs, and updating the "enabled" state for a given
> + * guest redistributor (VCPU) given the respective state in the provided
> + * proptable. This proptable is indexed by the stored virtual LPI number.
> + * This is to implement a guest INVALL command.
> + */
> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable)
> +{
> +    int chunk, i;
> +    struct host_its *its;
> +
> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
> +    {
> +        if ( !lpi_data.host_lpis[chunk] )
> +            continue;
> +
> +        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
> +        {
> +            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i], hlpi;
> +            uint32_t hlpi_nr;
> +
> +            hlpi.data = hlpip->data;
> +            if ( !hlpi.virt_lpi )
> +                continue;
> +
> +            if ( hlpi.dom_id != v->domain->domain_id )
> +                continue;
> +
> +            if ( hlpi.vcpu_id != v->vcpu_id )
> +                continue;
> +
> +            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
> +
> +            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
> +                lpi_data.lpi_property[hlpi_nr - 8192] |= LPI_PROP_ENABLED;
> +            else
> +                lpi_data.lpi_property[hlpi_nr - 8192] &= ~LPI_PROP_ENABLED;
> +        }
> +    }
        AFAIK, the initial design is to use tasklet to update property
table as it consumes
lot of time to update the table.

> +
> +    /* Tell all ITSes that they should update the property table for CPU 0,
> +     * which is where we map all LPIs to.
> +     */
> +    list_for_each_entry(its, &host_its_list, entry)
> +        its_send_cmd_invall(its, 0);
> +}
> +
>  void gicv3_lpi_set_enable(struct host_its *its,
>                            uint32_t deviceid, uint32_t eventid,
>                            uint32_t host_lpi, bool enabled)
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 74da8fc..1e429b7 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -294,6 +294,33 @@ out_unlock:
>      return ret;
>  }
>
> +/* INVALL updates the per-LPI configuration status for every LPI mapped to
> + * this redistributor. For the guest side we don't need to update anything,
> + * as we always refer to the actual table for the enabled bit and the
> + * priority.
> + * Enabling or disabling a virtual LPI however needs to be propagated to
> + * the respective host LPI. Instead of iterating over all mapped LPIs in our
> + * emulated GIC (which is expensive due to the required on-demand mapping),
> + * we iterate over all mapped _host_ LPIs and filter for those which are
> + * forwarded to this virtual redistributor.
> + */
> +static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t collid = its_cmd_get_collection(cmdptr);
> +    struct vcpu *vcpu;
> +
> +    spin_lock(&its->its_lock);
> +    vcpu = get_vcpu_from_collection(its, collid);
> +    spin_unlock(&its->its_lock);
> +
> +    if ( !vcpu )
> +        return -1;
> +
> +    gicv3_lpi_update_configurations(vcpu, its->d->arch.vgic.proptable);
> +
> +    return 0;
> +}
> +
>  static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
>  {
>      uint32_t collid = its_cmd_get_collection(cmdptr);
> @@ -515,6 +542,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_INV:
>              its_handle_inv(its, cmdptr);
>             break;
> +        case GITS_CMD_INVALL:
> +            its_handle_invall(its, cmdptr);
> +           break;
>          case GITS_CMD_MAPC:
>              its_handle_mapc(its, cmdptr);
>              break;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 2cdb3e1..ba6b2d5 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -146,6 +146,8 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
>                              uint32_t host_lpi);
>
> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable);
> +
>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
>      return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
> --
> 2.9.0
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
  2016-10-24 14:28   ` Vijay Kilari
@ 2016-10-26  1:10   ` Stefano Stabellini
  2016-11-10 15:29     ` Andre Przywara
  2016-11-01 17:22   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-26  1:10 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi Andre,

Sorry for the late reply, I'll try to be faster for the next rounds of
review. The patch looks good for a first iteration. Some comments below.

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> ITS. We limit the number of LPIs we use with a compile time constant to
> avoid wasting memory.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig              |  6 ++++
>  xen/arch/arm/efi/efi-boot.h       |  1 -
>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>  xen/include/asm-arm/cache.h       |  4 +++
>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>  7 files changed, 181 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 9fe3b8e..66e2bb8 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -50,6 +50,12 @@ config HAS_ITS
>          depends on ARM_64
>          depends on HAS_GICV3
>  
> +config HOST_LPI_BITS
> +        depends on HAS_ITS
> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> +        range 14 32
> +        default "20"
> +
>  config ALTERNATIVE
>  	bool
>  
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 045d6ce..dc64aec 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -10,7 +10,6 @@
>  #include "efi-dom0.h"
>  
>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>  
>  #define DEVICE_TREE_GUID \
>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 0f42a77..b52dff3 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -20,10 +20,86 @@
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
> +#include <asm/p2m.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
>  
> +/* Global state */
> +static struct {
> +    uint8_t *lpi_property;
> +    int host_lpi_bits;
> +} lpi_data;
> +
> +/* Pending table for each redistributor */
> +static DEFINE_PER_CPU(void *, pending_table);
> +
> +#define MAX_HOST_LPI_BITS                                                \

To avoid confusion, I would call this MAX_PHYS_LPI_BITS


> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)

And this MAX_PHYS_LPIS


> +uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    uint64_t reg, attr;
> +    void *pendtable;

I would introduce a check to make sure that this_cpu(pending_table) == NULL.


> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The pending table holds one bit per LPI, so we need three bits less
> +     * than the number of LPI_BITs.

Why 3 bit less? Please add more info on how you came up with 3.


>         But the alignment requirement from the
> +     * ITS is 64K, so make order at least 16 (-12).

Does it need to be 64K aligned or does it need to be at least 64K in
size? That makes a big difference. If it just needs to be 64K aligned,
you can do that with xmalloc.


> +     */
> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);

Shouldn't we be using MAX_HOST_LPI_BITS instead of
lpi_data.host_lpi_bits to make this calculation?


> +    if ( !pendtable )
> +        return 0;
> +
> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));

flush_dcache?


> +    this_cpu(pending_table) = pendtable;
> +
> +    reg  = attr | GICR_PENDBASER_PTZ;
> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
> +
> +    return reg;
> +}
> +
> +uint64_t gicv3_lpi_get_proptable()
> +{
> +    uint64_t attr;
> +    static uint64_t reg = 0;
> +
> +    /* The property table is shared across all redistributors. */
> +    if ( reg )
> +        return reg;

Can't you just use lpi_data.lpi_property != NULL instead of introducing
a new static local variable?


> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);

Please add a comment on how the order is calculated.


> +    if ( !lpi_data.lpi_property )
> +        return 0;
> +
> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
> +
> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
> +
> +    return reg;
> +}
> +
> +int gicv3_lpi_init_host_lpis(int lpi_bits)
> +{
> +    lpi_data.host_lpi_bits = lpi_bits;
> +
> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
> +
> +    return 0;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 238da84..2534aa5 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>  
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);

Please #define a mask instead of using 0x1f


> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>  
> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>  
>      return 0;
>  }
> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +
> +    if ( list_empty(&host_its_list) )
> +        return;
> +
> +    /* Make sure LPIs are disabled before setting up the BASERs. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
> +
> +    table_reg = gicv3_lpi_allocate_pendtable();
> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);

Maybe we want to return in case table_reg == NULL ?


> +    table_reg = gicv3_lpi_get_proptable();
> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> +}
>  
>  static int __init gicv3_populate_rdist(void)
>  {
> @@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                    gicv3_rdist_init_lpis(ptr);
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
> index 2de6564..af96eee 100644
> --- a/xen/include/asm-arm/cache.h
> +++ b/xen/include/asm-arm/cache.h
> @@ -7,6 +7,10 @@
>  #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
>  #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
>  
> +#ifndef __ASSEMBLY__
> +void __flush_dcache_area(const void *vaddr, unsigned long size);
> +#endif
> +
>  #define __read_mostly __section(".data.read_mostly")
>  
>  #endif
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 2f5c51c..48c6c78 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>  /* Parse the host DT and pick up all host ITSes. */
>  void gicv3_its_dt_init(const struct dt_device_node *node);
>  
> +/* Allocate and initialize tables for each host redistributor.
> + * Returns the respective {PROP,PEND}BASER register value.
> + */
> +uint64_t gicv3_lpi_get_proptable(void);
> +uint64_t gicv3_lpi_allocate_pendtable(void);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(int nr_lpis);
> +
>  #else
>  
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>  }
> -
> +static inline uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    return 0;
> +}
> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    return 0;
> +}
> +static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif /* __ASSEMBLY__ */
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..da5fb77 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,8 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>  
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_LPIS               (1U << 17)
>  
>  #define GICD_CTLR_RWP                (1UL << 31)
>  #define GICD_CTLR_ARE_NS             (1U << 4)
> @@ -95,12 +96,57 @@
>  #define GICR_IGRPMODR0               (0x0D00)
>  #define GICR_NSACR                   (0x0E00)
>  
> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
>  
> +#define GIC_BASER_CACHE_nCnB         0ULL
> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> +#define GIC_BASER_CACHE_nC           1ULL
> +#define GIC_BASER_CACHE_RaWt         2ULL
> +#define GIC_BASER_CACHE_RaWb         3ULL
> +#define GIC_BASER_CACHE_WaWt         4ULL
> +#define GIC_BASER_CACHE_WaWb         5ULL
> +#define GIC_BASER_CACHE_RaWaWt       6ULL
> +#define GIC_BASER_CACHE_RaWaWb       7ULL
> +#define GIC_BASER_CACHE_MASK         7ULL
> +#define GIC_BASER_NonShareable       0ULL
> +#define GIC_BASER_InnerShareable     1ULL
> +#define GIC_BASER_OuterShareable     2ULL
> +
> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> +#define PROPBASER_RES0_MASK                                  \
> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
> +
> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
> +	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
> +	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_PTZ                              BIT(62)
> +#define PENDBASER_RES0_MASK                                  \
> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
> +         GENMASK(15, 12) | GENMASK(6, 0))
> +
>  #define DEFAULT_PMR_VALUE            0xff
>  
> +#define LPI_PROP_DEFAULT_PRIO        0xa0
> +#define LPI_PROP_RES1                (1 << 1)
> +#define LPI_PROP_ENABLED             (1 << 0)
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>  
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2016-10-26  1:11   ` Stefano Stabellini
  2016-11-01 15:13   ` Julien Grall
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-26  1:11 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
> offers. Store that information in a list to both propagate all of them
> later to Dom0, but also to be able to iterate over all ITSes.
> This introduces an ITS Kconfig option.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Looks good to me


>  xen/arch/arm/Kconfig          |  5 ++++
>  xen/arch/arm/Makefile         |  1 +
>  xen/arch/arm/gic-its.c        | 67 +++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |  6 ++++
>  xen/include/asm-arm/gic-its.h | 57 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 136 insertions(+)
>  create mode 100644 xen/arch/arm/gic-its.c
>  create mode 100644 xen/include/asm-arm/gic-its.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 797c91f..9fe3b8e 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -45,6 +45,11 @@ config ACPI
>  config HAS_GICV3
>  	bool
>  
> +config HAS_ITS
> +        bool "GICv3 ITS MSI controller support"
> +        depends on ARM_64
> +        depends on HAS_GICV3
> +
>  config ALTERNATIVE
>  	bool
>  
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 64fdf41..c2c4daa 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -18,6 +18,7 @@ obj-$(EARLY_PRINTK) += early_printk.o
>  obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
> +obj-$(CONFIG_HAS_ITS) += gic-its.o
>  obj-y += guestcopy.o
>  obj-y += hvm.o
>  obj-y += io.o
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> new file mode 100644
> index 0000000..0f42a77
> --- /dev/null
> +++ b/xen/arch/arm/gic-its.c
> @@ -0,0 +1,67 @@
> +/*
> + * xen/arch/arm/gic-its.c
> + *
> + * ARM Generic Interrupt Controller ITS support
> + *
> + * Copyright (C) 2016 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/config.h>
> +#include <xen/lib.h>
> +#include <xen/device_tree.h>
> +#include <xen/libfdt/libfdt.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
> +
> +void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +    const struct dt_device_node *its = NULL;
> +    struct host_its *its_data;
> +
> +    /*
> +     * Check for ITS MSI subnodes. If any, add the ITS register
> +     * frames to the ITS list.
> +     */
> +    dt_for_each_child_node(node, its)
> +    {
> +        paddr_t addr, size;
> +
> +        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
> +            continue;
> +
> +        if ( dt_device_get_address(its, 0, &addr, &size) )
> +            panic("GICv3: Cannot find a valid ITS frame address");
> +
> +        its_data = xzalloc(struct host_its);
> +        if ( !its_data )
> +            panic("GICv3: Cannot allocate memory for ITS frame");
> +
> +        its_data->addr = addr;
> +        its_data->size = size;
> +        its_data->dt_node = its;
> +
> +        printk("GICv3: Found ITS @0x%lx\n", addr);
> +
> +        list_add_tail(&its_data->entry, &host_its_list);
> +    }
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b8be395..238da84 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -43,9 +43,12 @@
>  #include <asm/device.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
>  #include <asm/cpufeature.h>
>  #include <asm/acpi.h>
>  
> +LIST_HEAD(host_its_list);
> +
>  /* Global state */
>  static struct {
>      void __iomem *map_dbase;  /* Mapped address of distributor registers */
> @@ -1229,6 +1232,9 @@ static void __init gicv3_dt_init(void)
>  
>      dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
>                            &vbase, &vsize);
> +
> +    /* Check for ITS child nodes and build the host ITS list accordingly. */
> +    gicv3_its_dt_init(node);
>  }
>  
>  static int gicv3_iomem_deny_access(const struct domain *d)
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> new file mode 100644
> index 0000000..2f5c51c
> --- /dev/null
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -0,0 +1,57 @@
> +/*
> + * ARM GICv3 ITS support
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef __ASM_ARM_ITS_H__
> +#define __ASM_ARM_ITS_H__
> +
> +#ifndef __ASSEMBLY__
> +#include <xen/device_tree.h>
> +
> +/* data structure for each hardware ITS */
> +struct host_its {
> +    struct list_head entry;
> +    const struct dt_device_node *dt_node;
> +    paddr_t addr;
> +    paddr_t size;
> +};
> +
> +extern struct list_head host_its_list;
> +
> +#ifdef CONFIG_HAS_ITS
> +
> +/* Parse the host DT and pick up all host ITSes. */
> +void gicv3_its_dt_init(const struct dt_device_node *node);
> +
> +#else
> +
> +static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +}
> +
> +#endif /* CONFIG_HAS_ITS */
> +
> +#endif /* __ASSEMBLY__ */
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
  2016-10-09 13:55   ` Vijay Kilari
  2016-10-24 14:30   ` Vijay Kilari
@ 2016-10-26 22:57   ` Stefano Stabellini
  2016-11-01 17:34     ` Julien Grall
  2016-11-10 15:32     ` Andre Przywara
  2016-11-01 18:19   ` Julien Grall
  3 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-26 22:57 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> We limit the number of devices to cover 4 PCI busses for now.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |   5 ++
>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>  3 files changed, 167 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index b52dff3..40238a2 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -21,6 +21,7 @@
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>  
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16)
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)

Shouldn't this be called its_map_baser?


> +{
> +    uint64_t attr;
> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;

The spec says "This field is read-only and specifies the number of
bytes per entry, minus one." Do we need to increment it by 1?


> +    int pagesz;
> +    int order;
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
> +     * supports.
> +     */

Is this really the best way to do it? Can't we assume ITS supports 4K,
given that Xen requires 4K pages at the moment? Is it actually possible
to find hardware that supports 4K but with an ITS that only support 64K
or 16K pages? It seems insane to me. Otherwise can't we probe the page
size somehow?


> +    for (pagesz = 0; pagesz < 3; pagesz++)
> +    {
> +        uint64_t reg;
> +        int nr_bytes;
> +
> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
> +        order = get_order_from_bytes(nr_bytes);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= nr_bytes >> (pagesz * 2 + 12);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_BASER_VALID;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +            attr = regc & BASER_ATTR_MASK;
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> +            return 0;
> +
> +        /* Check whether our buffer is aligned to the next page size already. */
> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
> +        {
> +            free_xenheap_pages(buffer, order);
> +            buffer = NULL;
> +        }
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for (i = 0; i < 8; i++)

Code style. Unfortunately we don't have a script to check, but please
refer to CODING_STYLE. I'd prefer if every number was #define'ed,
including `8' (something like GITS_BASER_MAX).


> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg >> 56) & 0x7;

Please #define 56 and 0x7


> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            gicv3_map_baser(basereg, reg, 1024);

An hardcoded max value might be OK, but please #define it.


> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            gicv3_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 2534aa5..5cf4618 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1548,6 +1549,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>  
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1603,6 +1605,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>  
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>      spin_unlock(&gicv3.lock);
>  
>      return res;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 48c6c78..589b889 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -18,6 +18,47 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>  
> +#define LPI_OFFSET      8192
> +
> +#define GITS_CTLR       (0x000)
> +#define GITS_IIDR       (0x004)
> +#define GITS_TYPER      (0x008)
> +#define GITS_CBASER     (0x080)
> +#define GITS_CWRITER    (0x088)
> +#define GITS_CREADR     (0x090)
> +#define GITS_BASER0     (0x100)
> +#define GITS_BASER1     (0x108)
> +#define GITS_BASER2     (0x110)
> +#define GITS_BASER3     (0x118)
> +#define GITS_BASER4     (0x120)
> +#define GITS_BASER5     (0x128)
> +#define GITS_BASER6     (0x130)
> +#define GITS_BASER7     (0x138)
> +
> +/* Register bits */
> +#define GITS_CTLR_ENABLE     0x1
> +#define GITS_IIDR_VALUE      0x34c
> +
> +#define GITS_BASER_VALID                BIT(63)
> +#define GITS_BASER_INDIRECT             BIT(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              ((7UL << GITS_BASER_TYPE_SHIFT) | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>  
> @@ -27,6 +68,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>  
>  extern struct list_head host_its_list;
> @@ -42,8 +84,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>  
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>  
>  #else
>  
> @@ -62,6 +105,10 @@ static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif /* __ASSEMBLY__ */
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer
  2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
  2016-10-24 14:31   ` Vijay Kilari
@ 2016-10-26 23:03   ` Stefano Stabellini
  2016-11-10 16:04     ` Andre Przywara
  2016-11-02 13:38   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-26 23:03 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer to the ITS h/w to create or alter the
> LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 25 +++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  1 +
>  2 files changed, 26 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 40238a2..c8a7a7e 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -18,6 +18,7 @@
>  
>  #include <xen/config.h>
>  #include <xen/lib.h>
> +#include <xen/err.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> @@ -56,6 +57,26 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>  }
>  
> +static void *gicv3_map_cbaser(void __iomem *cbasereg)

Shouldn't it be its_map_cbaser?


> +{
> +    uint64_t attr, reg;
> +    void *buffer;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    buffer = alloc_xenheap_pages(0, 0);
> +    if ( !buffer )
> +        return ERR_PTR(-ENOMEM);

We haven't use much ERR_PTR on arm so far. In this case I'd just return
NULL.


> +
> +    /* We use exactly one 4K page, so the "Size" field is 0. */
> +    reg = attr | BIT(63) | (virt_to_maddr(buffer) & GENMASK(51, 12));

Shouldn't the mask be GENMASK(47, 12)? Maybe I have an old spec
version.


> +    writeq_relaxed(reg, cbasereg);
> +
> +    return buffer;
> +}
> +
>  static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>  {
>      uint64_t attr;
> @@ -149,6 +170,10 @@ int gicv3_its_init(struct host_its *hw_its)
>          }
>      }
>  
> +    hw_its->cmd_buf = gicv3_map_cbaser(hw_its->its_base + GITS_CBASER);
> +    if ( IS_ERR(hw_its->cmd_buf) )
> +        return PTR_ERR(hw_its->cmd_buf);
> +
>      return 0;
>  }
>  
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 589b889..b2a003f 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -69,6 +69,7 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    void *cmd_buf;
>  };
>  
>  extern struct list_head host_its_list;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-09-28 18:24 ` [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2016-10-26 23:55   ` Stefano Stabellini
  2016-10-27 21:52     ` Stefano Stabellini
  2016-11-10 15:57     ` Andre Przywara
  2016-11-02 15:05   ` Julien Grall
  1 sibling, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-26 23:55 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> To be able to easily send commands to the ITS, create the respective
> wrapper functions, which take care of the ring buffer.
> The first two commands we implement provide methods to map a collection
> to a redistributor (aka host core) and to flush the command queue (SYNC).
> Start using these commands for mapping one collection to each host CPU.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 101 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |  17 +++++++
>  xen/include/asm-arm/gic-its.h |  32 +++++++++++++
>  3 files changed, 150 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index c8a7a7e..88397bc 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -33,6 +33,10 @@ static struct {
>      int host_lpi_bits;
>  } lpi_data;
>  
> +/* Physical redistributor address */
> +static DEFINE_PER_CPU(uint64_t, rdist_addr);
> +/* Redistributor ID */
> +static DEFINE_PER_CPU(uint64_t, rdist_id);
>  /* Pending table for each redistributor */
>  static DEFINE_PER_CPU(void *, pending_table);
>  
> @@ -40,6 +44,86 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>  
> +#define ITS_COMMAND_SIZE        32
> +
> +static int its_send_command(struct host_its *hw_its, void *its_cmd)
> +{
> +    int readp, writep;

uint64_t


> +    spin_lock(&hw_its->cmd_lock);
> +
> +    readp = readl_relaxed(hw_its->its_base + GITS_CREADR) & GENMASK(19, 5);
> +    writep = readl_relaxed(hw_its->its_base + GITS_CWRITER) & GENMASK(19, 5);

It might be worth to

  #define ITS_CMD_RING_SIZE PAGE_SIZE

for clarity


> +    if ( ((writep + ITS_COMMAND_SIZE) % PAGE_SIZE) == readp )
> +    {
> +        spin_unlock(&hw_its->cmd_lock);
> +        return -EBUSY;
> +    }
> +
> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_COMMAND_SIZE);
> +    __flush_dcache_area(hw_its->cmd_buf + writep, ITS_COMMAND_SIZE);
> +    writep = (writep + ITS_COMMAND_SIZE) % PAGE_SIZE;
> +
> +    writeq_relaxed(writep & GENMASK(19, 5), hw_its->its_base + GITS_CWRITER);
> +
> +    spin_unlock(&hw_its->cmd_lock);
> +
> +    return 0;
> +}
> +
> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
> +{
> +    reg &= ~GENMASK(51, 16);
> +
> +    if ( hw_its->pta )
> +        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);

Again in my version of the spec is GENMASK(47, 16).


> +    else
> +        reg |= per_cpu(rdist_id, cpu) << 16;
> +    return reg;
> +}
> +
> +static int its_send_cmd_sync(struct host_its *its, int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_SYNC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)) | BIT(63));
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +/* Set up the (1:1) collection mapping for the given host CPU. */
> +void gicv3_its_setup_collection(int cpu)
> +{
> +    struct host_its *its;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        /* Only send commands to ITS that have been initialized already. */
> +        if ( !its->cmd_buf )
> +            continue;
> +
> +        its_send_cmd_mapc(its, cpu, cpu);
> +        its_send_cmd_sync(its, cpu);
> +    }
> +}
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -147,6 +231,13 @@ int gicv3_its_init(struct host_its *hw_its)
>      if ( !hw_its->its_base )
>          return -ENOMEM;
>  
> +    /* Make sure the ITS is disabled before programming the BASE registers. */
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
> +    hw_its->pta = reg & GITS_TYPER_PTA;

To avoid problems:

  pta = !!(reg & GITS_TYPER_PTA);


>      for (i = 0; i < 8; i++)
>      {
>          void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> @@ -174,9 +265,18 @@ int gicv3_its_init(struct host_its *hw_its)
>      if ( IS_ERR(hw_its->cmd_buf) )
>          return PTR_ERR(hw_its->cmd_buf);
>  
> +    its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
> +    its_send_cmd_sync(hw_its, smp_processor_id());

Why do we need these two commands in addition to the ones issued by
gicv3_its_setup_collection?


>      return 0;
>  }
>  
> +void gicv3_set_redist_addr(paddr_t address, int redist_id)
> +{
> +    this_cpu(rdist_addr) = address;
> +    this_cpu(rdist_id) = redist_id;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> @@ -265,6 +365,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);
>  
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>  
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 5cf4618..b9387a3 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -638,6 +638,8 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>      table_reg = gicv3_lpi_get_proptable();
>      if ( table_reg )
>          writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> +
> +    gicv3_its_setup_collection(smp_processor_id());
>  }
>  
>  static int __init gicv3_populate_rdist(void)
> @@ -684,7 +686,22 @@ static int __init gicv3_populate_rdist(void)
>                  this_cpu(rbase) = ptr;
>  
>                  if ( typer & GICR_TYPER_PLPIS )
> +                {
> +                    paddr_t rdist_addr;
> +
> +                    rdist_addr = gicv3.rdist_regions[i].base;
> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
> +
> +                    /* The ITS refers to redistributors either by their physical
> +                     * address or by their ID. Determine those two values and
> +                     * let the ITS code store them in per host CPU variables to
> +                     * later be able to address those redistributors.
> +                     */
> +                    gicv3_set_redist_addr(rdist_addr,
> +                                          (typer >> 8) & GENMASK(15, 0));

Please #define the 8


>                      gicv3_rdist_init_lpis(ptr);
> +                }
>  
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index b2a003f..b49d274 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -37,6 +37,7 @@
>  
>  /* Register bits */
>  #define GITS_CTLR_ENABLE     0x1
> +#define GITS_TYPER_PTA       BIT(19)
>  #define GITS_IIDR_VALUE      0x34c
>  
>  #define GITS_BASER_VALID                BIT(63)
> @@ -59,6 +60,22 @@
>                                          (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
>                                          GITS_BASER_INDIRECT)
>  
> +/* ITS command definitions */
> +#define ITS_CMD_SIZE                    32
> +
> +#define GITS_CMD_MOVI                   0x01
> +#define GITS_CMD_INT                    0x03
> +#define GITS_CMD_CLEAR                  0x04
> +#define GITS_CMD_SYNC                   0x05
> +#define GITS_CMD_MAPD                   0x08
> +#define GITS_CMD_MAPC                   0x09
> +#define GITS_CMD_MAPTI                  0x0a

In my version of the spec (PRD03-GENC-010745 24.0) 0x0a is MAPVI.


> +#define GITS_CMD_MAPI                   0x0b
> +#define GITS_CMD_INV                    0x0c
> +#define GITS_CMD_INVALL                 0x0d
> +#define GITS_CMD_MOVALL                 0x0e
> +#define GITS_CMD_DISCARD                0x0f
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>  
> @@ -69,7 +86,9 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    spinlock_t cmd_lock;
>      void *cmd_buf;
> +    bool pta;
>  };
>  
>  extern struct list_head host_its_list;
> @@ -89,6 +108,12 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
>  int gicv3_its_init(struct host_its *hw_its);
>  
> +/* Set the physical address and ID for each redistributor as read from DT. */
> +void gicv3_set_redist_addr(paddr_t address, int redist_id);
> +
> +/* Map a collection for this host CPU to each host ITS. */
> +void gicv3_its_setup_collection(int cpu);
> +
>  #else
>  
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -110,6 +135,13 @@ static inline int gicv3_its_init(struct host_its *hw_its)
>  {
>      return 0;
>  }
> +static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
> +{
> +}
> +static inline void gicv3_its_setup_collection(int cpu)
> +{
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif /* __ASSEMBLY__ */

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-10-26 23:55   ` Stefano Stabellini
@ 2016-10-27 21:52     ` Stefano Stabellini
  2016-11-10 15:57     ` Andre Przywara
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-27 21:52 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Andre Przywara, Julien Grall, xen-devel

On Wed, 26 Oct 2016, Stefano Stabellini wrote:
> > +/* ITS command definitions */
> > +#define ITS_CMD_SIZE                    32
> > +
> > +#define GITS_CMD_MOVI                   0x01
> > +#define GITS_CMD_INT                    0x03
> > +#define GITS_CMD_CLEAR                  0x04
> > +#define GITS_CMD_SYNC                   0x05
> > +#define GITS_CMD_MAPD                   0x08
> > +#define GITS_CMD_MAPC                   0x09
> > +#define GITS_CMD_MAPTI                  0x0a
> 
> In my version of the spec (PRD03-GENC-010745 24.0) 0x0a is MAPVI.

For reference, I had an older version of the spec. I found the new
version, which confirms Andre's numbers.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array
  2016-09-28 18:24 ` [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2016-10-27 22:59   ` Stefano Stabellini
  2016-11-02 15:14     ` Julien Grall
  2016-11-10 17:22     ` Andre Przywara
  0 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-27 22:59 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The number of LPIs on a host can be potentially huge (millions),
> although in practise will be mostly reasonable. So prematurely allocating
> an array of struct irq_desc's for each LPI is not an option.
> However Xen itself does not care about LPIs, as every LPI will be injected
> into a guest (Dom0 for now).
> Create a dense data structure (8 Bytes) for each LPI which holds just
> enough information to determine the virtual IRQ number and the VCPU into
> which the LPI needs to be injected.
> Also to not artificially limit the number of LPIs, we create a 2-level
> table for holding those structures.
> This patch introduces functions to initialize these tables and to
> create, lookup and destroy entries for a given LPI.
> We allocate and access LPI information in a way that does not require
> a lock.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  18 +++++
>  2 files changed, 172 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 88397bc..2140e4a 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -18,18 +18,31 @@
>  
>  #include <xen/config.h>
>  #include <xen/lib.h>
> +#include <xen/sched.h>
>  #include <xen/err.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> +#include <asm/domain.h>
>  #include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
>  
> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
> +union host_lpi {
> +    uint64_t data;
> +    struct {
> +        uint64_t virt_lpi:32;
> +        uint64_t dom_id:16;
> +        uint64_t vcpu_id:16;
> +    };
> +};

Why not the following?

  union host_lpi {
      uint64_t data;
      struct {
          uint32_t virt_lpi;
          uint16_t dom_id;
          uint16_t vcpu_id;
      };
  };


>  /* Global state */
>  static struct {
>      uint8_t *lpi_property;
> +    union host_lpi **host_lpis;
>      int host_lpi_bits;
>  } lpi_data;
>  
> @@ -43,6 +56,26 @@ static DEFINE_PER_CPU(void *, pending_table);
>  #define MAX_HOST_LPI_BITS                                                \
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
> +
> +static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)

I take "lpi" is the physical lpi here. Maybe we would rename it to "plpi"
for clarity.


> +{
> +    union host_lpi *hlpi;
> +
> +    if ( lpi < 8192 || lpi >= MAX_HOST_LPIS + 8192 )
> +        return NULL;
> +
> +    lpi -= 8192;
> +    if ( !lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE] )
> +        return NULL;
> +
> +    hlpi = &lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE][lpi % HOST_LPIS_PER_PAGE];

I realize I am sometimes obsessive about this, but division operations
are expensive and this is on the hot path, so I would do:

#define HOST_LPIS_PER_PAGE      (PAGE_SIZE >> 3)

unsigned int table = lpi / HOST_LPIS_PER_PAGE;

then use table throughout this function.


> +    if ( d && hlpi->dom_id != d->domain_id )
> +        return NULL;

I think this function is very useful so I would avoid making any domain
checks here: one day we might want to retrieve hlpi even if hlpi->dom_id
!= d->domain_id. I would move the domain check outside.


> +    return hlpi;
> +}
>  
>  #define ITS_COMMAND_SIZE        32
>  
> @@ -96,6 +129,33 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_discard(struct host_its *its,
> +                                uint32_t deviceid, uint32_t eventid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_DISCARD | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid;
> +    cmd[2] = 0x00;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +static int its_send_cmd_mapti(struct host_its *its,
> +                              uint32_t deviceid, uint32_t eventid,
> +                              uint32_t pintid, uint16_t icid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
> +    cmd[2] = icid;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>  {
>      uint64_t cmd[4];
> @@ -330,15 +390,109 @@ uint64_t gicv3_lpi_get_proptable()
>      return reg;
>  }
>  
> +/* Allocate the 2nd level array for host LPIs. This one holds pointers
> + * to the page with the actual "union host_lpi" entries. Our LPI limit
> + * avoids excessive memory usage.
> + */
>  int gicv3_lpi_init_host_lpis(int lpi_bits)
>  {
> +    int nr_lpi_ptrs;
> +
>      lpi_data.host_lpi_bits = lpi_bits;
>  
> +    nr_lpi_ptrs = MAX_HOST_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
> +
> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
> +    if ( !lpi_data.host_lpis )
> +        return -ENOMEM;

Why are we not allocating the 2nd level right away? To save memory? If
so, I would like some numbers in a real use case scenario written either
here on in the commit message.


>      printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
>  
>      return 0;
>  }
>  
> +/* Allocates a new host LPI to be injected as "virt_lpi" into the specified
> + * VCPU. Returns the host LPI ID or a negative error value.
> + */
> +int gicv3_lpi_allocate_host_lpi(struct host_its *its,
> +                                uint32_t devid, uint32_t eventid,
> +                                struct vcpu *v, int virt_lpi)
> +{
> +    int chunk, i;
> +    union host_lpi hlpi, *new_chunk;
> +
> +    /* TODO: handle some kind of preassigned LPI mapping for DomUs */
> +    if ( !its )
> +        return -EPERM;
> +
> +    /* TODO: This could be optimized by storing some "next available" hint and
> +     * only iterate if this one doesn't work. But this function should be
> +     * called rarely.
> +     */

Yes please. Even a trivial pointer to last would be far better than this.
It would be nice to run some numbers and prove that in realistic
scenarios finding an empty plpi doesn't take more than 5-10 ops, which
should be the case unless we have to loop over and the initial chucks
are still fully populated, causing Xen to scan for 512 units at a time.
We defenitely want to avoid that, if not in rare worse case scenarios.


> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
> +    {
> +        /* If we hit an unallocated chunk, we initialize it and use entry 0. */
> +        if ( !lpi_data.host_lpis[chunk] )
> +        {
> +            new_chunk = alloc_xenheap_pages(0, 0);
> +            if ( !new_chunk )
> +                return -ENOMEM;
> +
> +            memset(new_chunk, 0, PAGE_SIZE);
> +            lpi_data.host_lpis[chunk] = new_chunk;
> +            i = 0;
> +        }
> +        else
> +        {
> +            /* Find an unallocted entry in this chunk. */
> +            for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
> +                if ( !lpi_data.host_lpis[chunk][i].virt_lpi )
> +                    break;
> +
> +            /* If this chunk is fully allocted, advance to the next one. */
                                           ^ allocated


> +            if ( i == HOST_LPIS_PER_PAGE)
> +                continue;
> +        }
> +
> +        hlpi.virt_lpi = virt_lpi;
> +        hlpi.dom_id = v->domain->domain_id;
> +        hlpi.vcpu_id = v->vcpu_id;
> +        lpi_data.host_lpis[chunk][i].data = hlpi.data;
> +
> +        if (its)

code style


> +        {
> +            its_send_cmd_mapti(its, devid, eventid,
> +                               chunk * HOST_LPIS_PER_PAGE + i + 8192, 0);
> +            its_send_cmd_sync(its, 0);

Why hardcode the physical cpu to 0? Should we get the pcpu the vcpu is
currently running on?


> +        }
> +
> +        return chunk * HOST_LPIS_PER_PAGE + i + 8192;
> +    }
> +
> +    return -ENOSPC;
> +}
> +
> +/* Drops the connection of the given host LPI to a virtual LPI.
> + */
> +int gicv3_lpi_drop_host_lpi(struct host_its *its,
> +                            uint32_t devid, uint32_t eventid, uint32_t host_lpi)
> +{
> +    union host_lpi *hlpip;
> +
> +    if ( !its )
> +        return -EPERM;
> +
> +    hlpip = gic_find_host_lpi(host_lpi, NULL);
> +    if ( !hlpip )
> +        return -1;
> +
> +    hlpip->data = 0;
> +
> +    its_send_cmd_discard(its, devid, eventid);
> +
> +    return 0;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index b49d274..512a388 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -114,6 +114,12 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>  /* Map a collection for this host CPU to each host ITS. */
>  void gicv3_its_setup_collection(int cpu);
>  
> +int gicv3_lpi_allocate_host_lpi(struct host_its *its,
> +                                uint32_t devid, uint32_t eventid,
> +                                struct vcpu *v, int virt_lpi);
> +int gicv3_lpi_drop_host_lpi(struct host_its *its,
> +                            uint32_t devid, uint32_t eventid,
> +                            uint32_t host_lpi);
>  #else
>  
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -141,6 +147,18 @@ static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
>  static inline void gicv3_its_setup_collection(int cpu)
>  {
>  }
> +static inline int gicv3_lpi_allocate_host_lpi(struct host_its *its,
> +                                              uint32_t devid, uint32_t eventid,
> +                                              struct vcpu *v, int virt_lpi)
> +{
> +    return 0;
> +}
> +static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
> +                                          uint32_t devid, uint32_t eventid,
> +                                          uint32_t host_lpi)
> +{
> +    return 0;
> +}
>  
>  #endif /* CONFIG_HAS_ITS */
>  
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping
  2016-09-28 18:24 ` [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-10-28  0:08   ` Stefano Stabellini
  1 sibling, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-28  0:08 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into a list to
> later be able to iterate over it or to easily teardown guests.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 90 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h | 16 ++++++++
>  2 files changed, 106 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 2140e4a..bf1f5b5 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -168,6 +168,94 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             int size, uint64_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size & GENMASK(4, 0);
> +    cmd[2] = itt_addr & GENMASK(51, 8);
> +    if ( valid )
> +        cmd[2] |= BIT(63);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> +                         int devid, int bits, bool valid)
> +{
> +    void *itt_addr = NULL;
> +    struct its_devices *dev, *temp;
> +    bool reuse_dev = false;
> +
> +    list_for_each_entry_safe(dev, temp, &hw_its->its_devices, entry)
> +    {
> +        if ( (dev->d->domain_id != d->domain_id) || (dev->devid != devid) )
> +            continue;
> +
> +        its_send_cmd_mapd(hw_its, dev->devid, 0, 0, false);
> +        xfree(dev->itt_addr);
> +        if ( !valid )
> +        {
> +            xfree(dev);
> +            list_del(&dev->entry);
> +
> +            return 0;
> +        }
> +
> +        reuse_dev = true;
> +        break;
> +    }

I don't think we want to go through the whole list every time this
function is called. Devices can be thousands. I would split it in two:
one to retrieve existing mappings and another to allocate new ones. We
need to make sure we don't call the function to retrieve existing
mappings often.

We can also consider using a rbtree instead of a list, or if devids are
a dense numeric group, we could use an array and direct access.


> +    if ( !valid )
> +        return 0;
> +
> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
> +    if ( !itt_addr )
> +        return -ENOMEM;
> +
> +    if ( !reuse_dev )
> +    {
> +        dev = xmalloc(struct its_devices);
> +        if ( !dev )
> +            return -ENOMEM;
> +
> +        list_add_tail(&dev->entry, &hw_its->its_devices);
> +    }
> +
> +    dev->itt_addr = itt_addr;
> +    dev->d = d;
> +    dev->devid = devid;
> +
> +    return its_send_cmd_mapd(hw_its, devid, bits - 1,
> +                             itt_addr ? virt_to_maddr(itt_addr) : 0, true);
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +int its_remove_domain(struct domain *d)
> +{
> +    struct host_its *its;
> +    struct its_devices *dev, *temp;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        list_for_each_entry_safe(dev, temp, &its->its_devices, entry)
> +        {
> +            if ( dev->d->domain_id != d->domain_id )
> +                continue;
> +
> +            its_send_cmd_mapd(its, dev->devid, 0, 0, false);
> +            xfree(dev->itt_addr);
> +            xfree(dev);
> +            list_del(&dev->entry);
> +        }
> +    }

Again scanning the full list on every domain destruction is not good.
This is easy to work around, even without completely reworking the data
structures, because we could add a second per-domain list to link all
devices that belong to the same domain.


> +    return 0;
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  void gicv3_its_setup_collection(int cpu)
>  {
> @@ -297,6 +385,7 @@ int gicv3_its_init(struct host_its *hw_its)
>  
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      hw_its->pta = reg & GITS_TYPER_PTA;
> +    hw_its->itte_size = ((reg >> 4) & 0xf) + 1;

Please add some #define


>      for (i = 0; i < 8; i++)
>      {
> @@ -520,6 +609,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->size = size;
>          its_data->dt_node = its;
>          spin_lock_init(&its_data->cmd_lock);
> +        INIT_LIST_HEAD(&its_data->its_devices);
>  
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>  
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 512a388..4e9841a 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -79,6 +79,13 @@
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>  
> +struct its_devices {
> +    struct list_head entry;
> +    struct domain *d;
> +    void *itt_addr;
> +    int devid;
> +};
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -88,6 +95,8 @@ struct host_its {
>      void __iomem *its_base;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    struct list_head its_devices;
> +    int itte_size;
>      bool pta;
>  };
>  
> @@ -114,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>  /* Map a collection for this host CPU to each host ITS. */
>  void gicv3_its_setup_collection(int cpu);
>  
> +/* Map a device on the host by allocating an ITT on the host (ITS).
> + * "bits" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> +                         int devid, int bits, bool valid);
> +
>  int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>                                  uint32_t devid, uint32_t eventid,
>                                  struct vcpu *v, int virt_lpi);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-10-28  1:04   ` Stefano Stabellini
  2017-01-12 19:14     ` Andre Przywara
  2016-11-04 15:46   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-28  1:04 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> For the same reason that allocating a struct irq_desc for each
> possible LPI is not an option, having a struct pending_irq for each LPI
> is also not feasible. However we actually only need those when an
> interrupt is on a vCPU (or is about to be injected).
> Maintain a list of those structs that we can use for the lifecycle of
> a guest LPI. We allocate new entries if necessary, however reuse
> pre-owned entries whenever possible.
> Teach the existing VGIC functions to find the right pointer when being
> given a virtual LPI number.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic.c            |  3 +++
>  xen/arch/arm/vgic-v3.c        |  2 ++
>  xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
>  xen/include/asm-arm/domain.h  |  1 +
>  xen/include/asm-arm/gic-its.h | 10 ++++++++
>  xen/include/asm-arm/vgic.h    |  9 +++++++
>  6 files changed, 78 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index 63c744a..ebe4035 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>              }
> +            /* If this was an LPI, mark this struct as available again. */
> +            if ( p->irq >= 8192 )
> +                p->irq = 0;

I believe that 0 is a valid irq number, we need to come up with a
different invalid_irq value, and we should #define it. We could also
consider checking if the irq is inflight (linked to the inflight list)
instead of using irq == 0 to understand if it is reusable.


>          }
>      }
>  }
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index ec038a3..e9b6490 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>  
> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
> +
>      return 0;
>  }
>  
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 0965119..b961551 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -31,6 +31,8 @@
>  #include <asm/mmio.h>
>  #include <asm/gic.h>
>  #include <asm/vgic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
>  
>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>  {
> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>      return vgic_get_rank(v, rank);
>  }
>  
> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>  {
>      INIT_LIST_HEAD(&p->inflight);
>      INIT_LIST_HEAD(&p->lr_queue);
> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>  
>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>  {
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> +    struct vgic_irq_rank *rank;
>      unsigned long flags;
>      int priority;
>  
> +    if ( virq >= 8192 )

Please introduce a convenience static inline function such as:

  bool is_lpi(unsigned int irq)


> +        return gicv3_lpi_get_priority(v->domain, virq);
> +
> +    rank = vgic_rank_irq(v, virq);
>      vgic_lock_rank(v, rank, flags);
>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>      vgic_unlock_rank(v, rank, flags);
> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
>      return 1;
>  }
>  
> +/*
> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> + * requires too much Xen memory, also a malicious guest could potentially
> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> + * on demand.
> + */
> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> +                                   bool allocate)
> +{
> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> +
> +    /* TODO: locking! */

Yeah, this needs to be fixed in v1 :-)


> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> +    {
> +        if ( lpi_irq->pirq.irq == lpi )
> +            return &lpi_irq->pirq;
> +
> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> +            empty = lpi_irq;
> +    }

This is another one of those cases where a list is too slow for the hot
path. The idea of allocating pending_irq struct on demand is good, but
storing them in a linked list would kill performance. Probably the best
thing we could do is an hashtable and we should preallocate the initial
array of elements. I don't know what the size of the initial array
should be, but we can start around 50, and change it in the future once
we do tests with real workloads. Of course the other key parameter is
the hash function, not sure which one is the right one, but ideally we
would never have to allocate new pending_irq struct for LPIs because the
preallocated set would suffice.

I could be convinced that a list is sufficient if we do some real
benchmarking and it turns out that lpi_to_pending always resolve in less
than ~5 steps.


> +    if ( !allocate )
> +        return NULL;
> +
> +    if ( !empty )
> +    {
> +        empty = xzalloc(struct lpi_pending_irq);
> +        vgic_init_pending_irq(&empty->pirq, lpi);
> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> +    } else
> +    {
> +        empty->pirq.status = 0;
> +        empty->pirq.irq = lpi;
> +    }
> +
> +    return &empty->pirq;
> +}
> +
>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>  {
>      struct pending_irq *n;
> +

spurious change


>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>       * are used for SPIs; the rests are used for per cpu irqs */
>      if ( irq < 32 )
>          n = &v->arch.vgic.pending_irqs[irq];
> +    else if ( irq >= 8192 )

Use the new static inline


> +        n = lpi_to_pending(v, irq, true);
>      else
>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>      return n;
> @@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  {
>      uint8_t priority;
> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
> +    struct pending_irq *iter, *n;
>      unsigned long flags;
>      bool_t running;
>  
> @@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>  
> +    n = irq_to_pending(v, virq);

Why this change?


>      /* vcpu offline */
>      if ( test_bit(_VPF_down, &v->pause_flags) )
>      {
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 9452fcd..ae8a9de 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -249,6 +249,7 @@ struct arch_vcpu
>          paddr_t rdist_base;
>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>          uint8_t flags;
> +        struct list_head pending_lpi_list;
>      } vgic;
>  
>      /* Timer registers  */
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 4e9841a..1f881c0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
>                              uint32_t host_lpi);
> +
> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> +{
> +    return GIC_PRI_IRQ;
> +}

Does it mean that we don't allow changes to LPI priorities?


>  #else
>  
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
>  {
>      return 0;
>  }
> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> +{
> +    return GIC_PRI_IRQ;
> +}
>  
>  #endif /* CONFIG_HAS_ITS */
>  
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 300f461..4e29ba6 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -83,6 +83,12 @@ struct pending_irq
>      struct list_head lr_queue;
>  };
>  
> +struct lpi_pending_irq
> +{
> +    struct list_head entry;
> +    struct pending_irq pirq;
> +};
> +
>  #define NR_INTERRUPT_PER_RANK   32
>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>  
> @@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
> +                                          bool allocate);
>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests
  2016-09-28 18:24 ` [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests Andre Przywara
@ 2016-10-28  1:51   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-28  1:51 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
> number to get this IRQ injected.
> Iterate our two-level LPI table to find this information quickly when
> the host takes an LPI. Call the existing injection function to let the
> GIC emulation deal with this interrupt.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c    | 32 ++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic.c        |  6 ++++--
>  xen/include/asm-arm/irq.h |  8 ++++++++
>  3 files changed, 44 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index bf1f5b5..b7aa918 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -77,6 +77,38 @@ static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
>      return hlpi;
>  }
>  
> +/* Handle incoming LPIs, which are a bit special, because they are potentially
> + * numerous and also only get injected into guests. Treat them specially here,
> + * by just looking up their target vCPU and virtual LPI number and hand it
> + * over to the injection function.
> + */
> +void do_LPI(unsigned int lpi)
> +{
> +    struct domain *d;
> +    union host_lpi *hlpip, hlpi;
> +
> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> +    WRITE_SYSREG32(lpi, ICC_DIR_EL1);

Given that LPIs are always guest IRQs, and given that we support EOImode
= 1, shouldn't we just do ICC_EOIR1_EL1 here (no ICC_DIR_EL1)?


> +    hlpip = gic_find_host_lpi(lpi, NULL);
> +    if ( !hlpip )
> +        return;
> +
> +    hlpi.data = hlpip->data;

Why can't we just reference hlpip directly? Why do we need hlpi?


> +    if ( !hlpi.virt_lpi )
> +        return;
> +
> +    d = get_domain_by_id(hlpi.dom_id);
> +    if ( !d )
> +        return;
> +
> +    if ( hlpi.vcpu_id >= d->max_vcpus )
> +        return;
> +
> +    vgic_vcpu_inject_irq(d->vcpu[hlpi.vcpu_id], hlpi.virt_lpi);
> +}
> +
>  #define ITS_COMMAND_SIZE        32
>  
>  static int its_send_command(struct host_its *hw_its, void *its_cmd)
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index ebe4035..2fad2f1 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -697,8 +697,10 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
>              local_irq_enable();
>              do_IRQ(regs, irq, is_fiq);
>              local_irq_disable();
> -        }
> -        else if (unlikely(irq < 16))
> +        } else if ( irq >= 8192 )
> +        {
> +            do_LPI(irq);
> +        } else if ( unlikely(irq < 16) )
>          {
>              do_sgi(regs, irq);
>          }
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 8f7a167..ee47de8 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -34,6 +34,14 @@ struct irq_desc *__irq_to_desc(int irq);
>  
>  void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>  
> +#ifdef CONFIG_HAS_ITS
> +void do_LPI(unsigned int irq);
> +#else
> +static inline void do_LPI(unsigned int irq)
> +{
> +}
> +#endif
> +
>  #define domain_pirq_to_irq(d, pirq) (pirq)
>  
>  bool_t is_assignable_irq(unsigned int irq);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host
  2016-09-28 18:24 ` [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
@ 2016-10-28 23:07   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-28 23:07 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Now that the host part of the ITS code is in place, we can enable the
> ITS and also LPIs on each redistributor to get the show rolling.
> At this point there would be no LPIs mapped, as guests don't know about
> the ITS yet.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


>  xen/arch/arm/gic-its.c |  4 ++++
>  xen/arch/arm/gic-v3.c  | 19 +++++++++++++++++++
>  2 files changed, 23 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index b7aa918..6bac422 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -449,6 +449,10 @@ int gicv3_its_init(struct host_its *hw_its)
>      its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
>      its_send_cmd_sync(hw_its, smp_processor_id());
>  
> +    /* Now enable interrupt translation on that ITS. */
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    writel_relaxed(reg | GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
>      return 0;
>  }
>  
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b9387a3..57009c6 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -642,6 +642,21 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>      gicv3_its_setup_collection(smp_processor_id());
>  }
>  
> +/* Enable LPIs on this redistributor (only useful when the host has an ITS. */
> +static bool gicv3_enable_lpis(void)
> +{
> +    uint32_t val;
> +
> +    val = readl_relaxed(GICD_RDIST_BASE + GICR_TYPER);
> +    if ( !(val & GICR_TYPER_PLPIS) )
> +        return false;
> +
> +    val = readl_relaxed(GICD_RDIST_BASE + GICR_CTLR);
> +    writel_relaxed(val | GICR_CTLR_ENABLE_LPIS, GICD_RDIST_BASE + GICR_CTLR);
> +
> +    return true;
> +}
> +
>  static int __init gicv3_populate_rdist(void)
>  {
>      int i;
> @@ -741,6 +756,10 @@ static int gicv3_cpu_init(void)
>      if ( gicv3_enable_redist() )
>          return -ENODEV;
>  
> +    /* If the host has any ITSes, enable LPIs now. */
> +    if ( !list_empty(&host_its_list) )
> +        gicv3_enable_lpis();
> +
>      /* Set priority on PPI and SGI interrupts */
>      priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
>                  GIC_PRI_IPI);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
  2016-10-24 15:32   ` Vijay Kilari
@ 2016-10-29  0:39   ` Stefano Stabellini
  2017-03-29 15:47     ` Andre Przywara
  2016-11-02 17:18   ` Julien Grall
  2 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-10-29  0:39 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c        | 189 ++++++++++++++++++++++++++++++++++++++----
>  xen/arch/arm/vgic.c           |   4 +
>  xen/include/asm-arm/domain.h  |   7 +-
>  xen/include/asm-arm/gic-its.h |  10 ++-
>  xen/include/asm-arm/vgic.h    |   3 +
>  5 files changed, 197 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index e9b6490..8fe8386 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>  
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>  
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
> +        return 1;
>  
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
> +        return 1;
>  
>      case 0x0080:
>          goto read_reserved;
> @@ -301,11 +305,6 @@ bad_width:
>      domain_crash_synchronous();
>      return 0;
>  
> -read_as_zero_64:
> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -    *r = 0;
> -    return 1;
> -
>  read_as_zero_32:
>      if ( dabt.size != DABT_WORD ) goto bad_width;
>      *r = 0;
> @@ -330,11 +329,149 @@ read_unknown:
>      return 1;
>  }
>  
> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
> +                                    int field_shift,
> +                                    uint64_t (*sanitise_fn)(uint64_t))
> +{
> +    uint64_t field = (reg & field_mask) >> field_shift;
> +
> +    field = sanitise_fn(field) << field_shift;
> +    return (reg & ~field_mask) | field;
> +}
> +
> +/* We want to avoid outer shareable. */
> +static uint64_t vgic_sanitise_shareability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_OuterShareable:
> +        return GIC_BASER_InnerShareable;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Avoid any inner non-cacheable mapping. */
> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_nCnB:
> +    case GIC_BASER_CACHE_nC:
> +        return GIC_BASER_CACHE_RaWb;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Non-cacheable or same-as-inner are OK. */
> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_SameAsInner:
> +    case GIC_BASER_CACHE_nC:
> +        return field;
> +    default:
> +        return GIC_BASER_CACHE_nC;
> +    }
> +}
> +
> +static uint64_t sanitize_propbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PROPBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);
> +    return reg;
> +}
> +
> +static uint64_t sanitize_pendbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PENDBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);
> +    return reg;
> +}
> +
> +/*
> + * Allow mapping some parts of guest memory into Xen's VA space to have easy
> + * access to it. This is to allow ITS configuration data to be held in
> + * guest memory and avoid using Xen memory for that.
> + */
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)

map_guest_pages and unmap_guest_pages are not ARM specific and should
live somewhere in xen/common, maybe xen/common/vmap.c.


> +{
> +    mfn_t onepage;
> +    mfn_t *pages;
> +    int i;
> +    void *ptr;
> +
> +    /* TODO: free previous mapping, change prototype? use get-put-put? */

No need: the caller is already doing that, right?


> +    guest_addr &= PAGE_MASK;
> +
> +    if ( nr_pages == 1 )
> +    {
> +        pages = &onepage;
> +    } else
> +    {
> +        pages = xmalloc_array(mfn_t, nr_pages);
> +        if ( !pages )
> +            return NULL;
> +    }

How often do you think only 1 page will be used? If the answer is not
often, I would get rid of the onepage optimization.


> +    for (i = 0; i < nr_pages; i++)
> +    {
> +        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
> +        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);

Don't you need to call page_to_mfn (or gfn_to_mfn) to get the mfn?


> +    }
> +
> +    ptr = vmap(pages, nr_pages);

It is possible for vmap to fail and return NULL. Maybe because the guest
intentionally tried to break the hypervisor by passing an array of pages
that ends in an MMIO region. We need to check vmap errors and handle
them.


> +    if ( nr_pages > 1 )
> +        xfree(pages);
> +
> +    return ptr;
> +}
> +
> +void unmap_guest_pages(void *va, int nr_pages)
> +{
> +    paddr_t pa;
> +    unsigned long i;
> +
> +    if ( !va )
> +        return;
> +
> +    va = (void *)((uintptr_t)va & PAGE_MASK);
> +    pa = virt_to_maddr(va);
> +
> +    vunmap(va);
> +    for (i = 0; i < nr_pages; i++)
> +        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
> +
> +    return;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>  
>      switch ( gicr_reg )
>      {
> @@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      case 0x0050:
>          goto write_reserved;
>  
> -    case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +    case VREG64(GICR_PROPBASER): {
> +        int nr_pages;

unsigned int


> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Why not use vgic_reg64_check_access?


> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;

Why? If there is a good reason for this, it is probably worth writing it
in an in-code comment.


> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;
>  
> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;
> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);

Do we need to set an upper limit on nr_pages? We don't really want to
allow (2^0x1f)/4096 pages, right?


> +        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);
> +        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
> +                                                         reg & GENMASK(47, 12),
> +                                                         nr_pages);

I am pretty sure I am reading the right spec now and it should be
GENMASK(51, 12). Also, don't we need to sanitize the table too before
using it?


> +        return 1;
> +    }
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Why not use vgic_reg64_check_access?


> +	reg = v->arch.vgic.rdist_pendbase;
> +	vgic_reg64_update(&reg, r, info);
> +	reg = sanitize_pendbaser(reg);
> +	v->arch.vgic.rdist_pendbase = reg;
> +        unmap_guest_pages(v->arch.vgic.pendtable, 16);
> +	v->arch.vgic.pendtable = map_guest_pages(v->domain,
> +                                                 reg & GENMASK(47, 12), 16);

Indentation.
We need to make sure the vmap mapping is correct. Also we need to
sanitize this table too before using it.


> +	return 1;
>  
>      case 0x0080:
>          goto write_reserved;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index b961551..4d9304f 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>          empty->pirq.irq = lpi;
>      }
>  
> +    /* Update the enabled status */
> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);

Where is the GIC_IRQ_GUEST_ENABLED unset?


>      return &empty->pirq;
>  }
>  
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index ae8a9de..0cd3500 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -109,6 +109,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        uint64_t rdist_propbase;
> +        uint8_t *proptable;

Do we need to keep both rdist_propbase and proptable? It is easy to go
from proptable to rdist_propbase and I guess it is not an operation that
is done often? If so, we could save some memory and remove it.


>  #endif
>      } vgic;
>  
> @@ -247,7 +249,10 @@ struct arch_vcpu
>  
>          /* GICv3: redistributor base and flags for this vCPU */
>          paddr_t rdist_base;
> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
> +        uint64_t rdist_pendbase;
> +        unsigned long *pendtable;

Same here.


>          uint8_t flags;
>          struct list_head pending_lpi_list;
>      } vgic;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 1f881c0..3b2e5c0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>  
>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
> -    return GIC_PRI_IRQ;
> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;

Please #define 0xfc. Do we need to check for lpi overflows? As in lpi
numbers larger than proptable size?


> +}
> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> +{
> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>  }
>  
>  #else
> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
>      return GIC_PRI_IRQ;
>  }
> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> +{
> +    return false;
> +}
>  
>  #endif /* CONFIG_HAS_ITS */
>  
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 4e29ba6..2b216cc 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>  
>  #undef VGIC_REG_HELPERS
>  
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
> +void unmap_guest_pages(void *va, int nr_pages);
> +
>  enum gic_sgi_mode;
>  
>  /*

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
  2016-10-26  1:11   ` Stefano Stabellini
@ 2016-11-01 15:13   ` Julien Grall
  2016-11-14 17:35     ` Andre Przywara
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-01 15:13 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/2016 19:24, Andre Przywara wrote:
> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
> offers. Store that information in a list to both propagate all of them
> later to Dom0, but also to be able to iterate over all ITSes.
> This introduces an ITS Kconfig option.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig          |  5 ++++
>  xen/arch/arm/Makefile         |  1 +
>  xen/arch/arm/gic-its.c        | 67 +++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |  6 ++++
>  xen/include/asm-arm/gic-its.h | 57 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 136 insertions(+)
>  create mode 100644 xen/arch/arm/gic-its.c
>  create mode 100644 xen/include/asm-arm/gic-its.h
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 797c91f..9fe3b8e 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -45,6 +45,11 @@ config ACPI
>  config HAS_GICV3
>  	bool
>
> +config HAS_ITS
> +        bool "GICv3 ITS MSI controller support"
> +        depends on ARM_64

HAS_GICV3 will only be selected for 64-bit. It would need some rework to 
be supported on 32-bit. So I would drop this dependency.

> +        depends on HAS_GICV3
> +

I am not convinced that we should (currently) let the user selecting the 
ITS support. It increases the test coverage (we have to test with and 
without). Do we expect people using GICv3 without ITS?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
  2016-10-24 14:28   ` Vijay Kilari
  2016-10-26  1:10   ` Stefano Stabellini
@ 2016-11-01 17:22   ` Julien Grall
  2016-11-15 11:32     ` Andre Przywara
  2 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-01 17:22 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/2016 19:24, Andre Przywara wrote:
> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> ITS. We limit the number of LPIs we use with a compile time constant to
> avoid wasting memory.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig              |  6 ++++
>  xen/arch/arm/efi/efi-boot.h       |  1 -
>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>  xen/include/asm-arm/cache.h       |  4 +++
>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>  7 files changed, 181 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 9fe3b8e..66e2bb8 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -50,6 +50,12 @@ config HAS_ITS
>          depends on ARM_64
>          depends on HAS_GICV3
>
> +config HOST_LPI_BITS
> +        depends on HAS_ITS
> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> +        range 14 32
> +        default "20"
> +

This would be better to be defined as a parameter command line. So the 
user does not need to rebuild Xen in order to increase the number of 
bits supported. It would also be useful to get a rational behind the 
default number in the commit message.

>  config ALTERNATIVE
>  	bool
>
> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> index 045d6ce..dc64aec 100644
> --- a/xen/arch/arm/efi/efi-boot.h
> +++ b/xen/arch/arm/efi/efi-boot.h
> @@ -10,7 +10,6 @@
>  #include "efi-dom0.h"
>
>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>
>  #define DEVICE_TREE_GUID \
>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 0f42a77..b52dff3 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c

Please rename this file gic-v3-its to make clear ITS is only GICv3.

> @@ -20,10 +20,86 @@
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
> +#include <asm/p2m.h>

Why did you include p2m.h? This header contains stage-2 page table 
functions but I don't see any use of them within this patch.

>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
>
> +/* Global state */
> +static struct {
> +    uint8_t *lpi_property;
> +    int host_lpi_bits;

Please use unsigned int.

> +} lpi_data;
> +
> +/* Pending table for each redistributor */
> +static DEFINE_PER_CPU(void *, pending_table);
> +
> +#define MAX_HOST_LPI_BITS                                                \
> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)

Why don't you directly initialize host_lpi_bits to the correct value? 
This would avoid to compute the min every time you use MAX_HOST_LPI_BITS 
and save few instructions.

> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)

I know that we don't support ITS for 32-bits, but I would rather avoid 
to use BIT as this macro is working on unsigned long. I would prefer if 
you introduce BIT_ULL or open-code.

> +
> +uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    uint64_t reg, attr;
> +    void *pendtable;
> +
> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;

 From the spec (8.11.18 in ARM IHI 0069C) the cacheability and 
shareability could be fixed (though it marked as deprecated). Should we 
check whether the value stick?

Also the variable attr sounds pointless as you will directly assign the 
value to reg with no more computation.

> +
> +    /*
> +     * The pending table holds one bit per LPI, so we need three bits less
> +     * than the number of LPI_BITs. But the alignment requirement from the
> +     * ITS is 64K, so make order at least 16 (-12).
> +     */
> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);
> +    if ( !pendtable )
> +        return 0;
> +
> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));

Same remark for BIT here.

> +    this_cpu(pending_table) = pendtable;
> +
> +    reg  = attr | GICR_PENDBASER_PTZ;
> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);

I don't think the mask is useful and would need to be changed if the 
physical address bits increased as it was done in ARMv8.2.

> +
> +    return reg;
> +}
> +
> +uint64_t gicv3_lpi_get_proptable()
> +{
> +    uint64_t attr;
> +    static uint64_t reg = 0;
> +
> +    /* The property table is shared across all redistributors. */
> +    if ( reg )
> +        return reg;
> +
> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;

Same question for the cacheability and shareability.

Also the variable attr sounds pointless as you will directly assign the 
value to reg with no more computation.

> +
> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
> +    if ( !lpi_data.lpi_property )
> +        return 0;
> +
> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
> +
> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);

Same remark for the mask here.

> +
> +    return reg;
> +}
> +
> +int gicv3_lpi_init_host_lpis(int lpi_bits)

Please use unsigned int for lpi_bits.

Also this function should probably be in the section __init.

> +{
> +    lpi_data.host_lpi_bits = lpi_bits;
> +
> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);

%lu.

> +
> +    return 0;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 238da84..2534aa5 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>
> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>
>      return 0;
>  }

Missing blank line here.

> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +
> +    if ( list_empty(&host_its_list) )
> +        return;
> +
> +    /* Make sure LPIs are disabled before setting up the BASERs. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
> +
> +    table_reg = gicv3_lpi_allocate_pendtable();
> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);

Is it fine to continue silently if gicv3_lpi_allocate_pendtable has failed?

> +
> +    table_reg = gicv3_lpi_get_proptable();
> +    if ( table_reg )
> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);

Ditto.


> +}
>
>  static int __init gicv3_populate_rdist(void)
>  {
> @@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                    gicv3_rdist_init_lpis(ptr);
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
> index 2de6564..af96eee 100644
> --- a/xen/include/asm-arm/cache.h
> +++ b/xen/include/asm-arm/cache.h
> @@ -7,6 +7,10 @@
>  #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
>  #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
>
> +#ifndef __ASSEMBLY__
> +void __flush_dcache_area(const void *vaddr, unsigned long size);
> +#endif
> +

Please move this change in a separate patch.

>  #define __read_mostly __section(".data.read_mostly")
>
>  #endif
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 2f5c51c..48c6c78 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>  /* Parse the host DT and pick up all host ITSes. */
>  void gicv3_its_dt_init(const struct dt_device_node *node);
>
> +/* Allocate and initialize tables for each host redistributor.
> + * Returns the respective {PROP,PEND}BASER register value.
> + */
> +uint64_t gicv3_lpi_get_proptable(void);
> +uint64_t gicv3_lpi_allocate_pendtable(void);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(int nr_lpis);
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>  }
> -

Please add a newline here.

> +static inline uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    return 0;
> +}

Ditto

> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    return 0;
> +}

Ditto

> +static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
> +{
> +    return 0;
> +}

Ditto

>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..da5fb77 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,8 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_LPIS               (1U << 17)

I was about to say that this should be named GICD_TYPER... but it looks 
like we already defined and use GIC_TYPE_ID_BITS_SHIFTS. So it is up to 
you if you rename it to get the correct register name.

>
>  #define GICD_CTLR_RWP                (1UL << 31)
>  #define GICD_CTLR_ARE_NS             (1U << 4)
> @@ -95,12 +96,57 @@
>  #define GICR_IGRPMODR0               (0x0D00)
>  #define GICR_NSACR                   (0x0E00)
>
> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)

Please add a new line here to separate definition for GICR_CTLR and 
GICR_TYPER.

>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
>
> +#define GIC_BASER_CACHE_nCnB         0ULL
> +#define GIC_BASER_CACHE_SameAsInner  0ULL

I think this would require some description in the code as it is not 
clear wheather nCnB apply for Outer or Inner. From my understanding it 
is only the later.

> +#define GIC_BASER_CACHE_nC           1ULL
> +#define GIC_BASER_CACHE_RaWt         2ULL
> +#define GIC_BASER_CACHE_RaWb         3ULL
> +#define GIC_BASER_CACHE_WaWt         4ULL
> +#define GIC_BASER_CACHE_WaWb         5ULL
> +#define GIC_BASER_CACHE_RaWaWt       6ULL
> +#define GIC_BASER_CACHE_RaWaWb       7ULL
> +#define GIC_BASER_CACHE_MASK         7ULL

New line here please and maybe a comment to say this is shareability 
definition.

> +#define GIC_BASER_NonShareable       0ULL
> +#define GIC_BASER_InnerShareable     1ULL
> +#define GIC_BASER_OuterShareable     2ULL
> +
> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7

> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)

It might be better to define GIC_BASER_SHAREABILITY_MASK rather than 
open-coding 3UL. Also technically 3UL should be 3ULL.

> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)

Same remark here.

> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)

Ditto

> +#define PROPBASER_RES0_MASK                                  \

I would probably rename this field GICR_PROPBASER_RES0_MASK.

> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
> +
> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
> +	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)

See my remark above.

> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
> +	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)

Ditto

> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_PTZ                              BIT(62)

Please don't use BIT but either 1ULL << 62 or introduce BIT_ULL.

> +#define PENDBASER_RES0_MASK                                  \

GICR_PENDBASER_RES0_MASK

> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
> +         GENMASK(15, 12) | GENMASK(6, 0))
> +
>  #define DEFAULT_PMR_VALUE            0xff
>
> +#define LPI_PROP_DEFAULT_PRIO        0xa0

You define LPI_PROP_DEFAULT_PRIO but never used it within this series. 
In any case, it would be better to keep using GIC_PRI_IRQ (as you did) 
as make LPI_PROP_DEFAULT_PRIO an alias of GIC_PRI_IRQ to avoid spreading 
the priority everywhere (for now they are all defined in gic.h).

> +#define LPI_PROP_RES1                (1 << 1)
> +#define LPI_PROP_ENABLED             (1 << 0)
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-10-26 22:57   ` Stefano Stabellini
@ 2016-11-01 17:34     ` Julien Grall
  2016-11-10 15:32     ` Andre Przywara
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-01 17:34 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel

Hi Stefano,

On 26/10/2016 23:57, Stefano Stabellini wrote:
>> +    int pagesz;
>> +    int order;
>> +    void *buffer = NULL;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    /*
>> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
>> +     * supports.
>> +     */
>
> Is this really the best way to do it? Can't we assume ITS supports 4K,
> given that Xen requires 4K pages at the moment? Is it actually possible
> to find hardware that supports 4K but with an ITS that only support 64K
> or 16K pages? It seems insane to me. Otherwise can't we probe the page
> size somehow?

By reading the spec (8.19.1 in IHI 0069C):

"If the GIC implementation supports only a single, fixed page size, this 
field might be RO.
When this register has an architecturally-defined reset value, if this 
field is implemented as an RW
field, it resets to a value that is architecturally UNKNOWN."

As the reset value is architecturally unknown the only way to find out 
the correct page size is to try them one by one.

The GIC is a separate component of the platform and will be programed 
using physical address (and not virtual one). It would be fine to have a 
BASE registers supporting only 64K to save few lines in the GIC.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                     ` (2 preceding siblings ...)
  2016-10-26 22:57   ` Stefano Stabellini
@ 2016-11-01 18:19   ` Julien Grall
  3 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-01 18:19 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/2016 19:24, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> We limit the number of devices to cover 4 PCI busses for now.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |   5 ++
>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>  3 files changed, 167 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index b52dff3..40238a2 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -21,6 +21,7 @@
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic-its.h>
> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))

I did not notice in the previous patch. But I would rather introduce 
GENMASK_ULL to avoid any issue the day we decide to port on 32-bits (for 
similar reason as BIT).

> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)

Please make page_bits unsigned int.

> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16)

Coding style.

> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)

unsigned int nr_items

Also can you find a better name for regc and/or reg. The naming is very 
similar and could be error-prone.

> +{
> +    uint64_t attr;
> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;

unsigned int here.

> +    int pagesz;

Same.

> +    int order;

Same.

> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
> +     * supports.
> +     */
> +    for (pagesz = 0; pagesz < 3; pagesz++)

Coding style: for ( ... )

> +    {
> +        uint64_t reg;
> +        int nr_bytes;

unsigned int nr_bytes.

> +
> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
> +        order = get_order_from_bytes(nr_bytes);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= nr_bytes >> (pagesz * 2 + 12);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_BASER_VALID;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);

Should not it be readq_relaxed?

> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +            attr = regc & BASER_ATTR_MASK;

Should we flush the cache if the GIC does not support shareability?


> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )

You probably want a define for 3UL << GITS_BASER_PAGE_SIZE_SHIFT.

> +            return 0;
> +
> +        /* Check whether our buffer is aligned to the next page size already. */
> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )

Well the buffer could be aligned to the next page size but the size not 
page aligned. So you would provide the GIC memory that it cannot touch.

So you have to always reallocate the buffer from scratch to avoid this 
issue.

> +        {
> +            free_xenheap_pages(buffer, order);
> +            buffer = NULL;
> +        }
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for (i = 0; i < 8; i++)

Coding style: for ( ... )

> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg >> 56) & 0x7;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            gicv3_map_baser(basereg, reg, 1024);

Flat table may use a huge amount of memory on some platform (depending 
how many device are present and how the ID space is sparsed). It would 
probably worth to add a TODO regarding the support of one/two-level table.

Also, the limitation could be dropped if we use the one/two-level table 
as the amount of memory pre-allocated will be smaller.

> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            gicv3_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 2534aa5..5cf4618 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1548,6 +1549,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1603,6 +1605,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>      spin_unlock(&gicv3.lock);
>
>      return res;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 48c6c78..589b889 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -18,6 +18,47 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>
> +#define LPI_OFFSET      8192
> +
> +#define GITS_CTLR       (0x000)
> +#define GITS_IIDR       (0x004)
> +#define GITS_TYPER      (0x008)
> +#define GITS_CBASER     (0x080)
> +#define GITS_CWRITER    (0x088)
> +#define GITS_CREADR     (0x090)
> +#define GITS_BASER0     (0x100)
> +#define GITS_BASER1     (0x108)
> +#define GITS_BASER2     (0x110)
> +#define GITS_BASER3     (0x118)
> +#define GITS_BASER4     (0x120)
> +#define GITS_BASER5     (0x128)
> +#define GITS_BASER6     (0x130)
> +#define GITS_BASER7     (0x138)
> +
> +/* Register bits */
> +#define GITS_CTLR_ENABLE     0x1
> +#define GITS_IIDR_VALUE      0x34c
> +
> +#define GITS_BASER_VALID                BIT(63)
> +#define GITS_BASER_INDIRECT             BIT(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              ((7UL << GITS_BASER_TYPE_SHIFT) | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> @@ -27,6 +68,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>
>  extern struct list_head host_its_list;
> @@ -42,8 +84,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>  #else
>
> @@ -62,6 +105,10 @@ static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>  {
>      return 0;
>  }

Please add a newline here and ...

> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}

here.

>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer
  2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
  2016-10-24 14:31   ` Vijay Kilari
  2016-10-26 23:03   ` Stefano Stabellini
@ 2016-11-02 13:38   ` Julien Grall
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-02 13:38 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hello Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer to the ITS h/w to create or alter the
> LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 25 +++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  1 +
>  2 files changed, 26 insertions(+)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 40238a2..c8a7a7e 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -18,6 +18,7 @@
>
>  #include <xen/config.h>
>  #include <xen/lib.h>
> +#include <xen/err.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <asm/p2m.h>
> @@ -56,6 +57,26 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>  }
>
> +static void *gicv3_map_cbaser(void __iomem *cbasereg)
> +{
> +    uint64_t attr, reg;
> +    void *buffer;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +

You could directly use 'reg' here rather than have a temporary variable.

Also what if the shareability/cacheability has been fixed by the hardware?

> +    buffer = alloc_xenheap_pages(0, 0);

Please document how you decide to use only a 4K page (is there 
potentially a drawback)? Also I would prefer if you add a define for the 
size of the command queue. This will be more readable.

> +    if ( !buffer )
> +        return ERR_PTR(-ENOMEM);
> +
> +    /* We use exactly one 4K page, so the "Size" field is 0. */
> +    reg = attr | BIT(63) | (virt_to_maddr(buffer) & GENMASK(51, 12));

Please introduce a define for bit 63. Also masking the address is not 
useful.

> +    writeq_relaxed(reg, cbasereg);

Should not we initialize GITS_CWRITER to 0? From the spec the field is 
reset to an UNKNOWN value (see 8.19.5)?

> +
> +    return buffer;
> +}
> +
>  static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>  {
>      uint64_t attr;
> @@ -149,6 +170,10 @@ int gicv3_its_init(struct host_its *hw_its)
>          }
>      }
>
> +    hw_its->cmd_buf = gicv3_map_cbaser(hw_its->its_base + GITS_CBASER);
> +    if ( IS_ERR(hw_its->cmd_buf) )
> +        return PTR_ERR(hw_its->cmd_buf);
> +
>      return 0;
>  }
>
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 589b889..b2a003f 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -69,6 +69,7 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    void *cmd_buf;
>  };
>
>  extern struct list_head host_its_list;
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation
  2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
                   ` (23 preceding siblings ...)
  2016-09-28 18:24 ` [RFC PATCH 24/24] ARM: vGIC: advertising LPI support Andre Przywara
@ 2016-11-02 13:56 ` Julien Grall
  24 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-02 13:56 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, vijay.kilari

Hi,

On 28/09/16 19:24, Andre Przywara wrote:
> Andre Przywara (24):
>   ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
>   ARM: GICv3: allocate LPI pending and property table
>   ARM: GICv3 ITS: allocate device and collection table
>   ARM: GICv3 ITS: map ITS command buffer
>   ARM: GICv3 ITS: introduce ITS command handling
>   ARM: GICv3 ITS: introduce host LPI array
>   ARM: GICv3 ITS: introduce device mapping

Andre and I talked IRL about GICv3 ITS host driver, I will summarize
here for the other.

Whilst reviewing the host driver, I was wondering if we could share the
driver with Linux. I remember that Vijay's managed to do it in his
series [1] and I quite liked the idea. It makes easier to track bugs and
errata as we would rely on Linux.

We already did that in quite few place in Xen (e.g SMMUv2).

Any opinions?

Regards,

[1]
https://lists.xenproject.org/archives/html/xen-devel/2016-02/msg00047.html

>   ARM: GICv3: introduce separate pending_irq structs for LPIs
>   ARM: GICv3: forward pending LPIs to guests
>   ARM: GICv3: enable ITS and LPIs on the host


--
Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-09-28 18:24 ` [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
  2016-10-26 23:55   ` Stefano Stabellini
@ 2016-11-02 15:05   ` Julien Grall
  2017-01-31  9:10     ` Andre Przywara
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-02 15:05 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> To be able to easily send commands to the ITS, create the respective
> wrapper functions, which take care of the ring buffer.
> The first two commands we implement provide methods to map a collection
> to a redistributor (aka host core) and to flush the command queue (SYNC).
> Start using these commands for mapping one collection to each host CPU.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 101 ++++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c         |  17 +++++++
>  xen/include/asm-arm/gic-its.h |  32 +++++++++++++
>  3 files changed, 150 insertions(+)
>
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index c8a7a7e..88397bc 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -33,6 +33,10 @@ static struct {
>      int host_lpi_bits;
>  } lpi_data;
>
> +/* Physical redistributor address */
> +static DEFINE_PER_CPU(uint64_t, rdist_addr);

The type should be paddr_t.

> +/* Redistributor ID */
> +static DEFINE_PER_CPU(uint64_t, rdist_id);
>  /* Pending table for each redistributor */
>  static DEFINE_PER_CPU(void *, pending_table);
>
> @@ -40,6 +44,86 @@ static DEFINE_PER_CPU(void *, pending_table);
>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>
> +#define ITS_COMMAND_SIZE        32
> +
> +static int its_send_command(struct host_its *hw_its, void *its_cmd)

The its_cmd could be const as you don't modify it.

> +{
> +    int readp, writep;

Please use uint32_t (or maybe uint64_t) here.

> +
> +    spin_lock(&hw_its->cmd_lock);
> +
> +    readp = readl_relaxed(hw_its->its_base + GITS_CREADR) & GENMASK(19, 5);
> +    writep = readl_relaxed(hw_its->its_base + GITS_CWRITER) & GENMASK(19, 5);

Please introduce a define for the GENMASK(19, 5) rather than hardcoding 
it in multiple place.

> +
> +    if ( ((writep + ITS_COMMAND_SIZE) % PAGE_SIZE) == readp )
> +    {
> +        spin_unlock(&hw_its->cmd_lock);
> +        return -EBUSY;
> +    }
> +
> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_COMMAND_SIZE);
> +    __flush_dcache_area(hw_its->cmd_buf + writep, ITS_COMMAND_SIZE);

Why the flush here? From patch #4, the GIC has been configured to be 
able to snoop the cache. So a dsb(ish) would be enough here.

> +    writep = (writep + ITS_COMMAND_SIZE) % PAGE_SIZE;
> +
> +    writeq_relaxed(writep & GENMASK(19, 5), hw_its->its_base + GITS_CWRITER);
> +
> +    spin_unlock(&hw_its->cmd_lock);
> +
> +    return 0;

This function return either -EBUSY or 0. Would not it be better to 
return a bool instead?

> +}
> +
> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
> +{
> +    reg &= ~GENMASK(51, 16);
> +
> +    if ( hw_its->pta )
> +        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);
> +    else
> +        reg |= per_cpu(rdist_id, cpu) << 16;

I would prefer if we setup the target address at initialize per-cpu 
rather than doing it every time we send a sync command (or else).

> +
> +    return reg;
> +}
> +
> +static int its_send_cmd_sync(struct host_its *its, int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_SYNC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)) | BIT(63));
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +/* Set up the (1:1) collection mapping for the given host CPU. */
> +void gicv3_its_setup_collection(int cpu)
> +{
> +    struct host_its *its;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        /* Only send commands to ITS that have been initialized already. */
> +        if ( !its->cmd_buf )
> +            continue;
> +
> +        its_send_cmd_mapc(its, cpu, cpu);
> +        its_send_cmd_sync(its, cpu);

Looking at the implementation of its_send_cmd_*, the functions may 
return an error if the command queue is full. However you don't check 
the return, and continue as it was fine. We will get in trouble much later.

Furthermore, sending the SYNC command does not meaning the ITS has 
executed the command. You have to ensure that GITS_CREADR == 
GITS_CWRITER and I didn't find this code within this series.

> +    }
> +}
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -147,6 +231,13 @@ int gicv3_its_init(struct host_its *hw_its)
>      if ( !hw_its->its_base )
>          return -ENOMEM;
>
> +    /* Make sure the ITS is disabled before programming the BASE registers. */
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);

The spec (6.2.1 in IHI 0069C) requires the ITS to be disabled and 
quiescent before programming the BASE registers. So I don't think this 
check is enough here.

> +
> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
> +    hw_its->pta = reg & GITS_TYPER_PTA;
> +
>      for (i = 0; i < 8; i++)
>      {
>          void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> @@ -174,9 +265,18 @@ int gicv3_its_init(struct host_its *hw_its)
>      if ( IS_ERR(hw_its->cmd_buf) )
>          return PTR_ERR(hw_its->cmd_buf);
>
> +    its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
> +    its_send_cmd_sync(hw_its, smp_processor_id());

See my comments on the previous its_send_* functions

> +
>      return 0;
>  }
>
> +void gicv3_set_redist_addr(paddr_t address, int redist_id)

The second parameter should probably be unsigned, maybe uint64_t?

> +{
> +    this_cpu(rdist_addr) = address;
> +    this_cpu(rdist_id) = redist_id;
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg, attr;
> @@ -265,6 +365,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);
>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 5cf4618..b9387a3 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -638,6 +638,8 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>      table_reg = gicv3_lpi_get_proptable();
>      if ( table_reg )
>          writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> +
> +    gicv3_its_setup_collection(smp_processor_id());
>  }
>
>  static int __init gicv3_populate_rdist(void)
> @@ -684,7 +686,22 @@ static int __init gicv3_populate_rdist(void)
>                  this_cpu(rbase) = ptr;
>
>                  if ( typer & GICR_TYPER_PLPIS )
> +                {
> +                    paddr_t rdist_addr;
> +
> +                    rdist_addr = gicv3.rdist_regions[i].base;
> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
> +
> +                    /* The ITS refers to redistributors either by their physical

Coding style:

/*
  * Foo

> +                     * address or by their ID. Determine those two values and
> +                     * let the ITS code store them in per host CPU variables to
> +                     * later be able to address those redistributors.
> +                     */
> +                    gicv3_set_redist_addr(rdist_addr,
> +                                          (typer >> 8) & GENMASK(15, 0));

Please avoid hardcoding mask and use a define.

> +
>                      gicv3_rdist_init_lpis(ptr);
> +                }
>
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index b2a003f..b49d274 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -37,6 +37,7 @@
>
>  /* Register bits */
>  #define GITS_CTLR_ENABLE     0x1
> +#define GITS_TYPER_PTA       BIT(19)
>  #define GITS_IIDR_VALUE      0x34c
>
>  #define GITS_BASER_VALID                BIT(63)
> @@ -59,6 +60,22 @@
>                                          (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
>                                          GITS_BASER_INDIRECT)
>
> +/* ITS command definitions */
> +#define ITS_CMD_SIZE                    32
> +
> +#define GITS_CMD_MOVI                   0x01
> +#define GITS_CMD_INT                    0x03
> +#define GITS_CMD_CLEAR                  0x04
> +#define GITS_CMD_SYNC                   0x05
> +#define GITS_CMD_MAPD                   0x08
> +#define GITS_CMD_MAPC                   0x09
> +#define GITS_CMD_MAPTI                  0x0a
> +#define GITS_CMD_MAPI                   0x0b
> +#define GITS_CMD_INV                    0x0c
> +#define GITS_CMD_INVALL                 0x0d
> +#define GITS_CMD_MOVALL                 0x0e
> +#define GITS_CMD_DISCARD                0x0f
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> @@ -69,7 +86,9 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    spinlock_t cmd_lock;
>      void *cmd_buf;
> +    bool pta;
>  };
>
>  extern struct list_head host_its_list;
> @@ -89,6 +108,12 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
>  int gicv3_lpi_init_host_lpis(int nr_lpis);
>  int gicv3_its_init(struct host_its *hw_its);
>
> +/* Set the physical address and ID for each redistributor as read from DT. */
> +void gicv3_set_redist_addr(paddr_t address, int redist_id);
> +
> +/* Map a collection for this host CPU to each host ITS. */
> +void gicv3_its_setup_collection(int cpu);
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -110,6 +135,13 @@ static inline int gicv3_its_init(struct host_its *hw_its)
>  {
>      return 0;
>  }

Newline here

> +static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
> +{
> +}

Ditto

> +static inline void gicv3_its_setup_collection(int cpu)
> +{
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array
  2016-10-27 22:59   ` Stefano Stabellini
@ 2016-11-02 15:14     ` Julien Grall
  2016-11-10 17:22     ` Andre Przywara
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-02 15:14 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel

Hi,

On 27/10/16 23:59, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> The number of LPIs on a host can be potentially huge (millions),
>> although in practise will be mostly reasonable. So prematurely allocating
>> an array of struct irq_desc's for each LPI is not an option.
>> However Xen itself does not care about LPIs, as every LPI will be injected
>> into a guest (Dom0 for now).
>> Create a dense data structure (8 Bytes) for each LPI which holds just
>> enough information to determine the virtual IRQ number and the VCPU into
>> which the LPI needs to be injected.
>> Also to not artificially limit the number of LPIs, we create a 2-level
>> table for holding those structures.
>> This patch introduces functions to initialize these tables and to
>> create, lookup and destroy entries for a given LPI.
>> We allocate and access LPI information in a way that does not require
>> a lock.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-arm/gic-its.h |  18 +++++
>>  2 files changed, 172 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 88397bc..2140e4a 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -18,18 +18,31 @@
>>
>>  #include <xen/config.h>
>>  #include <xen/lib.h>
>> +#include <xen/sched.h>
>>  #include <xen/err.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> +#include <asm/domain.h>
>>  #include <asm/io.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>
>> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
>> +union host_lpi {
>> +    uint64_t data;
>> +    struct {
>> +        uint64_t virt_lpi:32;
>> +        uint64_t dom_id:16;
>> +        uint64_t vcpu_id:16;
>> +    };
>> +};
>
> Why not the following?
>
>   union host_lpi {
>       uint64_t data;
>       struct {
>           uint32_t virt_lpi;
>           uint16_t dom_id;
>           uint16_t vcpu_id;
>       };
>   };
>
>
>>  /* Global state */
>>  static struct {
>>      uint8_t *lpi_property;
>> +    union host_lpi **host_lpis;
>>      int host_lpi_bits;
>>  } lpi_data;
>>
>> @@ -43,6 +56,26 @@ static DEFINE_PER_CPU(void *, pending_table);
>>  #define MAX_HOST_LPI_BITS                                                \
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
>> +
>> +static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
>
> I take "lpi" is the physical lpi here. Maybe we would rename it to "plpi"
> for clarity.

+1 here. We tend to use the prefix 'p' for physical and 'v' for virtual 
(e.g virq/pirq, vcpu/pcpu). I'd like to see the same for the LPIs.

While we are here, I think the function should be named gic_find_plpi.

>
>
>> +{
>> +    union host_lpi *hlpi;
>> +
>> +    if ( lpi < 8192 || lpi >= MAX_HOST_LPIS + 8192 )
>> +        return NULL;
>> +
>> +    lpi -= 8192;
>> +    if ( !lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE] )
>> +        return NULL;
>> +
>> +    hlpi = &lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE][lpi % HOST_LPIS_PER_PAGE];
>
> I realize I am sometimes obsessive about this, but division operations
> are expensive and this is on the hot path, so I would do:
>
> #define HOST_LPIS_PER_PAGE      (PAGE_SIZE >> 3)
>
> unsigned int table = lpi / HOST_LPIS_PER_PAGE;
>
> then use table throughout this function.
>
>
>> +    if ( d && hlpi->dom_id != d->domain_id )
>> +        return NULL;
>
> I think this function is very useful so I would avoid making any domain
> checks here: one day we might want to retrieve hlpi even if hlpi->dom_id
> != d->domain_id. I would move the domain check outside.

+1.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-10-24 14:28   ` Vijay Kilari
@ 2016-11-02 16:22     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-02 16:22 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On 24/10/16 15:28, Vijay Kilari wrote:

Hi Vijay,

thanks for having a look!

> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> ITS. We limit the number of LPIs we use with a compile time constant to
>> avoid wasting memory.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig              |  6 ++++
>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>  xen/include/asm-arm/cache.h       |  4 +++
>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 9fe3b8e..66e2bb8 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -50,6 +50,12 @@ config HAS_ITS
>>          depends on ARM_64
>>          depends on HAS_GICV3
>>
>> +config HOST_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +
>>  config ALTERNATIVE
>>         bool
>>
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 045d6ce..dc64aec 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -10,7 +10,6 @@
>>  #include "efi-dom0.h"
>>
>>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>>
>>  #define DEVICE_TREE_GUID \
>>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 0f42a77..b52dff3 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -20,10 +20,86 @@
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>> +#include <asm/p2m.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    int host_lpi_bits;
>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define MAX_HOST_LPI_BITS                                                \
>> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>> +
>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg, attr;
>> +    void *pendtable;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    /*
>> +     * The pending table holds one bit per LPI, so we need three bits less
>> +     * than the number of LPI_BITs. But the alignment requirement from the
>> +     * ITS is 64K, so make order at least 16 (-12).
>> +     */
>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);
> 
>     The pend table size allocated is differ from proptable size?

According to the spec the pending table always covers all LPIs that the
ITS advertises, consequently GICR_PENDBASER has no field to indicate the
size of the table.

> 
>> +    if ( !pendtable )
>> +        return 0;
>> +
>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
>          memset size is different from allocated size?

We just clean what the ITS needs. A potentially bigger allocation above
is just to match the alignment requirement. I didn't find a nice
function to allocate pages with a specific alignment beyond a single page.

>          flushing zeroed pendtable?
>> +    this_cpu(pending_table) = pendtable;
>> +
>> +    reg  = attr | GICR_PENDBASER_PTZ;
>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
>      can use __pa instead of virt_to_maddr()
>      Isn't GENMASK(47, 12) here?

Please download the newest revision (issue C) of the spec. Issue B
extended physical address space to cover 52 bits in many places.

>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable()
>> +{
>> +    uint64_t attr;
>> +    static uint64_t reg = 0;
>> +
>> +    /* The property table is shared across all redistributors. */
>> +    if ( reg )
>> +        return reg;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
>      using PENDBASER definitions for PROPBASER?

Good catch.

>> +
>> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
>> +    if ( !lpi_data.lpi_property )
>> +        return 0;
>> +
>> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
>> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
>> +
>> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
>> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
>    Isn't GENMASK(47, 12)?
>> +
>> +    return reg;
>> +}
>> +
>> +int gicv3_lpi_init_host_lpis(int lpi_bits)
>> +{
>> +    lpi_data.host_lpi_bits = lpi_bits;
>> +
>> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 238da84..2534aa5 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>
>> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>>
>>      return 0;
>>  }
>> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    if ( list_empty(&host_its_list) )
>> +        return;
>> +
>> +    /* Make sure LPIs are disabled before setting up the BASERs. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
>> +
>> +    table_reg = gicv3_lpi_get_proptable();
> 
>    Here LPI property table is allocated per cpu. One property table
> should be enough and can be shared by all cpus.

The function is called _get_ and not _allocate_, multiple calls to it
returns the same pointer, allocated on the first incarnation by the
magic of a function-local, static variable. See the above:
	if ( reg ) return reg;

>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
>     After updating GICR_PENDBASER and GICR_PROPBASER regs
> shouldn't we read back and check if sharability bits are support by HW or not
> like it is done in linux driver?

Possibly.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
  2016-10-24 15:32   ` Vijay Kilari
  2016-10-29  0:39   ` Stefano Stabellini
@ 2016-11-02 17:18   ` Julien Grall
  2016-11-02 17:41     ` Stefano Stabellini
  2017-01-31  9:10     ` Andre Przywara
  2 siblings, 2 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-02 17:18 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Steve Capper

Hi Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c        | 189 ++++++++++++++++++++++++++++++++++++++----
>  xen/arch/arm/vgic.c           |   4 +
>  xen/include/asm-arm/domain.h  |   7 +-
>  xen/include/asm-arm/gic-its.h |  10 ++-
>  xen/include/asm-arm/vgic.h    |   3 +
>  5 files changed, 197 insertions(+), 16 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index e9b6490..8fe8386 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);

The field PTZ read as 0.

> +        return 1;
>
>      case 0x0080:
>          goto read_reserved;
> @@ -301,11 +305,6 @@ bad_width:
>      domain_crash_synchronous();
>      return 0;
>
> -read_as_zero_64:
> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -    *r = 0;
> -    return 1;
> -
>  read_as_zero_32:
>      if ( dabt.size != DABT_WORD ) goto bad_width;
>      *r = 0;
> @@ -330,11 +329,149 @@ read_unknown:
>      return 1;
>  }
>
> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
> +                                    int field_shift,
> +                                    uint64_t (*sanitise_fn)(uint64_t))
> +{
> +    uint64_t field = (reg & field_mask) >> field_shift;
> +
> +    field = sanitise_fn(field) << field_shift;

Newline here please.

> +    return (reg & ~field_mask) | field;
> +}
> +
> +/* We want to avoid outer shareable. */
> +static uint64_t vgic_sanitise_shareability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_OuterShareable:
> +        return GIC_BASER_InnerShareable;
> +    default:
> +        return field;
> +    }
> +}

I am not sure to understand why we need to sanitise the value here. From 
my understanding of the spec (see 8.11.18 in IHI 0069C) we should 
support any shareability/cacheability, correct?

> +
> +/* Avoid any inner non-cacheable mapping. */
> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_nCnB:
> +    case GIC_BASER_CACHE_nC:
> +        return GIC_BASER_CACHE_RaWb;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Non-cacheable or same-as-inner are OK. */
> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_SameAsInner:
> +    case GIC_BASER_CACHE_nC:
> +        return field;
> +    default:
> +        return GIC_BASER_CACHE_nC;
> +    }
> +}
> +
> +static uint64_t sanitize_propbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PROPBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);

Why do you mask the bits 51:48. There is no restriction in Xen about the 
size of the IPA (though 52 bits support is part of ARMv8.2), so we 
should avoid to open-code mask everywhere in the code. Otherwise it will 
be more painful to extend the number of bits supported.

FWIW, all the p2m code is checking whether the IPA is supported.

> +    return reg;
> +}
> +
> +static uint64_t sanitize_pendbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~PENDBASER_RES0_MASK;
> +    reg &= ~GENMASK(51, 48);

Ditto.

> +    return reg;
> +}
> +
> +/*
> + * Allow mapping some parts of guest memory into Xen's VA space to have easy
> + * access to it. This is to allow ITS configuration data to be held in
> + * guest memory and avoid using Xen memory for that.
> + */
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)

Please pass a gfn_t rather than paddr_t.

> +{
> +    mfn_t onepage;
> +    mfn_t *pages;

s/pages/mfns/

> +    int i;
> +    void *ptr;
> +
> +    /* TODO: free previous mapping, change prototype? use get-put-put? */
> +
> +    guest_addr &= PAGE_MASK;
> +
> +    if ( nr_pages == 1 )
> +    {
> +        pages = &onepage;
> +    } else
> +    {
> +        pages = xmalloc_array(mfn_t, nr_pages);
> +        if ( !pages )
> +            return NULL;
> +    }
> +
> +    for (i = 0; i < nr_pages; i++)
> +    {
> +        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);

get_page_from_gfn can fail if you try to get a page on memory that is 
not baked by a RAM region. Also get_page_from_gfn will work on foreign 
mapping, we don't want the guest using foreing memory (e.g memory 
belonging to another domain) for the ITS internal memory.

Also, please try to pay attention for error path whilst you write code. 
It is a pain to handle them after the code has been written. I will try 
to point them when I spot it.

> +        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);

You cannot assume a 1:1 mapping between the IPA and the PA. Please use 
the struct page_info returned by get_page_from_gfn

> +    }
> +
> +    ptr = vmap(pages, nr_pages);

I am not a big fan of the vmap solution for various reasons:
	- the VMAP area is small (only 1GB) it will not scale (you seem to use 
it to map pretty much all memory provisioned for the ITS)
	- writing in a register cannot fail, how do you co-op with that?

I think the best approach here is to use a similar approach as 
copy_*_guests helpers but dealing with IPA rather than guest VA.

> +
> +    if ( nr_pages > 1 )
> +        xfree(pages);
> +
> +    return ptr;
> +}
> +
> +void unmap_guest_pages(void *va, int nr_pages)
> +{
> +    paddr_t pa;
> +    unsigned long i;
> +
> +    if ( !va )
> +        return;
> +
> +    va = (void *)((uintptr_t)va & PAGE_MASK);
> +    pa = virt_to_maddr(va);
> +
> +    vunmap(va);
> +    for (i = 0; i < nr_pages; i++)
> +        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
> +
> +    return;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>
>      switch ( gicr_reg )
>      {
> @@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      case 0x0050:
>          goto write_reserved;
>
> -    case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +    case VREG64(GICR_PROPBASER): {

Coding style, the { should be on a line.

> +        int nr_pages;

unsigned int

> +
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Newline here for clarity. Also please use vgic_reg64_check_access rather 
than open-coding it.

> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )

 From my understanding VGIC_V3_LPIS_ENABLED is set when the guest enable 
LPIs on this re-distributor. However, this check is not safe as 
GICR_CTLR.Enable_LPIs may be set concurrently (the re-distributors are 
accessible from any vCPU).

Also, when ITS is not available we should avoid to handle the register 
(i.e treating as write ignore). My rational here is we should limit the 
amount of emulation exposed to the guest whenever it is possible.

> +            return 1;

I think we should at least print warning as writing to GICR_PROPBASER 
when GICR_CTLR.Enable_LPIs is set is unpredictable. IHMO, I would even 
crash the guest.

The code below likely needs locking as the property table is common to 
all re-distributor, hence could be modified concurrently. Also, I would 
like to see a comment on top of emulation of GICR_TYPER to mention that 
all re-distributor shares the same common property table 
(GICR_TYPER.CommonLPIAff = 0).

> +
> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;
>
> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;

The spec (see 8.11.19): "If the value of this field is larger than the 
value of GICD_TYPER.IDbits, the GICD_TYPER.IDbits value applies). We 
don't want to map more than necessary uin

> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
> +        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);

This looks wrong to me. A guest could specify a size different from the 
previous write.

> +        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
> +                                                         reg & GENMASK(47, 12),
> +                                                         nr_pages);
> +        return 1;
> +    }
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Newline + vgic_reg64_check_access

Also, you don't check whether the LPIs have been enabled here.

All my comments above stands. Furthermore, the code is not correctly 
indented (you are using hard tab).

> +	reg = v->arch.vgic.rdist_pendbase;
> +	vgic_reg64_update(&reg, r, info);
> +	reg = sanitize_pendbaser(reg);
> +	v->arch.vgic.rdist_pendbase = reg;
> +
> +        unmap_guest_pages(v->arch.vgic.pendtable, 16);
> +	v->arch.vgic.pendtable = map_guest_pages(v->domain,
> +                                                 reg & GENMASK(47, 12), 16);

The pending table is never touched by Xen. So I would avoid to mapping it.

> +	return 1;
>
>      case 0x0080:
>          goto write_reserved;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index b961551..4d9304f 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>          empty->pirq.irq = lpi;
>      }
>
> +    /* Update the enabled status */
> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
> +
>      return &empty->pirq;
>  }
>
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index ae8a9de..0cd3500 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -109,6 +109,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        uint64_t rdist_propbase;
> +        uint8_t *proptable;
>  #endif
>      } vgic;
>
> @@ -247,7 +249,10 @@ struct arch_vcpu
>
>          /* GICv3: redistributor base and flags for this vCPU */
>          paddr_t rdist_base;
> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */

Please avoid spurious change. We don't require in Xen to have all the 
constant aligned. This also makes harder to got through the changes.

> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)

Please document the purpose of this bit.

> +        uint64_t rdist_pendbase;
> +        unsigned long *pendtable;
>          uint8_t flags;
>          struct list_head pending_lpi_list;
>      } vgic;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 1f881c0..3b2e5c0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>
>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)

s/lpi/vlpi/ to make clear this is a function deal with virtual LPIs.

>  {
> -    return GIC_PRI_IRQ;
> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;

I think this is the best place to ask this question, I don't see any 
code within this series to check that the guest effectively initialized 
proptable and the size is correct (you don't check that the guest 
provided enough memory compare to the vLPI suggested).

FWIW, I have only already made those comments back Vijay sent his patch 
series. It might be worth for you to look at what he did regarding all 
the sanity checks.

> +}

Newline here for clarity.

> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)

Ditto for the naming.

> +{
> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>  }
>
>  #else
> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>  {
>      return GIC_PRI_IRQ;
>  }

Newline here for clarity.

> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> +{
> +    return false;
> +}
>
>  #endif /* CONFIG_HAS_ITS */
>
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 4e29ba6..2b216cc 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>
>  #undef VGIC_REG_HELPERS
>
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
> +void unmap_guest_pages(void *va, int nr_pages);
> +
>  enum gic_sgi_mode;
>
>  /*
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-11-02 17:18   ` Julien Grall
@ 2016-11-02 17:41     ` Stefano Stabellini
  2016-11-02 18:03       ` Julien Grall
  2017-01-31  9:10     ` Andre Przywara
  1 sibling, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-02 17:41 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andre Przywara, Stefano Stabellini, Steve Capper, xen-devel

On Wed, 2 Nov 2016, Julien Grall wrote:
> > +    }
> > +
> > +    ptr = vmap(pages, nr_pages);
> 
> I am not a big fan of the vmap solution for various reasons:
> 	- the VMAP area is small (only 1GB) it will not scale (you seem
> to use it to map pretty much all memory provisioned for the ITS)
> 	- writing in a register cannot fail, how do you co-op with that?
> 
> I think the best approach here is to use a similar approach as
> copy_*_guests helpers but dealing with IPA rather than guest VA.

I don't like the idea of using the vmap for this either, but the problem
with the copy_*_guest approach is that it only maps one page at a time.
It is unable to map multiple pages contiguously, which seems to be
required here.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-10-24 14:30   ` Vijay Kilari
@ 2016-11-02 17:51     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-02 17:51 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 15:30, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
>> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
>> and collection ID, which points to the target CPU.
>> This mapping is stored in the device and collection tables, which software
>> has to provide for the ITS to use.
>> Allocate the required memory and hand it the ITS.
>> We limit the number of devices to cover 4 PCI busses for now.
> 
>    Thunderx has more than 4 PCI busses

Yeah, I am thinking about a proper solution for that hack.
We may use a default of 4 buses and allow platform to override this.
Or make this a configuration value.
Or copy the Linux behaviour by limiting the number of pages to some
sensible value (16MB, if I got this correctly).
Anyway I think we need indirect table support to save Thunder from
allocating too much memory for this.

>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |   5 ++
>>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>>  3 files changed, 167 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index b52dff3..40238a2 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -21,6 +21,7 @@
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> +#include <asm/io.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>>
>> +#define BASER_ATTR_MASK                                           \
>> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
>> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
>> +
>> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>> +{
>> +    uint64_t ret;
>> +
>> +    if ( page_bits < 16)
>> +        return (uint64_t)addr & GENMASK(47, page_bits);
>> +
>> +    ret = addr & GENMASK(47, 16);
>> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> 
>     why this mask and shift for?.

According to the (current issue of the) spec bits 48-51 of the address
are stored in bits 12-15 of the register.

>> +}
>> +
>> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>> +{
>> +    uint64_t attr;
>> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
>> +    int pagesz;
>> +    int order;
>> +    void *buffer = NULL;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    /*
>> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
>> +     * supports.
>> +     */
>> +    for (pagesz = 0; pagesz < 3; pagesz++)
>> +    {
>> +        uint64_t reg;
>> +        int nr_bytes;
>> +
>> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
>> +        order = get_order_from_bytes(nr_bytes);
>> +
>> +        if ( !buffer )
>> +            buffer = alloc_xenheap_pages(order, 0);
>            Don't we need to reset to zero all the pages before handing
> memory to ITS hw?

True. Will fix it.

>> +        if ( !buffer )
>> +            return -ENOMEM;
>> +
>> +        reg  = attr;
>> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
>> +        reg |= nr_bytes >> (pagesz * 2 + 12);
>> +        reg |= regc & BASER_RO_MASK;
>> +        reg |= GITS_BASER_VALID;
>> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
>> +
>> +        writeq_relaxed(reg, basereg);
>> +        regc = readl_relaxed(basereg);
>> +
>> +        /* The host didn't like our attributes, just use what it returned. */
>> +        if ( (regc & BASER_ATTR_MASK) != attr )
>> +            attr = regc & BASER_ATTR_MASK;
>> +
>> +        /* If the host accepted our page size, we are done. */
>> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
>> +            return 0;
>> +
>> +        /* Check whether our buffer is aligned to the next page size already. */
>> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
>> +        {
>> +            free_xenheap_pages(buffer, order);
>> +            buffer = NULL;
>> +        }
>> +    }
>> +
>> +    if ( buffer )
>> +        free_xenheap_pages(buffer, order);
>> +
>> +    return -EINVAL;
>> +}
>> +
>> +int gicv3_its_init(struct host_its *hw_its)
>> +{
>> +    uint64_t reg;
>> +    int i;
>> +
>> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>> +    if ( !hw_its->its_base )
>> +        return -ENOMEM;
>> +
>> +    for (i = 0; i < 8; i++)
>> +    {
>> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> +        int type;
>> +
>> +        reg = readq_relaxed(basereg);
>> +        type = (reg >> 56) & 0x7;
> 
>       define a macro for these constants

Sure.

Cheers,
Andre.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-11-02 17:41     ` Stefano Stabellini
@ 2016-11-02 18:03       ` Julien Grall
  2016-11-02 18:09         ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-02 18:03 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Andre Przywara, Steve Capper, xen-devel

Hi Stefano,

On 02/11/16 17:41, Stefano Stabellini wrote:
> On Wed, 2 Nov 2016, Julien Grall wrote:
>>> +    }
>>> +
>>> +    ptr = vmap(pages, nr_pages);
>>
>> I am not a big fan of the vmap solution for various reasons:
>> 	- the VMAP area is small (only 1GB) it will not scale (you seem
>> to use it to map pretty much all memory provisioned for the ITS)
>> 	- writing in a register cannot fail, how do you co-op with that?
>>
>> I think the best approach here is to use a similar approach as
>> copy_*_guests helpers but dealing with IPA rather than guest VA.
>
> I don't like the idea of using the vmap for this either, but the problem
> with the copy_*_guest approach is that it only maps one page at a time.
> It is unable to map multiple pages contiguously, which seems to be
> required here.

We will get into trouble very quickly with the vmap solution. The memory 
provisioned by the ITS can be quite big. Each GITS_BASER can hold up to 
16MB (if I computed correctly) of memory.

The question is why do we need to map the page contiguously? Can we do 
in a different way?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-11-02 18:03       ` Julien Grall
@ 2016-11-02 18:09         ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-02 18:09 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andre Przywara, Stefano Stabellini, Steve Capper, xen-devel

On Wed, 2 Nov 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 02/11/16 17:41, Stefano Stabellini wrote:
> > On Wed, 2 Nov 2016, Julien Grall wrote:
> > > > +    }
> > > > +
> > > > +    ptr = vmap(pages, nr_pages);
> > > 
> > > I am not a big fan of the vmap solution for various reasons:
> > > 	- the VMAP area is small (only 1GB) it will not scale (you seem
> > > to use it to map pretty much all memory provisioned for the ITS)
> > > 	- writing in a register cannot fail, how do you co-op with that?
> > > 
> > > I think the best approach here is to use a similar approach as
> > > copy_*_guests helpers but dealing with IPA rather than guest VA.
> > 
> > I don't like the idea of using the vmap for this either, but the problem
> > with the copy_*_guest approach is that it only maps one page at a time.
> > It is unable to map multiple pages contiguously, which seems to be
> > required here.
> 
> We will get into trouble very quickly with the vmap solution. The memory
> provisioned by the ITS can be quite big. Each GITS_BASER can hold up to 16MB
> (if I computed correctly) of memory.
> 
> The question is why do we need to map the page contiguously? Can we do in a
> different way?
 
I agree with Julien: if we can get away with mapping one page at a time
without performance penalties, we should do it. Keep in mind that on
ARM64 Xen doesn't actually need to map anything because the whole
physical memory is already mapped.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
  2016-10-09 14:20   ` Vijay Kilari
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-11-03 17:50   ` Julien Grall
  2016-11-08 23:54   ` Stefano Stabellini
  3 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-03 17:50 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> Create a new file to hold the emulation code for the ITS widget.
> For now we emulate the memory mapped ITS registers and provide a stub
> to introduce the ITS command handling framework (but without actually
> emulating any commands at this time).

The ITS is a complex piece so I think it would be good to describe more 
in the commit message how this will work. Also a documentation in the 
tree would be very good to help understanding the code.

>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            |   9 -
>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>  4 files changed, 398 insertions(+), 9 deletions(-)
>  create mode 100644 xen/arch/arm/vgic-its.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index c2c4daa..cb0201f 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -44,6 +44,7 @@ obj-y += traps.o
>  obj-y += vgic.o
>  obj-y += vgic-v2.o
>  obj-$(CONFIG_ARM_64) += vgic-v3.o
> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>  obj-y += vm_event.o
>  obj-y += vtimer.o
>  obj-y += vpsci.o
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> new file mode 100644
> index 0000000..875b992
> --- /dev/null
> +++ b/xen/arch/arm/vgic-its.c
> @@ -0,0 +1,378 @@
> +/*
> + * xen/arch/arm/vgic-its.c
> + *
> + * ARM Interrupt Translation Service (ITS) emulation
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/domain_page.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <asm/current.h>
> +#include <asm/mmio.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
> +#include <asm/vgic.h>
> +#include <asm/vgic-emul.h>
> +
> +/* Data structure to describe a virtual ITS */
> +struct virt_its {
> +    struct domain *d;
> +    struct host_its *hw_its;
> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
> +    uint64_t cbaser;
> +    uint64_t *cmdbuf;
> +    int cwriter;
> +    int creadr;

CWRITER and CREADR are registers so they need to be described in term of 
number of bits. Also, while the top word of CREADR/CWRITER is RES0. I 
would much prefer to see uint64_t rather than uint32_t as this is the 
real size of the register.

> +    spinlock_t its_lock;        /* protects the collection and device tables */
> +    uint64_t baser0, baser1;

Please describe what contains baser0 and baser1. If I understand 
correctly the code, baser0 will be store Device information whilst 
baser1 the collection.

> +    uint16_t *coll_table;

What is the layout of the device table?

> +    int max_collections;

unsigned int

> +    uint64_t *dev_table;

What is the layout of the device table?

> +    int max_devices;

unsigned int.

> +    bool enabled;
> +};
> +
> +/* An Interrupt Translation Table Entry: this is indexed by a

Coding style:

/*
  * Foo

> + * DeviceID/EventID pair and is located in guest memory.
> + */
> +struct vits_itte
> +{
> +    uint64_t hlpi:24;
> +    uint64_t vlpi:24;
> +    uint64_t collection:16;
> +};
> +
> +/**************************************
> + * Functions that handle ITS commands *
> + **************************************/
> +
> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,

Please make this function inline.

> +                                   int word, int shift, int size)

unsigned for all those parameters.

> +{
> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);

It is probably better to use BIT_ULL (see my explanation on previous 
patches).

> +}
> +
> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
> +
> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> +                                uint32_t writer)

uint64_t here.

> +{
> +    uint64_t *cmdptr;
> +
> +    if ( !its->cmdbuf )
> +        return -1;
> +
> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +        return -1;

You return an error value but the caller does not check it. Should not 
the caller do a different action when the return -1? If not, it should 
be documented.

> +
> +    spin_lock(&its->vcmd_lock);

I am quite concerned about this locking.

> +
> +    while ( its->creadr != writer )
> +    {
> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
> +        switch (its_cmd_get_command(cmdptr))

Coding style: switch ( ... )

> +        {
> +        case GITS_CMD_SYNC:
> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
> +	    break;

The indentation is wrong.

> +        default:
> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",

gdprintk will happen XENLOG_GUEST, so you can use XENLOG_WARNING here.

Also s/%ld/%lu/

> +                   its_cmd_get_command(cmdptr));

Should not we report the error to the default, or crash it? We tend to 
do the latter on Xen for constrained unpredictable behavior.

> +            break;
> +        }
> +
> +        its->creadr += ITS_CMD_SIZE;
> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +            its->creadr = 0;
> +    }
> +    its->cwriter = writer;

I think its->cwriter should be updated before the loop. So another vCPU 
could read the correct CWRITER whilst this vCPU is executing the commands.

> +
> +    spin_unlock(&its->vcmd_lock);
> +
> +    return 0;
> +}
> +
> +/*****************************
> + * ITS registers read access *
> + *****************************/
> +
> +/* The physical address is encoded slightly differently depending on

Coding style:

/*
  * foo

> + * the used page size: the highest four bits are stored in the lowest
> + * four bits of the field for 64K pages.
> + */
> +static paddr_t get_baser_phys_addr(uint64_t reg)
> +{
> +    if ( reg & BIT(9) )

Please document what is bit 9.

> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
> +    else
> +        return reg & GENMASK(47, 12);
> +}
> +
> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                                 register_t *r, void *priv)
> +{
> +    struct virt_its *its = priv;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);

Please use a define for BIT(31). Also, technically the ITS is not 
quiescent when command are executed (GITS_CTLR could be read from 
another vCPU).

> +	break;
> +    case VREG32(GITS_IIDR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
> +        break;
> +    case VREG64(GITS_TYPER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(0x1eff1, info);

Please document the value and add defines. Vijay's mentioned about the 
number of device IDs, but the number of collection likely needs to be 
dynamic as it depends on the number of vCPUs.

> +        break;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(its->cbaser, info);
> +        break;
> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(its->cwriter, info);
> +        break;
> +    case VREG64(GITS_CREADR):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(its->creadr, info);
> +        break;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(its->baser0, info);
> +        break;
> +    case VREG64(GITS_BASER1):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(its->baser1, info);
> +        break;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +        *r = vgic_reg64_extract(0, info);

Please introduce a label read_as_zero_64 at the end and do the 
implementation of RAZ there. It will acts as a documentation too (see an 
example in vgic-v3.c).

Also, vgic_reg64_extract(0, info) will ... always return 0. So you can 
optimize it ;).

> +        break;
> +    case VREG32(GICD_PIDR2):

This feels odd to use GICD_PIDR2 here. Please define GITS_PIDR2 to avoid 
any confusion.

> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);

Ditto.

> +        break;

Please add all the registers even implementation defined and reserved 
one. Ignoring registers without any warning is usually a bad idea as it 
makes very difficult to debug it. You can look at vgic-v3.c for an example.


> +    }
> +
> +    return 1;
> +
> +bad_width:

Please print an error here (see vgic-v3.c).

> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
> +/******************************
> + * ITS registers write access *
> + ******************************/
> +
> +static int its_baser_table_size(uint64_t baser)

unsigned int for the return and the function would probably benefit to 
be inlined.

> +{
> +    int page_size = 0;

unsigned int.

> +
> +    switch ( (baser >> 8) & 3 )

Please define 8 and 3.

> +    {
> +    case 0: page_size = SZ_4K; break;
> +    case 1: page_size = SZ_16K; break;
> +    case 2:
> +    case 3: page_size = SZ_64K; break;
> +    }

It looks like to me that the switch could be turned into an array:

unsigned page_size[] = {SZ_4K, SZ_16K, SZ_64K, SZ_64K};

This woudl make the code simpler.

> +
> +    return page_size * ((baser & GENMASK(7, 0)) + 1);
> +}
> +
> +static int its_baser_nr_entries(uint64_t baser)

unsigned int for the return and the function would probably benefit to 
be inlined.

> +{
> +    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;

unsigned int for the type. Also please use a define for 48.

> +
> +    return its_baser_table_size(baser) / entry_size;
> +}
> +
> +static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
> +                                  register_t r, void *priv)
> +{
> +    struct domain *d = v->domain;
> +    struct virt_its *its = priv;
> +    uint64_t reg;
> +    uint32_t ctlr;

ctlr could be defined in the case...

> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):

here. I tend to prefer to restrict the scope whenever it is possible.

> +        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +	vgic_reg32_update(&ctlr, r, info);
> +	its->enabled = ctlr & GITS_CTLR_ENABLE;
> +	/* TODO: trigger something ... */

The indentation is wrong.

> +        return 1;
> +    case VREG32(GITS_IIDR):
> +        goto write_ignore_32;
> +    case VREG32(GITS_TYPER):
> +        goto write_ignore_32;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access.

> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +

There may have concurrent access to GITS_BASER, so you want to have some 
lock here.

> +        reg = its->cbaser;
> +        vgic_reg64_update(&reg, r, info);
> +        /* TODO: sanitise! */

Please fix this todo as soon as possible.

> +        its->cbaser = reg;

Also, I am not sure to understand why you need a temporary variable. 
Whilst you could directly update its->cbaser:

vgic_regs64_update(&its->cbaser, r, info);

Also, from the spec (8.19.2 in ARM IHI 0069C), GITS_CREADR (i.e 
its->creadr) should be reset to 0.

> +
> +        if ( reg & BIT(63) )

Please define bit 63.

> +        {
> +            its->cmdbuf = map_guest_pages(d, reg & GENMASK(51, 12), 1);
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->cmdbuf, 1);
> +            its->cmdbuf = NULL;
> +        }
> +
> +	return 1;
> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access.

> +        reg = its->cwriter;
> +        vgic_reg64_update(&reg, r, info);

vgic_its_handle_cmds expect CWRITER to the bit 0 (Retry) masked and bit 
[32:20], [4:1] should be RES0 (i.e masked).

> +        vgic_its_handle_cmds(d, its, reg);

Should not you check the return value?

> +        return 1;
> +    case VREG64(GITS_CREADR):
> +        goto write_ignore_64;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->baser0;
> +        vgic_reg64_update(&reg, r, info);
> +
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;

Where does this sizeof(uint64_t) come from?

> +        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
> +        /* TODO: sanitise! */
> +        /* TODO: locking(?) */

Yes, some locking is needed.

> +
> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->dev_table = map_guest_pages(d,
> +                                             get_baser_phys_addr(reg),
> +                                             its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = its_baser_nr_entries(reg);
> +            memset(its->dev_table, 0, its->max_devices * sizeof(uint64_t));

I am not sure to understand why we need to memset and what the value 
corresponds to.

> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->dev_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = 0;
> +        }
> +
> +        its->baser0 = reg;

Why don't you update baser0 directly (with vgic_reg64_update)?

> +        return 1;
> +    case VREG64(GITS_BASER1):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;

Please use vgic_reg64_check_access

> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->baser1;
> +        vgic_reg64_update(&reg, r, info);
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
> +        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
> +        /* TODO: sanitise! */
> +
> +        /* TODO: sort out locking */

I am expecting this to be fixed in the next version.

> +        /* TODO: repeated calls: free old mapping */
> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->coll_table = map_guest_pages(d, get_baser_phys_addr(reg),
> +                                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = its_baser_nr_entries(reg);
> +            memset(its->coll_table, 0xff,
> +                   its->max_collections * sizeof(uint16_t));

I am not sure to understand why we need to memset and what the value 
corresponds to.

> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->coll_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = 0;
> +        }
> +        its->baser1 = reg;

Why don't you update baser1 directly (with vgic_reg64_update)?

> +        return 1;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        goto write_ignore_64;

 From the ITS register map, we would have to emulate more register (at 
least reserved, implementation defined and RAZ).

> +    default:
> +        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
> +                 info->gpa & 0xffff);
> +        return 0;
> +    }
> +
> +    return 1;
> +
> +write_ignore_64:
> +    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
> +    return 1;
> +
> +write_ignore_32:
> +    if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +    return 1;
> +
> +bad_width:
> +    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
> +           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
> +
> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
The Makefile already includes the vi
> +static const struct mmio_handler_ops vgic_its_mmio_handler = {
> +    .read  = vgic_v3_its_mmio_read,
> +    .write = vgic_v3_its_mmio_write,
> +};

This will break compilation with randconfig as the ITS is selectable. 
Please make sure that every patch built one by one. A good approach 
would be allowing the selection of the ITS at the end of this series.

> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 8fe8386..aa53a1e 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -158,15 +158,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
>      rank->vcpu[offset] = new_vcpu->vcpu_id;
>  }
>
> -static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> -{
> -    /*
> -     * 64 bits registers can be accessible using 32-bit and 64-bit unless
> -     * stated otherwise (See 8.1.3 ARM IHI 0069A).
> -     */
> -    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> -}
> -
>  static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>                                           uint32_t gicr_reg,
>                                           register_t *r)
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index da5fb77..6a91f5b 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -147,6 +147,16 @@
>  #define LPI_PROP_RES1                (1 << 1)
>  #define LPI_PROP_ENABLED             (1 << 0)
>
> +/*
> + * PIDR2: Only bits[7:4] are not implementation defined. We are
> + * emulating a GICv3 ([7:4] = 0x3).
> + *
> + * We don't emulate a specific registers scheme so implement the others
> + * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
> + */
> +#define GICV3_GICD_PIDR2  0x30
> +#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2

Those values should not be defined in gic_v3_defs.h but a vgic headers. 
My rationale is, those value are implementation defined (e.g depends on 
the emulation).

> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>
> @@ -190,6 +200,15 @@ struct rdist_region {
>      bool single_rdist;
>  };
>
> +/*
> + * 64 bits registers can be accessible using 32-bit and 64-bit unless
> + * stated otherwise (See 8.1.3 ARM IHI 0069A).
> + */
> +static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> +{
> +    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> +}
> +

This function should be defined in vgic.h and not gic_v3_defs.h

>  #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
>
>  /*
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-11-03 19:26     ` Andre Przywara
  2016-11-04 12:07       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-03 19:26 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 16:31, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Create a new file to hold the emulation code for the ITS widget.
>> For now we emulate the memory mapped ITS registers and provide a stub
>> to introduce the ITS command handling framework (but without actually
>> emulating any commands at this time).
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Makefile             |   1 +
>>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/vgic-v3.c            |   9 -
>>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>>  4 files changed, 398 insertions(+), 9 deletions(-)
>>  create mode 100644 xen/arch/arm/vgic-its.c
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index c2c4daa..cb0201f 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -44,6 +44,7 @@ obj-y += traps.o
>>  obj-y += vgic.o
>>  obj-y += vgic-v2.o
>>  obj-$(CONFIG_ARM_64) += vgic-v3.o
>> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>>  obj-y += vm_event.o
>>  obj-y += vtimer.o
>>  obj-y += vpsci.o
>> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
>> new file mode 100644
>> index 0000000..875b992
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic-its.c
>> @@ -0,0 +1,378 @@
>> +/*
>> + * xen/arch/arm/vgic-its.c
>> + *
>> + * ARM Interrupt Translation Service (ITS) emulation
>> + *
>> + * Andre Przywara <andre.przywara@arm.com>
>> + * Copyright (c) 2016 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/bitops.h>
>> +#include <xen/config.h>
>> +#include <xen/domain_page.h>
>> +#include <xen/lib.h>
>> +#include <xen/init.h>
>> +#include <xen/softirq.h>
>> +#include <xen/irq.h>
>> +#include <xen/sched.h>
>> +#include <xen/sizes.h>
>> +#include <asm/current.h>
>> +#include <asm/mmio.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic-its.h>
>> +#include <asm/vgic.h>
>> +#include <asm/vgic-emul.h>
>> +
>> +/* Data structure to describe a virtual ITS */
>> +struct virt_its {
>> +    struct domain *d;
>> +    struct host_its *hw_its;
>> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
>> +    uint64_t cbaser;
>> +    uint64_t *cmdbuf;
>> +    int cwriter;
>> +    int creadr;
>> +    spinlock_t its_lock;        /* protects the collection and device tables */
>> +    uint64_t baser0, baser1;
>> +    uint16_t *coll_table;
>> +    int max_collections;
>> +    uint64_t *dev_table;
>> +    int max_devices;
>> +    bool enabled;
>> +};
>> +
>> +/* An Interrupt Translation Table Entry: this is indexed by a
>> + * DeviceID/EventID pair and is located in guest memory.
>> + */
>> +struct vits_itte
>> +{
>> +    uint64_t hlpi:24;
>> +    uint64_t vlpi:24;
>> +    uint64_t collection:16;
>> +};
>> +
>> +/**************************************
>> + * Functions that handle ITS commands *
>> + **************************************/
>> +
>> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>> +                                   int word, int shift, int size)
>> +{
>> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
>> +}
>> +
>> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
>> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
>> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
>> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
>> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
>> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
>> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
>> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
>> +
>> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>> +
>> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>> +                                uint32_t writer)
>> +{
>> +    uint64_t *cmdptr;
>> +
>> +    if ( !its->cmdbuf )
>> +        return -1;
>> +
>> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
>> +        return -1;
>> +
>> +    spin_lock(&its->vcmd_lock);
>> +
>> +    while ( its->creadr != writer )
>> +    {
>> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>> +        switch (its_cmd_get_command(cmdptr))
>> +        {
>> +        case GITS_CMD_SYNC:
>> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
>> +           break;
>> +        default:
>> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
>> +                   its_cmd_get_command(cmdptr));
>> +            break;
>> +        }
>> +
>> +        its->creadr += ITS_CMD_SIZE;
>> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
>> +            its->creadr = 0;
>> +    }
>> +    its->cwriter = writer;
>> +
>> +    spin_unlock(&its->vcmd_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +/*****************************
>> + * ITS registers read access *
>> + *****************************/
>> +
>> +/* The physical address is encoded slightly differently depending on
>> + * the used page size: the highest four bits are stored in the lowest
>> + * four bits of the field for 64K pages.
>> + */
>> +static paddr_t get_baser_phys_addr(uint64_t reg)
>> +{
>> +    if ( reg & BIT(9) )
>> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
>> +    else
>> +        return reg & GENMASK(47, 12);
>> +}
>> +
>> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
>> +                                 register_t *r, void *priv)
>> +{
>> +    struct virt_its *its = priv;
>> +
>> +    switch ( info->gpa & 0xffff )
>> +    {
>> +    case VREG32(GITS_CTLR):
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
>> +       break;
>> +    case VREG32(GITS_IIDR):
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
>> +        break;
>> +    case VREG64(GITS_TYPER):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(0x1eff1, info);
>        GITS_TYPER.HCC is not set. Should be max vcpus of the domain

HCC is clear on purpose. We want the guest to provide memory for
everything that it allocates, to avoid it to hog Xen with allocations.

>        GITS_TYPER.ID_bits are also just set to 15.

Yeah, I guess it should match what the hardware ITS provides, to not
impose an artificial limit here.

>> +        break;
>> +    case VREG64(GITS_CBASER):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(its->cbaser, info);
>> +        break;
>> +    case VREG64(GITS_CWRITER):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(its->cwriter, info);
>> +        break;
>> +    case VREG64(GITS_CREADR):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(its->creadr, info);
>> +        break;
>> +    case VREG64(GITS_BASER0):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(its->baser0, info);
>> +        break;
>> +    case VREG64(GITS_BASER1):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(its->baser1, info);
>> +        break;
>> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg64_extract(0, info);
>> +        break;
>> +    case VREG32(GICD_PIDR2):
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
>> +        break;
>          missing default
>> +    }
>> +
>> +    return 1;
>> +
>> +bad_width:
>     print would be helpful

Yes.

>> +    domain_crash_synchronous();
>> +
>> +    return 0;
>> +}
>> +
>> +/******************************
>> + * ITS registers write access *
>> + ******************************/
>> +
>> +static int its_baser_table_size(uint64_t baser)
>> +{
>> +    int page_size = 0;
>> +
>> +    switch ( (baser >> 8) & 3 )
>> +    {
>> +    case 0: page_size = SZ_4K; break;
>> +    case 1: page_size = SZ_16K; break;
>> +    case 2:
>> +    case 3: page_size = SZ_64K; break;
>> +    }
>> +
>> +    return page_size * ((baser & GENMASK(7, 0)) + 1);
>> +}
>> +
>> +static int its_baser_nr_entries(uint64_t baser)
>> +{
>> +    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;
>> +
>> +    return its_baser_table_size(baser) / entry_size;
>> +}
>> +
>> +static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
>> +                                  register_t r, void *priv)
>> +{
>> +    struct domain *d = v->domain;
>> +    struct virt_its *its = priv;
>> +    uint64_t reg;
>> +    uint32_t ctlr;
>> +
>> +    switch ( info->gpa & 0xffff )
>> +    {
>> +    case VREG32(GITS_CTLR):
>> +        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>> +       vgic_reg32_update(&ctlr, r, info);
>> +       its->enabled = ctlr & GITS_CTLR_ENABLE;
>> +       /* TODO: trigger something ... */
>> +        return 1;
>> +    case VREG32(GITS_IIDR):
>> +        goto write_ignore_32;
>> +    case VREG32(GITS_TYPER):
>> +        goto write_ignore_32;
>> +    case VREG64(GITS_CBASER):
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +
>> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
>> +        if ( its->enabled )
>> +            return 1;
>> +
>> +        reg = its->cbaser;
>> +        vgic_reg64_update(&reg, r, info);
>> +        /* TODO: sanitise! */
>> +        its->cbaser = reg;
>> +
>> +        if ( reg & BIT(63) )
>> +        {
>> +            its->cmdbuf = map_guest_pages(d, reg & GENMASK(51, 12), 1);
> 
>        Only one page of guest cmd queue is mapped. After cwriter
> moving beyond 1 page,
> panic is observed. Map all the guest pages

Right, good catch.

Cheers,
Andre.

>> +        }
>> +        else
>> +        {
>> +            unmap_guest_pages(its->cmdbuf, 1);
>    Same here.
>> +            its->cmdbuf = NULL;
>> +        }
>> +
>> +       return 1;

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-11-03 19:33     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-03 19:33 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 16:31, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
>> those IDs, which we directly pass on to the host.
>> For this we have to map each device that Dom0 may request to a host
>> ITS device with the same identifier.
>> Allocate the respective memory and enter each device into a list to
>> later be able to iterate over it or to easily teardown guests.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 90 +++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-arm/gic-its.h | 16 ++++++++
>>  2 files changed, 106 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 2140e4a..bf1f5b5 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -168,6 +168,94 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>>      return its_send_command(its, cmd);
>>  }
>>
>> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>> +                             int size, uint64_t itt_addr, bool valid)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
>> +    cmd[1] = size & GENMASK(4, 0);
>> +    cmd[2] = itt_addr & GENMASK(51, 8);
>> +    if ( valid )
>> +        cmd[2] |= BIT(63);
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>> +                         int devid, int bits, bool valid)
>> +{
>> +    void *itt_addr = NULL;
>> +    struct its_devices *dev, *temp;
>> +    bool reuse_dev = false;
>> +
>> +    list_for_each_entry_safe(dev, temp, &hw_its->its_devices, entry)
>> +    {
>> +        if ( (dev->d->domain_id != d->domain_id) || (dev->devid != devid) )
>> +            continue;
>> +
>> +        its_send_cmd_mapd(hw_its, dev->devid, 0, 0, false);
>> +        xfree(dev->itt_addr);
>> +        if ( !valid )
>> +        {
>> +            xfree(dev);
>     xfree() should be done after list_del()

Oh, indeed.

>> +            list_del(&dev->entry);
>> +
>> +            return 0;
>> +        }
>> +
>> +        reuse_dev = true;
>> +        break;
>> +    }
>> +
>> +    if ( !valid )
>> +        return 0;
>> +
>> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
>> +    if ( !itt_addr )
>> +        return -ENOMEM;
>> +
>> +    if ( !reuse_dev )
>> +    {
>> +        dev = xmalloc(struct its_devices);
>> +        if ( !dev )
>> +            return -ENOMEM;
>> +
>> +        list_add_tail(&dev->entry, &hw_its->its_devices);
>> +    }
>> +
>> +    dev->itt_addr = itt_addr;
>> +    dev->d = d;
>> +    dev->devid = devid;
>> +
>> +    return its_send_cmd_mapd(hw_its, devid, bits - 1,
>> +                             itt_addr ? virt_to_maddr(itt_addr) : 0, true);
>           check on itt_addr is redundant

Ah, yes, this is an artifact of an earlier version. Thanks for spotting
this.

> 
>> +}
>> +
>> +/* Removing any connections a domain had to any ITS in the system. */
>> +int its_remove_domain(struct domain *d)
>> +{
>> +    struct host_its *its;
>> +    struct its_devices *dev, *temp;
>> +
>> +    list_for_each_entry(its, &host_its_list, entry)
>> +    {
>> +        list_for_each_entry_safe(dev, temp, &its->its_devices, entry)
>> +        {
>> +            if ( dev->d->domain_id != d->domain_id )
>> +                continue;
>> +
>> +            its_send_cmd_mapd(its, dev->devid, 0, 0, false);
>> +            xfree(dev->itt_addr);
>> +            xfree(dev);
> 
> xfree() should be done after list_del(
>> +            list_del(&dev->entry);
>> +        }
>         This code is same as above. Can be moved to a separate function?

Probably. Will take a look.

Thanks,
Andre.

>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>  /* Set up the (1:1) collection mapping for the given host CPU. */
>>  void gicv3_its_setup_collection(int cpu)
>>  {
>> @@ -297,6 +385,7 @@ int gicv3_its_init(struct host_its *hw_its)
>>
>>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>>      hw_its->pta = reg & GITS_TYPER_PTA;
>> +    hw_its->itte_size = ((reg >> 4) & 0xf) + 1;
>       can define a macro for these constants
>>
>>      for (i = 0; i < 8; i++)
>>      {
>> @@ -520,6 +609,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>>          its_data->size = size;
>>          its_data->dt_node = its;
>>          spin_lock_init(&its_data->cmd_lock);
>> +        INIT_LIST_HEAD(&its_data->its_devices);
>>
>>          printk("GICv3: Found ITS @0x%lx\n", addr);
>>
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 512a388..4e9841a 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -79,6 +79,13 @@
>>  #ifndef __ASSEMBLY__
>>  #include <xen/device_tree.h>
>>
>> +struct its_devices {
>> +    struct list_head entry;
>> +    struct domain *d;
>> +    void *itt_addr;
>> +    int devid;
>> +};
>> +
>>  /* data structure for each hardware ITS */
>>  struct host_its {
>>      struct list_head entry;
>> @@ -88,6 +95,8 @@ struct host_its {
>>      void __iomem *its_base;
>>      spinlock_t cmd_lock;
>>      void *cmd_buf;
>> +    struct list_head its_devices;
>> +    int itte_size;
>>      bool pta;
>>  };
>>
>> @@ -114,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>>  /* Map a collection for this host CPU to each host ITS. */
>>  void gicv3_its_setup_collection(int cpu);
>>
>> +/* Map a device on the host by allocating an ITT on the host (ITS).
>> + * "bits" specifies how many events (interrupts) this device will need.
>> + * Setting "valid" to false deallocates the device.
>> + */
>> +int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>> +                         int devid, int bits, bool valid);
>> +
>>  int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>>                                  uint32_t devid, uint32_t eventid,
>>                                  struct vcpu *v, int virt_lpi);
>> --
>> 2.9.0
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-10-24 15:31   ` Vijay Kilari
@ 2016-11-03 19:47     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-03 19:47 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 16:31, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> For the same reason that allocating a struct irq_desc for each
>> possible LPI is not an option, having a struct pending_irq for each LPI
>> is also not feasible. However we actually only need those when an
>> interrupt is on a vCPU (or is about to be injected).
>> Maintain a list of those structs that we can use for the lifecycle of
>> a guest LPI. We allocate new entries if necessary, however reuse
>> pre-owned entries whenever possible.
>> Teach the existing VGIC functions to find the right pointer when being
>> given a virtual LPI number.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic.c            |  3 +++
>>  xen/arch/arm/vgic-v3.c        |  2 ++
>>  xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
>>  xen/include/asm-arm/domain.h  |  1 +
>>  xen/include/asm-arm/gic-its.h | 10 ++++++++
>>  xen/include/asm-arm/vgic.h    |  9 +++++++
>>  6 files changed, 78 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index 63c744a..ebe4035 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>>              }
>> +            /* If this was an LPI, mark this struct as available again. */
>> +            if ( p->irq >= 8192 )
>  Can define something line is_lpi(irq) and use it everywhere

Yes, that was on my list anyway.

>> +                p->irq = 0;
>>          }
>>      }
>>  }
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index ec038a3..e9b6490 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>>
>> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
>> +
>>      return 0;
>>  }
>>
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 0965119..b961551 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -31,6 +31,8 @@
>>  #include <asm/mmio.h>
>>  #include <asm/gic.h>
>>  #include <asm/vgic.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic-its.h>
>>
>>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>>  {
>> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>>      return vgic_get_rank(v, rank);
>>  }
>>
>> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>>  {
>>      INIT_LIST_HEAD(&p->inflight);
>>      INIT_LIST_HEAD(&p->lr_queue);
>> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>>
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +    struct vgic_irq_rank *rank;
>>      unsigned long flags;
>>      int priority;
>>
>> +    if ( virq >= 8192 )
>> +        return gicv3_lpi_get_priority(v->domain, virq);
>> +
>> +    rank = vgic_rank_irq(v, virq);
>>      vgic_lock_rank(v, rank, flags);
>>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>      vgic_unlock_rank(v, rank, flags);
>> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
>>      return 1;
>>  }
>>
>> +/*
>> + * Holding struct pending_irq's for each possible virtual LPI in each domain
>> + * requires too much Xen memory, also a malicious guest could potentially
>> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
>> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
>> + * on demand.
>> + */
>> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>> +                                   bool allocate)
>> +{
>> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
>> +
>> +    /* TODO: locking! */
>> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
>> +    {
>> +        if ( lpi_irq->pirq.irq == lpi )
>> +            return &lpi_irq->pirq;
>> +
>> +        if ( lpi_irq->pirq.irq == 0 && !empty )
>> +            empty = lpi_irq;
>> +    }
>    With this approach of allocating pending_irq on demand, if the
> pending_lpi_list
> is at n position then it iterates for long time to find pending_irq entry.
> This will increase LPI injection time to domain.
> 
> Why can't we use btree?

That's an optimization. You will find that the actual number of
interrupts on a VCPU at any given time is very low, especially if we
look at LPIs only. So my hunch is that it's either 0, 1 or 2, not more.
So for simplicity I'd keep it as a list for now. If we see issues, we
can amend this at any time.

>> +
>> +    if ( !allocate )
>> +        return NULL;
>> +
>> +    if ( !empty )
>> +    {
>> +        empty = xzalloc(struct lpi_pending_irq);
>> +        vgic_init_pending_irq(&empty->pirq, lpi);
>> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
>> +    } else
>> +    {
>> +        empty->pirq.status = 0;
>> +        empty->pirq.irq = lpi;
>> +    }
>> +
>> +    return &empty->pirq;
>> +}
>> +
>>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>>  {
>>      struct pending_irq *n;
>> +
>>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>>       * are used for SPIs; the rests are used for per cpu irqs */
>>      if ( irq < 32 )
>>          n = &v->arch.vgic.pending_irqs[irq];
>> +    else if ( irq >= 8192 )
>> +        n = lpi_to_pending(v, irq, true);
>>      else
>>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>>      return n;
>> @@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>  {
>>      uint8_t priority;
>> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
>> +    struct pending_irq *iter, *n;
>>      unsigned long flags;
>>      bool_t running;
>>
>> @@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>
>>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>
>> +    n = irq_to_pending(v, virq);
>> +
>>      /* vcpu offline */
>>      if ( test_bit(_VPF_down, &v->pause_flags) )
>>      {
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 9452fcd..ae8a9de 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -249,6 +249,7 @@ struct arch_vcpu
>>          paddr_t rdist_base;
>>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>>          uint8_t flags;
>> +        struct list_head pending_lpi_list;
>>      } vgic;
>>
>>      /* Timer registers  */
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 4e9841a..1f881c0 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>                              uint32_t devid, uint32_t eventid,
>>                              uint32_t host_lpi);
>> +
>> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>> +{
>> +    return GIC_PRI_IRQ;
>    Why lpi priority is fixed?. can't we use domain set lpi priority?

Mmh, looks like a rebase artifact. It is fixed if the ITS isn't
configured, but should just be the prototype if not.
The final file has it right, so I guess this gets amended in some later
patch. Thanks for spotting this.

Cheers,
Andre.

>> +}
>> +
>>  #else
>>
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> @@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>  {
>>      return 0;
>>  }
>> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>> +{
>> +    return GIC_PRI_IRQ;
>> +}
>>
>>  #endif /* CONFIG_HAS_ITS */
>>
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 300f461..4e29ba6 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -83,6 +83,12 @@ struct pending_irq
>>      struct list_head lr_queue;
>>  };
>>
>> +struct lpi_pending_irq
>> +{
>> +    struct list_head entry;
>> +    struct pending_irq pirq;
>> +};
>> +
>>  #define NR_INTERRUPT_PER_RANK   32
>>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>>
>> @@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>>  extern void vgic_clear_pending_irqs(struct vcpu *v);
>> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
>> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
>> +                                          bool allocate);
>>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>>  extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>> --
>> 2.9.0
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-10-24 15:32   ` Vijay Kilari
@ 2016-11-03 20:21     ` Andre Przywara
  2016-11-04 11:53       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-03 20:21 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 16:32, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> Allow a guest to provide the address and size for the memory regions
>> it has reserved for the GICv3 pending and property tables.
>> We sanitise the various fields of the respective redistributor
>> registers and map those pages into Xen's address space to have easy
>> access.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-v3.c        | 189 ++++++++++++++++++++++++++++++++++++++----
>>  xen/arch/arm/vgic.c           |   4 +
>>  xen/include/asm-arm/domain.h  |   7 +-
>>  xen/include/asm-arm/gic-its.h |  10 ++-
>>  xen/include/asm-arm/vgic.h    |   3 +
>>  5 files changed, 197 insertions(+), 16 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index e9b6490..8fe8386 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -20,12 +20,14 @@
>>
>>  #include <xen/bitops.h>
>>  #include <xen/config.h>
>> +#include <xen/domain_page.h>
>>  #include <xen/lib.h>
>>  #include <xen/init.h>
>>  #include <xen/softirq.h>
>>  #include <xen/irq.h>
>>  #include <xen/sched.h>
>>  #include <xen/sizes.h>
>> +#include <xen/vmap.h>
>>  #include <asm/current.h>
>>  #include <asm/mmio.h>
>>  #include <asm/gic_v3_defs.h>
>> @@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>>          goto read_reserved;
>>
>>      case VREG64(GICR_PROPBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
>> +        return 1;
>>
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
>> +        return 1;
>>
>>      case 0x0080:
>>          goto read_reserved;
>> @@ -301,11 +305,6 @@ bad_width:
>>      domain_crash_synchronous();
>>      return 0;
>>
>> -read_as_zero_64:
>> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> -    *r = 0;
>> -    return 1;
>> -
>>  read_as_zero_32:
>>      if ( dabt.size != DABT_WORD ) goto bad_width;
>>      *r = 0;
>> @@ -330,11 +329,149 @@ read_unknown:
>>      return 1;
>>  }
>>
>> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
>> +                                    int field_shift,
>> +                                    uint64_t (*sanitise_fn)(uint64_t))
>> +{
>> +    uint64_t field = (reg & field_mask) >> field_shift;
>> +
>> +    field = sanitise_fn(field) << field_shift;
>> +    return (reg & ~field_mask) | field;
>> +}
>> +
>> +/* We want to avoid outer shareable. */
>> +static uint64_t vgic_sanitise_shareability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_OuterShareable:
>> +        return GIC_BASER_InnerShareable;
>> +    default:
>> +        return field;
>> +    }
>> +}
>> +
>> +/* Avoid any inner non-cacheable mapping. */
>> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_CACHE_nCnB:
>> +    case GIC_BASER_CACHE_nC:
>> +        return GIC_BASER_CACHE_RaWb;
>> +    default:
>> +        return field;
>> +    }
>> +}
>> +
>> +/* Non-cacheable or same-as-inner are OK. */
>> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_CACHE_SameAsInner:
>> +    case GIC_BASER_CACHE_nC:
>> +        return field;
>> +    default:
>> +        return GIC_BASER_CACHE_nC;
>> +    }
>> +}
>> +
>> +static uint64_t sanitize_propbaser(uint64_t reg)
>> +{
>> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
>> +                              vgic_sanitise_shareability);
>> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_inner_cacheability);
>> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_outer_cacheability);
>> +
>> +    reg &= ~PROPBASER_RES0_MASK;
>> +    reg &= ~GENMASK(51, 48);
>> +    return reg;
>> +}
>> +
>> +static uint64_t sanitize_pendbaser(uint64_t reg)
>> +{
>> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
>> +                              vgic_sanitise_shareability);
>> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_inner_cacheability);
>> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_outer_cacheability);
>> +
>> +    reg &= ~PENDBASER_RES0_MASK;
>> +    reg &= ~GENMASK(51, 48);
>> +    return reg;
>> +}
>> +
>> +/*
>> + * Allow mapping some parts of guest memory into Xen's VA space to have easy
>> + * access to it. This is to allow ITS configuration data to be held in
>> + * guest memory and avoid using Xen memory for that.
>> + */
>> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
>    I think this file is not right place to put this generic function

Yeah, possibly.

>> +{
>> +    mfn_t onepage;
>> +    mfn_t *pages;
>> +    int i;
>> +    void *ptr;
>> +
>> +    /* TODO: free previous mapping, change prototype? use get-put-put? */
>> +
>> +    guest_addr &= PAGE_MASK;
>> +
>> +    if ( nr_pages == 1 )
>> +    {
>> +        pages = &onepage;
>> +    } else
>> +    {
>> +        pages = xmalloc_array(mfn_t, nr_pages);
>> +        if ( !pages )
>> +            return NULL;
>> +    }
>> +
>> +    for (i = 0; i < nr_pages; i++)
>> +    {
>> +        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
> 
>              check return value of this function

Yes.

>> +        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);
>> +    }
>> +
>> +    ptr = vmap(pages, nr_pages);
>> +
>> +    if ( nr_pages > 1 )
>> +        xfree(pages);
>> +
>> +    return ptr;
>> +}
>> +
>> +void unmap_guest_pages(void *va, int nr_pages)
>       same here. Can be put in generic file p2m.c?
>> +{
>> +    paddr_t pa;
>> +    unsigned long i;
>> +
>> +    if ( !va )
>> +        return;
>> +
>> +    va = (void *)((uintptr_t)va & PAGE_MASK);
>> +    pa = virt_to_maddr(va);
>   can use _pa()

Do you mean __pa()? Which is defined to be exactly virt_to_maddr()?
I prefer the more verbose version, which is more readable, IMHO.

>> +
>> +    vunmap(va);
>> +    for (i = 0; i < nr_pages; i++)
>> +        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
>> +
>> +    return;
>> +}
>> +
>>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>>                                            uint32_t gicr_reg,
>>                                            register_t r)
>>  {
>>      struct hsr_dabt dabt = info->dabt;
>> +    uint64_t reg;
>>
>>      switch ( gicr_reg )
>>      {
>> @@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>>      case 0x0050:
>>          goto write_reserved;
>>
>> -    case VREG64(GICR_PROPBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +    case VREG64(GICR_PROPBASER): {
>> +        int nr_pages;
>> +
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
>> +            return 1;
>> +
>> +        reg = v->domain->arch.vgic.rdist_propbase;
>> +        vgic_reg64_update(&reg, r, info);
>> +        reg = sanitize_propbaser(reg);
>> +        v->domain->arch.vgic.rdist_propbase = reg;
>>
>> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;
>              should be validated against HOST_LPIS?

I don't think so. The actual LPI numbers are totally independent between
host and Dom0.
So why and how should this be matched?

>> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
>> +        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);
>> +        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
>> +                                                         reg & GENMASK(47, 12),
>> +                                                         nr_pages);
>> +        return 1;
>> +    }
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> 
>    check on VGIC_V3_LPIS_ENABLED is required

Right, forgot this.

>> +       reg = v->arch.vgic.rdist_pendbase;
>> +       vgic_reg64_update(&reg, r, info);
>> +       reg = sanitize_pendbaser(reg);
>> +       v->arch.vgic.rdist_pendbase = reg;
>> +
>> +        unmap_guest_pages(v->arch.vgic.pendtable, 16);
>      why only 16 pages are unmapped?

Well, it matches the allocation below, but I agree that this should
match the advertised number of LPIs in GICD_TYPER.

Cheers,
Andre.

>> +       v->arch.vgic.pendtable = map_guest_pages(v->domain,
>> +                                                 reg & GENMASK(47, 12), 16);
>> +       return 1;
>>
>>      case 0x0080:
>>          goto write_reserved;
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index b961551..4d9304f 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>>          empty->pirq.irq = lpi;
>>      }
>>
>> +    /* Update the enabled status */
>> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
>> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
>> +
>>      return &empty->pirq;
>>  }
>>
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index ae8a9de..0cd3500 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -109,6 +109,8 @@ struct arch_domain
>>          } *rdist_regions;
>>          int nr_regions;                     /* Number of rdist regions */
>>          uint32_t rdist_stride;              /* Re-Distributor stride */
>> +        uint64_t rdist_propbase;
>> +        uint8_t *proptable;
>>  #endif
>>      } vgic;
>>
>> @@ -247,7 +249,10 @@ struct arch_vcpu
>>
>>          /* GICv3: redistributor base and flags for this vCPU */
>>          paddr_t rdist_base;
>> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
>> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
>> +        uint64_t rdist_pendbase;
>> +        unsigned long *pendtable;
>>          uint8_t flags;
>>          struct list_head pending_lpi_list;
>>      } vgic;
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 1f881c0..3b2e5c0 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>
>>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>>  {
>> -    return GIC_PRI_IRQ;
>> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
>> +}
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>> +{
>> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>>  }
>>
>>  #else
>> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>>  {
>>      return GIC_PRI_IRQ;
>>  }
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>> +{
>> +    return false;
>> +}
>>
>>  #endif /* CONFIG_HAS_ITS */
>>
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 4e29ba6..2b216cc 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>>
>>  #undef VGIC_REG_HELPERS
>>
>> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
>> +void unmap_guest_pages(void *va, int nr_pages);
>> +
>>  enum gic_sgi_mode;
>>
>>  /*
>> --
>> 2.9.0
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-10-24 15:32   ` Vijay Kilari
@ 2016-11-04  9:22     ` Andre Przywara
  2016-11-10  0:21       ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-04  9:22 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi,

On 24/10/16 16:32, Vijay Kilari wrote:
> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> The INVALL command instructs an ITS to invalidate the configuration
>> data for all LPIs associated with a given redistributor (read: VCPU).
>> To avoid iterating (and mapping!) all guest tables, we instead go through
>> the host LPI table to find any LPIs targetting this VCPU. We then update
>> the configuration bits for the connected virtual LPIs.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 58 +++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
>>  xen/include/asm-arm/gic-its.h |  2 ++
>>  3 files changed, 90 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 6f4329f..5129d6e 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
>>      return its_send_command(its, cmd);
>>  }
>>
>> +static int its_send_cmd_invall(struct host_its *its, int cpu)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_INVALL;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = cpu & GENMASK(15, 0);
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>>  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>>                           int devid, int bits, bool valid)
>>  {
>> @@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
>>      return hlpi.virt_lpi;
>>  }
>>
>> +/* Iterate over all host LPIs, and updating the "enabled" state for a given
>> + * guest redistributor (VCPU) given the respective state in the provided
>> + * proptable. This proptable is indexed by the stored virtual LPI number.
>> + * This is to implement a guest INVALL command.
>> + */
>> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable)
>> +{
>> +    int chunk, i;
>> +    struct host_its *its;
>> +
>> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
>> +    {
>> +        if ( !lpi_data.host_lpis[chunk] )
>> +            continue;
>> +
>> +        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
>> +        {
>> +            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i], hlpi;
>> +            uint32_t hlpi_nr;
>> +
>> +            hlpi.data = hlpip->data;
>> +            if ( !hlpi.virt_lpi )
>> +                continue;
>> +
>> +            if ( hlpi.dom_id != v->domain->domain_id )
>> +                continue;
>> +
>> +            if ( hlpi.vcpu_id != v->vcpu_id )
>> +                continue;
>> +
>> +            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
>> +
>> +            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
>> +                lpi_data.lpi_property[hlpi_nr - 8192] |= LPI_PROP_ENABLED;
>> +            else
>> +                lpi_data.lpi_property[hlpi_nr - 8192] &= ~LPI_PROP_ENABLED;
>> +        }
>> +    }
>         AFAIK, the initial design is to use tasklet to update property
> table as it consumes
> lot of time to update the table.

This is a possible, but premature optimization.
Linux (at the moment, at least) only calls INVALL _once_, just after
initialising the collections. And at this point no LPI is mapped, so the
whole routine does basically nothing - and that quite fast.
We can later have any kind of fancy algorithm if there is a need for.

Cheers,
Andre.


>> +
>> +    /* Tell all ITSes that they should update the property table for CPU 0,
>> +     * which is where we map all LPIs to.
>> +     */
>> +    list_for_each_entry(its, &host_its_list, entry)
>> +        its_send_cmd_invall(its, 0);
>> +}
>> +
>>  void gicv3_lpi_set_enable(struct host_its *its,
>>                            uint32_t deviceid, uint32_t eventid,
>>                            uint32_t host_lpi, bool enabled)
>> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
>> index 74da8fc..1e429b7 100644
>> --- a/xen/arch/arm/vgic-its.c
>> +++ b/xen/arch/arm/vgic-its.c
>> @@ -294,6 +294,33 @@ out_unlock:
>>      return ret;
>>  }
>>
>> +/* INVALL updates the per-LPI configuration status for every LPI mapped to
>> + * this redistributor. For the guest side we don't need to update anything,
>> + * as we always refer to the actual table for the enabled bit and the
>> + * priority.
>> + * Enabling or disabling a virtual LPI however needs to be propagated to
>> + * the respective host LPI. Instead of iterating over all mapped LPIs in our
>> + * emulated GIC (which is expensive due to the required on-demand mapping),
>> + * we iterate over all mapped _host_ LPIs and filter for those which are
>> + * forwarded to this virtual redistributor.
>> + */
>> +static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
>> +{
>> +    uint32_t collid = its_cmd_get_collection(cmdptr);
>> +    struct vcpu *vcpu;
>> +
>> +    spin_lock(&its->its_lock);
>> +    vcpu = get_vcpu_from_collection(its, collid);
>> +    spin_unlock(&its->its_lock);
>> +
>> +    if ( !vcpu )
>> +        return -1;
>> +
>> +    gicv3_lpi_update_configurations(vcpu, its->d->arch.vgic.proptable);
>> +
>> +    return 0;
>> +}
>> +
>>  static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
>>  {
>>      uint32_t collid = its_cmd_get_collection(cmdptr);
>> @@ -515,6 +542,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>>          case GITS_CMD_INV:
>>              its_handle_inv(its, cmdptr);
>>             break;
>> +        case GITS_CMD_INVALL:
>> +            its_handle_invall(its, cmdptr);
>> +           break;
>>          case GITS_CMD_MAPC:
>>              its_handle_mapc(its, cmdptr);
>>              break;
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 2cdb3e1..ba6b2d5 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -146,6 +146,8 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>                              uint32_t devid, uint32_t eventid,
>>                              uint32_t host_lpi);
>>
>> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable);
>> +
>>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>>  {
>>      return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
>> --
>> 2.9.0
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-11-03 20:21     ` Andre Przywara
@ 2016-11-04 11:53       ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-04 11:53 UTC (permalink / raw)
  To: Andre Przywara, Vijay Kilari; +Cc: xen-devel, Stefano Stabellini

Hi,

On 03/11/16 20:21, Andre Przywara wrote:
> On 24/10/16 16:32, Vijay Kilari wrote:
>> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>> +    va = (void *)((uintptr_t)va & PAGE_MASK);
>>> +    pa = virt_to_maddr(va);
>>   can use _pa()
>
> Do you mean __pa()? Which is defined to be exactly virt_to_maddr()?
> I prefer the more verbose version, which is more readable, IMHO.

FWIW, __pa tends to be more used than virt_to_maddr within the source base.

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-11-03 19:26     ` Andre Przywara
@ 2016-11-04 12:07       ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-04 12:07 UTC (permalink / raw)
  To: Andre Przywara, Vijay Kilari; +Cc: xen-devel, Stefano Stabellini

Hello Andre,

On 03/11/16 19:26, Andre Przywara wrote:
> On 24/10/16 16:31, Vijay Kilari wrote:
>> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>> +    switch ( info->gpa & 0xffff )
>>> +    {
>>> +    case VREG32(GITS_CTLR):
>>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>>> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
>>> +       break;
>>> +    case VREG32(GITS_IIDR):
>>> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
>>> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
>>> +        break;
>>> +    case VREG64(GITS_TYPER):
>>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
>>> +        *r = vgic_reg64_extract(0x1eff1, info);
>>        GITS_TYPER.HCC is not set. Should be max vcpus of the domain
>
> HCC is clear on purpose. We want the guest to provide memory for
> everything that it allocates, to avoid it to hog Xen with allocations.

Whilst I agree that we want to limit the memory allocated by Xen itself,
each collection entry is just 16-bit. So unless we want to support a 
very big number of collection, I don't see any reason to request the 
guest to provision memory.

This makes the code more complex and you also have to validate the 
collection every time.

I remembered minimum number of collection an implementation has to 
support is "max_vcpus + 1" but I can't find again the statement in the spec.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
  2016-10-24 15:31   ` Vijay Kilari
  2016-10-28  1:04   ` Stefano Stabellini
@ 2016-11-04 15:46   ` Julien Grall
  2 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-04 15:46 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> +/*
> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> + * requires too much Xen memory, also a malicious guest could potentially
> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> + * on demand.
> + */
> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> +                                   bool allocate)
> +{
> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> +
> +    /* TODO: locking! */
> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> +    {
> +        if ( lpi_irq->pirq.irq == lpi )
> +            return &lpi_irq->pirq;
> +
> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> +            empty = lpi_irq;
> +    }
> +
> +    if ( !allocate )
> +        return NULL;
> +
> +    if ( !empty )
> +    {
> +        empty = xzalloc(struct lpi_pending_irq);

xzalloc can return NULL if we fail to allocate memory.

> +        vgic_init_pending_irq(&empty->pirq, lpi);
> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> +    } else
> +    {
> +        empty->pirq.status = 0;
> +        empty->pirq.irq = lpi;
> +    }
> +
> +    return &empty->pirq;
> +}
> +

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 13/24] ARM: vITS: handle CLEAR command
  2016-09-28 18:24 ` [RFC PATCH 13/24] ARM: vITS: handle CLEAR command Andre Przywara
@ 2016-11-04 15:48   ` Julien Grall
  2016-11-09  0:39   ` Stefano Stabellini
  1 sibling, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-04 15:48 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 28/09/16 19:24, Andre Przywara wrote:
> This introduces the ITS command handler for the CLEAR command, which
> clears the pending state of an LPI.
> This removes a not-yet injected, but already queued IRQ from a VCPU.
>
> In addition this patch introduces the lookup function which translates
> a given DeviceID/EventID pair into a pointer to our vITTE structure.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 115 insertions(+)
>
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 875b992..99d9e9c 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -61,6 +61,73 @@ struct vits_itte
>      uint64_t collection:16;
>  };
>
> +#define UNMAPPED_COLLECTION      ((uint16_t)~0)
> +
> +/* Must be called with the ITS lock held. */

This comment is a call to have an ASSERT in the function.

> +static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)

Please use unsigned int.

> +{
> +    uint16_t vcpu_id;
> +
> +    if ( collid >= its->max_collections )
> +        return NULL;
> +
> +    vcpu_id = its->coll_table[collid];
> +    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
> +        return NULL;
> +
> +    return its->d->vcpu[vcpu_id];
> +}
> +
> +#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
> +#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))

The layout of dev_table[...] really needs to be explained. It took me 
quite a while to understand how it works. For instance why you skip the 
first 8 bits for the address...

> +#define DEV_TABLE_ENTRY(addr, bits)                     \
> +        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
> +
> +static paddr_t get_itte_address(struct virt_its *its,
> +                                uint32_t devid, uint32_t evid)
> +{
> +    paddr_t addr;
> +
> +    if ( devid >= its->max_devices )
> +        return ~0;

Please use INVALID_PADDR here.

> +
> +    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
> +        return ~0;

Ditto.

> +
> +    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
> +
> +    return addr + evid * sizeof(struct vits_itte);
> +}
> +
> +/* Looks up a given deviceID/eventID pair on an ITS and returns a pointer to

Coding style:

/*
  * Foo

> + * the corresponding ITTE. This maps the respective guest page into Xen.
> + * Once finished with handling the ITTE, call put_devid_evid() to unmap
> + * the page again.
> + * Must be called with the ITS lock held.

This is a call for an ASSERT in the code.

> + */
> +static struct vits_itte *get_devid_evid(struct virt_its *its,
> +                                        uint32_t devid, uint32_t evid)

The naming of the function is confusing. It doesn't look up a device 
ID/event ID but an IIT. So I would rename it to find_itte.

> +{
> +    paddr_t addr = get_itte_address(its, devid, evid);
> +    struct vits_itte *itte;
> +
> +    if (addr == ~0)

Coding style: if ( ... )

And return INVALID_PADDR.

> +        return NULL;
> +
> +    /* TODO: check locking for map_guest_pages() */
> +    itte = map_guest_pages(its->d, addr & PAGE_MASK, 1);
> +    if ( !itte )
> +        return NULL;
> +
> +    return itte + (addr & ~PAGE_MASK) / sizeof(struct vits_itte);
> +}
> +
> +/* Must be called with the ITS lock held. */
> +static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
> +{
> +    unmap_guest_pages(itte, 1);
> +}
> +
>  /**************************************
>   * Functions that handle ITS commands *
>   **************************************/
> @@ -80,6 +147,51 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>  #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
>  #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
>
> +static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct pending_irq *pirq;
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    uint32_t vlpi;
> +
> +    spin_lock(&its->its_lock);
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +    {
> +        spin_unlock(&its->its_lock);
> +        return -1;
> +    }
> +
> +    vcpu = get_vcpu_from_collection(its, itte->collection);
> +    if ( !vcpu )
> +    {
> +        spin_unlock(&its->its_lock);
> +        return -1;
> +    }
> +
> +    vlpi = itte->vlpi;
> +
> +    put_devid_evid(its, itte);
> +    spin_unlock(&its->its_lock);
> +
> +    /* Remove a pending, but not yet injected guest IRQ. */
> +    pirq = lpi_to_pending(vcpu, vlpi, false);
> +    if ( pirq )
> +    {
> +        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
> +        gic_remove_from_queues(vcpu, vlpi);
> +
> +        /* Mark this pending IRQ struct as availabe again. */

NIT: s/availabe/available/

> +        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
> +            pirq->irq = 0;

This code should be in a separate helper. It will be helpful to make the 
structure available again easily without open coding it.

> +    }
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -100,6 +212,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>          switch (its_cmd_get_command(cmdptr))
>          {
> +        case GITS_CMD_CLEAR:
> +            its_handle_clear(its, cmdptr);

Should not you check the return for its_handle_clear?

> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits
  2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
                     ` (2 preceding siblings ...)
  2016-11-03 17:50   ` Julien Grall
@ 2016-11-08 23:54   ` Stefano Stabellini
  3 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-08 23:54 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> Create a new file to hold the emulation code for the ITS widget.
> For now we emulate the memory mapped ITS registers and provide a stub
> to introduce the ITS command handling framework (but without actually
> emulating any commands at this time).
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/vgic-its.c           | 378 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            |   9 -
>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>  4 files changed, 398 insertions(+), 9 deletions(-)
>  create mode 100644 xen/arch/arm/vgic-its.c
> 
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index c2c4daa..cb0201f 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -44,6 +44,7 @@ obj-y += traps.o
>  obj-y += vgic.o
>  obj-y += vgic-v2.o
>  obj-$(CONFIG_ARM_64) += vgic-v3.o
> +obj-$(CONFIG_HAS_ITS) += vgic-its.o
>  obj-y += vm_event.o
>  obj-y += vtimer.o
>  obj-y += vpsci.o
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> new file mode 100644
> index 0000000..875b992
> --- /dev/null
> +++ b/xen/arch/arm/vgic-its.c
> @@ -0,0 +1,378 @@
> +/*
> + * xen/arch/arm/vgic-its.c
> + *
> + * ARM Interrupt Translation Service (ITS) emulation
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/domain_page.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <asm/current.h>
> +#include <asm/mmio.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
> +#include <asm/vgic.h>
> +#include <asm/vgic-emul.h>
> +
> +/* Data structure to describe a virtual ITS */
> +struct virt_its {
> +    struct domain *d;
> +    struct host_its *hw_its;
> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
> +    uint64_t cbaser;
> +    uint64_t *cmdbuf;
> +    int cwriter;
> +    int creadr;
> +    spinlock_t its_lock;        /* protects the collection and device tables */
> +    uint64_t baser0, baser1;
> +    uint16_t *coll_table;
> +    int max_collections;
> +    uint64_t *dev_table;
> +    int max_devices;
> +    bool enabled;
> +};
> +
> +/* An Interrupt Translation Table Entry: this is indexed by a
> + * DeviceID/EventID pair and is located in guest memory.
> + */
> +struct vits_itte
> +{
> +    uint64_t hlpi:24;
> +    uint64_t vlpi:24;
> +    uint64_t collection:16;
> +};
> +
> +/**************************************
> + * Functions that handle ITS commands *
> + **************************************/
> +
> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
> +                                   int word, int shift, int size)
> +{
> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
> +
> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> +                                uint32_t writer)
> +{
> +    uint64_t *cmdptr;
> +
> +    if ( !its->cmdbuf )
> +        return -1;
> +
> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +        return -1;
> +
> +    spin_lock(&its->vcmd_lock);
> +
> +    while ( its->creadr != writer )
> +    {
> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
> +
> +        switch (its_cmd_get_command(cmdptr))
> +        {
> +        case GITS_CMD_SYNC:
> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
> +	    break;

indentation


> +        default:
> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
> +                   its_cmd_get_command(cmdptr));
> +            break;
> +        }
> +
> +        its->creadr += ITS_CMD_SIZE;
> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +            its->creadr = 0;
> +    }
> +    its->cwriter = writer;
> +
> +    spin_unlock(&its->vcmd_lock);
> +
> +    return 0;
> +}
> +
> +/*****************************
> + * ITS registers read access *
> + *****************************/
> +
> +/* The physical address is encoded slightly differently depending on
> + * the used page size: the highest four bits are stored in the lowest
> + * four bits of the field for 64K pages.
> + */
> +static paddr_t get_baser_phys_addr(uint64_t reg)
> +{
> +    if ( reg & BIT(9) )
> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
> +    else
> +        return reg & GENMASK(47, 12);
> +}

I would simplify the code by supporting only one page size, maybe 4K.


> +
> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                                 register_t *r, void *priv)
> +{
> +    struct virt_its *its = priv;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
> +	break;

indentation


> +    case VREG32(GITS_IIDR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
> +        break;
> +    case VREG64(GITS_TYPER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0x1eff1, info);

please #define 0x1eff1


> +        break;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->cbaser, info);
> +        break;
> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->cwriter, info);
> +        break;
> +    case VREG64(GITS_CREADR):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->creadr, info);
> +        break;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->baser0, info);
> +        break;
> +    case VREG64(GITS_BASER1):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(its->baser1, info);
> +        break;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0, info);
> +        break;

I notice that this patch lacks the code to initialize the vits registers
to sensible defaults. For example, who initializes the entry size
(52:48) of GITS_BASER?


> +    case VREG32(GICD_PIDR2):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
> +        break;
> +    }
> +
> +    return 1;
> +
> +bad_width:
> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
> +/******************************
> + * ITS registers write access *
> + ******************************/
> +
> +static int its_baser_table_size(uint64_t baser)
> +{
> +    int page_size = 0;
> +
> +    switch ( (baser >> 8) & 3 )
> +    {
> +    case 0: page_size = SZ_4K; break;
> +    case 1: page_size = SZ_16K; break;
> +    case 2:
> +    case 3: page_size = SZ_64K; break;
> +    }
> +
> +    return page_size * ((baser & GENMASK(7, 0)) + 1);
> +}
> +
> +static int its_baser_nr_entries(uint64_t baser)
> +{
> +    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;
> +
> +    return its_baser_table_size(baser) / entry_size;
> +}
> +
> +static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
> +                                  register_t r, void *priv)
> +{
> +    struct domain *d = v->domain;
> +    struct virt_its *its = priv;
> +    uint64_t reg;
> +    uint32_t ctlr;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +	vgic_reg32_update(&ctlr, r, info);
> +	its->enabled = ctlr & GITS_CTLR_ENABLE;
> +	/* TODO: trigger something ... */

indentation



> +        return 1;
> +    case VREG32(GITS_IIDR):
> +        goto write_ignore_32;
> +    case VREG32(GITS_TYPER):
> +        goto write_ignore_32;
> +    case VREG64(GITS_CBASER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )

It is worth printing an error (gdprintk).


> +            return 1;
> +
> +        reg = its->cbaser;
> +        vgic_reg64_update(&reg, r, info);
> +        /* TODO: sanitise! */

Yeah, we really need to do that :-)


> +        its->cbaser = reg;
> +
> +        if ( reg & BIT(63) )
> +        {
> +            its->cmdbuf = map_guest_pages(d, reg & GENMASK(51, 12), 1);

This is only one page, there is no need to use the vmap.


> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->cmdbuf, 1);
> +            its->cmdbuf = NULL;
> +        }
> +
> +	return 1;

indentation


> +    case VREG64(GITS_CWRITER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        reg = its->cwriter;
> +        vgic_reg64_update(&reg, r, info);
> +        vgic_its_handle_cmds(d, its, reg);
> +        return 1;
> +    case VREG64(GITS_CREADR):
> +        goto write_ignore_64;
> +    case VREG64(GITS_BASER0):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )

please add a warning


> +            return 1;
> +
> +        reg = its->baser0;
> +        vgic_reg64_update(&reg, r, info);
> +
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
> +        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;

Why not | with its->baser0?


> +        /* TODO: sanitise! */

Indeed


> +        /* TODO: locking(?) */

vITS stuff can be modified concurrently by two or more vCPUs, so
anything that changes a shared state accessible by multiple vCPUs need a
lock.


> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->dev_table = map_guest_pages(d,
> +                                             get_baser_phys_addr(reg),
> +                                             its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = its_baser_nr_entries(reg);
> +            memset(its->dev_table, 0, its->max_devices * sizeof(uint64_t));
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->dev_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_devices = 0;
> +        }
> +
> +        its->baser0 = reg;
> +        return 1;
> +    case VREG64(GITS_BASER1):

We need to be able to share this code with the GITS_BASER0 case above


> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +
> +        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
> +        if ( its->enabled )
> +            return 1;
> +
> +        reg = its->baser1;
> +        vgic_reg64_update(&reg, r, info);
> +        reg &= ~GITS_BASER_RO_MASK;
> +        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
> +        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
> +        /* TODO: sanitise! */
> +
> +        /* TODO: sort out locking */
> +        /* TODO: repeated calls: free old mapping */
> +        if ( reg & GITS_BASER_VALID )
> +        {
> +            its->coll_table = map_guest_pages(d, get_baser_phys_addr(reg),
> +                                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = its_baser_nr_entries(reg);
> +            memset(its->coll_table, 0xff,
> +                   its->max_collections * sizeof(uint16_t));
> +        }
> +        else
> +        {
> +            unmap_guest_pages(its->coll_table,
> +                              its_baser_table_size(reg) >> PAGE_SHIFT);
> +            its->max_collections = 0;
> +        }
> +        its->baser1 = reg;
> +        return 1;
> +    case VRANGE64(GITS_BASER2, GITS_BASER7):
> +        goto write_ignore_64;
> +    default:
> +        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
> +                 info->gpa & 0xffff);
> +        return 0;
> +    }
> +
> +    return 1;
> +
> +write_ignore_64:
> +    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
> +    return 1;
> +
> +write_ignore_32:
> +    if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +    return 1;
> +
> +bad_width:
> +    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
> +           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
> +
> +    domain_crash_synchronous();
> +
> +    return 0;
> +}
> +
> +static const struct mmio_handler_ops vgic_its_mmio_handler = {
> +    .read  = vgic_v3_its_mmio_read,
> +    .write = vgic_v3_its_mmio_write,
> +};
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 8fe8386..aa53a1e 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -158,15 +158,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
>      rank->vcpu[offset] = new_vcpu->vcpu_id;
>  }
>  
> -static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> -{
> -    /*
> -     * 64 bits registers can be accessible using 32-bit and 64-bit unless
> -     * stated otherwise (See 8.1.3 ARM IHI 0069A).
> -     */
> -    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> -}
> -
>  static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>                                           uint32_t gicr_reg,
>                                           register_t *r)
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index da5fb77..6a91f5b 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -147,6 +147,16 @@
>  #define LPI_PROP_RES1                (1 << 1)
>  #define LPI_PROP_ENABLED             (1 << 0)
>  
> +/*
> + * PIDR2: Only bits[7:4] are not implementation defined. We are
> + * emulating a GICv3 ([7:4] = 0x3).
> + *
> + * We don't emulate a specific registers scheme so implement the others
> + * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
> + */
> +#define GICV3_GICD_PIDR2  0x30
> +#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>  
> @@ -190,6 +200,15 @@ struct rdist_region {
>      bool single_rdist;
>  };
>  
> +/*
> + * 64 bits registers can be accessible using 32-bit and 64-bit unless
> + * stated otherwise (See 8.1.3 ARM IHI 0069A).
> + */
> +static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
> +{
> +    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
> +}
> +
>  #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
>  
>  /*
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 13/24] ARM: vITS: handle CLEAR command
  2016-09-28 18:24 ` [RFC PATCH 13/24] ARM: vITS: handle CLEAR command Andre Przywara
  2016-11-04 15:48   ` Julien Grall
@ 2016-11-09  0:39   ` Stefano Stabellini
  2016-11-09 13:32     ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  0:39 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> This introduces the ITS command handler for the CLEAR command, which
> clears the pending state of an LPI.
> This removes a not-yet injected, but already queued IRQ from a VCPU.
> 
> In addition this patch introduces the lookup function which translates
> a given DeviceID/EventID pair into a pointer to our vITTE structure.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 115 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 875b992..99d9e9c 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -61,6 +61,73 @@ struct vits_itte
>      uint64_t collection:16;
>  };
>  
> +#define UNMAPPED_COLLECTION      ((uint16_t)~0)
> +
> +/* Must be called with the ITS lock held. */
> +static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
> +{
> +    uint16_t vcpu_id;
> +
> +    if ( collid >= its->max_collections )
> +        return NULL;
> +
> +    vcpu_id = its->coll_table[collid];
> +    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
> +        return NULL;
> +
> +    return its->d->vcpu[vcpu_id];
> +}
> +
> +#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
> +#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))
> +#define DEV_TABLE_ENTRY(addr, bits)                     \
> +        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
> +
> +static paddr_t get_itte_address(struct virt_its *its,
> +                                uint32_t devid, uint32_t evid)
> +{
> +    paddr_t addr;
> +
> +    if ( devid >= its->max_devices )
> +        return ~0;

Please #define the error


> +    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
> +        return ~0;

same here


> +    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
> +
> +    return addr + evid * sizeof(struct vits_itte);
> +}
> +
> +/* Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
> + * the corresponding ITTE. This maps the respective guest page into Xen.
> + * Once finished with handling the ITTE, call put_devid_evid() to unmap
> + * the page again.
> + * Must be called with the ITS lock held.
> + */
> +static struct vits_itte *get_devid_evid(struct virt_its *its,
> +                                        uint32_t devid, uint32_t evid)
> +{
> +    paddr_t addr = get_itte_address(its, devid, evid);
> +    struct vits_itte *itte;
> +
> +    if (addr == ~0)
> +        return NULL;
> +
> +    /* TODO: check locking for map_guest_pages() */
> +    itte = map_guest_pages(its->d, addr & PAGE_MASK, 1);
> +    if ( !itte )
> +        return NULL;

No need to use the vmap to map 1 page


> +    return itte + (addr & ~PAGE_MASK) / sizeof(struct vits_itte);

Please use () around the div operation for clarity


> +}
> +
> +/* Must be called with the ITS lock held. */
> +static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
> +{
> +    unmap_guest_pages(itte, 1);

No need for this, once you use __pa instead of the vmap


> +}
> +
>  /**************************************
>   * Functions that handle ITS commands *
>   **************************************/
> @@ -80,6 +147,51 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>  #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
>  #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
>  
> +static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct pending_irq *pirq;
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    uint32_t vlpi;
> +
> +    spin_lock(&its->its_lock);
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +    {
> +        spin_unlock(&its->its_lock);
> +        return -1;
> +    }
> +
> +    vcpu = get_vcpu_from_collection(its, itte->collection);
> +    if ( !vcpu )
> +    {
> +        spin_unlock(&its->its_lock);
> +        return -1;
> +    }
> +
> +    vlpi = itte->vlpi;
> +
> +    put_devid_evid(its, itte);
> +    spin_unlock(&its->its_lock);
> +
> +    /* Remove a pending, but not yet injected guest IRQ. */

We need to check that the vlpi hasn't already been added to an LR
register. We can do that with GIC_IRQ_GUEST_VISIBLE.

In case GIC_IRQ_GUEST_VISIBLE is set, we need to clear the lr
(clear_lr). If we don't handle this case, we should at least print a
warning.


> +    pirq = lpi_to_pending(vcpu, vlpi, false);
> +    if ( pirq )
> +    {
> +        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
> +        gic_remove_from_queues(vcpu, vlpi);
> +
> +        /* Mark this pending IRQ struct as availabe again. */
> +        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
> +            pirq->irq = 0;
> +    }
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -100,6 +212,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>          switch (its_cmd_get_command(cmdptr))
>          {
> +        case GITS_CMD_CLEAR:
> +            its_handle_clear(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 14/24] ARM: vITS: handle INT command
  2016-09-28 18:24 ` [RFC PATCH 14/24] ARM: vITS: handle INT command Andre Przywara
@ 2016-11-09  0:42   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  0:42 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The INT command sets a given LPI identified by a DeviceID/EventID pair
> as pending and thus triggers it to be injected.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 99d9e9c..7072753 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -192,6 +192,37 @@ static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
>      return 0;
>  }
>  
> +static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    int ret = -1;
> +    uint32_t vlpi;
> +
> +    spin_lock(&its->its_lock);
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +        goto out_unlock;
> +
> +    vcpu = its->d->vcpu[itte->collection];

We need to check that itte->collection is a valid vcpu before using as
array index.


> +    vlpi = itte->vlpi;
> +
> +    ret = 0;
> +
> +    put_devid_evid(its, itte);
> +
> +out_unlock:
> +    spin_unlock(&its->its_lock);
> +
> +    if ( !ret)

code style


> +        vgic_vcpu_inject_irq(vcpu, vlpi);
> +
> +    return ret;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -215,6 +246,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_CLEAR:
>              its_handle_clear(its, cmdptr);
>              break;
> +        case GITS_CMD_INT:
> +            its_handle_int(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 15/24] ARM: vITS: handle MAPC command
  2016-09-28 18:24 ` [RFC PATCH 15/24] ARM: vITS: handle MAPC command Andre Przywara
@ 2016-11-09  0:48   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  0:48 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The MAPC command associates a given collection ID with a given
> redistributor, thus mapping collections to VCPUs.
> We just store the vcpu_id in the collection table for that.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 7072753..caad320 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -223,6 +223,33 @@ out_unlock:
>      return ret;
>  }
>  
> +static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t collid = its_cmd_get_collection(cmdptr);
> +    uint64_t rdbase = its_cmd_mask_field(cmdptr, 2, 16, 44);
> +    int ret = -1;

I take 44 is a bit arbitrary here? It might be best to #define it.


> +    if ( collid >= its->max_collections )
> +        return ret;
> +
> +    if ( rdbase >= its->d->max_vcpus )
> +        return ret;
> +
> +    spin_lock(&its->its_lock);
> +    if ( its->coll_table )
> +    {
> +        if ( its_cmd_get_validbit(cmdptr) )
> +            its->coll_table[collid] = rdbase;
> +        else
> +            its->coll_table[collid] = UNMAPPED_COLLECTION;
> +
> +        ret = 0;
> +    }
> +    spin_unlock(&its->its_lock);
> +
> +    return ret;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -249,6 +276,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_INT:
>              its_handle_int(its, cmdptr);
>              break;
> +        case GITS_CMD_MAPC:
> +            its_handle_mapc(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 16/24] ARM: vITS: handle MAPD command
  2016-09-28 18:24 ` [RFC PATCH 16/24] ARM: vITS: handle MAPD command Andre Przywara
@ 2016-11-09  0:54   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  0:54 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The MAPD command maps a device by associating a memory region for
> storing ITTEs with a certain device ID.
> We just store the given guest physical address in the device table.
> We don't map the device tables permanently, as their alignment
> requirement is only 256 Bytes, thus making mapping of several tables
> complicated. We map the device tables on demand when we need them later.
> 
> Also we propagate the MAPD request to the hardware ITS, as the device ID
> is only meaningful there.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index caad320..83d47e1 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -250,6 +250,34 @@ static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
>      return ret;
>  }
>  
> +static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    int size = its_cmd_get_size(cmdptr);
> +    bool valid = its_cmd_get_validbit(cmdptr);
> +    paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) & GENMASK(51, 8);
> +
> +    if ( !its->dev_table )
> +        return -1;

We should validate devid, size and itt_addr?


> +    spin_lock(&its->its_lock);
> +    if ( valid )
> +        its->dev_table[devid] = DEV_TABLE_ENTRY(itt_addr, size + 1);
> +    else
> +        its->dev_table[devid] = 0;
> +
> +    spin_unlock(&its->its_lock);
> +
> +    /* DomUs (will later) have their ITTs allocated at domain creation time,
> +     * when Dom0 configures the passthrough.
> +     */
> +    if ( its->hw_its )
> +        return gicv3_its_map_device(its->hw_its,
> +                                    its->d, devid, size + 1, valid);
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -279,6 +307,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_MAPC:
>              its_handle_mapc(its, cmdptr);
>              break;
> +        case GITS_CMD_MAPD:
> +            its_handle_mapd(its, cmdptr);
> +	    break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 17/24] ARM: vITS: handle MAPTI command
  2016-09-28 18:24 ` [RFC PATCH 17/24] ARM: vITS: handle MAPTI command Andre Przywara
@ 2016-11-09  1:07   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  1:07 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
> pair and actually instantiates LPI interrupts.
> We allocate a new host LPI and connect that one to this virtual LPI,
> so that any triggering IRQ on the host can be quickly forwarded to
> a guest.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 53 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 83d47e1..70897dd 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -278,6 +278,55 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
>      return 0;
>  }
>  
> +static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    uint32_t intid = its_cmd_get_physical_id(cmdptr);
> +    int collid = its_cmd_get_collection(cmdptr);
> +    struct vits_itte *itte;
> +    uint32_t host_lpi;
> +    struct vcpu *vcpu;
> +    int ret = -1;
> +
> +    if ( its_cmd_get_command(cmdptr) == GITS_CMD_MAPI )
> +        intid = eventid;
> +
> +    if ( collid >= its->max_collections )
> +        return -1;
> +
> +    spin_lock(&its->its_lock);
> +    vcpu = get_vcpu_from_collection(its, collid);
> +    if ( !vcpu )
> +        goto out_unlock;
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +        goto out_unlock;
> +
> +    if ( itte->hlpi )
> +        goto out_unmap;

get_vcpu_from_collection and get_devid_evid take care of checking the
validity of devid, eventid and collid. Do we need to also check that
devid, eventid and intid are valid from a host perspective, given that
we are calling gicv3_lpi_allocate_host_lpi?


> +    host_lpi = gicv3_lpi_allocate_host_lpi(its->hw_its,
> +                                           devid, eventid,
> +                                           vcpu, intid);
> +    if ( host_lpi >= 0 )
> +        itte->hlpi = host_lpi;
> +
> +    itte->vlpi = intid;
> +    itte->collection = collid;
> +
> +    ret = 0;
> +
> +out_unmap:
> +    put_devid_evid(its, itte);
> +
> +out_unlock:
> +    spin_unlock(&its->its_lock);
> +    
> +    return ret;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -310,6 +359,10 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_MAPD:
>              its_handle_mapd(its, cmdptr);
>  	    break;
> +        case GITS_CMD_MAPI:
> +        case GITS_CMD_MAPTI:
> +            its_handle_mapti(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 18/24] ARM: vITS: handle MOVI command
  2016-09-28 18:24 ` [RFC PATCH 18/24] ARM: vITS: handle MOVI command Andre Przywara
@ 2016-11-09  1:13   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  1:13 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The MOVI command moves the interrupt affinity from one redistributor
> (read: VCPU) to another.
> For now migration of "live" LPIs is not yet implemented, but we store
> the changed affinity in the host LPI structure and in our virtual ITTE.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 16 +++++++++++++++
>  xen/arch/arm/vgic-its.c       | 46 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  1 +
>  3 files changed, 63 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 6bac422..d1b1cbb 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -618,6 +618,22 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>      return 0;
>  }
>  
> +/* Changes the target VCPU for a given host LPI assigned to a domain. */
> +int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id)
> +{
> +    union host_lpi *hlpip, hlpi;
> +
> +    hlpip = gic_find_host_lpi(host_lpi, d);
> +    if ( !hlpip )
> +        return -1;
> +
> +    hlpi.data = hlpip->data;
> +    hlpi.vcpu_id = new_vcpu_id;
> +    hlpip->data = hlpi.data;

Most surely we need to call vgic_migrate_irq


> +    return 0;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 70897dd..c0a60ad 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -327,6 +327,46 @@ out_unlock:
>      return ret;
>  }
>  
> +static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    int collid = its_cmd_get_collection(cmdptr);
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    uint32_t host_lpi = 0;
> +
> +    if ( collid >= its->max_collections )
> +        return -1;
> +
> +    spin_lock(&its->its_lock);
> +
> +    vcpu = get_vcpu_from_collection(its, collid);
> +    if ( !vcpu )
> +        goto out_unlock;
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +        goto out_unlock;
> +
> +    itte->collection = collid;
> +    host_lpi = itte->hlpi;
> +
> +    /* TODO: lookup currently-in-guest virtual IRQs and migrate them */
> +
> +    put_devid_evid(its, itte);
> +
> +out_unlock:
> +    spin_unlock(&its->its_lock);
> +
> +    if ( !host_lpi )
> +        return -1;
> +
> +    gicv3_lpi_change_vcpu(its->d, host_lpi, vcpu->vcpu_id);
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -363,6 +403,12 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_MAPTI:
>              its_handle_mapti(its, cmdptr);
>              break;
> +        case GITS_CMD_MOVALL:
> +            gdprintk(XENLOG_G_INFO, "ITS: ignoring MOVALL command\n");
> +            break;
> +        case GITS_CMD_MOVI:
> +            its_handle_movi(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 3b2e5c0..7e1142f 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -133,6 +133,7 @@ int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>  int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>                                  uint32_t devid, uint32_t eventid,
>                                  struct vcpu *v, int virt_lpi);
> +int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
>                              uint32_t host_lpi);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 19/24] ARM: vITS: handle DISCARD command
  2016-09-28 18:24 ` [RFC PATCH 19/24] ARM: vITS: handle DISCARD command Andre Przywara
@ 2016-11-09  1:28   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  1:28 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The DISCARD command drops the connection between a DeviceID/EventID
> and an LPI/collection pair.
> We mark the respective structure entries as not allocated and make
> sure that any queued IRQs are removed.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 21 +++++++++++++++++++
>  xen/arch/arm/vgic-its.c       | 48 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  5 +++++
>  3 files changed, 74 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index d1b1cbb..766a7cb 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -634,6 +634,27 @@ int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id)
>      return 0;
>  }
>  
> +/* Looks up a given host LPI assigned to that domain and returns the
> + * connected virtual LPI number. Also stores the target vcpu ID in
> + * the passed vcpu_id pointer.
> + * Returns 0 if no host LPI could be found for that domain, or the
> + * virtual LPI number (>= 8192) if the lookup succeeded.
> + */
> +uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
> +{
> +    union host_lpi *hlpip, hlpi;
> +
> +    hlpip = gic_find_host_lpi(host_lpi, d);
> +    if ( !hlpip )
> +        return 0;
> +
> +    hlpi.data = hlpip->data;
> +    if ( vcpu_id )
> +        *vcpu_id = hlpi.vcpu_id;
> +
> +    return hlpi.virt_lpi;
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index c0a60ad..028d234 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -367,6 +367,51 @@ out_unlock:
>      return 0;
>  }
>  
> +static int its_handle_discard(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct pending_irq *pirq;
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    uint32_t vlpi;
> +    int ret = -1, vcpu_id;
> +
> +    spin_lock(&its->its_lock);
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +        goto out_unlock;
> +
> +    vlpi = gicv3_lpi_lookup_lpi(its->d, itte->hlpi, &vcpu_id);
> +    if ( !vlpi )
> +        goto out_unlock;

Using itte->hlpi like that is very dangerous because the guest could be
modifying that field while we run gicv3_lpi_lookup_lpi or
gicv3_lpi_drop_host_lpi. Actually we need a compiler barrier after
reading all guest accessible fields and before using them to access our
own data structures.


> +    vcpu = its->d->vcpu[vcpu_id];
> +
> +    pirq = lpi_to_pending(vcpu, vlpi, false);
> +    if ( pirq )
> +    {
> +        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
> +        gic_remove_from_queues(vcpu, vlpi);
> +
> +        /* Mark this pending IRQ struct as availabe again. */
> +        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
> +            pirq->irq = 0;

We need to do something in case the vlpi is in a GICH_LR register


> +    }
> +
> +    gicv3_lpi_drop_host_lpi(its->hw_its, devid, eventid, itte->hlpi);

Same here regarding itte->hlpi


> +    itte->hlpi = 0;             /* Mark this ITTE as unused. */
> +    ret = 0;
> +
> +    put_devid_evid(its, itte);
> +
> +out_unlock:
> +    spin_unlock(&its->its_lock);
> +
> +    return ret;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -390,6 +435,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_CLEAR:
>              its_handle_clear(its, cmdptr);
>              break;
> +        case GITS_CMD_DISCARD:
> +            its_handle_discard(its, cmdptr);
> +            break;
>          case GITS_CMD_INT:
>              its_handle_int(its, cmdptr);
>              break;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 7e1142f..3f5698d 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -133,6 +133,11 @@ int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>  int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>                                  uint32_t devid, uint32_t eventid,
>                                  struct vcpu *v, int virt_lpi);
> +/* Given a physical LPI, looks up and returns the associated virtual LPI
> + * and the target VCPU in the given domain.
> + */
> +uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi,
> +                              int *vcpu_id);
>  int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 20/24] ARM: vITS: handle INV command
  2016-09-28 18:24 ` [RFC PATCH 20/24] ARM: vITS: handle INV command Andre Przywara
@ 2016-11-09  1:49   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-09  1:49 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> The INV command instructs the ITS to update the configuration data for
> a given LPI by re-reading its entry from the property table.
> We don't need to care so much about the priority value, but enabling
> or disabling an LPI has some effect: We remove or push virtual LPIs
> to their VCPUs, also propagate the enable bit to the hardware.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-its.c        | 35 ++++++++++++++++++++
>  xen/arch/arm/vgic-its.c       | 74 +++++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic-its.h |  3 ++
>  3 files changed, 112 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> index 766a7cb..6f4329f 100644
> --- a/xen/arch/arm/gic-its.c
> +++ b/xen/arch/arm/gic-its.c
> @@ -215,6 +215,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_inv(struct host_its *its,
> +                            uint32_t deviceid, uint32_t eventid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid;
> +    cmd[2] = 0x00;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>                           int devid, int bits, bool valid)
>  {
> @@ -655,6 +668,28 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
>      return hlpi.virt_lpi;
>  }
>  
> +void gicv3_lpi_set_enable(struct host_its *its,
> +                          uint32_t deviceid, uint32_t eventid,
> +                          uint32_t host_lpi, bool enabled)
> +{
> +    host_lpi -= 8192;
> +
> +    if ( host_lpi >= MAX_HOST_LPIS )
> +        return;
> +
> +    if ( !its )
> +        return;
> +
> +    if (enabled)
> +        lpi_data.lpi_property[host_lpi] |= LPI_PROP_ENABLED;
> +    else
> +        lpi_data.lpi_property[host_lpi] &= ~LPI_PROP_ENABLED;
> +
> +    __flush_dcache_area(&lpi_data.lpi_property[host_lpi], 1);
> +
> +    its_send_cmd_inv(its, deviceid, eventid);
> +    its_send_cmd_sync(its, 0);
> +}
> +
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>      const struct dt_device_node *its = NULL;
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 028d234..74da8fc 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -223,6 +223,77 @@ out_unlock:
>      return ret;
>  }
>  
> +/* For a given virtual LPI read the enabled bit from the virtual property
> + * table and update the virtual IRQ's state.
> + * This enables or disables the associated hardware LPI, also takes care
> + * of removing or pushing of virtual LPIs to their VCPUs.
> + */
> +static void update_lpi_enabled_status(struct virt_its* its,
> +                                      struct vcpu *vcpu, uint32_t vlpi,
> +                                      uint32_t deviceid, uint32_t eventid,
> +                                      uint32_t hlpi)
> +{
> +    struct pending_irq *pirq = lpi_to_pending(vcpu, vlpi, false);
> +    uint8_t property = its->d->arch.vgic.proptable[vlpi - 8192];

We need to check vlpi before using to access an array. We also need a
barrier before using property.


> +    if ( property & LPI_PROP_ENABLED )
> +    {
> +        if ( pirq )
> +        {
> +            unsigned long flags;
> +
> +            set_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
> +            spin_lock_irqsave(&vcpu->arch.vgic.lock, flags);
> +            if ( !list_empty(&pirq->inflight) &&
> +                 !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
> +                gic_raise_guest_irq(vcpu, vlpi, property & 0xfc);
> +            spin_unlock_irqrestore(&vcpu->arch.vgic.lock, flags);
> +
> +        }
> +        gicv3_lpi_set_enable(its->hw_its, deviceid, eventid, hlpi, true);
> +    }
> +    else
> +    {
> +        if ( pirq )
> +        {
> +            clear_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
> +            gic_remove_from_queues(vcpu, vlpi);
> +        }
> +        gicv3_lpi_set_enable(its->hw_its, deviceid, eventid, hlpi, false);
> +    }
> +}
> +
> +static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct vits_itte *itte;
> +    struct vcpu *vcpu;
> +    uint32_t hlpi, vlpi;
> +    int ret = -1;
> +
> +    spin_lock(&its->its_lock);
> +
> +    itte = get_devid_evid(its, devid, eventid);
> +    if ( !itte )
> +        goto out_unlock;

We need to check itte->collection before using it to access d->vcpu.


> +    vcpu = its->d->vcpu[itte->collection];
> +    vlpi = itte->vlpi;
> +    hlpi = itte->hlpi;
> +
> +    ret = 0;
> +
> +    put_devid_evid(its, itte);
> +
> +out_unlock:
> +    spin_unlock(&its->its_lock);
> +
> +    if ( !ret )
> +        update_lpi_enabled_status(its, vcpu, vlpi, devid, eventid, hlpi);
> +
> +    return ret;
> +}
> +
>  static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
>  {
>      uint32_t collid = its_cmd_get_collection(cmdptr);
> @@ -441,6 +512,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_INT:
>              its_handle_int(its, cmdptr);
>              break;
> +        case GITS_CMD_INV:
> +            its_handle_inv(its, cmdptr);
> +	    break;
>          case GITS_CMD_MAPC:
>              its_handle_mapc(its, cmdptr);
>              break;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 3f5698d..2cdb3e1 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -139,6 +139,9 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>  uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi,
>                                int *vcpu_id);
>  int gicv3_lpi_change_vcpu(struct domain *d, uint32_t host_lpi, int new_vcpu_id);
> +void gicv3_lpi_set_enable(struct host_its *its,
> +                          uint32_t deviceid, uint32_t eventid,
> +                          uint32_t host_lpi, bool enabled);
>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>                              uint32_t devid, uint32_t eventid,
>                              uint32_t host_lpi);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 13/24] ARM: vITS: handle CLEAR command
  2016-11-09  0:39   ` Stefano Stabellini
@ 2016-11-09 13:32     ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-09 13:32 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel

Hi,

On 09/11/16 00:39, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> This introduces the ITS command handler for the CLEAR command, which
>> clears the pending state of an LPI.
>> This removes a not-yet injected, but already queued IRQ from a VCPU.
>>
>> In addition this patch introduces the lookup function which translates
>> a given DeviceID/EventID pair into a pointer to our vITTE structure.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-its.c | 115 ++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 115 insertions(+)
>>
>> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
>> index 875b992..99d9e9c 100644
>> --- a/xen/arch/arm/vgic-its.c
>> +++ b/xen/arch/arm/vgic-its.c
>> @@ -61,6 +61,73 @@ struct vits_itte
>>      uint64_t collection:16;
>>  };
>>
>> +#define UNMAPPED_COLLECTION      ((uint16_t)~0)
>> +
>> +/* Must be called with the ITS lock held. */
>> +static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
>> +{
>> +    uint16_t vcpu_id;
>> +
>> +    if ( collid >= its->max_collections )
>> +        return NULL;
>> +
>> +    vcpu_id = its->coll_table[collid];
>> +    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
>> +        return NULL;
>> +
>> +    return its->d->vcpu[vcpu_id];
>> +}
>> +
>> +#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
>> +#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))
>> +#define DEV_TABLE_ENTRY(addr, bits)                     \
>> +        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
>> +
>> +static paddr_t get_itte_address(struct virt_its *its,
>> +                                uint32_t devid, uint32_t evid)
>> +{
>> +    paddr_t addr;
>> +
>> +    if ( devid >= its->max_devices )
>> +        return ~0;
>
> Please #define the error

Technically this should be INVALID_PADDR here.

>
>> +    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
>> +        return ~0;
>
> same here

Ditto.

>
>
>> +    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
>> +
>> +    return addr + evid * sizeof(struct vits_itte);
>> +}
>> +
>> +/* Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
>> + * the corresponding ITTE. This maps the respective guest page into Xen.
>> + * Once finished with handling the ITTE, call put_devid_evid() to unmap
>> + * the page again.
>> + * Must be called with the ITS lock held.
>> + */
>> +static struct vits_itte *get_devid_evid(struct virt_its *its,
>> +                                        uint32_t devid, uint32_t evid)
>> +{
>> +    paddr_t addr = get_itte_address(its, devid, evid);
>> +    struct vits_itte *itte;
>> +
>> +    if (addr == ~0)
>> +        return NULL;
>> +
>> +    /* TODO: check locking for map_guest_pages() */
>> +    itte = map_guest_pages(its->d, addr & PAGE_MASK, 1);
>> +    if ( !itte )
>> +        return NULL;
>
> No need to use the vmap to map 1 page

But you do have to translate the IPA to a PA, so you cannot directly use 
__pa on it.

>
>> +    return itte + (addr & ~PAGE_MASK) / sizeof(struct vits_itte);
>
> Please use () around the div operation for clarity
>
>
>> +}
>> +
>> +/* Must be called with the ITS lock held. */
>> +static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
>> +{
>> +    unmap_guest_pages(itte, 1);
>
> No need for this, once you use __pa instead of the vmap

Well, we should at least use map_domain_page/unmap_domain_page even if 
they are a nop on ARM64. And not directly __pa.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-04  9:22     ` Andre Przywara
@ 2016-11-10  0:21       ` Stefano Stabellini
  2016-11-10 11:57         ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10  0:21 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Fri, 4 Nov 2016, Andre Przywara wrote:
> Hi,
> 
> On 24/10/16 16:32, Vijay Kilari wrote:
> > On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> >> The INVALL command instructs an ITS to invalidate the configuration
> >> data for all LPIs associated with a given redistributor (read: VCPU).
> >> To avoid iterating (and mapping!) all guest tables, we instead go through
> >> the host LPI table to find any LPIs targetting this VCPU. We then update
> >> the configuration bits for the connected virtual LPIs.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  xen/arch/arm/gic-its.c        | 58 +++++++++++++++++++++++++++++++++++++++++++
> >>  xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
> >>  xen/include/asm-arm/gic-its.h |  2 ++
> >>  3 files changed, 90 insertions(+)
> >>
> >> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> >> index 6f4329f..5129d6e 100644
> >> --- a/xen/arch/arm/gic-its.c
> >> +++ b/xen/arch/arm/gic-its.c
> >> @@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
> >>      return its_send_command(its, cmd);
> >>  }
> >>
> >> +static int its_send_cmd_invall(struct host_its *its, int cpu)
> >> +{
> >> +    uint64_t cmd[4];
> >> +
> >> +    cmd[0] = GITS_CMD_INVALL;
> >> +    cmd[1] = 0x00;
> >> +    cmd[2] = cpu & GENMASK(15, 0);
> >> +    cmd[3] = 0x00;
> >> +
> >> +    return its_send_command(its, cmd);
> >> +}
> >> +
> >>  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> >>                           int devid, int bits, bool valid)
> >>  {
> >> @@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
> >>      return hlpi.virt_lpi;
> >>  }
> >>
> >> +/* Iterate over all host LPIs, and updating the "enabled" state for a given
> >> + * guest redistributor (VCPU) given the respective state in the provided
> >> + * proptable. This proptable is indexed by the stored virtual LPI number.
> >> + * This is to implement a guest INVALL command.
> >> + */
> >> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable)
> >> +{
> >> +    int chunk, i;
> >> +    struct host_its *its;
> >> +
> >> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
> >> +    {
> >> +        if ( !lpi_data.host_lpis[chunk] )
> >> +            continue;
> >> +
> >> +        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
> >> +        {
> >> +            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i], hlpi;
> >> +            uint32_t hlpi_nr;
> >> +
> >> +            hlpi.data = hlpip->data;
> >> +            if ( !hlpi.virt_lpi )
> >> +                continue;
> >> +
> >> +            if ( hlpi.dom_id != v->domain->domain_id )
> >> +                continue;
> >> +
> >> +            if ( hlpi.vcpu_id != v->vcpu_id )
> >> +                continue;
> >> +
> >> +            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
> >> +
> >> +            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
> >> +                lpi_data.lpi_property[hlpi_nr - 8192] |= LPI_PROP_ENABLED;
> >> +            else
> >> +                lpi_data.lpi_property[hlpi_nr - 8192] &= ~LPI_PROP_ENABLED;
> >> +        }
> >> +    }
> >         AFAIK, the initial design is to use tasklet to update property
> > table as it consumes
> > lot of time to update the table.
> 
> This is a possible, but premature optimization.
> Linux (at the moment, at least) only calls INVALL _once_, just after
> initialising the collections. And at this point no LPI is mapped, so the
> whole routine does basically nothing - and that quite fast.
> We can later have any kind of fancy algorithm if there is a need for.

I understand, but as-is it's so expensive that could be a DOS vector.
Also other OSes could issue INVALL much more often than Linux.

Considering that we might support device assigment with ITS soon, I
think it might be best to parse per-domain virtual tables rather than
the full list of physical LPIs, which theoretically could be much
larger. Or alternatively we need to think about adding another field to
lpi_data, to link together all lpis assigned to the same domain, but
that would cost even more memory. Or we could rate-limit the INVALL
calls to one every few seconds or something. Or all of the above :-)

We need to protect Xen from too frequent and too expensive requests like
this.


> >> +
> >> +    /* Tell all ITSes that they should update the property table for CPU 0,
> >> +     * which is where we map all LPIs to.
> >> +     */
> >> +    list_for_each_entry(its, &host_its_list, entry)
> >> +        its_send_cmd_invall(its, 0);
> >> +}
> >> +
> >>  void gicv3_lpi_set_enable(struct host_its *its,
> >>                            uint32_t deviceid, uint32_t eventid,
> >>                            uint32_t host_lpi, bool enabled)
> >> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> >> index 74da8fc..1e429b7 100644
> >> --- a/xen/arch/arm/vgic-its.c
> >> +++ b/xen/arch/arm/vgic-its.c
> >> @@ -294,6 +294,33 @@ out_unlock:
> >>      return ret;
> >>  }
> >>
> >> +/* INVALL updates the per-LPI configuration status for every LPI mapped to
> >> + * this redistributor. For the guest side we don't need to update anything,
> >> + * as we always refer to the actual table for the enabled bit and the
> >> + * priority.
> >> + * Enabling or disabling a virtual LPI however needs to be propagated to
> >> + * the respective host LPI. Instead of iterating over all mapped LPIs in our
> >> + * emulated GIC (which is expensive due to the required on-demand mapping),
> >> + * we iterate over all mapped _host_ LPIs and filter for those which are
> >> + * forwarded to this virtual redistributor.
> >> + */
> >> +static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
> >> +{
> >> +    uint32_t collid = its_cmd_get_collection(cmdptr);
> >> +    struct vcpu *vcpu;
> >> +
> >> +    spin_lock(&its->its_lock);
> >> +    vcpu = get_vcpu_from_collection(its, collid);
> >> +    spin_unlock(&its->its_lock);
> >> +
> >> +    if ( !vcpu )
> >> +        return -1;
> >> +
> >> +    gicv3_lpi_update_configurations(vcpu, its->d->arch.vgic.proptable);
> >> +
> >> +    return 0;
> >> +}
> >> +
> >>  static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
> >>  {
> >>      uint32_t collid = its_cmd_get_collection(cmdptr);
> >> @@ -515,6 +542,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> >>          case GITS_CMD_INV:
> >>              its_handle_inv(its, cmdptr);
> >>             break;
> >> +        case GITS_CMD_INVALL:
> >> +            its_handle_invall(its, cmdptr);
> >> +           break;
> >>          case GITS_CMD_MAPC:
> >>              its_handle_mapc(its, cmdptr);
> >>              break;
> >> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> >> index 2cdb3e1..ba6b2d5 100644
> >> --- a/xen/include/asm-arm/gic-its.h
> >> +++ b/xen/include/asm-arm/gic-its.h
> >> @@ -146,6 +146,8 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
> >>                              uint32_t devid, uint32_t eventid,
> >>                              uint32_t host_lpi);
> >>
> >> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable);
> >> +
> >>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> >>  {
> >>      return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
> >> --
> >> 2.9.0
> >>
> >>
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@lists.xen.org
> >> https://lists.xen.org/xen-devel
> > 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0
  2016-09-28 18:24 ` [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
@ 2016-11-10  0:38   ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10  0:38 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> For each hardware ITS create and initialize a virtual ITS for Dom0.
> We use the same memory mapped address to keep the doorbell working.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-its.c       | 22 ++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c        | 12 ++++++++++++
>  xen/include/asm-arm/domain.h  |  1 +
>  xen/include/asm-arm/gic-its.h | 13 +++++++++++++
>  4 files changed, 48 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-its.c b/xen/arch/arm/vgic-its.c
> index 1e429b7..5c605b5 100644
> --- a/xen/arch/arm/vgic-its.c
> +++ b/xen/arch/arm/vgic-its.c
> @@ -829,6 +829,28 @@ static const struct mmio_handler_ops vgic_its_mmio_handler = {
>      .write = vgic_v3_its_mmio_write,
>  };
>  
> +int vgic_v3_its_init_virtual(struct domain *d, struct host_its *hw_its,
> +                             paddr_t guest_addr)
> +{
> +    struct virt_its *its;
> +
> +    its = xzalloc(struct virt_its);
> +    if ( ! its )
> +        return -ENOMEM;
> +
> +    its->d = d;
> +    its->hw_its = hw_its;
> +    its->baser0 = 0x7917000000000400;
> +    its->baser1 = 0x3c01000000000400;
> +    its->cbaser = 0x380e000000000400;

Please #define these values.


> +    spin_lock_init(&its->vcmd_lock);
> +    spin_lock_init(&its->its_lock);
> +
> +    register_mmio_handler(d, &vgic_its_mmio_handler, guest_addr, SZ_64K, its);
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index aa53a1e..d230a1f 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -31,6 +31,7 @@
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
>  #include <asm/vgic.h>
>  #include <asm/vgic-emul.h>
>  
> @@ -1572,6 +1573,7 @@ static int vgic_v3_domain_init(struct domain *d)
>       */
>      if ( is_hardware_domain(d) )
>      {
> +        struct host_its *hw_its;
>          unsigned int first_cpu = 0;
>  
>          d->arch.vgic.dbase = vgic_v3_hw.dbase;
> @@ -1597,6 +1599,16 @@ static int vgic_v3_domain_init(struct domain *d)
>  
>              first_cpu += size / d->arch.vgic.rdist_stride;
>          }
> +        d->arch.vgic.nr_regions = vgic_v3_hw.nr_rdist_regions;
> +
> +        list_for_each_entry(hw_its, &host_its_list, entry)
> +        {
> +            /* Emulate the control registers frame (lower 64K). */
> +            vgic_v3_its_init_virtual(d, hw_its, hw_its->addr);
> +
> +            d->arch.vgic.has_its = true;
> +        }
> +
>      }
>      else
>      {
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 0cd3500..1c2f7c7 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -111,6 +111,7 @@ struct arch_domain
>          uint32_t rdist_stride;              /* Re-Distributor stride */
>          uint64_t rdist_propbase;
>          uint8_t *proptable;
> +        bool has_its;
>  #endif
>      } vgic;
>  
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index ba6b2d5..b58e092 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -123,6 +123,13 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>  /* Map a collection for this host CPU to each host ITS. */
>  void gicv3_its_setup_collection(int cpu);
>  
> +/* Create and register a virtual ITS at the given guest address.
> + * If a host ITS is specified, a hardware domain can reach out to that host
> + * ITS to deal with devices and LPI mappings and can enable/disable LPIs.
> + */
> +int vgic_v3_its_init_virtual(struct domain *d, struct host_its *hw_its,
> +                             paddr_t guest_addr);
> +
>  /* Map a device on the host by allocating an ITT on the host (ITS).
>   * "bits" specifies how many events (interrupts) this device will need.
>   * Setting "valid" to false deallocates the device.
> @@ -204,6 +211,12 @@ static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>  {
>      return false;
>  }
> +static inline int vgic_v3_its_init_virtual(struct domain *d,
> +                                           struct host_its *hw_its,
> +                                           paddr_t guest_addr)
> +{
> +    return 0;
> +}
>  
>  #endif /* CONFIG_HAS_ITS */
>  
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 24/24] ARM: vGIC: advertising LPI support
  2016-09-28 18:24 ` [RFC PATCH 24/24] ARM: vGIC: advertising LPI support Andre Przywara
@ 2016-11-10  0:49   ` Stefano Stabellini
  2016-11-10 11:22     ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10  0:49 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Wed, 28 Sep 2016, Andre Przywara wrote:
> To let a guest know about the availability of virtual LPIs, set the
> respective bits in the virtual GIC registers and let a guest control
> the LPI enable bit.
> Only report the LPI capability if the host has initialized at least
> one ITS.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c | 28 +++++++++++++++++++++++-----
>  1 file changed, 23 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d230a1f..61c97a2 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -168,8 +168,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* We have not implemented LPI's, read zero */
> -        goto read_as_zero_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
> +                                info);

I don't think it is useful to call vgic_reg32_extract in this case.
vgic.flags is not a register.


> +        return 1;
>  
>      case VREG32(GICR_IIDR):
>          if ( dabt.size != DABT_WORD ) goto bad_width;
> @@ -181,16 +183,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          uint64_t typer, aff;
>  
>          if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -        /* TBD: Update processor id in [23:8] when ITS support is added */
>          aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
>          typer = aff;
> +        typer |= (v->vcpu_id & 0xffff) << 8;
>  
>          if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
>              typer |= GICR_TYPER_LAST;
>  
> +        if ( v->domain->arch.vgic.has_its )
> +            typer |= GICR_TYPER_PLPIS;
> +
>          *r = vgic_reg64_extract(typer, info);
>  
>          return 1;
> @@ -468,8 +473,16 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        if ( !v->domain->arch.vgic.has_its )
> +            return 1;
> +
> +        if ( r & 1 )
> +            v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
> +        else
> +            v->arch.vgic.flags &= !VGIC_V3_LPIS_ENABLED;
> +
> +        return 1;
>  
>      case VREG32(GICR_IIDR):
>          /* RO */
> @@ -1075,6 +1088,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
>          typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
>                   DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
>  
> +        if ( v->domain->arch.vgic.has_its )
> +        {
> +            typer |= GICD_TYPE_LPIS;
> +            irq_bits = 16;
> +        }
>          typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
>  
>          *r = vgic_reg32_extract(typer, info);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 24/24] ARM: vGIC: advertising LPI support
  2016-11-10  0:49   ` Stefano Stabellini
@ 2016-11-10 11:22     ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-10 11:22 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel

On 10/11/16 00:49, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> To let a guest know about the availability of virtual LPIs, set the
>> respective bits in the virtual GIC registers and let a guest control
>> the LPI enable bit.
>> Only report the LPI capability if the host has initialized at least
>> one ITS.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-v3.c | 28 +++++++++++++++++++++++-----
>>  1 file changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index d230a1f..61c97a2 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -168,8 +168,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>>      switch ( gicr_reg )
>>      {
>>      case VREG32(GICR_CTLR):
>> -        /* We have not implemented LPI's, read zero */
>> -        goto read_as_zero_32;
>> +        if ( dabt.size != DABT_WORD ) goto bad_width;
>> +        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
>> +                                info);
>
> I don't think it is useful to call vgic_reg32_extract in this case.
> vgic.flags is not a register.

All the emulation is using vgic_reg*_extract and construct a register 
when necessary. So I would keep the vgic_reg32_extract.

However, it would be more readable to have:

uint32_t ctlr;

ctlr = !!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED);
*r = vgic_reg32_extract(ctlr, info);

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-10  0:21       ` Stefano Stabellini
@ 2016-11-10 11:57         ` Julien Grall
  2016-11-10 20:42           ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-10 11:57 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel, Vijay Kilari

Hi,

On 10/11/16 00:21, Stefano Stabellini wrote:
> On Fri, 4 Nov 2016, Andre Przywara wrote:
>> On 24/10/16 16:32, Vijay Kilari wrote:
>>> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>>>> The INVALL command instructs an ITS to invalidate the configuration
>>>> data for all LPIs associated with a given redistributor (read: VCPU).
>>>> To avoid iterating (and mapping!) all guest tables, we instead go through
>>>> the host LPI table to find any LPIs targetting this VCPU. We then update
>>>> the configuration bits for the connected virtual LPIs.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>> ---
>>>>  xen/arch/arm/gic-its.c        | 58 +++++++++++++++++++++++++++++++++++++++++++
>>>>  xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
>>>>  xen/include/asm-arm/gic-its.h |  2 ++
>>>>  3 files changed, 90 insertions(+)
>>>>
>>>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>>>> index 6f4329f..5129d6e 100644
>>>> --- a/xen/arch/arm/gic-its.c
>>>> +++ b/xen/arch/arm/gic-its.c
>>>> @@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
>>>>      return its_send_command(its, cmd);
>>>>  }
>>>>
>>>> +static int its_send_cmd_invall(struct host_its *its, int cpu)
>>>> +{
>>>> +    uint64_t cmd[4];
>>>> +
>>>> +    cmd[0] = GITS_CMD_INVALL;
>>>> +    cmd[1] = 0x00;
>>>> +    cmd[2] = cpu & GENMASK(15, 0);
>>>> +    cmd[3] = 0x00;
>>>> +
>>>> +    return its_send_command(its, cmd);
>>>> +}
>>>> +
>>>>  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
>>>>                           int devid, int bits, bool valid)
>>>>  {
>>>> @@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d, uint32_t host_lpi, int *vcpu_id)
>>>>      return hlpi.virt_lpi;
>>>>  }
>>>>
>>>> +/* Iterate over all host LPIs, and updating the "enabled" state for a given
>>>> + * guest redistributor (VCPU) given the respective state in the provided
>>>> + * proptable. This proptable is indexed by the stored virtual LPI number.
>>>> + * This is to implement a guest INVALL command.
>>>> + */
>>>> +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t *proptable)
>>>> +{
>>>> +    int chunk, i;
>>>> +    struct host_its *its;
>>>> +
>>>> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
>>>> +    {
>>>> +        if ( !lpi_data.host_lpis[chunk] )
>>>> +            continue;
>>>> +
>>>> +        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
>>>> +        {
>>>> +            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i], hlpi;
>>>> +            uint32_t hlpi_nr;
>>>> +
>>>> +            hlpi.data = hlpip->data;
>>>> +            if ( !hlpi.virt_lpi )
>>>> +                continue;
>>>> +
>>>> +            if ( hlpi.dom_id != v->domain->domain_id )
>>>> +                continue;
>>>> +
>>>> +            if ( hlpi.vcpu_id != v->vcpu_id )
>>>> +                continue;
>>>> +
>>>> +            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
>>>> +
>>>> +            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
>>>> +                lpi_data.lpi_property[hlpi_nr - 8192] |= LPI_PROP_ENABLED;
>>>> +            else
>>>> +                lpi_data.lpi_property[hlpi_nr - 8192] &= ~LPI_PROP_ENABLED;
>>>> +        }
>>>> +    }
>>>         AFAIK, the initial design is to use tasklet to update property
>>> table as it consumes
>>> lot of time to update the table.
>>
>> This is a possible, but premature optimization.
>> Linux (at the moment, at least) only calls INVALL _once_, just after
>> initialising the collections. And at this point no LPI is mapped, so the
>> whole routine does basically nothing - and that quite fast.
>> We can later have any kind of fancy algorithm if there is a need for.
>
> I understand, but as-is it's so expensive that could be a DOS vector.
> Also other OSes could issue INVALL much more often than Linux.
>
> Considering that we might support device assigment with ITS soon, I
> think it might be best to parse per-domain virtual tables rather than
> the full list of physical LPIs, which theoretically could be much
> larger. Or alternatively we need to think about adding another field to
> lpi_data, to link together all lpis assigned to the same domain, but
> that would cost even more memory. Or we could rate-limit the INVALL
> calls to one every few seconds or something. Or all of the above :-)

It is not necessary for an ITS implementation to wait until an 
INVALL/INV command is issued to take into account the change of the LPI 
configuration tables (aka property table in this thread).

So how about trapping the property table? We would still have to go 
through the property table the first time (i.e when writing into the 
GICR_PROPBASER), but INVALL would be a nop.

The idea would be unmapping the region when GICR_PROPBASER is written. 
So any read/write access would be trapped. For a write access, Xen will 
update the LPIs internal data structures and write the value in the 
guest page unmapped. If we don't want to have an overhead for the read 
access, we could just write-protect the page in stage-2 page table. So 
only write access would be trapped.

Going further, for the ITS, Xen is using the guest memory to store the 
ITS information. This means Xen has to validate the information at every 
access. So how about restricting the access in stage-2 page table? That 
would remove the overhead of validating data.

Any thoughts?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-10-26  1:10   ` Stefano Stabellini
@ 2016-11-10 15:29     ` Andre Przywara
  2016-11-10 21:00       ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-10 15:29 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 26/10/16 02:10, Stefano Stabellini wrote:
> Hi Andre,
> 
> Sorry for the late reply, I'll try to be faster for the next rounds of
> review. The patch looks good for a first iteration. Some comments below.

No worries and thanks for the thorough review, much appreciated.
As you can see I took my time to respond as well ;-)

> 
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> ITS. We limit the number of LPIs we use with a compile time constant to
>> avoid wasting memory.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig              |  6 ++++
>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>  xen/include/asm-arm/cache.h       |  4 +++
>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 9fe3b8e..66e2bb8 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -50,6 +50,12 @@ config HAS_ITS
>>          depends on ARM_64
>>          depends on HAS_GICV3
>>  
>> +config HOST_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +
>>  config ALTERNATIVE
>>  	bool
>>  
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 045d6ce..dc64aec 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -10,7 +10,6 @@
>>  #include "efi-dom0.h"
>>  
>>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>>  
>>  #define DEVICE_TREE_GUID \
>>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 0f42a77..b52dff3 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -20,10 +20,86 @@
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>> +#include <asm/p2m.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>  
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    int host_lpi_bits;
>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define MAX_HOST_LPI_BITS                                                \
> 
> To avoid confusion, I would call this MAX_PHYS_LPI_BITS
> 
> 
>> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> 
> And this MAX_PHYS_LPIS

Done.

>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg, attr;
>> +    void *pendtable;
> 
> I would introduce a check to make sure that this_cpu(pending_table) == NULL.

Can do. So I return back this value then, though this should never happen.

> 
>> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    /*
>> +     * The pending table holds one bit per LPI, so we need three bits less
>> +     * than the number of LPI_BITs.
> 
> Why 3 bit less? Please add more info on how you came up with 3.

3 bits as in 2 << 3 = 8 = BITS_PER_BYTES. We need to divide by that,
which is shift by 3, which is ORDER - 3. Does that make sense?
But this mayhem goes away anyway with _xmalloc.

> 
>>         But the alignment requirement from the
>> +     * ITS is 64K, so make order at least 16 (-12).
> 
> Does it need to be 64K aligned or does it need to be at least 64K in
> size?

The first.

> That makes a big difference. If it just needs to be 64K aligned,
> you can do that with xmalloc.

Well, not xmalloc (since I don't have a data structure of that size),
but _xmalloc. I just saw that this is exported as well (I dismissed this
before because of the leading underscore).
Also "alloc pages" sounded more like what I had in mind, but I guess
aligning it to 64K serves the same purpose.

>> +     */
>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);
> 
> Shouldn't we be using MAX_HOST_LPI_BITS instead of
> lpi_data.host_lpi_bits to make this calculation?

I was under the impression that the redistributors expect the pending
table to cover every possible LPI as reported in GICD_TYPER (because in
contrast to PROPBASER the PENDBASER register lacks a size field).
But thinking about this again this seems to be insane, since 32 bit
worth of LPIs would lead to a 0.5GB pending table. But as the LPI
numbers are under the control of software, we can go with allocating
less - up to our internal limit - which is also what Linux does.

> 
>> +    if ( !pendtable )
>> +        return 0;
>> +
>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
> 
> flush_dcache?

Uhm, yes.

> 
>> +    this_cpu(pending_table) = pendtable;
>> +
>> +    reg  = attr | GICR_PENDBASER_PTZ;
>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable()
>> +{
>> +    uint64_t attr;
>> +    static uint64_t reg = 0;
>> +
>> +    /* The property table is shared across all redistributors. */
>> +    if ( reg )
>> +        return reg;
> 
> Can't you just use lpi_data.lpi_property != NULL instead of introducing
> a new static local variable?

Seems like a good idea actually. We have to reconstruct the register
content, but that seems doable.

>> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
> 
> Please add a comment on how the order is calculated.

Does " ... - PAGE_SHIFT" suffice?

> 
> 
>> +    if ( !lpi_data.lpi_property )
>> +        return 0;
>> +
>> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
>> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
>> +
>> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
>> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
>> +
>> +    return reg;
>> +}
>> +
>> +int gicv3_lpi_init_host_lpis(int lpi_bits)
>> +{
>> +    lpi_data.host_lpi_bits = lpi_bits;
>> +
>> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 238da84..2534aa5 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>  
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
> 
> Please #define a mask instead of using 0x1f
> 
> 
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>  
>> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>>  
>>      return 0;
>>  }
>> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    if ( list_empty(&host_its_list) )
>> +        return;
>> +
>> +    /* Make sure LPIs are disabled before setting up the BASERs. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> 
> Maybe we want to return in case table_reg == NULL ?

I guess so. I just wonder what we would do in this case? Panic?
Theoretically we could just proceed without enabling LPIs on this
redistributor, but that's probably not what a user would expect.


Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-10-26 22:57   ` Stefano Stabellini
  2016-11-01 17:34     ` Julien Grall
@ 2016-11-10 15:32     ` Andre Przywara
  2016-11-10 21:06       ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-10 15:32 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 26/10/16 23:57, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
>> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
>> and collection ID, which points to the target CPU.
>> This mapping is stored in the device and collection tables, which software
>> has to provide for the ITS to use.
>> Allocate the required memory and hand it the ITS.
>> We limit the number of devices to cover 4 PCI busses for now.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |   5 ++
>>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
>>  3 files changed, 167 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index b52dff3..40238a2 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -21,6 +21,7 @@
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> +#include <asm/io.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>>  
>> +#define BASER_ATTR_MASK                                           \
>> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
>> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
>> +
>> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>> +{
>> +    uint64_t ret;
>> +
>> +    if ( page_bits < 16)
>> +        return (uint64_t)addr & GENMASK(47, page_bits);
>> +
>> +    ret = addr & GENMASK(47, 16);
>> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>> +}
>> +
>> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> 
> Shouldn't this be called its_map_baser?

Yes, the BASER registers are an ITS property.

>> +{
>> +    uint64_t attr;
>> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
> 
> The spec says "This field is read-only and specifies the number of
> bytes per entry, minus one." Do we need to increment it by 1?

Mmh, looks so. I guess it worked because the number gets dwarfed by the
page size round up below.

>> +    int pagesz;
>> +    int order;
>> +    void *buffer = NULL;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    /*
>> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
>> +     * supports.
>> +     */
> 
> Is this really the best way to do it? Can't we assume ITS supports 4K,
> given that Xen requires 4K pages at the moment?

The ITS pages are totally independent from the core's MMU page size.
So the spec says: "If the GIC implementation supports only a single,
fixed page size, this field might be RO."
I take it that this means that the only implemented page size could be
64K, for instance. And in fact the KVM ITS emulation advertises exactly
this to a guest.

> Is it actually possible
> to find hardware that supports 4K but with an ITS that only support 64K
> or 16K pages? It seems insane to me. Otherwise can't we probe the page
> size somehow?

We can probe by writing and seeing if it sticks - that's what the code
does. Is it really so horrible? I agree it's nasty, but isn't it
basically a loop around the code needed anyway?

Yes to the rest of the comments.

Cheers,
Andre.

>> +    for (pagesz = 0; pagesz < 3; pagesz++)
>> +    {
>> +        uint64_t reg;
>> +        int nr_bytes;
>> +
>> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
>> +        order = get_order_from_bytes(nr_bytes);
>> +
>> +        if ( !buffer )
>> +            buffer = alloc_xenheap_pages(order, 0);
>> +        if ( !buffer )
>> +            return -ENOMEM;
>> +
>> +        reg  = attr;
>> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
>> +        reg |= nr_bytes >> (pagesz * 2 + 12);
>> +        reg |= regc & BASER_RO_MASK;
>> +        reg |= GITS_BASER_VALID;
>> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
>> +
>> +        writeq_relaxed(reg, basereg);
>> +        regc = readl_relaxed(basereg);
>> +
>> +        /* The host didn't like our attributes, just use what it returned. */
>> +        if ( (regc & BASER_ATTR_MASK) != attr )
>> +            attr = regc & BASER_ATTR_MASK;
>> +
>> +        /* If the host accepted our page size, we are done. */
>> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
>> +            return 0;
>> +
>> +        /* Check whether our buffer is aligned to the next page size already. */
>> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
>> +        {
>> +            free_xenheap_pages(buffer, order);
>> +            buffer = NULL;
>> +        }
>> +    }
>> +
>> +    if ( buffer )
>> +        free_xenheap_pages(buffer, order);
>> +
>> +    return -EINVAL;
>> +}
>> +
>> +int gicv3_its_init(struct host_its *hw_its)
>> +{
>> +    uint64_t reg;
>> +    int i;
>> +
>> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>> +    if ( !hw_its->its_base )
>> +        return -ENOMEM;
>> +
>> +    for (i = 0; i < 8; i++)
> 
> Code style. Unfortunately we don't have a script to check, but please
> refer to CODING_STYLE. I'd prefer if every number was #define'ed,
> including `8' (something like GITS_BASER_MAX).
> 
> 
>> +    {
>> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> +        int type;
>> +
>> +        reg = readq_relaxed(basereg);
>> +        type = (reg >> 56) & 0x7;
> 
> Please #define 56 and 0x7
> 
> 
>> +        switch ( type )
>> +        {
>> +        case GITS_BASER_TYPE_NONE:
>> +            continue;
>> +        case GITS_BASER_TYPE_DEVICE:
>> +            /* TODO: find some better way of limiting the number of devices */
>> +            gicv3_map_baser(basereg, reg, 1024);
> 
> An hardcoded max value might be OK, but please #define it.
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-10-26 23:55   ` Stefano Stabellini
  2016-10-27 21:52     ` Stefano Stabellini
@ 2016-11-10 15:57     ` Andre Przywara
  1 sibling, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-10 15:57 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 27/10/16 00:55, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> To be able to easily send commands to the ITS, create the respective
>> wrapper functions, which take care of the ring buffer.
>> The first two commands we implement provide methods to map a collection
>> to a redistributor (aka host core) and to flush the command queue (SYNC).
>> Start using these commands for mapping one collection to each host CPU.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 101 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |  17 +++++++
>>  xen/include/asm-arm/gic-its.h |  32 +++++++++++++
>>  3 files changed, 150 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index c8a7a7e..88397bc 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -33,6 +33,10 @@ static struct {
>>      int host_lpi_bits;
>>  } lpi_data;
>>  
>> +/* Physical redistributor address */
>> +static DEFINE_PER_CPU(uint64_t, rdist_addr);
>> +/* Redistributor ID */
>> +static DEFINE_PER_CPU(uint64_t, rdist_id);
>>  /* Pending table for each redistributor */
>>  static DEFINE_PER_CPU(void *, pending_table);
>>  
>> @@ -40,6 +44,86 @@ static DEFINE_PER_CPU(void *, pending_table);
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>>  
>> +#define ITS_COMMAND_SIZE        32
>> +
>> +static int its_send_command(struct host_its *hw_its, void *its_cmd)
>> +{
>> +    int readp, writep;
> 
> uint64_t

A bit overkill, but probably right type-wise.

> 
>> +    spin_lock(&hw_its->cmd_lock);
>> +
>> +    readp = readl_relaxed(hw_its->its_base + GITS_CREADR) & GENMASK(19, 5);
>> +    writep = readl_relaxed(hw_its->its_base + GITS_CWRITER) & GENMASK(19, 5);
> 
> It might be worth to
> 
>   #define ITS_CMD_RING_SIZE PAGE_SIZE
> 
> for clarity

Or revisit this to allow bigger queues than one page.

> 
>> +    if ( ((writep + ITS_COMMAND_SIZE) % PAGE_SIZE) == readp )
>> +    {
>> +        spin_unlock(&hw_its->cmd_lock);
>> +        return -EBUSY;
>> +    }
>> +
>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_COMMAND_SIZE);
>> +    __flush_dcache_area(hw_its->cmd_buf + writep, ITS_COMMAND_SIZE);
>> +    writep = (writep + ITS_COMMAND_SIZE) % PAGE_SIZE;
>> +
>> +    writeq_relaxed(writep & GENMASK(19, 5), hw_its->its_base + GITS_CWRITER);
>> +
>> +    spin_unlock(&hw_its->cmd_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
>> +{
>> +    reg &= ~GENMASK(51, 16);
>> +
>> +    if ( hw_its->pta )
>> +        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);
> 
> Again in my version of the spec is GENMASK(47, 16).
> 
> 
>> +    else
>> +        reg |= per_cpu(rdist_id, cpu) << 16;
>> +    return reg;
>> +}
>> +
>> +static int its_send_cmd_sync(struct host_its *its, int cpu)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_SYNC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)) | BIT(63));
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +/* Set up the (1:1) collection mapping for the given host CPU. */
>> +void gicv3_its_setup_collection(int cpu)
>> +{
>> +    struct host_its *its;
>> +
>> +    list_for_each_entry(its, &host_its_list, entry)
>> +    {
>> +        /* Only send commands to ITS that have been initialized already. */
>> +        if ( !its->cmd_buf )
>> +            continue;
>> +
>> +        its_send_cmd_mapc(its, cpu, cpu);
>> +        its_send_cmd_sync(its, cpu);
>> +    }
>> +}
>> +
>>  #define BASER_ATTR_MASK                                           \
>>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> @@ -147,6 +231,13 @@ int gicv3_its_init(struct host_its *hw_its)
>>      if ( !hw_its->its_base )
>>          return -ENOMEM;
>>  
>> +    /* Make sure the ITS is disabled before programming the BASE registers. */
>> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
>> +
>> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>> +    hw_its->pta = reg & GITS_TYPER_PTA;
> 
> To avoid problems:
> 
>   pta = !!(reg & GITS_TYPER_PTA);
> 
> 
>>      for (i = 0; i < 8; i++)
>>      {
>>          void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> @@ -174,9 +265,18 @@ int gicv3_its_init(struct host_its *hw_its)
>>      if ( IS_ERR(hw_its->cmd_buf) )
>>          return PTR_ERR(hw_its->cmd_buf);
>>  
>> +    its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
>> +    its_send_cmd_sync(hw_its, smp_processor_id());
> 
> Why do we need these two commands in addition to the ones issued by
> gicv3_its_setup_collection?

gicv3_its_setup_collection() gets called for each redistributor. On the
first CPU this is done _before_ we actually have set up the ITS, so we
can't send any commands at this time. The function actually checks for
an initialised command buffer, eventually doing nothing on the first core.
However this function _here_ is only called once (on core 0).
So we send the commands for core 0 now, actually at the earliest
possible time.

Cheers,
Andre.

>>      return 0;
>>  }
>>  
>> +void gicv3_set_redist_addr(paddr_t address, int redist_id)
>> +{
>> +    this_cpu(rdist_addr) = address;
>> +    this_cpu(rdist_id) = redist_id;
>> +}
>> +
>>  uint64_t gicv3_lpi_allocate_pendtable(void)
>>  {
>>      uint64_t reg, attr;
>> @@ -265,6 +365,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>>          its_data->addr = addr;
>>          its_data->size = size;
>>          its_data->dt_node = its;
>> +        spin_lock_init(&its_data->cmd_lock);
>>  
>>          printk("GICv3: Found ITS @0x%lx\n", addr);
>>  
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 5cf4618..b9387a3 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -638,6 +638,8 @@ static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>>      table_reg = gicv3_lpi_get_proptable();
>>      if ( table_reg )
>>          writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
>> +
>> +    gicv3_its_setup_collection(smp_processor_id());
>>  }
>>  
>>  static int __init gicv3_populate_rdist(void)
>> @@ -684,7 +686,22 @@ static int __init gicv3_populate_rdist(void)
>>                  this_cpu(rbase) = ptr;
>>  
>>                  if ( typer & GICR_TYPER_PLPIS )
>> +                {
>> +                    paddr_t rdist_addr;
>> +
>> +                    rdist_addr = gicv3.rdist_regions[i].base;
>> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
>> +
>> +                    /* The ITS refers to redistributors either by their physical
>> +                     * address or by their ID. Determine those two values and
>> +                     * let the ITS code store them in per host CPU variables to
>> +                     * later be able to address those redistributors.
>> +                     */
>> +                    gicv3_set_redist_addr(rdist_addr,
>> +                                          (typer >> 8) & GENMASK(15, 0));
> 
> Please #define the 8
> 
> 
>>                      gicv3_rdist_init_lpis(ptr);
>> +                }
>>  
>>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>>                          smp_processor_id(), i, ptr);
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index b2a003f..b49d274 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -37,6 +37,7 @@
>>  
>>  /* Register bits */
>>  #define GITS_CTLR_ENABLE     0x1
>> +#define GITS_TYPER_PTA       BIT(19)
>>  #define GITS_IIDR_VALUE      0x34c
>>  
>>  #define GITS_BASER_VALID                BIT(63)
>> @@ -59,6 +60,22 @@
>>                                          (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
>>                                          GITS_BASER_INDIRECT)
>>  
>> +/* ITS command definitions */
>> +#define ITS_CMD_SIZE                    32
>> +
>> +#define GITS_CMD_MOVI                   0x01
>> +#define GITS_CMD_INT                    0x03
>> +#define GITS_CMD_CLEAR                  0x04
>> +#define GITS_CMD_SYNC                   0x05
>> +#define GITS_CMD_MAPD                   0x08
>> +#define GITS_CMD_MAPC                   0x09
>> +#define GITS_CMD_MAPTI                  0x0a
> 
> In my version of the spec (PRD03-GENC-010745 24.0) 0x0a is MAPVI.
> 
> 
>> +#define GITS_CMD_MAPI                   0x0b
>> +#define GITS_CMD_INV                    0x0c
>> +#define GITS_CMD_INVALL                 0x0d
>> +#define GITS_CMD_MOVALL                 0x0e
>> +#define GITS_CMD_DISCARD                0x0f
>> +
>>  #ifndef __ASSEMBLY__
>>  #include <xen/device_tree.h>
>>  
>> @@ -69,7 +86,9 @@ struct host_its {
>>      paddr_t addr;
>>      paddr_t size;
>>      void __iomem *its_base;
>> +    spinlock_t cmd_lock;
>>      void *cmd_buf;
>> +    bool pta;
>>  };
>>  
>>  extern struct list_head host_its_list;
>> @@ -89,6 +108,12 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
>>  int gicv3_lpi_init_host_lpis(int nr_lpis);
>>  int gicv3_its_init(struct host_its *hw_its);
>>  
>> +/* Set the physical address and ID for each redistributor as read from DT. */
>> +void gicv3_set_redist_addr(paddr_t address, int redist_id);
>> +
>> +/* Map a collection for this host CPU to each host ITS. */
>> +void gicv3_its_setup_collection(int cpu);
>> +
>>  #else
>>  
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> @@ -110,6 +135,13 @@ static inline int gicv3_its_init(struct host_its *hw_its)
>>  {
>>      return 0;
>>  }
>> +static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
>> +{
>> +}
>> +static inline void gicv3_its_setup_collection(int cpu)
>> +{
>> +}
>> +
>>  #endif /* CONFIG_HAS_ITS */
>>  
>>  #endif /* __ASSEMBLY__ */
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer
  2016-10-26 23:03   ` Stefano Stabellini
@ 2016-11-10 16:04     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2016-11-10 16:04 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 27/10/16 00:03, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> Instead of directly manipulating the tables in memory, an ITS driver
>> sends commands via a ring buffer to the ITS h/w to create or alter the
>> LPI mappings.
>> Allocate memory for that buffer and tell the ITS about it to be able
>> to send ITS commands.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 25 +++++++++++++++++++++++++
>>  xen/include/asm-arm/gic-its.h |  1 +
>>  2 files changed, 26 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 40238a2..c8a7a7e 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -18,6 +18,7 @@
>>  
>>  #include <xen/config.h>
>>  #include <xen/lib.h>
>> +#include <xen/err.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> @@ -56,6 +57,26 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>>  }
>>  
>> +static void *gicv3_map_cbaser(void __iomem *cbasereg)
> 
> Shouldn't it be its_map_cbaser?

Yes.

>> +{
>> +    uint64_t attr, reg;
>> +    void *buffer;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    buffer = alloc_xenheap_pages(0, 0);
>> +    if ( !buffer )
>> +        return ERR_PTR(-ENOMEM);
> 
> We haven't use much ERR_PTR on arm so far. In this case I'd just return
> NULL.

In this case I agree, though "we haven't uses it much" isn't really a
good argument ;-)

Cheers,
Andre.

>> +
>> +    /* We use exactly one 4K page, so the "Size" field is 0. */
>> +    reg = attr | BIT(63) | (virt_to_maddr(buffer) & GENMASK(51, 12));
> 
> Shouldn't the mask be GENMASK(47, 12)? Maybe I have an old spec
> version.
> 
> 
>> +    writeq_relaxed(reg, cbasereg);
>> +
>> +    return buffer;
>> +}
>> +
>>  static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>>  {
>>      uint64_t attr;
>> @@ -149,6 +170,10 @@ int gicv3_its_init(struct host_its *hw_its)
>>          }
>>      }
>>  
>> +    hw_its->cmd_buf = gicv3_map_cbaser(hw_its->its_base + GITS_CBASER);
>> +    if ( IS_ERR(hw_its->cmd_buf) )
>> +        return PTR_ERR(hw_its->cmd_buf);
>> +
>>      return 0;
>>  }
>>  
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 589b889..b2a003f 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -69,6 +69,7 @@ struct host_its {
>>      paddr_t addr;
>>      paddr_t size;
>>      void __iomem *its_base;
>> +    void *cmd_buf;
>>  };
>>  
>>  extern struct list_head host_its_list;
>> -- 
>> 2.9.0
>>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array
  2016-10-27 22:59   ` Stefano Stabellini
  2016-11-02 15:14     ` Julien Grall
@ 2016-11-10 17:22     ` Andre Przywara
  2016-11-10 21:48       ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-10 17:22 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 27/10/16 23:59, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> The number of LPIs on a host can be potentially huge (millions),
>> although in practise will be mostly reasonable. So prematurely allocating
>> an array of struct irq_desc's for each LPI is not an option.
>> However Xen itself does not care about LPIs, as every LPI will be injected
>> into a guest (Dom0 for now).
>> Create a dense data structure (8 Bytes) for each LPI which holds just
>> enough information to determine the virtual IRQ number and the VCPU into
>> which the LPI needs to be injected.
>> Also to not artificially limit the number of LPIs, we create a 2-level
>> table for holding those structures.
>> This patch introduces functions to initialize these tables and to
>> create, lookup and destroy entries for a given LPI.
>> We allocate and access LPI information in a way that does not require
>> a lock.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
>>  xen/include/asm-arm/gic-its.h |  18 +++++
>>  2 files changed, 172 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 88397bc..2140e4a 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -18,18 +18,31 @@
>>  
>>  #include <xen/config.h>
>>  #include <xen/lib.h>
>> +#include <xen/sched.h>
>>  #include <xen/err.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <asm/p2m.h>
>> +#include <asm/domain.h>
>>  #include <asm/io.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>  
>> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
>> +union host_lpi {
>> +    uint64_t data;
>> +    struct {
>> +        uint64_t virt_lpi:32;
>> +        uint64_t dom_id:16;
>> +        uint64_t vcpu_id:16;
>> +    };
>> +};
> 
> Why not the following?
> 
>   union host_lpi {
>       uint64_t data;
>       struct {
>           uint32_t virt_lpi;
>           uint16_t dom_id;
>           uint16_t vcpu_id;
>       };
>   };

I am not sure that gives me a guarantee of stuffing everything into a
u64 (as per the C standard). It probably will on arm64 with gcc, but I
thought better safe than sorry.

>>  /* Global state */
>>  static struct {
>>      uint8_t *lpi_property;
>> +    union host_lpi **host_lpis;
>>      int host_lpi_bits;
>>  } lpi_data;
>>  
>> @@ -43,6 +56,26 @@ static DEFINE_PER_CPU(void *, pending_table);
>>  #define MAX_HOST_LPI_BITS                                                \
>>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
>> +
>> +static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
> 
> I take "lpi" is the physical lpi here. Maybe we would rename it to "plpi"
> for clarity.

Indeed.

> 
>> +{
>> +    union host_lpi *hlpi;
>> +
>> +    if ( lpi < 8192 || lpi >= MAX_HOST_LPIS + 8192 )
>> +        return NULL;
>> +
>> +    lpi -= 8192;
>> +    if ( !lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE] )
>> +        return NULL;
>> +
>> +    hlpi = &lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE][lpi % HOST_LPIS_PER_PAGE];
> 
> I realize I am sometimes obsessive about this, but division operations
> are expensive and this is on the hot path, so I would do:
> 
> #define HOST_LPIS_PER_PAGE      (PAGE_SIZE >> 3)

to replace
#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))?

This should be computed by the compiler, as it's constant.

> unsigned int table = lpi / HOST_LPIS_PER_PAGE;

So I'd rather replace this by ">> (PAGE_SIZE - 3)".
But again the compiler would do this for us, as replacing "constant
divisions by power-of-two" with "right shifts" are a text book example
of easy optimization, if I remember this compiler class at uni correctly ;-)

> then use table throughout this function.

I see your point (though this is ARMv8, which always has udiv).
But to prove your paranoia wrong: I don't see any divisions in the
disassembly, but a lsr #3 and a lsr #9 and various other clever and
cheap ARMv8 instructions ;-)
Compilers have really come a long way in 2016 ...

> 
>> +    if ( d && hlpi->dom_id != d->domain_id )
>> +        return NULL;
> 
> I think this function is very useful so I would avoid making any domain
> checks here: one day we might want to retrieve hlpi even if hlpi->dom_id
> != d->domain_id. I would move the domain check outside.

That's why I have "d && ..." in front. If you pass in NULL for the
domain, it will skip this check. That saves us from coding the check in
every caller.
Is that not good enough?

> 
>> +    return hlpi;
>> +}
>>  
>>  #define ITS_COMMAND_SIZE        32
>>  
>> @@ -96,6 +129,33 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
>>      return its_send_command(its, cmd);
>>  }
>>  
>> +static int its_send_cmd_discard(struct host_its *its,
>> +                                uint32_t deviceid, uint32_t eventid)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_DISCARD | ((uint64_t)deviceid << 32);
>> +    cmd[1] = eventid;
>> +    cmd[2] = 0x00;
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +static int its_send_cmd_mapti(struct host_its *its,
>> +                              uint32_t deviceid, uint32_t eventid,
>> +                              uint32_t pintid, uint16_t icid)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
>> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
>> +    cmd[2] = icid;
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>>  static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>>  {
>>      uint64_t cmd[4];
>> @@ -330,15 +390,109 @@ uint64_t gicv3_lpi_get_proptable()
>>      return reg;
>>  }
>>  
>> +/* Allocate the 2nd level array for host LPIs. This one holds pointers
>> + * to the page with the actual "union host_lpi" entries. Our LPI limit
>> + * avoids excessive memory usage.
>> + */
>>  int gicv3_lpi_init_host_lpis(int lpi_bits)
>>  {
>> +    int nr_lpi_ptrs;
>> +
>>      lpi_data.host_lpi_bits = lpi_bits;
>>  
>> +    nr_lpi_ptrs = MAX_HOST_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
>> +
>> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
>> +    if ( !lpi_data.host_lpis )
>> +        return -ENOMEM;
> 
> Why are we not allocating the 2nd level right away? To save memory? If
> so, I would like some numbers in a real use case scenario written either
> here on in the commit message.

LPIs can be allocated sparsely. Each LPI uses 8 Bytes, chances are we
never use more than a few dozen on a real system, so we just use two
pages with this scheme.

Allocating memory for all 2 << 20 (the default) takes 8 MB (probably for
nothing), extending this to 24 bits uses 128 MB already.
The problem is that Xen cannot know how many LPIs Dom0 will use, so I'd
rather make this number generous here - hence the allocation scheme.

Not sure if this is actually overkill or paranoid and we would get away
with a much smaller single level allocation, driven by a config option
or runtime parameter, though.

>>      printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
>>  
>>      return 0;
>>  }
>>  
>> +/* Allocates a new host LPI to be injected as "virt_lpi" into the specified
>> + * VCPU. Returns the host LPI ID or a negative error value.
>> + */
>> +int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>> +                                uint32_t devid, uint32_t eventid,
>> +                                struct vcpu *v, int virt_lpi)
>> +{
>> +    int chunk, i;
>> +    union host_lpi hlpi, *new_chunk;
>> +
>> +    /* TODO: handle some kind of preassigned LPI mapping for DomUs */
>> +    if ( !its )
>> +        return -EPERM;
>> +
>> +    /* TODO: This could be optimized by storing some "next available" hint and
>> +     * only iterate if this one doesn't work. But this function should be
>> +     * called rarely.
>> +     */
> 
> Yes please. Even a trivial pointer to last would be far better than this.
> It would be nice to run some numbers and prove that in realistic
> scenarios finding an empty plpi doesn't take more than 5-10 ops, which
> should be the case unless we have to loop over and the initial chucks
> are still fully populated, causing Xen to scan for 512 units at a time.
> We defenitely want to avoid that, if not in rare worse case scenarios.

I can try, though keep in mind that this code is really only called on
allocating a host LPI, which would only happen when an LPI gets mapped.
And this is done only upon a Dom0 driver initializing a device. Normally
you wouldn't expect this during the actual guest runtime.

> 
>> +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE; chunk++)
>> +    {
>> +        /* If we hit an unallocated chunk, we initialize it and use entry 0. */
>> +        if ( !lpi_data.host_lpis[chunk] )
>> +        {
>> +            new_chunk = alloc_xenheap_pages(0, 0);
>> +            if ( !new_chunk )
>> +                return -ENOMEM;
>> +
>> +            memset(new_chunk, 0, PAGE_SIZE);
>> +            lpi_data.host_lpis[chunk] = new_chunk;
>> +            i = 0;
>> +        }
>> +        else
>> +        {
>> +            /* Find an unallocted entry in this chunk. */
>> +            for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
>> +                if ( !lpi_data.host_lpis[chunk][i].virt_lpi )
>> +                    break;
>> +
>> +            /* If this chunk is fully allocted, advance to the next one. */
>                                            ^ allocated
> 
> 
>> +            if ( i == HOST_LPIS_PER_PAGE)
>> +                continue;
>> +        }
>> +
>> +        hlpi.virt_lpi = virt_lpi;
>> +        hlpi.dom_id = v->domain->domain_id;
>> +        hlpi.vcpu_id = v->vcpu_id;
>> +        lpi_data.host_lpis[chunk][i].data = hlpi.data;
>> +
>> +        if (its)
> 
> code style
> 
> 
>> +        {
>> +            its_send_cmd_mapti(its, devid, eventid,
>> +                               chunk * HOST_LPIS_PER_PAGE + i + 8192, 0);
>> +            its_send_cmd_sync(its, 0);
> 
> Why hardcode the physical cpu to 0? Should we get the pcpu the vcpu is
> currently running on?

Yes, admittedly I was papering over this for the RFC (as I am afraid
there's more than that).
Will look at this.

Cheers,
Andre.

>> +        }
>> +
>> +        return chunk * HOST_LPIS_PER_PAGE + i + 8192;
>> +    }
>> +
>> +    return -ENOSPC;
>> +}
>> +
>> +/* Drops the connection of the given host LPI to a virtual LPI.
>> + */
>> +int gicv3_lpi_drop_host_lpi(struct host_its *its,
>> +                            uint32_t devid, uint32_t eventid, uint32_t host_lpi)
>> +{
>> +    union host_lpi *hlpip;
>> +
>> +    if ( !its )
>> +        return -EPERM;
>> +
>> +    hlpip = gic_find_host_lpi(host_lpi, NULL);
>> +    if ( !hlpip )
>> +        return -1;
>> +
>> +    hlpip->data = 0;
>> +
>> +    its_send_cmd_discard(its, devid, eventid);
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index b49d274..512a388 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -114,6 +114,12 @@ void gicv3_set_redist_addr(paddr_t address, int redist_id);
>>  /* Map a collection for this host CPU to each host ITS. */
>>  void gicv3_its_setup_collection(int cpu);
>>  
>> +int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>> +                                uint32_t devid, uint32_t eventid,
>> +                                struct vcpu *v, int virt_lpi);
>> +int gicv3_lpi_drop_host_lpi(struct host_its *its,
>> +                            uint32_t devid, uint32_t eventid,
>> +                            uint32_t host_lpi);
>>  #else
>>  
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> @@ -141,6 +147,18 @@ static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
>>  static inline void gicv3_its_setup_collection(int cpu)
>>  {
>>  }
>> +static inline int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>> +                                              uint32_t devid, uint32_t eventid,
>> +                                              struct vcpu *v, int virt_lpi)
>> +{
>> +    return 0;
>> +}
>> +static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
>> +                                          uint32_t devid, uint32_t eventid,
>> +                                          uint32_t host_lpi)
>> +{
>> +    return 0;
>> +}
>>  
>>  #endif /* CONFIG_HAS_ITS */
>>  
>> -- 
>> 2.9.0
>>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-10 11:57         ` Julien Grall
@ 2016-11-10 20:42           ` Stefano Stabellini
  2016-11-11 15:53             ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10 20:42 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andre Przywara, Stefano Stabellini, Vijay Kilari, xen-devel

On Thu, 10 Nov 2016, Julien Grall wrote:
> Hi,
> 
> On 10/11/16 00:21, Stefano Stabellini wrote:
> > On Fri, 4 Nov 2016, Andre Przywara wrote:
> > > On 24/10/16 16:32, Vijay Kilari wrote:
> > > > On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara
> > > > <andre.przywara@arm.com> wrote:
> > > > > The INVALL command instructs an ITS to invalidate the configuration
> > > > > data for all LPIs associated with a given redistributor (read: VCPU).
> > > > > To avoid iterating (and mapping!) all guest tables, we instead go
> > > > > through
> > > > > the host LPI table to find any LPIs targetting this VCPU. We then
> > > > > update
> > > > > the configuration bits for the connected virtual LPIs.
> > > > > 
> > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> > > > > ---
> > > > >  xen/arch/arm/gic-its.c        | 58
> > > > > +++++++++++++++++++++++++++++++++++++++++++
> > > > >  xen/arch/arm/vgic-its.c       | 30 ++++++++++++++++++++++
> > > > >  xen/include/asm-arm/gic-its.h |  2 ++
> > > > >  3 files changed, 90 insertions(+)
> > > > > 
> > > > > diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> > > > > index 6f4329f..5129d6e 100644
> > > > > --- a/xen/arch/arm/gic-its.c
> > > > > +++ b/xen/arch/arm/gic-its.c
> > > > > @@ -228,6 +228,18 @@ static int its_send_cmd_inv(struct host_its *its,
> > > > >      return its_send_command(its, cmd);
> > > > >  }
> > > > > 
> > > > > +static int its_send_cmd_invall(struct host_its *its, int cpu)
> > > > > +{
> > > > > +    uint64_t cmd[4];
> > > > > +
> > > > > +    cmd[0] = GITS_CMD_INVALL;
> > > > > +    cmd[1] = 0x00;
> > > > > +    cmd[2] = cpu & GENMASK(15, 0);
> > > > > +    cmd[3] = 0x00;
> > > > > +
> > > > > +    return its_send_command(its, cmd);
> > > > > +}
> > > > > +
> > > > >  int gicv3_its_map_device(struct host_its *hw_its, struct domain *d,
> > > > >                           int devid, int bits, bool valid)
> > > > >  {
> > > > > @@ -668,6 +680,52 @@ uint32_t gicv3_lpi_lookup_lpi(struct domain *d,
> > > > > uint32_t host_lpi, int *vcpu_id)
> > > > >      return hlpi.virt_lpi;
> > > > >  }
> > > > > 
> > > > > +/* Iterate over all host LPIs, and updating the "enabled" state for a
> > > > > given
> > > > > + * guest redistributor (VCPU) given the respective state in the
> > > > > provided
> > > > > + * proptable. This proptable is indexed by the stored virtual LPI
> > > > > number.
> > > > > + * This is to implement a guest INVALL command.
> > > > > + */
> > > > > +void gicv3_lpi_update_configurations(struct vcpu *v, uint8_t
> > > > > *proptable)
> > > > > +{
> > > > > +    int chunk, i;
> > > > > +    struct host_its *its;
> > > > > +
> > > > > +    for (chunk = 0; chunk < MAX_HOST_LPIS / HOST_LPIS_PER_PAGE;
> > > > > chunk++)
> > > > > +    {
> > > > > +        if ( !lpi_data.host_lpis[chunk] )
> > > > > +            continue;
> > > > > +
> > > > > +        for (i = 0; i < HOST_LPIS_PER_PAGE; i++)
> > > > > +        {
> > > > > +            union host_lpi *hlpip = &lpi_data.host_lpis[chunk][i],
> > > > > hlpi;
> > > > > +            uint32_t hlpi_nr;
> > > > > +
> > > > > +            hlpi.data = hlpip->data;
> > > > > +            if ( !hlpi.virt_lpi )
> > > > > +                continue;
> > > > > +
> > > > > +            if ( hlpi.dom_id != v->domain->domain_id )
> > > > > +                continue;
> > > > > +
> > > > > +            if ( hlpi.vcpu_id != v->vcpu_id )
> > > > > +                continue;
> > > > > +
> > > > > +            hlpi_nr = chunk * HOST_LPIS_PER_PAGE + i;
> > > > > +
> > > > > +            if ( proptable[hlpi.virt_lpi] & LPI_PROP_ENABLED )
> > > > > +                lpi_data.lpi_property[hlpi_nr - 8192] |=
> > > > > LPI_PROP_ENABLED;
> > > > > +            else
> > > > > +                lpi_data.lpi_property[hlpi_nr - 8192] &=
> > > > > ~LPI_PROP_ENABLED;
> > > > > +        }
> > > > > +    }
> > > >         AFAIK, the initial design is to use tasklet to update property
> > > > table as it consumes
> > > > lot of time to update the table.
> > > 
> > > This is a possible, but premature optimization.
> > > Linux (at the moment, at least) only calls INVALL _once_, just after
> > > initialising the collections. And at this point no LPI is mapped, so the
> > > whole routine does basically nothing - and that quite fast.
> > > We can later have any kind of fancy algorithm if there is a need for.
> > 
> > I understand, but as-is it's so expensive that could be a DOS vector.
> > Also other OSes could issue INVALL much more often than Linux.
> > 
> > Considering that we might support device assigment with ITS soon, I
> > think it might be best to parse per-domain virtual tables rather than
> > the full list of physical LPIs, which theoretically could be much
> > larger. Or alternatively we need to think about adding another field to
> > lpi_data, to link together all lpis assigned to the same domain, but
> > that would cost even more memory. Or we could rate-limit the INVALL
> > calls to one every few seconds or something. Or all of the above :-)
> 
> It is not necessary for an ITS implementation to wait until an INVALL/INV
> command is issued to take into account the change of the LPI configuration
> tables (aka property table in this thread).
> 
> So how about trapping the property table? We would still have to go through
> the property table the first time (i.e when writing into the GICR_PROPBASER),
> but INVALL would be a nop.
> 
> The idea would be unmapping the region when GICR_PROPBASER is written. So any
> read/write access would be trapped. For a write access, Xen will update the
> LPIs internal data structures and write the value in the guest page unmapped.
> If we don't want to have an overhead for the read access, we could just
> write-protect the page in stage-2 page table. So only write access would be
> trapped.
> 
> Going further, for the ITS, Xen is using the guest memory to store the ITS
> information. This means Xen has to validate the information at every access.
> So how about restricting the access in stage-2 page table? That would remove
> the overhead of validating data.
> 
> Any thoughts?

It is a promising idea. Let me expand on this.

I agree that on INVALL if we need to do anything, we should go through
the virtual property table rather than the full list of host lpis.

Once we agree on that, the two options we have are:

1) We let the guest write anything to the table, then we do a full
validation of the table on INVALL. We also do a validation of the table
entries used as parameters for any other commands.

2) We map the table read-only, then do a validation of every guest
write. INVALL becomes a NOP and parameters validation for many commands
could be removed or at least reduced.

Conceptually the two options should both lead to exactly the same
result. Therefore I think the decision should be made purely on
performance: which one is faster?  If it is true that INVALL is only
typically called once I suspect that 1) is faster, but I would like to
see some simple benchmarks, such as the time that it takes to configure
the ITS from scratch with the two approaches.


That said, even if 1) turns out to be faster and the approach of choice,
the idea of making the tables read-only in stage-2 could still be useful
to simplify parameters validation and protect Xen from concurrent
changes of the table entries from another guest vcpu. If the tables as
RW, we need to be very careful in Xen and use barriers to avoid
re-reading any guest table entry twice, as the guest could be changing
it in parallel to exploit the hypervisor.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-11-10 15:29     ` Andre Przywara
@ 2016-11-10 21:00       ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10 21:00 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Thu, 10 Nov 2016, Andre Przywara wrote:
> Hi,
> 
> On 26/10/16 02:10, Stefano Stabellini wrote:
> > Hi Andre,
> > 
> > Sorry for the late reply, I'll try to be faster for the next rounds of
> > review. The patch looks good for a first iteration. Some comments below.
> 
> No worries and thanks for the thorough review, much appreciated.
> As you can see I took my time to respond as well ;-)
> 
> > 
> > On Wed, 28 Sep 2016, Andre Przywara wrote:
> >> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
> >> The pending bits and the configuration data (priority, enable bits) for
> >> those LPIs are stored in tables in normal memory, which software has to
> >> provide to the hardware.
> >> Allocate the required memory, initialize it and hand it over to each
> >> ITS. We limit the number of LPIs we use with a compile time constant to
> >> avoid wasting memory.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  xen/arch/arm/Kconfig              |  6 ++++
> >>  xen/arch/arm/efi/efi-boot.h       |  1 -
> >>  xen/arch/arm/gic-its.c            | 76 +++++++++++++++++++++++++++++++++++++++
> >>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
> >>  xen/include/asm-arm/cache.h       |  4 +++
> >>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
> >>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
> >>  7 files changed, 181 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> >> index 9fe3b8e..66e2bb8 100644
> >> --- a/xen/arch/arm/Kconfig
> >> +++ b/xen/arch/arm/Kconfig
> >> @@ -50,6 +50,12 @@ config HAS_ITS
> >>          depends on ARM_64
> >>          depends on HAS_GICV3
> >>  
> >> +config HOST_LPI_BITS
> >> +        depends on HAS_ITS
> >> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> >> +        range 14 32
> >> +        default "20"
> >> +
> >>  config ALTERNATIVE
> >>  	bool
> >>  
> >> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
> >> index 045d6ce..dc64aec 100644
> >> --- a/xen/arch/arm/efi/efi-boot.h
> >> +++ b/xen/arch/arm/efi/efi-boot.h
> >> @@ -10,7 +10,6 @@
> >>  #include "efi-dom0.h"
> >>  
> >>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
> >> -void __flush_dcache_area(const void *vaddr, unsigned long size);
> >>  
> >>  #define DEVICE_TREE_GUID \
> >>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
> >> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> >> index 0f42a77..b52dff3 100644
> >> --- a/xen/arch/arm/gic-its.c
> >> +++ b/xen/arch/arm/gic-its.c
> >> @@ -20,10 +20,86 @@
> >>  #include <xen/lib.h>
> >>  #include <xen/device_tree.h>
> >>  #include <xen/libfdt/libfdt.h>
> >> +#include <asm/p2m.h>
> >>  #include <asm/gic.h>
> >>  #include <asm/gic_v3_defs.h>
> >>  #include <asm/gic-its.h>
> >>  
> >> +/* Global state */
> >> +static struct {
> >> +    uint8_t *lpi_property;
> >> +    int host_lpi_bits;
> >> +} lpi_data;
> >> +
> >> +/* Pending table for each redistributor */
> >> +static DEFINE_PER_CPU(void *, pending_table);
> >> +
> >> +#define MAX_HOST_LPI_BITS                                                \
> > 
> > To avoid confusion, I would call this MAX_PHYS_LPI_BITS
> > 
> > 
> >> +        min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
> >> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> > 
> > And this MAX_PHYS_LPIS
> 
> Done.
> 
> >> +uint64_t gicv3_lpi_allocate_pendtable(void)
> >> +{
> >> +    uint64_t reg, attr;
> >> +    void *pendtable;
> > 
> > I would introduce a check to make sure that this_cpu(pending_table) == NULL.
> 
> Can do. So I return back this value then, though this should never happen.
> 
> > 
> >> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> >> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> >> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> >> +
> >> +    /*
> >> +     * The pending table holds one bit per LPI, so we need three bits less
> >> +     * than the number of LPI_BITs.
> > 
> > Why 3 bit less? Please add more info on how you came up with 3.
> 
> 3 bits as in 2 << 3 = 8 = BITS_PER_BYTES. We need to divide by that,
> which is shift by 3, which is ORDER - 3. Does that make sense?
> But this mayhem goes away anyway with _xmalloc.

Please add info to the in code comment (if it will still be there).


> > 
> >>         But the alignment requirement from the
> >> +     * ITS is 64K, so make order at least 16 (-12).
> > 
> > Does it need to be 64K aligned or does it need to be at least 64K in
> > size?
> 
> The first.
> 
> > That makes a big difference. If it just needs to be 64K aligned,
> > you can do that with xmalloc.
> 
> Well, not xmalloc (since I don't have a data structure of that size),
> but _xmalloc. I just saw that this is exported as well (I dismissed this
> before because of the leading underscore).
> Also "alloc pages" sounded more like what I had in mind, but I guess
> aligning it to 64K serves the same purpose.
> 
> >> +     */
> >> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3, 16) - 12, 0);
> > 
> > Shouldn't we be using MAX_HOST_LPI_BITS instead of
> > lpi_data.host_lpi_bits to make this calculation?
> 
> I was under the impression that the redistributors expect the pending
> table to cover every possible LPI as reported in GICD_TYPER (because in
> contrast to PROPBASER the PENDBASER register lacks a size field).
> But thinking about this again this seems to be insane, since 32 bit
> worth of LPIs would lead to a 0.5GB pending table. But as the LPI
> numbers are under the control of software, we can go with allocating
> less - up to our internal limit - which is also what Linux does.
> 
> > 
> >> +    if ( !pendtable )
> >> +        return 0;
> >> +
> >> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
> > 
> > flush_dcache?
> 
> Uhm, yes.
> 
> > 
> >> +    this_cpu(pending_table) = pendtable;
> >> +
> >> +    reg  = attr | GICR_PENDBASER_PTZ;
> >> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
> >> +
> >> +    return reg;
> >> +}
> >> +
> >> +uint64_t gicv3_lpi_get_proptable()
> >> +{
> >> +    uint64_t attr;
> >> +    static uint64_t reg = 0;
> >> +
> >> +    /* The property table is shared across all redistributors. */
> >> +    if ( reg )
> >> +        return reg;
> > 
> > Can't you just use lpi_data.lpi_property != NULL instead of introducing
> > a new static local variable?
> 
> Seems like a good idea actually. We have to reconstruct the register
> content, but that seems doable.
> 
> >> +    attr  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> >> +    attr |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> >> +    attr |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> >> +
> >> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS - 12, 0);
> > 
> > Please add a comment on how the order is calculated.
> 
> Does " ... - PAGE_SHIFT" suffice?

Yes, that would work.


> > 
> > 
> >> +    if ( !lpi_data.lpi_property )
> >> +        return 0;
> >> +
> >> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_HOST_LPIS);
> >> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
> >> +
> >> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
> >> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
> >> +
> >> +    return reg;
> >> +}
> >> +
> >> +int gicv3_lpi_init_host_lpis(int lpi_bits)
> >> +{
> >> +    lpi_data.host_lpi_bits = lpi_bits;
> >> +
> >> +    printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
> >> +
> >> +    return 0;
> >> +}
> >> +
> >>  void gicv3_its_dt_init(const struct dt_device_node *node)
> >>  {
> >>      const struct dt_device_node *its = NULL;
> >> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> >> index 238da84..2534aa5 100644
> >> --- a/xen/arch/arm/gic-v3.c
> >> +++ b/xen/arch/arm/gic-v3.c
> >> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
> >>      type = readl_relaxed(GICD + GICD_TYPER);
> >>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
> >>  
> >> +    if ( type & GICD_TYPE_LPIS )
> >> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
> > 
> > Please #define a mask instead of using 0x1f
> > 
> > 
> >> +
> >>      printk("GICv3: %d lines, (IID %8.8x).\n",
> >>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
> >>  
> >> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
> >>  
> >>      return 0;
> >>  }
> >> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
> >> +{
> >> +    uint32_t reg;
> >> +    uint64_t table_reg;
> >> +
> >> +    if ( list_empty(&host_its_list) )
> >> +        return;
> >> +
> >> +    /* Make sure LPIs are disabled before setting up the BASERs. */
> >> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> >> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base + GICR_CTLR);
> >> +
> >> +    table_reg = gicv3_lpi_allocate_pendtable();
> >> +    if ( table_reg )
> >> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> > 
> > Maybe we want to return in case table_reg == NULL ?
> 
> I guess so. I just wonder what we would do in this case? Panic?
> Theoretically we could just proceed without enabling LPIs on this
> redistributor, but that's probably not what a user would expect.
 
Print an error and/or panic are good options.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table
  2016-11-10 15:32     ` Andre Przywara
@ 2016-11-10 21:06       ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10 21:06 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Thu, 10 Nov 2016, Andre Przywara wrote:
> Hi,
> 
> On 26/10/16 23:57, Stefano Stabellini wrote:
> > On Wed, 28 Sep 2016, Andre Przywara wrote:
> >> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> >> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> >> and collection ID, which points to the target CPU.
> >> This mapping is stored in the device and collection tables, which software
> >> has to provide for the ITS to use.
> >> Allocate the required memory and hand it the ITS.
> >> We limit the number of devices to cover 4 PCI busses for now.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  xen/arch/arm/gic-its.c        | 114 ++++++++++++++++++++++++++++++++++++++++++
> >>  xen/arch/arm/gic-v3.c         |   5 ++
> >>  xen/include/asm-arm/gic-its.h |  49 +++++++++++++++++-
> >>  3 files changed, 167 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> >> index b52dff3..40238a2 100644
> >> --- a/xen/arch/arm/gic-its.c
> >> +++ b/xen/arch/arm/gic-its.c
> >> @@ -21,6 +21,7 @@
> >>  #include <xen/device_tree.h>
> >>  #include <xen/libfdt/libfdt.h>
> >>  #include <asm/p2m.h>
> >> +#include <asm/io.h>
> >>  #include <asm/gic.h>
> >>  #include <asm/gic_v3_defs.h>
> >>  #include <asm/gic-its.h>
> >> @@ -38,6 +39,119 @@ static DEFINE_PER_CPU(void *, pending_table);
> >>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
> >>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> >>  
> >> +#define BASER_ATTR_MASK                                           \
> >> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> >> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> >> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> >> +#define BASER_RO_MASK   (GENMASK(52, 48) | GENMASK(58, 56))
> >> +
> >> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> >> +{
> >> +    uint64_t ret;
> >> +
> >> +    if ( page_bits < 16)
> >> +        return (uint64_t)addr & GENMASK(47, page_bits);
> >> +
> >> +    ret = addr & GENMASK(47, 16);
> >> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> >> +}
> >> +
> >> +static int gicv3_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> > 
> > Shouldn't this be called its_map_baser?
> 
> Yes, the BASER registers are an ITS property.
> 
> >> +{
> >> +    uint64_t attr;
> >> +    int entry_size = (regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f;
> > 
> > The spec says "This field is read-only and specifies the number of
> > bytes per entry, minus one." Do we need to increment it by 1?
> 
> Mmh, looks so. I guess it worked because the number gets dwarfed by the
> page size round up below.
> 
> >> +    int pagesz;
> >> +    int order;
> >> +    void *buffer = NULL;
> >> +
> >> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> >> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> >> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> >> +
> >> +    /*
> >> +     * Loop over the page sizes (4K, 16K, 64K) to find out what the host
> >> +     * supports.
> >> +     */
> > 
> > Is this really the best way to do it? Can't we assume ITS supports 4K,
> > given that Xen requires 4K pages at the moment?
> 
> The ITS pages are totally independent from the core's MMU page size.
> So the spec says: "If the GIC implementation supports only a single,
> fixed page size, this field might be RO."
> I take it that this means that the only implemented page size could be
> 64K, for instance. And in fact the KVM ITS emulation advertises exactly
> this to a guest.
> 
> > Is it actually possible
> > to find hardware that supports 4K but with an ITS that only support 64K
> > or 16K pages? It seems insane to me. Otherwise can't we probe the page
> > size somehow?
> 
> We can probe by writing and seeing if it sticks - that's what the code
> does. Is it really so horrible? I agree it's nasty, but isn't it
> basically a loop around the code needed anyway?

It looks very strange that there isn't a better way to find that info.
It looks a bit like an hack. It is also bad from a software point of
view being forced to cope with all three possible page granularities.

But oh well, sometimes we just have to deal with whatever the hardware
offers us.


> Yes to the rest of the comments.
> 
> 
> >> +    for (pagesz = 0; pagesz < 3; pagesz++)
> >> +    {
> >> +        uint64_t reg;
> >> +        int nr_bytes;
> >> +
> >> +        nr_bytes = ROUNDUP(nr_items * entry_size, BIT(pagesz * 2 + 12));
> >> +        order = get_order_from_bytes(nr_bytes);
> >> +
> >> +        if ( !buffer )
> >> +            buffer = alloc_xenheap_pages(order, 0);
> >> +        if ( !buffer )
> >> +            return -ENOMEM;
> >> +
> >> +        reg  = attr;
> >> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> >> +        reg |= nr_bytes >> (pagesz * 2 + 12);
> >> +        reg |= regc & BASER_RO_MASK;
> >> +        reg |= GITS_BASER_VALID;
> >> +        reg |= encode_phys_addr(virt_to_maddr(buffer), pagesz * 2 + 12);
> >> +
> >> +        writeq_relaxed(reg, basereg);
> >> +        regc = readl_relaxed(basereg);
> >> +
> >> +        /* The host didn't like our attributes, just use what it returned. */
> >> +        if ( (regc & BASER_ATTR_MASK) != attr )
> >> +            attr = regc & BASER_ATTR_MASK;
> >> +
> >> +        /* If the host accepted our page size, we are done. */
> >> +        if ( (reg & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> >> +            return 0;
> >> +
> >> +        /* Check whether our buffer is aligned to the next page size already. */
> >> +        if ( !(virt_to_maddr(buffer) & (BIT(pagesz * 2 + 12 + 2) - 1)) )
> >> +        {
> >> +            free_xenheap_pages(buffer, order);
> >> +            buffer = NULL;
> >> +        }
> >> +    }
> >> +
> >> +    if ( buffer )
> >> +        free_xenheap_pages(buffer, order);
> >> +
> >> +    return -EINVAL;
> >> +}
> >> +
> >> +int gicv3_its_init(struct host_its *hw_its)
> >> +{
> >> +    uint64_t reg;
> >> +    int i;
> >> +
> >> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> >> +    if ( !hw_its->its_base )
> >> +        return -ENOMEM;
> >> +
> >> +    for (i = 0; i < 8; i++)
> > 
> > Code style. Unfortunately we don't have a script to check, but please
> > refer to CODING_STYLE. I'd prefer if every number was #define'ed,
> > including `8' (something like GITS_BASER_MAX).
> > 
> > 
> >> +    {
> >> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> >> +        int type;
> >> +
> >> +        reg = readq_relaxed(basereg);
> >> +        type = (reg >> 56) & 0x7;
> > 
> > Please #define 56 and 0x7
> > 
> > 
> >> +        switch ( type )
> >> +        {
> >> +        case GITS_BASER_TYPE_NONE:
> >> +            continue;
> >> +        case GITS_BASER_TYPE_DEVICE:
> >> +            /* TODO: find some better way of limiting the number of devices */
> >> +            gicv3_map_baser(basereg, reg, 1024);
> > 
> > An hardcoded max value might be OK, but please #define it.
> > 
> > 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array
  2016-11-10 17:22     ` Andre Przywara
@ 2016-11-10 21:48       ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-10 21:48 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Thu, 10 Nov 2016, Andre Przywara wrote:
> Hi,
> 
> On 27/10/16 23:59, Stefano Stabellini wrote:
> > On Wed, 28 Sep 2016, Andre Przywara wrote:
> >> The number of LPIs on a host can be potentially huge (millions),
> >> although in practise will be mostly reasonable. So prematurely allocating
> >> an array of struct irq_desc's for each LPI is not an option.
> >> However Xen itself does not care about LPIs, as every LPI will be injected
> >> into a guest (Dom0 for now).
> >> Create a dense data structure (8 Bytes) for each LPI which holds just
> >> enough information to determine the virtual IRQ number and the VCPU into
> >> which the LPI needs to be injected.
> >> Also to not artificially limit the number of LPIs, we create a 2-level
> >> table for holding those structures.
> >> This patch introduces functions to initialize these tables and to
> >> create, lookup and destroy entries for a given LPI.
> >> We allocate and access LPI information in a way that does not require
> >> a lock.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  xen/arch/arm/gic-its.c        | 154 ++++++++++++++++++++++++++++++++++++++++++
> >>  xen/include/asm-arm/gic-its.h |  18 +++++
> >>  2 files changed, 172 insertions(+)
> >>
> >> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
> >> index 88397bc..2140e4a 100644
> >> --- a/xen/arch/arm/gic-its.c
> >> +++ b/xen/arch/arm/gic-its.c
> >> @@ -18,18 +18,31 @@
> >>  
> >>  #include <xen/config.h>
> >>  #include <xen/lib.h>
> >> +#include <xen/sched.h>
> >>  #include <xen/err.h>
> >>  #include <xen/device_tree.h>
> >>  #include <xen/libfdt/libfdt.h>
> >>  #include <asm/p2m.h>
> >> +#include <asm/domain.h>
> >>  #include <asm/io.h>
> >>  #include <asm/gic.h>
> >>  #include <asm/gic_v3_defs.h>
> >>  #include <asm/gic-its.h>
> >>  
> >> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
> >> +union host_lpi {
> >> +    uint64_t data;
> >> +    struct {
> >> +        uint64_t virt_lpi:32;
> >> +        uint64_t dom_id:16;
> >> +        uint64_t vcpu_id:16;
> >> +    };
> >> +};
> > 
> > Why not the following?
> > 
> >   union host_lpi {
> >       uint64_t data;
> >       struct {
> >           uint32_t virt_lpi;
> >           uint16_t dom_id;
> >           uint16_t vcpu_id;
> >       };
> >   };
> 
> I am not sure that gives me a guarantee of stuffing everything into a
> u64 (as per the C standard). It probably will on arm64 with gcc, but I
> thought better safe than sorry.

I am pretty sure that it is covered by the standard, also see
IHI0055A_aapcs64. Additionally I don't think the union with "data" is
actually required either.


> >>  /* Global state */
> >>  static struct {
> >>      uint8_t *lpi_property;
> >> +    union host_lpi **host_lpis;
> >>      int host_lpi_bits;
> >>  } lpi_data;
> >>  
> >> @@ -43,6 +56,26 @@ static DEFINE_PER_CPU(void *, pending_table);
> >>  #define MAX_HOST_LPI_BITS                                                \
> >>          min_t(unsigned int, lpi_data.host_lpi_bits, CONFIG_HOST_LPI_BITS)
> >>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> >> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
> >> +
> >> +static union host_lpi *gic_find_host_lpi(uint32_t lpi, struct domain *d)
> > 
> > I take "lpi" is the physical lpi here. Maybe we would rename it to "plpi"
> > for clarity.
> 
> Indeed.
> 
> > 
> >> +{
> >> +    union host_lpi *hlpi;
> >> +
> >> +    if ( lpi < 8192 || lpi >= MAX_HOST_LPIS + 8192 )
> >> +        return NULL;
> >> +
> >> +    lpi -= 8192;
> >> +    if ( !lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE] )
> >> +        return NULL;
> >> +
> >> +    hlpi = &lpi_data.host_lpis[lpi / HOST_LPIS_PER_PAGE][lpi % HOST_LPIS_PER_PAGE];
> > 
> > I realize I am sometimes obsessive about this, but division operations
> > are expensive and this is on the hot path, so I would do:
> > 
> > #define HOST_LPIS_PER_PAGE      (PAGE_SIZE >> 3)
> 
> to replace
> #define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))?
> 
> This should be computed by the compiler, as it's constant.
> 
> > unsigned int table = lpi / HOST_LPIS_PER_PAGE;
> 
> So I'd rather replace this by ">> (PAGE_SIZE - 3)".

This is actually what I meant, thanks.


> But again the compiler would do this for us, as replacing "constant
> divisions by power-of-two" with "right shifts" are a text book example
> of easy optimization, if I remember this compiler class at uni correctly ;-)

Yet, we found instanced where this didn't happen in the common Xen
scheduler code on x86.


> > then use table throughout this function.
> 
> I see your point (though this is ARMv8, which always has udiv).
> But to prove your paranoia wrong: I don't see any divisions in the
> disassembly, but a lsr #3 and a lsr #9 and various other clever and
> cheap ARMv8 instructions ;-)
> Compilers have really come a long way in 2016 ...

Fair enough, thanks for checking. That is enough for me.


> >> +    if ( d && hlpi->dom_id != d->domain_id )
> >> +        return NULL;
> > 
> > I think this function is very useful so I would avoid making any domain
> > checks here: one day we might want to retrieve hlpi even if hlpi->dom_id
> > != d->domain_id. I would move the domain check outside.
> 
> That's why I have "d && ..." in front. If you pass in NULL for the
> domain, it will skip this check. That saves us from coding the check in
> every caller.
> Is that not good enough?

There is a simple solution to this: write two functions, one without the
check and a wrapper to it with the check.


> > 
> >> +    return hlpi;
> >> +}
> >>  
> >>  #define ITS_COMMAND_SIZE        32
> >>  
> >> @@ -96,6 +129,33 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
> >>      return its_send_command(its, cmd);
> >>  }
> >>  
> >> +static int its_send_cmd_discard(struct host_its *its,
> >> +                                uint32_t deviceid, uint32_t eventid)
> >> +{
> >> +    uint64_t cmd[4];
> >> +
> >> +    cmd[0] = GITS_CMD_DISCARD | ((uint64_t)deviceid << 32);
> >> +    cmd[1] = eventid;
> >> +    cmd[2] = 0x00;
> >> +    cmd[3] = 0x00;
> >> +
> >> +    return its_send_command(its, cmd);
> >> +}
> >> +
> >> +static int its_send_cmd_mapti(struct host_its *its,
> >> +                              uint32_t deviceid, uint32_t eventid,
> >> +                              uint32_t pintid, uint16_t icid)
> >> +{
> >> +    uint64_t cmd[4];
> >> +
> >> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> >> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
> >> +    cmd[2] = icid;
> >> +    cmd[3] = 0x00;
> >> +
> >> +    return its_send_command(its, cmd);
> >> +}
> >> +
> >>  static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
> >>  {
> >>      uint64_t cmd[4];
> >> @@ -330,15 +390,109 @@ uint64_t gicv3_lpi_get_proptable()
> >>      return reg;
> >>  }
> >>  
> >> +/* Allocate the 2nd level array for host LPIs. This one holds pointers
> >> + * to the page with the actual "union host_lpi" entries. Our LPI limit
> >> + * avoids excessive memory usage.
> >> + */
> >>  int gicv3_lpi_init_host_lpis(int lpi_bits)
> >>  {
> >> +    int nr_lpi_ptrs;
> >> +
> >>      lpi_data.host_lpi_bits = lpi_bits;
> >>  
> >> +    nr_lpi_ptrs = MAX_HOST_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
> >> +
> >> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
> >> +    if ( !lpi_data.host_lpis )
> >> +        return -ENOMEM;
> > 
> > Why are we not allocating the 2nd level right away? To save memory? If
> > so, I would like some numbers in a real use case scenario written either
> > here on in the commit message.
> 
> LPIs can be allocated sparsely. Each LPI uses 8 Bytes, chances are we
> never use more than a few dozen on a real system, so we just use two
> pages with this scheme.
> 
> Allocating memory for all 2 << 20 (the default) takes 8 MB (probably for
> nothing), extending this to 24 bits uses 128 MB already.
> The problem is that Xen cannot know how many LPIs Dom0 will use, so I'd
> rather make this number generous here - hence the allocation scheme.
> 
> Not sure if this is actually overkill or paranoid and we would get away
> with a much smaller single level allocation, driven by a config option
> or runtime parameter, though.

All right. Please write an in-code comment explaining this reasoning
with a sample number of LPIs used by Dom0 on a real case scenario.


> >>      printk("GICv3: using at most %ld LPIs on the host.\n", MAX_HOST_LPIS);
> >>  
> >>      return 0;
> >>  }
> >>  
> >> +/* Allocates a new host LPI to be injected as "virt_lpi" into the specified
> >> + * VCPU. Returns the host LPI ID or a negative error value.
> >> + */
> >> +int gicv3_lpi_allocate_host_lpi(struct host_its *its,
> >> +                                uint32_t devid, uint32_t eventid,
> >> +                                struct vcpu *v, int virt_lpi)
> >> +{
> >> +    int chunk, i;
> >> +    union host_lpi hlpi, *new_chunk;
> >> +
> >> +    /* TODO: handle some kind of preassigned LPI mapping for DomUs */
> >> +    if ( !its )
> >> +        return -EPERM;
> >> +
> >> +    /* TODO: This could be optimized by storing some "next available" hint and
> >> +     * only iterate if this one doesn't work. But this function should be
> >> +     * called rarely.
> >> +     */
> > 
> > Yes please. Even a trivial pointer to last would be far better than this.
> > It would be nice to run some numbers and prove that in realistic
> > scenarios finding an empty plpi doesn't take more than 5-10 ops, which
> > should be the case unless we have to loop over and the initial chucks
> > are still fully populated, causing Xen to scan for 512 units at a time.
> > We defenitely want to avoid that, if not in rare worse case scenarios.
> 
> I can try, though keep in mind that this code is really only called on
> allocating a host LPI, which would only happen when an LPI gets mapped.
> And this is done only upon a Dom0 driver initializing a device. Normally
> you wouldn't expect this during the actual guest runtime.

It would happen during actual guest runtime with device assignment,
wouldn't?

A simple pointer to last would be a good start, and in-code comment about
how many times we loop to find an empty plpi on a normal case (no device
assignment).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-10 20:42           ` Stefano Stabellini
@ 2016-11-11 15:53             ` Julien Grall
  2016-11-11 20:31               ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-11 15:53 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Andre Przywara, Steve Capper, Vijay Kilari, xen-devel

Hi Stefano,

On 10/11/16 20:42, Stefano Stabellini wrote:
> On Thu, 10 Nov 2016, Julien Grall wrote:
>> On 10/11/16 00:21, Stefano Stabellini wrote:
>>> On Fri, 4 Nov 2016, Andre Przywara wrote:
>>>> On 24/10/16 16:32, Vijay Kilari wrote:
>>>>> On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara
>>>>>         AFAIK, the initial design is to use tasklet to update property
>>>>> table as it consumes
>>>>> lot of time to update the table.
>>>>
>>>> This is a possible, but premature optimization.
>>>> Linux (at the moment, at least) only calls INVALL _once_, just after
>>>> initialising the collections. And at this point no LPI is mapped, so the
>>>> whole routine does basically nothing - and that quite fast.
>>>> We can later have any kind of fancy algorithm if there is a need for.
>>>
>>> I understand, but as-is it's so expensive that could be a DOS vector.
>>> Also other OSes could issue INVALL much more often than Linux.
>>>
>>> Considering that we might support device assigment with ITS soon, I
>>> think it might be best to parse per-domain virtual tables rather than
>>> the full list of physical LPIs, which theoretically could be much
>>> larger. Or alternatively we need to think about adding another field to
>>> lpi_data, to link together all lpis assigned to the same domain, but
>>> that would cost even more memory. Or we could rate-limit the INVALL
>>> calls to one every few seconds or something. Or all of the above :-)
>>
>> It is not necessary for an ITS implementation to wait until an INVALL/INV
>> command is issued to take into account the change of the LPI configuration
>> tables (aka property table in this thread).
>>
>> So how about trapping the property table? We would still have to go through
>> the property table the first time (i.e when writing into the GICR_PROPBASER),
>> but INVALL would be a nop.
>>
>> The idea would be unmapping the region when GICR_PROPBASER is written. So any
>> read/write access would be trapped. For a write access, Xen will update the
>> LPIs internal data structures and write the value in the guest page unmapped.
>> If we don't want to have an overhead for the read access, we could just
>> write-protect the page in stage-2 page table. So only write access would be
>> trapped.
>>
>> Going further, for the ITS, Xen is using the guest memory to store the ITS
>> information. This means Xen has to validate the information at every access.
>> So how about restricting the access in stage-2 page table? That would remove
>> the overhead of validating data.
>>
>> Any thoughts?
>
> It is a promising idea. Let me expand on this.
>
> I agree that on INVALL if we need to do anything, we should go through
> the virtual property table rather than the full list of host lpis.

I agree on that.

>
> Once we agree on that, the two options we have are:

I believe we had a similar discussion when Vijay worked on the vITS (see 
[1]). I would have hoped that this new proposal took into account the 
constraint mentioned back then.

>
> 1) We let the guest write anything to the table, then we do a full
> validation of the table on INVALL. We also do a validation of the table
> entries used as parameters for any other commands.
>
> 2) We map the table read-only, then do a validation of every guest
> write. INVALL becomes a NOP and parameters validation for many commands
> could be removed or at least reduced.
>
> Conceptually the two options should both lead to exactly the same
> result. Therefore I think the decision should be made purely on
> performance: which one is faster?  If it is true that INVALL is only
> typically called once I suspect that 1) is faster, but I would like to
> see some simple benchmarks, such as the time that it takes to configure
> the ITS from scratch with the two approaches.

The problem is not which one is faster but which one will not take down 
the hypervisor.

The guest is allowed to create a command queue as big as 1MB, a command 
use 32 bytes, so the command queue can fit up 32640 commands.

Now imagine a malicious guest filling up the command queue with INVALL 
and then notify the via (via GITS_CWRITER). Based on patch #5, all those 
commands will be handled in one. So you have to multiple the time of one 
command by 32640 times.

Given that the hypervisor is not preemptible, it likely means a DOS.
A similar problem would happen if an vITS command is translate to an ITS 
command (see the implementation of INVALL). Multiple malicious guest 
could slow down the other guest by filling up the host command queue. 
Worst, a command from a normal guest could be discarded because the host 
ITS command queue is full (see its_send_command in gic-its.c).

That's why in the approach we had on the previous series was "host ITS 
command should be limited when emulating guest ITS command". From my 
recall, in that series the host and guest LPIs was fully separated 
(enabling a guest LPIs was not enabling host LPIs).

That said, a design doc explaining all the constraints and code flow 
would have been really helpful. It took me a while to digest and 
understand the interaction between each part of the code. The design 
document would have also been a good place to discuss about problems 
that span across multiple patches (like the command queue emulation).

>
>
> That said, even if 1) turns out to be faster and the approach of choice,
> the idea of making the tables read-only in stage-2 could still be useful
> to simplify parameters validation and protect Xen from concurrent
> changes of the table entries from another guest vcpu. If the tables as
> RW, we need to be very careful in Xen and use barriers to avoid
> re-reading any guest table entry twice, as the guest could be changing
> it in parallel to exploit the hypervisor.

Yes, and this is true for all the tables (PROPBASER, BASER,...) that 
reside on guest memory. Most of them should not be touched by the guest.

This is the same for the command queue (patch #12), accessing the 
command directly from the guest memory is not safe. A guest could modify 
the value behind our back.

Regards,

[1] https://xenbits.xen.org/people/ianc/vits/draftG.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-11 15:53             ` Julien Grall
@ 2016-11-11 20:31               ` Stefano Stabellini
  2016-11-18 18:39                 ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-11 20:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andre Przywara, Stefano Stabellini, Steve Capper, Vijay Kilari,
	xen-devel

On Fri, 11 Nov 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 10/11/16 20:42, Stefano Stabellini wrote:
> > On Thu, 10 Nov 2016, Julien Grall wrote:
> > > On 10/11/16 00:21, Stefano Stabellini wrote:
> > > > On Fri, 4 Nov 2016, Andre Przywara wrote:
> > > > > On 24/10/16 16:32, Vijay Kilari wrote:
> > > > > > On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara
> > > > > >         AFAIK, the initial design is to use tasklet to update
> > > > > > property
> > > > > > table as it consumes
> > > > > > lot of time to update the table.
> > > > > 
> > > > > This is a possible, but premature optimization.
> > > > > Linux (at the moment, at least) only calls INVALL _once_, just after
> > > > > initialising the collections. And at this point no LPI is mapped, so
> > > > > the
> > > > > whole routine does basically nothing - and that quite fast.
> > > > > We can later have any kind of fancy algorithm if there is a need for.
> > > > 
> > > > I understand, but as-is it's so expensive that could be a DOS vector.
> > > > Also other OSes could issue INVALL much more often than Linux.
> > > > 
> > > > Considering that we might support device assigment with ITS soon, I
> > > > think it might be best to parse per-domain virtual tables rather than
> > > > the full list of physical LPIs, which theoretically could be much
> > > > larger. Or alternatively we need to think about adding another field to
> > > > lpi_data, to link together all lpis assigned to the same domain, but
> > > > that would cost even more memory. Or we could rate-limit the INVALL
> > > > calls to one every few seconds or something. Or all of the above :-)
> > > 
> > > It is not necessary for an ITS implementation to wait until an INVALL/INV
> > > command is issued to take into account the change of the LPI configuration
> > > tables (aka property table in this thread).
> > > 
> > > So how about trapping the property table? We would still have to go
> > > through
> > > the property table the first time (i.e when writing into the
> > > GICR_PROPBASER),
> > > but INVALL would be a nop.
> > > 
> > > The idea would be unmapping the region when GICR_PROPBASER is written. So
> > > any
> > > read/write access would be trapped. For a write access, Xen will update
> > > the
> > > LPIs internal data structures and write the value in the guest page
> > > unmapped.
> > > If we don't want to have an overhead for the read access, we could just
> > > write-protect the page in stage-2 page table. So only write access would
> > > be
> > > trapped.
> > > 
> > > Going further, for the ITS, Xen is using the guest memory to store the ITS
> > > information. This means Xen has to validate the information at every
> > > access.
> > > So how about restricting the access in stage-2 page table? That would
> > > remove
> > > the overhead of validating data.
> > > 
> > > Any thoughts?
> > 
> > It is a promising idea. Let me expand on this.
> > 
> > I agree that on INVALL if we need to do anything, we should go through
> > the virtual property table rather than the full list of host lpis.
> 
> I agree on that.
> 
> > 
> > Once we agree on that, the two options we have are:
> 
> I believe we had a similar discussion when Vijay worked on the vITS (see [1]).
> I would have hoped that this new proposal took into account the constraint
> mentioned back then.
> 
> > 
> > 1) We let the guest write anything to the table, then we do a full
> > validation of the table on INVALL. We also do a validation of the table
> > entries used as parameters for any other commands.
> > 
> > 2) We map the table read-only, then do a validation of every guest
> > write. INVALL becomes a NOP and parameters validation for many commands
> > could be removed or at least reduced.
> > 
> > Conceptually the two options should both lead to exactly the same
> > result. Therefore I think the decision should be made purely on
> > performance: which one is faster?  If it is true that INVALL is only
> > typically called once I suspect that 1) is faster, but I would like to
> > see some simple benchmarks, such as the time that it takes to configure
> > the ITS from scratch with the two approaches.
> 
> The problem is not which one is faster but which one will not take down the
> hypervisor.
> 
> The guest is allowed to create a command queue as big as 1MB, a command use 32
> bytes, so the command queue can fit up 32640 commands.
> 
> Now imagine a malicious guest filling up the command queue with INVALL and
> then notify the via (via GITS_CWRITER). Based on patch #5, all those commands
> will be handled in one. So you have to multiple the time of one command by
> 32640 times.
> 
> Given that the hypervisor is not preemptible, it likely means a DOS.

I think it can be made to work safely using a rate-limiting technique.
Such as: Xen is only going to emulate an INVALL for a given domain only
once every one or two seconds and no more often than that. x86 has
something like that under xen/arch/x86/irq.c, see irq_ratelimit.

But to be clear, I am not saying that this is necessarily the best way
to do it. I would like to see some benchmarks first.


> A similar problem would happen if an vITS command is translate to an ITS
> command (see the implementation of INVALL). Multiple malicious guest could
> slow down the other guest by filling up the host command queue. Worst, a
> command from a normal guest could be discarded because the host ITS command
> queue is full (see its_send_command in gic-its.c).
 
Looking at the patches, nothing checks for discarded physical ITS
commands. Not good.


> That's why in the approach we had on the previous series was "host ITS command
> should be limited when emulating guest ITS command". From my recall, in that
> series the host and guest LPIs was fully separated (enabling a guest LPIs was
> not enabling host LPIs).

I am interested in reading what Ian suggested to do when the physical
ITS queue is full, but I cannot find anything specific about it in the
doc.

Do you have a suggestion for this? 

The only things that come to mind right now are:

1) check if the ITS queue is full and busy loop until it is not (spin_lock style)
2) check if the ITS queue is full and sleep until it is not (mutex style)


> That said, a design doc explaining all the constraints and code flow would
> have been really helpful. It took me a while to digest and understand the
> interaction between each part of the code. The design document would have also
> been a good place to discuss about problems that span across multiple patches
> (like the command queue emulation).
> 
> > 
> > 
> > That said, even if 1) turns out to be faster and the approach of choice,
> > the idea of making the tables read-only in stage-2 could still be useful
> > to simplify parameters validation and protect Xen from concurrent
> > changes of the table entries from another guest vcpu. If the tables as
> > RW, we need to be very careful in Xen and use barriers to avoid
> > re-reading any guest table entry twice, as the guest could be changing
> > it in parallel to exploit the hypervisor.
> 
> Yes, and this is true for all the tables (PROPBASER, BASER,...) that reside on
> guest memory. Most of them should not be touched by the guest.
> 
> This is the same for the command queue (patch #12), accessing the command
> directly from the guest memory is not safe. A guest could modify the value
> behind our back.
> 
> Regards,
> 
> [1] https://xenbits.xen.org/people/ianc/vits/draftG.html

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2016-11-01 15:13   ` Julien Grall
@ 2016-11-14 17:35     ` Andre Przywara
  2016-11-23 15:39       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-14 17:35 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Hi,

On 01/11/16 15:13, Julien Grall wrote:
> Hi Andre,
> 
> On 28/09/2016 19:24, Andre Przywara wrote:
>> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
>> offers. Store that information in a list to both propagate all of them
>> later to Dom0, but also to be able to iterate over all ITSes.
>> This introduces an ITS Kconfig option.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig          |  5 ++++
>>  xen/arch/arm/Makefile         |  1 +
>>  xen/arch/arm/gic-its.c        | 67
>> +++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |  6 ++++
>>  xen/include/asm-arm/gic-its.h | 57 ++++++++++++++++++++++++++++++++++++
>>  5 files changed, 136 insertions(+)
>>  create mode 100644 xen/arch/arm/gic-its.c
>>  create mode 100644 xen/include/asm-arm/gic-its.h
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 797c91f..9fe3b8e 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -45,6 +45,11 @@ config ACPI
>>  config HAS_GICV3
>>      bool
>>
>> +config HAS_ITS
>> +        bool "GICv3 ITS MSI controller support"
>> +        depends on ARM_64
> 
> HAS_GICV3 will only be selected for 64-bit. It would need some rework to
> be supported on 32-bit. So I would drop this dependency.

OK, makes sense.

>> +        depends on HAS_GICV3
>> +
> 
> I am not convinced that we should (currently) let the user selecting the
> ITS support. It increases the test coverage (we have to test with and
> without). Do we expect people using GICv3 without ITS?

My concern was more that if it breaks something, people can just disable
it. But I have to go through the patches again to see if disabling it
really brings us something (because thinking about it I don't think so).

So given the test coverage argument I think we should at least enable it
by default for ARM64. Is there some "expert options" group somewhere
where we could insert the option to turn it off?


Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-11-01 17:22   ` Julien Grall
@ 2016-11-15 11:32     ` Andre Przywara
  2016-11-23 15:58       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-11-15 11:32 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Hi Julien,

On 01/11/16 17:22, Julien Grall wrote:
> Hi Andre,
> 
> On 28/09/2016 19:24, Andre Przywara wrote:
>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> ITS. We limit the number of LPIs we use with a compile time constant to
>> avoid wasting memory.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig              |  6 ++++
>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>  xen/arch/arm/gic-its.c            | 76
>> +++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>  xen/include/asm-arm/cache.h       |  4 +++
>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 9fe3b8e..66e2bb8 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -50,6 +50,12 @@ config HAS_ITS
>>          depends on ARM_64
>>          depends on HAS_GICV3
>>
>> +config HOST_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +
> 
> This would be better to be defined as a parameter command line. So the
> user does not need to rebuild Xen in order to increase the number of
> bits supported. It would also be useful to get a rational behind the
> default number in the commit message.

Yeah, a command line option sounds useful, though I have to check
whether this changes compile-time computation into actual runtime one.

The number is made-up, based on some reasoning on the possible memory
consumption and the number of LPIs provided. 8 MB and 1 million LPIs
sounded like a sweet spot.
If I got this correctly, Linux atm doesn't use more than 65536 LPIs.

>>  config ALTERNATIVE
>>      bool
>>
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 045d6ce..dc64aec 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -10,7 +10,6 @@
>>  #include "efi-dom0.h"
>>
>>  void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
>> -void __flush_dcache_area(const void *vaddr, unsigned long size);
>>
>>  #define DEVICE_TREE_GUID \
>>  {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69,
>> 0xaa, 0xe0}}
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index 0f42a77..b52dff3 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
> 
> Please rename this file gic-v3-its to make clear ITS is only GICv3.

OK.

>> @@ -20,10 +20,86 @@
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>>  #include <xen/libfdt/libfdt.h>
>> +#include <asm/p2m.h>
> 
> Why did you include p2m.h? This header contains stage-2 page table
> functions but I don't see any use of them within this patch.

This may be a rebase artifact introduced when shuffling patches around.

>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>>  #include <asm/gic-its.h>
>>
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    int host_lpi_bits;
> 
> Please use unsigned int.
> 
>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define
>> MAX_HOST_LPI_BITS                                                \
>> +        min_t(unsigned int, lpi_data.host_lpi_bits,
>> CONFIG_HOST_LPI_BITS)
> 
> Why don't you directly initialize host_lpi_bits to the correct value?
> This would avoid to compute the min every time you use MAX_HOST_LPI_BITS
> and save few instructions.

Mmmh, probably because of forest and trees and stuff ;-)
Looks indeed like being pointless.

>> +#define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
> 
> I know that we don't support ITS for 32-bits, but I would rather avoid
> to use BIT as this macro is working on unsigned long. I would prefer if
> you introduce BIT_ULL or open-code.

Sure.

>> +
>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg, attr;
>> +    void *pendtable;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
> From the spec (8.11.18 in ARM IHI 0069C) the cacheability and
> shareability could be fixed (though it marked as deprecated). Should we
> check whether the value stick?

Yeah, I guess we have to. I was just not sure what to do when it
doesn't. By design we don't touch the pending table after having it
handed in, so technically we wouldn't even need to map it. But we have
to make sure nobody steps on it and that it stays reserved.

> Also the variable attr sounds pointless as you will directly assign the
> value to reg with no more computation.

Yes, this is probably copy & pasted from a place where it needed a loop.
I just kept it because of readability. Can surely clean this up.

>> +
>> +    /*
>> +     * The pending table holds one bit per LPI, so we need three bits
>> less
>> +     * than the number of LPI_BITs. But the alignment requirement
>> from the
>> +     * ITS is 64K, so make order at least 16 (-12).
>> +     */
>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3,
>> 16) - 12, 0);
>> +    if ( !pendtable )
>> +        return 0;
>> +
>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
> 
> Same remark for BIT here.
> 
>> +    this_cpu(pending_table) = pendtable;
>> +
>> +    reg  = attr | GICR_PENDBASER_PTZ;
>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
> 
> I don't think the mask is useful and would need to be changed if the
> physical address bits increased as it was done in ARMv8.2.

Mmmh, not so sure we can extend the mask to cover the region that it
RES0 at the moment. We need some mask anyway (to not clobber the upper
bits), so I figured we just use what the spec says.

>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable()
>> +{
>> +    uint64_t attr;
>> +    static uint64_t reg = 0;
>> +
>> +    /* The property table is shared across all redistributors. */
>> +    if ( reg )
>> +        return reg;
>> +
>> +    attr  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
> Same question for the cacheability and shareability.
> 
> Also the variable attr sounds pointless as you will directly assign the
> value to reg with no more computation.
> 
>> +
>> +    lpi_data.lpi_property = alloc_xenheap_pages(MAX_HOST_LPI_BITS -
>> 12, 0);
>> +    if ( !lpi_data.lpi_property )
>> +        return 0;
>> +
>> +    memset(lpi_data.lpi_property, GIC_PRI_IRQ | LPI_PROP_RES1,
>> MAX_HOST_LPIS);
>> +    __flush_dcache_area(lpi_data.lpi_property, MAX_HOST_LPIS);
>> +
>> +    reg  = attr | ((MAX_HOST_LPI_BITS - 1) << 0);
>> +    reg |= virt_to_maddr(lpi_data.lpi_property) & GENMASK(51, 12);
> 
> Same remark for the mask here.
> 
>> +
>> +    return reg;
>> +}
>> +
>> +int gicv3_lpi_init_host_lpis(int lpi_bits)
> 
> Please use unsigned int for lpi_bits.
> 
> Also this function should probably be in the section __init.
> 
>> +{
>> +    lpi_data.host_lpi_bits = lpi_bits;
>> +
>> +    printk("GICv3: using at most %ld LPIs on the host.\n",
>> MAX_HOST_LPIS);
> 
> %lu.
> 
>> +
>> +    return 0;
>> +}
>> +
>>  void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>      const struct dt_device_node *its = NULL;
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 238da84..2534aa5 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) &
>> 0x1f) + 1);
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>
>> @@ -615,6 +618,26 @@ static int gicv3_enable_redist(void)
>>
>>      return 0;
>>  }
> 
> Missing blank line here.
> 
>> +static void gicv3_rdist_init_lpis(void __iomem * rdist_base)
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    if ( list_empty(&host_its_list) )
>> +        return;
>> +
>> +    /* Make sure LPIs are disabled before setting up the BASERs. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, rdist_base +
>> GICR_CTLR);
>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> 
> Is it fine to continue silently if gicv3_lpi_allocate_pendtable has failed?

Probably not, I have fixed this already as Stefano pointed out the same.

>> +
>> +    table_reg = gicv3_lpi_get_proptable();
>> +    if ( table_reg )
>> +        writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> 
> Ditto.
> 
> 
>> +}
>>
>>  static int __init gicv3_populate_rdist(void)
>>  {
>> @@ -658,6 +681,10 @@ static int __init gicv3_populate_rdist(void)
>>              if ( (typer >> 32) == aff )
>>              {
>>                  this_cpu(rbase) = ptr;
>> +
>> +                if ( typer & GICR_TYPER_PLPIS )
>> +                    gicv3_rdist_init_lpis(ptr);
>> +
>>                  printk("GICv3: CPU%d: Found redistributor in region
>> %d @%p\n",
>>                          smp_processor_id(), i, ptr);
>>                  return 0;
>> diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
>> index 2de6564..af96eee 100644
>> --- a/xen/include/asm-arm/cache.h
>> +++ b/xen/include/asm-arm/cache.h
>> @@ -7,6 +7,10 @@
>>  #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
>>  #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
>>
>> +#ifndef __ASSEMBLY__
>> +void __flush_dcache_area(const void *vaddr, unsigned long size);
>> +#endif
>> +
> 
> Please move this change in a separate patch.
> 
>>  #define __read_mostly __section(".data.read_mostly")
>>
>>  #endif
>> diff --git a/xen/include/asm-arm/gic-its.h
>> b/xen/include/asm-arm/gic-its.h
>> index 2f5c51c..48c6c78 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>>  /* Parse the host DT and pick up all host ITSes. */
>>  void gicv3_its_dt_init(const struct dt_device_node *node);
>>
>> +/* Allocate and initialize tables for each host redistributor.
>> + * Returns the respective {PROP,PEND}BASER register value.
>> + */
>> +uint64_t gicv3_lpi_get_proptable(void);
>> +uint64_t gicv3_lpi_allocate_pendtable(void);
>> +
>> +/* Initialize the host structures for LPIs. */
>> +int gicv3_lpi_init_host_lpis(int nr_lpis);
>> +
>>  #else
>>
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>>  {
>>  }
>> -
> 
> Please add a newline here.
> 
>> +static inline uint64_t gicv3_lpi_get_proptable(void)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>> +static inline int gicv3_lpi_init_host_lpis(int nr_lpis)
>> +{
>> +    return 0;
>> +}
> 
> Ditto
> 
>>  #endif /* CONFIG_HAS_ITS */
>>
>>  #endif /* __ASSEMBLY__ */
>> diff --git a/xen/include/asm-arm/gic_v3_defs.h
>> b/xen/include/asm-arm/gic_v3_defs.h
>> index 6bd25a5..da5fb77 100644
>> --- a/xen/include/asm-arm/gic_v3_defs.h
>> +++ b/xen/include/asm-arm/gic_v3_defs.h
>> @@ -44,7 +44,8 @@
>>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>>
>>  /* Additional bits in GICD_TYPER defined by GICv3 */
>> -#define GICD_TYPE_ID_BITS_SHIFT 19
>> +#define GICD_TYPE_ID_BITS_SHIFT      19
>> +#define GICD_TYPE_LPIS               (1U << 17)
> 
> I was about to say that this should be named GICD_TYPER... but it looks
> like we already defined and use GIC_TYPE_ID_BITS_SHIFTS. So it is up to
> you if you rename it to get the correct register name.

Yeah, I was unsure about this as well. My hunch is we should avoid the
churn and keep existing names around. Experience shows that those simple
renames tend to introduce nasty rebase issues, especially with a
long-standing series like this.

>>
>>  #define GICD_CTLR_RWP                (1UL << 31)
>>  #define GICD_CTLR_ARE_NS             (1U << 4)
>> @@ -95,12 +96,57 @@
>>  #define GICR_IGRPMODR0               (0x0D00)
>>  #define GICR_NSACR                   (0x0E00)
>>
>> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
> 
> Please add a new line here to separate definition for GICR_CTLR and
> GICR_TYPER.
> 
>>  #define GICR_TYPER_PLPIS             (1U << 0)
>>  #define GICR_TYPER_VLPIS             (1U << 1)
>>  #define GICR_TYPER_LAST              (1U << 4)
>>
>> +#define GIC_BASER_CACHE_nCnB         0ULL
>> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> 
> I think this would require some description in the code as it is not
> clear wheather nCnB apply for Outer or Inner. From my understanding it
> is only the later.
> 
>> +#define GIC_BASER_CACHE_nC           1ULL
>> +#define GIC_BASER_CACHE_RaWt         2ULL
>> +#define GIC_BASER_CACHE_RaWb         3ULL
>> +#define GIC_BASER_CACHE_WaWt         4ULL
>> +#define GIC_BASER_CACHE_WaWb         5ULL
>> +#define GIC_BASER_CACHE_RaWaWt       6ULL
>> +#define GIC_BASER_CACHE_RaWaWb       7ULL
>> +#define GIC_BASER_CACHE_MASK         7ULL
> 
> New line here please and maybe a comment to say this is shareability
> definition.
> 
>> +#define GIC_BASER_NonShareable       0ULL
>> +#define GIC_BASER_InnerShareable     1ULL
>> +#define GIC_BASER_OuterShareable     2ULL
>> +
>> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
>> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> 
>> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
>> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
>> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> 
> It might be better to define GIC_BASER_SHAREABILITY_MASK rather than
> open-coding 3UL. Also technically 3UL should be 3ULL.

I really think its hard to read already, naming this also breaks 80
characters, I believe (I think I tried it).
So while I appreciate the habit of naming magic constants, I wonder if
this is really going bonkers here. After all we define a mask here, so
_somewhere_ these actual numbers would need to end up, wouldn't they?

>> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> 
> Same remark here.
> 
>> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> 
> Ditto
> 
>> +#define PROPBASER_RES0_MASK                                  \
> 
> I would probably rename this field GICR_PROPBASER_RES0_MASK.
> 
>> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
>> +
>> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
>> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
>> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
>> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
>> +    (3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> 
> See my remark above.
> 
>> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
>> +    (7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> 
> Ditto
> 
>> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
>> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
>> +#define GICR_PENDBASER_PTZ                              BIT(62)
> 
> Please don't use BIT but either 1ULL << 62 or introduce BIT_ULL.
> 
>> +#define PENDBASER_RES0_MASK                                  \
> 
> GICR_PENDBASER_RES0_MASK
> 
>> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
>> +         GENMASK(15, 12) | GENMASK(6, 0))
>> +
>>  #define DEFAULT_PMR_VALUE            0xff
>>
>> +#define LPI_PROP_DEFAULT_PRIO        0xa0
> 
> You define LPI_PROP_DEFAULT_PRIO but never used it within this series.
> In any case, it would be better to keep using GIC_PRI_IRQ (as you did)
> as make LPI_PROP_DEFAULT_PRIO an alias of GIC_PRI_IRQ to avoid spreading
> the priority everywhere (for now they are all defined in gic.h).

Sure.
(And basically "yes, will fix" to anything I haven't replied on
explictly above).

Cheers,
Andre.
> 
>> +#define LPI_PROP_RES1                (1 << 1)
>> +#define LPI_PROP_ENABLED             (1 << 0)
>> +
>>  #define GICH_VMCR_EOI                (1 << 9)
>>  #define GICH_VMCR_VENG1              (1 << 1)
>>
>>
> 
> Regards,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-11 20:31               ` Stefano Stabellini
@ 2016-11-18 18:39                 ` Stefano Stabellini
  2016-11-25 16:10                   ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-11-18 18:39 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Andre Przywara, Julien Grall, Steve Capper, Vijay Kilari, xen-devel

On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> On Fri, 11 Nov 2016, Julien Grall wrote:
> > Hi Stefano,
> > 
> > On 10/11/16 20:42, Stefano Stabellini wrote:
> > > On Thu, 10 Nov 2016, Julien Grall wrote:
> > > > On 10/11/16 00:21, Stefano Stabellini wrote:
> > > > > On Fri, 4 Nov 2016, Andre Przywara wrote:
> > > > > > On 24/10/16 16:32, Vijay Kilari wrote:
> > > > > > > On Wed, Sep 28, 2016 at 11:54 PM, Andre Przywara
> > > > > > >         AFAIK, the initial design is to use tasklet to update
> > > > > > > property
> > > > > > > table as it consumes
> > > > > > > lot of time to update the table.
> > > > > > 
> > > > > > This is a possible, but premature optimization.
> > > > > > Linux (at the moment, at least) only calls INVALL _once_, just after
> > > > > > initialising the collections. And at this point no LPI is mapped, so
> > > > > > the
> > > > > > whole routine does basically nothing - and that quite fast.
> > > > > > We can later have any kind of fancy algorithm if there is a need for.
> > > > > 
> > > > > I understand, but as-is it's so expensive that could be a DOS vector.
> > > > > Also other OSes could issue INVALL much more often than Linux.
> > > > > 
> > > > > Considering that we might support device assigment with ITS soon, I
> > > > > think it might be best to parse per-domain virtual tables rather than
> > > > > the full list of physical LPIs, which theoretically could be much
> > > > > larger. Or alternatively we need to think about adding another field to
> > > > > lpi_data, to link together all lpis assigned to the same domain, but
> > > > > that would cost even more memory. Or we could rate-limit the INVALL
> > > > > calls to one every few seconds or something. Or all of the above :-)
> > > > 
> > > > It is not necessary for an ITS implementation to wait until an INVALL/INV
> > > > command is issued to take into account the change of the LPI configuration
> > > > tables (aka property table in this thread).
> > > > 
> > > > So how about trapping the property table? We would still have to go
> > > > through
> > > > the property table the first time (i.e when writing into the
> > > > GICR_PROPBASER),
> > > > but INVALL would be a nop.
> > > > 
> > > > The idea would be unmapping the region when GICR_PROPBASER is written. So
> > > > any
> > > > read/write access would be trapped. For a write access, Xen will update
> > > > the
> > > > LPIs internal data structures and write the value in the guest page
> > > > unmapped.
> > > > If we don't want to have an overhead for the read access, we could just
> > > > write-protect the page in stage-2 page table. So only write access would
> > > > be
> > > > trapped.
> > > > 
> > > > Going further, for the ITS, Xen is using the guest memory to store the ITS
> > > > information. This means Xen has to validate the information at every
> > > > access.
> > > > So how about restricting the access in stage-2 page table? That would
> > > > remove
> > > > the overhead of validating data.
> > > > 
> > > > Any thoughts?
> > > 
> > > It is a promising idea. Let me expand on this.
> > > 
> > > I agree that on INVALL if we need to do anything, we should go through
> > > the virtual property table rather than the full list of host lpis.
> > 
> > I agree on that.
> > 
> > > 
> > > Once we agree on that, the two options we have are:
> > 
> > I believe we had a similar discussion when Vijay worked on the vITS (see [1]).
> > I would have hoped that this new proposal took into account the constraint
> > mentioned back then.
> > 
> > > 
> > > 1) We let the guest write anything to the table, then we do a full
> > > validation of the table on INVALL. We also do a validation of the table
> > > entries used as parameters for any other commands.
> > > 
> > > 2) We map the table read-only, then do a validation of every guest
> > > write. INVALL becomes a NOP and parameters validation for many commands
> > > could be removed or at least reduced.
> > > 
> > > Conceptually the two options should both lead to exactly the same
> > > result. Therefore I think the decision should be made purely on
> > > performance: which one is faster?  If it is true that INVALL is only
> > > typically called once I suspect that 1) is faster, but I would like to
> > > see some simple benchmarks, such as the time that it takes to configure
> > > the ITS from scratch with the two approaches.
> > 
> > The problem is not which one is faster but which one will not take down the
> > hypervisor.
> > 
> > The guest is allowed to create a command queue as big as 1MB, a command use 32
> > bytes, so the command queue can fit up 32640 commands.
> > 
> > Now imagine a malicious guest filling up the command queue with INVALL and
> > then notify the via (via GITS_CWRITER). Based on patch #5, all those commands
> > will be handled in one. So you have to multiple the time of one command by
> > 32640 times.
> > 
> > Given that the hypervisor is not preemptible, it likely means a DOS.
> 
> I think it can be made to work safely using a rate-limiting technique.
> Such as: Xen is only going to emulate an INVALL for a given domain only
> once every one or two seconds and no more often than that. x86 has
> something like that under xen/arch/x86/irq.c, see irq_ratelimit.
> 
> But to be clear, I am not saying that this is necessarily the best way
> to do it. I would like to see some benchmarks first.
> 
> 
> > A similar problem would happen if an vITS command is translate to an ITS
> > command (see the implementation of INVALL). Multiple malicious guest could
> > slow down the other guest by filling up the host command queue. Worst, a
> > command from a normal guest could be discarded because the host ITS command
> > queue is full (see its_send_command in gic-its.c).
>  
> Looking at the patches, nothing checks for discarded physical ITS
> commands. Not good.
> 
> 
> > That's why in the approach we had on the previous series was "host ITS command
> > should be limited when emulating guest ITS command". From my recall, in that
> > series the host and guest LPIs was fully separated (enabling a guest LPIs was
> > not enabling host LPIs).
> 
> I am interested in reading what Ian suggested to do when the physical
> ITS queue is full, but I cannot find anything specific about it in the
> doc.
> 
> Do you have a suggestion for this? 
> 
> The only things that come to mind right now are:
> 
> 1) check if the ITS queue is full and busy loop until it is not (spin_lock style)
> 2) check if the ITS queue is full and sleep until it is not (mutex style)

Another, probably better idea, is to map all pLPIs of a device when the
device is assigned to a guest (including Dom0). This is what was written
in Ian's design doc. The advantage of this approach is that Xen doesn't
need to take any actions on the physical ITS command queue when the
guest issues virtual ITS commands, therefore completely solving this
problem at the root. (Although I am not sure about enable/disable
commands: could we avoid issuing enable/disable on pLPIs?) It also helps
toward solving the INVALL potential DOS issue, because it significantly
reduces the computation needed when an INVALL is issued by the guest.

On the other end, this approach has the potential of consuming much more
memory to map all the possible pLPIs that a device could use up to the
theoretical max. Of course that is not good either. But fortunately for
PCI devices we know how many events a device can generate. Also we
should be able to get that info on device tree for other devices. So I
suggest Xen only maps as many pLPIs as events the device can generate,
when the device is assigned to the guest. This way there would be no
wasted memory.

Does it make sense? Do you think it could work?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2016-11-14 17:35     ` Andre Przywara
@ 2016-11-23 15:39       ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-23 15:39 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel



On 14/11/16 17:35, Andre Przywara wrote:
> Hi,

Hi Andre,

> On 01/11/16 15:13, Julien Grall wrote:
>> On 28/09/2016 19:24, Andre Przywara wrote:
>>> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
>>> offers. Store that information in a list to both propagate all of them
>>> later to Dom0, but also to be able to iterate over all ITSes.
>>> This introduces an ITS Kconfig option.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>  xen/arch/arm/Kconfig          |  5 ++++
>>>  xen/arch/arm/Makefile         |  1 +
>>>  xen/arch/arm/gic-its.c        | 67
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/gic-v3.c         |  6 ++++
>>>  xen/include/asm-arm/gic-its.h | 57 ++++++++++++++++++++++++++++++++++++
>>>  5 files changed, 136 insertions(+)
>>>  create mode 100644 xen/arch/arm/gic-its.c
>>>  create mode 100644 xen/include/asm-arm/gic-its.h
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 797c91f..9fe3b8e 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -45,6 +45,11 @@ config ACPI
>>>  config HAS_GICV3
>>>      bool
>>>
>>> +config HAS_ITS
>>> +        bool "GICv3 ITS MSI controller support"
>>> +        depends on ARM_64
>>
>> HAS_GICV3 will only be selected for 64-bit. It would need some rework to
>> be supported on 32-bit. So I would drop this dependency.
>
> OK, makes sense.
>
>>> +        depends on HAS_GICV3
>>> +
>>
>> I am not convinced that we should (currently) let the user selecting the
>> ITS support. It increases the test coverage (we have to test with and
>> without). Do we expect people using GICv3 without ITS?
>
> My concern was more that if it breaks something, people can just disable
> it. But I have to go through the patches again to see if disabling it
> really brings us something (because thinking about it I don't think so).
>
> So given the test coverage argument I think we should at least enable it
> by default for ARM64. Is there some "expert options" group somewhere
> where we could insert the option to turn it off?

You can use "if EXPERT=y", see how we handle ACPI for instance.

Thinking a bit more about this, I would like to see ITS as a technical 
preview at the beginning. This would let us a bit of time to stabilize 
the code. Any opinions?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table
  2016-11-15 11:32     ` Andre Przywara
@ 2016-11-23 15:58       ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2016-11-23 15:58 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 15/11/16 11:32, Andre Przywara wrote:
> On 01/11/16 17:22, Julien Grall wrote:
>> On 28/09/2016 19:24, Andre Przywara wrote:
>>> The ARM GICv3 ITS provides a new kind of interrupt called LPIs.
>>> The pending bits and the configuration data (priority, enable bits) for
>>> those LPIs are stored in tables in normal memory, which software has to
>>> provide to the hardware.
>>> Allocate the required memory, initialize it and hand it over to each
>>> ITS. We limit the number of LPIs we use with a compile time constant to
>>> avoid wasting memory.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>  xen/arch/arm/Kconfig              |  6 ++++
>>>  xen/arch/arm/efi/efi-boot.h       |  1 -
>>>  xen/arch/arm/gic-its.c            | 76
>>> +++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/gic-v3.c             | 27 ++++++++++++++
>>>  xen/include/asm-arm/cache.h       |  4 +++
>>>  xen/include/asm-arm/gic-its.h     | 22 +++++++++++-
>>>  xen/include/asm-arm/gic_v3_defs.h | 48 ++++++++++++++++++++++++-
>>>  7 files changed, 181 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 9fe3b8e..66e2bb8 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -50,6 +50,12 @@ config HAS_ITS
>>>          depends on ARM_64
>>>          depends on HAS_GICV3
>>>
>>> +config HOST_LPI_BITS
>>> +        depends on HAS_ITS
>>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>>> +        range 14 32
>>> +        default "20"
>>> +
>>
>> This would be better to be defined as a parameter command line. So the
>> user does not need to rebuild Xen in order to increase the number of
>> bits supported. It would also be useful to get a rational behind the
>> default number in the commit message.
>
> Yeah, a command line option sounds useful, though I have to check
> whether this changes compile-time computation into actual runtime one.
>
> The number is made-up, based on some reasoning on the possible memory
> consumption and the number of LPIs provided. 8 MB and 1 million LPIs
> sounded like a sweet spot.
> If I got this correctly, Linux atm doesn't use more than 65536 LPIs.

You know my answer here ;). Linux is not the only guests OS supports on 
Xen, so we should avoid making assumptions on the best behavior based on 
this.

[...]

>>> +
>>> +    /*
>>> +     * The pending table holds one bit per LPI, so we need three bits
>>> less
>>> +     * than the number of LPI_BITs. But the alignment requirement
>>> from the
>>> +     * ITS is 64K, so make order at least 16 (-12).
>>> +     */
>>> +    pendtable = alloc_xenheap_pages(MAX(lpi_data.host_lpi_bits - 3,
>>> 16) - 12, 0);
>>> +    if ( !pendtable )
>>> +        return 0;
>>> +
>>> +    memset(pendtable, 0, BIT(lpi_data.host_lpi_bits - 3));
>>
>> Same remark for BIT here.
>>
>>> +    this_cpu(pending_table) = pendtable;
>>> +
>>> +    reg  = attr | GICR_PENDBASER_PTZ;
>>> +    reg |= virt_to_maddr(pendtable) & GENMASK(51, 16);
>>
>> I don't think the mask is useful and would need to be changed if the
>> physical address bits increased as it was done in ARMv8.2.
>
> Mmmh, not so sure we can extend the mask to cover the region that it
> RES0 at the moment. We need some mask anyway (to not clobber the upper
> bits), so I figured we just use what the spec says.

The masked PA should always be equal to the PA. Otherwise we would 
program the wrong address into the register and who knows what can 
happen. So this mask is pointless.

[..]

>>>  #endif /* CONFIG_HAS_ITS */
>>>
>>>  #endif /* __ASSEMBLY__ */
>>> diff --git a/xen/include/asm-arm/gic_v3_defs.h
>>> b/xen/include/asm-arm/gic_v3_defs.h
>>> index 6bd25a5..da5fb77 100644
>>> --- a/xen/include/asm-arm/gic_v3_defs.h
>>> +++ b/xen/include/asm-arm/gic_v3_defs.h
>>> @@ -44,7 +44,8 @@
>>>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>>>
>>>  /* Additional bits in GICD_TYPER defined by GICv3 */
>>> -#define GICD_TYPE_ID_BITS_SHIFT 19
>>> +#define GICD_TYPE_ID_BITS_SHIFT      19
>>> +#define GICD_TYPE_LPIS               (1U << 17)
>>
>> I was about to say that this should be named GICD_TYPER... but it looks
>> like we already defined and use GIC_TYPE_ID_BITS_SHIFTS. So it is up to
>> you if you rename it to get the correct register name.
>
> Yeah, I was unsure about this as well. My hunch is we should avoid the
> churn and keep existing names around. Experience shows that those simple
> renames tend to introduce nasty rebase issues, especially with a
> long-standing series like this.

Clean up are always welcomed ;). I am fine if it comes after.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-18 18:39                 ` Stefano Stabellini
@ 2016-11-25 16:10                   ` Julien Grall
  2016-12-01  1:19                     ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-11-25 16:10 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Andre Przywara, Steve Capper, Vijay Kilari, xen-devel

Hi,

On 18/11/16 18:39, Stefano Stabellini wrote:
> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
>> On Fri, 11 Nov 2016, Julien Grall wrote:
>>> On 10/11/16 20:42, Stefano Stabellini wrote:
>>> That's why in the approach we had on the previous series was "host ITS command
>>> should be limited when emulating guest ITS command". From my recall, in that
>>> series the host and guest LPIs was fully separated (enabling a guest LPIs was
>>> not enabling host LPIs).
>>
>> I am interested in reading what Ian suggested to do when the physical
>> ITS queue is full, but I cannot find anything specific about it in the
>> doc.
>>
>> Do you have a suggestion for this?
>>
>> The only things that come to mind right now are:
>>
>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock style)
>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
>
> Another, probably better idea, is to map all pLPIs of a device when the
> device is assigned to a guest (including Dom0). This is what was written
> in Ian's design doc. The advantage of this approach is that Xen doesn't
> need to take any actions on the physical ITS command queue when the
> guest issues virtual ITS commands, therefore completely solving this
> problem at the root. (Although I am not sure about enable/disable
> commands: could we avoid issuing enable/disable on pLPIs?)

In the previous design document (see [1]), the pLPIs are enabled when 
the device is assigned to the guest. This means that it is not necessary 
to send command there. This is also means we may receive a pLPI before 
the associated vLPI has been configured.

That said, given that LPIs are edge-triggered, there is no deactivate 
state (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is 
done, the same LPIs could potentially be raised again. This could 
generate a storm.

The priority drop is necessary if we don't want to block the reception 
of interrupt for the current physical CPU.

What I am more concerned about is this problem can also happen in normal 
running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For 
edge-triggered interrupt, there is no way to prevent them to fire again. 
Maybe it is time to introduce rate-limit interrupt for ARM. Any opinions?

 > It also helps
> toward solving the INVALL potential DOS issue, because it significantly
> reduces the computation needed when an INVALL is issued by the guest.
>
> On the other end, this approach has the potential of consuming much more
> memory to map all the possible pLPIs that a device could use up to the
> theoretical max. Of course that is not good either. But fortunately for
> PCI devices we know how many events a device can generate. Also we
> should be able to get that info on device tree for other devices. So I
> suggest Xen only maps as many pLPIs as events the device can generate,
> when the device is assigned to the guest. This way there would be no
> wasted memory.
>
> Does it make sense? Do you think it could work?

Aside the point I raised above, I think the approach looks sensible.

Regards,

[1] https://xenbits.xen.org/people/ianc/vits/draftG.html#device-assignment

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-11-25 16:10                   ` Julien Grall
@ 2016-12-01  1:19                     ` Stefano Stabellini
  2016-12-02 16:18                       ` Andre Przywara
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-01  1:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: Andre Przywara, Stefano Stabellini, Steve Capper, Vijay Kilari,
	xen-devel

On Fri, 25 Nov 2016, Julien Grall wrote:
> Hi,
> 
> On 18/11/16 18:39, Stefano Stabellini wrote:
> > On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> > > On Fri, 11 Nov 2016, Julien Grall wrote:
> > > > On 10/11/16 20:42, Stefano Stabellini wrote:
> > > > That's why in the approach we had on the previous series was "host ITS
> > > > command
> > > > should be limited when emulating guest ITS command". From my recall, in
> > > > that
> > > > series the host and guest LPIs was fully separated (enabling a guest
> > > > LPIs was
> > > > not enabling host LPIs).
> > > 
> > > I am interested in reading what Ian suggested to do when the physical
> > > ITS queue is full, but I cannot find anything specific about it in the
> > > doc.
> > > 
> > > Do you have a suggestion for this?
> > > 
> > > The only things that come to mind right now are:
> > > 
> > > 1) check if the ITS queue is full and busy loop until it is not (spin_lock
> > > style)
> > > 2) check if the ITS queue is full and sleep until it is not (mutex style)
> > 
> > Another, probably better idea, is to map all pLPIs of a device when the
> > device is assigned to a guest (including Dom0). This is what was written
> > in Ian's design doc. The advantage of this approach is that Xen doesn't
> > need to take any actions on the physical ITS command queue when the
> > guest issues virtual ITS commands, therefore completely solving this
> > problem at the root. (Although I am not sure about enable/disable
> > commands: could we avoid issuing enable/disable on pLPIs?)
> 
> In the previous design document (see [1]), the pLPIs are enabled when the
> device is assigned to the guest. This means that it is not necessary to send
> command there. This is also means we may receive a pLPI before the associated
> vLPI has been configured.
> 
> That said, given that LPIs are edge-triggered, there is no deactivate state
> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
> LPIs could potentially be raised again. This could generate a storm.

Thank you for raising this important point. You are correct.


> The priority drop is necessary if we don't want to block the reception of
> interrupt for the current physical CPU.
> 
> What I am more concerned about is this problem can also happen in normal
> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
> it is time to introduce rate-limit interrupt for ARM. Any opinions?

Yes. It could be as simple as disabling the pLPI when Xen receives a
second pLPI before the guest EOIs the first corresponding vLPI, which
shouldn't happen in normal circumstances.

We need a simple per-LPI inflight counter, incremented when a pLPI is
received, decremented when the corresponding vLPI is EOIed (the LR is
cleared).

When the counter > 1, we disable the pLPI and request a maintenance
interrupt for the corresponding vLPI.

When we receive the maintenance interrupt and we clear the LR of the
vLPI, Xen should re-enable the pLPI.

Given that the state of the LRs is sync'ed before calling gic_interrupt,
we can be sure to know exactly in what state the vLPI is at any given
time. But for this to work correctly, it is important to configure the
pLPI to be delivered to the same pCPU running the vCPU which handles
the vLPI (as it is already the case today for SPIs).

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-01  1:19                     ` Stefano Stabellini
@ 2016-12-02 16:18                       ` Andre Przywara
  2016-12-03  0:46                         ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-12-02 16:18 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari, Steve Capper

Hi,

sorry for chiming in late ....

I've been spending some time thinking about this, and I think we can in
fact get away without ever propagating command from domains to the host.

I made a list of all commands that possible require host ITS command
propagation. There are two groups:
1: enabling/disabling LPIs: INV, INVALL
2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.

The second group can be handled by mapping all required devices up
front, I will elaborate on that in a different email.

For the first group, read below ...

On 01/12/16 01:19, Stefano Stabellini wrote:
> On Fri, 25 Nov 2016, Julien Grall wrote:
>> Hi,
>>
>> On 18/11/16 18:39, Stefano Stabellini wrote:
>>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
>>>> On Fri, 11 Nov 2016, Julien Grall wrote:
>>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
>>>>> That's why in the approach we had on the previous series was "host ITS
>>>>> command
>>>>> should be limited when emulating guest ITS command". From my recall, in
>>>>> that
>>>>> series the host and guest LPIs was fully separated (enabling a guest
>>>>> LPIs was
>>>>> not enabling host LPIs).
>>>>
>>>> I am interested in reading what Ian suggested to do when the physical
>>>> ITS queue is full, but I cannot find anything specific about it in the
>>>> doc.
>>>>
>>>> Do you have a suggestion for this?
>>>>
>>>> The only things that come to mind right now are:
>>>>
>>>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock
>>>> style)
>>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
>>>
>>> Another, probably better idea, is to map all pLPIs of a device when the
>>> device is assigned to a guest (including Dom0). This is what was written
>>> in Ian's design doc. The advantage of this approach is that Xen doesn't
>>> need to take any actions on the physical ITS command queue when the
>>> guest issues virtual ITS commands, therefore completely solving this
>>> problem at the root. (Although I am not sure about enable/disable
>>> commands: could we avoid issuing enable/disable on pLPIs?)
>>
>> In the previous design document (see [1]), the pLPIs are enabled when the
>> device is assigned to the guest. This means that it is not necessary to send
>> command there. This is also means we may receive a pLPI before the associated
>> vLPI has been configured.
>>
>> That said, given that LPIs are edge-triggered, there is no deactivate state
>> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
>> LPIs could potentially be raised again. This could generate a storm.
> 
> Thank you for raising this important point. You are correct.
>
>> The priority drop is necessary if we don't want to block the reception of
>> interrupt for the current physical CPU.
>>
>> What I am more concerned about is this problem can also happen in normal
>> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
>> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
>> it is time to introduce rate-limit interrupt for ARM. Any opinions?
> 
> Yes. It could be as simple as disabling the pLPI when Xen receives a
> second pLPI before the guest EOIs the first corresponding vLPI, which
> shouldn't happen in normal circumstances.
> 
> We need a simple per-LPI inflight counter, incremented when a pLPI is
> received, decremented when the corresponding vLPI is EOIed (the LR is
> cleared).
> 
> When the counter > 1, we disable the pLPI and request a maintenance
> interrupt for the corresponding vLPI.

So why do we need a _counter_? This is about edge triggered interrupts,
I think we can just accumulate all of them into one.

So here is what I think:
- We use the guest provided pending table to hold a pending bit for each
VLPI. We can unmap the memory from the guest, since software is not
supposed to access this table as per the spec.
- We use the guest provided property table, without trapping it. There
is nothing to be "validated" in that table, since it's a really tight
encoding and every value written in there is legal. We only look at bit
0 for this exercise here anyway.
- Upon reception of a physical LPI, we look it up to find the VCPU and
virtual LPI number. This is what we need to do anyway and it's a quick
two-level table lookup at the moment.
- We use the VCPU's domain and the VLPI number to index the guest's
property table and read the enabled bit. Again a quick table lookup.
 - If the VLPI is enabled, we EOI it on the host and inject it.
 - If the VLPI is disabled, we set the pending bit in the VCPU's
   pending table and EOI on the host - to allow other IRQs.
- On a guest INV command, we check whether that vLPI is now enabled:
 - If it is disabled now, we don't need to do anything.
 - If it is enabled now, we check the pending bit for that VLPI:
  - If it is 0, we don't do anything.
  - If it is 1, we inject the VLPI and clear the pending bit.
- On a guest INVALL command, we just need to iterate over the virtual
LPIs. If you look at the conditions above, the only immediate action is
when a VLPI gets enabled _and_ its pending bit is set. So we can do
64-bit read accesses over the whole pending table to find non-zero words
and thus set bits, which should be rare in practice. We can store the
highest mapped VLPI to avoid iterating over the whole of the table.
Ideally the guest has no direct control over the pending bits, since
this is what the device generates. Also we limit the number of VLPIs in
total per guest anyway.

If that still sounds like a DOS vector, we could additionally rate-limit
INVALLs, and/or track additions to the pending table after the last
INVALL: if there haven't been any new pending bits since the last scan,
INVALL is a NOP.

Does that makes sense so far?

So that just leaves us with this IRQ storm issue, which I am thinking
about now. But I guess this is not a show stopper given we can disable
the physical LPI if we sense this situation.

> When we receive the maintenance interrupt and we clear the LR of the
> vLPI, Xen should re-enable the pLPI.
> Given that the state of the LRs is sync'ed before calling gic_interrupt,
> we can be sure to know exactly in what state the vLPI is at any given
> time. But for this to work correctly, it is important to configure the
> pLPI to be delivered to the same pCPU running the vCPU which handles
> the vLPI (as it is already the case today for SPIs).

Why would that be necessary?

Cheers,
Andre

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-02 16:18                       ` Andre Przywara
@ 2016-12-03  0:46                         ` Stefano Stabellini
  2016-12-05 13:36                           ` Julien Grall
  2016-12-09 19:00                           ` Andre Przywara
  0 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-03  0:46 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari, Steve Capper

On Fri, 2 Dec 2016, Andre Przywara wrote:
> Hi,
> 
> sorry for chiming in late ....
> 
> I've been spending some time thinking about this, and I think we can in
> fact get away without ever propagating command from domains to the host.
> 
> I made a list of all commands that possible require host ITS command
> propagation. There are two groups:
> 1: enabling/disabling LPIs: INV, INVALL
> 2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.
> 
> The second group can be handled by mapping all required devices up
> front, I will elaborate on that in a different email.
> 
> For the first group, read below ...
> 
> On 01/12/16 01:19, Stefano Stabellini wrote:
> > On Fri, 25 Nov 2016, Julien Grall wrote:
> >> Hi,
> >>
> >> On 18/11/16 18:39, Stefano Stabellini wrote:
> >>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> >>>> On Fri, 11 Nov 2016, Julien Grall wrote:
> >>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
> >>>>> That's why in the approach we had on the previous series was "host ITS
> >>>>> command
> >>>>> should be limited when emulating guest ITS command". From my recall, in
> >>>>> that
> >>>>> series the host and guest LPIs was fully separated (enabling a guest
> >>>>> LPIs was
> >>>>> not enabling host LPIs).
> >>>>
> >>>> I am interested in reading what Ian suggested to do when the physical
> >>>> ITS queue is full, but I cannot find anything specific about it in the
> >>>> doc.
> >>>>
> >>>> Do you have a suggestion for this?
> >>>>
> >>>> The only things that come to mind right now are:
> >>>>
> >>>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock
> >>>> style)
> >>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
> >>>
> >>> Another, probably better idea, is to map all pLPIs of a device when the
> >>> device is assigned to a guest (including Dom0). This is what was written
> >>> in Ian's design doc. The advantage of this approach is that Xen doesn't
> >>> need to take any actions on the physical ITS command queue when the
> >>> guest issues virtual ITS commands, therefore completely solving this
> >>> problem at the root. (Although I am not sure about enable/disable
> >>> commands: could we avoid issuing enable/disable on pLPIs?)
> >>
> >> In the previous design document (see [1]), the pLPIs are enabled when the
> >> device is assigned to the guest. This means that it is not necessary to send
> >> command there. This is also means we may receive a pLPI before the associated
> >> vLPI has been configured.
> >>
> >> That said, given that LPIs are edge-triggered, there is no deactivate state
> >> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
> >> LPIs could potentially be raised again. This could generate a storm.
> > 
> > Thank you for raising this important point. You are correct.
> >
> >> The priority drop is necessary if we don't want to block the reception of
> >> interrupt for the current physical CPU.
> >>
> >> What I am more concerned about is this problem can also happen in normal
> >> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
> >> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
> >> it is time to introduce rate-limit interrupt for ARM. Any opinions?
> > 
> > Yes. It could be as simple as disabling the pLPI when Xen receives a
> > second pLPI before the guest EOIs the first corresponding vLPI, which
> > shouldn't happen in normal circumstances.
> > 
> > We need a simple per-LPI inflight counter, incremented when a pLPI is
> > received, decremented when the corresponding vLPI is EOIed (the LR is
> > cleared).
> > 
> > When the counter > 1, we disable the pLPI and request a maintenance
> > interrupt for the corresponding vLPI.
> 
> So why do we need a _counter_? This is about edge triggered interrupts,
> I think we can just accumulate all of them into one.

The counter is not to re-inject the same amount of interrupts into the
guest, but to detect interrupt storms.


> So here is what I think:
> - We use the guest provided pending table to hold a pending bit for each
> VLPI. We can unmap the memory from the guest, since software is not
> supposed to access this table as per the spec.
> - We use the guest provided property table, without trapping it. There
> is nothing to be "validated" in that table, since it's a really tight
> encoding and every value written in there is legal. We only look at bit
> 0 for this exercise here anyway.

I am following...


> - Upon reception of a physical LPI, we look it up to find the VCPU and
> virtual LPI number. This is what we need to do anyway and it's a quick
> two-level table lookup at the moment.
> - We use the VCPU's domain and the VLPI number to index the guest's
> property table and read the enabled bit. Again a quick table lookup.

They should be both O(2), correct?


>  - If the VLPI is enabled, we EOI it on the host and inject it.
>  - If the VLPI is disabled, we set the pending bit in the VCPU's
>    pending table and EOI on the host - to allow other IRQs.
> - On a guest INV command, we check whether that vLPI is now enabled:
>  - If it is disabled now, we don't need to do anything.
>  - If it is enabled now, we check the pending bit for that VLPI:
>   - If it is 0, we don't do anything.
>   - If it is 1, we inject the VLPI and clear the pending bit.
> - On a guest INVALL command, we just need to iterate over the virtual
> LPIs.

Right, much better.


> If you look at the conditions above, the only immediate action is
> when a VLPI gets enabled _and_ its pending bit is set. So we can do
> 64-bit read accesses over the whole pending table to find non-zero words
> and thus set bits, which should be rare in practice. We can store the
> highest mapped VLPI to avoid iterating over the whole of the table.
> Ideally the guest has no direct control over the pending bits, since
> this is what the device generates. Also we limit the number of VLPIs in
> total per guest anyway.

I wonder if we could even use a fully packed bitmask with only the
pending bits, so 1 bit per vLPI, rather than 1 byte per vLPI. That would
be a nice improvement.


> If that still sounds like a DOS vector, we could additionally rate-limit
> INVALLs, and/or track additions to the pending table after the last
> INVALL: if there haven't been any new pending bits since the last scan,
> INVALL is a NOP.
> 
> Does that makes sense so far?

It makes sense. It should be OK.


> So that just leaves us with this IRQ storm issue, which I am thinking
> about now. But I guess this is not a show stopper given we can disable
> the physical LPI if we sense this situation.

That is true and it's exactly what we should do.


> > When we receive the maintenance interrupt and we clear the LR of the
> > vLPI, Xen should re-enable the pLPI.
> > Given that the state of the LRs is sync'ed before calling gic_interrupt,
> > we can be sure to know exactly in what state the vLPI is at any given
> > time. But for this to work correctly, it is important to configure the
> > pLPI to be delivered to the same pCPU running the vCPU which handles
> > the vLPI (as it is already the case today for SPIs).
> 
> Why would that be necessary?

Because the state of the LRs of other pCPUs won't be up to date: we
wouldn't know for sure whether the guest EOI'ed the vLPI or not.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-03  0:46                         ` Stefano Stabellini
@ 2016-12-05 13:36                           ` Julien Grall
  2016-12-05 19:51                             ` Stefano Stabellini
  2016-12-09 19:00                           ` Andre Przywara
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-12-05 13:36 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel, Vijay Kilari, Steve Capper

Hi Stefano,

On 03/12/16 00:46, Stefano Stabellini wrote:
> On Fri, 2 Dec 2016, Andre Przywara wrote:
>>> When we receive the maintenance interrupt and we clear the LR of the
>>> vLPI, Xen should re-enable the pLPI.
>>> Given that the state of the LRs is sync'ed before calling gic_interrupt,
>>> we can be sure to know exactly in what state the vLPI is at any given
>>> time. But for this to work correctly, it is important to configure the
>>> pLPI to be delivered to the same pCPU running the vCPU which handles
>>> the vLPI (as it is already the case today for SPIs).
>>
>> Why would that be necessary?
>
> Because the state of the LRs of other pCPUs won't be up to date: we
> wouldn't know for sure whether the guest EOI'ed the vLPI or not.

Well, there is still a small window when the interrupt may be received 
on the previous pCPU. So we have to take into account this case.

This window may be bigger with LPIs, because a single vCPU may have 
thousand interrupts routed. This would take a long time to move all of 
them when the vCPU is migrating. So we may want to take a lazy approach 
and moving them when they are received on the "wrong" pCPU.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-05 13:36                           ` Julien Grall
@ 2016-12-05 19:51                             ` Stefano Stabellini
  2016-12-06 15:56                               ` Julien Grall
  2016-12-06 21:36                               ` Dario Faggioli
  0 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-05 19:51 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	dario.faggioli, george.dunlap, xen-devel

On Mon, 5 Dec 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 03/12/16 00:46, Stefano Stabellini wrote:
> > On Fri, 2 Dec 2016, Andre Przywara wrote:
> > > > When we receive the maintenance interrupt and we clear the LR of the
> > > > vLPI, Xen should re-enable the pLPI.
> > > > Given that the state of the LRs is sync'ed before calling gic_interrupt,
> > > > we can be sure to know exactly in what state the vLPI is at any given
> > > > time. But for this to work correctly, it is important to configure the
> > > > pLPI to be delivered to the same pCPU running the vCPU which handles
> > > > the vLPI (as it is already the case today for SPIs).
> > > 
> > > Why would that be necessary?
> > 
> > Because the state of the LRs of other pCPUs won't be up to date: we
> > wouldn't know for sure whether the guest EOI'ed the vLPI or not.
> 
> Well, there is still a small window when the interrupt may be received on the
> previous pCPU. So we have to take into account this case.

That's right. We already have a mechanism to deal with that, based on
the GIC_IRQ_GUEST_MIGRATING flag. It should work with LPIs too.


> This window may be bigger with LPIs, because a single vCPU may have thousand
> interrupts routed. This would take a long time to move all of them when the
> vCPU is migrating. So we may want to take a lazy approach and moving them when
> they are received on the "wrong" pCPU.

That's possible. The only downside is that modifying the irq migration
workflow is difficult and we might want to avoid it if possible.

Another approach is to let the scheduler know that migration is slower.
In fact this is not a new problem: it can be slow to migrate interrupts,
even few non-LPIs interrupts, even on x86. I wonder if the Xen scheduler
has any knowledge of that (CC'ing George and Dario). I guess that's the
reason why most people run with dom0_vcpus_pin.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-05 19:51                             ` Stefano Stabellini
@ 2016-12-06 15:56                               ` Julien Grall
  2016-12-06 19:36                                 ` Stefano Stabellini
  2016-12-06 21:36                               ` Dario Faggioli
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-12-06 15:56 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, dario.faggioli,
	george.dunlap, xen-devel

Hi Stefano,

On 05/12/16 19:51, Stefano Stabellini wrote:
> On Mon, 5 Dec 2016, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 03/12/16 00:46, Stefano Stabellini wrote:
>>> On Fri, 2 Dec 2016, Andre Przywara wrote:
>>>>> When we receive the maintenance interrupt and we clear the LR of the
>>>>> vLPI, Xen should re-enable the pLPI.
>>>>> Given that the state of the LRs is sync'ed before calling gic_interrupt,
>>>>> we can be sure to know exactly in what state the vLPI is at any given
>>>>> time. But for this to work correctly, it is important to configure the
>>>>> pLPI to be delivered to the same pCPU running the vCPU which handles
>>>>> the vLPI (as it is already the case today for SPIs).
>>>>
>>>> Why would that be necessary?
>>>
>>> Because the state of the LRs of other pCPUs won't be up to date: we
>>> wouldn't know for sure whether the guest EOI'ed the vLPI or not.
>>
>> Well, there is still a small window when the interrupt may be received on the
>> previous pCPU. So we have to take into account this case.
>
> That's right. We already have a mechanism to deal with that, based on
> the GIC_IRQ_GUEST_MIGRATING flag. It should work with LPIs too.

Right.

>> This window may be bigger with LPIs, because a single vCPU may have thousand
>> interrupts routed. This would take a long time to move all of them when the
>> vCPU is migrating. So we may want to take a lazy approach and moving them when
>> they are received on the "wrong" pCPU.
>
> That's possible. The only downside is that modifying the irq migration
> workflow is difficult and we might want to avoid it if possible.

I don't think this would modify the irq migration work flow. If you look 
at the implementation of arch_move_irqs, it will just go over the vIRQ 
and call irq_set_affinity.

irq_set_affinity will directly modify the hardware and that's all.

>
> Another approach is to let the scheduler know that migration is slower.
> In fact this is not a new problem: it can be slow to migrate interrupts,
> even few non-LPIs interrupts, even on x86. I wonder if the Xen scheduler
> has any knowledge of that (CC'ing George and Dario). I guess that's the
> reason why most people run with dom0_vcpus_pin.

I gave a quick look at x86, arch_move_irqs is not implemented. Only PIRQ 
are migrated when a vCPU is moving to another pCPU.

The function pirq_set_affinity, will change the affinity of a PIRQ but 
only in software (see irq_set_affinity). This is not yet replicated the 
configuration into the hardware.

In the case of ARM, we directly modify the configuration of the 
hardware. This adds much more overhead because you have to do an 
hardware access for every single IRQ.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 15:56                               ` Julien Grall
@ 2016-12-06 19:36                                 ` Stefano Stabellini
  2016-12-06 21:32                                   ` Dario Faggioli
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-06 19:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	dario.faggioli, george.dunlap, xen-devel

On Tue, 6 Dec 2016, Julien Grall wrote:
> > > This window may be bigger with LPIs, because a single vCPU may have
> > > thousand
> > > interrupts routed. This would take a long time to move all of them when
> > > the
> > > vCPU is migrating. So we may want to take a lazy approach and moving them
> > > when
> > > they are received on the "wrong" pCPU.
> > 
> > That's possible. The only downside is that modifying the irq migration
> > workflow is difficult and we might want to avoid it if possible.
> 
> I don't think this would modify the irq migration work flow. If you look at
> the implementation of arch_move_irqs, it will just go over the vIRQ and call
> irq_set_affinity.
> 
> irq_set_affinity will directly modify the hardware and that's all.
> 
> > 
> > Another approach is to let the scheduler know that migration is slower.
> > In fact this is not a new problem: it can be slow to migrate interrupts,
> > even few non-LPIs interrupts, even on x86. I wonder if the Xen scheduler
> > has any knowledge of that (CC'ing George and Dario). I guess that's the
> > reason why most people run with dom0_vcpus_pin.
> 
> I gave a quick look at x86, arch_move_irqs is not implemented. Only PIRQ are
> migrated when a vCPU is moving to another pCPU.
> 
> The function pirq_set_affinity, will change the affinity of a PIRQ but only in
> software (see irq_set_affinity). This is not yet replicated the configuration
> into the hardware.
> 
> In the case of ARM, we directly modify the configuration of the hardware. This
> adds much more overhead because you have to do an hardware access for every
> single IRQ.

George, Dario, any comments on whether this would make sense and how to
do it?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 19:36                                 ` Stefano Stabellini
@ 2016-12-06 21:32                                   ` Dario Faggioli
  2016-12-06 21:53                                     ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Dario Faggioli @ 2016-12-06 21:32 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Andre Przywara, Steve Capper, george.dunlap, Vijay Kilari, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3056 bytes --]

On Tue, 2016-12-06 at 11:36 -0800, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Julien Grall wrote:
> > 
> > > Another approach is to let the scheduler know that migration is
> > > slower.
> > > In fact this is not a new problem: it can be slow to migrate
> > > interrupts,
> > > even few non-LPIs interrupts, even on x86. I wonder if the Xen
> > > scheduler
> > > has any knowledge of that (CC'ing George and Dario). I guess
> > > that's the
> > > reason why most people run with dom0_vcpus_pin.
> > 
> > I gave a quick look at x86, arch_move_irqs is not implemented. Only
> > PIRQ are
> > migrated when a vCPU is moving to another pCPU.
> > 
> > In the case of ARM, we directly modify the configuration of the
> > hardware. This
> > adds much more overhead because you have to do an hardware access
> > for every
> > single IRQ.
> 
> George, Dario, any comments on whether this would make sense and how
> to
> do it?
>
I was actually looking into this, but I think I don't know enough of
ARM in general, and about this issue in particular to be useful.

That being said, perhaps you could clarify a bit what you mean with
"let the scheduler know that migration is slower". What you'd expect
the scheduler to do?

Checking the code, as Julien says, on x86 all we do when we move vCPUs
around is calling evtchn_move_pirqs(). In fact, it was right that
function that was called multiple times in schedule.c, and it was you
that (as Julien pointed out already):
1) in 5bd62a757b9 ("xen/arm: physical irq follow virtual irq"), 
   created arch_move_irqs() as something that does something on ARM,
   and as an empty stub in x86.
2) in 14f7e3b8a70 ("xen: introduce sched_move_irqs"), generalized 
   schedule.c code by implementing sched_move_irqs().

So, if I understood correctly what Julien said here "I don't think this
would modify the irq migration work flow. etc.", it looks to me that
the suggested lazy approach could be a good solution (but I'm saying
that lacking the knowledge of what it would actually mean to implement
that).

If you want something inside the scheduler that sort of delays the
wakeup of a domain on the new pCPU until some condition in IRQ handling
code is verified (but, please, confirm whether or not it was this that
you were thinking of), my thoughts, out of the top of my head about
this are:
- in general, I think it should be possible;
- it has to be arch-specific, I think?
- It's easy to avoid the vCPU being woken as a consequence of
  vcpu_wake() being called, e.g., at the end of vcpu_migrate();
- we must be careful about not forgetting/failing to (re)wakeup the 
  vCPU when the condition verifies

Sorry if I can't be more useful than this for now. :-/

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-05 19:51                             ` Stefano Stabellini
  2016-12-06 15:56                               ` Julien Grall
@ 2016-12-06 21:36                               ` Dario Faggioli
  1 sibling, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-06 21:36 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Andre Przywara, Steve Capper, george.dunlap, Vijay Kilari, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1216 bytes --]

On Mon, 2016-12-05 at 11:51 -0800, Stefano Stabellini wrote:
> Another approach is to let the scheduler know that migration is
> slower.
> In fact this is not a new problem: it can be slow to migrate
> interrupts,
> even few non-LPIs interrupts, even on x86. I wonder if the Xen
> scheduler
> has any knowledge of that (CC'ing George and Dario). I guess that's
> the
> reason why most people run with dom0_vcpus_pin.
>
Oh, and about this last sentence.

I may indeed be lacking knowledge/understanding, but if you think this
is a valid use case for dom0_vcpus_pin, I'd indeed be interested in
knowing why.

In fact, that configuration has always looked rather awkward to me, and
I think we should start thinking stopping providing the option at all
(or changing/extending its behavior).

So, if you think you need it, please spell that out, and let's see if
there are better ways to achieve the same. :-)

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 21:32                                   ` Dario Faggioli
@ 2016-12-06 21:53                                     ` Stefano Stabellini
  2016-12-06 22:01                                       ` Stefano Stabellini
  2016-12-06 22:39                                       ` Dario Faggioli
  0 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-06 21:53 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	george.dunlap, Julien Grall, xen-devel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4544 bytes --]

On Tue, 6 Dec 2016, Dario Faggioli wrote:
> On Tue, 2016-12-06 at 11:36 -0800, Stefano Stabellini wrote:
> > On Tue, 6 Dec 2016, Julien Grall wrote:
> > > 
> > > > Another approach is to let the scheduler know that migration is
> > > > slower.
> > > > In fact this is not a new problem: it can be slow to migrate
> > > > interrupts,
> > > > even few non-LPIs interrupts, even on x86. I wonder if the Xen
> > > > scheduler
> > > > has any knowledge of that (CC'ing George and Dario). I guess
> > > > that's the
> > > > reason why most people run with dom0_vcpus_pin.
> > > 
> > > I gave a quick look at x86, arch_move_irqs is not implemented. Only
> > > PIRQ are
> > > migrated when a vCPU is moving to another pCPU.
> > > 
> > > In the case of ARM, we directly modify the configuration of the
> > > hardware. This
> > > adds much more overhead because you have to do an hardware access
> > > for every
> > > single IRQ.
> > 
> > George, Dario, any comments on whether this would make sense and how
> > to
> > do it?
> >
> I was actually looking into this, but I think I don't know enough of
> ARM in general, and about this issue in particular to be useful.
> 
> That being said, perhaps you could clarify a bit what you mean with
> "let the scheduler know that migration is slower". What you'd expect
> the scheduler to do?
> 
> Checking the code, as Julien says, on x86 all we do when we move vCPUs
> around is calling evtchn_move_pirqs(). In fact, it was right that
> function that was called multiple times in schedule.c, and it was you
> that (as Julien pointed out already):
> 1) in 5bd62a757b9 ("xen/arm: physical irq follow virtual irq"), 
>    created arch_move_irqs() as something that does something on ARM,
>    and as an empty stub in x86.
> 2) in 14f7e3b8a70 ("xen: introduce sched_move_irqs"), generalized 
>    schedule.c code by implementing sched_move_irqs().
> 
> So, if I understood correctly what Julien said here "I don't think this
> would modify the irq migration work flow. etc.", it looks to me that
> the suggested lazy approach could be a good solution (but I'm saying
> that lacking the knowledge of what it would actually mean to implement
> that).
> 
> If you want something inside the scheduler that sort of delays the
> wakeup of a domain on the new pCPU until some condition in IRQ handling
> code is verified (but, please, confirm whether or not it was this that
> you were thinking of), my thoughts, out of the top of my head about
> this are:
> - in general, I think it should be possible;
> - it has to be arch-specific, I think?
> - It's easy to avoid the vCPU being woken as a consequence of
>   vcpu_wake() being called, e.g., at the end of vcpu_migrate();
> - we must be careful about not forgetting/failing to (re)wakeup the 
>   vCPU when the condition verifies
> 
> Sorry if I can't be more useful than this for now. :-/

We don't need scheduler support to implement interrupt migration. The
question was much simpler than that: moving a vCPU with interrupts
assigned to it is slower than moving a vCPU without interrupts assigned
to it. You could say that the slowness is directly proportional do the
number of interrupts assigned to the vCPU. Does the scheduler know that?
Or blindly moves vCPUs around? Also see below.



> On Mon, 2016-12-05 at 11:51 -0800, Stefano Stabellini wrote:
> > Another approach is to let the scheduler know that migration is
> > slower.
> > In fact this is not a new problem: it can be slow to migrate
> > interrupts,
> > even few non-LPIs interrupts, even on x86. I wonder if the Xen
> > scheduler
> > has any knowledge of that (CC'ing George and Dario). I guess that's
> > the
> > reason why most people run with dom0_vcpus_pin.
> >
> Oh, and about this last sentence.
> 
> I may indeed be lacking knowledge/understanding, but if you think this
> is a valid use case for dom0_vcpus_pin, I'd indeed be interested in
> knowing why.
> 
> In fact, that configuration has always looked rather awkward to me, and
> I think we should start thinking stopping providing the option at all
> (or changing/extending its behavior).
> 
> So, if you think you need it, please spell that out, and let's see if
> there are better ways to achieve the same. :-)

That's right, I think dom0_vcpus_pin is a good work-around for the lack
of scheduler knowledge about interrupts. If the scheduler knew that
moving vCPU0 from pCPU0 to pCPU1 is far more expensive than moving vCPU3
from pCPU3 to pCPU1 then it would make better decision and we wouldn't
need dom0_vcpus_pin.

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 21:53                                     ` Stefano Stabellini
@ 2016-12-06 22:01                                       ` Stefano Stabellini
  2016-12-06 22:12                                         ` Dario Faggioli
  2016-12-06 23:13                                         ` Julien Grall
  2016-12-06 22:39                                       ` Dario Faggioli
  1 sibling, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-06 22:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, Dario Faggioli,
	george.dunlap, Julien Grall, xen-devel

On Tue, 6 Dec 2016, Stefano Stabellini wrote:
> moving a vCPU with interrupts assigned to it is slower than moving a
> vCPU without interrupts assigned to it. You could say that the
> slowness is directly proportional do the number of interrupts assigned
> to the vCPU.

To be pedantic, by "assigned" I mean that a physical interrupt is routed
to a given pCPU and is set to be forwarded to a guest vCPU running on it
by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
these physical interrupts, a corresponding virtual interrupt (could be a
different irq) will be injected into the guest vCPU.

When the vCPU is migrated to a new pCPU, the physical interrupts that
are configured to be injected as virtual interrupts into the vCPU, are
migrated with it. The physical interrupt migration has a cost. However,
receiving physical interrupts on the wrong pCPU has an higher cost.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 22:01                                       ` Stefano Stabellini
@ 2016-12-06 22:12                                         ` Dario Faggioli
  2016-12-06 23:13                                         ` Julien Grall
  1 sibling, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-06 22:12 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, george.dunlap,
	Julien Grall, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1461 bytes --]

On Tue, 2016-12-06 at 14:01 -0800, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
> > 
> > moving a vCPU with interrupts assigned to it is slower than moving
> > a
> > vCPU without interrupts assigned to it. You could say that the
> > slowness is directly proportional do the number of interrupts
> > assigned
> > to the vCPU.
> 
> To be pedantic, by "assigned" I mean that a physical interrupt is
> routed
> to a given pCPU and is set to be forwarded to a guest vCPU running on
> it
> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one
> of
> these physical interrupts, a corresponding virtual interrupt (could
> be a
> different irq) will be injected into the guest vCPU.
> 
> When the vCPU is migrated to a new pCPU, the physical interrupts that
> are configured to be injected as virtual interrupts into the vCPU,
> are
> migrated with it. The physical interrupt migration has a cost.
> However,
> receiving physical interrupts on the wrong pCPU has an higher cost.
>
Yeah, I got in what sense you said "assigned", but thanks anyway for
this clarification. It indeed makes the picture more clear (even just
FTR) :-)

Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 21:53                                     ` Stefano Stabellini
  2016-12-06 22:01                                       ` Stefano Stabellini
@ 2016-12-06 22:39                                       ` Dario Faggioli
  2016-12-06 23:24                                         ` Julien Grall
  2016-12-07 20:21                                         ` Stefano Stabellini
  1 sibling, 2 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-06 22:39 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, george.dunlap,
	Julien Grall, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2397 bytes --]

On Tue, 2016-12-06 at 13:53 -0800, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Dario Faggioli wrote:
> > Sorry if I can't be more useful than this for now. :-/
> 
> We don't need scheduler support to implement interrupt migration. The
> question was much simpler than that: moving a vCPU with interrupts
> assigned to it is slower than moving a vCPU without interrupts
> assigned
> to it. You could say that the slowness is directly proportional do
> the
> number of interrupts assigned to the vCPU. Does the scheduler know
> that?
> Or blindly moves vCPUs around? Also see below.
> 
Ah, ok, it is indeed a simpler question than I thought! :-)

Answer: no, the scheduler does not use the information of how many or
what interrupts are being routed to a vCPU in any way.

Just for the sake of correctness and precision, it does not "blindly
moves vCPUs around", as in, it follows some criteria for deciding
whether or not to move a vCPU, and if yes, where to, but among those
criteria, there is no trace of anything related to routed interrupts.

Let me also add that the criteria are scheduler specific, so they're
different, e.g., between Credit and Credit2.

Starting considering routed interrupt as a migration criteria in Credit
would be rather difficult. Credit use a 'best effort' approach for
migrating vCPUs, which is hard to augment.

Starting considering routed interrupt as a migration criteria in
Credit2 would be much easier. Credit2's load balancer is specifically
designed for being extendible with things like that. It would require
some thinking, though, in order to figure out how important this
particular aspect would be, wrt others that are considered.

E.g., if I have pCPU 0 loaded at 75% and pCPU 1 loaded at 25%, vCPU A
has a lot of routed interrupts, and moving it gives me perfect load
balancing (i.e., load will become 50% on pCPU 0 and 50% on pCPU 1)
should I move it or not?
Well, it depends if whether or not we think that the overhead we save
by not migrating outweights the benefit of a perfectly balanced system.

Something like that...

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 22:01                                       ` Stefano Stabellini
  2016-12-06 22:12                                         ` Dario Faggioli
@ 2016-12-06 23:13                                         ` Julien Grall
  2016-12-07 20:20                                           ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-12-06 23:13 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, Dario Faggioli,
	george.dunlap, xen-devel

Hi Stefano,

On 06/12/2016 22:01, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
>> moving a vCPU with interrupts assigned to it is slower than moving a
>> vCPU without interrupts assigned to it. You could say that the
>> slowness is directly proportional do the number of interrupts assigned
>> to the vCPU.
>
> To be pedantic, by "assigned" I mean that a physical interrupt is routed
> to a given pCPU and is set to be forwarded to a guest vCPU running on it
> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
> these physical interrupts, a corresponding virtual interrupt (could be a
> different irq) will be injected into the guest vCPU.
>
> When the vCPU is migrated to a new pCPU, the physical interrupts that
> are configured to be injected as virtual interrupts into the vCPU, are
> migrated with it. The physical interrupt migration has a cost. However,
> receiving physical interrupts on the wrong pCPU has an higher cost.

I don't understand why it is a problem for you to receive the first 
interrupt to the wrong pCPU and moving it if necessary.

While this may have an higher cost (I don't believe so) on the first 
received interrupt, migrating thousands of interrupts at the same time 
is very expensive and will likely get Xen stuck for a while (think about 
ITS with a single command queue).

Furthermore, the current approach will move every single interrupt 
routed a the vCPU, even those disabled. That's pointless and a waste of 
resource. You may argue that we can skip the ones disabled, but in that 
case what would be the benefits to migrate the IRQs while migrate the vCPUs?

So I would suggest to spread it over the time. This also means less 
headache for the scheduler developers.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 22:39                                       ` Dario Faggioli
@ 2016-12-06 23:24                                         ` Julien Grall
  2016-12-07  0:17                                           ` Dario Faggioli
  2016-12-07 20:21                                         ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-12-06 23:24 UTC (permalink / raw)
  To: Dario Faggioli, Stefano Stabellini
  Cc: Andre Przywara, Steve Capper, george.dunlap, Vijay Kilari, xen-devel

Hi Dario,

On 06/12/2016 22:39, Dario Faggioli wrote:
> On Tue, 2016-12-06 at 13:53 -0800, Stefano Stabellini wrote:
>> On Tue, 6 Dec 2016, Dario Faggioli wrote:
>>> Sorry if I can't be more useful than this for now. :-/
>>
>> We don't need scheduler support to implement interrupt migration. The
>> question was much simpler than that: moving a vCPU with interrupts
>> assigned to it is slower than moving a vCPU without interrupts
>> assigned
>> to it. You could say that the slowness is directly proportional do
>> the
>> number of interrupts assigned to the vCPU. Does the scheduler know
>> that?
>> Or blindly moves vCPUs around? Also see below.
>>
> Ah, ok, it is indeed a simpler question than I thought! :-)
>
> Answer: no, the scheduler does not use the information of how many or
> what interrupts are being routed to a vCPU in any way.
>
> Just for the sake of correctness and precision, it does not "blindly
> moves vCPUs around", as in, it follows some criteria for deciding
> whether or not to move a vCPU, and if yes, where to, but among those
> criteria, there is no trace of anything related to routed interrupts.
>
> Let me also add that the criteria are scheduler specific, so they're
> different, e.g., between Credit and Credit2.
>
> Starting considering routed interrupt as a migration criteria in Credit
> would be rather difficult. Credit use a 'best effort' approach for
> migrating vCPUs, which is hard to augment.
>
> Starting considering routed interrupt as a migration criteria in
> Credit2 would be much easier. Credit2's load balancer is specifically
> designed for being extendible with things like that. It would require
> some thinking, though, in order to figure out how important this
> particular aspect would be, wrt others that are considered.
>
> E.g., if I have pCPU 0 loaded at 75% and pCPU 1 loaded at 25%, vCPU A
> has a lot of routed interrupts, and moving it gives me perfect load
> balancing (i.e., load will become 50% on pCPU 0 and 50% on pCPU 1)
> should I move it or not?
> Well, it depends if whether or not we think that the overhead we save
> by not migrating outweights the benefit of a perfectly balanced system.
>
> Something like that...

This idea to migrate IRQ while migrating vCPU is already wrong to me as 
this is really expensive, I am talking in term of milliseconds easily as 
in some case Xen will have to interact with a shared command queue.

Spreading the load is much better and avoid to migrate directly IRQ that 
are disabled when the vCPU is been migrated.

I cannot see any reason to not do that as changing the interrupt 
affinity may be not taken into account directly. By that, I mean it 
might be possible to receive another interrupt on the wrong pCPU.

I really think we should make the vCPU migration much simpler (e.g avoid 
this big loop over interrupt). In fine, if we really expect the 
scheduler to migrate the vCPU on a different pCPU. We should also expect 
receiving the interrupt on the wrong pCPU may not happen often.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 23:24                                         ` Julien Grall
@ 2016-12-07  0:17                                           ` Dario Faggioli
  0 siblings, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-07  0:17 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: Andre Przywara, Steve Capper, george.dunlap, Vijay Kilari, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1606 bytes --]

On Tue, 2016-12-06 at 23:24 +0000, Julien Grall wrote:
> I really think we should make the vCPU migration much simpler (e.g
> avoid 
> this big loop over interrupt). In fine, if we really expect the 
> scheduler to migrate the vCPU on a different pCPU. We should also
> expect 
> receiving the interrupt on the wrong pCPU may not happen often.
> 
This makes sense to me, but as I said, I don't really know. I mean, I
understand what you're explaining but I didn't consider this before,
and I don't have any performance figure.

I hope I manage to explain that, if we want to take this into account
during migration, as Stefano was asking about, there's a way to do that
in Credit2. That's it. I'll be happy to help dealing with whatever you
withing yourselves decide it's best for ARM, if it has scheduling
implications. :-)

And while we're here, if considering this specific aspect is not a good
idea, but you (anyone!) have in mind other things that it could be
interesting to take into account when evaluating whether or not to
migrate a vCPU, I'd be interested to know that.

After all, the advantage of having our own scheduler (e.g., wrt KVM
that has to use the Linux one), is exactly this, i.e., that we can
focus a lot more on virtualization specific aspects. So, really, I'm
all ears. :-D

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 23:13                                         ` Julien Grall
@ 2016-12-07 20:20                                           ` Stefano Stabellini
  2016-12-09 18:01                                             ` Julien Grall
  2016-12-09 18:07                                             ` Andre Przywara
  0 siblings, 2 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-07 20:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	Dario Faggioli, george.dunlap, xen-devel

On Tue, 6 Dec 2016, Julien Grall wrote:
> On 06/12/2016 22:01, Stefano Stabellini wrote:
> > On Tue, 6 Dec 2016, Stefano Stabellini wrote:
> > > moving a vCPU with interrupts assigned to it is slower than moving a
> > > vCPU without interrupts assigned to it. You could say that the
> > > slowness is directly proportional do the number of interrupts assigned
> > > to the vCPU.
> > 
> > To be pedantic, by "assigned" I mean that a physical interrupt is routed
> > to a given pCPU and is set to be forwarded to a guest vCPU running on it
> > by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
> > these physical interrupts, a corresponding virtual interrupt (could be a
> > different irq) will be injected into the guest vCPU.
> > 
> > When the vCPU is migrated to a new pCPU, the physical interrupts that
> > are configured to be injected as virtual interrupts into the vCPU, are
> > migrated with it. The physical interrupt migration has a cost. However,
> > receiving physical interrupts on the wrong pCPU has an higher cost.
> 
> I don't understand why it is a problem for you to receive the first interrupt
> to the wrong pCPU and moving it if necessary.
> 
> While this may have an higher cost (I don't believe so) on the first received
> interrupt, migrating thousands of interrupts at the same time is very
> expensive and will likely get Xen stuck for a while (think about ITS with a
> single command queue).
> 
> Furthermore, the current approach will move every single interrupt routed a
> the vCPU, even those disabled. That's pointless and a waste of resource. You
> may argue that we can skip the ones disabled, but in that case what would be
> the benefits to migrate the IRQs while migrate the vCPUs?
>
> So I would suggest to spread it over the time. This also means less headache
> for the scheduler developers.

The most important aspect of interrupts handling in Xen is latency,
measured as the time between Xen receiving a physical interrupt and the
guest receiving it. This latency should be both small and deterministic.

We all agree so far, right?


The issue with spreading interrupts migrations over time is that it makes
interrupt latency less deterministic. It is OK, in the uncommon case of
vCPU migration with interrupts, to take a hit for a short time. This
"hit" can be measured. It can be known. If your workload cannot tolerate
it, vCPUs can be pinned. It should be a rare event anyway. On the other
hand, by spreading interrupts migrations, we make it harder to predict
latency. Aside from determinism, another problem with this approach is
that it ensures that every interrupt assigned to a vCPU will first hit
the wrong pCPU, then it will be moved. It guarantees the worst-case
scenario for interrupt latency for the vCPU that has been moved. If we
migrated all interrupts as soon as possible, we would minimize the
amount of interrupts delivered to the wrong pCPU. Most interrupts would
be delivered to the new pCPU right away, reducing interrupt latency.

Regardless of how we implement interrupts migrations on ARM, I think it
still makes sense for the scheduler to know about it. I realize that
this is a separate point. Even if we spread interrupts migrations over
time, it still has a cost, in terms of latency as I wrote above, but also
in terms of interactions with interrupt controllers and ITSes. A vCPU
with no interrupts assigned to it poses no such problems. The scheduler
should be aware of the difference. If the scheduler knew, I bet that
vCPU migration would be a rare event for vCPUs that have many interrupts
assigned to them. For example, Dom0 vCPU0 would never be moved, and
dom0_pin_vcpus would be superfluous.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-06 22:39                                       ` Dario Faggioli
  2016-12-06 23:24                                         ` Julien Grall
@ 2016-12-07 20:21                                         ` Stefano Stabellini
  2016-12-09 10:14                                           ` Dario Faggioli
  1 sibling, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-07 20:21 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	george.dunlap, Julien Grall, xen-devel

On Tue, 6 Dec 2016, Dario Faggioli wrote:
> On Tue, 2016-12-06 at 13:53 -0800, Stefano Stabellini wrote:
> > On Tue, 6 Dec 2016, Dario Faggioli wrote:
> > > Sorry if I can't be more useful than this for now. :-/
> > 
> > We don't need scheduler support to implement interrupt migration. The
> > question was much simpler than that: moving a vCPU with interrupts
> > assigned to it is slower than moving a vCPU without interrupts
> > assigned
> > to it. You could say that the slowness is directly proportional do
> > the
> > number of interrupts assigned to the vCPU. Does the scheduler know
> > that?
> > Or blindly moves vCPUs around? Also see below.
> > 
> Ah, ok, it is indeed a simpler question than I thought! :-)
> 
> Answer: no, the scheduler does not use the information of how many or
> what interrupts are being routed to a vCPU in any way.
> 
> Just for the sake of correctness and precision, it does not "blindly
> moves vCPUs around", as in, it follows some criteria for deciding
> whether or not to move a vCPU, and if yes, where to, but among those
> criteria, there is no trace of anything related to routed interrupts.
> 
> Let me also add that the criteria are scheduler specific, so they're
> different, e.g., between Credit and Credit2.
> 
> Starting considering routed interrupt as a migration criteria in Credit
> would be rather difficult. Credit use a 'best effort' approach for
> migrating vCPUs, which is hard to augment.
> 
> Starting considering routed interrupt as a migration criteria in
> Credit2 would be much easier. Credit2's load balancer is specifically
> designed for being extendible with things like that. It would require
> some thinking, though, in order to figure out how important this
> particular aspect would be, wrt others that are considered.
> 
> E.g., if I have pCPU 0 loaded at 75% and pCPU 1 loaded at 25%, vCPU A
> has a lot of routed interrupts, and moving it gives me perfect load
> balancing (i.e., load will become 50% on pCPU 0 and 50% on pCPU 1)
> should I move it or not?
> Well, it depends if whether or not we think that the overhead we save
> by not migrating outweights the benefit of a perfectly balanced system.

Right. I don't know where to draw the line. I don't how much weight it
should have, but certainly it shouldn't be considered the same thing as
moving any other vCPU.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-07 20:21                                         ` Stefano Stabellini
@ 2016-12-09 10:14                                           ` Dario Faggioli
  0 siblings, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-09 10:14 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, george.dunlap,
	Julien Grall, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 1814 bytes --]

On Wed, 2016-12-07 at 12:21 -0800, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Dario Faggioli wrote:
> > E.g., if I have pCPU 0 loaded at 75% and pCPU 1 loaded at 25%, vCPU
> > A
> > has a lot of routed interrupts, and moving it gives me perfect load
> > balancing (i.e., load will become 50% on pCPU 0 and 50% on pCPU 1)
> > should I move it or not?
> > Well, it depends if whether or not we think that the overhead we
> > save
> > by not migrating outweights the benefit of a perfectly balanced
> > system.
> 
> Right. I don't know where to draw the line. I don't how much weight
> it
> should have, but certainly it shouldn't be considered the same thing
> as
> moving any other vCPU.
>
Right. As I said, Credit2 load balancer is nice and easy to extend
already --and needs to become nicer and easier to extend in order to
deal with soft-affinity, so I'll work on that soon (there's patches out
for soft-affinity which does sort of something like that, but I'm not
entirely satisfied of them, so I'll probably rework that part).

At that point, I'll be more than happy to consider this, and try to
reason about how much it should be weighted. After all, the only thing
we need to take this information into account when making load
balancing decisions is a mechanism for knowing how many of these routed
interrupt a vCPU has, and of course this needs to be:
 - easy to use,
 - super quick (load balancing is an hot path),
 - architecture independent,

is this the case already? :-)

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-07 20:20                                           ` Stefano Stabellini
@ 2016-12-09 18:01                                             ` Julien Grall
  2016-12-09 20:13                                               ` Stefano Stabellini
  2016-12-09 18:07                                             ` Andre Przywara
  1 sibling, 1 reply; 144+ messages in thread
From: Julien Grall @ 2016-12-09 18:01 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, Dario Faggioli,
	george.dunlap, xen-devel

Hi Stefano,

On 07/12/16 20:20, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Julien Grall wrote:
>> On 06/12/2016 22:01, Stefano Stabellini wrote:
>>> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
>>>> moving a vCPU with interrupts assigned to it is slower than moving a
>>>> vCPU without interrupts assigned to it. You could say that the
>>>> slowness is directly proportional do the number of interrupts assigned
>>>> to the vCPU.
>>>
>>> To be pedantic, by "assigned" I mean that a physical interrupt is routed
>>> to a given pCPU and is set to be forwarded to a guest vCPU running on it
>>> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
>>> these physical interrupts, a corresponding virtual interrupt (could be a
>>> different irq) will be injected into the guest vCPU.
>>>
>>> When the vCPU is migrated to a new pCPU, the physical interrupts that
>>> are configured to be injected as virtual interrupts into the vCPU, are
>>> migrated with it. The physical interrupt migration has a cost. However,
>>> receiving physical interrupts on the wrong pCPU has an higher cost.
>>
>> I don't understand why it is a problem for you to receive the first interrupt
>> to the wrong pCPU and moving it if necessary.
>>
>> While this may have an higher cost (I don't believe so) on the first received
>> interrupt, migrating thousands of interrupts at the same time is very
>> expensive and will likely get Xen stuck for a while (think about ITS with a
>> single command queue).
>>
>> Furthermore, the current approach will move every single interrupt routed a
>> the vCPU, even those disabled. That's pointless and a waste of resource. You
>> may argue that we can skip the ones disabled, but in that case what would be
>> the benefits to migrate the IRQs while migrate the vCPUs?
>>
>> So I would suggest to spread it over the time. This also means less headache
>> for the scheduler developers.
>
> The most important aspect of interrupts handling in Xen is latency,
> measured as the time between Xen receiving a physical interrupt and the
> guest receiving it. This latency should be both small and deterministic.
>
> We all agree so far, right?
>
>
> The issue with spreading interrupts migrations over time is that it makes
> interrupt latency less deterministic. It is OK, in the uncommon case of
> vCPU migration with interrupts, to take a hit for a short time.  This
> "hit" can be measured. It can be known. If your workload cannot tolerate
> it, vCPUs can be pinned. It should be a rare event anyway.  On the other
> hand, by spreading interrupts migrations, we make it harder to predict
> latency. Aside from determinism, another problem with this approach is
> that it ensures that every interrupt assigned to a vCPU will first hit
> the wrong pCPU, then it will be moved.  It guarantees the worst-case
> scenario for interrupt latency for the vCPU that has been moved. If we
> migrated all interrupts as soon as possible, we would minimize the
> amount of interrupts delivered to the wrong pCPU. Most interrupts would
> be delivered to the new pCPU right away, reducing interrupt latency.

Migrating all the interrupts can be really expensive because in the 
current state we have to go through every single interrupt and check 
whether the interrupt has been routed to this vCPU. We will also route 
disabled interrupt. And this seems really pointless. This may need some 
optimization here.

With ITS, we may have thousand of interrupts routed to a vCPU. This 
means that for every interrupt we have to issue a command in the host 
ITS queue. You will likely fill up the command queue and add much more 
latency.

Even if you consider the vCPU migration to be a rare case. You could 
still get the pCPU stuck for tens of milliseconds, the time to migrate 
everything. And I don't think this is not acceptable.

Anyway, I would like to see measurement in both situation before 
deciding when LPIs will be migrated.

> Regardless of how we implement interrupts migrations on ARM, I think it
> still makes sense for the scheduler to know about it. I realize that
> this is a separate point. Even if we spread interrupts migrations over
> time, it still has a cost, in terms of latency as I wrote above, but also
> in terms of interactions with interrupt controllers and ITSes. A vCPU
> with no interrupts assigned to it poses no such problems. The scheduler
> should be aware of the difference. If the scheduler knew, I bet that
> vCPU migration would be a rare event for vCPUs that have many interrupts
> assigned to them. For example, Dom0 vCPU0 would never be moved, and
> dom0_pin_vcpus would be superfluous.

The number of interrupts routed to a vCPU will vary over the time, this 
will depend what the guest decides to do, so you need scheduler to 
adapt. And in fine, you give the guest a chance to "control" the 
scheduler depending how the interrupts are spread between vCPU.

If the number increases, you may end up to have the scheduler to decide 
to not migrate the vCPU because it will be too expensive. But you may 
have a situation where migrating a vCPU with many interrupts is the only 
possible choice and you will slow down the platform.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-07 20:20                                           ` Stefano Stabellini
  2016-12-09 18:01                                             ` Julien Grall
@ 2016-12-09 18:07                                             ` Andre Przywara
  2016-12-09 20:18                                               ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-12-09 18:07 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: xen-devel, Dario Faggioli, george.dunlap, Vijay Kilari, Steve Capper

Hi,

On 07/12/16 20:20, Stefano Stabellini wrote:
> On Tue, 6 Dec 2016, Julien Grall wrote:
>> On 06/12/2016 22:01, Stefano Stabellini wrote:
>>> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
>>>> moving a vCPU with interrupts assigned to it is slower than moving a
>>>> vCPU without interrupts assigned to it. You could say that the
>>>> slowness is directly proportional do the number of interrupts assigned
>>>> to the vCPU.
>>>
>>> To be pedantic, by "assigned" I mean that a physical interrupt is routed
>>> to a given pCPU and is set to be forwarded to a guest vCPU running on it
>>> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
>>> these physical interrupts, a corresponding virtual interrupt (could be a
>>> different irq) will be injected into the guest vCPU.
>>>
>>> When the vCPU is migrated to a new pCPU, the physical interrupts that
>>> are configured to be injected as virtual interrupts into the vCPU, are
>>> migrated with it. The physical interrupt migration has a cost. However,
>>> receiving physical interrupts on the wrong pCPU has an higher cost.
>>
>> I don't understand why it is a problem for you to receive the first interrupt
>> to the wrong pCPU and moving it if necessary.
>>
>> While this may have an higher cost (I don't believe so) on the first received
>> interrupt, migrating thousands of interrupts at the same time is very
>> expensive and will likely get Xen stuck for a while (think about ITS with a
>> single command queue).
>>
>> Furthermore, the current approach will move every single interrupt routed a
>> the vCPU, even those disabled. That's pointless and a waste of resource. You
>> may argue that we can skip the ones disabled, but in that case what would be
>> the benefits to migrate the IRQs while migrate the vCPUs?
>>
>> So I would suggest to spread it over the time. This also means less headache
>> for the scheduler developers.
> 
> The most important aspect of interrupts handling in Xen is latency,
> measured as the time between Xen receiving a physical interrupt and the
> guest receiving it. This latency should be both small and deterministic.
> 
> We all agree so far, right?
> 
> 
> The issue with spreading interrupts migrations over time is that it makes
> interrupt latency less deterministic. It is OK, in the uncommon case of
> vCPU migration with interrupts, to take a hit for a short time. This
> "hit" can be measured. It can be known. If your workload cannot tolerate
> it, vCPUs can be pinned. It should be a rare event anyway. On the other
> hand, by spreading interrupts migrations, we make it harder to predict
> latency. Aside from determinism, another problem with this approach is
> that it ensures that every interrupt assigned to a vCPU will first hit
> the wrong pCPU, then it will be moved. It guarantees the worst-case
> scenario for interrupt latency for the vCPU that has been moved. If we
> migrated all interrupts as soon as possible, we would minimize the
> amount of interrupts delivered to the wrong pCPU. Most interrupts would
> be delivered to the new pCPU right away, reducing interrupt latency.

So if this is such a crucial issue, why don't we use the ITS for good
this time? The ITS hardware probably supports 16 bits worth of
collection IDs, so what about we assign each VCPU (in every guest) a
unique collection ID on the host and do a MAPC & MOVALL on a VCPU
migration to let it point to the right physical redistributor.
I see that this does not cover all use cases (> 65536 VCPUs, for
instance), also depends much of many implementation details:
- How costly is a MOVALL? It needs to scan the pending table and
transfer set bits to the other redistributor, which may take a while.
- Is there an impact if we exceed the number of hardware backed
collections (GITS_TYPE[HCC])? If the ITS is forced to access system
memory for every table lookup, this may slow down everyday operations.
- How likely are those misdirected interrupts in the first place? How
often do we migrate VCPU compared to the the interrupt frequency?

There are more, subtle parameters to consider, so I guess we just need
to try and measure.

> Regardless of how we implement interrupts migrations on ARM, I think it
> still makes sense for the scheduler to know about it. I realize that
> this is a separate point. Even if we spread interrupts migrations over
> time, it still has a cost, in terms of latency as I wrote above, but also
> in terms of interactions with interrupt controllers and ITSes. A vCPU
> with no interrupts assigned to it poses no such problems. The scheduler
> should be aware of the difference. If the scheduler knew, I bet that
> vCPU migration would be a rare event for vCPUs that have many interrupts
> assigned to them. For example, Dom0 vCPU0 would never be moved, and
> dom0_pin_vcpus would be superfluous.

That's a good point, so indeed the "interrupt load" should be a
scheduler parameter. But as you said: that's a different story.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-03  0:46                         ` Stefano Stabellini
  2016-12-05 13:36                           ` Julien Grall
@ 2016-12-09 19:00                           ` Andre Przywara
  2016-12-10  0:30                             ` Stefano Stabellini
  1 sibling, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-12-09 19:00 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall, Vijay Kilari, Steve Capper

On 03/12/16 00:46, Stefano Stabellini wrote:
> On Fri, 2 Dec 2016, Andre Przywara wrote:
>> Hi,

Hi Stefano,

I started to answer this email some days ago, but then spend some time
on actually implementing what I suggested, hence the delay ...

>>
>> sorry for chiming in late ....
>>
>> I've been spending some time thinking about this, and I think we can in
>> fact get away without ever propagating command from domains to the host.
>>
>> I made a list of all commands that possible require host ITS command
>> propagation. There are two groups:
>> 1: enabling/disabling LPIs: INV, INVALL
>> 2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.
>>
>> The second group can be handled by mapping all required devices up
>> front, I will elaborate on that in a different email.
>>
>> For the first group, read below ...
>>
>> On 01/12/16 01:19, Stefano Stabellini wrote:
>>> On Fri, 25 Nov 2016, Julien Grall wrote:
>>>> Hi,
>>>>
>>>> On 18/11/16 18:39, Stefano Stabellini wrote:
>>>>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
>>>>>> On Fri, 11 Nov 2016, Julien Grall wrote:
>>>>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
>>>>>>> That's why in the approach we had on the previous series was "host ITS
>>>>>>> command
>>>>>>> should be limited when emulating guest ITS command". From my recall, in
>>>>>>> that
>>>>>>> series the host and guest LPIs was fully separated (enabling a guest
>>>>>>> LPIs was
>>>>>>> not enabling host LPIs).
>>>>>>
>>>>>> I am interested in reading what Ian suggested to do when the physical
>>>>>> ITS queue is full, but I cannot find anything specific about it in the
>>>>>> doc.
>>>>>>
>>>>>> Do you have a suggestion for this?
>>>>>>
>>>>>> The only things that come to mind right now are:
>>>>>>
>>>>>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock
>>>>>> style)
>>>>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
>>>>>
>>>>> Another, probably better idea, is to map all pLPIs of a device when the
>>>>> device is assigned to a guest (including Dom0). This is what was written
>>>>> in Ian's design doc. The advantage of this approach is that Xen doesn't
>>>>> need to take any actions on the physical ITS command queue when the
>>>>> guest issues virtual ITS commands, therefore completely solving this
>>>>> problem at the root. (Although I am not sure about enable/disable
>>>>> commands: could we avoid issuing enable/disable on pLPIs?)
>>>>
>>>> In the previous design document (see [1]), the pLPIs are enabled when the
>>>> device is assigned to the guest. This means that it is not necessary to send
>>>> command there. This is also means we may receive a pLPI before the associated
>>>> vLPI has been configured.
>>>>
>>>> That said, given that LPIs are edge-triggered, there is no deactivate state
>>>> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
>>>> LPIs could potentially be raised again. This could generate a storm.
>>>
>>> Thank you for raising this important point. You are correct.
>>>
>>>> The priority drop is necessary if we don't want to block the reception of
>>>> interrupt for the current physical CPU.
>>>>
>>>> What I am more concerned about is this problem can also happen in normal
>>>> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
>>>> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
>>>> it is time to introduce rate-limit interrupt for ARM. Any opinions?
>>>
>>> Yes. It could be as simple as disabling the pLPI when Xen receives a
>>> second pLPI before the guest EOIs the first corresponding vLPI, which
>>> shouldn't happen in normal circumstances.
>>>
>>> We need a simple per-LPI inflight counter, incremented when a pLPI is
>>> received, decremented when the corresponding vLPI is EOIed (the LR is
>>> cleared).
>>>
>>> When the counter > 1, we disable the pLPI and request a maintenance
>>> interrupt for the corresponding vLPI.
>>
>> So why do we need a _counter_? This is about edge triggered interrupts,
>> I think we can just accumulate all of them into one.
> 
> The counter is not to re-inject the same amount of interrupts into the
> guest, but to detect interrupt storms.

I was wondering if an interrupt "storm" could already be defined by
"receiving an LPI while there is already one pending (in the guest's
virtual pending table) and it being disabled by the guest". I admit that
declaring two interrupts as a storm is a bit of a stretch, but in fact
the guest had probably a reason for disabling it even though it
fires, so Xen should just follow suit.
The only difference is that we don't do it _immediately_ when the guest
tells us (via INV), but only if needed (LPI actually fires).

>> So here is what I think:
>> - We use the guest provided pending table to hold a pending bit for each
>> VLPI. We can unmap the memory from the guest, since software is not
>> supposed to access this table as per the spec.
>> - We use the guest provided property table, without trapping it. There
>> is nothing to be "validated" in that table, since it's a really tight
>> encoding and every value written in there is legal. We only look at bit
>> 0 for this exercise here anyway.
> 
> I am following...
> 
> 
>> - Upon reception of a physical LPI, we look it up to find the VCPU and
>> virtual LPI number. This is what we need to do anyway and it's a quick
>> two-level table lookup at the moment.
>> - We use the VCPU's domain and the VLPI number to index the guest's
>> property table and read the enabled bit. Again a quick table lookup.
> 
> They should be both O(2), correct?

The second is even O(1). And even the first one could be a single table,
if desperately needed.

>>  - If the VLPI is enabled, we EOI it on the host and inject it.
>>  - If the VLPI is disabled, we set the pending bit in the VCPU's
>>    pending table and EOI on the host - to allow other IRQs.
>> - On a guest INV command, we check whether that vLPI is now enabled:
>>  - If it is disabled now, we don't need to do anything.
>>  - If it is enabled now, we check the pending bit for that VLPI:
>>   - If it is 0, we don't do anything.
>>   - If it is 1, we inject the VLPI and clear the pending bit.
>> - On a guest INVALL command, we just need to iterate over the virtual
>> LPIs.
> 
> Right, much better.
> 
> 
>> If you look at the conditions above, the only immediate action is
>> when a VLPI gets enabled _and_ its pending bit is set. So we can do
>> 64-bit read accesses over the whole pending table to find non-zero words
>> and thus set bits, which should be rare in practice. We can store the
>> highest mapped VLPI to avoid iterating over the whole of the table.
>> Ideally the guest has no direct control over the pending bits, since
>> this is what the device generates. Also we limit the number of VLPIs in
>> total per guest anyway.
> 
> I wonder if we could even use a fully packed bitmask with only the
> pending bits, so 1 bit per vLPI, rather than 1 byte per vLPI. That would
> be a nice improvement.

The _pending_ table is exactly that: one bit per VLPI. So by doing a
64-bit read we cover 64 VLPIs. And normally if an LPI fires it will
probably be enabled (otherwise the guest would have disabled it in the
device), so we inject it and don't need this table. It's
really just for storing the pending status should an LPI arrive while
the guest had _disabled_ it. I assume this is rather rare, so the table
will mostly be empty: that's why I expect most reads to be 0 and the
iteration of the table to be very quick. As an additional optimization
we could store the highest and lowest virtually pending LPI, to avoid
scanning the whole table.

We can't do so much about the property table, though, because its layout
is described in the spec - in contrast to the ITS tables, which are IMPDEF.
But as we only need to do something if the LPI is _both_ enabled _and_
pending, scanning the pending table gives us a quite good filter
already. For the few LPIs that hit here, we can just access the right
byte in the property table.

>> If that still sounds like a DOS vector, we could additionally rate-limit
>> INVALLs, and/or track additions to the pending table after the last
>> INVALL: if there haven't been any new pending bits since the last scan,
>> INVALL is a NOP.
>>
>> Does that makes sense so far?
> 
> It makes sense. It should be OK.
> 
> 
>> So that just leaves us with this IRQ storm issue, which I am thinking
>> about now. But I guess this is not a show stopper given we can disable
>> the physical LPI if we sense this situation.
> 
> That is true and it's exactly what we should do.
> 
> 
>>> When we receive the maintenance interrupt and we clear the LR of the
>>> vLPI, Xen should re-enable the pLPI.

So I was thinking why you would need to wait for the guest to actually
EOI it?
Can't we end the interrupt storm condition at the moment the guest
enables the interrupt? LPIs are edge triggered and so a storm in the
past is easily merged into a single guest LPI once the guest enables it
again. From there on we inject every triggered LPI into the guest.

This special handling for the interrupt storm just stems from the fact
that have to keep LPIs enabled on the h/w interrupt controller level,
despite the guest having disabled it on it's own _virtual_ GIC. So once
the guest enables it again, we are in line with the current GICv2/GICv3,
aren't we? Do we have interrupt storm detection/prevention in the moment?
And please keep in mind that LPIs are _always_ edge triggered: So once
we EOI an LPI on the host, this "very same" LPI is gone from a h/w GIC
point of view, the next incoming interrupt must have been triggered by a
new interrupt condition in the device (new network packet, for
instance). In contrast to GICv2 this applies to _every_ LPI.

So I am not sure we should really care _too_ much about this (apart from
the "guest has disabled it" part): Once we assign a device to a guest,
we lose some control over the machine anyway and at least trust the
device to not completely block the system. I don't see how the ITS
differs in that respect from the GICv3/GICv2.


A quick update on my side: I implemented the scheme I described in my
earlier mail now and it boots to the Dom0 prompt on a fastmodel with an ITS:
- On receiving the PHYSDEVOP_manage_pci_add hypercall in
xen/arch/arm/physdev.c, we MAPD the device on the host, MAPTI a bunch of
interrupts and enable them. We keep them unassigned in our host
pLPI->VLPI table, so we discard them should they fire.
This hypercall is issued by Dom0 Linux before bringing up any PCI
devices, so it works even for Dom0 without any Linux changes. For DomUs
with PCI passthrough Dom0 is expected to issue this hypercall on behalf
of the to-be-created domain.
- When a guest (be it Dom0 or Domu) actually maps an LPI (MAPTI), we
just enter the virtual LPI number and the target VCPU in our pLPI-vLPI
table and be done. Should it fire now, we know where to inject it, but
refer to the enabled bit in the guest's property table before doing so.
- When a guest (be it Dom0 or DomU) enables or disabled an interrupt, we
don't do much, as we refer to the enable bit every time we want to
inject already. The only thing I actually do is to inject an LPI if
there is a virtual LPI pending and the LPI is now enabled.

I will spend some time next week on updating the design document,
describing the new approach. I hope it becomes a bit clearer then.

Cheers,
Andre.

>>> Given that the state of the LRs is sync'ed before calling gic_interrupt,
>>> we can be sure to know exactly in what state the vLPI is at any given
>>> time. But for this to work correctly, it is important to configure the
>>> pLPI to be delivered to the same pCPU running the vCPU which handles
>>> the vLPI (as it is already the case today for SPIs).
>>
>> Why would that be necessary?
> 
> Because the state of the LRs of other pCPUs won't be up to date: we
> wouldn't know for sure whether the guest EOI'ed the vLPI or not.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-09 18:01                                             ` Julien Grall
@ 2016-12-09 20:13                                               ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-09 20:13 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Andre Przywara,
	Dario Faggioli, george.dunlap, xen-devel

On Fri, 9 Dec 2016, Julien Grall wrote:
> Hi Stefano,
> 
> On 07/12/16 20:20, Stefano Stabellini wrote:
> > On Tue, 6 Dec 2016, Julien Grall wrote:
> > > On 06/12/2016 22:01, Stefano Stabellini wrote:
> > > > On Tue, 6 Dec 2016, Stefano Stabellini wrote:
> > > > > moving a vCPU with interrupts assigned to it is slower than moving a
> > > > > vCPU without interrupts assigned to it. You could say that the
> > > > > slowness is directly proportional do the number of interrupts assigned
> > > > > to the vCPU.
> > > > 
> > > > To be pedantic, by "assigned" I mean that a physical interrupt is routed
> > > > to a given pCPU and is set to be forwarded to a guest vCPU running on it
> > > > by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
> > > > these physical interrupts, a corresponding virtual interrupt (could be a
> > > > different irq) will be injected into the guest vCPU.
> > > > 
> > > > When the vCPU is migrated to a new pCPU, the physical interrupts that
> > > > are configured to be injected as virtual interrupts into the vCPU, are
> > > > migrated with it. The physical interrupt migration has a cost. However,
> > > > receiving physical interrupts on the wrong pCPU has an higher cost.
> > > 
> > > I don't understand why it is a problem for you to receive the first
> > > interrupt
> > > to the wrong pCPU and moving it if necessary.
> > > 
> > > While this may have an higher cost (I don't believe so) on the first
> > > received
> > > interrupt, migrating thousands of interrupts at the same time is very
> > > expensive and will likely get Xen stuck for a while (think about ITS with
> > > a
> > > single command queue).
> > > 
> > > Furthermore, the current approach will move every single interrupt routed
> > > a
> > > the vCPU, even those disabled. That's pointless and a waste of resource.
> > > You
> > > may argue that we can skip the ones disabled, but in that case what would
> > > be
> > > the benefits to migrate the IRQs while migrate the vCPUs?
> > > 
> > > So I would suggest to spread it over the time. This also means less
> > > headache
> > > for the scheduler developers.
> > 
> > The most important aspect of interrupts handling in Xen is latency,
> > measured as the time between Xen receiving a physical interrupt and the
> > guest receiving it. This latency should be both small and deterministic.
> > 
> > We all agree so far, right?
> > 
> > 
> > The issue with spreading interrupts migrations over time is that it makes
> > interrupt latency less deterministic. It is OK, in the uncommon case of
> > vCPU migration with interrupts, to take a hit for a short time.  This
> > "hit" can be measured. It can be known. If your workload cannot tolerate
> > it, vCPUs can be pinned. It should be a rare event anyway.  On the other
> > hand, by spreading interrupts migrations, we make it harder to predict
> > latency. Aside from determinism, another problem with this approach is
> > that it ensures that every interrupt assigned to a vCPU will first hit
> > the wrong pCPU, then it will be moved.  It guarantees the worst-case
> > scenario for interrupt latency for the vCPU that has been moved. If we
> > migrated all interrupts as soon as possible, we would minimize the
> > amount of interrupts delivered to the wrong pCPU. Most interrupts would
> > be delivered to the new pCPU right away, reducing interrupt latency.
> 
> Migrating all the interrupts can be really expensive because in the current
> state we have to go through every single interrupt and check whether the
> interrupt has been routed to this vCPU. We will also route disabled interrupt.
> And this seems really pointless. This may need some optimization here.

Indeed, that should be fixed.


> With ITS, we may have thousand of interrupts routed to a vCPU. This means that
> for every interrupt we have to issue a command in the host ITS queue. You will
> likely fill up the command queue and add much more latency.
> 
> Even if you consider the vCPU migration to be a rare case. You could still get
> the pCPU stuck for tens of milliseconds, the time to migrate everything. And I
> don't think this is not acceptable.
[...]
> If the number increases, you may end up to have the scheduler to decide to not
> migrate the vCPU because it will be too expensive. But you may have a
> situation where migrating a vCPU with many interrupts is the only possible
> choice and you will slow down the platform.

A vCPU with thousand of interrupts routed to it, is the case where I
would push back to the scheduler. It should know that moving the vcpu
would be very costly.

Regardless, we need to figure out a way to move the interrupts without
"blocking" the platform for long. In practice, we might find a
threshold: a number of active interrupts above which we cannot move them
all at once anymore. Something like: we move the first 500 active
interrupts immediately, we delay the rest. We can find this threshold
only with practical measurements.


> Anyway, I would like to see measurement in both situation before deciding when
> LPIs will be migrated.

Yes, let's be scientific about this.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-09 18:07                                             ` Andre Przywara
@ 2016-12-09 20:18                                               ` Stefano Stabellini
  2016-12-14  2:39                                                 ` George Dunlap
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-09 20:18 UTC (permalink / raw)
  To: Andre Przywara
  Cc: Stefano Stabellini, Vijay Kilari, Steve Capper, Dario Faggioli,
	george.dunlap, Julien Grall, xen-devel

On Fri, 9 Dec 2016, Andre Przywara wrote:
> On 07/12/16 20:20, Stefano Stabellini wrote:
> > On Tue, 6 Dec 2016, Julien Grall wrote:
> >> On 06/12/2016 22:01, Stefano Stabellini wrote:
> >>> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
> >>>> moving a vCPU with interrupts assigned to it is slower than moving a
> >>>> vCPU without interrupts assigned to it. You could say that the
> >>>> slowness is directly proportional do the number of interrupts assigned
> >>>> to the vCPU.
> >>>
> >>> To be pedantic, by "assigned" I mean that a physical interrupt is routed
> >>> to a given pCPU and is set to be forwarded to a guest vCPU running on it
> >>> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
> >>> these physical interrupts, a corresponding virtual interrupt (could be a
> >>> different irq) will be injected into the guest vCPU.
> >>>
> >>> When the vCPU is migrated to a new pCPU, the physical interrupts that
> >>> are configured to be injected as virtual interrupts into the vCPU, are
> >>> migrated with it. The physical interrupt migration has a cost. However,
> >>> receiving physical interrupts on the wrong pCPU has an higher cost.
> >>
> >> I don't understand why it is a problem for you to receive the first interrupt
> >> to the wrong pCPU and moving it if necessary.
> >>
> >> While this may have an higher cost (I don't believe so) on the first received
> >> interrupt, migrating thousands of interrupts at the same time is very
> >> expensive and will likely get Xen stuck for a while (think about ITS with a
> >> single command queue).
> >>
> >> Furthermore, the current approach will move every single interrupt routed a
> >> the vCPU, even those disabled. That's pointless and a waste of resource. You
> >> may argue that we can skip the ones disabled, but in that case what would be
> >> the benefits to migrate the IRQs while migrate the vCPUs?
> >>
> >> So I would suggest to spread it over the time. This also means less headache
> >> for the scheduler developers.
> > 
> > The most important aspect of interrupts handling in Xen is latency,
> > measured as the time between Xen receiving a physical interrupt and the
> > guest receiving it. This latency should be both small and deterministic.
> > 
> > We all agree so far, right?
> > 
> > 
> > The issue with spreading interrupts migrations over time is that it makes
> > interrupt latency less deterministic. It is OK, in the uncommon case of
> > vCPU migration with interrupts, to take a hit for a short time. This
> > "hit" can be measured. It can be known. If your workload cannot tolerate
> > it, vCPUs can be pinned. It should be a rare event anyway. On the other
> > hand, by spreading interrupts migrations, we make it harder to predict
> > latency. Aside from determinism, another problem with this approach is
> > that it ensures that every interrupt assigned to a vCPU will first hit
> > the wrong pCPU, then it will be moved. It guarantees the worst-case
> > scenario for interrupt latency for the vCPU that has been moved. If we
> > migrated all interrupts as soon as possible, we would minimize the
> > amount of interrupts delivered to the wrong pCPU. Most interrupts would
> > be delivered to the new pCPU right away, reducing interrupt latency.
> 
> So if this is such a crucial issue, why don't we use the ITS for good
> this time? The ITS hardware probably supports 16 bits worth of
> collection IDs, so what about we assign each VCPU (in every guest) a
> unique collection ID on the host and do a MAPC & MOVALL on a VCPU
> migration to let it point to the right physical redistributor.
> I see that this does not cover all use cases (> 65536 VCPUs, for
> instance), also depends much of many implementation details:

This is certainly an idea worth exploring. We don't need to assign a
collection ID to every vCPU, just the ones that have LPIs assigned to
them, which should be considerably fewer.


> - How costly is a MOVALL? It needs to scan the pending table and
> transfer set bits to the other redistributor, which may take a while.

This is an hardware operation, even if it is not fast, I'd prefer to
rely on that, rather than implementing something complex in software.
Usually hardware gets better over time at this sort of things.


> - Is there an impact if we exceed the number of hardware backed
> collections (GITS_TYPE[HCC])? If the ITS is forced to access system
> memory for every table lookup, this may slow down everyday operations.

We'll have to fall back to manually moving them one by one.


> - How likely are those misdirected interrupts in the first place? How
> often do we migrate VCPU compared to the the interrupt frequency?

This is where is scheduler work comes in.


> There are more, subtle parameters to consider, so I guess we just need
> to try and measure.

That's right. This is why I have been saying that we need numbers. This
is difficult hardware, difficult code and difficult scenarios. Intuition
only gets us so far. We need to be scientific and measure the approach
we decide to take, and maybe even one that we decided not to take, to
figure out whether it is actually acceptable.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-09 19:00                           ` Andre Przywara
@ 2016-12-10  0:30                             ` Stefano Stabellini
  2016-12-12 10:38                               ` Andre Przywara
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-10  0:30 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari, Steve Capper

[-- Attachment #1: Type: TEXT/PLAIN, Size: 12821 bytes --]

On Fri, 9 Dec 2016, Andre Przywara wrote:
> >> I've been spending some time thinking about this, and I think we can in
> >> fact get away without ever propagating command from domains to the host.
> >>
> >> I made a list of all commands that possible require host ITS command
> >> propagation. There are two groups:
> >> 1: enabling/disabling LPIs: INV, INVALL
> >> 2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.
> >>
> >> The second group can be handled by mapping all required devices up
> >> front, I will elaborate on that in a different email.
> >>
> >> For the first group, read below ...
> >>
> >> On 01/12/16 01:19, Stefano Stabellini wrote:
> >>> On Fri, 25 Nov 2016, Julien Grall wrote:
> >>>> Hi,
> >>>>
> >>>> On 18/11/16 18:39, Stefano Stabellini wrote:
> >>>>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
> >>>>>> On Fri, 11 Nov 2016, Julien Grall wrote:
> >>>>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
> >>>>>>> That's why in the approach we had on the previous series was "host ITS
> >>>>>>> command
> >>>>>>> should be limited when emulating guest ITS command". From my recall, in
> >>>>>>> that
> >>>>>>> series the host and guest LPIs was fully separated (enabling a guest
> >>>>>>> LPIs was
> >>>>>>> not enabling host LPIs).
> >>>>>>
> >>>>>> I am interested in reading what Ian suggested to do when the physical
> >>>>>> ITS queue is full, but I cannot find anything specific about it in the
> >>>>>> doc.
> >>>>>>
> >>>>>> Do you have a suggestion for this?
> >>>>>>
> >>>>>> The only things that come to mind right now are:
> >>>>>>
> >>>>>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock
> >>>>>> style)
> >>>>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
> >>>>>
> >>>>> Another, probably better idea, is to map all pLPIs of a device when the
> >>>>> device is assigned to a guest (including Dom0). This is what was written
> >>>>> in Ian's design doc. The advantage of this approach is that Xen doesn't
> >>>>> need to take any actions on the physical ITS command queue when the
> >>>>> guest issues virtual ITS commands, therefore completely solving this
> >>>>> problem at the root. (Although I am not sure about enable/disable
> >>>>> commands: could we avoid issuing enable/disable on pLPIs?)
> >>>>
> >>>> In the previous design document (see [1]), the pLPIs are enabled when the
> >>>> device is assigned to the guest. This means that it is not necessary to send
> >>>> command there. This is also means we may receive a pLPI before the associated
> >>>> vLPI has been configured.
> >>>>
> >>>> That said, given that LPIs are edge-triggered, there is no deactivate state
> >>>> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
> >>>> LPIs could potentially be raised again. This could generate a storm.
> >>>
> >>> Thank you for raising this important point. You are correct.
> >>>
> >>>> The priority drop is necessary if we don't want to block the reception of
> >>>> interrupt for the current physical CPU.
> >>>>
> >>>> What I am more concerned about is this problem can also happen in normal
> >>>> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
> >>>> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
> >>>> it is time to introduce rate-limit interrupt for ARM. Any opinions?
> >>>
> >>> Yes. It could be as simple as disabling the pLPI when Xen receives a
> >>> second pLPI before the guest EOIs the first corresponding vLPI, which
> >>> shouldn't happen in normal circumstances.
> >>>
> >>> We need a simple per-LPI inflight counter, incremented when a pLPI is
> >>> received, decremented when the corresponding vLPI is EOIed (the LR is
> >>> cleared).
> >>>
> >>> When the counter > 1, we disable the pLPI and request a maintenance
> >>> interrupt for the corresponding vLPI.
> >>
> >> So why do we need a _counter_? This is about edge triggered interrupts,
> >> I think we can just accumulate all of them into one.
> > 
> > The counter is not to re-inject the same amount of interrupts into the
> > guest, but to detect interrupt storms.
> 
> I was wondering if an interrupt "storm" could already be defined by
> "receiving an LPI while there is already one pending (in the guest's
> virtual pending table) and it being disabled by the guest". I admit that
> declaring two interrupts as a storm is a bit of a stretch, but in fact
> the guest had probably a reason for disabling it even though it
> fires, so Xen should just follow suit.
> The only difference is that we don't do it _immediately_ when the guest
> tells us (via INV), but only if needed (LPI actually fires).

Either way should work OK, I think.


> >>  - If the VLPI is enabled, we EOI it on the host and inject it.
> >>  - If the VLPI is disabled, we set the pending bit in the VCPU's
> >>    pending table and EOI on the host - to allow other IRQs.
> >> - On a guest INV command, we check whether that vLPI is now enabled:
> >>  - If it is disabled now, we don't need to do anything.
> >>  - If it is enabled now, we check the pending bit for that VLPI:
> >>   - If it is 0, we don't do anything.
> >>   - If it is 1, we inject the VLPI and clear the pending bit.
> >> - On a guest INVALL command, we just need to iterate over the virtual
> >> LPIs.
> > 
> > Right, much better.
> > 
> > 
> >> If you look at the conditions above, the only immediate action is
> >> when a VLPI gets enabled _and_ its pending bit is set. So we can do
> >> 64-bit read accesses over the whole pending table to find non-zero words
> >> and thus set bits, which should be rare in practice. We can store the
> >> highest mapped VLPI to avoid iterating over the whole of the table.
> >> Ideally the guest has no direct control over the pending bits, since
> >> this is what the device generates. Also we limit the number of VLPIs in
> >> total per guest anyway.
> > 
> > I wonder if we could even use a fully packed bitmask with only the
> > pending bits, so 1 bit per vLPI, rather than 1 byte per vLPI. That would
> > be a nice improvement.
> 
> The _pending_ table is exactly that: one bit per VLPI.

Actually the spec says about the pending table, ch 6.1.2:

"Each Redistributor maintains entries in a separate LPI Pending table
that indicates the pending state of each LPI when GICR_CTLR.EnableLPIs
== 1 in the Redistributor:
  0 The LPI is not pending.
  1 The LPI is pending.

For a given LPI:
• The corresponding byte in the LPI Pending table is (base address + (N / 8)).
• The bit position in the byte is (N MOD 8)."

It seems to me that each LPI is supposed to have a byte, not a bit. Am I
looking at the wrong table?

In any case you suggested to trap the pending table, so we can actually
write anything we want to those guest provided pages.


> So by doing a
> 64-bit read we cover 64 VLPIs. And normally if an LPI fires it will
> probably be enabled (otherwise the guest would have disabled it in the
> device), so we inject it and don't need this table. It's
> really just for storing the pending status should an LPI arrive while
> the guest had _disabled_ it. I assume this is rather rare, so the table
> will mostly be empty: that's why I expect most reads to be 0 and the
> iteration of the table to be very quick. As an additional optimization
> we could store the highest and lowest virtually pending LPI, to avoid
> scanning the whole table.
>
> We can't do so much about the property table, though, because its layout
> is described in the spec - in contrast to the ITS tables, which are IMPDEF.
> But as we only need to do something if the LPI is _both_ enabled _and_
> pending, scanning the pending table gives us a quite good filter
> already. For the few LPIs that hit here, we can just access the right
> byte in the property table.

OK


> >> If that still sounds like a DOS vector, we could additionally rate-limit
> >> INVALLs, and/or track additions to the pending table after the last
> >> INVALL: if there haven't been any new pending bits since the last scan,
> >> INVALL is a NOP.
> >>
> >> Does that makes sense so far?
> > 
> > It makes sense. It should be OK.
> > 
> > 
> >> So that just leaves us with this IRQ storm issue, which I am thinking
> >> about now. But I guess this is not a show stopper given we can disable
> >> the physical LPI if we sense this situation.
> > 
> > That is true and it's exactly what we should do.
> > 
> > 
> >>> When we receive the maintenance interrupt and we clear the LR of the
> >>> vLPI, Xen should re-enable the pLPI.
> 
> So I was thinking why you would need to wait for the guest to actually
> EOI it?
> Can't we end the interrupt storm condition at the moment the guest
> enables the interrupt? LPIs are edge triggered and so a storm in the
> past is easily merged into a single guest LPI once the guest enables it
> again. From there on we inject every triggered LPI into the guest.

What you describe works if the guest disables the interrupt. But what if
it doesn't? Xen should also be able to cope with non-cooperative guests
which might even have triggered the interrupt storm on purpose.

I think that you are asking this question because we are actually
talking about two different issues. See below.


> This special handling for the interrupt storm just stems from the fact
> that have to keep LPIs enabled on the h/w interrupt controller level,
> despite the guest having disabled it on it's own _virtual_ GIC. So once
> the guest enables it again, we are in line with the current GICv2/GICv3,
> aren't we? Do we have interrupt storm detection/prevention in the moment?

No, that's not the cause of the storm. Julien described well before:

  given that LPIs are edge-triggered, there is no deactivate state (see 4.1
  in ARM IHI 0069C). So as soon as the priority drop is done, the same LPIs could
  potentially be raised again. This could generate a storm.

The problem is that Xen has to do priority drop upon receiving an pLPI,
but, given that LPIs don't have a deactivate state, the priority drop is
enough to let the hardware inject a second LPI, even if the guest didn't
EOI the first one yet.

In the case of SPIs, the hardware cannot inject a second interrupt after
Xen does priority drop, it has to wait for the guest to EOI it.



> And please keep in mind that LPIs are _always_ edge triggered: So once
> we EOI an LPI on the host, this "very same" LPI is gone from a h/w GIC
> point of view, the next incoming interrupt must have been triggered by a
> new interrupt condition in the device (new network packet, for
> instance).

This is actually the problem: what if the guest configured the device on
purpose to keep generating LPIs without pause? Nice and simple way to
take down the host.


> In contrast to GICv2 this applies to _every_ LPI.
> So I am not sure we should really care _too_ much about this (apart from
> the "guest has disabled it" part): Once we assign a device to a guest,
> we lose some control over the machine anyway and at least trust the
> device to not completely block the system.

No we don't! Hardware engineers make mistakes too! We have to protect
Xen from devices which purposely or mistakenly generate interrupt
storms. This is actually a pretty common problem.


> I don't see how the ITS differs in that respect from the GICv3/GICv2.

It differs because in the case of GICv2, every single interrupt has to
be EOI'd by the guest. Therefore the Xen scheduler can still decide to
schedule it out. In the case of the ITS, Xen could be stuck in an
interrupt handling loop.


> A quick update on my side: I implemented the scheme I described in my
> earlier mail now and it boots to the Dom0 prompt on a fastmodel with an ITS:

Nice!


> - On receiving the PHYSDEVOP_manage_pci_add hypercall in
> xen/arch/arm/physdev.c, we MAPD the device on the host, MAPTI a bunch of
> interrupts and enable them. We keep them unassigned in our host
> pLPI->VLPI table, so we discard them should they fire.
> This hypercall is issued by Dom0 Linux before bringing up any PCI
> devices, so it works even for Dom0 without any Linux changes. For DomUs
> with PCI passthrough Dom0 is expected to issue this hypercall on behalf
> of the to-be-created domain.
> - When a guest (be it Dom0 or Domu) actually maps an LPI (MAPTI), we
> just enter the virtual LPI number and the target VCPU in our pLPI-vLPI
> table and be done. Should it fire now, we know where to inject it, but
> refer to the enabled bit in the guest's property table before doing so.
> - When a guest (be it Dom0 or DomU) enables or disabled an interrupt, we
> don't do much, as we refer to the enable bit every time we want to
> inject already. The only thing I actually do is to inject an LPI if
> there is a virtual LPI pending and the LPI is now enabled.

Sounds good, well done!

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-10  0:30                             ` Stefano Stabellini
@ 2016-12-12 10:38                               ` Andre Przywara
  2016-12-14  0:38                                 ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2016-12-12 10:38 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall, Vijay Kilari, Steve Capper

Hi Stefano,

thanks for the prompt and helpful answer!

On 10/12/16 00:30, Stefano Stabellini wrote:
> On Fri, 9 Dec 2016, Andre Przywara wrote:
>>>> I've been spending some time thinking about this, and I think we can in
>>>> fact get away without ever propagating command from domains to the host.
>>>>
>>>> I made a list of all commands that possible require host ITS command
>>>> propagation. There are two groups:
>>>> 1: enabling/disabling LPIs: INV, INVALL
>>>> 2: mapping/unmapping devices/events: DISCARD, MAPD, MAPTI.
>>>>
>>>> The second group can be handled by mapping all required devices up
>>>> front, I will elaborate on that in a different email.
>>>>
>>>> For the first group, read below ...
>>>>
>>>> On 01/12/16 01:19, Stefano Stabellini wrote:
>>>>> On Fri, 25 Nov 2016, Julien Grall wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On 18/11/16 18:39, Stefano Stabellini wrote:
>>>>>>> On Fri, 11 Nov 2016, Stefano Stabellini wrote:
>>>>>>>> On Fri, 11 Nov 2016, Julien Grall wrote:
>>>>>>>>> On 10/11/16 20:42, Stefano Stabellini wrote:
>>>>>>>>> That's why in the approach we had on the previous series was "host ITS
>>>>>>>>> command
>>>>>>>>> should be limited when emulating guest ITS command". From my recall, in
>>>>>>>>> that
>>>>>>>>> series the host and guest LPIs was fully separated (enabling a guest
>>>>>>>>> LPIs was
>>>>>>>>> not enabling host LPIs).
>>>>>>>>
>>>>>>>> I am interested in reading what Ian suggested to do when the physical
>>>>>>>> ITS queue is full, but I cannot find anything specific about it in the
>>>>>>>> doc.
>>>>>>>>
>>>>>>>> Do you have a suggestion for this?
>>>>>>>>
>>>>>>>> The only things that come to mind right now are:
>>>>>>>>
>>>>>>>> 1) check if the ITS queue is full and busy loop until it is not (spin_lock
>>>>>>>> style)
>>>>>>>> 2) check if the ITS queue is full and sleep until it is not (mutex style)
>>>>>>>
>>>>>>> Another, probably better idea, is to map all pLPIs of a device when the
>>>>>>> device is assigned to a guest (including Dom0). This is what was written
>>>>>>> in Ian's design doc. The advantage of this approach is that Xen doesn't
>>>>>>> need to take any actions on the physical ITS command queue when the
>>>>>>> guest issues virtual ITS commands, therefore completely solving this
>>>>>>> problem at the root. (Although I am not sure about enable/disable
>>>>>>> commands: could we avoid issuing enable/disable on pLPIs?)
>>>>>>
>>>>>> In the previous design document (see [1]), the pLPIs are enabled when the
>>>>>> device is assigned to the guest. This means that it is not necessary to send
>>>>>> command there. This is also means we may receive a pLPI before the associated
>>>>>> vLPI has been configured.
>>>>>>
>>>>>> That said, given that LPIs are edge-triggered, there is no deactivate state
>>>>>> (see 4.1 in ARM IHI 0069C). So as soon as the priority drop is done, the same
>>>>>> LPIs could potentially be raised again. This could generate a storm.
>>>>>
>>>>> Thank you for raising this important point. You are correct.
>>>>>
>>>>>> The priority drop is necessary if we don't want to block the reception of
>>>>>> interrupt for the current physical CPU.
>>>>>>
>>>>>> What I am more concerned about is this problem can also happen in normal
>>>>>> running (i.e the pLPI is bound to an vLPI and the vLPI is enabled). For
>>>>>> edge-triggered interrupt, there is no way to prevent them to fire again. Maybe
>>>>>> it is time to introduce rate-limit interrupt for ARM. Any opinions?
>>>>>
>>>>> Yes. It could be as simple as disabling the pLPI when Xen receives a
>>>>> second pLPI before the guest EOIs the first corresponding vLPI, which
>>>>> shouldn't happen in normal circumstances.
>>>>>
>>>>> We need a simple per-LPI inflight counter, incremented when a pLPI is
>>>>> received, decremented when the corresponding vLPI is EOIed (the LR is
>>>>> cleared).
>>>>>
>>>>> When the counter > 1, we disable the pLPI and request a maintenance
>>>>> interrupt for the corresponding vLPI.
>>>>
>>>> So why do we need a _counter_? This is about edge triggered interrupts,
>>>> I think we can just accumulate all of them into one.
>>>
>>> The counter is not to re-inject the same amount of interrupts into the
>>> guest, but to detect interrupt storms.
>>
>> I was wondering if an interrupt "storm" could already be defined by
>> "receiving an LPI while there is already one pending (in the guest's
>> virtual pending table) and it being disabled by the guest". I admit that
>> declaring two interrupts as a storm is a bit of a stretch, but in fact
>> the guest had probably a reason for disabling it even though it
>> fires, so Xen should just follow suit.
>> The only difference is that we don't do it _immediately_ when the guest
>> tells us (via INV), but only if needed (LPI actually fires).
> 
> Either way should work OK, I think.
> 
> 
>>>>  - If the VLPI is enabled, we EOI it on the host and inject it.
>>>>  - If the VLPI is disabled, we set the pending bit in the VCPU's
>>>>    pending table and EOI on the host - to allow other IRQs.
>>>> - On a guest INV command, we check whether that vLPI is now enabled:
>>>>  - If it is disabled now, we don't need to do anything.
>>>>  - If it is enabled now, we check the pending bit for that VLPI:
>>>>   - If it is 0, we don't do anything.
>>>>   - If it is 1, we inject the VLPI and clear the pending bit.
>>>> - On a guest INVALL command, we just need to iterate over the virtual
>>>> LPIs.
>>>
>>> Right, much better.
>>>
>>>
>>>> If you look at the conditions above, the only immediate action is
>>>> when a VLPI gets enabled _and_ its pending bit is set. So we can do
>>>> 64-bit read accesses over the whole pending table to find non-zero words
>>>> and thus set bits, which should be rare in practice. We can store the
>>>> highest mapped VLPI to avoid iterating over the whole of the table.
>>>> Ideally the guest has no direct control over the pending bits, since
>>>> this is what the device generates. Also we limit the number of VLPIs in
>>>> total per guest anyway.
>>>
>>> I wonder if we could even use a fully packed bitmask with only the
>>> pending bits, so 1 bit per vLPI, rather than 1 byte per vLPI. That would
>>> be a nice improvement.
>>
>> The _pending_ table is exactly that: one bit per VLPI.
> 
> Actually the spec says about the pending table, ch 6.1.2:
> 
> "Each Redistributor maintains entries in a separate LPI Pending table
> that indicates the pending state of each LPI when GICR_CTLR.EnableLPIs
> == 1 in the Redistributor:
>   0 The LPI is not pending.
>   1 The LPI is pending.
> 
> For a given LPI:
> • The corresponding byte in the LPI Pending table is (base address + (N / 8)).
> • The bit position in the byte is (N MOD 8)."

        ^^^

> It seems to me that each LPI is supposed to have a byte, not a bit. Am I
> looking at the wrong table?

Well, the explanation could indeed be a bit more explicit, but it's
really meant to be a bit:
1) The two lines above describe how to address a single bit in a
byte-addressed array.
2) The following paragraphs talk about "the first 1KB" when it comes to
non-LPI interrupts. This matches 8192 bits.
3) In section 6.1 the spec states: "Memory-backed storage for LPI
pending _bits_ in an LPI Pending table."
4) The actual instance, Marc's Linux driver, also speaks of a bit:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/irqchip/irq-gic-v3-its.c#n785

> In any case you suggested to trap the pending table, so we can actually
> write anything we want to those guest provided pages.

Indeed:
"During normal operation, the LPI Pending table is maintained solely by
the Redistributor."

>> So by doing a
>> 64-bit read we cover 64 VLPIs. And normally if an LPI fires it will
>> probably be enabled (otherwise the guest would have disabled it in the
>> device), so we inject it and don't need this table. It's
>> really just for storing the pending status should an LPI arrive while
>> the guest had _disabled_ it. I assume this is rather rare, so the table
>> will mostly be empty: that's why I expect most reads to be 0 and the
>> iteration of the table to be very quick. As an additional optimization
>> we could store the highest and lowest virtually pending LPI, to avoid
>> scanning the whole table.
>>
>> We can't do so much about the property table, though, because its layout
>> is described in the spec - in contrast to the ITS tables, which are IMPDEF.
>> But as we only need to do something if the LPI is _both_ enabled _and_
>> pending, scanning the pending table gives us a quite good filter
>> already. For the few LPIs that hit here, we can just access the right
>> byte in the property table.
> 
> OK
> 
> 
>>>> If that still sounds like a DOS vector, we could additionally rate-limit
>>>> INVALLs, and/or track additions to the pending table after the last
>>>> INVALL: if there haven't been any new pending bits since the last scan,
>>>> INVALL is a NOP.
>>>>
>>>> Does that makes sense so far?
>>>
>>> It makes sense. It should be OK.
>>>
>>>
>>>> So that just leaves us with this IRQ storm issue, which I am thinking
>>>> about now. But I guess this is not a show stopper given we can disable
>>>> the physical LPI if we sense this situation.
>>>
>>> That is true and it's exactly what we should do.
>>>
>>>
>>>>> When we receive the maintenance interrupt and we clear the LR of the
>>>>> vLPI, Xen should re-enable the pLPI.
>>
>> So I was thinking why you would need to wait for the guest to actually
>> EOI it?
>> Can't we end the interrupt storm condition at the moment the guest
>> enables the interrupt? LPIs are edge triggered and so a storm in the
>> past is easily merged into a single guest LPI once the guest enables it
>> again. From there on we inject every triggered LPI into the guest.
> 
> What you describe works if the guest disables the interrupt. But what if
> it doesn't? Xen should also be able to cope with non-cooperative guests
> which might even have triggered the interrupt storm on purpose.
> 
> I think that you are asking this question because we are actually
> talking about two different issues. See below.

Indeed I was wondering about that ...

> 
>> This special handling for the interrupt storm just stems from the fact
>> that have to keep LPIs enabled on the h/w interrupt controller level,
>> despite the guest having disabled it on it's own _virtual_ GIC. So once
>> the guest enables it again, we are in line with the current GICv2/GICv3,
>> aren't we? Do we have interrupt storm detection/prevention in the moment?
> 
> No, that's not the cause of the storm. Julien described well before:
> 
>   given that LPIs are edge-triggered, there is no deactivate state (see 4.1
>   in ARM IHI 0069C). So as soon as the priority drop is done, the same LPIs could
>   potentially be raised again. This could generate a storm.
> 
> The problem is that Xen has to do priority drop upon receiving an pLPI,
> but, given that LPIs don't have a deactivate state, the priority drop is
> enough to let the hardware inject a second LPI, even if the guest didn't
> EOI the first one yet.
> 
> In the case of SPIs, the hardware cannot inject a second interrupt after
> Xen does priority drop, it has to wait for the guest to EOI it.

I understand that ...

>> And please keep in mind that LPIs are _always_ edge triggered: So once
>> we EOI an LPI on the host, this "very same" LPI is gone from a h/w GIC
>> point of view, the next incoming interrupt must have been triggered by a
>> new interrupt condition in the device (new network packet, for
>> instance).
> 
> This is actually the problem: what if the guest configured the device on
> purpose to keep generating LPIs without pause? Nice and simple way to
> take down the host.

I see, I just wasn't sure we were talking about the same thing: actual
interrupt storm triggered by the device vs. "virtual" interrupt storm
due to an interrupt line not being lowered by the IRQ handler.
And I was hoping for the latter, but well ...
So thanks for the clarification.

>> In contrast to GICv2 this applies to _every_ LPI.
>> So I am not sure we should really care _too_ much about this (apart from
>> the "guest has disabled it" part): Once we assign a device to a guest,
>> we lose some control over the machine anyway and at least trust the
>> device to not completely block the system.
> 
> No we don't! Hardware engineers make mistakes too!

You tell me ... ;-)

> We have to protect
> Xen from devices which purposely or mistakenly generate interrupt
> storms. This is actually a pretty common problem.

I see, though this doesn't make this whole problem easier ;-)

>> I don't see how the ITS differs in that respect from the GICv3/GICv2.
> 
> It differs because in the case of GICv2, every single interrupt has to
> be EOI'd by the guest.

Sure, though I think technically it's "deactivated" here that matters
(we EOI LPIs as well). And since LPIs have no active state, this makes
the difference.

> Therefore the Xen scheduler can still decide to
> schedule it out. In the case of the ITS, Xen could be stuck in an
> interrupt handling loop.

So I was wondering as we might need to relax our new strict "No
(unprivileged) guest ever causes a host ITS command to be queued" rule a
bit, because:
- Disabling an LPI is a separate issue, as we can trigger this in Xen
interrupt context once we decide that this is an interrupt storm.
- But enabling it again has to both happen in timely manner (as the
guest expects interrupts to come in) and to be triggered by a guest
action, which causes the INV command to be send when handling a guest fault.

Now this INV command (and possibly a follow-up SYNC) for enabling an LPI
would be the only critical ones, so I was wondering if we could ensure
that these commands can always be queued immediately, by making sure we
have at least two ITS command queue slots available all of the time.
Other ITS commands (triggered by device pass-throughs, for instance),
would then have to potentially wait if we foresee that they could fill
up the host command queue.
Something like QoS for ITS commands.
And I think we should map the maximum command queue size on the host
(1MB => 32768 commands) to make this scenario less likely.

I will need to think about this a bit further, maybe implement something
as a proof of concept.

Cheers,
Andre.

> 
>> A quick update on my side: I implemented the scheme I described in my
>> earlier mail now and it boots to the Dom0 prompt on a fastmodel with an ITS:
> 
> Nice!
> 
> 
>> - On receiving the PHYSDEVOP_manage_pci_add hypercall in
>> xen/arch/arm/physdev.c, we MAPD the device on the host, MAPTI a bunch of
>> interrupts and enable them. We keep them unassigned in our host
>> pLPI->VLPI table, so we discard them should they fire.
>> This hypercall is issued by Dom0 Linux before bringing up any PCI
>> devices, so it works even for Dom0 without any Linux changes. For DomUs
>> with PCI passthrough Dom0 is expected to issue this hypercall on behalf
>> of the to-be-created domain.
>> - When a guest (be it Dom0 or Domu) actually maps an LPI (MAPTI), we
>> just enter the virtual LPI number and the target VCPU in our pLPI-vLPI
>> table and be done. Should it fire now, we know where to inject it, but
>> refer to the enabled bit in the guest's property table before doing so.
>> - When a guest (be it Dom0 or DomU) enables or disabled an interrupt, we
>> don't do much, as we refer to the enable bit every time we want to
>> inject already. The only thing I actually do is to inject an LPI if
>> there is a virtual LPI pending and the LPI is now enabled.
> 
> Sounds good, well done!
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-12 10:38                               ` Andre Przywara
@ 2016-12-14  0:38                                 ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2016-12-14  0:38 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari, Steve Capper

[-- Attachment #1: Type: TEXT/PLAIN, Size: 6589 bytes --]

On Mon, 12 Dec 2016, Andre Przywara wrote:
> >> The _pending_ table is exactly that: one bit per VLPI.
> > 
> > Actually the spec says about the pending table, ch 6.1.2:
> > 
> > "Each Redistributor maintains entries in a separate LPI Pending table
> > that indicates the pending state of each LPI when GICR_CTLR.EnableLPIs
> > == 1 in the Redistributor:
> >   0 The LPI is not pending.
> >   1 The LPI is pending.
> > 
> > For a given LPI:
> > • The corresponding byte in the LPI Pending table is (base address + (N / 8)).
> > • The bit position in the byte is (N MOD 8)."
> 
>         ^^^
> 
> > It seems to me that each LPI is supposed to have a byte, not a bit. Am I
> > looking at the wrong table?
> 
> Well, the explanation could indeed be a bit more explicit, but it's
> really meant to be a bit:
> 1) The two lines above describe how to address a single bit in a
> byte-addressed array.
> 2) The following paragraphs talk about "the first 1KB" when it comes to
> non-LPI interrupts. This matches 8192 bits.
> 3) In section 6.1 the spec states: "Memory-backed storage for LPI
> pending _bits_ in an LPI Pending table."
> 4) The actual instance, Marc's Linux driver, also speaks of a bit:
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/irqchip/irq-gic-v3-its.c#n785

I was mislead by the doc. Better this way.


> >> This special handling for the interrupt storm just stems from the fact
> >> that have to keep LPIs enabled on the h/w interrupt controller level,
> >> despite the guest having disabled it on it's own _virtual_ GIC. So once
> >> the guest enables it again, we are in line with the current GICv2/GICv3,
> >> aren't we? Do we have interrupt storm detection/prevention in the moment?
> > 
> > No, that's not the cause of the storm. Julien described well before:
> > 
> >   given that LPIs are edge-triggered, there is no deactivate state (see 4.1
> >   in ARM IHI 0069C). So as soon as the priority drop is done, the same LPIs could
> >   potentially be raised again. This could generate a storm.
> > 
> > The problem is that Xen has to do priority drop upon receiving an pLPI,
> > but, given that LPIs don't have a deactivate state, the priority drop is
> > enough to let the hardware inject a second LPI, even if the guest didn't
> > EOI the first one yet.
> > 
> > In the case of SPIs, the hardware cannot inject a second interrupt after
> > Xen does priority drop, it has to wait for the guest to EOI it.
> 
> I understand that ...
> 
> >> And please keep in mind that LPIs are _always_ edge triggered: So once
> >> we EOI an LPI on the host, this "very same" LPI is gone from a h/w GIC
> >> point of view, the next incoming interrupt must have been triggered by a
> >> new interrupt condition in the device (new network packet, for
> >> instance).
> > 
> > This is actually the problem: what if the guest configured the device on
> > purpose to keep generating LPIs without pause? Nice and simple way to
> > take down the host.
> 
> I see, I just wasn't sure we were talking about the same thing: actual
> interrupt storm triggered by the device vs. "virtual" interrupt storm
> due to an interrupt line not being lowered by the IRQ handler.
> And I was hoping for the latter, but well ...
> So thanks for the clarification.
> 
> >> In contrast to GICv2 this applies to _every_ LPI.
> >> So I am not sure we should really care _too_ much about this (apart from
> >> the "guest has disabled it" part): Once we assign a device to a guest,
> >> we lose some control over the machine anyway and at least trust the
> >> device to not completely block the system.
> > 
> > No we don't! Hardware engineers make mistakes too!
> 
> You tell me ... ;-)
> 
> > We have to protect
> > Xen from devices which purposely or mistakenly generate interrupt
> > storms. This is actually a pretty common problem.
> 
> I see, though this doesn't make this whole problem easier ;-)
> 
> >> I don't see how the ITS differs in that respect from the GICv3/GICv2.
> > 
> > It differs because in the case of GICv2, every single interrupt has to
> > be EOI'd by the guest.
> 
> Sure, though I think technically it's "deactivated" here that matters
> (we EOI LPIs as well). And since LPIs have no active state, this makes
> the difference.
> 
> > Therefore the Xen scheduler can still decide to
> > schedule it out. In the case of the ITS, Xen could be stuck in an
> > interrupt handling loop.
> 
> So I was wondering as we might need to relax our new strict "No
> (unprivileged) guest ever causes a host ITS command to be queued" rule a
> bit, because:
> - Disabling an LPI is a separate issue, as we can trigger this in Xen
> interrupt context once we decide that this is an interrupt storm.
> - But enabling it again has to both happen in timely manner (as the
> guest expects interrupts to come in) and to be triggered by a guest
> action, which causes the INV command to be send when handling a guest fault.
> 
> Now this INV command (and possibly a follow-up SYNC) for enabling an LPI
> would be the only critical ones, so I was wondering if we could ensure
> that these commands can always be queued immediately, by making sure we
> have at least two ITS command queue slots available all of the time.
> Other ITS commands (triggered by device pass-throughs, for instance),
> would then have to potentially wait if we foresee that they could fill
> up the host command queue.
> Something like QoS for ITS commands.
> And I think we should map the maximum command queue size on the host
> (1MB => 32768 commands) to make this scenario less likely.
> 
> I will need to think about this a bit further, maybe implement something
> as a proof of concept.

On one hand, I'd say that it should be rare to re-enable an interrupt,
after it was disabled due to a storm. Extremely rare. It should be OK
to issue a physical INV and SYNC in that event.

However, if that happens, it could very well be because the guest is
trying to take down the host, and issuing physical ITS commands in
response to a malicious guest action, could be a good way to help the
attacker. We need to be extra careful.

Given that a storm is supposed to be an exceptional circumstance, it is
OK to enforce very strict limits on the amount of times we are willing
to issue physical ITS commands as a consequence of a guest action. For
example, we could decide to do it just once, or twice, then label the
guest as "untrustworthy" and destroy it. After all if a storm keeps
happening, it must be due to a malicious guest or faulty hardware - in
both cases it is best to terminate the VM.

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-09 20:18                                               ` Stefano Stabellini
@ 2016-12-14  2:39                                                 ` George Dunlap
  2016-12-16  1:30                                                   ` Dario Faggioli
  0 siblings, 1 reply; 144+ messages in thread
From: George Dunlap @ 2016-12-14  2:39 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Vijay Kilari, Steve Capper, Andre Przywara, Dario Faggioli,
	Julien Grall, xen-devel


> On Dec 10, 2016, at 4:18 AM, Stefano Stabellini <sstabellini@kernel.org> wrote:
> 
> On Fri, 9 Dec 2016, Andre Przywara wrote:
>> On 07/12/16 20:20, Stefano Stabellini wrote:
>>> On Tue, 6 Dec 2016, Julien Grall wrote:
>>>> On 06/12/2016 22:01, Stefano Stabellini wrote:
>>>>> On Tue, 6 Dec 2016, Stefano Stabellini wrote:
>>>>>> moving a vCPU with interrupts assigned to it is slower than moving a
>>>>>> vCPU without interrupts assigned to it. You could say that the
>>>>>> slowness is directly proportional do the number of interrupts assigned
>>>>>> to the vCPU.
>>>>> 
>>>>> To be pedantic, by "assigned" I mean that a physical interrupt is routed
>>>>> to a given pCPU and is set to be forwarded to a guest vCPU running on it
>>>>> by the _IRQ_GUEST flag. The guest could be dom0. Upon receiving one of
>>>>> these physical interrupts, a corresponding virtual interrupt (could be a
>>>>> different irq) will be injected into the guest vCPU.
>>>>> 
>>>>> When the vCPU is migrated to a new pCPU, the physical interrupts that
>>>>> are configured to be injected as virtual interrupts into the vCPU, are
>>>>> migrated with it. The physical interrupt migration has a cost. However,
>>>>> receiving physical interrupts on the wrong pCPU has an higher cost.
>>>> 
>>>> I don't understand why it is a problem for you to receive the first interrupt
>>>> to the wrong pCPU and moving it if necessary.
>>>> 
>>>> While this may have an higher cost (I don't believe so) on the first received
>>>> interrupt, migrating thousands of interrupts at the same time is very
>>>> expensive and will likely get Xen stuck for a while (think about ITS with a
>>>> single command queue).
>>>> 
>>>> Furthermore, the current approach will move every single interrupt routed a
>>>> the vCPU, even those disabled. That's pointless and a waste of resource. You
>>>> may argue that we can skip the ones disabled, but in that case what would be
>>>> the benefits to migrate the IRQs while migrate the vCPUs?
>>>> 
>>>> So I would suggest to spread it over the time. This also means less headache
>>>> for the scheduler developers.
>>> 
>>> The most important aspect of interrupts handling in Xen is latency,
>>> measured as the time between Xen receiving a physical interrupt and the
>>> guest receiving it. This latency should be both small and deterministic.
>>> 
>>> We all agree so far, right?
>>> 
>>> 
>>> The issue with spreading interrupts migrations over time is that it makes
>>> interrupt latency less deterministic. It is OK, in the uncommon case of
>>> vCPU migration with interrupts, to take a hit for a short time. This
>>> "hit" can be measured. It can be known. If your workload cannot tolerate
>>> it, vCPUs can be pinned. It should be a rare event anyway. On the other
>>> hand, by spreading interrupts migrations, we make it harder to predict
>>> latency. Aside from determinism, another problem with this approach is
>>> that it ensures that every interrupt assigned to a vCPU will first hit
>>> the wrong pCPU, then it will be moved. It guarantees the worst-case
>>> scenario for interrupt latency for the vCPU that has been moved. If we
>>> migrated all interrupts as soon as possible, we would minimize the
>>> amount of interrupts delivered to the wrong pCPU. Most interrupts would
>>> be delivered to the new pCPU right away, reducing interrupt latency.

OK, so ultimately for each interrupt we can take an “eager” approach and move it as soon as the vcpu moves, or a “lazy” approach and move it after it fires.

The two options which have been discussed are:
1. Always take an eager approach, and try to tell the scheduler to limit the migration frequency for these vcpus more than others
2. Always take a lazy approach, and leave the scheduler the way it is.

Another approach which one might take:
3. Eagerly migrate a subset of the interrupts and lazily migrate the others.  For instance, we could eagerly migrate all the interrupts which have fired since the last vcpu migration.  In a system where migrations happen frequently, this should only be a handful; in a system that migrates infrequently, this will be more, but it won’t matter, because it will happen less often.

Workloads which need predictable IRQ latencies should probably be pinning their vcpus anyway.

So at the moment, the scheduler already tries to avoid migrating things *a little bit* if it can (see migrate_resist).  It’s not clear to me at the moment whether this is enough or not.  Or to put it a different way — how long should the scheduler try to wait before moving one of these vcpus?  At the moment I haven’t seen a good way of calculating this.

#3 to me has the feeling of being somewhat more satisfying, but also potentially fairly complicated.  Since the scheduler already does migration resistance somewhat, #1 would be a simpler to implement in the sort run.  If it turns out that #1 has other drawbacks, we can implement #3 as and when needed.

Thoughts?

 -George
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 21/24] ARM: vITS: handle INVALL command
  2016-12-14  2:39                                                 ` George Dunlap
@ 2016-12-16  1:30                                                   ` Dario Faggioli
  0 siblings, 0 replies; 144+ messages in thread
From: Dario Faggioli @ 2016-12-16  1:30 UTC (permalink / raw)
  To: George Dunlap, Stefano Stabellini
  Cc: Andre Przywara, Julien Grall, Steve Capper, Vijay Kilari, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 5013 bytes --]

On Wed, 2016-12-14 at 03:39 +0100, George Dunlap wrote:
> > On Dec 10, 2016, at 4:18 AM, Stefano Stabellini <sstabellini@kernel
> > .org> wrote:
> > > > The issue with spreading interrupts migrations over time is
> > > > that it makes
> > > > interrupt latency less deterministic. It is OK, in the uncommon
> > > > case of
> > > > vCPU migration with interrupts, to take a hit for a short time.
> > > > This
> > > > "hit" can be measured. It can be known. If your workload cannot
> > > > tolerate
> > > > it, vCPUs can be pinned. It should be a rare event anyway. On
> > > > the other
> > > > hand, by spreading interrupts migrations, we make it harder to
> > > > predict
> > > > latency. Aside from determinism, another problem with this
> > > > approach is
> > > > that it ensures that every interrupt assigned to a vCPU will
> > > > first hit
> > > > the wrong pCPU, then it will be moved. It guarantees the worst-
> > > > case
> > > > scenario for interrupt latency for the vCPU that has been
> > > > moved. If we
> > > > migrated all interrupts as soon as possible, we would minimize
> > > > the
> > > > amount of interrupts delivered to the wrong pCPU. Most
> > > > interrupts would
> > > > be delivered to the new pCPU right away, reducing interrupt
> > > > latency.
> 
> Another approach which one might take:
> 3. Eagerly migrate a subset of the interrupts and lazily migrate the
> others.  For instance, we could eagerly migrate all the interrupts
> which have fired since the last vcpu migration.  In a system where
> migrations happen frequently, this should only be a handful; in a
> system that migrates infrequently, this will be more, but it won’t
> matter, because it will happen less often.
> 
Yes, if doable (e.g., I don't know how easy and practical is to know
and keep track of fired interrupts) this looks a good solution to me
too.

> So at the moment, the scheduler already tries to avoid migrating
> things *a little bit* if it can (see migrate_resist).  It’s not clear
> to me at the moment whether this is enough or not.  
>
Well, true, but migration resistance, in Credit2, is just a fixed value
which:
 1. is set at boot time;
 2. is always the same for all vcpus;
 3. is always the same, no matter what a vcpu is doing.

And even if we make it tunable and changeable at runtime (which I
intend to do), it's still something pretty "static" because of 2 and 3.

And even if we make it tunable per-vcpu (which is doable), it would be
rather hard to decide to what value to set it, for each vcpu. And, of
course, 3 would still apply (i.e., it would change according to the
vcpu workload or characteristics).

So, it's guessing. More or less fine grained, but always guessing.

On the other hand, using something proportional to nr. of routed
interrupt as the migration resistance threshold would overcome all 1, 2
and 3. It would give us a migrate_resist value which is adaptive, and
is determined according to actual workload of properties of a specific
vcpu.
Feeding routed interrupt info to the load balancer comes from similar
reasoning (and we actually may want to do both).

FTR, Credit1 has a similar mechanism, i.e., it *even wilded guesses*
whether a vcpu could still have some of its data in cache, and tries
not to migrate it if it's likely (see __csched_vcpu_is_cache_hot()).
We can improve that too, although it is a lot more complex and less
predictable, as usual with Credit1.

> Or to put it a different way — how long should the scheduler try to
> wait before moving one of these vcpus?  
>
Yep, it's similar to the "anticipation" problem in I/O schedulers
(where "excessive seeks" ~= "too frequent migrations").

 https://en.wikipedia.org/wiki/Anticipatory_scheduling

> At the moment I haven’t seen a good way of calculating this.
> 
Exactly, and basing the calculation on the number of routed interrupt
--and, if possible, other metrics too-- could be that "good way" we're
looking for.

It would need experimenting, of course, but I like the idea.

> #3 to me has the feeling of being somewhat more satisfying, but also
> potentially fairly complicated.  Since the scheduler already does
> migration resistance somewhat, #1 would be a simpler to implement in
> the sort run.  If it turns out that #1 has other drawbacks, we can
> implement #3 as and when needed.
> 
> Thoughts?
> 
Yes, we can do things incrementally, which is always good. I like your
#1 proposal because it has the really positive side effect of bringing
us in the camp of adaptive migration resistance, which is something
pretty advanced and pretty cool, if we manage to do it right. :-)

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2016-10-28  1:04   ` Stefano Stabellini
@ 2017-01-12 19:14     ` Andre Przywara
  2017-01-13 19:37       ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2017-01-12 19:14 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi Stefano,

as just mentioned in my last reply, I missed that email last time. Sorry
for that.

Replying to the comments that still apply to the new drop ...

On 28/10/16 02:04, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> For the same reason that allocating a struct irq_desc for each
>> possible LPI is not an option, having a struct pending_irq for each LPI
>> is also not feasible. However we actually only need those when an
>> interrupt is on a vCPU (or is about to be injected).
>> Maintain a list of those structs that we can use for the lifecycle of
>> a guest LPI. We allocate new entries if necessary, however reuse
>> pre-owned entries whenever possible.
>> Teach the existing VGIC functions to find the right pointer when being
>> given a virtual LPI number.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic.c            |  3 +++
>>  xen/arch/arm/vgic-v3.c        |  2 ++
>>  xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
>>  xen/include/asm-arm/domain.h  |  1 +
>>  xen/include/asm-arm/gic-its.h | 10 ++++++++
>>  xen/include/asm-arm/vgic.h    |  9 +++++++
>>  6 files changed, 78 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
>> index 63c744a..ebe4035 100644
>> --- a/xen/arch/arm/gic.c
>> +++ b/xen/arch/arm/gic.c
>> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>>              }
>> +            /* If this was an LPI, mark this struct as available again. */
>> +            if ( p->irq >= 8192 )
>> +                p->irq = 0;
> 
> I believe that 0 is a valid irq number, we need to come up with a
> different invalid_irq value, and we should #define it. We could also
> consider checking if the irq is inflight (linked to the inflight list)
> instead of using irq == 0 to understand if it is reusable.

But those pending_irqs here are used by LPIs only, where everything
below 8192 is invalid. So that seemed like an easy and straightforward
value to use. The other, statically allocated pending_irqs would never
read an IRQ number above 8192. When searching for an empty pending_irq
for a new LPI, we would never touch any of the statically allocated
structs, so this is safe, isn't it?

>>          }
>>      }
>>  }
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index ec038a3..e9b6490 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>>  
>> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
>> +
>>      return 0;
>>  }
>>  
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index 0965119..b961551 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -31,6 +31,8 @@
>>  #include <asm/mmio.h>
>>  #include <asm/gic.h>
>>  #include <asm/vgic.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic-its.h>
>>  
>>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>>  {
>> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>>      return vgic_get_rank(v, rank);
>>  }
>>  
>> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>>  {
>>      INIT_LIST_HEAD(&p->inflight);
>>      INIT_LIST_HEAD(&p->lr_queue);
>> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>>  
>>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>>  {
>> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
>> +    struct vgic_irq_rank *rank;
>>      unsigned long flags;
>>      int priority;
>>  
>> +    if ( virq >= 8192 )
> 
> Please introduce a convenience static inline function such as:
> 
>   bool is_lpi(unsigned int irq)

Sure.

>> +        return gicv3_lpi_get_priority(v->domain, virq);
>> +
>> +    rank = vgic_rank_irq(v, virq);
>>      vgic_lock_rank(v, rank, flags);
>>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>>      vgic_unlock_rank(v, rank, flags);
>> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
>>      return 1;
>>  }
>>  
>> +/*
>> + * Holding struct pending_irq's for each possible virtual LPI in each domain
>> + * requires too much Xen memory, also a malicious guest could potentially
>> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
>> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
>> + * on demand.
>> + */
>> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>> +                                   bool allocate)
>> +{
>> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
>> +
>> +    /* TODO: locking! */
> 
> Yeah, this needs to be fixed in v1 :-)

I fixed that in the RFC v2 post.

> 
>> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
>> +    {
>> +        if ( lpi_irq->pirq.irq == lpi )
>> +            return &lpi_irq->pirq;
>> +
>> +        if ( lpi_irq->pirq.irq == 0 && !empty )
>> +            empty = lpi_irq;
>> +    }
> 
> This is another one of those cases where a list is too slow for the hot
> path. The idea of allocating pending_irq struct on demand is good, but
> storing them in a linked list would kill performance. Probably the best
> thing we could do is an hashtable and we should preallocate the initial
> array of elements. I don't know what the size of the initial array
> should be, but we can start around 50, and change it in the future once
> we do tests with real workloads. Of course the other key parameter is
> the hash function, not sure which one is the right one, but ideally we
> would never have to allocate new pending_irq struct for LPIs because the
> preallocated set would suffice.

As I mentioned in the last post, I expect this number to be really low
(less than 5). Let's face it: If you have multiple interrupts pending
for a significant amount of time you won't make any actual progress in
the guest, because it's busy with handling interrupts.
So my picture of LPI handling is:
1) A device triggers an MSI, so the host receives the LPI. Ideally this
will be handled by the pCPU where the right VCPU is running atm, so it
will exit to EL2. Xen will handle the LPI by assigning one struct
pending_irq to it and will inject it into the guest.
2) The VCPU gets to run again and calls the interrupt handler, because
the (virtual) LPI is pending.
3) The (Linux) IRQ handler reads the ICC_IAR register to learn the IRQ
number, and will get the virtual LPI number.
=> At this point the LPI is done when it comes to the VGIC. The LR state
will be set to 0 (neither pending or active). This is independent of the
EOI the handler will execute soon (or later).
4) On the next exit the VGIC code will discover that the IRQ is done
(LR.state == 0) and will discard the struct pending_irq (set the LPI
number to 0 to make it available to the next LPI).

Even if there would be multiple LPIs pending at the same time (because
the guest had interrupts disabled, for instance), I believe they can be
all handled without exiting. Upon EOIing (priority-dropping, really) the
first LPI, the next virtual LPI would fire, calling the interrupt
handler again, and so no. Unless the kernel decides to do something that
exits (even accessing the hardware normally wouldn't, I believe), we can
clear all pending LPIs in one go.

So I have a hard time to imagine how we can really have many LPIs
pending and thus struct pending_irqs allocated.
Note that this may differ from SPIs, for instance, because the IRQ life
cycle is more complex there (extending till the EOI).

Does that make some sense? Or am I missing something here?

> I could be convinced that a list is sufficient if we do some real
> benchmarking and it turns out that lpi_to_pending always resolve in less
> than ~5 steps.

I can try to do this once I get it running on some silicon ...

>> +    if ( !allocate )
>> +        return NULL;
>> +
>> +    if ( !empty )
>> +    {
>> +        empty = xzalloc(struct lpi_pending_irq);
>> +        vgic_init_pending_irq(&empty->pirq, lpi);
>> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
>> +    } else
>> +    {
>> +        empty->pirq.status = 0;
>> +        empty->pirq.irq = lpi;
>> +    }
>> +
>> +    return &empty->pirq;
>> +}
>> +
>>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>>  {
>>      struct pending_irq *n;
>> +
> 
> spurious change
> 
> 
>>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>>       * are used for SPIs; the rests are used for per cpu irqs */
>>      if ( irq < 32 )
>>          n = &v->arch.vgic.pending_irqs[irq];
>> +    else if ( irq >= 8192 )
> 
> Use the new static inline
> 
> 
>> +        n = lpi_to_pending(v, irq, true);
>>      else
>>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>>      return n;
>> @@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>  {
>>      uint8_t priority;
>> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
>> +    struct pending_irq *iter, *n;
>>      unsigned long flags;
>>      bool_t running;
>>  
>> @@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>>  
>>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>>  
>> +    n = irq_to_pending(v, virq);
> 
> Why this change?

Because we now need to hold the lock before calling irq_to_pending(),
which now may call lpi_to_pending().

>>      /* vcpu offline */
>>      if ( test_bit(_VPF_down, &v->pause_flags) )
>>      {
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index 9452fcd..ae8a9de 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -249,6 +249,7 @@ struct arch_vcpu
>>          paddr_t rdist_base;
>>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>>          uint8_t flags;
>> +        struct list_head pending_lpi_list;
>>      } vgic;
>>  
>>      /* Timer registers  */
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 4e9841a..1f881c0 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
>>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>                              uint32_t devid, uint32_t eventid,
>>                              uint32_t host_lpi);
>> +
>> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>> +{
>> +    return GIC_PRI_IRQ;
>> +}
> 
> Does it mean that we don't allow changes to LPI priorities?

This is placeholder code for now, until we learn about the virtual
property table in patch 11/24 (where this function gets amended).
The new code drop gets away without this function here entirely.

Cheers,
Andre.

>>  #else
>>  
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> @@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>  {
>>      return 0;
>>  }
>> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>> +{
>> +    return GIC_PRI_IRQ;
>> +}
>>  
>>  #endif /* CONFIG_HAS_ITS */
>>  
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 300f461..4e29ba6 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -83,6 +83,12 @@ struct pending_irq
>>      struct list_head lr_queue;
>>  };
>>  
>> +struct lpi_pending_irq
>> +{
>> +    struct list_head entry;
>> +    struct pending_irq pirq;
>> +};
>> +
>>  #define NR_INTERRUPT_PER_RANK   32
>>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>>  
>> @@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>>  extern void vgic_clear_pending_irqs(struct vcpu *v);
>> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
>> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
>> +                                          bool allocate);
>>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>>  extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>> -- 
>> 2.9.0
>>
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-12 19:14     ` Andre Przywara
@ 2017-01-13 19:37       ` Stefano Stabellini
  2017-01-16  9:44         ` André Przywara
  0 siblings, 1 reply; 144+ messages in thread
From: Stefano Stabellini @ 2017-01-13 19:37 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

On Thu, 12 Jan 2017, Andre Przywara wrote:
> Hi Stefano,
> 
> as just mentioned in my last reply, I missed that email last time. Sorry
> for that.
> 
> Replying to the comments that still apply to the new drop ...
> 
> On 28/10/16 02:04, Stefano Stabellini wrote:
> > On Wed, 28 Sep 2016, Andre Przywara wrote:
> >> For the same reason that allocating a struct irq_desc for each
> >> possible LPI is not an option, having a struct pending_irq for each LPI
> >> is also not feasible. However we actually only need those when an
> >> interrupt is on a vCPU (or is about to be injected).
> >> Maintain a list of those structs that we can use for the lifecycle of
> >> a guest LPI. We allocate new entries if necessary, however reuse
> >> pre-owned entries whenever possible.
> >> Teach the existing VGIC functions to find the right pointer when being
> >> given a virtual LPI number.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> >>  xen/arch/arm/gic.c            |  3 +++
> >>  xen/arch/arm/vgic-v3.c        |  2 ++
> >>  xen/arch/arm/vgic.c           | 56 ++++++++++++++++++++++++++++++++++++++++---
> >>  xen/include/asm-arm/domain.h  |  1 +
> >>  xen/include/asm-arm/gic-its.h | 10 ++++++++
> >>  xen/include/asm-arm/vgic.h    |  9 +++++++
> >>  6 files changed, 78 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> >> index 63c744a..ebe4035 100644
> >> --- a/xen/arch/arm/gic.c
> >> +++ b/xen/arch/arm/gic.c
> >> @@ -506,6 +506,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
> >>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
> >>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
> >>              }
> >> +            /* If this was an LPI, mark this struct as available again. */
> >> +            if ( p->irq >= 8192 )
> >> +                p->irq = 0;
> > 
> > I believe that 0 is a valid irq number, we need to come up with a
> > different invalid_irq value, and we should #define it. We could also
> > consider checking if the irq is inflight (linked to the inflight list)
> > instead of using irq == 0 to understand if it is reusable.
> 
> But those pending_irqs here are used by LPIs only, where everything
> below 8192 is invalid. So that seemed like an easy and straightforward
> value to use. The other, statically allocated pending_irqs would never
> read an IRQ number above 8192. When searching for an empty pending_irq
> for a new LPI, we would never touch any of the statically allocated
> structs, so this is safe, isn't it?

I think you are right. Still, please #define it.


> >>          }
> >>      }
> >>  }
> >> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> >> index ec038a3..e9b6490 100644
> >> --- a/xen/arch/arm/vgic-v3.c
> >> +++ b/xen/arch/arm/vgic-v3.c
> >> @@ -1388,6 +1388,8 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
> >>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
> >>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
> >>  
> >> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
> >> +
> >>      return 0;
> >>  }
> >>  
> >> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> >> index 0965119..b961551 100644
> >> --- a/xen/arch/arm/vgic.c
> >> +++ b/xen/arch/arm/vgic.c
> >> @@ -31,6 +31,8 @@
> >>  #include <asm/mmio.h>
> >>  #include <asm/gic.h>
> >>  #include <asm/vgic.h>
> >> +#include <asm/gic_v3_defs.h>
> >> +#include <asm/gic-its.h>
> >>  
> >>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
> >>  {
> >> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
> >>      return vgic_get_rank(v, rank);
> >>  }
> >>  
> >> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> >> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> >>  {
> >>      INIT_LIST_HEAD(&p->inflight);
> >>      INIT_LIST_HEAD(&p->lr_queue);
> >> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
> >>  
> >>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
> >>  {
> >> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> >> +    struct vgic_irq_rank *rank;
> >>      unsigned long flags;
> >>      int priority;
> >>  
> >> +    if ( virq >= 8192 )
> > 
> > Please introduce a convenience static inline function such as:
> > 
> >   bool is_lpi(unsigned int irq)
> 
> Sure.
> 
> >> +        return gicv3_lpi_get_priority(v->domain, virq);
> >> +
> >> +    rank = vgic_rank_irq(v, virq);
> >>      vgic_lock_rank(v, rank, flags);
> >>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
> >>      vgic_unlock_rank(v, rank, flags);
> >> @@ -446,13 +452,55 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
> >>      return 1;
> >>  }
> >>  
> >> +/*
> >> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> >> + * requires too much Xen memory, also a malicious guest could potentially
> >> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> >> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> >> + * on demand.
> >> + */
> >> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> >> +                                   bool allocate)
> >> +{
> >> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> >> +
> >> +    /* TODO: locking! */
> > 
> > Yeah, this needs to be fixed in v1 :-)
> 
> I fixed that in the RFC v2 post.
> 
> > 
> >> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> >> +    {
> >> +        if ( lpi_irq->pirq.irq == lpi )
> >> +            return &lpi_irq->pirq;
> >> +
> >> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> >> +            empty = lpi_irq;
> >> +    }
> > 
> > This is another one of those cases where a list is too slow for the hot
> > path. The idea of allocating pending_irq struct on demand is good, but
> > storing them in a linked list would kill performance. Probably the best
> > thing we could do is an hashtable and we should preallocate the initial
> > array of elements. I don't know what the size of the initial array
> > should be, but we can start around 50, and change it in the future once
> > we do tests with real workloads. Of course the other key parameter is
> > the hash function, not sure which one is the right one, but ideally we
> > would never have to allocate new pending_irq struct for LPIs because the
> > preallocated set would suffice.
> 
> As I mentioned in the last post, I expect this number to be really low
> (less than 5). 

Are you able to check this assumption in a real scenario? If not you,
somebody else?


> Let's face it: If you have multiple interrupts pending
> for a significant amount of time you won't make any actual progress in
> the guest, because it's busy with handling interrupts.
> So my picture of LPI handling is:
> 1) A device triggers an MSI, so the host receives the LPI. Ideally this
> will be handled by the pCPU where the right VCPU is running atm, so it
> will exit to EL2. Xen will handle the LPI by assigning one struct
> pending_irq to it and will inject it into the guest.
> 2) The VCPU gets to run again and calls the interrupt handler, because
> the (virtual) LPI is pending.
> 3) The (Linux) IRQ handler reads the ICC_IAR register to learn the IRQ
> number, and will get the virtual LPI number.
> => At this point the LPI is done when it comes to the VGIC. The LR state
> will be set to 0 (neither pending or active). This is independent of the
> EOI the handler will execute soon (or later).
> 4) On the next exit the VGIC code will discover that the IRQ is done
> (LR.state == 0) and will discard the struct pending_irq (set the LPI
> number to 0 to make it available to the next LPI).

I am following


> Even if there would be multiple LPIs pending at the same time (because
> the guest had interrupts disabled, for instance), I believe they can be
> all handled without exiting. Upon EOIing (priority-dropping, really) the
> first LPI, the next virtual LPI would fire, calling the interrupt
> handler again, and so no. Unless the kernel decides to do something that
> exits (even accessing the hardware normally wouldn't, I believe), we can
> clear all pending LPIs in one go.
> 
> So I have a hard time to imagine how we can really have many LPIs
> pending and thus struct pending_irqs allocated.
> Note that this may differ from SPIs, for instance, because the IRQ life
> cycle is more complex there (extending till the EOI).
> 
> Does that make some sense? Or am I missing something here?

In my tests with much smaller platforms than the ones existing today, I
could easily have 2-3 interrupts pending at the same time without much
load and without any SR-IOV NICs or any other fancy PCIE hardware. It
would be nice to test on Cavium ThunderX for example. It's also easy to
switch to rbtrees.


> > I could be convinced that a list is sufficient if we do some real
> > benchmarking and it turns out that lpi_to_pending always resolve in less
> > than ~5 steps.
> 
> I can try to do this once I get it running on some silicon ...
> 
> >> +    if ( !allocate )
> >> +        return NULL;
> >> +
> >> +    if ( !empty )
> >> +    {
> >> +        empty = xzalloc(struct lpi_pending_irq);
> >> +        vgic_init_pending_irq(&empty->pirq, lpi);
> >> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> >> +    } else
> >> +    {
> >> +        empty->pirq.status = 0;
> >> +        empty->pirq.irq = lpi;
> >> +    }
> >> +
> >> +    return &empty->pirq;
> >> +}
> >> +
> >>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
> >>  {
> >>      struct pending_irq *n;
> >> +
> > 
> > spurious change
> > 
> > 
> >>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
> >>       * are used for SPIs; the rests are used for per cpu irqs */
> >>      if ( irq < 32 )
> >>          n = &v->arch.vgic.pending_irqs[irq];
> >> +    else if ( irq >= 8192 )
> > 
> > Use the new static inline
> > 
> > 
> >> +        n = lpi_to_pending(v, irq, true);
> >>      else
> >>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
> >>      return n;
> >> @@ -480,7 +528,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
> >>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
> >>  {
> >>      uint8_t priority;
> >> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
> >> +    struct pending_irq *iter, *n;
> >>      unsigned long flags;
> >>      bool_t running;
> >>  
> >> @@ -488,6 +536,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
> >>  
> >>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
> >>  
> >> +    n = irq_to_pending(v, virq);
> > 
> > Why this change?
> 
> Because we now need to hold the lock before calling irq_to_pending(),
> which now may call lpi_to_pending().
> 
> >>      /* vcpu offline */
> >>      if ( test_bit(_VPF_down, &v->pause_flags) )
> >>      {
> >> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> >> index 9452fcd..ae8a9de 100644
> >> --- a/xen/include/asm-arm/domain.h
> >> +++ b/xen/include/asm-arm/domain.h
> >> @@ -249,6 +249,7 @@ struct arch_vcpu
> >>          paddr_t rdist_base;
> >>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> >>          uint8_t flags;
> >> +        struct list_head pending_lpi_list;
> >>      } vgic;
> >>  
> >>      /* Timer registers  */
> >> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> >> index 4e9841a..1f881c0 100644
> >> --- a/xen/include/asm-arm/gic-its.h
> >> +++ b/xen/include/asm-arm/gic-its.h
> >> @@ -136,6 +136,12 @@ int gicv3_lpi_allocate_host_lpi(struct host_its *its,
> >>  int gicv3_lpi_drop_host_lpi(struct host_its *its,
> >>                              uint32_t devid, uint32_t eventid,
> >>                              uint32_t host_lpi);
> >> +
> >> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> >> +{
> >> +    return GIC_PRI_IRQ;
> >> +}
> > 
> > Does it mean that we don't allow changes to LPI priorities?
> 
> This is placeholder code for now, until we learn about the virtual
> property table in patch 11/24 (where this function gets amended).
> The new code drop gets away without this function here entirely.
> 
> Cheers,
> Andre.
> 
> >>  #else
> >>  
> >>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> >> @@ -175,6 +181,10 @@ static inline int gicv3_lpi_drop_host_lpi(struct host_its *its,
> >>  {
> >>      return 0;
> >>  }
> >> +static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> >> +{
> >> +    return GIC_PRI_IRQ;
> >> +}
> >>  
> >>  #endif /* CONFIG_HAS_ITS */
> >>  
> >> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> >> index 300f461..4e29ba6 100644
> >> --- a/xen/include/asm-arm/vgic.h
> >> +++ b/xen/include/asm-arm/vgic.h
> >> @@ -83,6 +83,12 @@ struct pending_irq
> >>      struct list_head lr_queue;
> >>  };
> >>  
> >> +struct lpi_pending_irq
> >> +{
> >> +    struct list_head entry;
> >> +    struct pending_irq pirq;
> >> +};
> >> +
> >>  #define NR_INTERRUPT_PER_RANK   32
> >>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
> >>  
> >> @@ -296,8 +302,11 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
> >>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
> >>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
> >>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> >> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
> >>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
> >>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
> >> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
> >> +                                          bool allocate);
> >>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
> >>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
> >>  extern int vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
> >> -- 
> >> 2.9.0
> >>
> > 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-13 19:37       ` Stefano Stabellini
@ 2017-01-16  9:44         ` André Przywara
  2017-01-16 19:16           ` Stefano Stabellini
  0 siblings, 1 reply; 144+ messages in thread
From: André Przywara @ 2017-01-16  9:44 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

On 13/01/17 19:37, Stefano Stabellini wrote:
> On Thu, 12 Jan 2017, Andre Przywara wrote:

Hi Stefano,

...

>>>> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
>>>> +    {
>>>> +        if ( lpi_irq->pirq.irq == lpi )
>>>> +            return &lpi_irq->pirq;
>>>> +
>>>> +        if ( lpi_irq->pirq.irq == 0 && !empty )
>>>> +            empty = lpi_irq;
>>>> +    }
>>>
>>> This is another one of those cases where a list is too slow for the hot
>>> path. The idea of allocating pending_irq struct on demand is good, but
>>> storing them in a linked list would kill performance. Probably the best
>>> thing we could do is an hashtable and we should preallocate the initial
>>> array of elements. I don't know what the size of the initial array
>>> should be, but we can start around 50, and change it in the future once
>>> we do tests with real workloads. Of course the other key parameter is
>>> the hash function, not sure which one is the right one, but ideally we
>>> would never have to allocate new pending_irq struct for LPIs because the
>>> preallocated set would suffice.
>>
>> As I mentioned in the last post, I expect this number to be really low
>> (less than 5). 
> 
> Are you able to check this assumption in a real scenario? If not you,
> somebody else?
> 
> 
>> Let's face it: If you have multiple interrupts pending
>> for a significant amount of time you won't make any actual progress in
>> the guest, because it's busy with handling interrupts.
>> So my picture of LPI handling is:
>> 1) A device triggers an MSI, so the host receives the LPI. Ideally this
>> will be handled by the pCPU where the right VCPU is running atm, so it
>> will exit to EL2. Xen will handle the LPI by assigning one struct
>> pending_irq to it and will inject it into the guest.
>> 2) The VCPU gets to run again and calls the interrupt handler, because
>> the (virtual) LPI is pending.
>> 3) The (Linux) IRQ handler reads the ICC_IAR register to learn the IRQ
>> number, and will get the virtual LPI number.
>> => At this point the LPI is done when it comes to the VGIC. The LR state
>> will be set to 0 (neither pending or active). This is independent of the
>> EOI the handler will execute soon (or later).
>> 4) On the next exit the VGIC code will discover that the IRQ is done
>> (LR.state == 0) and will discard the struct pending_irq (set the LPI
>> number to 0 to make it available to the next LPI).
> 
> I am following
> 
> 
>> Even if there would be multiple LPIs pending at the same time (because
>> the guest had interrupts disabled, for instance), I believe they can be
>> all handled without exiting. Upon EOIing (priority-dropping, really) the
>> first LPI, the next virtual LPI would fire, calling the interrupt
>> handler again, and so no. Unless the kernel decides to do something that
>> exits (even accessing the hardware normally wouldn't, I believe), we can
>> clear all pending LPIs in one go.
>>
>> So I have a hard time to imagine how we can really have many LPIs
>> pending and thus struct pending_irqs allocated.
>> Note that this may differ from SPIs, for instance, because the IRQ life
>> cycle is more complex there (extending till the EOI).
>>
>> Does that make some sense? Or am I missing something here?
> 
> In my tests with much smaller platforms than the ones existing today, I
> could easily have 2-3 interrupts pending at the same time without much
> load and without any SR-IOV NICs or any other fancy PCIE hardware.

The difference to LPIs is that SPIs can be level triggered (eventually
requiring a driver to delete the interrupt condition in the device),
also require an explicit deactivation to finish off the IRQ state machine.
Both these things will lead to an IRQ to stay much longer in the LRs
than one would expect for an always edge triggered LPI lacking an active
state.
Also the timer IRQ is a PPI and thus a frequent visitor in the LRs.

> It would be nice to test on Cavium ThunderX for example.

Yes, I agree that there is quite some guessing involved, so proving this
sounds like a worthwhile task.

> It's also easy to switch to rbtrees.

On Friday I looked at rbtrees in Xen, which thankfully seem to be the
same as in Linux. So I converted the its_devices list over.

But in this case here I don't believe that rbtrees are the best data
structure, since we frequently need to look up entries, but also need to
find new, empty ones (when an LPI has fired).
And for allocating new LPIs we don't need a certain slot, just any free
would do. We probably want to avoid actually malloc-ing pending_irq
structures for that.

So a hash table with open addressing sounds like a better fit here:
- With a clever hash function (taken for instance Linux' LPI allocation
scheme into account) we get very quick lookup times for already assigned
LPIs.
- Assigning an LPI would use the same hash function, probably finding an
unused pending_irq, which we then could easily allocate.

I will try to write something along those lines.

Cheers,
Andre.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-16  9:44         ` André Przywara
@ 2017-01-16 19:16           ` Stefano Stabellini
  0 siblings, 0 replies; 144+ messages in thread
From: Stefano Stabellini @ 2017-01-16 19:16 UTC (permalink / raw)
  To: André Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5313 bytes --]

On Mon, 16 Jan 2017, André Przywara wrote:
> On 13/01/17 19:37, Stefano Stabellini wrote:
> > On Thu, 12 Jan 2017, Andre Przywara wrote:
> 
> Hi Stefano,
> 
> ...
> 
> >>>> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> >>>> +    {
> >>>> +        if ( lpi_irq->pirq.irq == lpi )
> >>>> +            return &lpi_irq->pirq;
> >>>> +
> >>>> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> >>>> +            empty = lpi_irq;
> >>>> +    }
> >>>
> >>> This is another one of those cases where a list is too slow for the hot
> >>> path. The idea of allocating pending_irq struct on demand is good, but
> >>> storing them in a linked list would kill performance. Probably the best
> >>> thing we could do is an hashtable and we should preallocate the initial
> >>> array of elements. I don't know what the size of the initial array
> >>> should be, but we can start around 50, and change it in the future once
> >>> we do tests with real workloads. Of course the other key parameter is
> >>> the hash function, not sure which one is the right one, but ideally we
> >>> would never have to allocate new pending_irq struct for LPIs because the
> >>> preallocated set would suffice.
> >>
> >> As I mentioned in the last post, I expect this number to be really low
> >> (less than 5). 
> > 
> > Are you able to check this assumption in a real scenario? If not you,
> > somebody else?
> > 
> > 
> >> Let's face it: If you have multiple interrupts pending
> >> for a significant amount of time you won't make any actual progress in
> >> the guest, because it's busy with handling interrupts.
> >> So my picture of LPI handling is:
> >> 1) A device triggers an MSI, so the host receives the LPI. Ideally this
> >> will be handled by the pCPU where the right VCPU is running atm, so it
> >> will exit to EL2. Xen will handle the LPI by assigning one struct
> >> pending_irq to it and will inject it into the guest.
> >> 2) The VCPU gets to run again and calls the interrupt handler, because
> >> the (virtual) LPI is pending.
> >> 3) The (Linux) IRQ handler reads the ICC_IAR register to learn the IRQ
> >> number, and will get the virtual LPI number.
> >> => At this point the LPI is done when it comes to the VGIC. The LR state
> >> will be set to 0 (neither pending or active). This is independent of the
> >> EOI the handler will execute soon (or later).
> >> 4) On the next exit the VGIC code will discover that the IRQ is done
> >> (LR.state == 0) and will discard the struct pending_irq (set the LPI
> >> number to 0 to make it available to the next LPI).
> > 
> > I am following
> > 
> > 
> >> Even if there would be multiple LPIs pending at the same time (because
> >> the guest had interrupts disabled, for instance), I believe they can be
> >> all handled without exiting. Upon EOIing (priority-dropping, really) the
> >> first LPI, the next virtual LPI would fire, calling the interrupt
> >> handler again, and so no. Unless the kernel decides to do something that
> >> exits (even accessing the hardware normally wouldn't, I believe), we can
> >> clear all pending LPIs in one go.
> >>
> >> So I have a hard time to imagine how we can really have many LPIs
> >> pending and thus struct pending_irqs allocated.
> >> Note that this may differ from SPIs, for instance, because the IRQ life
> >> cycle is more complex there (extending till the EOI).
> >>
> >> Does that make some sense? Or am I missing something here?
> > 
> > In my tests with much smaller platforms than the ones existing today, I
> > could easily have 2-3 interrupts pending at the same time without much
> > load and without any SR-IOV NICs or any other fancy PCIE hardware.
> 
> The difference to LPIs is that SPIs can be level triggered (eventually
> requiring a driver to delete the interrupt condition in the device),
> also require an explicit deactivation to finish off the IRQ state machine.
> Both these things will lead to an IRQ to stay much longer in the LRs
> than one would expect for an always edge triggered LPI lacking an active
> state.
> Also the timer IRQ is a PPI and thus a frequent visitor in the LRs.
> 
> > It would be nice to test on Cavium ThunderX for example.
> 
> Yes, I agree that there is quite some guessing involved, so proving this
> sounds like a worthwhile task.
> 
> > It's also easy to switch to rbtrees.
> 
> On Friday I looked at rbtrees in Xen, which thankfully seem to be the
> same as in Linux. So I converted the its_devices list over.
> 
> But in this case here I don't believe that rbtrees are the best data
> structure, since we frequently need to look up entries, but also need to
> find new, empty ones (when an LPI has fired).
> And for allocating new LPIs we don't need a certain slot, just any free
> would do. We probably want to avoid actually malloc-ing pending_irq
> structures for that.
> 
> So a hash table with open addressing sounds like a better fit here:
> - With a clever hash function (taken for instance Linux' LPI allocation
> scheme into account) we get very quick lookup times for already assigned
> LPIs.
> - Assigning an LPI would use the same hash function, probably finding an
> unused pending_irq, which we then could easily allocate.
> 
> I will try to write something along those lines.

Thanks Andre, indeed it sounds like a better option.

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2016-11-02 15:05   ` Julien Grall
@ 2017-01-31  9:10     ` Andre Przywara
  2017-01-31 10:23       ` Julien Grall
  0 siblings, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2017-01-31  9:10 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel

Hi Julien,

(forgot to hit the Send button yesterday ...)

just going over the review comments again and found some leftovers.
I fixed/addressed all comments of yours that I don't explicitly refer to
here.

...

On 02/11/16 15:05, Julien Grall wrote:
> Hi Andre,
> 
> On 28/09/16 19:24, Andre Przywara wrote:
>> To be able to easily send commands to the ITS, create the respective
>> wrapper functions, which take care of the ring buffer.
>> The first two commands we implement provide methods to map a collection
>> to a redistributor (aka host core) and to flush the command queue (SYNC).
>> Start using these commands for mapping one collection to each host CPU.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-its.c        | 101
>> ++++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c         |  17 +++++++
>>  xen/include/asm-arm/gic-its.h |  32 +++++++++++++
>>  3 files changed, 150 insertions(+)
>>
>> diff --git a/xen/arch/arm/gic-its.c b/xen/arch/arm/gic-its.c
>> index c8a7a7e..88397bc 100644
>> --- a/xen/arch/arm/gic-its.c
>> +++ b/xen/arch/arm/gic-its.c
>> @@ -33,6 +33,10 @@ static struct {
>>      int host_lpi_bits;
>>  } lpi_data;
>>
>> +/* Physical redistributor address */
>> +static DEFINE_PER_CPU(uint64_t, rdist_addr);
> 
> The type should be paddr_t.
> 
>> +/* Redistributor ID */
>> +static DEFINE_PER_CPU(uint64_t, rdist_id);
>>  /* Pending table for each redistributor */
>>  static DEFINE_PER_CPU(void *, pending_table);
>>
>> @@ -40,6 +44,86 @@ static DEFINE_PER_CPU(void *, pending_table);
>>          min_t(unsigned int, lpi_data.host_lpi_bits,
>> CONFIG_HOST_LPI_BITS)
>>  #define MAX_HOST_LPIS   (BIT(MAX_HOST_LPI_BITS) - 8192)
>>
>> +#define ITS_COMMAND_SIZE        32
>> +
>> +static int its_send_command(struct host_its *hw_its, void *its_cmd)
> 
> The its_cmd could be const as you don't modify it.
> 
>> +{
>> +    int readp, writep;
> 
> Please use uint32_t (or maybe uint64_t) here.
> 
>> +
>> +    spin_lock(&hw_its->cmd_lock);
>> +
>> +    readp = readl_relaxed(hw_its->its_base + GITS_CREADR) &
>> GENMASK(19, 5);
>> +    writep = readl_relaxed(hw_its->its_base + GITS_CWRITER) &
>> GENMASK(19, 5);
> 
> Please introduce a define for the GENMASK(19, 5) rather than hardcoding
> it in multiple place.
> 
>> +
>> +    if ( ((writep + ITS_COMMAND_SIZE) % PAGE_SIZE) == readp )
>> +    {
>> +        spin_unlock(&hw_its->cmd_lock);
>> +        return -EBUSY;
>> +    }
>> +
>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_COMMAND_SIZE);
>> +    __flush_dcache_area(hw_its->cmd_buf + writep, ITS_COMMAND_SIZE);
> 
> Why the flush here? From patch #4, the GIC has been configured to be
> able to snoop the cache. So a dsb(ish) would be enough here.
> 
>> +    writep = (writep + ITS_COMMAND_SIZE) % PAGE_SIZE;
>> +
>> +    writeq_relaxed(writep & GENMASK(19, 5), hw_its->its_base +
>> GITS_CWRITER);
>> +
>> +    spin_unlock(&hw_its->cmd_lock);
>> +
>> +    return 0;
> 
> This function return either -EBUSY or 0. Would not it be better to
> return a bool instead?

I'd rather keep this in line with the general UNIX/Linux way of
returning an int, with 0 on success and a negative error number on
failure. This allows easier fixing when we introduce more error handling
in the future (SErrors?).
So callers just pass on the return value and it's up to the leaf
functions to come up with the proper error value.

>> +}
>> +
>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>> uint64_t reg)
>> +{
>> +    reg &= ~GENMASK(51, 16);
>> +
>> +    if ( hw_its->pta )
>> +        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);
>> +    else
>> +        reg |= per_cpu(rdist_id, cpu) << 16;
> 
> I would prefer if we setup the target address at initialize per-cpu
> rather than doing it every time we send a sync command (or else).

I believe we can't do easily, because the PTA bit is per ITS, not per
redistributor. I see that it's rather unlikely that we have ITSes with
different PTA bit settings in one system, but architecturally it's possible.

>> +
>> +    return reg;
>> +}
>> +
>> +static int its_send_cmd_sync(struct host_its *its, int cpu)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_SYNC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +static int its_send_cmd_mapc(struct host_its *its, int collection_id,
>> int cpu)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0))
>> | BIT(63));
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +/* Set up the (1:1) collection mapping for the given host CPU. */
>> +void gicv3_its_setup_collection(int cpu)
>> +{
>> +    struct host_its *its;
>> +
>> +    list_for_each_entry(its, &host_its_list, entry)
>> +    {
>> +        /* Only send commands to ITS that have been initialized
>> already. */
>> +        if ( !its->cmd_buf )
>> +            continue;
>> +
>> +        its_send_cmd_mapc(its, cpu, cpu);
>> +        its_send_cmd_sync(its, cpu);
> 
> Looking at the implementation of its_send_cmd_*, the functions may
> return an error if the command queue is full. However you don't check
> the return, and continue as it was fine. We will get in trouble much later.
> 
> Furthermore, sending the SYNC command does not meaning the ITS has
> executed the command. You have to ensure that GITS_CREADR ==
> GITS_CWRITER and I didn't find this code within this series.

Originally I didn't care so much about handling ITS command queue
errors, because traditionally we couldn't do too much about them, since
there is no good way of communicating failure back to the guest. But
since we now only issue commands outside of a non-Dom0 guest context, I
added the error handling you suggested and pass any error up the call chain.

> 
>> +    }
>> +}
>> +
>>  #define BASER_ATTR_MASK                                           \
>>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> @@ -147,6 +231,13 @@ int gicv3_its_init(struct host_its *hw_its)
>>      if ( !hw_its->its_base )
>>          return -ENOMEM;
>>
>> +    /* Make sure the ITS is disabled before programming the BASE
>> registers. */
>> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base +
>> GITS_CTLR);
> 
> The spec (6.2.1 in IHI 0069C) requires the ITS to be disabled and
> quiescent before programming the BASE registers. So I don't think this
> check is enough here.
> 
>> +
>> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>> +    hw_its->pta = reg & GITS_TYPER_PTA;
>> +
>>      for (i = 0; i < 8; i++)
>>      {
>>          void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> @@ -174,9 +265,18 @@ int gicv3_its_init(struct host_its *hw_its)
>>      if ( IS_ERR(hw_its->cmd_buf) )
>>          return PTR_ERR(hw_its->cmd_buf);
>>
>> +    its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
>> +    its_send_cmd_sync(hw_its, smp_processor_id());
> 
> See my comments on the previous its_send_* functions
> 
>> +
>>      return 0;
>>  }
>>
>> +void gicv3_set_redist_addr(paddr_t address, int redist_id)
> 
> The second parameter should probably be unsigned, maybe uint64_t?

Well, the processor ID is a 16-bit value and nicely aligns with Xen's
VCPUIDs, which are declared as "int" in struct vcpu.
So I'd rather keep it as int here.

>> +{
>> +    this_cpu(rdist_addr) = address;
>> +    this_cpu(rdist_id) = redist_id;
>> +}
>> +
>>  uint64_t gicv3_lpi_allocate_pendtable(void)
>>  {
>>      uint64_t reg, attr;
>> @@ -265,6 +365,7 @@ void gicv3_its_dt_init(const struct dt_device_node
>> *node)
>>          its_data->addr = addr;
>>          its_data->size = size;
>>          its_data->dt_node = its;
>> +        spin_lock_init(&its_data->cmd_lock);
>>
>>          printk("GICv3: Found ITS @0x%lx\n", addr);
>>
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 5cf4618..b9387a3 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -638,6 +638,8 @@ static void gicv3_rdist_init_lpis(void __iomem *
>> rdist_base)
>>      table_reg = gicv3_lpi_get_proptable();
>>      if ( table_reg )
>>          writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
>> +
>> +    gicv3_its_setup_collection(smp_processor_id());
>>  }
>>
>>  static int __init gicv3_populate_rdist(void)
>> @@ -684,7 +686,22 @@ static int __init gicv3_populate_rdist(void)
>>                  this_cpu(rbase) = ptr;
>>
>>                  if ( typer & GICR_TYPER_PLPIS )
>> +                {
>> +                    paddr_t rdist_addr;
>> +
>> +                    rdist_addr = gicv3.rdist_regions[i].base;
>> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
>> +
>> +                    /* The ITS refers to redistributors either by
>> their physical
> 
> Coding style:
> 
> /*
>  * Foo

Oh, right, I think I copied this commenting style wrongly from some
other Xen VGIC code. Fixed that now.

Cheers,
Andre.

>> +                     * address or by their ID. Determine those two
>> values and
>> +                     * let the ITS code store them in per host CPU
>> variables to
>> +                     * later be able to address those redistributors.
>> +                     */
>> +                    gicv3_set_redist_addr(rdist_addr,
>> +                                          (typer >> 8) & GENMASK(15,
>> 0));
> 
> Please avoid hardcoding mask and use a define.
> 
>> +
>>                      gicv3_rdist_init_lpis(ptr);
>> +                }
>>
>>                  printk("GICv3: CPU%d: Found redistributor in region
>> %d @%p\n",
>>                          smp_processor_id(), i, ptr);
>> diff --git a/xen/include/asm-arm/gic-its.h
>> b/xen/include/asm-arm/gic-its.h
>> index b2a003f..b49d274 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -37,6 +37,7 @@
>>
>>  /* Register bits */
>>  #define GITS_CTLR_ENABLE     0x1
>> +#define GITS_TYPER_PTA       BIT(19)
>>  #define GITS_IIDR_VALUE      0x34c
>>
>>  #define GITS_BASER_VALID                BIT(63)
>> @@ -59,6 +60,22 @@
>>                                          (31UL <<
>> GITS_BASER_ENTRY_SIZE_SHIFT) |\
>>                                          GITS_BASER_INDIRECT)
>>
>> +/* ITS command definitions */
>> +#define ITS_CMD_SIZE                    32
>> +
>> +#define GITS_CMD_MOVI                   0x01
>> +#define GITS_CMD_INT                    0x03
>> +#define GITS_CMD_CLEAR                  0x04
>> +#define GITS_CMD_SYNC                   0x05
>> +#define GITS_CMD_MAPD                   0x08
>> +#define GITS_CMD_MAPC                   0x09
>> +#define GITS_CMD_MAPTI                  0x0a
>> +#define GITS_CMD_MAPI                   0x0b
>> +#define GITS_CMD_INV                    0x0c
>> +#define GITS_CMD_INVALL                 0x0d
>> +#define GITS_CMD_MOVALL                 0x0e
>> +#define GITS_CMD_DISCARD                0x0f
>> +
>>  #ifndef __ASSEMBLY__
>>  #include <xen/device_tree.h>
>>
>> @@ -69,7 +86,9 @@ struct host_its {
>>      paddr_t addr;
>>      paddr_t size;
>>      void __iomem *its_base;
>> +    spinlock_t cmd_lock;
>>      void *cmd_buf;
>> +    bool pta;
>>  };
>>
>>  extern struct list_head host_its_list;
>> @@ -89,6 +108,12 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
>>  int gicv3_lpi_init_host_lpis(int nr_lpis);
>>  int gicv3_its_init(struct host_its *hw_its);
>>
>> +/* Set the physical address and ID for each redistributor as read
>> from DT. */
>> +void gicv3_set_redist_addr(paddr_t address, int redist_id);
>> +
>> +/* Map a collection for this host CPU to each host ITS. */
>> +void gicv3_its_setup_collection(int cpu);
>> +
>>  #else
>>
>>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> @@ -110,6 +135,13 @@ static inline int gicv3_its_init(struct host_its
>> *hw_its)
>>  {
>>      return 0;
>>  }
> 
> Newline here
> 
>> +static inline void gicv3_set_redist_addr(paddr_t address, int redist_id)
>> +{
>> +}
> 
> Ditto
> 
>> +static inline void gicv3_its_setup_collection(int cpu)
>> +{
>> +}
>> +
>>  #endif /* CONFIG_HAS_ITS */
>>
>>  #endif /* __ASSEMBLY__ */
>>
> 
> Regards,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-11-02 17:18   ` Julien Grall
  2016-11-02 17:41     ` Stefano Stabellini
@ 2017-01-31  9:10     ` Andre Przywara
  2017-01-31 10:38       ` Julien Grall
  1 sibling, 1 reply; 144+ messages in thread
From: Andre Przywara @ 2017-01-31  9:10 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Steve Capper

Hi Julien,

(forgot to hit the Send button yesterday ...)

....

On 02/11/16 17:18, Julien Grall wrote:
> Hi Andre,
> 
> On 28/09/16 19:24, Andre Przywara wrote:
>> Allow a guest to provide the address and size for the memory regions
>> it has reserved for the GICv3 pending and property tables.
>> We sanitise the various fields of the respective redistributor
>> registers and map those pages into Xen's address space to have easy
>> access.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-v3.c        | 189
>> ++++++++++++++++++++++++++++++++++++++----
>>  xen/arch/arm/vgic.c           |   4 +
>>  xen/include/asm-arm/domain.h  |   7 +-
>>  xen/include/asm-arm/gic-its.h |  10 ++-
>>  xen/include/asm-arm/vgic.h    |   3 +
>>  5 files changed, 197 insertions(+), 16 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index e9b6490..8fe8386 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -20,12 +20,14 @@
>>
>>  #include <xen/bitops.h>
>>  #include <xen/config.h>
>> +#include <xen/domain_page.h>
>>  #include <xen/lib.h>
>>  #include <xen/init.h>
>>  #include <xen/softirq.h>
>>  #include <xen/irq.h>
>>  #include <xen/sched.h>
>>  #include <xen/sizes.h>
>> +#include <xen/vmap.h>
>>  #include <asm/current.h>
>>  #include <asm/mmio.h>
>>  #include <asm/gic_v3_defs.h>
>> @@ -228,12 +230,14 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct
>> vcpu *v, mmio_info_t *info,
>>          goto read_reserved;
>>
>>      case VREG64(GICR_PROPBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase,
>> info);
>> +        return 1;
>>
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
> 
> The field PTZ read as 0.
> 
>> +        return 1;
>>
>>      case 0x0080:
>>          goto read_reserved;
>> @@ -301,11 +305,6 @@ bad_width:
>>      domain_crash_synchronous();
>>      return 0;
>>
>> -read_as_zero_64:
>> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> -    *r = 0;
>> -    return 1;
>> -
>>  read_as_zero_32:
>>      if ( dabt.size != DABT_WORD ) goto bad_width;
>>      *r = 0;
>> @@ -330,11 +329,149 @@ read_unknown:
>>      return 1;
>>  }
>>
>> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
>> +                                    int field_shift,
>> +                                    uint64_t (*sanitise_fn)(uint64_t))
>> +{
>> +    uint64_t field = (reg & field_mask) >> field_shift;
>> +
>> +    field = sanitise_fn(field) << field_shift;
> 
> Newline here please.
> 
>> +    return (reg & ~field_mask) | field;
>> +}
>> +
>> +/* We want to avoid outer shareable. */
>> +static uint64_t vgic_sanitise_shareability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_OuterShareable:
>> +        return GIC_BASER_InnerShareable;
>> +    default:
>> +        return field;
>> +    }
>> +}
> 
> I am not sure to understand why we need to sanitise the value here. From
> my understanding of the spec (see 8.11.18 in IHI 0069C) we should
> support any shareability/cacheability, correct?

No, actually an ITS is free to support only _one_ of those attributes,
up to the point where it is read-only:

"It is IMPLEMENTATION DEFINED whether this field has a fixed value or
can be programmed by software. Implementing this field with a fixed
value is deprecated."

So we support more than one value, but refuse any really not useful
ones. This goes in line with the KVM implementation.

For the rest of the comments regarding the memory tables setup:
I effectively rewrote this in the new series, so I think the majority of
the comments don't apply anymore, hopefully the rewrite actually fixed
the issues you mentioned. So I refrain from any comments now and look
forward to a review of the new approach ;-)

Cheers,
Andre.

>> +
>> +/* Avoid any inner non-cacheable mapping. */
>> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_CACHE_nCnB:
>> +    case GIC_BASER_CACHE_nC:
>> +        return GIC_BASER_CACHE_RaWb;
>> +    default:
>> +        return field;
>> +    }
>> +}
>> +
>> +/* Non-cacheable or same-as-inner are OK. */
>> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
>> +{
>> +    switch (field) {
>> +    case GIC_BASER_CACHE_SameAsInner:
>> +    case GIC_BASER_CACHE_nC:
>> +        return field;
>> +    default:
>> +        return GIC_BASER_CACHE_nC;
>> +    }
>> +}
>> +
>> +static uint64_t sanitize_propbaser(uint64_t reg)
>> +{
>> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
>> +                              vgic_sanitise_shareability);
>> +    reg = vgic_sanitise_field(reg,
>> GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_inner_cacheability);
>> +    reg = vgic_sanitise_field(reg,
>> GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_outer_cacheability);
>> +
>> +    reg &= ~PROPBASER_RES0_MASK;
>> +    reg &= ~GENMASK(51, 48);
> 
> Why do you mask the bits 51:48. There is no restriction in Xen about the
> size of the IPA (though 52 bits support is part of ARMv8.2), so we
> should avoid to open-code mask everywhere in the code. Otherwise it will
> be more painful to extend the number of bits supported.
> 
> FWIW, all the p2m code is checking whether the IPA is supported.
> 
>> +    return reg;
>> +}
>> +
>> +static uint64_t sanitize_pendbaser(uint64_t reg)
>> +{
>> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
>> +                              vgic_sanitise_shareability);
>> +    reg = vgic_sanitise_field(reg,
>> GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_inner_cacheability);
>> +    reg = vgic_sanitise_field(reg,
>> GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>> +                              vgic_sanitise_outer_cacheability);
>> +
>> +    reg &= ~PENDBASER_RES0_MASK;
>> +    reg &= ~GENMASK(51, 48);
> 
> Ditto.
> 
>> +    return reg;
>> +}
>> +
>> +/*
>> + * Allow mapping some parts of guest memory into Xen's VA space to
>> have easy
>> + * access to it. This is to allow ITS configuration data to be held in
>> + * guest memory and avoid using Xen memory for that.
>> + */
>> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int
>> nr_pages)
> 
> Please pass a gfn_t rather than paddr_t.
> 
>> +{
>> +    mfn_t onepage;
>> +    mfn_t *pages;
> 
> s/pages/mfns/
> 
>> +    int i;
>> +    void *ptr;
>> +
>> +    /* TODO: free previous mapping, change prototype? use
>> get-put-put? */
>> +
>> +    guest_addr &= PAGE_MASK;
>> +
>> +    if ( nr_pages == 1 )
>> +    {
>> +        pages = &onepage;
>> +    } else
>> +    {
>> +        pages = xmalloc_array(mfn_t, nr_pages);
>> +        if ( !pages )
>> +            return NULL;
>> +    }
>> +
>> +    for (i = 0; i < nr_pages; i++)
>> +    {
>> +        get_page_from_gfn(d, (guest_addr >> PAGE_SHIFT) + i, NULL,
>> P2M_ALLOC);
> 
> get_page_from_gfn can fail if you try to get a page on memory that is
> not baked by a RAM region. Also get_page_from_gfn will work on foreign
> mapping, we don't want the guest using foreing memory (e.g memory
> belonging to another domain) for the ITS internal memory.
> 
> Also, please try to pay attention for error path whilst you write code.
> It is a pain to handle them after the code has been written. I will try
> to point them when I spot it.
> 
>> +        pages[i] = _mfn((guest_addr + i * PAGE_SIZE) >> PAGE_SHIFT);
> 
> You cannot assume a 1:1 mapping between the IPA and the PA. Please use
> the struct page_info returned by get_page_from_gfn
> 
>> +    }
>> +
>> +    ptr = vmap(pages, nr_pages);
> 
> I am not a big fan of the vmap solution for various reasons:
>     - the VMAP area is small (only 1GB) it will not scale (you seem to
> use it to map pretty much all memory provisioned for the ITS)
>     - writing in a register cannot fail, how do you co-op with that?
> 
> I think the best approach here is to use a similar approach as
> copy_*_guests helpers but dealing with IPA rather than guest VA.
> 
>> +
>> +    if ( nr_pages > 1 )
>> +        xfree(pages);
>> +
>> +    return ptr;
>> +}
>> +
>> +void unmap_guest_pages(void *va, int nr_pages)
>> +{
>> +    paddr_t pa;
>> +    unsigned long i;
>> +
>> +    if ( !va )
>> +        return;
>> +
>> +    va = (void *)((uintptr_t)va & PAGE_MASK);
>> +    pa = virt_to_maddr(va);
>> +
>> +    vunmap(va);
>> +    for (i = 0; i < nr_pages; i++)
>> +        put_page(mfn_to_page((pa >> PAGE_SHIFT) + i));
>> +
>> +    return;
>> +}
>> +
>>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t
>> *info,
>>                                            uint32_t gicr_reg,
>>                                            register_t r)
>>  {
>>      struct hsr_dabt dabt = info->dabt;
>> +    uint64_t reg;
>>
>>      switch ( gicr_reg )
>>      {
>> @@ -375,13 +512,37 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>> vcpu *v, mmio_info_t *info,
>>      case 0x0050:
>>          goto write_reserved;
>>
>> -    case VREG64(GICR_PROPBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +    case VREG64(GICR_PROPBASER): {
> 
> Coding style, the { should be on a line.
> 
>> +        int nr_pages;
> 
> unsigned int
> 
>> +
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> 
> Newline here for clarity. Also please use vgic_reg64_check_access rather
> than open-coding it.
> 
>> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> 
> From my understanding VGIC_V3_LPIS_ENABLED is set when the guest enable
> LPIs on this re-distributor. However, this check is not safe as
> GICR_CTLR.Enable_LPIs may be set concurrently (the re-distributors are
> accessible from any vCPU).
> 
> Also, when ITS is not available we should avoid to handle the register
> (i.e treating as write ignore). My rational here is we should limit the
> amount of emulation exposed to the guest whenever it is possible.
> 
>> +            return 1;
> 
> I think we should at least print warning as writing to GICR_PROPBASER
> when GICR_CTLR.Enable_LPIs is set is unpredictable. IHMO, I would even
> crash the guest.
> 
> The code below likely needs locking as the property table is common to
> all re-distributor, hence could be modified concurrently. Also, I would
> like to see a comment on top of emulation of GICR_TYPER to mention that
> all re-distributor shares the same common property table
> (GICR_TYPER.CommonLPIAff = 0).
> 
>> +
>> +        reg = v->domain->arch.vgic.rdist_propbase;
>> +        vgic_reg64_update(&reg, r, info);
>> +        reg = sanitize_propbaser(reg);
>> +        v->domain->arch.vgic.rdist_propbase = reg;
>>
>> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) +
>> 1) - 8192;
> 
> The spec (see 8.11.19): "If the value of this field is larger than the
> value of GICD_TYPER.IDbits, the GICD_TYPER.IDbits value applies). We
> don't want to map more than necessary uin
> 
>> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
>> +        unmap_guest_pages(v->domain->arch.vgic.proptable, nr_pages);
> 
> This looks wrong to me. A guest could specify a size different from the
> previous write.
> 
>> +        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
>> +                                                         reg &
>> GENMASK(47, 12),
>> +                                                         nr_pages);
>> +        return 1;
>> +    }
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> 
> Newline + vgic_reg64_check_access
> 
> Also, you don't check whether the LPIs have been enabled here.
> 
> All my comments above stands. Furthermore, the code is not correctly
> indented (you are using hard tab).
> 
>> +    reg = v->arch.vgic.rdist_pendbase;
>> +    vgic_reg64_update(&reg, r, info);
>> +    reg = sanitize_pendbaser(reg);
>> +    v->arch.vgic.rdist_pendbase = reg;
>> +
>> +        unmap_guest_pages(v->arch.vgic.pendtable, 16);
>> +    v->arch.vgic.pendtable = map_guest_pages(v->domain,
>> +                                                 reg & GENMASK(47,
>> 12), 16);
> 
> The pending table is never touched by Xen. So I would avoid to mapping it.
> 
>> +    return 1;
>>
>>      case 0x0080:
>>          goto write_reserved;
>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index b961551..4d9304f 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu
>> *v, unsigned int lpi,
>>          empty->pirq.irq = lpi;
>>      }
>>
>> +    /* Update the enabled status */
>> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
>> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
>> +
>>      return &empty->pirq;
>>  }
>>
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index ae8a9de..0cd3500 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -109,6 +109,8 @@ struct arch_domain
>>          } *rdist_regions;
>>          int nr_regions;                     /* Number of rdist
>> regions */
>>          uint32_t rdist_stride;              /* Re-Distributor stride */
>> +        uint64_t rdist_propbase;
>> +        uint8_t *proptable;
>>  #endif
>>      } vgic;
>>
>> @@ -247,7 +249,10 @@ struct arch_vcpu
>>
>>          /* GICv3: redistributor base and flags for this vCPU */
>>          paddr_t rdist_base;
>> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the
>> rdist */
> 
> Please avoid spurious change. We don't require in Xen to have all the
> constant aligned. This also makes harder to got through the changes.
> 
>> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
> 
> Please document the purpose of this bit.
> 
>> +        uint64_t rdist_pendbase;
>> +        unsigned long *pendtable;
>>          uint8_t flags;
>>          struct list_head pending_lpi_list;
>>      } vgic;
>> diff --git a/xen/include/asm-arm/gic-its.h
>> b/xen/include/asm-arm/gic-its.h
>> index 1f881c0..3b2e5c0 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>
>>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
> 
> s/lpi/vlpi/ to make clear this is a function deal with virtual LPIs.
> 
>>  {
>> -    return GIC_PRI_IRQ;
>> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
> 
> I think this is the best place to ask this question, I don't see any
> code within this series to check that the guest effectively initialized
> proptable and the size is correct (you don't check that the guest
> provided enough memory compare to the vLPI suggested).
> 
> FWIW, I have only already made those comments back Vijay sent his patch
> series. It might be worth for you to look at what he did regarding all
> the sanity checks.
> 
>> +}
> 
> Newline here for clarity.
> 
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
> 
> Ditto for the naming.
> 
>> +{
>> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>>  }
>>
>>  #else
>> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct
>> domain *d, uint32_t lpi)
>>  {
>>      return GIC_PRI_IRQ;
>>  }
> 
> Newline here for clarity.
> 
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>> +{
>> +    return false;
>> +}
>>
>>  #endif /* CONFIG_HAS_ITS */
>>
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 4e29ba6..2b216cc 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>>
>>  #undef VGIC_REG_HELPERS
>>
>> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int
>> nr_pages);
>> +void unmap_guest_pages(void *va, int nr_pages);
>> +
>>  enum gic_sgi_mode;
>>
>>  /*
>>
> 
> Regards,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling
  2017-01-31  9:10     ` Andre Przywara
@ 2017-01-31 10:23       ` Julien Grall
  0 siblings, 0 replies; 144+ messages in thread
From: Julien Grall @ 2017-01-31 10:23 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel

Hi Andre,

On 31/01/2017 09:10, Andre Przywara wrote:
> On 02/11/16 15:05, Julien Grall wrote:
>> On 28/09/16 19:24, Andre Przywara wrote:
>>> +}
>>> +
>>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>>> uint64_t reg)
>>> +{
>>> +    reg &= ~GENMASK(51, 16);
>>> +
>>> +    if ( hw_its->pta )
>>> +        reg |= per_cpu(rdist_addr, cpu) & GENMASK(51, 16);
>>> +    else
>>> +        reg |= per_cpu(rdist_id, cpu) << 16;
>>
>> I would prefer if we setup the target address at initialize per-cpu
>> rather than doing it every time we send a sync command (or else).
>
> I believe we can't do easily, because the PTA bit is per ITS, not per
> redistributor. I see that it's rather unlikely that we have ITSes with
> different PTA bit settings in one system, but architecturally it's possible.

How about storing the value per ITS then?

[...]

>>> +void gicv3_set_redist_addr(paddr_t address, int redist_id)
>>
>> The second parameter should probably be unsigned, maybe uint64_t?
>
> Well, the processor ID is a 16-bit value and nicely aligns with Xen's
> VCPUIDs, which are declared as "int" in struct vcpu.
> So I'd rather keep it as int here.

I am not sure why you speak about the vCPUIDS where this code only deal 
with pCPUIDs. Anyway, the ID should never be signed and even though it 
has been defined as int in the structure, on ARM we are trying to use 
unsigned number as it makes much more sense.

Furthermore, the per-cpu value rdist_id has been defined as uint64_t, 
hence my request to use uint64_t.

The 2 type need to match for consistency. I am fine if you decide to use 
unsigned int for the per-cpu value.

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-01-31  9:10     ` Andre Przywara
@ 2017-01-31 10:38       ` Julien Grall
  2017-01-31 12:04         ` Andre Przywara
  0 siblings, 1 reply; 144+ messages in thread
From: Julien Grall @ 2017-01-31 10:38 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Steve Capper



On 31/01/2017 09:10, Andre Przywara wrote:
> Hi Julien,

Hi Andre,

> On 02/11/16 17:18, Julien Grall wrote:
>> On 28/09/16 19:24, Andre Przywara wrote:
>>> +    return (reg & ~field_mask) | field;
>>> +}
>>> +
>>> +/* We want to avoid outer shareable. */
>>> +static uint64_t vgic_sanitise_shareability(uint64_t field)
>>> +{
>>> +    switch (field) {
>>> +    case GIC_BASER_OuterShareable:
>>> +        return GIC_BASER_InnerShareable;
>>> +    default:
>>> +        return field;
>>> +    }
>>> +}
>>
>> I am not sure to understand why we need to sanitise the value here. From
>> my understanding of the spec (see 8.11.18 in IHI 0069C) we should
>> support any shareability/cacheability, correct?
>
> No, actually an ITS is free to support only _one_ of those attributes,
> up to the point where it is read-only:
>
> "It is IMPLEMENTATION DEFINED whether this field has a fixed value or
> can be programmed by software. Implementing this field with a fixed
> value is deprecated."
>
> So we support more than one value, but refuse any really not useful
> ones. This goes in line with the KVM implementation.

Looking at your quote from the spec, this behavior is deprecated. Why do 
we want to implement a deprecated behavior?

>
> For the rest of the comments regarding the memory tables setup:
> I effectively rewrote this in the new series, so I think the majority of
> the comments don't apply anymore, hopefully the rewrite actually fixed
> the issues you mentioned. So I refrain from any comments now and look
> forward to a review of the new approach ;-)

I will give a look to the new implementation.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-01-31 10:38       ` Julien Grall
@ 2017-01-31 12:04         ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2017-01-31 12:04 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Steve Capper

Hi,

On 31/01/17 10:38, Julien Grall wrote:
> 
> 
> On 31/01/2017 09:10, Andre Przywara wrote:
>> Hi Julien,
> 
> Hi Andre,
> 
>> On 02/11/16 17:18, Julien Grall wrote:
>>> On 28/09/16 19:24, Andre Przywara wrote:
>>>> +    return (reg & ~field_mask) | field;
>>>> +}
>>>> +
>>>> +/* We want to avoid outer shareable. */
>>>> +static uint64_t vgic_sanitise_shareability(uint64_t field)
>>>> +{
>>>> +    switch (field) {
>>>> +    case GIC_BASER_OuterShareable:
>>>> +        return GIC_BASER_InnerShareable;
>>>> +    default:
>>>> +        return field;
>>>> +    }
>>>> +}
>>>
>>> I am not sure to understand why we need to sanitise the value here. From
>>> my understanding of the spec (see 8.11.18 in IHI 0069C) we should
>>> support any shareability/cacheability, correct?
>>
>> No, actually an ITS is free to support only _one_ of those attributes,
>> up to the point where it is read-only:
>>
>> "It is IMPLEMENTATION DEFINED whether this field has a fixed value or
>> can be programmed by software. Implementing this field with a fixed
>> value is deprecated."
>>
>> So we support more than one value, but refuse any really not useful
>> ones. This goes in line with the KVM implementation.
> 
> Looking at your quote from the spec, this behavior is deprecated. Why do
> we want to implement a deprecated behavior?

We don't. Allowing only _one_ attribute and thus making those register
bits read-only is deprecated. We make sure to provide support for at
least two of them.
Supporting every possible attribute in a virtualization scenario is
pointless and not helpful. I believe the architecture requires software
to cope with only one attribute, even though this is for some reason
"deprecated" (which is a hint for an implementer, not for a driver author).

>> For the rest of the comments regarding the memory tables setup:
>> I effectively rewrote this in the new series, so I think the majority of
>> the comments don't apply anymore, hopefully the rewrite actually fixed
>> the issues you mentioned. So I refrain from any comments now and look
>> forward to a review of the new approach ;-)
> 
> I will give a look to the new implementation.

Thanks!

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

* Re: [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables
  2016-10-29  0:39   ` Stefano Stabellini
@ 2017-03-29 15:47     ` Andre Przywara
  0 siblings, 0 replies; 144+ messages in thread
From: Andre Przywara @ 2017-03-29 15:47 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: xen-devel, Julien Grall

Hi,

On 29/10/16 01:39, Stefano Stabellini wrote:
> On Wed, 28 Sep 2016, Andre Przywara wrote:
>> Allow a guest to provide the address and size for the memory regions
>> it has reserved for the GICv3 pending and property tables.
>> We sanitise the various fields of the respective redistributor
>> registers and map those pages into Xen's address space to have easy
>> access.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index e9b6490..8fe8386 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
....

>> +        reg = v->domain->arch.vgic.rdist_propbase;
>> +        vgic_reg64_update(&reg, r, info);
>> +        reg = sanitize_propbaser(reg);
>> +        v->domain->arch.vgic.rdist_propbase = reg;
>>  
>> +        nr_pages = BIT((v->domain->arch.vgic.rdist_propbase & 0x1f) + 1) - 8192;
>> +        nr_pages = DIV_ROUND_UP(nr_pages, PAGE_SIZE);
> 
> Do we need to set an upper limit on nr_pages? We don't really want to
> allow (2^0x1f)/4096 pages, right?

Why not? This is the virtual property table, and the *guest* provides
the memory. We just comply here and map it. I don't see any issue.

[ .... ]

>> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
>> index b961551..4d9304f 100644
>> --- a/xen/arch/arm/vgic.c
>> +++ b/xen/arch/arm/vgic.c
>> @@ -488,6 +488,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>>          empty->pirq.irq = lpi;
>>      }
>>  
>> +    /* Update the enabled status */
>> +    if ( gicv3_lpi_is_enabled(v->domain, lpi) )
>> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
> 
> Where is the GIC_IRQ_GUEST_ENABLED unset?

In the patch where the INV command is emulated. This is how
enabling/disabling LPI works: Software (the guest here) sets the bit in
the property table and issues an ITS command to notify the ITS
(emulation) about it.

>>      return &empty->pirq;
>>  }
>>  
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index ae8a9de..0cd3500 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -109,6 +109,8 @@ struct arch_domain
>>          } *rdist_regions;
>>          int nr_regions;                     /* Number of rdist regions */
>>          uint32_t rdist_stride;              /* Re-Distributor stride */
>> +        uint64_t rdist_propbase;
>> +        uint8_t *proptable;
> 
> Do we need to keep both rdist_propbase and proptable? It is easy to go
> from proptable to rdist_propbase and I guess it is not an operation that
> is done often? If so, we could save some memory and remove it.

The code has changed meanwhile, so this does not apply direclty anymore,
but just to make sure:
We need rdist_propbase separately, because a guest can happily set and
change it as often as it wants before enabling LPIs. We shouldn't (and
we don't) allocate memory now (and so set proptable) until the LPIs get
enabled.

>>  #endif
>>      } vgic;
>>  
>> @@ -247,7 +249,10 @@ struct arch_vcpu
>>  
>>          /* GICv3: redistributor base and flags for this vCPU */
>>          paddr_t rdist_base;
>> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
>> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
>> +        uint64_t rdist_pendbase;
>> +        unsigned long *pendtable;
> 
> Same here.

And the same rationale applies here.

Fixed / addresses the rest.

Cheers,
Andre.

>>          uint8_t flags;
>>          struct list_head pending_lpi_list;
>>      } vgic;
>> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
>> index 1f881c0..3b2e5c0 100644
>> --- a/xen/include/asm-arm/gic-its.h
>> +++ b/xen/include/asm-arm/gic-its.h
>> @@ -139,7 +139,11 @@ int gicv3_lpi_drop_host_lpi(struct host_its *its,
>>  
>>  static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>>  {
>> -    return GIC_PRI_IRQ;
>> +    return d->arch.vgic.proptable[lpi - 8192] & 0xfc;
> 
> Please #define 0xfc. Do we need to check for lpi overflows? As in lpi
> numbers larger than proptable size?
> 
> 
>> +}
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>> +{
>> +    return d->arch.vgic.proptable[lpi - 8192] & LPI_PROP_ENABLED;
>>  }
>>  
>>  #else
>> @@ -185,6 +189,10 @@ static inline int gicv3_lpi_get_priority(struct domain *d, uint32_t lpi)
>>  {
>>      return GIC_PRI_IRQ;
>>  }
>> +static inline bool gicv3_lpi_is_enabled(struct domain *d, uint32_t lpi)
>> +{
>> +    return false;
>> +}
>>  
>>  #endif /* CONFIG_HAS_ITS */
>>  
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index 4e29ba6..2b216cc 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -285,6 +285,9 @@ VGIC_REG_HELPERS(32, 0x3);
>>  
>>  #undef VGIC_REG_HELPERS
>>  
>> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
>> +void unmap_guest_pages(void *va, int nr_pages);
>> +
>>  enum gic_sgi_mode;
>>  
>>  /*

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 144+ messages in thread

end of thread, other threads:[~2017-03-29 15:45 UTC | newest]

Thread overview: 144+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-28 18:24 [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Andre Przywara
2016-09-28 18:24 ` [RFC PATCH 01/24] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
2016-10-26  1:11   ` Stefano Stabellini
2016-11-01 15:13   ` Julien Grall
2016-11-14 17:35     ` Andre Przywara
2016-11-23 15:39       ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 02/24] ARM: GICv3: allocate LPI pending and property table Andre Przywara
2016-10-24 14:28   ` Vijay Kilari
2016-11-02 16:22     ` Andre Przywara
2016-10-26  1:10   ` Stefano Stabellini
2016-11-10 15:29     ` Andre Przywara
2016-11-10 21:00       ` Stefano Stabellini
2016-11-01 17:22   ` Julien Grall
2016-11-15 11:32     ` Andre Przywara
2016-11-23 15:58       ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 03/24] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
2016-10-09 13:55   ` Vijay Kilari
2016-10-10  9:05     ` Andre Przywara
2016-10-24 14:30   ` Vijay Kilari
2016-11-02 17:51     ` Andre Przywara
2016-10-26 22:57   ` Stefano Stabellini
2016-11-01 17:34     ` Julien Grall
2016-11-10 15:32     ` Andre Przywara
2016-11-10 21:06       ` Stefano Stabellini
2016-11-01 18:19   ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 04/24] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
2016-10-24 14:31   ` Vijay Kilari
2016-10-26 23:03   ` Stefano Stabellini
2016-11-10 16:04     ` Andre Przywara
2016-11-02 13:38   ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 05/24] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
2016-10-26 23:55   ` Stefano Stabellini
2016-10-27 21:52     ` Stefano Stabellini
2016-11-10 15:57     ` Andre Przywara
2016-11-02 15:05   ` Julien Grall
2017-01-31  9:10     ` Andre Przywara
2017-01-31 10:23       ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 06/24] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
2016-10-27 22:59   ` Stefano Stabellini
2016-11-02 15:14     ` Julien Grall
2016-11-10 17:22     ` Andre Przywara
2016-11-10 21:48       ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 07/24] ARM: GICv3 ITS: introduce device mapping Andre Przywara
2016-10-24 15:31   ` Vijay Kilari
2016-11-03 19:33     ` Andre Przywara
2016-10-28  0:08   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 08/24] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
2016-10-24 15:31   ` Vijay Kilari
2016-11-03 19:47     ` Andre Przywara
2016-10-28  1:04   ` Stefano Stabellini
2017-01-12 19:14     ` Andre Przywara
2017-01-13 19:37       ` Stefano Stabellini
2017-01-16  9:44         ` André Przywara
2017-01-16 19:16           ` Stefano Stabellini
2016-11-04 15:46   ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 09/24] ARM: GICv3: forward pending LPIs to guests Andre Przywara
2016-10-28  1:51   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 10/24] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
2016-10-28 23:07   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 11/24] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
2016-10-24 15:32   ` Vijay Kilari
2016-11-03 20:21     ` Andre Przywara
2016-11-04 11:53       ` Julien Grall
2016-10-29  0:39   ` Stefano Stabellini
2017-03-29 15:47     ` Andre Przywara
2016-11-02 17:18   ` Julien Grall
2016-11-02 17:41     ` Stefano Stabellini
2016-11-02 18:03       ` Julien Grall
2016-11-02 18:09         ` Stefano Stabellini
2017-01-31  9:10     ` Andre Przywara
2017-01-31 10:38       ` Julien Grall
2017-01-31 12:04         ` Andre Przywara
2016-09-28 18:24 ` [RFC PATCH 12/24] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
2016-10-09 14:20   ` Vijay Kilari
2016-10-10 10:38     ` Andre Przywara
2016-10-24 15:31   ` Vijay Kilari
2016-11-03 19:26     ` Andre Przywara
2016-11-04 12:07       ` Julien Grall
2016-11-03 17:50   ` Julien Grall
2016-11-08 23:54   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 13/24] ARM: vITS: handle CLEAR command Andre Przywara
2016-11-04 15:48   ` Julien Grall
2016-11-09  0:39   ` Stefano Stabellini
2016-11-09 13:32     ` Julien Grall
2016-09-28 18:24 ` [RFC PATCH 14/24] ARM: vITS: handle INT command Andre Przywara
2016-11-09  0:42   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 15/24] ARM: vITS: handle MAPC command Andre Przywara
2016-11-09  0:48   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 16/24] ARM: vITS: handle MAPD command Andre Przywara
2016-11-09  0:54   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 17/24] ARM: vITS: handle MAPTI command Andre Przywara
2016-11-09  1:07   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 18/24] ARM: vITS: handle MOVI command Andre Przywara
2016-11-09  1:13   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 19/24] ARM: vITS: handle DISCARD command Andre Przywara
2016-11-09  1:28   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 20/24] ARM: vITS: handle INV command Andre Przywara
2016-11-09  1:49   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 21/24] ARM: vITS: handle INVALL command Andre Przywara
2016-10-24 15:32   ` Vijay Kilari
2016-11-04  9:22     ` Andre Przywara
2016-11-10  0:21       ` Stefano Stabellini
2016-11-10 11:57         ` Julien Grall
2016-11-10 20:42           ` Stefano Stabellini
2016-11-11 15:53             ` Julien Grall
2016-11-11 20:31               ` Stefano Stabellini
2016-11-18 18:39                 ` Stefano Stabellini
2016-11-25 16:10                   ` Julien Grall
2016-12-01  1:19                     ` Stefano Stabellini
2016-12-02 16:18                       ` Andre Przywara
2016-12-03  0:46                         ` Stefano Stabellini
2016-12-05 13:36                           ` Julien Grall
2016-12-05 19:51                             ` Stefano Stabellini
2016-12-06 15:56                               ` Julien Grall
2016-12-06 19:36                                 ` Stefano Stabellini
2016-12-06 21:32                                   ` Dario Faggioli
2016-12-06 21:53                                     ` Stefano Stabellini
2016-12-06 22:01                                       ` Stefano Stabellini
2016-12-06 22:12                                         ` Dario Faggioli
2016-12-06 23:13                                         ` Julien Grall
2016-12-07 20:20                                           ` Stefano Stabellini
2016-12-09 18:01                                             ` Julien Grall
2016-12-09 20:13                                               ` Stefano Stabellini
2016-12-09 18:07                                             ` Andre Przywara
2016-12-09 20:18                                               ` Stefano Stabellini
2016-12-14  2:39                                                 ` George Dunlap
2016-12-16  1:30                                                   ` Dario Faggioli
2016-12-06 22:39                                       ` Dario Faggioli
2016-12-06 23:24                                         ` Julien Grall
2016-12-07  0:17                                           ` Dario Faggioli
2016-12-07 20:21                                         ` Stefano Stabellini
2016-12-09 10:14                                           ` Dario Faggioli
2016-12-06 21:36                               ` Dario Faggioli
2016-12-09 19:00                           ` Andre Przywara
2016-12-10  0:30                             ` Stefano Stabellini
2016-12-12 10:38                               ` Andre Przywara
2016-12-14  0:38                                 ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 22/24] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
2016-11-10  0:38   ` Stefano Stabellini
2016-09-28 18:24 ` [RFC PATCH 23/24] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
2016-09-28 18:24 ` [RFC PATCH 24/24] ARM: vGIC: advertising LPI support Andre Przywara
2016-11-10  0:49   ` Stefano Stabellini
2016-11-10 11:22     ` Julien Grall
2016-11-02 13:56 ` [RFC PATCH 00/24] [FOR 4.9] arm64: Dom0 ITS emulation Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.