All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/28] arm64: Dom0 ITS emulation
@ 2017-01-30 18:31 Andre Przywara
  2017-01-30 18:31 ` [PATCH 01/28] ARM: export __flush_dcache_area() Andre Przywara
                   ` (29 more replies)
  0 siblings, 30 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi,

after the two RFC versions now the first "serious" attempt for emulating
an ARM GICv3 ITS interrupt controller, for Dom0 only at the moment.
The ITS is an interrupt controller widget providing a sophisticated way
of dealing with MSIs in a scalable manner.
For hardware which relies on the ITS to provide interrupts for its
peripherals this code is needed to get a machine booted into Dom0 at all.
ITS emulation for DomUs is only really useful with PCI passthrough,
which is not yet available for ARM. It is expected that this feature
will be co-developed with the ITS DomU code. However this code drop here
considered DomU emulation already, to keep later architectural changes
to a minimum.

Some generic design principles:

* The current GIC code statically allocates structures for each supported
IRQ (both for the host and the guest), which due to the potentially
millions of LPI interrupts is not feasible to copy for the ITS.
So we refrain from introducing the ITS as a first class Xen interrupt
controller, also we don't hold struct irq_desc's or struct pending_irq's
for each possible LPI.
Fortunately LPIs are only interesting to guests, so we get away with
storing only the virtual IRQ number and the guest VCPU for each allocated
host LPI, which can be stashed into one uint64_t. This data is stored in
a two-level table, which is both memory efficient and quick to access.
We hook into the existing IRQ handling and VGIC code to avoid accessing
the normal structures, providing alternative methods for getting the
needed information (priority, is enabled?) for LPIs.
For interrupts which are queued to or are actually in a guest we
allocate struct pending_irq's on demand. As it is expected that only a
very small number of interrupts is ever on a VCPU at the same time, this
seems like the best approach. For now allocated structs are re-used and
held in a linked list. Should it emerge that traversing a linked list
is a performance issue, this can be changed to use a hash table.

* On the guest side we (later will) have to deal with malicious guests
trying to hog Xen with mapping requests for a lot of LPIs, for instance.
As the ITS actually uses system memory for storing status information,
we use this memory (which the guest has to provide) to naturally limit
a guest. For those tables which are page sized (devices, collections (CPUs),
LPI properties) we map those pages into Xen, so we can easily access
them from the virtual GIC code.
Unfortunately the actual interrupt mapping tables are not necessarily
page aligned, also can be much smaller than a page, so mapping all of
them permanently is fiddly. As ITS commands in need to iterate those
tables are pretty rare after all, we for now map them on demand upon
emulating a virtual ITS command. This is acceptable because "mapping"
them is actually very cheap on arm64. Also as we can't properly protect
those areas due to their sub-page-size property, we validate the data
in there before actually using it. The vITS code basically just stores
the data in there which the guest has actually transferred via the
virtual ITS command queue before, so there is no secret revealed nor
does it create an attack vector for a malicious guest.

* An obvious approach to handling some guest ITS commands would be to
propagate them to the host, for instance to map devices and LPIs and
to enable or disable LPIs.
However this (later with DomU support) will create an attack vector, as
a malicious guest could try to fill the host command queue with
propagated commands.
So (in contrast to the first RFC post) we completely avoid this situation.
For mapping devices and LPIs we rely on this being done via a hypercall
prior to the actual guest run. For enabling and disabling LPIs we keep
this bit on the virtual side and let LPIs always be enabled on the host side,
dealing with the consequences this approach creates.

As it is expected that the ITS support will become a tech preview in the
first release, there is a Kconfig option to enable it. Also it is
supported on arm64 only, which will most likely not change in the future.
This leads to some hideous constructs like an #ifdef'ed header file with
empty function stubs, I have some hope we can still clean this up.
Also some parameters are config options which can be overridden on the
Xen commandline. This is to support experimentation and adaption to
various platforms, ideally we find either one-size-fits-all values or
find another way of getting rid of this.

Compared to the previous post (RFC-v2) this has seen a lot of reworks
and cleanups in various areas.
I tried to address all of the review comments, though some are hard to
follow due to rewrites. So apologies if some points have slipped through.
Allocating and mapping of memory for both the physical and virtual ITS
and redistributor tables has been improved, though I didn't manage to
write protect the virtual tables from a guest without impacting access
from Xen at the same time. I will need to take a deeper look into this,
but ideally it's only a small change in get_guest_pages().

This code boots Dom0 on an ARM Fast Model with ITS support. I tried to
address the issues seen by people running the previous version on real
hardware, though couldn't verify this here for myself.
So any testing, bug reports (and possibly even fixes) are very welcome.

The code can also be found on the its/v1 branch here:
git://linux-arm.org/xen-ap.git
http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/v1

Cheers,
Andre

(Rough) changelog RFC-v2 .. v1:
- split host ITS driver into gic-v3-lpi.c and gic-v3-its.c part
- rename virtual ITS driver file to vgic-v3-its.c
- use macros and named constants for all magic numbers
- use atomic accessors for accessing the host LPI data
- remove leftovers from connecting virtual and host ITSes
- bail out if host ITS is disabled in the DT
- rework map/unmap_guest_pages():
    - split off p2m part as get/put_guest_pages (to be done on allocation)
    - get rid of vmap, using map_domain_page() instead
- delay allocation of virtual tables until actual LPI/ITS enablement
- properly size both virtual and physical tables upon allocation
- fix put_domain() locking issues in physdev_op and LPI handling code
- add and extend comments in various areas
- fix lotsa coding style and white space issues, including comment style
- add locking to data structures not yet covered
- fix various locking issues
- use an rbtree to deal with ITS devices (instead of a list)
- properly handle memory attributes for ITS tables
- handle cacheable/non-cacheable ITS table mappings
- sanitize guest provided ITS/LPI table attributes
- fix breakage on non-GICv2 compatible host GICv3 controllers
- add command line parameters on top of Kconfig options
- properly wait for an ITS to become quiescient before enabling it
- handle host ITS command queue errors
- actually wait for host ITS command completion (READR==WRITER)
- fix ARM32 compilation
- various patch splits and reorderings

Andre Przywara (28):
  ARM: export __flush_dcache_area()
  ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  ARM: GICv3: allocate LPI pending and property table
  ARM: GICv3 ITS: allocate device and collection table
  ARM: GICv3 ITS: map ITS command buffer
  ARM: GICv3 ITS: introduce ITS command handling
  ARM: GICv3 ITS: introduce device mapping
  ARM: GICv3 ITS: introduce host LPI array
  ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  ARM: GICv3: introduce separate pending_irq structs for LPIs
  ARM: GICv3: forward pending LPIs to guests
  ARM: GICv3: enable ITS and LPIs on the host
  ARM: vGICv3: handle virtual LPI pending and property tables
  ARM: vGICv3: Handle disabled LPIs
  ARM: vGICv3: introduce basic ITS emulation bits
  ARM: vITS: introduce translation table walks
  ARM: vITS: handle CLEAR command
  ARM: vITS: handle INT command
  ARM: vITS: handle MAPC command
  ARM: vITS: handle MAPD command
  ARM: vITS: handle MAPTI command
  ARM: vITS: handle MOVI command
  ARM: vITS: handle DISCARD command
  ARM: vITS: handle INV command
  ARM: vITS: handle INVALL command
  ARM: vITS: create and initialize virtual ITSes for Dom0
  ARM: vITS: create ITS subnodes for Dom0 DT
  ARM: vGIC: advertising LPI support

 xen/arch/arm/Kconfig              |  33 ++
 xen/arch/arm/Makefile             |   3 +
 xen/arch/arm/efi/efi-boot.h       |   1 -
 xen/arch/arm/gic-v3-its.c         | 825 +++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c         | 414 +++++++++++++++++
 xen/arch/arm/gic-v3.c             |  98 +++-
 xen/arch/arm/gic.c                |   9 +-
 xen/arch/arm/physdev.c            |  21 +
 xen/arch/arm/vgic-v3-its.c        | 929 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            | 347 ++++++++++++--
 xen/arch/arm/vgic.c               |  68 ++-
 xen/include/asm-arm/atomic.h      |   6 +-
 xen/include/asm-arm/bitops.h      |   1 +
 xen/include/asm-arm/cache.h       |   4 +
 xen/include/asm-arm/domain.h      |  14 +-
 xen/include/asm-arm/gic.h         |   7 +
 xen/include/asm-arm/gic_v3_defs.h |  73 ++-
 xen/include/asm-arm/gic_v3_its.h  | 241 ++++++++++
 xen/include/asm-arm/irq.h         |   8 +
 xen/include/asm-arm/vgic.h        |  34 ++
 20 files changed, 3089 insertions(+), 47 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/arch/arm/gic-v3-lpi.c
 create mode 100644 xen/arch/arm/vgic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic_v3_its.h

-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [PATCH 01/28] ARM: export __flush_dcache_area()
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 11:23   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
                   ` (28 subsequent siblings)
  29 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The ability to clean a cache line is not only useful for EFI, but will
be needed later for the ITS support.
Export the function to be usable from the whole Xen/ARM code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/efi/efi-boot.h | 1 -
 xen/include/asm-arm/cache.h | 4 ++++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 045d6ce..dc64aec 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -10,7 +10,6 @@
 #include "efi-dom0.h"
 
 void noreturn efi_xen_start(void *fdt_ptr, uint32_t fdt_size);
-void __flush_dcache_area(const void *vaddr, unsigned long size);
 
 #define DEVICE_TREE_GUID \
 {0xb1b621d5, 0xf19c, 0x41a5, {0x83, 0x0b, 0xd9, 0x15, 0x2c, 0x69, 0xaa, 0xe0}}
diff --git a/xen/include/asm-arm/cache.h b/xen/include/asm-arm/cache.h
index 2de6564..af96eee 100644
--- a/xen/include/asm-arm/cache.h
+++ b/xen/include/asm-arm/cache.h
@@ -7,6 +7,10 @@
 #define L1_CACHE_SHIFT  (CONFIG_ARM_L1_CACHE_SHIFT)
 #define L1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
 
+#ifndef __ASSEMBLY__
+void __flush_dcache_area(const void *vaddr, unsigned long size);
+#endif
+
 #define __read_mostly __section(".data.read_mostly")
 
 #endif
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
  2017-01-30 18:31 ` [PATCH 01/28] ARM: export __flush_dcache_area() Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 12:39   ` Julien Grall
  2017-02-06 12:58   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table Andre Przywara
                   ` (27 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Parse the DT GIC subnodes to find every ITS MSI controller the hardware
offers. Store that information in a list to both propagate all of them
later to Dom0, but also to be able to iterate over all ITSes.
This introduces an ITS Kconfig option.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig             |  4 +++
 xen/arch/arm/Makefile            |  1 +
 xen/arch/arm/gic-v3-its.c        | 71 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            | 12 ++++---
 xen/include/asm-arm/gic_v3_its.h | 57 ++++++++++++++++++++++++++++++++
 5 files changed, 141 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic_v3_its.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 2e023d1..bf64c61 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -45,6 +45,10 @@ config ACPI
 config HAS_GICV3
 	bool
 
+config HAS_ITS
+        bool "GICv3 ITS MSI controller support"
+        depends on HAS_GICV3
+
 endmenu
 
 menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 59b3b53..5f4ff23 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -18,6 +18,7 @@ obj-$(EARLY_PRINTK) += early_printk.o
 obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
+obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
new file mode 100644
index 0000000..ff0f571
--- /dev/null
+++ b/xen/arch/arm/gic-v3-its.c
@@ -0,0 +1,71 @@
+/*
+ * xen/arch/arm/gic-v3-its.c
+ *
+ * ARM GICv3 Interrupt Translation Service (ITS) support
+ *
+ * Copyright (C) 2016,2017 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/device_tree.h>
+#include <xen/libfdt/libfdt.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+
+/* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
+void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+    const struct dt_device_node *its = NULL;
+    struct host_its *its_data;
+
+    /*
+     * Check for ITS MSI subnodes. If any, add the ITS register
+     * frames to the ITS list.
+     */
+    dt_for_each_child_node(node, its)
+    {
+        paddr_t addr, size;
+
+        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
+            continue;
+
+        if ( !dt_device_is_available(its) )
+            continue;
+
+        if ( dt_device_get_address(its, 0, &addr, &size) )
+            panic("GICv3: Cannot find a valid ITS frame address");
+
+        its_data = xzalloc(struct host_its);
+        if ( !its_data )
+            panic("GICv3: Cannot allocate memory for ITS frame");
+
+        its_data->addr = addr;
+        its_data->size = size;
+        its_data->dt_node = its;
+
+        printk("GICv3: Found ITS @0x%lx\n", addr);
+
+        list_add_tail(&its_data->entry, &host_its_list);
+    }
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b8be395..838dd11 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -43,9 +43,12 @@
 #include <asm/device.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
 #include <asm/cpufeature.h>
 #include <asm/acpi.h>
 
+LIST_HEAD(host_its_list);
+
 /* Global state */
 static struct {
     void __iomem *map_dbase;  /* Mapped address of distributor registers */
@@ -1224,11 +1227,12 @@ static void __init gicv3_dt_init(void)
      */
     res = dt_device_get_address(node, 1 + gicv3.rdist_count,
                                 &cbase, &csize);
-    if ( res )
-        return;
+    if ( !res )
+        dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
+                              &vbase, &vsize);
 
-    dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
-                          &vbase, &vsize);
+    /* Check for ITS child nodes and build the host ITS list accordingly. */
+    gicv3_its_dt_init(node);
 }
 
 static int gicv3_iomem_deny_access(const struct domain *d)
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
new file mode 100644
index 0000000..2f5c51c
--- /dev/null
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -0,0 +1,57 @@
+/*
+ * ARM GICv3 ITS support
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#ifndef __ASM_ARM_ITS_H__
+#define __ASM_ARM_ITS_H__
+
+#ifndef __ASSEMBLY__
+#include <xen/device_tree.h>
+
+/* data structure for each hardware ITS */
+struct host_its {
+    struct list_head entry;
+    const struct dt_device_node *dt_node;
+    paddr_t addr;
+    paddr_t size;
+};
+
+extern struct list_head host_its_list;
+
+#ifdef CONFIG_HAS_ITS
+
+/* Parse the host DT and pick up all host ITSes. */
+void gicv3_its_dt_init(const struct dt_device_node *node);
+
+#else
+
+static inline void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+}
+
+#endif /* CONFIG_HAS_ITS */
+
+#endif /* __ASSEMBLY__ */
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
  2017-01-30 18:31 ` [PATCH 01/28] ARM: export __flush_dcache_area() Andre Przywara
  2017-01-30 18:31 ` [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 16:26   ` Julien Grall
  2017-02-14  0:47   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                   ` (26 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The ARM GICv3 provides a new kind of interrupt called LPIs.
The pending bits and the configuration data (priority, enable bits) for
those LPIs are stored in tables in normal memory, which software has to
provide to the hardware.
Allocate the required memory, initialize it and hand it over to each
redistributor. The maximum number of LPIs to be used can be adjusted with
the command line option "max_lpi_bits", which defaults to a compile time
constant exposed in Kconfig.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig              |  15 +++++
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/gic-v3-lpi.c         | 129 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c             |  44 +++++++++++++
 xen/include/asm-arm/bitops.h      |   1 +
 xen/include/asm-arm/gic.h         |   2 +
 xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
 xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
 8 files changed, 264 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-lpi.c

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index bf64c61..71734a1 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -49,6 +49,21 @@ config HAS_ITS
         bool "GICv3 ITS MSI controller support"
         depends on HAS_GICV3
 
+config MAX_PHYS_LPI_BITS
+        depends on HAS_ITS
+        int "Maximum bits for GICv3 host LPIs (14-32)"
+        range 14 32
+        default "20"
+        help
+          Specifies the maximum number of LPIs (in bits) Xen should take
+          care of. The host ITS may provide support for a very large number
+          of supported LPIs, for all of which we may not want to allocate
+          memory, so this number here allows to limit this.
+          Xen itself does not know how many LPIs domains will ever need
+          beforehand.
+          This can be overriden on the command line with the max_lpi_bits
+          parameter.
+
 endmenu
 
 menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 5f4ff23..4ccf2eb 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -19,6 +19,7 @@ obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
 obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
+obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
new file mode 100644
index 0000000..e2fc901
--- /dev/null
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -0,0 +1,129 @@
+/*
+ * xen/arch/arm/gic-v3-lpi.c
+ *
+ * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
+ *
+ * Copyright (C) 2016,2017 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/sizes.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+
+/* Global state */
+static struct {
+    uint8_t *lpi_property;
+    unsigned int host_lpi_bits;
+} lpi_data;
+
+/* Pending table for each redistributor */
+static DEFINE_PER_CPU(void *, pending_table);
+
+#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
+
+uint64_t gicv3_lpi_allocate_pendtable(void)
+{
+    uint64_t reg;
+    void *pendtable;
+
+    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    if ( !this_cpu(pending_table) )
+    {
+        /*
+         * The pending table holds one bit per LPI and even covers bits for
+         * interrupt IDs below 8192, so we allocate the full range.
+         * The GICv3 imposes a 64KB alignment requirement.
+         */
+        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8, SZ_64K);
+        if ( !pendtable )
+            return 0;
+
+        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);
+        __flush_dcache_area(pendtable, BIT_ULL(lpi_data.host_lpi_bits) / 8);
+
+        this_cpu(pending_table) = pendtable;
+    }
+    else
+    {
+        pendtable = this_cpu(pending_table);
+    }
+
+    reg |= GICR_PENDBASER_PTZ;
+
+    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));
+    reg |= virt_to_maddr(pendtable);
+
+    return reg;
+}
+
+uint64_t gicv3_lpi_get_proptable(void)
+{
+    uint64_t reg;
+
+    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    /*
+     * The property table is shared across all redistributors, so allocate
+     * this only once, but return the same value on subsequent calls.
+     */
+    if ( !lpi_data.lpi_property )
+    {
+        /* The property table holds one byte per LPI. */
+        void *table = alloc_xenheap_pages(lpi_data.host_lpi_bits - PAGE_SHIFT,
+                                          0);
+
+        if ( !table )
+            return 0;
+
+        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
+        __flush_dcache_area(table, MAX_PHYS_LPIS);
+        lpi_data.lpi_property = table;
+    }
+
+    reg |= ((lpi_data.host_lpi_bits - 1) << 0);
+
+    ASSERT(!(virt_to_maddr(lpi_data.lpi_property) & ~GENMASK(51, 12)));
+    reg |= virt_to_maddr(lpi_data.lpi_property);
+
+    return reg;
+}
+
+static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
+integer_param("max_lpi_bits", max_lpi_bits);
+
+int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
+{
+    lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
+
+    printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 838dd11..fcb86c8 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
     type = readl_relaxed(GICD + GICD_TYPER);
     nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
 
+    if ( type & GICD_TYPE_LPIS )
+        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
+
     printk("GICv3: %d lines, (IID %8.8x).\n",
            nr_lines, readl_relaxed(GICD + GICD_IIDR));
 
@@ -616,6 +619,33 @@ static int gicv3_enable_redist(void)
     return 0;
 }
 
+static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
+{
+    uint32_t reg;
+    uint64_t table_reg;
+
+    /* We don't support LPIs without an ITS. */
+    if ( list_empty(&host_its_list) )
+        return -ENODEV;
+
+    /* Make sure LPIs are disabled before setting up the tables. */
+    reg = readl_relaxed(rdist_base + GICR_CTLR);
+    if ( reg & GICR_CTLR_ENABLE_LPIS )
+        return -EBUSY;
+
+    table_reg = gicv3_lpi_allocate_pendtable();
+    if ( !table_reg )
+        return -ENOMEM;
+    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
+
+    table_reg = gicv3_lpi_get_proptable();
+    if ( !table_reg )
+        return -ENOMEM;
+    writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
+
+    return 0;
+}
+
 static int __init gicv3_populate_rdist(void)
 {
     int i;
@@ -658,6 +688,20 @@ static int __init gicv3_populate_rdist(void)
             if ( (typer >> 32) == aff )
             {
                 this_cpu(rbase) = ptr;
+
+                if ( typer & GICR_TYPER_PLPIS )
+                {
+                    int ret;
+
+                    ret = gicv3_rdist_init_lpis(ptr);
+                    if ( ret && ret != -ENODEV )
+                    {
+                        printk("GICv3: CPU%d: Cannot initialize LPIs: %d\n",
+                               smp_processor_id(), ret);
+                        break;
+                    }
+                }
+
                 printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
                         smp_processor_id(), i, ptr);
                 return 0;
diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
index bda8898..1cbfb9e 100644
--- a/xen/include/asm-arm/bitops.h
+++ b/xen/include/asm-arm/bitops.h
@@ -24,6 +24,7 @@
 #define BIT(nr)                 (1UL << (nr))
 #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
 #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
+#define BIT_ULL(nr)             (1ULL << (nr))
 #define BITS_PER_BYTE           8
 
 #define ADDR (*(volatile int *) addr)
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 836a103..12bd155 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -220,6 +220,8 @@ enum gic_version {
     GIC_V3,
 };
 
+#define LPI_OFFSET      8192
+
 extern enum gic_version gic_hw_version(void);
 
 /* Program the IRQ type into the GIC */
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 6bd25a5..b307322 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -44,7 +44,8 @@
 #define GICC_SRE_EL2_ENEL1           (1UL << 3)
 
 /* Additional bits in GICD_TYPER defined by GICv3 */
-#define GICD_TYPE_ID_BITS_SHIFT 19
+#define GICD_TYPE_ID_BITS_SHIFT      19
+#define GICD_TYPE_LPIS               (1U << 17)
 
 #define GICD_CTLR_RWP                (1UL << 31)
 #define GICD_CTLR_ARE_NS             (1U << 4)
@@ -95,12 +96,61 @@
 #define GICR_IGRPMODR0               (0x0D00)
 #define GICR_NSACR                   (0x0E00)
 
+#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
+
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
 
+/* For specifying the inner cacheability type only */
+#define GIC_BASER_CACHE_nCnB         0ULL
+/* For specifying the outer cacheability type only */
+#define GIC_BASER_CACHE_SameAsInner  0ULL
+#define GIC_BASER_CACHE_nC           1ULL
+#define GIC_BASER_CACHE_RaWt         2ULL
+#define GIC_BASER_CACHE_RaWb         3ULL
+#define GIC_BASER_CACHE_WaWt         4ULL
+#define GIC_BASER_CACHE_WaWb         5ULL
+#define GIC_BASER_CACHE_RaWaWt       6ULL
+#define GIC_BASER_CACHE_RaWaWb       7ULL
+#define GIC_BASER_CACHE_MASK         7ULL
+
+#define GIC_BASER_NonShareable       0ULL
+#define GIC_BASER_InnerShareable     1ULL
+#define GIC_BASER_OuterShareable     2ULL
+
+#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
+#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PROPBASER_SHAREABILITY_MASK                     \
+        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
+#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
+#define GICR_PROPBASER_RES0_MASK                             \
+        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
+
+#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
+#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PENDBASER_SHAREABILITY_MASK                     \
+	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
+#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
+	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_PTZ                              BIT(62)
+#define GICR_PENDBASER_RES0_MASK                             \
+        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
+         GENMASK(15, 12) | GENMASK(6, 0))
+
 #define DEFAULT_PMR_VALUE            0xff
 
+#define LPI_PROP_PRIO_MASK           0xfc
+#define LPI_PROP_RES1                (1 << 1)
+#define LPI_PROP_ENABLED             (1 << 0)
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 2f5c51c..a66b6be 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -36,12 +36,32 @@ extern struct list_head host_its_list;
 /* Parse the host DT and pick up all host ITSes. */
 void gicv3_its_dt_init(const struct dt_device_node *node);
 
+/* Allocate and initialize tables for each host redistributor.
+ * Returns the respective {PROP,PEND}BASER register value.
+ */
+uint64_t gicv3_lpi_get_proptable(void);
+uint64_t gicv3_lpi_allocate_pendtable(void);
+
+/* Initialize the host structures for LPIs. */
+int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
 {
 }
-
+static inline uint64_t gicv3_lpi_get_proptable(void)
+{
+    return 0;
+}
+static inline uint64_t gicv3_lpi_allocate_pendtable(void)
+{
+    return 0;
+}
+static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (2 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 17:19   ` Julien Grall
                     ` (5 more replies)
  2017-01-30 18:31 ` [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
                   ` (25 subsequent siblings)
  29 siblings, 6 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
an EventID (the MSI payload or interrupt ID) to a pair of LPI number
and collection ID, which points to the target CPU.
This mapping is stored in the device and collection tables, which software
has to provide for the ITS to use.
Allocate the required memory and hand it the ITS.
The maximum number of devices is limited to a compile-time constant
exposed in Kconfig.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig             |  14 +++++
 xen/arch/arm/gic-v3-its.c        | 129 +++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            |   5 ++
 xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
 4 files changed, 202 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 71734a1..81bc233 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
           This can be overriden on the command line with the max_lpi_bits
           parameter.
 
+config MAX_PHYS_ITS_DEVICE_BITS
+        depends on HAS_ITS
+        int "Number of device bits the ITS supports"
+        range 1 32
+        default "10"
+        help
+          Specifies the maximum number of devices which want to use the ITS.
+          Xen needs to allocates memory for the whole range very early.
+          The allocation scheme may be sparse, so a much larger number must
+          be supported to cover devices with a high bus number or those on
+          separate bus segments.
+          This can be overriden on the command line with the max_its_device_bits
+          parameter.
+
 endmenu
 
 menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index ff0f571..c31fef6 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,9 +20,138 @@
 #include <xen/lib.h>
 #include <xen/device_tree.h>
 #include <xen/libfdt/libfdt.h>
+#include <xen/mm.h>
+#include <xen/sizes.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
+#include <asm/io.h>
+
+#define BASER_ATTR_MASK                                           \
+        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
+         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
+         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
+#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
+
+static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
+{
+    uint64_t ret;
+
+    if ( page_bits < 16 )
+        return (uint64_t)addr & GENMASK(47, page_bits);
+
+    ret = addr & GENMASK(47, 16);
+    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
+}
+
+#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
+
+static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
+{
+    uint64_t attr, reg;
+    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;
+    int pagesz = 0, order, table_size;
+    void *buffer = NULL;
+
+    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    /*
+     * Setup the BASE register with the attributes that we like. Then read
+     * it back and see what sticks (page size, cacheability and shareability
+     * attributes), retrying if necessary.
+     */
+    while ( 1 )
+    {
+        table_size = ROUNDUP(nr_items * entry_size, BIT(PAGE_BITS(pagesz)));
+        order = get_order_from_bytes(table_size);
+
+        if ( !buffer )
+            buffer = alloc_xenheap_pages(order, 0);
+        if ( !buffer )
+            return -ENOMEM;
+
+        reg  = attr;
+        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
+        reg |= table_size >> PAGE_BITS(pagesz);
+        reg |= regc & BASER_RO_MASK;
+        reg |= GITS_VALID_BIT;
+        reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
+
+        writeq_relaxed(reg, basereg);
+        regc = readl_relaxed(basereg);
+
+        /* The host didn't like our attributes, just use what it returned. */
+        if ( (regc & BASER_ATTR_MASK) != attr )
+        {
+            /* If we can't map it shareable, drop cacheability as well. */
+            if ( (regc & GITS_BASER_SHAREABILITY_MASK) == GIC_BASER_NonShareable )
+            {
+                regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
+                attr = regc & BASER_ATTR_MASK;
+                continue;
+            }
+            attr = regc & BASER_ATTR_MASK;
+        }
+
+        /* If the host accepted our page size, we are done. */
+        if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
+            return 0;
+
+        /* None of the page sizes was accepted, give up */
+        if ( pagesz >= 2 )
+            break;
+
+        free_xenheap_pages(buffer, order);
+        buffer = NULL;
+
+        pagesz++;
+    }
+
+    if ( buffer )
+        free_xenheap_pages(buffer, order);
+
+    return -EINVAL;
+}
+
+static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
+integer_param("max_its_device_bits", max_its_device_bits);
+
+int gicv3_its_init(struct host_its *hw_its)
+{
+    uint64_t reg;
+    int i;
+
+    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
+    if ( !hw_its->its_base )
+        return -ENOMEM;
+
+    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
+    {
+        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
+        int type;
+
+        reg = readq_relaxed(basereg);
+        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
+        switch ( type )
+        {
+        case GITS_BASER_TYPE_NONE:
+            continue;
+        case GITS_BASER_TYPE_DEVICE:
+            /* TODO: find some better way of limiting the number of devices */
+            its_map_baser(basereg, reg, BIT(max_its_device_bits));
+            break;
+        case GITS_BASER_TYPE_COLLECTION:
+            its_map_baser(basereg, reg, NR_CPUS);
+            break;
+        default:
+            continue;
+        }
+    }
+
+    return 0;
+}
 
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index fcb86c8..440c079 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -29,6 +29,7 @@
 #include <xen/irq.h>
 #include <xen/iocap.h>
 #include <xen/sched.h>
+#include <xen/err.h>
 #include <xen/errno.h>
 #include <xen/delay.h>
 #include <xen/device_tree.h>
@@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
 {
     int res, i;
     uint32_t reg;
+    struct host_its *hw_its;
 
     if ( !cpu_has_gicv3 )
     {
@@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
     res = gicv3_cpu_init();
     gicv3_hyp_init();
 
+    list_for_each_entry(hw_its, &host_its_list, entry)
+        gicv3_its_init(hw_its);
+
     spin_unlock(&gicv3.lock);
 
     return res;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index a66b6be..ed44bdb 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -18,6 +18,53 @@
 #ifndef __ASM_ARM_ITS_H__
 #define __ASM_ARM_ITS_H__
 
+#define GITS_CTLR                       0x000
+#define GITS_IIDR                       0x004
+#define GITS_TYPER                      0x008
+#define GITS_CBASER                     0x080
+#define GITS_CWRITER                    0x088
+#define GITS_CREADR                     0x090
+#define GITS_BASER_NR_REGS              8
+#define GITS_BASER0                     0x100
+#define GITS_BASER1                     0x108
+#define GITS_BASER2                     0x110
+#define GITS_BASER3                     0x118
+#define GITS_BASER4                     0x120
+#define GITS_BASER5                     0x128
+#define GITS_BASER6                     0x130
+#define GITS_BASER7                     0x138
+
+/* Register bits */
+#define GITS_VALID_BIT                  BIT_ULL(63)
+
+#define GITS_CTLR_QUIESCENT             BIT(31)
+#define GITS_CTLR_ENABLE                BIT(0)
+
+#define GITS_IIDR_VALUE                 0x34c
+
+#define GITS_BASER_INDIRECT             BIT_ULL(62)
+#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
+#define GITS_BASER_TYPE_SHIFT           56
+#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
+#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
+#define GITS_BASER_TYPE_NONE            0UL
+#define GITS_BASER_TYPE_DEVICE          1UL
+#define GITS_BASER_TYPE_VCPU            2UL
+#define GITS_BASER_TYPE_CPU             3UL
+#define GITS_BASER_TYPE_COLLECTION      4UL
+#define GITS_BASER_TYPE_RESERVED5       5UL
+#define GITS_BASER_TYPE_RESERVED6       6UL
+#define GITS_BASER_TYPE_RESERVED7       7UL
+#define GITS_BASER_ENTRY_SIZE_SHIFT     48
+#define GITS_BASER_SHAREABILITY_SHIFT   10
+#define GITS_BASER_PAGE_SIZE_SHIFT      8
+#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
+                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
+                                        GITS_BASER_INDIRECT)
+#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL << GITS_BASER_SHAREABILITY_SHIFT)
+#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
+#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
+
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
@@ -27,6 +74,7 @@ struct host_its {
     const struct dt_device_node *dt_node;
     paddr_t addr;
     paddr_t size;
+    void __iomem *its_base;
 };
 
 extern struct list_head host_its_list;
@@ -42,8 +90,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
 uint64_t gicv3_lpi_get_proptable(void);
 uint64_t gicv3_lpi_allocate_pendtable(void);
 
-/* Initialize the host structures for LPIs. */
+/* Initialize the host structures for LPIs and the host ITSes. */
 int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
+int gicv3_its_init(struct host_its *hw_its);
 
 #else
 
@@ -62,6 +111,10 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
 {
     return 0;
 }
+static inline int gicv3_its_init(struct host_its *hw_its)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (3 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 17:43   ` Julien Grall
  2017-02-14  0:59   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
                   ` (24 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Instead of directly manipulating the tables in memory, an ITS driver
sends commands via a ring buffer to the ITS h/w to create or alter the
LPI mappings.
Allocate memory for that buffer and tell the ITS about it to be able
to send ITS commands.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 46 ++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic_v3_its.h |  6 ++++++
 2 files changed, 52 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index c31fef6..ad7cd2a 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -27,6 +27,8 @@
 #include <asm/gic_v3_its.h>
 #include <asm/io.h>
 
+#define ITS_CMD_QUEUE_SZ                SZ_64K
+
 #define BASER_ATTR_MASK                                           \
         ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
          (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
@@ -44,6 +46,45 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
     return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
 }
 
+static void *its_map_cbaser(struct host_its *its)
+{
+    void __iomem *cbasereg = its->its_base + GITS_CBASER;
+    uint64_t reg, regc;
+    void *buffer;
+    paddr_t paddr;
+
+    reg  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    buffer = _xzalloc(ITS_CMD_QUEUE_SZ, PAGE_SIZE);
+    if ( !buffer )
+        return NULL;
+    paddr = virt_to_maddr(buffer);
+    ASSERT(!(paddr & ~GENMASK(51, 12)));
+
+    reg |= GITS_VALID_BIT | paddr;
+    reg |= ((ITS_CMD_QUEUE_SZ / PAGE_SIZE) - 1) & GITS_CBASER_SIZE_MASK;
+    writeq_relaxed(reg, cbasereg);
+    regc = readq_relaxed(cbasereg);
+
+    /* If the ITS dropped shareability, drop cacheability as well. */
+    if ( (regc & GITS_BASER_SHAREABILITY_MASK) == 0 )
+    {
+        regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
+        writeq_relaxed(regc, cbasereg);
+    }
+
+    /*
+     * If the command queue memory is mapped as uncached, we need to flush
+     * it on every access.
+     */
+    if ( !(regc & GITS_BASER_INNER_CACHEABILITY_MASK) )
+        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
+
+    return buffer;
+}
+
 #define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
 
 static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
@@ -150,6 +191,11 @@ int gicv3_its_init(struct host_its *hw_its)
         }
     }
 
+    hw_its->cmd_buf = its_map_cbaser(hw_its);
+    if ( !hw_its->cmd_buf )
+        return -ENOMEM;
+    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
+
     return 0;
 }
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index ed44bdb..ff5572f 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -65,9 +65,13 @@
 #define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
 #define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
 
+#define GITS_CBASER_SIZE_MASK           0xff
+
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
+#define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
+
 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -75,6 +79,8 @@ struct host_its {
     paddr_t addr;
     paddr_t size;
     void __iomem *its_base;
+    void *cmd_buf;
+    unsigned int flags;
 };
 
 extern struct list_head host_its_list;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (4 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-06 19:16   ` Julien Grall
  2017-02-07 11:59   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
                   ` (23 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

To be able to easily send commands to the ITS, create the respective
wrapper functions, which take care of the ring buffer.
The first two commands we implement provide methods to map a collection
to a redistributor (aka host core) and to flush the command queue (SYNC).
Start using these commands for mapping one collection to each host CPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c         | 142 +++++++++++++++++++++++++++++++++++++-
 xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
 xen/arch/arm/gic-v3.c             |  18 ++++-
 xen/include/asm-arm/gic_v3_defs.h |   2 +
 xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
 5 files changed, 215 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index ad7cd2a..6578e8a 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -19,6 +19,7 @@
 #include <xen/config.h>
 #include <xen/lib.h>
 #include <xen/device_tree.h>
+#include <xen/delay.h>
 #include <xen/libfdt/libfdt.h>
 #include <xen/mm.h>
 #include <xen/sizes.h>
@@ -29,6 +30,98 @@
 
 #define ITS_CMD_QUEUE_SZ                SZ_64K
 
+#define BUFPTR_MASK                     GENMASK(19, 5)
+static int its_send_command(struct host_its *hw_its, const void *its_cmd)
+{
+    uint64_t readp, writep;
+
+    spin_lock(&hw_its->cmd_lock);
+
+    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
+    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
+
+    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
+    {
+        spin_unlock(&hw_its->cmd_lock);
+        return -EBUSY;
+    }
+
+    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
+    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
+        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
+    else
+        dsb(ishst);
+
+    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
+    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base + GITS_CWRITER);
+
+    spin_unlock(&hw_its->cmd_lock);
+
+    return 0;
+}
+
+static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
+{
+    reg &= ~GENMASK(51, 16);
+
+    reg |= gicv3_get_redist_address(cpu, hw_its->flags & HOST_ITS_USES_PTA);
+
+    return reg;
+}
+
+static int its_send_cmd_sync(struct host_its *its, int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_SYNC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, 0x0);
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)));
+    cmd[2] |= GITS_VALID_BIT;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+/* Set up the (1:1) collection mapping for the given host CPU. */
+int gicv3_its_setup_collection(int cpu)
+{
+    struct host_its *its;
+    int ret;
+
+    list_for_each_entry(its, &host_its_list, entry)
+    {
+        /*
+         * This function is called on CPU0 before any ITSes have been
+         * properly initialized. Skip the collection setup in this case,
+         * it will be done explicitly for CPU0 upon initializing the ITS.
+         */
+        if ( !its->cmd_buf )
+            continue;
+
+        ret = its_send_cmd_mapc(its, cpu, cpu);
+        if ( ret )
+            return ret;
+
+        ret = its_send_cmd_sync(its, cpu);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+
 #define BASER_ATTR_MASK                                           \
         ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
          (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
@@ -156,18 +249,51 @@ static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
     return -EINVAL;
 }
 
+/* Wait for an ITS to become quiescient (all ITS operations completed). */
+static int gicv3_its_wait_quiescient(struct host_its *hw_its)
+{
+    uint32_t reg;
+    s_time_t deadline = NOW() + MILLISECS(1000);
+
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    if ( (reg & (GITS_CTLR_QUIESCENT | GITS_CTLR_ENABLE)) == GITS_CTLR_QUIESCENT )
+        return 0;
+
+    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
+    do {
+        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+        if ( reg & GITS_CTLR_QUIESCENT )
+            return 0;
+
+        cpu_relax();
+        udelay(1);
+    } while ( NOW() <= deadline );
+
+    dprintk(XENLOG_ERR, "ITS not quiescient\n");
+    return -ETIMEDOUT;
+}
+
 static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
 integer_param("max_its_device_bits", max_its_device_bits);
 
 int gicv3_its_init(struct host_its *hw_its)
 {
     uint64_t reg;
-    int i;
+    int i, ret;
 
     hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
     if ( !hw_its->its_base )
         return -ENOMEM;
 
+    ret = gicv3_its_wait_quiescient(hw_its);
+    if ( ret )
+        return ret;
+
+    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
+    if ( reg & GITS_TYPER_PTA )
+        hw_its->flags |= HOST_ITS_USES_PTA;
+
     for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
     {
         void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
@@ -196,6 +322,20 @@ int gicv3_its_init(struct host_its *hw_its)
         return -ENOMEM;
     writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
 
+    /*
+     * We issue the collection mapping calls upon initialising the
+     * redistributors, which for CPU 0 happens before the ITS gets initialised
+     * here. So we skip this mapping for CPU 0 there (since the ITS is not
+     * ready), instead do it explicitly here for CPU 0.
+     */
+    ret = its_send_cmd_mapc(hw_its, smp_processor_id(), smp_processor_id());
+    if ( ret )
+        return ret;
+
+    ret = its_send_cmd_sync(hw_its, smp_processor_id());
+    if ( ret )
+        return ret;
+
     return 0;
 }
 
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index e2fc901..5911b91 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -30,11 +30,31 @@ static struct {
     unsigned int host_lpi_bits;
 } lpi_data;
 
+/* Physical redistributor address */
+static DEFINE_PER_CPU(paddr_t, redist_addr);
+/* Redistributor ID */
+static DEFINE_PER_CPU(int, redist_id);
 /* Pending table for each redistributor */
 static DEFINE_PER_CPU(void *, pending_table);
 
 #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
 
+/* Stores this redistributor's physical address and ID in a per-CPU variable */
+void gicv3_set_redist_address(paddr_t address, int redist_id)
+{
+    this_cpu(redist_addr) = address;
+    this_cpu(redist_id) = redist_id;
+}
+
+/* Returns a redistributor's ID (either as an address or as an ID) */
+uint64_t gicv3_get_redist_address(int cpu, bool use_pta)
+{
+    if ( use_pta )
+        return per_cpu(redist_addr, cpu) & GENMASK(51, 16);
+    else
+        return per_cpu(redist_id, cpu) << 16;
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 440c079..5f825a6 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -644,7 +644,7 @@ static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
         return -ENOMEM;
     writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
 
-    return 0;
+    return gicv3_its_setup_collection(smp_processor_id());
 }
 
 static int __init gicv3_populate_rdist(void)
@@ -692,7 +692,21 @@ static int __init gicv3_populate_rdist(void)
 
                 if ( typer & GICR_TYPER_PLPIS )
                 {
-                    int ret;
+                    paddr_t rdist_addr;
+                    int procnum, ret;
+
+                    rdist_addr = gicv3.rdist_regions[i].base;
+                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
+                    procnum = (typer & GICR_TYPER_PROC_NUM_MASK);
+                    procnum >>= GICR_TYPER_PROC_NUM_SHIFT;
+
+                    /*
+                     * The ITS refers to redistributors either by their physical
+                     * address or by their ID. Determine those two values and
+                     * let the ITS code store them in per host CPU variables to
+                     * later be able to address those redistributors.
+                     */
+                    gicv3_set_redist_address(rdist_addr, procnum);
 
                     ret = gicv3_rdist_init_lpis(ptr);
                     if ( ret && ret != -ENODEV )
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index b307322..878bae2 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -101,6 +101,8 @@
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
+#define GICR_TYPER_PROC_NUM_SHIFT    8
+#define GICR_TYPER_PROC_NUM_MASK     (0xffff << GICR_TYPER_PROC_NUM_SHIFT)
 
 /* For specifying the inner cacheability type only */
 #define GIC_BASER_CACHE_nCnB         0ULL
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index ff5572f..8288185 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -40,6 +40,9 @@
 #define GITS_CTLR_QUIESCENT             BIT(31)
 #define GITS_CTLR_ENABLE                BIT(0)
 
+#define GITS_TYPER_PTA                  BIT_ULL(19)
+#define GITS_TYPER_IDBITS_SHIFT         8
+
 #define GITS_IIDR_VALUE                 0x34c
 
 #define GITS_BASER_INDIRECT             BIT_ULL(62)
@@ -67,10 +70,27 @@
 
 #define GITS_CBASER_SIZE_MASK           0xff
 
+/* ITS command definitions */
+#define ITS_CMD_SIZE                    32
+
+#define GITS_CMD_MOVI                   0x01
+#define GITS_CMD_INT                    0x03
+#define GITS_CMD_CLEAR                  0x04
+#define GITS_CMD_SYNC                   0x05
+#define GITS_CMD_MAPD                   0x08
+#define GITS_CMD_MAPC                   0x09
+#define GITS_CMD_MAPTI                  0x0a
+#define GITS_CMD_MAPI                   0x0b
+#define GITS_CMD_INV                    0x0c
+#define GITS_CMD_INVALL                 0x0d
+#define GITS_CMD_MOVALL                 0x0e
+#define GITS_CMD_DISCARD                0x0f
+
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
 
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
+#define HOST_ITS_USES_PTA               (1U << 1)
 
 /* data structure for each hardware ITS */
 struct host_its {
@@ -79,6 +99,7 @@ struct host_its {
     paddr_t addr;
     paddr_t size;
     void __iomem *its_base;
+    spinlock_t cmd_lock;
     void *cmd_buf;
     unsigned int flags;
 };
@@ -100,6 +121,13 @@ uint64_t gicv3_lpi_allocate_pendtable(void);
 int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
 int gicv3_its_init(struct host_its *hw_its);
 
+/* Store the physical address and ID for each redistributor as read from DT. */
+void gicv3_set_redist_address(paddr_t address, int redist_id);
+uint64_t gicv3_get_redist_address(int cpu, bool use_pta);
+
+/* Map a collection for this host CPU to each host ITS. */
+int gicv3_its_setup_collection(int cpu);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
@@ -121,6 +149,14 @@ static inline int gicv3_its_init(struct host_its *hw_its)
 {
     return 0;
 }
+static inline void gicv3_set_redist_address(paddr_t address, int redist_id)
+{
+}
+static inline int gicv3_its_setup_collection(int cpu)
+{
+    return 0;
+}
+
 #endif /* CONFIG_HAS_ITS */
 
 #endif /* __ASSEMBLY__ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (5 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-07 14:05   ` Julien Grall
                     ` (3 more replies)
  2017-01-30 18:31 ` [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
                   ` (22 subsequent siblings)
  29 siblings, 4 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The ITS uses device IDs to map LPIs to a device. Dom0 will later use
those IDs, which we directly pass on to the host.
For this we have to map each device that Dom0 may request to a host
ITS device with the same identifier.
Allocate the respective memory and enter each device into an rbtree to
later be able to iterate over it or to easily teardown guests.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 188 ++++++++++++++++++++++++++++++++++++++-
 xen/arch/arm/vgic-v3.c           |   3 +
 xen/include/asm-arm/domain.h     |   3 +
 xen/include/asm-arm/gic_v3_its.h |  28 ++++++
 4 files changed, 221 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 6578e8a..4a3a394 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -21,8 +21,10 @@
 #include <xen/device_tree.h>
 #include <xen/delay.h>
 #include <xen/libfdt/libfdt.h>
-#include <xen/mm.h>
+#include <xen/rbtree.h>
+#include <xen/sched.h>
 #include <xen/sizes.h>
+#include <xen/domain.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
@@ -94,6 +96,21 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
+                             int size, uint64_t itt_addr, bool valid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
+    cmd[1] = size & GENMASK(4, 0);
+    cmd[2] = itt_addr & GENMASK(51, 8);
+    if ( valid )
+        cmd[2] |= GITS_VALID_BIT;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(int cpu)
 {
@@ -293,6 +310,7 @@ int gicv3_its_init(struct host_its *hw_its)
     reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
     if ( reg & GITS_TYPER_PTA )
         hw_its->flags |= HOST_ITS_USES_PTA;
+    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
 
     for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
     {
@@ -339,6 +357,173 @@ int gicv3_its_init(struct host_its *hw_its)
     return 0;
 }
 
+static void remove_mapped_guest_device(struct its_devices *dev)
+{
+    if ( dev->hw_its )
+        its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
+
+    xfree(dev->itt_addr);
+    xfree(dev);
+}
+
+int gicv3_its_map_guest_device(struct domain *d, int host_devid,
+                               int guest_devid, int bits, bool valid)
+{
+    void *itt_addr = NULL;
+    struct its_devices *dev, *temp;
+    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
+    struct host_its *hw_its;
+    int ret;
+
+    /* check for already existing mappings */
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    while (*new)
+    {
+        temp = rb_entry(*new, struct its_devices, rbnode);
+
+        if ( temp->guest_devid == guest_devid )
+        {
+            if ( !valid )
+                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
+
+            spin_unlock(&d->arch.vgic.its_devices_lock);
+
+            if ( valid )
+                return -EBUSY;
+
+            remove_mapped_guest_device(temp);
+
+            return 0;
+        }
+
+        if ( guest_devid < temp->guest_devid )
+            new = &((*new)->rb_right);
+        else
+            new = &((*new)->rb_left);
+    }
+
+    if ( !valid )
+    {
+        ret = -ENOENT;
+        goto out_unlock;
+    }
+
+    /*
+     * TODO: Work out the correct hardware ITS to use here.
+     * Maybe a per-platform function: devid -> ITS?
+     * Or parsing the DT to find the msi_parent?
+     * Or get Dom0 to give us this information?
+     * For now just use the first ITS.
+     */
+    hw_its = list_first_entry(&host_its_list, struct host_its, entry);
+
+    ret = -ENOMEM;
+
+    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
+    if ( !itt_addr )
+        goto out_unlock;
+
+    dev = xmalloc(struct its_devices);
+    if ( !dev )
+    {
+        xfree(itt_addr);
+        goto out_unlock;
+    }
+
+    ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
+                            virt_to_maddr(itt_addr), true);
+    if ( ret )
+    {
+        xfree(itt_addr);
+        xfree(dev);
+        goto out_unlock;
+    }
+
+    dev->itt_addr = itt_addr;
+    dev->hw_its = hw_its;
+    dev->guest_devid = guest_devid;
+    dev->host_devid = host_devid;
+    dev->eventids = BIT(bits);
+
+    rb_link_node(&dev->rbnode, parent, new);
+    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
+
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    return 0;
+
+out_unlock:
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+    return ret;
+}
+
+/* Removing any connections a domain had to any ITS in the system. */
+void gicv3_its_unmap_all_devices(struct domain *d)
+{
+    struct rb_node *victim;
+    struct its_devices *dev;
+
+    /*
+     * This is an easily readable, yet inefficient implementation.
+     * It uses the provided iteration wrapper and erases each node, which
+     * possibly triggers rebalancing.
+     * This seems overkill since we are going to abolish the whole tree, but
+     * avoids an open-coded re-implementation of the traversal functions with
+     * some recursive function calls.
+     * Performance does not matter here, since we are destroying a domain.
+     */
+restart:
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
+    {
+        dev = rb_entry(victim, struct its_devices, rbnode);
+        rb_erase(victim, &d->arch.vgic.its_devices);
+
+        spin_unlock(&d->arch.vgic.its_devices_lock);
+
+        remove_mapped_guest_device(dev);
+
+        goto restart;
+    }
+
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+}
+
+int gicv3_its_unmap_device(struct domain *d, int guest_devid)
+{
+    struct rb_node *node;
+
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    node = d->arch.vgic.its_devices.rb_node;
+    while (node)
+    {
+        struct its_devices *dev = rb_entry(node, struct its_devices, rbnode);
+
+        if ( dev->guest_devid > guest_devid )
+        {
+            node = node->rb_left;
+            continue;
+        }
+        if ( dev->guest_devid < guest_devid )
+        {
+            node = node->rb_right;
+            continue;
+        }
+
+        rb_erase(node, &d->arch.vgic.its_devices);
+
+        spin_unlock(&d->arch.vgic.its_devices_lock);
+
+        remove_mapped_guest_device(dev);
+
+        return 0;
+
+    }
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    return -ENOENT;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
@@ -369,6 +554,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
         its_data->addr = addr;
         its_data->size = size;
         its_data->dt_node = its;
+        spin_lock_init(&its_data->cmd_lock);
 
         printk("GICv3: Found ITS @0x%lx\n", addr);
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d61479d..1fadb00 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
     d->arch.vgic.nr_regions = rdist_count;
     d->arch.vgic.rdist_regions = rdist_regions;
 
+    spin_lock_init(&d->arch.vgic.its_devices_lock);
+    d->arch.vgic.its_devices = RB_ROOT;
+
     /*
      * Domain 0 gets the hardware address.
      * Guests get the virtual platform layout.
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 2d6fbb1..00b9c1a 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -11,6 +11,7 @@
 #include <asm/gic.h>
 #include <public/hvm/params.h>
 #include <xen/serial.h>
+#include <xen/rbtree.h>
 
 struct hvm_domain
 {
@@ -109,6 +110,8 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        struct rb_root its_devices;         /* devices mapped to an ITS */
+        spinlock_t its_devices_lock;        /* protects the its_devices tree */
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 8288185..9c5dcf3 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -42,6 +42,10 @@
 
 #define GITS_TYPER_PTA                  BIT_ULL(19)
 #define GITS_TYPER_IDBITS_SHIFT         8
+#define GITS_TYPER_ITT_SIZE_SHIFT       4
+#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
+#define GITS_TYPER_ITT_SIZE(r)          (((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
+                                                GITS_TYPER_ITT_SIZE_SHIFT)
 
 #define GITS_IIDR_VALUE                 0x34c
 
@@ -88,6 +92,7 @@
 
 #ifndef __ASSEMBLY__
 #include <xen/device_tree.h>
+#include <xen/rbtree.h>
 
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)
@@ -101,9 +106,19 @@ struct host_its {
     void __iomem *its_base;
     spinlock_t cmd_lock;
     void *cmd_buf;
+    int itte_size;
     unsigned int flags;
 };
 
+struct its_devices {
+    struct rb_node rbnode;
+    struct host_its *hw_its;
+    void *itt_addr;
+    uint32_t guest_devid;
+    uint32_t host_devid;
+    uint32_t eventids;
+};
+
 extern struct list_head host_its_list;
 
 #ifdef CONFIG_HAS_ITS
@@ -128,6 +143,13 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta);
 /* Map a collection for this host CPU to each host ITS. */
 int gicv3_its_setup_collection(int cpu);
 
+/* Map a device on the host by allocating an ITT on the host (ITS).
+ * "bits" specifies how many events (interrupts) this device will need.
+ * Setting "valid" to false deallocates the device.
+ */
+int gicv3_its_map_guest_device(struct domain *d, int host_devid,
+                               int guest_devid, int bits, bool valid);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
@@ -156,6 +178,12 @@ static inline int gicv3_its_setup_collection(int cpu)
 {
     return 0;
 }
+static inline int gicv3_its_map_guest_device(struct domain *d, int host_devid,
+                                             int guest_devid, int bits,
+                                             bool valid)
+{
+    return -ENODEV;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (6 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-07 18:01   ` Julien Grall
  2017-02-14 20:05   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall Andre Przywara
                   ` (21 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
We allocate and access LPI information in a way that does not require
a lock.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        |  80 ++++++++++++++++-
 xen/arch/arm/gic-v3-lpi.c        | 187 ++++++++++++++++++++++++++++++++++++++-
 xen/include/asm-arm/atomic.h     |   6 +-
 xen/include/asm-arm/gic.h        |   5 ++
 xen/include/asm-arm/gic_v3_its.h |   9 ++
 5 files changed, 282 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 4a3a394..f073ab5 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -83,6 +83,20 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapti(struct host_its *its,
+                              uint32_t deviceid, uint32_t eventid,
+                              uint32_t pintid, uint16_t icid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid | ((uint64_t)pintid << 32);
+    cmd[2] = icid;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
 {
     uint64_t cmd[4];
@@ -111,6 +125,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_inv(struct host_its *its,
+                            uint32_t deviceid, uint32_t eventid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid;
+    cmd[2] = 0x00;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(int cpu)
 {
@@ -359,13 +386,47 @@ int gicv3_its_init(struct host_its *hw_its)
 
 static void remove_mapped_guest_device(struct its_devices *dev)
 {
+    int i;
+
     if ( dev->hw_its )
         its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
 
+    for ( i = 0; i < dev->eventids / 32; i++ )
+        gicv3_free_host_lpi_block(dev->hw_its, dev->host_lpis[i]);
+
     xfree(dev->itt_addr);
+    xfree(dev->host_lpis);
     xfree(dev);
 }
 
+/*
+ * On the host ITS @its, map @nr_events consecutive LPIs.
+ * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
+ * increasing both @eventid and @lpi to cover the number of requested LPIs.
+ */
+int gicv3_its_map_host_events(struct host_its *its,
+                              int devid, int eventid, int lpi,
+                              int nr_events)
+{
+    int i, ret;
+
+    for ( i = 0; i < nr_events; i++ )
+    {
+        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
+        if ( ret )
+            return ret;
+        ret = its_send_cmd_inv(its, devid, eventid + i);
+        if ( ret )
+            return ret;
+    }
+
+    ret = its_send_cmd_sync(its, 0);
+    if ( ret )
+        return ret;
+
+    return 0;
+}
+
 int gicv3_its_map_guest_device(struct domain *d, int host_devid,
                                int guest_devid, int bits, bool valid)
 {
@@ -373,7 +434,7 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
     struct its_devices *dev, *temp;
     struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
     struct host_its *hw_its;
-    int ret;
+    int ret, i;
 
     /* check for already existing mappings */
     spin_lock(&d->arch.vgic.its_devices_lock);
@@ -430,10 +491,19 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
         goto out_unlock;
     }
 
+    dev->host_lpis = xzalloc_array(uint32_t, BIT(bits) / 32);
+    if ( !dev->host_lpis )
+    {
+        xfree(dev);
+        xfree(itt_addr);
+        return -ENOMEM;
+    }
+
     ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
                             virt_to_maddr(itt_addr), true);
     if ( ret )
     {
+        xfree(dev->host_lpis);
         xfree(itt_addr);
         xfree(dev);
         goto out_unlock;
@@ -450,6 +520,14 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
 
     spin_unlock(&d->arch.vgic.its_devices_lock);
 
+    /*
+     * Map all host LPIs within this device already. We can't afford to queue
+     * any host ITS commands later on during the guest's runtime.
+     */
+    for ( i = 0; i < BIT(bits) / 32; i++ )
+        dev->host_lpis[i] = gicv3_allocate_host_lpi_block(hw_its, d, host_devid,
+                                                          i * 32);
+
     return 0;
 
 out_unlock:
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 5911b91..8f6e7f3 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -18,16 +18,34 @@
 
 #include <xen/config.h>
 #include <xen/lib.h>
-#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/err.h>
+#include <xen/sched.h>
 #include <xen/sizes.h>
+#include <asm/atomic.h>
+#include <asm/domain.h>
+#include <asm/io.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
 
+/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
+union host_lpi {
+    uint64_t data;
+    struct {
+        uint32_t virt_lpi;
+        uint16_t dom_id;
+        uint16_t vcpu_id;
+    };
+};
+
 /* Global state */
 static struct {
     uint8_t *lpi_property;
+    union host_lpi **host_lpis;
     unsigned int host_lpi_bits;
+    /* Protects allocation and deallocation of host LPIs, but not the access */
+    spinlock_t host_lpis_lock;
 } lpi_data;
 
 /* Physical redistributor address */
@@ -38,6 +56,19 @@ static DEFINE_PER_CPU(int, redist_id);
 static DEFINE_PER_CPU(void *, pending_table);
 
 #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
+#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
+
+static union host_lpi *gic_get_host_lpi(uint32_t plpi)
+{
+    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
+        return NULL;
+
+    plpi -= LPI_OFFSET;
+    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
+        return NULL;
+
+    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % HOST_LPIS_PER_PAGE];
+}
 
 /* Stores this redistributor's physical address and ID in a per-CPU variable */
 void gicv3_set_redist_address(paddr_t address, int redist_id)
@@ -130,15 +161,169 @@ uint64_t gicv3_lpi_get_proptable(void)
 static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
 integer_param("max_lpi_bits", max_lpi_bits);
 
+/*
+ * Allocate the 2nd level array for host LPIs. This one holds pointers
+ * to the page with the actual "union host_lpi" entries. Our LPI limit
+ * avoids excessive memory usage.
+ */
 int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
 {
+    int nr_lpi_ptrs;
+
+    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
+
     lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
 
+    spin_lock_init(&lpi_data.host_lpis_lock);
+
+    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
+    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
+    if ( !lpi_data.host_lpis )
+        return -ENOMEM;
+
     printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);
 
     return 0;
 }
 
+#define LPI_BLOCK       32
+
+/* Must be called with host_lpis_lock held. */
+static int find_unused_host_lpi(int start, uint32_t *index)
+{
+    int chunk;
+    uint32_t i = *index;
+
+    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
+    {
+        /* If we hit an unallocated chunk, use entry 0 in that one. */
+        if ( !lpi_data.host_lpis[chunk] )
+        {
+            *index = 0;
+            return chunk;
+        }
+
+        /* Find an unallocated entry in this chunk. */
+        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+        {
+            if ( lpi_data.host_lpis[chunk][i].dom_id == INVALID_DOMID )
+            {
+                *index = i;
+                return chunk;
+            }
+        }
+        i = 0;
+    }
+
+    return -1;
+}
+
+/*
+ * Allocate a block of 32 LPIs on the given host ITS for device "devid",
+ * starting with "eventid". Put them into the respective ITT by issuing a
+ * MAPTI command for each of them.
+ */
+int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
+                                  uint32_t host_devid, uint32_t eventid)
+{
+    static uint32_t next_lpi = 0;
+    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;
+    int chunk;
+    int i;
+
+    spin_lock(&lpi_data.host_lpis_lock);
+    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
+
+    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
+    {
+        lpi_idx = 0;
+        chunk = find_unused_host_lpi(0, &lpi_idx);
+        if ( chunk == -1 )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOSPC;
+        }
+    }
+
+    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
+    if ( !lpi_data.host_lpis[chunk] )
+    {
+        union host_lpi *new_chunk;
+
+        new_chunk = alloc_xenheap_pages(0, 0);
+        if ( !new_chunk )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOMEM;
+        }
+
+        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+            new_chunk[i].dom_id = INVALID_DOMID;
+
+        lpi_data.host_lpis[chunk] = new_chunk;
+        lpi_idx = 0;
+    }
+
+    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+    {
+        union host_lpi hlpi;
+
+        /*
+         * Mark this host LPI as belonging to the domain, but don't assign
+         * any virtual LPI or a VCPU yet.
+         */
+        hlpi.virt_lpi = INVALID_LPI;
+        hlpi.dom_id = d->domain_id;
+        hlpi.vcpu_id = INVALID_DOMID;
+        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
+                         hlpi.data);
+
+        /*
+         * Enable this host LPI, so we don't have to do this during the
+         * guest's runtime.
+         */
+        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
+    }
+
+    /*
+     * We have allocated and initialized the host LPI entries, so it's safe
+     * to drop the lock now. Access to the structures can be done concurrently
+     * as it involves only an atomic uint64_t access.
+     */
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    __flush_dcache_area(&lpi_data.lpi_property[lpi], LPI_BLOCK);
+
+    gicv3_its_map_host_events(its, host_devid, eventid, lpi + LPI_OFFSET,
+                              LPI_BLOCK);
+
+    next_lpi = lpi + LPI_BLOCK;
+    return lpi + LPI_OFFSET;
+}
+
+int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi)
+{
+    union host_lpi *hlpi, empty_lpi = { .dom_id = INVALID_DOMID };
+    int i;
+
+    hlpi = gic_get_host_lpi(lpi);
+    if ( !hlpi )
+        return -ENOENT;
+
+    spin_lock(&lpi_data.host_lpis_lock);
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
+
+    /* TODO: Call a function in gic-v3-its.c to send DISCARDs */
+
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
index 22a5036..df9de6a 100644
--- a/xen/include/asm-arm/atomic.h
+++ b/xen/include/asm-arm/atomic.h
@@ -53,9 +53,9 @@ build_atomic_write(write_u16_atomic, "h", WORD, uint16_t, "r")
 build_atomic_write(write_u32_atomic, "",  WORD, uint32_t, "r")
 build_atomic_write(write_int_atomic, "",  WORD, int, "r")
 
-#if 0 /* defined (CONFIG_ARM_64) */
-build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
-build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
+#if defined (CONFIG_ARM_64)
+build_atomic_read(read_u64_atomic, "", "", uint64_t, "=r")
+build_atomic_write(write_u64_atomic, "", "", uint64_t, "r")
 #endif
 
 build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 12bd155..7825575 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -220,7 +220,12 @@ enum gic_version {
     GIC_V3,
 };
 
+#define INVALID_LPI     0
 #define LPI_OFFSET      8192
+static inline bool is_lpi(unsigned int irq)
+{
+    return irq >= LPI_OFFSET;
+}
 
 extern enum gic_version gic_hw_version(void);
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 9c5dcf3..0e6b06a 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -97,6 +97,8 @@
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)
 
+#define INVALID_DOMID ((uint16_t)~0)
+
 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -117,6 +119,7 @@ struct its_devices {
     uint32_t guest_devid;
     uint32_t host_devid;
     uint32_t eventids;
+    uint32_t *host_lpis;
 };
 
 extern struct list_head host_its_list;
@@ -149,6 +152,12 @@ int gicv3_its_setup_collection(int cpu);
  */
 int gicv3_its_map_guest_device(struct domain *d, int host_devid,
                                int guest_devid, int bits, bool valid);
+int gicv3_its_map_host_events(struct host_its *its,
+                              int host_devid, int eventid,
+                              int lpi, int nrevents);
+int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
+                                  uint32_t host_devid, uint32_t eventid);
+int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);
 
 #else
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (7 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-31 10:29   ` Jaggi, Manish
  2017-02-14 20:11   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
                   ` (20 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

To get MSIs from devices forwarded to a CPU, we need to name the device
and its MSIs by mapping them to an ITS.
Since this involves queueing commands to the ITS command queue, we can't
really afford to do this during the guest's runtime, as this would open
up a denial-of-service attack vector.
So we require every device with MSI interrupts to be mapped explicitly by
Dom0. For Dom0 itself we can just use the existing PCI physdev_op
hypercalls, which the existing Linux kernel issues already.
So upon receipt of this hypercall we map the device to the hardware ITS
and prepare it to be later mapped by the virtual ITS by using the very
same device ID (for Dom0 only).
Also we ask for mapping 32 LPIs to cover 32 MSIs that the device may
use.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/physdev.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index 27bbbda..6e02de4 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -9,11 +9,32 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/sched.h>
+#include <xen/guest_access.h>
+#include <asm/gic_v3_its.h>
 #include <asm/hypercall.h>
 
 
 int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
+    struct physdev_manage_pci manage;
+    u32 devid;
+    int ret;
+
+    switch (cmd)
+    {
+        case PHYSDEVOP_manage_pci_add:
+        case PHYSDEVOP_manage_pci_remove:
+            if ( copy_from_guest(&manage, arg, 1) != 0 )
+                return -EFAULT;
+
+            devid = manage.bus << 8 | manage.devfn;
+            /* Allocate an ITS device table with space for 32 MSIs */
+            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
+                                             cmd == PHYSDEVOP_manage_pci_add);
+
+            return ret;
+    }
+
     gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd);
     return -ENOSYS;
 }
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (8 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-14 20:39   ` Stefano Stabellini
  2017-02-15 17:03   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests Andre Przywara
                   ` (19 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

For the same reason that allocating a struct irq_desc for each
possible LPI is not an option, having a struct pending_irq for each LPI
is also not feasible. However we actually only need those when an
interrupt is on a vCPU (or is about to be injected).
Maintain a list of those structs that we can use for the lifecycle of
a guest LPI. We allocate new entries if necessary, however reuse
pre-owned entries whenever possible.
I added some locking around this list here, however my gut feeling is
that we don't need one because this a per-VCPU structure anyway.
If someone could confirm this, I'd be grateful.
Teach the existing VGIC functions to find the right pointer when being
given a virtual LPI number.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic.c           |  3 +++
 xen/arch/arm/vgic-v3.c       |  3 +++
 xen/arch/arm/vgic.c          | 64 +++++++++++++++++++++++++++++++++++++++++---
 xen/include/asm-arm/domain.h |  2 ++
 xen/include/asm-arm/vgic.h   | 14 ++++++++++
 5 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index a5348f2..bd3c032 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -509,6 +509,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
                 struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
                 irq_set_affinity(p->desc, cpumask_of(v_target->processor));
             }
+            /* If this was an LPI, mark this struct as available again. */
+            if ( is_lpi(p->irq) )
+                p->irq = 0;
         }
     }
 }
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 1fadb00..b0653c2 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1426,6 +1426,9 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
     if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
         v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
 
+    spin_lock_init(&v->arch.vgic.pending_lpi_list_lock);
+    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
+
     return 0;
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 364d5f0..7e3440f 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -31,6 +31,8 @@
 #include <asm/mmio.h>
 #include <asm/gic.h>
 #include <asm/vgic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
 
 static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
 {
@@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
     return vgic_get_rank(v, rank);
 }
 
-static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
+void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
 {
     INIT_LIST_HEAD(&p->inflight);
     INIT_LIST_HEAD(&p->lr_queue);
@@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
 
 static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
 {
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
+    struct vgic_irq_rank *rank;
     unsigned long flags;
     int priority;
 
+    if ( is_lpi(virq) )
+        return vgic_lpi_get_priority(v->domain, virq);
+
+    rank = vgic_rank_irq(v, virq);
     vgic_lock_rank(v, rank, flags);
     priority = rank->priority[virq & INTERRUPT_RANK_MASK];
     vgic_unlock_rank(v, rank, flags);
@@ -446,13 +452,63 @@ bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode,
     return true;
 }
 
+/*
+ * Holding struct pending_irq's for each possible virtual LPI in each domain
+ * requires too much Xen memory, also a malicious guest could potentially
+ * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
+ * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
+ * on demand.
+ */
+struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
+                                   bool allocate)
+{
+    struct lpi_pending_irq *lpi_irq, *empty = NULL;
+
+    spin_lock(&v->arch.vgic.pending_lpi_list_lock);
+    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
+    {
+        if ( lpi_irq->pirq.irq == lpi )
+        {
+            spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
+            return &lpi_irq->pirq;
+        }
+
+        if ( lpi_irq->pirq.irq == 0 && !empty )
+            empty = lpi_irq;
+    }
+
+    if ( !allocate )
+    {
+        spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
+        return NULL;
+    }
+
+    if ( !empty )
+    {
+        empty = xzalloc(struct lpi_pending_irq);
+        vgic_init_pending_irq(&empty->pirq, lpi);
+        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
+    } else
+    {
+        empty->pirq.status = 0;
+        empty->pirq.irq = lpi;
+    }
+
+    spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
+
+    return &empty->pirq;
+}
+
 struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
 {
     struct pending_irq *n;
+
     /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
      * are used for SPIs; the rests are used for per cpu irqs */
     if ( irq < 32 )
         n = &v->arch.vgic.pending_irqs[irq];
+    else if ( is_lpi(irq) )
+        n = lpi_to_pending(v, irq, true);
     else
         n = &v->domain->arch.vgic.pending_irqs[irq - 32];
     return n;
@@ -480,7 +536,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 {
     uint8_t priority;
-    struct pending_irq *iter, *n = irq_to_pending(v, virq);
+    struct pending_irq *iter, *n;
     unsigned long flags;
     bool running;
 
@@ -488,6 +544,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
+    n = irq_to_pending(v, virq);
+
     /* vcpu offline */
     if ( test_bit(_VPF_down, &v->pause_flags) )
     {
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 00b9c1a..f44a84b 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -257,6 +257,8 @@ struct arch_vcpu
         paddr_t rdist_base;
 #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
         uint8_t flags;
+        struct list_head pending_lpi_list;
+        spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
     } vgic;
 
     /* Timer registers  */
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 672f649..03d4d2e 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -83,6 +83,12 @@ struct pending_irq
     struct list_head lr_queue;
 };
 
+struct lpi_pending_irq
+{
+    struct list_head entry;
+    struct pending_irq pirq;
+};
+
 #define NR_INTERRUPT_PER_RANK   32
 #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
 
@@ -296,13 +302,21 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
+extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
 extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
+extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
+                                          bool allocate);
 extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
 extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
 extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
 extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
 extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
+/* placeholder function until the property table gets introduced */
+static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
+{
+    return 0xa;
+}
 extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
 int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (9 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-14 21:00   ` Stefano Stabellini
  2017-02-15 17:30   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
                   ` (18 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
number to get this IRQ injected.
Iterate our two-level LPI table to find this information quickly when
the host takes an LPI. Call the existing injection function to let the
GIC emulation deal with this interrupt.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-lpi.c | 41 +++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic.c        |  6 ++++--
 xen/include/asm-arm/irq.h |  8 ++++++++
 3 files changed, 53 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 8f6e7f3..d270053 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -86,6 +86,47 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta)
         return per_cpu(redist_id, cpu) << 16;
 }
 
+/*
+ * Handle incoming LPIs, which are a bit special, because they are potentially
+ * numerous and also only get injected into guests. Treat them specially here,
+ * by just looking up their target vCPU and virtual LPI number and hand it
+ * over to the injection function.
+ */
+void do_LPI(unsigned int lpi)
+{
+    struct domain *d;
+    union host_lpi *hlpip, hlpi;
+    struct vcpu *vcpu;
+
+    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
+
+    hlpip = gic_get_host_lpi(lpi);
+    if ( !hlpip )
+        return;
+
+    hlpi.data = read_u64_atomic(&hlpip->data);
+
+    /* We may have mapped more host LPIs than the guest actually asked for. */
+    if ( !hlpi.virt_lpi )
+        return;
+
+    d = get_domain_by_id(hlpi.dom_id);
+    if ( !d )
+        return;
+
+    if ( hlpi.vcpu_id >= d->max_vcpus )
+    {
+        put_domain(d);
+        return;
+    }
+
+    vcpu = d->vcpu[hlpi.vcpu_id];
+
+    put_domain(d);
+
+    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index bd3c032..7286e5d 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -700,8 +700,10 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
             local_irq_enable();
             do_IRQ(regs, irq, is_fiq);
             local_irq_disable();
-        }
-        else if (unlikely(irq < 16))
+        } else if ( is_lpi(irq) )
+        {
+            do_LPI(irq);
+        } else if ( unlikely(irq < 16) )
         {
             do_sgi(regs, irq);
         }
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 8f7a167..ee47de8 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -34,6 +34,14 @@ struct irq_desc *__irq_to_desc(int irq);
 
 void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
 
+#ifdef CONFIG_HAS_ITS
+void do_LPI(unsigned int irq);
+#else
+static inline void do_LPI(unsigned int irq)
+{
+}
+#endif
+
 #define domain_pirq_to_irq(d, pirq) (pirq)
 
 bool_t is_assignable_irq(unsigned int irq);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (10 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-14 22:41   ` Stefano Stabellini
  2017-02-15 17:35   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
                   ` (17 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Now that the host part of the ITS code is in place, we can enable the
ITS and also LPIs on each redistributor to get the show rolling.
At this point there would be no LPIs mapped, as guests don't know about
the ITS yet.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c | 34 ++++++++++++++++++++++++++++++++--
 xen/arch/arm/gic-v3.c     | 19 +++++++++++++++++++
 2 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index f073ab5..2a7093f 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -62,6 +62,28 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
     return 0;
 }
 
+/* Wait for an ITS to finish processing all commands. */
+static int gicv3_its_wait_commands(struct host_its *hw_its)
+{
+    s_time_t deadline = NOW() + MILLISECS(1000);
+    uint64_t readp, writep;
+
+    do {
+        spin_lock(&hw_its->cmd_lock);
+        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
+        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
+        spin_unlock(&hw_its->cmd_lock);
+
+        if ( readp == writep )
+            return 0;
+
+        cpu_relax();
+        udelay(1);
+    } while ( NOW() <= deadline );
+
+    return -ETIMEDOUT;
+}
+
 static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
 {
     reg &= ~GENMASK(51, 16);
@@ -161,6 +183,10 @@ int gicv3_its_setup_collection(int cpu)
         ret = its_send_cmd_sync(its, cpu);
         if ( ret )
             return ret;
+
+        ret = gicv3_its_wait_commands(its);
+        if ( ret )
+            return ret;
     }
 
     return 0;
@@ -367,6 +393,10 @@ int gicv3_its_init(struct host_its *hw_its)
         return -ENOMEM;
     writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
 
+    /* Now enable interrupt translation and command processing on that ITS. */
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    writel_relaxed(reg | GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
     /*
      * We issue the collection mapping calls upon initialising the
      * redistributors, which for CPU 0 happens before the ITS gets initialised
@@ -381,7 +411,7 @@ int gicv3_its_init(struct host_its *hw_its)
     if ( ret )
         return ret;
 
-    return 0;
+    return gicv3_its_wait_commands(hw_its);
 }
 
 static void remove_mapped_guest_device(struct its_devices *dev)
@@ -424,7 +454,7 @@ int gicv3_its_map_host_events(struct host_its *its,
     if ( ret )
         return ret;
 
-    return 0;
+    return gicv3_its_wait_commands(its);
 }
 
 int gicv3_its_map_guest_device(struct domain *d, int host_devid,
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 5f825a6..23cf33d 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -647,6 +647,21 @@ static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
     return gicv3_its_setup_collection(smp_processor_id());
 }
 
+/* Enable LPIs on this redistributor (only useful when the host has an ITS. */
+static bool gicv3_enable_lpis(void)
+{
+    uint32_t val;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_TYPER);
+    if ( !(val & GICR_TYPER_PLPIS) )
+        return false;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_CTLR);
+    writel_relaxed(val | GICR_CTLR_ENABLE_LPIS, GICD_RDIST_BASE + GICR_CTLR);
+
+    return true;
+}
+
 static int __init gicv3_populate_rdist(void)
 {
     int i;
@@ -755,6 +770,10 @@ static int gicv3_cpu_init(void)
     if ( gicv3_enable_redist() )
         return -ENODEV;
 
+    /* If the host has any ITSes, enable LPIs now. */
+    if ( !list_empty(&host_its_list) )
+        gicv3_enable_lpis();
+
     /* Set priority on PPI and SGI interrupts */
     priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
                 GIC_PRI_IPI);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (11 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-14 23:56   ` Stefano Stabellini
  2017-02-15 18:44   ` Julien Grall
  2017-01-30 18:31 ` [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs Andre Przywara
                   ` (16 subsequent siblings)
  29 siblings, 2 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Allow a guest to provide the address and size for the memory regions
it has reserved for the GICv3 pending and property tables.
We sanitise the various fields of the respective redistributor
registers and map those pages into Xen's address space to have easy
access.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c       | 220 +++++++++++++++++++++++++++++++++++++++----
 xen/arch/arm/vgic.c          |   4 +
 xen/include/asm-arm/domain.h |   8 +-
 xen/include/asm-arm/vgic.h   |  24 ++++-
 4 files changed, 233 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index b0653c2..c6db2d7 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -20,12 +20,14 @@
 
 #include <xen/bitops.h>
 #include <xen/config.h>
+#include <xen/domain_page.h>
 #include <xen/lib.h>
 #include <xen/init.h>
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/sched.h>
 #include <xen/sizes.h>
+#include <xen/vmap.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
@@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_reserved;
 
     case VREG64(GICR_PROPBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
+        return 1;
 
     case VREG64(GICR_PENDBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
+        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
+        return 1;
 
     case 0x0080:
         goto read_reserved;
@@ -302,11 +307,6 @@ bad_width:
     domain_crash_synchronous();
     return 0;
 
-read_as_zero_64:
-    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-    *r = 0;
-    return 1;
-
 read_as_zero_32:
     if ( dabt.size != DABT_WORD ) goto bad_width;
     *r = 0;
@@ -331,11 +331,179 @@ read_unknown:
     return 1;
 }
 
+static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
+                                    int field_shift,
+                                    uint64_t (*sanitise_fn)(uint64_t))
+{
+    uint64_t field = (reg & field_mask) >> field_shift;
+
+    field = sanitise_fn(field) << field_shift;
+
+    return (reg & ~field_mask) | field;
+}
+
+/* We want to avoid outer shareable. */
+static uint64_t vgic_sanitise_shareability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_OuterShareable:
+        return GIC_BASER_InnerShareable;
+    default:
+        return field;
+    }
+}
+
+/* Avoid any inner non-cacheable mapping. */
+static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_CACHE_nCnB:
+    case GIC_BASER_CACHE_nC:
+        return GIC_BASER_CACHE_RaWb;
+    default:
+        return field;
+    }
+}
+
+/* Non-cacheable or same-as-inner are OK. */
+static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
+{
+    switch (field) {
+    case GIC_BASER_CACHE_SameAsInner:
+    case GIC_BASER_CACHE_nC:
+        return field;
+    default:
+        return GIC_BASER_CACHE_nC;
+    }
+}
+
+static uint64_t sanitize_propbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
+                              GICR_PROPBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~GICR_PROPBASER_RES0_MASK;
+    return reg;
+}
+
+static uint64_t sanitize_pendbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
+                              GICR_PENDBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~GICR_PENDBASER_RES0_MASK;
+    return reg;
+}
+
+/*
+ * Mark a given number of guest pages as used (by increasing their refcount),
+ * starting with the given guest address. This needs to be called once before
+ * calling (possibly repeatedly) map_guest_pages().
+ * Before the domain gets destroyed, call put_guest_pages() to drop the
+ * reference.
+ */
+int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)
+{
+    int i;
+    struct page_info *page;
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
+        if ( ! page )
+            return -EINVAL;
+    }
+
+    return 0;
+}
+
+void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)
+{
+    mfn_t mfn;
+    int i;
+
+    p2m_read_lock(&d->arch.p2m);
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
+                            NULL, NULL, NULL);
+        if ( mfn_eq(mfn, INVALID_MFN) )
+            continue;
+        put_page(mfn_to_page(mfn_x(mfn)));
+    }
+    p2m_read_unlock(&d->arch.p2m);
+}
+
+/*
+ * Provides easy access to guest memory by "mapping" some parts of it into
+ * Xen's VA space. In fact it relies on the memory being already mapped
+ * and just provides a pointer to it.
+ * This allows the ITS configuration data to be held in guest memory and
+ * avoids using Xen's memory for that.
+ */
+void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
+{
+    int i;
+    void *ptr, *follow;
+
+    ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));
+
+    /* Make sure subsequent pages are mapped in a virtually contigious way. */
+    for ( i = 1; i < nr_pages; i++ )
+    {
+        follow = map_domain_page(_mfn((guest_addr >> PAGE_SHIFT) + i));
+        if ( follow != ptr + ((long)i << PAGE_SHIFT) )
+            return NULL;
+    }
+
+    return ptr + (guest_addr & ~PAGE_MASK);
+}
+
+/* "Unmap" previously mapped guest pages. Should be optimized away on arm64. */
+void unmap_guest_pages(void *va, int nr_pages)
+{
+    long i;
+
+    for ( i = nr_pages - 1; i >= 0; i-- )
+        unmap_domain_page(((uintptr_t)va & PAGE_MASK) + (i << PAGE_SHIFT));
+}
+
+int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
+{
+    if ( vlpi >= d->arch.vgic.nr_lpis )
+        return GIC_PRI_IRQ;
+
+    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_PRIO_MASK;
+}
+
+bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
+{
+    if ( vlpi >= d->arch.vgic.nr_lpis )
+        return false;
+
+    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_ENABLED;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
 {
     struct hsr_dabt dabt = info->dabt;
+    uint64_t reg;
 
     switch ( gicr_reg )
     {
@@ -366,36 +534,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
         goto write_impl_defined;
 
     case VREG64(GICR_SETLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case VREG64(GICR_CLRLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x0050:
         goto write_reserved;
 
     case VREG64(GICR_PROPBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
+            return 1;
+
+        reg = v->domain->arch.vgic.rdist_propbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_propbaser(reg);
+        v->domain->arch.vgic.rdist_propbase = reg;
+        return 1;
 
     case VREG64(GICR_PENDBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
+            return 1;
+
+	reg = v->arch.vgic.rdist_pendbase;
+	vgic_reg64_update(&reg, r, info);
+	reg = sanitize_pendbaser(reg);
+	v->arch.vgic.rdist_pendbase = reg;
+	return 1;
 
     case 0x0080:
         goto write_reserved;
 
     case VREG64(GICR_INVLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x00A8:
         goto write_reserved;
 
     case VREG64(GICR_INVALLR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x00B8:
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 7e3440f..cf444f3 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -494,6 +494,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
         empty->pirq.irq = lpi;
     }
 
+    /* Update the enabled status */
+    if ( vgic_lpi_is_enabled(v->domain, lpi) )
+        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
+
     spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
 
     return &empty->pirq;
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index f44a84b..33c1851 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -110,6 +110,9 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        int nr_lpis;
+        uint64_t rdist_propbase;
+        uint8_t *proptable;
         struct rb_root its_devices;         /* devices mapped to an ITS */
         spinlock_t its_devices_lock;        /* protects the its_devices tree */
 #endif
@@ -255,7 +258,10 @@ struct arch_vcpu
 
         /* GICv3: redistributor base and flags for this vCPU */
         paddr_t rdist_base;
-#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_LPIS_ENABLED    (1 << 1)
+        uint64_t rdist_pendbase;
+        unsigned long *pendtable;
         uint8_t flags;
         struct list_head pending_lpi_list;
         spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 03d4d2e..a882fe8 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -285,6 +285,11 @@ VGIC_REG_HELPERS(32, 0x3);
 
 #undef VGIC_REG_HELPERS
 
+int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
+void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
+void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
+void unmap_guest_pages(void *va, int nr_pages);
+
 enum gic_sgi_mode;
 
 /*
@@ -312,14 +317,23 @@ extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
 extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
 extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
 extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
-/* placeholder function until the property table gets introduced */
-static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
-{
-    return 0xa;
-}
 extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
 int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);
+#ifdef CONFIG_HAS_GICV3
+extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
+extern bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi);
+#else
+static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
+{
+    return 0xa0;
+}
+
+static inline bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
+{
+    return false;
+}
+#endif
 
 extern int domain_vgic_register(struct domain *d, int *mmio_count);
 extern int vcpu_vgic_free(struct vcpu *v);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (12 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-14 23:58   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
                   ` (15 subsequent siblings)
  29 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

If a guest disables an LPI, we do not forward this to the associated
host LPI to avoid queueing commands to the host ITS command queue.
So it may happen that an LPI fires nevertheless on the host. In this
case we can bail out early, but have to save the pending state on the
virtual side.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-lpi.c  |  8 ++++++++
 xen/arch/arm/vgic-v3.c     | 12 ++++++++++++
 xen/include/asm-arm/vgic.h |  6 ++++++
 3 files changed, 26 insertions(+)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index d270053..ade8b69 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -124,6 +124,14 @@ void do_LPI(unsigned int lpi)
 
     put_domain(d);
 
+    /*
+     * We keep all host LPIs enabled, so check if it's disabled on the guest
+     * side and just record this LPI in the virtual pending table in this case.
+     * The guest picks it up once it gets enabled again.
+     */
+    if ( !vgic_can_inject_lpi(vcpu, hlpi.virt_lpi) )
+        return;
+
     vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
 }
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index c6db2d7..de625bf 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -498,6 +498,18 @@ bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
     return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_ENABLED;
 }
 
+bool vgic_can_inject_lpi(struct vcpu *vcpu, uint32_t vlpi)
+{
+    if ( vlpi >= vcpu->domain->arch.vgic.nr_lpis )
+        return false;
+
+    if ( vgic_lpi_is_enabled(vcpu->domain, vlpi) )
+        return true;
+
+    set_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable);
+    return false;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index a882fe8..e71b18b 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -323,6 +323,7 @@ int vgic_v3_init(struct domain *d, int *mmio_count);
 #ifdef CONFIG_HAS_GICV3
 extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
 extern bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi);
+extern bool vgic_can_inject_lpi(struct vcpu *v, uint32_t vlpi);
 #else
 static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
 {
@@ -333,6 +334,11 @@ static inline bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
 {
     return false;
 }
+
+static inline bool vgic_can_inject_lpi(struct vcpu *v, uint32_t vlpi)
+{
+    return false;
+}
 #endif
 
 extern int domain_vgic_register(struct domain *d, int *mmio_count);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (13 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-15 20:06   ` Shanker Donthineni
  2017-01-30 18:31 ` [PATCH 16/28] ARM: vITS: introduce translation table walks Andre Przywara
                   ` (14 subsequent siblings)
  29 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Create a new file to hold the emulation code for the ITS widget.
For now we emulate the memory mapped ITS registers and provide a stub
to introduce the ITS command handling framework (but without actually
emulating any commands at this time).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/vgic-v3-its.c        | 485 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            |   9 -
 xen/include/asm-arm/gic_v3_defs.h |  19 ++
 4 files changed, 505 insertions(+), 9 deletions(-)
 create mode 100644 xen/arch/arm/vgic-v3-its.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 4ccf2eb..a1cbc27 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -46,6 +46,7 @@ obj-y += traps.o
 obj-y += vgic.o
 obj-y += vgic-v2.o
 obj-$(CONFIG_HAS_GICV3) += vgic-v3.o
+obj-$(CONFIG_HAS_ITS) += vgic-v3-its.o
 obj-y += vm_event.o
 obj-y += vtimer.o
 obj-y += vpsci.o
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
new file mode 100644
index 0000000..fc28376
--- /dev/null
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -0,0 +1,485 @@
+/*
+ * xen/arch/arm/vgic-v3-its.c
+ *
+ * ARM Interrupt Translation Service (ITS) emulation
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <xen/bitops.h>
+#include <xen/config.h>
+#include <xen/domain_page.h>
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/softirq.h>
+#include <xen/irq.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
+#include <asm/current.h>
+#include <asm/mmio.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+#include <asm/vgic.h>
+#include <asm/vgic-emul.h>
+
+/* Data structure to describe a virtual ITS */
+struct virt_its {
+    struct domain *d;
+    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
+    uint64_t cbaser;
+    uint64_t *cmdbuf;
+    int cwriter;
+    int creadr;
+    spinlock_t its_lock;        /* protects the collection and device tables */
+    uint64_t baser0, baser1;
+    uint16_t *coll_table;
+    int max_collections;
+    uint64_t *dev_table;
+    int max_devices;
+    bool enabled;
+};
+
+/*
+ * An Interrupt Translation Table Entry: this is indexed by a
+ * DeviceID/EventID pair and is located in guest memory.
+ */
+struct vits_itte
+{
+    uint32_t vlpi;
+    uint16_t collection;
+};
+
+/**************************************
+ * Functions that handle ITS commands *
+ **************************************/
+
+static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
+                                   int word, int shift, int size)
+{
+    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
+}
+
+#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
+#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
+#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
+#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
+#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
+#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
+#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
+
+#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
+
+static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
+                                uint32_t writer)
+{
+    uint64_t *cmdptr;
+
+    if ( !its->cmdbuf )
+        return -1;
+
+    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
+        return -1;
+
+    spin_lock(&its->vcmd_lock);
+
+    while ( its->creadr != writer )
+    {
+        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
+        switch (its_cmd_get_command(cmdptr))
+        {
+        case GITS_CMD_SYNC:
+            /* We handle ITS commands synchronously, so we ignore SYNC. */
+	    break;
+        default:
+            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
+                   its_cmd_get_command(cmdptr));
+            break;
+        }
+
+        its->creadr += ITS_CMD_SIZE;
+        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
+            its->creadr = 0;
+    }
+    its->cwriter = writer;
+
+    spin_unlock(&its->vcmd_lock);
+
+    return 0;
+}
+
+/*****************************
+ * ITS registers read access *
+ *****************************/
+
+/*
+ * The physical address is encoded slightly differently depending on
+ * the used page size: the highest four bits are stored in the lowest
+ * four bits of the field for 64K pages.
+ */
+static paddr_t get_baser_phys_addr(uint64_t reg)
+{
+    if ( reg & BIT(9) )
+        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
+    else
+        return reg & GENMASK(47, 12);
+}
+
+static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
+                                 register_t *r, void *priv)
+{
+    struct virt_its *its = priv;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
+	break;
+    case VREG32(GITS_IIDR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
+        break;
+    case VREG64(GITS_TYPER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(0x1eff1, info);
+        break;
+    case VREG64(GITS_CBASER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->cbaser, info);
+        break;
+    case VREG64(GITS_CWRITER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->cwriter, info);
+        break;
+    case VREG64(GITS_CREADR):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->creadr, info);
+        break;
+    case VREG64(GITS_BASER0):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser0, info);
+        break;
+    case VREG64(GITS_BASER1):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser1, info);
+        break;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        *r = vgic_reg64_extract(0, info);
+        break;
+    case VREG32(GICD_PIDR2):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
+        break;
+    }
+
+    return 1;
+
+bad_width:
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+/******************************
+ * ITS registers write access *
+ ******************************/
+
+static int its_baser_table_size(uint64_t baser)
+{
+    int page_size = 0;
+
+    switch ( (baser >> 8) & 3 )
+    {
+    case 0: page_size = SZ_4K; break;
+    case 1: page_size = SZ_16K; break;
+    case 2:
+    case 3: page_size = SZ_64K; break;
+    }
+
+    return page_size * ((baser & GENMASK(7, 0)) + 1);
+}
+
+static int its_baser_nr_entries(uint64_t baser)
+{
+    int entry_size = ((baser & GENMASK(52, 48)) >> 48) + 1;
+
+    return its_baser_table_size(baser) / entry_size;
+}
+
+static void vgic_its_map_cmdbuf(struct virt_its *its)
+{
+    if ( its->cmdbuf || !(its->cbaser & GITS_VALID_BIT) )
+        return;
+
+    get_guest_pages(its->d, its->cbaser & GENMASK(51, 12),
+                    (its->cbaser & 0xff) + 1);
+    its->cmdbuf = map_guest_pages(its->d, its->cbaser & GENMASK(51, 12),
+                                  (its->cbaser & 0xff) + 1);
+}
+
+static void vgic_its_unmap_cmdbuf(struct virt_its *its)
+{
+    int nr_pages = (its->cbaser & 0xff) + 1;
+
+    if ( !its->cmdbuf )
+        return;
+
+    unmap_guest_pages(its->cmdbuf, nr_pages);
+    put_guest_pages(its->d, its->cbaser & GENMASK(51, 12), nr_pages);
+
+    its->cmdbuf = NULL;
+}
+
+static void* vgic_its_map_its_table(struct virt_its *its, uint64_t reg)
+{
+    void *ret;
+    int table_size = its_baser_table_size(reg);
+
+    if ( !(reg & GITS_VALID_BIT) )
+        return NULL;
+
+    get_guest_pages(its->d, get_baser_phys_addr(reg), table_size >> PAGE_SHIFT);
+    ret = map_guest_pages(its->d, get_baser_phys_addr(reg),
+                          table_size >> PAGE_SHIFT);
+    memset(ret, 0, table_size);
+
+    return ret;
+}
+
+static void vgic_its_unmap_its_table(struct domain *d, void *table,
+                                     uint64_t reg)
+{
+    if ( !table )
+        return;
+
+    unmap_guest_pages(table, its_baser_table_size(reg) >> PAGE_SHIFT);
+    put_guest_pages(d, get_baser_phys_addr(reg),
+                    its_baser_table_size(reg) >> PAGE_SHIFT);
+}
+
+static void vgic_v3_its_change_its_status(struct virt_its *its, bool status)
+{
+    if ( !status )
+    {
+        its->enabled = false;
+        return;
+    }
+
+    vgic_its_map_cmdbuf(its);
+
+    if ( !its->dev_table )
+        its->dev_table = vgic_its_map_its_table(its, its->baser0);
+
+    if ( !its->coll_table )
+        its->coll_table = vgic_its_map_its_table(its, its->baser1);
+
+    its->enabled = true;
+}
+
+static void sanitize_its_base_reg(uint64_t *reg)
+{
+    uint64_t r = *reg;
+
+    /* Avoid outer shareable. */
+    switch ( (r >> GITS_BASER_SHAREABILITY_SHIFT) & 0x03 )
+    {
+    case GIC_BASER_OuterShareable:
+        r = r & ~GITS_BASER_SHAREABILITY_MASK;
+        r |= GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+        break;
+    default:
+        break;
+    }
+
+    /* Avoid any inner non-cacheable mapping. */
+    switch ( (r >> GITS_BASER_INNER_CACHEABILITY_SHIFT) & 0x07 )
+    {
+    case GIC_BASER_CACHE_nCnB:
+    case GIC_BASER_CACHE_nC:
+        r = r & ~GITS_BASER_INNER_CACHEABILITY_MASK;
+        r |= GIC_BASER_CACHE_RaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+        break;
+    default:
+        break;
+    }
+
+    /* Only allow non-cacheable or same-as-inner. */
+    switch ( (r >> GITS_BASER_OUTER_CACHEABILITY_SHIFT) & 0x07 )
+    {
+    case GIC_BASER_CACHE_SameAsInner:
+    case GIC_BASER_CACHE_nC:
+        break;
+    default:
+        r = r & ~GITS_BASER_OUTER_CACHEABILITY_MASK;
+        r |= GIC_BASER_CACHE_nC << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+        break;
+    }
+
+    *reg = r;
+}
+
+static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
+                                  register_t r, void *priv)
+{
+    struct domain *d = v->domain;
+    struct virt_its *its = priv;
+    uint64_t reg;
+    uint32_t reg32, ctlr;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+
+        ctlr = its->enabled ? GITS_CTLR_ENABLE : 0;
+        reg32 = ctlr;
+        vgic_reg32_update(&reg32, r, info);
+        its->enabled = reg32 & GITS_CTLR_ENABLE;
+
+        if ( ctlr ^ reg32 )
+            vgic_v3_its_change_its_status(its, its->enabled);
+        return 1;
+
+    case VREG32(GITS_IIDR):
+        goto write_ignore_32;
+    case VREG32(GITS_TYPER):
+        goto write_ignore_32;
+    case VREG64(GITS_CBASER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->cbaser;
+        vgic_reg64_update(&reg, r, info);
+        sanitize_its_base_reg(&reg);
+
+        vgic_its_unmap_cmdbuf(its);
+        its->cbaser = reg;
+
+	return 1;
+
+    case VREG64(GITS_CWRITER):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+        reg = its->cwriter & 0xfffe0;
+        vgic_reg64_update(&reg, r, info);
+        its->cwriter = reg & 0xfffe0;
+
+        if ( its->enabled )
+            vgic_its_handle_cmds(d, its, reg);
+
+        return 1;
+
+    case VREG64(GITS_CREADR):
+        goto write_ignore_64;
+    case VREG64(GITS_BASER0):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->baser0;
+        vgic_reg64_update(&reg, r, info);
+
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
+        sanitize_its_base_reg(&reg);
+
+        /* Has the table address been changed or invalidated? */
+        if ( !(reg & GITS_VALID_BIT) ||
+             get_baser_phys_addr(reg) != get_baser_phys_addr(its->baser0) )
+        {
+            vgic_its_unmap_its_table(its->d, its->dev_table, reg);
+            its->dev_table = NULL;
+        }
+
+        if ( reg & GITS_VALID_BIT )
+            its->max_devices = its_baser_nr_entries(reg);
+        else
+            its->max_devices = 0;
+
+        its->baser0 = reg;
+        return 1;
+    case VREG64(GITS_BASER1):
+        if ( info->dabt.size < DABT_WORD ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its->enabled )
+            return 1;
+
+        reg = its->baser1;
+        vgic_reg64_update(&reg, r, info);
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
+        sanitize_its_base_reg(&reg);
+
+        if ( !(reg & GITS_VALID_BIT) ||
+             get_baser_phys_addr(reg) != get_baser_phys_addr(its->baser1) )
+        {
+            vgic_its_unmap_its_table(its->d, its->coll_table, reg);
+            its->coll_table = NULL;
+        }
+
+        if ( reg & GITS_VALID_BIT )
+            its->max_collections = its_baser_nr_entries(reg);
+        else
+            its->max_collections = 0;
+        its->baser1 = reg;
+        return 1;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        goto write_ignore_64;
+    default:
+        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
+                 info->gpa & 0xffff);
+        return 0;
+    }
+
+    return 1;
+
+write_ignore_64:
+    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+    return 1;
+
+write_ignore_32:
+    if ( info->dabt.size != DABT_WORD ) goto bad_width;
+    return 1;
+
+bad_width:
+    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
+           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
+
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+static const struct mmio_handler_ops vgic_its_mmio_handler = {
+    .read  = vgic_v3_its_mmio_read,
+    .write = vgic_v3_its_mmio_write,
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index de625bf..d1382be 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -159,15 +159,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
     rank->vcpu[offset] = new_vcpu->vcpu_id;
 }
 
-static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
-{
-    /*
-     * 64 bits registers can be accessible using 32-bit and 64-bit unless
-     * stated otherwise (See 8.1.3 ARM IHI 0069A).
-     */
-    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
-}
-
 static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
                                          uint32_t gicr_reg,
                                          register_t *r)
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 878bae2..1e88d6b 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -153,6 +153,16 @@
 #define LPI_PROP_RES1                (1 << 1)
 #define LPI_PROP_ENABLED             (1 << 0)
 
+/*
+ * PIDR2: Only bits[7:4] are not implementation defined. We are
+ * emulating a GICv3 ([7:4] = 0x3).
+ *
+ * We don't emulate a specific registers scheme so implement the others
+ * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
+ */
+#define GICV3_GICD_PIDR2  0x30
+#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
@@ -196,6 +206,15 @@ struct rdist_region {
     bool single_rdist;
 };
 
+/*
+ * 64 bits registers can be accessible using 32-bit and 64-bit unless
+ * stated otherwise (See 8.1.3 ARM IHI 0069A).
+ */
+static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
+{
+    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
+}
+
 #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
 
 /*
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 16/28] ARM: vITS: introduce translation table walks
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (14 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 17/28] ARM: vITS: handle CLEAR command Andre Przywara
                   ` (13 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The ITS stores the target (v)CPU and the (virtual) LPI number in tables.
Introduce functions to walk those tables and translate an device ID -
event ID pair into a pair of virtual LPI and vCPU.
Since the final interrupt translation tables can be smaller than a page,
we map them on demand (which is cheap on arm64). Also we take care of
the locking on the way, since we can't easily protect those ITTs from
being altered by the guest.

To allow compiling without warnings, we declare two functions as
non-static for the moment.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 135 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 135 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index fc28376..982c51d 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -60,6 +60,141 @@ struct vits_itte
     uint16_t collection;
 };
 
+#define UNMAPPED_COLLECTION      ((uint16_t)~0)
+
+/* Must be called with the ITS lock held. */
+static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
+{
+    uint16_t vcpu_id;
+
+    if ( collid >= its->max_collections )
+        return NULL;
+
+    vcpu_id = its->coll_table[collid];
+    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
+        return NULL;
+
+    return its->d->vcpu[vcpu_id];
+}
+
+#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK(51, 8))
+#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK(7, 0)) + 1))
+#define DEV_TABLE_ENTRY(addr, bits)                     \
+        (((addr) & GENMASK(51, 8)) | (((bits) - 1) & GENMASK(7, 0)))
+
+static paddr_t get_itte_address(struct virt_its *its,
+                                uint32_t devid, uint32_t evid)
+{
+    paddr_t addr;
+
+    if ( devid >= its->max_devices )
+        return ~0;
+
+    if ( evid >= DEV_TABLE_ITT_SIZE(its->dev_table[devid]) )
+        return ~0;
+
+    addr = DEV_TABLE_ITT_ADDR(its->dev_table[devid]);
+
+    return addr + evid * sizeof(struct vits_itte);
+}
+
+/*
+ * Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
+ * the corresponding ITTE. This maps the respective guest page into Xen.
+ * Once finished with handling the ITTE, call put_devid_evid() to unmap
+ * the page again.
+ * Must be called with the ITS lock held.
+ */
+static struct vits_itte *get_devid_evid(struct virt_its *its,
+                                        uint32_t devid, uint32_t evid)
+{
+    paddr_t addr = get_itte_address(its, devid, evid);
+
+    if ( addr == ~0 )
+        return NULL;
+
+    return map_guest_pages(its->d, addr, 1);
+}
+
+/* Must be called with the ITS lock held. */
+static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
+{
+    unmap_guest_pages(itte, 1);
+}
+
+/*
+ * Queries the collection and device tables to get the vCPU and virtual
+ * LPI number for a given guest event. This takes care of mapping the
+ * respective tables and validating the values, since we can't efficiently
+ * protect the ITTs with their less-than-page-size granularity.
+ * Takes and drops the its_lock.
+ */
+bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+               struct vcpu **vcpu, uint32_t *vlpi)
+{
+    struct vits_itte *itte;
+    int collid;
+    uint32_t _vlpi;
+    struct vcpu *_vcpu;
+
+    spin_lock(&its->its_lock);
+    itte = get_devid_evid(its, devid, evid);
+    if ( !itte )
+    {
+        spin_unlock(&its->its_lock);
+        return false;
+    }
+    collid = itte->collection;
+    _vlpi = itte->vlpi;
+    put_devid_evid(its, itte);
+
+    _vcpu = get_vcpu_from_collection(its, collid);
+    spin_unlock(&its->its_lock);
+
+    if ( !_vcpu )
+        return false;
+
+    if ( collid >= its->max_collections )
+        return false;
+
+    *vcpu = _vcpu;
+    *vlpi = _vlpi;
+
+    return true;
+}
+
+#define SKIP_LPI_UPDATE 1
+bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
+{
+    struct vits_itte *itte;
+
+    if ( collid >= its->max_collections )
+        return false;
+
+    /* TODO: validate vlpi */
+
+    spin_lock(&its->its_lock);
+    itte = get_devid_evid(its, devid, evid);
+    if ( !itte )
+    {
+        spin_unlock(&its->its_lock);
+        return false;
+    }
+
+    itte->collection = collid;
+    if ( vlpi != SKIP_LPI_UPDATE )
+        itte->vlpi = vlpi;
+
+    if ( vcpu )
+        *vcpu = get_vcpu_from_collection(its, collid);
+
+    put_devid_evid(its, itte);
+    spin_unlock(&its->its_lock);
+
+    return true;
+}
+
 /**************************************
  * Functions that handle ITS commands *
  **************************************/
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 17/28] ARM: vITS: handle CLEAR command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (15 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 16/28] ARM: vITS: introduce translation table walks Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-15  0:07   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 18/28] ARM: vITS: handle INT command Andre Przywara
                   ` (12 subsequent siblings)
  29 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

This introduces the ITS command handler for the CLEAR command, which
clears the pending state of an LPI.
This removes a not-yet injected, but already queued IRQ from a VCPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 35 +++++++++++++++++++++++++++++++++--
 1 file changed, 33 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 982c51d..48eb924 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -129,8 +129,8 @@ static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
  * protect the ITTs with their less-than-page-size granularity.
  * Takes and drops the its_lock.
  */
-bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
-               struct vcpu **vcpu, uint32_t *vlpi)
+static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                      struct vcpu **vcpu, uint32_t *vlpi)
 {
     struct vits_itte *itte;
     int collid;
@@ -214,6 +214,34 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
 #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
 #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
 
+static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *pirq;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    /* Remove a pending, but not yet injected guest IRQ. */
+    pirq = lpi_to_pending(vcpu, vlpi, false);
+    if ( pirq )
+    {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
+        gic_remove_from_queues(vcpu, vlpi);
+
+        /* Mark this pending IRQ struct as availabe again. */
+        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+            pirq->irq = 0;
+    }
+
+    clear_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -234,6 +262,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
         switch (its_cmd_get_command(cmdptr))
         {
+        case GITS_CMD_CLEAR:
+            its_handle_clear(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 18/28] ARM: vITS: handle INT command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (16 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 17/28] ARM: vITS: handle CLEAR command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 19/28] ARM: vITS: handle MAPC command Andre Przywara
                   ` (11 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The INT command sets a given LPI identified by a DeviceID/EventID pair
as pending and thus triggers it to be injected.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 48eb924..9307235 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -242,6 +242,26 @@ static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+    uint8_t prop;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    prop = vcpu->domain->arch.vgic.proptable[vlpi - LPI_OFFSET];
+    if ( prop & LPI_PROP_ENABLED )
+        vgic_vcpu_inject_irq(vcpu, vlpi);
+    else
+        set_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -265,6 +285,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_INT:
+            its_handle_int(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 19/28] ARM: vITS: handle MAPC command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (17 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 18/28] ARM: vITS: handle INT command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 20/28] ARM: vITS: handle MAPD command Andre Przywara
                   ` (10 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The MAPC command associates a given collection ID with a given
redistributor, thus mapping collections to VCPUs.
We just store the vcpu_id in the collection table for that.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 9307235..e6523a3 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -262,6 +262,33 @@ static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    uint64_t rdbase = its_cmd_mask_field(cmdptr, 2, 16, 44);
+    int ret = -1;
+
+    if ( collid >= its->max_collections )
+        return ret;
+
+    if ( rdbase >= its->d->max_vcpus )
+        return ret;
+
+    spin_lock(&its->its_lock);
+    if ( its->coll_table )
+    {
+        if ( its_cmd_get_validbit(cmdptr) )
+            its->coll_table[collid] = rdbase;
+        else
+            its->coll_table[collid] = UNMAPPED_COLLECTION;
+
+        ret = 0;
+    }
+    spin_unlock(&its->its_lock);
+
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -288,6 +315,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_MAPC:
+            its_handle_mapc(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 20/28] ARM: vITS: handle MAPD command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (18 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 19/28] ARM: vITS: handle MAPC command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-15  0:17   ` Stefano Stabellini
  2017-01-30 18:31 ` [PATCH 21/28] ARM: vITS: handle MAPTI command Andre Przywara
                   ` (9 subsequent siblings)
  29 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The MAPD command maps a device by associating a memory region for
storing ITTEs with a certain device ID.
We just store the given guest physical address in the device table.
We don't map the device tables permanently, as their alignment
requirement is only 256 Bytes, thus making mapping of several tables
complicated. We map the device tables on demand when we need them later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index e6523a3..5be40d8 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -289,6 +289,27 @@ static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
     return ret;
 }
 
+static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    int size = its_cmd_get_size(cmdptr);
+    bool valid = its_cmd_get_validbit(cmdptr);
+    paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) & GENMASK(51, 8);
+
+    if ( !its->dev_table )
+        return -1;
+
+    spin_lock(&its->its_lock);
+    if ( valid )
+        its->dev_table[devid] = DEV_TABLE_ENTRY(itt_addr, size + 1);
+    else
+        its->dev_table[devid] = 0;
+
+    spin_unlock(&its->its_lock);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -318,6 +339,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
+        case GITS_CMD_MAPD:
+            its_handle_mapd(its, cmdptr);
+	    break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 21/28] ARM: vITS: handle MAPTI command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (19 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 20/28] ARM: vITS: handle MAPD command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 22/28] ARM: vITS: handle MOVI command Andre Przywara
                   ` (8 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
pair and actually instantiates LPI interrupts.
We connect the already allocated host LPI to this virtual LPI, so that
any triggering IRQ on the host can be quickly forwarded to a guest.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 58 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c        | 18 +++++++++++++
 xen/arch/arm/vgic-v3-its.c       | 27 +++++++++++++++++--
 xen/include/asm-arm/gic_v3_its.h |  5 ++++
 4 files changed, 106 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 2a7093f..1268d64 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -632,6 +632,64 @@ int gicv3_its_unmap_device(struct domain *d, int guest_devid)
     return -ENOENT;
 }
 
+/*
+ * Translates an event for a given guest device ID into the associated host
+ * LPI number. This can be used to look up the mapped guest LPI.
+ */
+static uint32_t translate_event(struct domain *d,
+                                uint32_t devid, uint32_t eventid)
+{
+    struct rb_node *node;
+    struct its_devices *dev;
+    uint32_t host_lpi = 0;
+
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    node = d->arch.vgic.its_devices.rb_node;
+    while (node)
+    {
+        dev = rb_entry(node, struct its_devices, rbnode);
+        if ( dev->guest_devid == devid )
+        {
+            if ( eventid >= dev->eventids )
+                goto out;
+
+            host_lpi = dev->host_lpis[eventid / 32] + (eventid % 32);
+            if ( !is_lpi(host_lpi) )
+                host_lpi = 0;
+            goto out;
+        }
+
+        if ( devid < dev->guest_devid )
+            node = node->rb_left;
+        else
+            node = node->rb_right;
+    }
+
+out:
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    return host_lpi;
+}
+
+/*
+ * Connects the event ID for an already assigned device to the given VCPU/vLPI
+ * pair. The corresponding physical LPI is already mapped on the host side
+ * (when assigning the physical device to the guest), so we just connect the
+ * target VCPU/vLPI pair to that interrupt to inject it properly if it fires.
+ */
+int gicv3_assign_guest_event(struct domain *d, uint32_t devid, uint32_t eventid,
+                             struct vcpu *v, uint32_t virt_lpi)
+{
+    uint32_t host_lpi = translate_event(d, devid, eventid);
+
+    if ( !host_lpi )
+        return -ENOENT;
+
+    gicv3_lpi_update_host_entry(host_lpi, d->domain_id, v->vcpu_id, virt_lpi);
+
+    return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index ade8b69..494ae22 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -135,6 +135,24 @@ void do_LPI(unsigned int lpi)
     vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
 }
 
+int gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id, int vcpu_id,
+                                uint32_t virt_lpi)
+{
+    union host_lpi *hlpip, hlpi;
+
+    host_lpi -= LPI_OFFSET;
+
+    hlpip = &lpi_data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % HOST_LPIS_PER_PAGE];
+
+    hlpi.virt_lpi = virt_lpi;
+    hlpi.dom_id = domain_id;
+    hlpi.vcpu_id = vcpu_id;
+
+    write_u64_atomic(&hlpip->data, hlpi.data);
+
+    return 0;
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 5be40d8..3e74f46 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -164,8 +164,8 @@ static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
 }
 
 #define SKIP_LPI_UPDATE 1
-bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
-                uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
+static bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                       uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
 {
     struct vits_itte *itte;
 
@@ -310,6 +310,25 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    uint32_t intid = its_cmd_get_physical_id(cmdptr);
+    int collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+
+    if ( its_cmd_get_command(cmdptr) == GITS_CMD_MAPI )
+        intid = eventid;
+
+    if ( !write_itte(its, devid, eventid, collid, intid, &vcpu) )
+        return -1;
+
+    gicv3_assign_guest_event(its->d, devid, eventid, vcpu, intid);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -342,6 +361,10 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPD:
             its_handle_mapd(its, cmdptr);
 	    break;
+        case GITS_CMD_MAPI:
+        case GITS_CMD_MAPTI:
+            its_handle_mapti(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 0e6b06a..160ece3 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -159,6 +159,11 @@ int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
                                   uint32_t host_devid, uint32_t eventid);
 int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);
 
+int gicv3_assign_guest_event(struct domain *d,
+                             uint32_t devid, uint32_t eventid,
+                             struct vcpu *v, uint32_t virt_lpi);
+int gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id, int vcpu_id,
+                                uint32_t virt_lpi);
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 22/28] ARM: vITS: handle MOVI command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (20 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 21/28] ARM: vITS: handle MAPTI command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 23/28] ARM: vITS: handle DISCARD command Andre Przywara
                   ` (7 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The MOVI command moves the interrupt affinity from one redistributor
(read: VCPU) to another.
For now migration of "live" LPIs is not yet implemented, but we store
the changed affinity in the host LPI structure and in our virtual ITTE.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 14 ++++++++++++++
 xen/arch/arm/gic-v3-lpi.c        | 13 +++++++++++++
 xen/arch/arm/vgic-v3-its.c       | 23 +++++++++++++++++++++++
 xen/include/asm-arm/gic_v3_its.h |  4 ++++
 4 files changed, 54 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1268d64..d37b7b6 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -690,6 +690,20 @@ int gicv3_assign_guest_event(struct domain *d, uint32_t devid, uint32_t eventid,
     return 0;
 }
 
+/* Changes the target VCPU for a given host LPI assigned to a domain. */
+int gicv3_lpi_change_vcpu(struct domain *d,
+                          uint32_t devid, uint32_t eventid, int vcpu_id)
+{
+    uint32_t host_lpi = translate_event(d, devid, eventid);
+
+    if ( !host_lpi )
+        return -ENOENT;
+
+    gicv3_lpi_update_host_vcpuid(host_lpi, vcpu_id);
+
+    return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 494ae22..66d6eff 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -153,6 +153,19 @@ int gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id, int vcpu_id,
     return 0;
 }
 
+int gicv3_lpi_update_host_vcpuid(uint32_t host_lpi, int vcpu_id)
+{
+    union host_lpi *hlpip;
+
+    host_lpi -= LPI_OFFSET;
+
+    hlpip = &lpi_data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % HOST_LPIS_PER_PAGE];
+
+    write_u16_atomic(&hlpip->vcpu_id, vcpu_id);
+
+    return 0;
+}
+
 uint64_t gicv3_lpi_allocate_pendtable(void)
 {
     uint64_t reg;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 3e74f46..fb9caf0 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -329,6 +329,23 @@ static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    int collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+
+    if ( !write_itte(its, devid, eventid, collid, SKIP_LPI_UPDATE, &vcpu) )
+        return -1;
+
+    /* TODO: lookup currently-in-guest virtual IRQs and migrate them */
+
+    gicv3_lpi_change_vcpu(its->d, devid, eventid, vcpu->vcpu_id);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -365,6 +382,12 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPTI:
             its_handle_mapti(its, cmdptr);
             break;
+        case GITS_CMD_MOVALL:
+            gdprintk(XENLOG_G_INFO, "ITS: ignoring MOVALL command\n");
+            break;
+        case GITS_CMD_MOVI:
+            its_handle_movi(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 160ece3..bac0fe5 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -162,8 +162,12 @@ int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);
 int gicv3_assign_guest_event(struct domain *d,
                              uint32_t devid, uint32_t eventid,
                              struct vcpu *v, uint32_t virt_lpi);
+int gicv3_lpi_change_vcpu(struct domain *d,
+                          uint32_t devid, uint32_t eventid, int vcpu_id);
 int gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id, int vcpu_id,
                                 uint32_t virt_lpi);
+int gicv3_lpi_update_host_vcpuid(uint32_t host_lpi, int vcpu_id);
+
 #else
 
 static inline void gicv3_its_dt_init(const struct dt_device_node *node)
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 23/28] ARM: vITS: handle DISCARD command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (21 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 22/28] ARM: vITS: handle MOVI command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 24/28] ARM: vITS: handle INV command Andre Przywara
                   ` (6 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The DISCARD command drops the connection between a DeviceID/EventID
and an LPI/collection pair.
We mark the respective structure entries as not allocated and make
sure that any queued IRQs are removed.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index fb9caf0..8747890 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -346,6 +346,36 @@ static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_discard(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *pirq;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    pirq = lpi_to_pending(vcpu, vlpi, false);
+    if ( pirq )
+    {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
+        gic_remove_from_queues(vcpu, vlpi);
+
+        /* Mark this pending IRQ struct as availabe again. */
+        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+            pirq->irq = 0;
+    }
+
+    if ( !write_itte(its, devid, eventid, ~0, INVALID_LPI, NULL) )
+        return -1;
+
+    gicv3_assign_guest_event(its->d, devid, eventid, NULL, 0);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -369,6 +399,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_DISCARD:
+            its_handle_discard(its, cmdptr);
+            break;
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 24/28] ARM: vITS: handle INV command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (22 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 23/28] ARM: vITS: handle DISCARD command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 25/28] ARM: vITS: handle INVALL command Andre Przywara
                   ` (5 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The INV command instructs the ITS to update the configuration data for
a given LPI by re-reading its entry from the property table.
We don't need to care so much about the priority value, but enabling
or disabling an LPI has some effect: We remove or push virtual LPIs
to their VCPUs, also check the virtual pending bit if an LPI gets enabled.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 8747890..82f7bcc 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -262,6 +262,60 @@ static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+/*
+ * For a given virtual LPI read the enabled bit from the virtual property
+ * table and update the virtual IRQ's state.
+ * This takes care of removing or pushing of virtual LPIs to their VCPUs.
+ */
+static void update_lpi_enabled_status(struct virt_its* its,
+                                      struct vcpu *vcpu, uint32_t vlpi)
+{
+    struct pending_irq *pirq = lpi_to_pending(vcpu, vlpi, false);
+    uint8_t property = its->d->arch.vgic.proptable[vlpi - LPI_OFFSET];
+
+    if ( property & LPI_PROP_ENABLED )
+    {
+        if ( pirq )
+        {
+            unsigned long flags;
+
+            set_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
+            spin_lock_irqsave(&vcpu->arch.vgic.lock, flags);
+            if ( !list_empty(&pirq->inflight) &&
+                 !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
+                gic_raise_guest_irq(vcpu, vlpi, property & LPI_PROP_PRIO_MASK);
+            spin_unlock_irqrestore(&vcpu->arch.vgic.lock, flags);
+        }
+
+        /* Check whether the LPI has fired while the guest had it disabled. */
+        if ( test_and_clear_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable) )
+            vgic_vcpu_inject_irq(vcpu, vlpi);
+    }
+    else
+    {
+        if ( pirq )
+        {
+            clear_bit(GIC_IRQ_GUEST_ENABLED, &pirq->status);
+            gic_remove_from_queues(vcpu, vlpi);
+        }
+    }
+}
+
+static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    update_lpi_enabled_status(its, vcpu, vlpi);
+
+    return 0;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -405,6 +459,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_INV:
+            its_handle_inv(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 25/28] ARM: vITS: handle INVALL command
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (23 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 24/28] ARM: vITS: handle INV command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 26/28] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
                   ` (4 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

The INVALL command instructs an ITS to invalidate the configuration
data for all LPIs associated with a given redistributor (read: VCPU).
This is nasty to emulate exactly with our architecture, so we just scan
the pending table and inject _every_ LPI found there that got enabled.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 82f7bcc..75d1e12 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -316,6 +316,40 @@ static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+/*
+ * INVALL updates the per-LPI configuration status for every LPI mapped to
+ * a particular redistributor. Since our property table is referenced when
+ * needed, we don't need to sync anything, really. But we have to take care
+ * of LPIs getting enabled if there is an interrupt pending.
+ * To catch every LPI without iterating through the device table we just
+ * look for set bits in our virtual pending table and check the status of
+ * the enabled bit in the respective property table entry.
+ * This actually covers every (pending) LPI from every redistributor,
+ * but update_lpi_enabled_status() is a NOP for LPIs not being mapped
+ * to the redistributor/VCPU we are interested in.
+ */
+static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+    int vlpi = 0;
+
+    spin_lock(&its->its_lock);
+    vcpu = get_vcpu_from_collection(its, collid);
+    spin_unlock(&its->its_lock);
+
+    do {
+        vlpi = find_next_bit(vcpu->arch.vgic.pendtable,
+                             its->d->arch.vgic.nr_lpis, vlpi);
+        if ( vlpi >= its->d->arch.vgic.nr_lpis )
+            break;
+
+        update_lpi_enabled_status(its, vcpu, vlpi);
+    } while (1);
+
+    return 0;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -462,6 +496,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INV:
             its_handle_inv(its, cmdptr);
 	    break;
+        case GITS_CMD_INVALL:
+            its_handle_invall(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             its_handle_mapc(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 26/28] ARM: vITS: create and initialize virtual ITSes for Dom0
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (24 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 25/28] ARM: vITS: handle INVALL command Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 27/28] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
                   ` (3 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

For each hardware ITS create and initialize a virtual ITS for Dom0.
We use the same memory mapped address to keep the doorbell working.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c       | 28 ++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c           | 12 ++++++++++++
 xen/include/asm-arm/domain.h     |  1 +
 xen/include/asm-arm/gic_v3_its.h | 10 ++++++++++
 4 files changed, 51 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 75d1e12..6404ebd 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -891,6 +891,34 @@ static const struct mmio_handler_ops vgic_its_mmio_handler = {
     .write = vgic_v3_its_mmio_write,
 };
 
+int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr)
+{
+    struct virt_its *its;
+    uint64_t base_attr;
+
+    its = xzalloc(struct virt_its);
+    if ( ! its )
+        return -ENOMEM;
+
+    base_attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    base_attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    base_attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    its->cbaser  = base_attr;
+    base_attr |= 0ULL << GITS_BASER_PAGE_SIZE_SHIFT;
+    its->baser0  = GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
+    its->baser0 |= (7ULL << GITS_BASER_ENTRY_SIZE_SHIFT) | base_attr;
+    its->baser1  = GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
+    its->baser1 |= (1ULL << GITS_BASER_ENTRY_SIZE_SHIFT) | base_attr;
+    its->d = d;
+    spin_lock_init(&its->vcmd_lock);
+    spin_lock_init(&its->its_lock);
+
+    register_mmio_handler(d, &vgic_its_mmio_handler, guest_addr, SZ_64K, its);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d1382be..8faec95 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -31,6 +31,7 @@
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
 #include <asm/vgic.h>
 #include <asm/vgic-emul.h>
 #include <asm/vreg.h>
@@ -1651,6 +1652,7 @@ static int vgic_v3_domain_init(struct domain *d)
      */
     if ( is_hardware_domain(d) )
     {
+        struct host_its *hw_its;
         unsigned int first_cpu = 0;
 
         d->arch.vgic.dbase = vgic_v3_hw.dbase;
@@ -1676,6 +1678,16 @@ static int vgic_v3_domain_init(struct domain *d)
 
             first_cpu += size / d->arch.vgic.rdist_stride;
         }
+        d->arch.vgic.nr_regions = vgic_v3_hw.nr_rdist_regions;
+
+        list_for_each_entry(hw_its, &host_its_list, entry)
+        {
+            /* Emulate the control registers frame (lower 64K). */
+            vgic_v3_its_init_virtual(d, hw_its->addr);
+
+            d->arch.vgic.has_its = true;
+        }
+
     }
     else
     {
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 33c1851..27cc310 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -115,6 +115,7 @@ struct arch_domain
         uint8_t *proptable;
         struct rb_root its_devices;         /* devices mapped to an ITS */
         spinlock_t its_devices_lock;        /* protects the its_devices tree */
+        bool has_its;
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index bac0fe5..e75249b 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -146,6 +146,12 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta);
 /* Map a collection for this host CPU to each host ITS. */
 int gicv3_its_setup_collection(int cpu);
 
+/* Create and register a virtual ITS at the given guest address.
+ * If a host ITS is specified, a hardware domain can reach out to that host
+ * ITS to deal with devices and LPI mappings and can enable/disable LPIs.
+ */
+int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr);
+
 /* Map a device on the host by allocating an ITT on the host (ITS).
  * "bits" specifies how many events (interrupts) this device will need.
  * Setting "valid" to false deallocates the device.
@@ -202,6 +208,10 @@ static inline int gicv3_its_map_guest_device(struct domain *d, int host_devid,
 {
     return -ENODEV;
 }
+static inline int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 27/28] ARM: vITS: create ITS subnodes for Dom0 DT
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (25 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 26/28] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-01-30 18:31 ` [PATCH 28/28] ARM: vGIC: advertising LPI support Andre Przywara
                   ` (2 subsequent siblings)
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Dom0 expects all ITSes in the system to be propagated to be able to
use MSIs.
Create Dom0 DT nodes for each hardware ITS, keeping the register frame
address the same, as the doorbell address that the Dom0 drivers program
into the BARs has to match the hardware.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 73 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            |  4 ++-
 xen/include/asm-arm/gic_v3_its.h | 13 +++++++
 3 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index d37b7b6..36839c9 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -704,6 +704,79 @@ int gicv3_lpi_change_vcpu(struct domain *d,
     return 0;
 }
 
+/*
+ * Create the respective guest DT nodes for a list of host ITSes.
+ * This copies the reg property, so the guest sees the ITS at the same address
+ * as the host.
+ */
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt)
+{
+    uint32_t len;
+    int res;
+    const void *prop = NULL;
+    const struct dt_device_node *its = NULL;
+    const struct host_its *its_data;
+
+    if ( list_empty(its_list) )
+        return 0;
+
+    /* The sub-nodes require the ranges property */
+    prop = dt_get_property(gic, "ranges", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_ERR "Can't find ranges property for the gic node\n");
+        return -FDT_ERR_XEN(ENOENT);
+    }
+
+    res = fdt_property(fdt, "ranges", prop, len);
+    if ( res )
+        return res;
+
+    list_for_each_entry(its_data, its_list, entry)
+    {
+        its = its_data->dt_node;
+
+        res = fdt_begin_node(fdt, its->name);
+        if ( res )
+            return res;
+
+        res = fdt_property_string(fdt, "compatible", "arm,gic-v3-its");
+        if ( res )
+            return res;
+
+        res = fdt_property(fdt, "msi-controller", NULL, 0);
+        if ( res )
+            return res;
+
+        if ( its->phandle )
+        {
+            res = fdt_property_cell(fdt, "phandle", its->phandle);
+            if ( res )
+                return res;
+        }
+
+        /* Use the same reg regions as the ITS node in host DTB. */
+        prop = dt_get_property(its, "reg", &len);
+        if ( !prop )
+        {
+            printk(XENLOG_ERR "GICv3: Can't find ITS reg property.\n");
+            res = -FDT_ERR_XEN(ENOENT);
+            return res;
+        }
+
+        res = fdt_property(fdt, "reg", prop, len);
+        if ( res )
+            return res;
+
+        fdt_end_node(fdt);
+    }
+
+    return res;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 23cf33d..7041d48 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1190,8 +1190,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d,
 
     res = fdt_property(fdt, "reg", new_cells, len);
     xfree(new_cells);
+    if ( res )
+        return res;
 
-    return res;
+    return gicv3_its_make_dt_nodes(&host_its_list, d, gic, fdt);
 }
 
 static const hw_irq_controller gicv3_host_irq_type = {
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index e75249b..4f5a453 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -152,6 +152,12 @@ int gicv3_its_setup_collection(int cpu);
  */
 int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr);
 
+/* Given a list of ITSes, create the appropriate DT nodes for a domain. */
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt);
+
 /* Map a device on the host by allocating an ITT on the host (ITS).
  * "bits" specifies how many events (interrupts) this device will need.
  * Setting "valid" to false deallocates the device.
@@ -212,6 +218,13 @@ static inline int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr)
 {
     return 0;
 }
+static inline int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                                       const struct domain *d,
+                                       const struct dt_device_node *gic,
+                                       void *fdt)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [PATCH 28/28] ARM: vGIC: advertising LPI support
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (26 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 27/28] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
@ 2017-01-30 18:31 ` Andre Przywara
  2017-02-13 13:53 ` [PATCH 00/28] arm64: Dom0 ITS emulation Vijay Kilari
  2017-02-15 17:55 ` Julien Grall
  29 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-01-30 18:31 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

To let a guest know about the availability of virtual LPIs, set the
respective bits in the virtual GIC registers and let a guest control
the LPI enable bit.
Only report the LPI capability if the host has initialized at least
one ITS.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 83 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 8faec95..6d5b7f4 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -169,8 +169,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* We have not implemented LPI's, read zero */
-        goto read_as_zero_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
+                                info);
+        return 1;
 
     case VREG32(GICR_IIDR):
         if ( dabt.size != DABT_WORD ) goto bad_width;
@@ -182,16 +184,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         uint64_t typer, aff;
 
         if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-        /* TBD: Update processor id in [23:8] when ITS support is added */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
         typer = aff;
+        typer |= (v->vcpu_id & 0xffff) << 8;
 
         if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
             typer |= GICR_TYPER_LAST;
 
+        if ( v->domain->arch.vgic.has_its )
+            typer |= GICR_TYPER_PLPIS;
+
         *r = vgic_reg64_extract(typer, info);
 
         return 1;
@@ -502,6 +507,40 @@ bool vgic_can_inject_lpi(struct vcpu *vcpu, uint32_t vlpi)
     return false;
 }
 
+static void vgic_vcpu_enable_lpis(struct vcpu *v)
+{
+    uint64_t reg = v->domain->arch.vgic.rdist_propbase;
+    unsigned int nr_lpis = BIT((reg & 0x1f) + 1) - LPI_OFFSET;
+    int nr_pages;
+
+    /* The first VCPU to enable LPIs maps the property table. */
+    if ( !v->domain->arch.vgic.proptable )
+    {
+        v->domain->arch.vgic.nr_lpis = nr_lpis;
+        nr_pages = DIV_ROUND_UP(nr_lpis, PAGE_SIZE);
+
+        get_guest_pages(v->domain, reg & GENMASK(51, 12), nr_pages);
+        v->domain->arch.vgic.proptable = map_guest_pages(v->domain,
+                                                         reg & GENMASK(51, 12),
+                                                         nr_pages);
+        printk("VGIC-v3: VCPU%d mapped %d pages for property table\n",
+               v->vcpu_id, nr_pages);
+    }
+    nr_pages = DIV_ROUND_UP(((nr_lpis + LPI_OFFSET) / 8), PAGE_SIZE);
+    reg = v->arch.vgic.rdist_pendbase;
+
+    get_guest_pages(v->domain, reg & GENMASK(51, 12), nr_pages);
+    v->arch.vgic.pendtable = map_guest_pages(v->domain,
+                                             reg & GENMASK(51, 12), nr_pages);
+
+    printk("VGIC-v3: VCPU%d mapped %d pages for pending table\n",
+           v->vcpu_id, nr_pages);
+
+    v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
+
+    printk("VGICv3: enabled %d LPIs for VCPU%d\n", nr_lpis, v->vcpu_id);
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
@@ -512,8 +551,18 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* LPI's not implemented */
-        goto write_ignore_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        if ( !v->domain->arch.vgic.has_its )
+            return 1;
+
+        /* LPIs can only be enabled once, but never disabled again. */
+        if ( !(r & GICR_CTLR_ENABLE_LPIS) ||
+             (v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
+            return 1;
+
+        vgic_vcpu_enable_lpis(v);
+
+        return 1;
 
     case VREG32(GICR_IIDR):
         /* RO */
@@ -1113,6 +1162,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
                  DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
 
+        if ( v->domain->arch.vgic.has_its )
+        {
+            typer |= GICD_TYPE_LPIS;
+            irq_bits = 16;
+        }
         typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
 
         *r = vgic_reg32_extract(typer, info);
@@ -1729,6 +1783,30 @@ static int vgic_v3_domain_init(struct domain *d)
 
 static void vgic_v3_domain_free(struct domain *d)
 {
+    int nr_pages;
+    struct vcpu *v;
+
+    if ( d->arch.vgic.proptable )
+    {
+        nr_pages = DIV_ROUND_UP(d->arch.vgic.nr_lpis, PAGE_SIZE);
+
+        unmap_guest_pages(d->arch.vgic.proptable, nr_pages);
+        put_guest_pages(d, d->arch.vgic.rdist_propbase & GENMASK(51, 12),
+                        nr_pages);
+        printk("VGICv3: freeing PROPBASE for Domain%d\n", d->domain_id);
+    }
+
+    nr_pages = DIV_ROUND_UP((d->arch.vgic.nr_lpis + LPI_OFFSET) / 8, PAGE_SIZE);
+    for_each_vcpu(d, v)
+    {
+        if ( !v->arch.vgic.pendtable )
+            continue;
+
+        unmap_guest_pages(v->arch.vgic.pendtable, nr_pages);
+        put_guest_pages(d, v->arch.vgic.rdist_pendbase & GENMASK(51, 12),
+                        nr_pages);
+        printk("VGICv3: freeing PENDBASE for VCPU%d\n", v->vcpu_id);
+    }
     xfree(d->arch.vgic.rdist_regions);
 }
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-30 18:31 ` [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall Andre Przywara
@ 2017-01-31 10:29   ` Jaggi, Manish
  2017-01-31 12:43     ` Julien Grall
  2017-02-14 20:11   ` Stefano Stabellini
  1 sibling, 1 reply; 106+ messages in thread
From: Jaggi, Manish @ 2017-01-31 10:29 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini, Julien Grall
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun


Hi Andre,
>
>From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre Przywara <andre.przywara@arm.com>
>Sent: Tuesday, January 31, 2017 12:01 AM
>To: Stefano Stabellini; Julien Grall
>Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
>    
[snip]
> 
> 
> int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
> {
>+    struct physdev_manage_pci manage;
>+    u32 devid;
>+    int ret;
>+
>+    switch (cmd)
>+    {

You might alos need to  PHYSDEVOP_pci_device_add hypercall also. 

>+        case PHYSDEVOP_manage_pci_add:
>+        case PHYSDEVOP_manage_pci_remove:
>+            if ( copy_from_guest(&manage, arg, 1) != 0 )
>+                return -EFAULT;
>+
>+            devid = manage.bus << 8 | manage.devfn;
>+            /* Allocate an ITS device table with space for 32 MSIs */
>+            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
>+                                             cmd == PHYSDEVOP_manage_pci_add);

Based on 4.9 kernel, is the deivce ID plain sBDF or it is returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
I believe there needs to be set this as requirement on the calle of hypercall. 
As sbdf and deviceID returned from msi_map calls might not be same.

>+
>+            return ret;
>+    }
>+
>     gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd);
>     return -ENOSYS;
> }
>-- 
>2.9.0
>

-Manish


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 10:29   ` Jaggi, Manish
@ 2017-01-31 12:43     ` Julien Grall
  2017-01-31 13:19       ` Jaggi, Manish
  2017-01-31 13:28       ` Jaggi, Manish
  0 siblings, 2 replies; 106+ messages in thread
From: Julien Grall @ 2017-01-31 12:43 UTC (permalink / raw)
  To: Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun



On 31/01/17 10:29, Jaggi, Manish wrote:
>>
>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre Przywara <andre.przywara@arm.com>
>> Sent: Tuesday, January 31, 2017 12:01 AM
>> To: Stefano Stabellini; Julien Grall
>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
>>
> [snip]
>>
>>
>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>  {
>> +    struct physdev_manage_pci manage;
>> +    u32 devid;
>> +    int ret;
>> +
>> +    switch (cmd)
>> +    {
>
> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>
>> +        case PHYSDEVOP_manage_pci_add:
>> +        case PHYSDEVOP_manage_pci_remove:
>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>> +                return -EFAULT;
>> +
>> +            devid = manage.bus << 8 | manage.devfn;
>> +            /* Allocate an ITS device table with space for 32 MSIs */
>> +            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
>> +                                             cmd == PHYSDEVOP_manage_pci_add);
>
> Based on 4.9 kernel, is the deivce ID plain sBDF or it is returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
> I believe there needs to be set this as requirement on the calle of hypercall.

The requirement of the hypercall is already defined and cannot be 
changed. So if it does not provide the correct information, then we need 
to find another way to get the DeviceID.

In case of ACPI, we should be able to get those informations from the 
IORT as the segment number is defined in the firmware tables. But for 
Device Tree, we would need DOM0 and Xen to agree on the segment number.

However, I am not sure whether we are going to need those hypercalls 
when Xen will gain support of PCI. There are some discussion to let Xen 
scanning the PCI devices, and therefore the hypercalls will be used.

Today, the hypercall is called by Linux on ARM, but it might not be the 
case in the future. If we decide to implement it today, it means that we 
will not be able to remove it from Linux from compatibility reasons.

So I would be more in favor of having a per-platform list of devices to 
support for the time being. So we can get GICv3 ITS working with Device 
Tree until Xen gain support of PCI. Stefano, Andre, any opinions?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 12:43     ` Julien Grall
@ 2017-01-31 13:19       ` Jaggi, Manish
  2017-01-31 13:46         ` Julien Grall
  2017-01-31 13:28       ` Jaggi, Manish
  1 sibling, 1 reply; 106+ messages in thread
From: Jaggi, Manish @ 2017-01-31 13:19 UTC (permalink / raw)
  To: Julien Grall, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

Hi Julien,


On 1/31/2017 6:13 PM, Julien Grall wrote:
>
>
> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>
>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre 
>>> Przywara <andre.przywara@arm.com>
>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>> To: Stefano Stabellini; Julien Grall
>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and 
>>> LPIs to the ITS on physdev_op hypercall
>>>
>> [snip]
>>>
>>>
>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>  {
>>> +    struct physdev_manage_pci manage;
>>> +    u32 devid;
>>> +    int ret;
>>> +
>>> +    switch (cmd)
>>> +    {
>>
>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>
>>> +        case PHYSDEVOP_manage_pci_add:
>>> +        case PHYSDEVOP_manage_pci_remove:
>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>> +                return -EFAULT;
>>> +
>>> +            devid = manage.bus << 8 | manage.devfn;
>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>> +            ret = gicv3_its_map_guest_device(hardware_domain, 
>>> devid, devid, 5,
>>> +                                             cmd == 
>>> PHYSDEVOP_manage_pci_add);
>>
>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is 
>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>> I believe there needs to be set this as requirement on the calle of 
>> hypercall.
>
> The requirement of the hypercall is already defined and cannot be 
> changed. So if it does not provide the correct information, then we 
> need to find another way to get the DeviceID.
>
Do you think sbdf and device ID are same ? If you recollect your 
comments last year sbdf != DeviceID.
for this series it has to be passed correctly otherwise ITS would be programmed incorrectly.
I suggest this series to include another way as well.

> In case of ACPI, we should be able to get those informations from the 
> IORT as the segment number is defined in the firmware tables. But for 
> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>
Is there any agreement hypercall used with this series ?
> However, I am not sure whether we are going to need those hypercalls 
> when Xen will gain support of PCI. There are some discussion to let 
> Xen scanning the PCI devices, and therefore the hypercalls will be used.
>
> Today, the hypercall is called by Linux on ARM, but it might not be 
> the case in the future. If we decide to implement it today, it means 
> that we will not be able to remove it from Linux from compatibility 
> reasons.
>
> So I would be more in favor of having a per-platform list of devices 
> to support for the time being. So we can get GICv3 ITS working with 
> Device Tree until Xen gain support of PCI. Stefano, Andre, any opinions?
> Cheers,
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 12:43     ` Julien Grall
  2017-01-31 13:19       ` Jaggi, Manish
@ 2017-01-31 13:28       ` Jaggi, Manish
  1 sibling, 0 replies; 106+ messages in thread
From: Jaggi, Manish @ 2017-01-31 13:28 UTC (permalink / raw)
  To: Julien Grall, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

Hi Julien,


On 1/31/2017 6:13 PM, Julien Grall wrote:
>
>
> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>
>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre 
>>> Przywara <andre.przywara@arm.com>
>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>> To: Stefano Stabellini; Julien Grall
>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and 
>>> LPIs to the ITS on physdev_op hypercall
>>>
>> [snip]
>>>
>>>
>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>  {
>>> +    struct physdev_manage_pci manage;
>>> +    u32 devid;
>>> +    int ret;
>>> +
>>> +    switch (cmd)
>>> +    {
>>
>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>
>>> +        case PHYSDEVOP_manage_pci_add:
>>> +        case PHYSDEVOP_manage_pci_remove:
>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>> +                return -EFAULT;
>>> +
>>> +            devid = manage.bus << 8 | manage.devfn;
>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>> +            ret = gicv3_its_map_guest_device(hardware_domain, 
>>> devid, devid, 5,
>>> +                                             cmd == 
>>> PHYSDEVOP_manage_pci_add);
>>
>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is 
>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>> I believe there needs to be set this as requirement on the calle of 
>> hypercall.
>
> The requirement of the hypercall is already defined and cannot be 
> changed. So if it does not provide the correct information, then we 
> need to find another way to get the DeviceID.
>
Do you think sbdf and device ID are same ? If you recollect your 
comments last year sbdf != DeviceID.
for this series it has to be passed correctly otherwise ITS is programmed with wrong deviceID.
I suggest this series to include another way as well.
> In case of ACPI, we should be able to get those informations from the 
> IORT as the segment number is defined in the firmware tables. But for 
> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>
Is there any agreement hypercall used with this series ?
> However, I am not sure whether we are going to need those hypercalls 
> when Xen will gain support of PCI. There are some discussion to let 
> Xen scanning the PCI devices, and therefore the hypercalls will be used.
>
> Today, the hypercall is called by Linux on ARM, but it might not be 
> the case in the future. If we decide to implement it today, it means 
> that we will not be able to remove it from Linux from compatibility 
> reasons.
>
> So I would be more in favor of having a per-platform list of devices 
> to support for the time being. So we can get GICv3 ITS working with 
> Device Tree until Xen gain support of PCI. Stefano, Andre, any opinions?
> Cheers,
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 13:19       ` Jaggi, Manish
@ 2017-01-31 13:46         ` Julien Grall
  2017-01-31 14:08           ` Jaggi, Manish
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-01-31 13:46 UTC (permalink / raw)
  To: Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

On 31/01/17 13:19, Jaggi, Manish wrote:
> On 1/31/2017 6:13 PM, Julien Grall wrote:
>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>
>>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre
>>>> Przywara <andre.przywara@arm.com>
>>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>>> To: Stefano Stabellini; Julien Grall
>>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and
>>>> LPIs to the ITS on physdev_op hypercall
>>>>
>>> [snip]
>>>>
>>>>
>>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>  {
>>>> +    struct physdev_manage_pci manage;
>>>> +    u32 devid;
>>>> +    int ret;
>>>> +
>>>> +    switch (cmd)
>>>> +    {
>>>
>>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>>
>>>> +        case PHYSDEVOP_manage_pci_add:
>>>> +        case PHYSDEVOP_manage_pci_remove:
>>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>>> +                return -EFAULT;
>>>> +
>>>> +            devid = manage.bus << 8 | manage.devfn;
>>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>>> +            ret = gicv3_its_map_guest_device(hardware_domain,
>>>> devid, devid, 5,
>>>> +                                             cmd ==
>>>> PHYSDEVOP_manage_pci_add);
>>>
>>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is
>>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>>> I believe there needs to be set this as requirement on the calle of
>>> hypercall.
>>
>> The requirement of the hypercall is already defined and cannot be
>> changed. So if it does not provide the correct information, then we
>> need to find another way to get the DeviceID.
>>
> Do you think sbdf and device ID are same ? If you recollect your
> comments last year sbdf != DeviceID.
> for this series it has to be passed correctly otherwise ITS would be programmed incorrectly.
> I suggest this series to include another way as well.

Thank you sherlock, if you had read my e-mail entirely you would have 
noticed I never said sbdf == DeviceID and actually provided insight on 
the problem and suggest solutions.

I would recommend you to do the same in the future. It would help to get 
the code much faster in Xen.

>
>> In case of ACPI, we should be able to get those informations from the
>> IORT as the segment number is defined in the firmware tables. But for
>> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>>
> Is there any agreement hypercall used with this series ?

 From xen/include/public/physdev.h

struct physdev_manage_pci {
     /* IN */
     uint8_t bus;
     uint8_t devfn;
};

struct physdev_manage_pci_ext {
     /* IN */
     uint8_t bus;
     uint8_t devfn;
     unsigned is_extfn;
     unsigned is_virtfn;
     struct {
         uint8_t bus;
         uint8_t devfn;
     } physfn;
};

Let me know how you could encode a DeviceID in those hypercalls.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 13:46         ` Julien Grall
@ 2017-01-31 14:08           ` Jaggi, Manish
  2017-01-31 15:17             ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Jaggi, Manish @ 2017-01-31 14:08 UTC (permalink / raw)
  To: Julien Grall, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

Hi Julien, 

On 1/31/2017 7:16 PM, Julien Grall wrote:
> On 31/01/17 13:19, Jaggi, Manish wrote:
>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>>
>>>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre
>>>>> Przywara <andre.przywara@arm.com>
>>>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>>>> To: Stefano Stabellini; Julien Grall
>>>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and
>>>>> LPIs to the ITS on physdev_op hypercall
>>>>>
>>>> [snip]
>>>>>
>>>>>
>>>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>  {
>>>>> +    struct physdev_manage_pci manage;
>>>>> +    u32 devid;
>>>>> +    int ret;
>>>>> +
>>>>> +    switch (cmd)
>>>>> +    { 
>>>>
>>>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>>>
>>>>> +        case PHYSDEVOP_manage_pci_add:
>>>>> +        case PHYSDEVOP_manage_pci_remove:
>>>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>>>> +                return -EFAULT;
>>>>> +
>>>>> +            devid = manage.bus << 8 | manage.devfn;
>>>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>>>> +            ret = gicv3_its_map_guest_device(hardware_domain,
>>>>> devid, devid, 5,
>>>>> +                                             cmd ==
>>>>> PHYSDEVOP_manage_pci_add); 
>>>>
>>>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is
>>>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>>>> I believe there needs to be set this as requirement on the calle of
>>>> hypercall. 
>>>
>>> The requirement of the hypercall is already defined and cannot be
>>> changed. So if it does not provide the correct information, then we
>>> need to find another way to get the DeviceID.
>>>
>> Do you think sbdf and device ID are same ? If you recollect your
>> comments last year sbdf != DeviceID.
>> for this series it has to be passed correctly otherwise ITS would be programmed incorrectly.
>> I suggest this series to include another way as well. 
>
> Thank you sherlock, if you had read my e-mail entirely you would have noticed I never said sbdf == DeviceID and actually provided insight on the problem and suggest solutions.
>
If you please read 4 lines above I wrote sbdf != DeviceID.
> I would recommend you to do the same in the future. It would help to get the code much faster in Xen.
>
>>
>>> In case of ACPI, we should be able to get those informations from the
>>> IORT as the segment number is defined in the firmware tables. But for
>>> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>>>
>> Is there any agreement hypercall used with this series ? 
>
> From xen/include/public/physdev.h
>
> struct physdev_manage_pci {
>     /* IN */
>     uint8_t bus;
>     uint8_t devfn;
> };
>
> struct physdev_manage_pci_ext {
>     /* IN */
>     uint8_t bus;
>     uint8_t devfn;
>     unsigned is_extfn;
>     unsigned is_virtfn;
>     struct {
>         uint8_t bus;
>         uint8_t devfn;
>     } physfn;
> };
>
> Let me know how you could encode a DeviceID in those hypercalls.
>
If you please go back to your comment where you wrote "we need to find another way to get the DeviceID", I was referring that we should add that another way in this series so that correct DeviceID is programmed in ITS.
> Cheers,
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 14:08           ` Jaggi, Manish
@ 2017-01-31 15:17             ` Julien Grall
  2017-01-31 16:02               ` Jaggi, Manish
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-01-31 15:17 UTC (permalink / raw)
  To: Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun



On 31/01/17 14:08, Jaggi, Manish wrote:
> Hi Julien,
>
> On 1/31/2017 7:16 PM, Julien Grall wrote:
>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>>>
>>>>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre
>>>>>> Przywara <andre.przywara@arm.com>
>>>>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>>>>> To: Stefano Stabellini; Julien Grall
>>>>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>>>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and
>>>>>> LPIs to the ITS on physdev_op hypercall
>>>>>>
>>>>> [snip]
>>>>>>
>>>>>>
>>>>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>  {
>>>>>> +    struct physdev_manage_pci manage;
>>>>>> +    u32 devid;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    switch (cmd)
>>>>>> +    {
>>>>>
>>>>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>>>>
>>>>>> +        case PHYSDEVOP_manage_pci_add:
>>>>>> +        case PHYSDEVOP_manage_pci_remove:
>>>>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>>>>> +                return -EFAULT;
>>>>>> +
>>>>>> +            devid = manage.bus << 8 | manage.devfn;
>>>>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>>>>> +            ret = gicv3_its_map_guest_device(hardware_domain,
>>>>>> devid, devid, 5,
>>>>>> +                                             cmd ==
>>>>>> PHYSDEVOP_manage_pci_add);
>>>>>
>>>>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is
>>>>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>>>>> I believe there needs to be set this as requirement on the calle of
>>>>> hypercall.
>>>>
>>>> The requirement of the hypercall is already defined and cannot be
>>>> changed. So if it does not provide the correct information, then we
>>>> need to find another way to get the DeviceID.
>>>>
>>> Do you think sbdf and device ID are same ? If you recollect your
>>> comments last year sbdf != DeviceID.
>>> for this series it has to be passed correctly otherwise ITS would be programmed incorrectly.
>>> I suggest this series to include another way as well.
>>
>> Thank you sherlock, if you had read my e-mail entirely you would have noticed I never said sbdf == DeviceID and actually provided insight on the problem and suggest solutions.
>>
> If you please read 4 lines above I wrote sbdf != DeviceID.

I think there is a miscommunication problem here. By "my e-mail" I was 
referring to the e-mail on this thread 
(4a8e35dc-57e5-e493-9a9a-4a91bb8e1a2f@arm.com). On your e-mail you 
implied I was not aware of sbdf != DeviceID (see "Do you think sbdf and 
device ID are same").


>> I would recommend you to do the same in the future. It would help to get the code much faster in Xen.
>>
>>>
>>>> In case of ACPI, we should be able to get those informations from the
>>>> IORT as the segment number is defined in the firmware tables. But for
>>>> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>>>>
>>> Is there any agreement hypercall used with this series ?
>>
>> From xen/include/public/physdev.h
>>
>> struct physdev_manage_pci {
>>     /* IN */
>>     uint8_t bus;
>>     uint8_t devfn;
>> };
>>
>> struct physdev_manage_pci_ext {
>>     /* IN */
>>     uint8_t bus;
>>     uint8_t devfn;
>>     unsigned is_extfn;
>>     unsigned is_virtfn;
>>     struct {
>>         uint8_t bus;
>>         uint8_t devfn;
>>     } physfn;
>> };
>>
>> Let me know how you could encode a DeviceID in those hypercalls.
>>
> If you please go back to your comment where you wrote "we need to find another way to get the DeviceID", I was referring that we should add that another way in this series so that correct DeviceID is programmed in ITS.

This is not the first time I am saying this, just saying "we should add 
that another way..." is not helpful. You should also provide some 
details on what you would do.

For now, you gave no feedbacks on my suggestions and I have no clue what 
you mean by "agreement hypercall".

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 15:17             ` Julien Grall
@ 2017-01-31 16:02               ` Jaggi, Manish
  2017-01-31 16:18                 ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Jaggi, Manish @ 2017-01-31 16:02 UTC (permalink / raw)
  To: Julien Grall, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun



On 1/31/2017 8:47 PM, Julien Grall wrote:
>
>
> On 31/01/17 14:08, Jaggi, Manish wrote:
>> Hi Julien,
>>
>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>>>>
>>>>>>> From: Xen-devel <xen-devel-bounces@lists.xen.org> on behalf of Andre
>>>>>>> Przywara <andre.przywara@arm.com>
>>>>>>> Sent: Tuesday, January 31, 2017 12:01 AM
>>>>>>> To: Stefano Stabellini; Julien Grall
>>>>>>> Cc: xen-devel@lists.xenproject.org; Vijay Kilari
>>>>>>> Subject: [Xen-devel] [PATCH 09/28] ARM: GICv3 ITS: map device and
>>>>>>> LPIs to the ITS on physdev_op hypercall
>>>>>>>
>>>>>> [snip]
>>>>>>>
>>>>>>>
>>>>>>>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>>>>  {
>>>>>>> +    struct physdev_manage_pci manage;
>>>>>>> +    u32 devid;
>>>>>>> +    int ret;
>>>>>>> +
>>>>>>> +    switch (cmd)
>>>>>>> +    { 
>>>>>>
>>>>>> You might alos need to  PHYSDEVOP_pci_device_add hypercall also.
>>>>>>
>>>>>>> +        case PHYSDEVOP_manage_pci_add:
>>>>>>> +        case PHYSDEVOP_manage_pci_remove:
>>>>>>> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
>>>>>>> +                return -EFAULT;
>>>>>>> +
>>>>>>> +            devid = manage.bus << 8 | manage.devfn;
>>>>>>> +            /* Allocate an ITS device table with space for 32 MSIs */
>>>>>>> +            ret = gicv3_its_map_guest_device(hardware_domain,
>>>>>>> devid, devid, 5,
>>>>>>> +                                             cmd ==
>>>>>>> PHYSDEVOP_manage_pci_add); 
>>>>>>
>>>>>> Based on 4.9 kernel, is the deivce ID plain sBDF or it is
>>>>>> returnedfrom of_msi_map_rid /  iort_msi_map_rid ?
>>>>>> I believe there needs to be set this as requirement on the calle of
>>>>>> hypercall. 
>>>>>
>>>>> The requirement of the hypercall is already defined and cannot be
>>>>> changed. So if it does not provide the correct information, then we
>>>>> need to find another way to get the DeviceID.
>>>>>
>>>> Do you think sbdf and device ID are same ? If you recollect your
>>>> comments last year sbdf != DeviceID.
>>>> for this series it has to be passed correctly otherwise ITS would be programmed incorrectly.
>>>> I suggest this series to include another way as well. 
>>>
>>> Thank you sherlock, if you had read my e-mail entirely you would have noticed I never said sbdf == DeviceID and actually provided insight on the problem and suggest solutions.
>>>
>> If you please read 4 lines above I wrote sbdf != DeviceID. 
>
> I think there is a miscommunication problem here. By "my e-mail" I was referring to the e-mail on this thread (4a8e35dc-57e5-e493-9a9a-4a91bb8e1a2f@arm.com). On your e-mail you implied I was not aware of sbdf != DeviceID (see "Do you think sbdf and device ID are same").
>
>
>>> I would recommend you to do the same in the future. It would help to get the code much faster in Xen.
>>>
>>>>
>>>>> In case of ACPI, we should be able to get those informations from the
>>>>> IORT as the segment number is defined in the firmware tables. But for
>>>>> Device Tree, we would need DOM0 and Xen to agree on the segment number.
>>>>>
>>>> Is there any agreement hypercall used with this series ? 
>>>
>>> From xen/include/public/physdev.h
>>>
>>> struct physdev_manage_pci {
>>>     /* IN */
>>>     uint8_t bus;
>>>     uint8_t devfn;
>>> };
>>>
>>> struct physdev_manage_pci_ext {
>>>     /* IN */
>>>     uint8_t bus;
>>>     uint8_t devfn;
>>>     unsigned is_extfn;
>>>     unsigned is_virtfn;
>>>     struct {
>>>         uint8_t bus;
>>>         uint8_t devfn;
>>>     } physfn;
>>> };
>>>
>>> Let me know how you could encode a DeviceID in those hypercalls.
>>>
>> If you please go back to your comment where you wrote "we need to find another way to get the DeviceID", I was referring that we should add that another way in this series so that correct DeviceID is programmed in ITS. 
>
> This is not the first time I am saying this, just saying "we should add that another way..." is not helpful. You should also provide some details on what you would do.
>
Julien, As you suggested we need to find another way, I assumed you had something in mind.
Since we both agree that sbdf!=deviceID, the current series of ITS patches will program the incorrect deviceID so there is a need to
have a way to map sbdf with deviceID in xen.

One option could be to add a new hypercall to supply sbdf and deviceID to xen.

 #define PHYSDEVOP_pci_dev_map_msi_specifier    33
 struct physdev_pci_dev_map_msi_specifier {
    /* IN */
    uint16_t seg;
    uint8_t bus;
    uint8_t devfn;
    uint32_t msi_specifier; //DeviceID
 };

(https://lists.xen.org/archives/html/xen-devel/2016-12/msg00224.html)

> For now, you gave no feedbacks on my suggestions and I have no clue what you mean by "agreement hypercall".
>
> Cheers,
>


   
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 16:02               ` Jaggi, Manish
@ 2017-01-31 16:18                 ` Julien Grall
  2017-02-24 19:57                   ` Shanker Donthineni
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-01-31 16:18 UTC (permalink / raw)
  To: Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun



On 31/01/17 16:02, Jaggi, Manish wrote:
>
>
> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>
>>
>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>> Hi Julien,
>>>
>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>> If you please go back to your comment where you wrote "we need to find another way to get the DeviceID", I was referring that we should add that another way in this series so that correct DeviceID is programmed in ITS.
>>
>> This is not the first time I am saying this, just saying "we should add that another way..." is not helpful. You should also provide some details on what you would do.
>>
> Julien, As you suggested we need to find another way, I assumed you had something in mind.

I gave suggestions on my e-mail but you may have missed it...

> Since we both agree that sbdf!=deviceID, the current series of ITS patches will program the incorrect deviceID so there is a need to
> have a way to map sbdf with deviceID in xen.
>
> One option could be to add a new hypercall to supply sbdf and deviceID to xen.

... as well as the part where I am saying that I am not in favor to 
implement an hypercall temporarily, and against adding a new hypercall 
for only a couple of weeks. As you may know PHYSDEV hypercall are part 
of the stable ABI and once they are added they cannot be removed.

So we need to be sure the hypercall is necessary. In this case, the 
hypercall is not necessary as all the information can be found in the 
firmware tables. However this is not implemented yet and part of the 
discussion on PCI Passthrough (see [1]).

We need a temporary solution that does not involve any commitment on the 
ABI until Xen is able to discover PCI.

Regards,

[1] <5cf9128e-e845-2a89-f7c7-ac8616941ab9@linaro.org>
>
>
>
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 01/28] ARM: export __flush_dcache_area()
  2017-01-30 18:31 ` [PATCH 01/28] ARM: export __flush_dcache_area() Andre Przywara
@ 2017-02-06 11:23   ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-06 11:23 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> The ability to clean a cache line is not only useful for EFI, but will
> be needed later for the ITS support.
> Export the function to be usable from the whole Xen/ARM code.

There is already a function to clean & invalidate. See 
clean_and_invalidate_dcache_va_range in include/asm-arm/page.h

So please use this function and leave __flush_dcache_area only exported 
to EFI.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-01-30 18:31 ` [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2017-02-06 12:39   ` Julien Grall
  2017-02-16 17:44     ` Andre Przywara
  2017-02-06 12:58   ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-06 12:39 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> new file mode 100644
> index 0000000..ff0f571
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -0,0 +1,71 @@
> +/*
> + * xen/arch/arm/gic-v3-its.c
> + *
> + * ARM GICv3 Interrupt Translation Service (ITS) support
> + *
> + * Copyright (C) 2016,2017 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/config.h>

No need to include xen/config.h it will be done by default.

> +#include <xen/lib.h>
> +#include <xen/device_tree.h>


> +#include <xen/libfdt/libfdt.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>

The 3 headers above does not look necessary for now. Please try to 
include them when needed.

> +#include <asm/gic_v3_its.h>
> +
> +/* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
> +void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +    const struct dt_device_node *its = NULL;
> +    struct host_its *its_data;
> +
> +    /*
> +     * Check for ITS MSI subnodes. If any, add the ITS register
> +     * frames to the ITS list.
> +     */
> +    dt_for_each_child_node(node, its)
> +    {
> +        paddr_t addr, size;

NIT: dt_device_get_address is taking uint64_t variables in parameter. So 
I would prefer to use uint64_t here for consistency.

> +
> +        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
> +            continue;
> +
> +        if ( !dt_device_is_available(its) )
> +            continue;

Can an ITS really be disabled? Or is it just for debugging?

> +
> +        if ( dt_device_get_address(its, 0, &addr, &size) )
> +            panic("GICv3: Cannot find a valid ITS frame address");
> +
> +        its_data = xzalloc(struct host_its);
> +        if ( !its_data )
> +            panic("GICv3: Cannot allocate memory for ITS frame");
> +
> +        its_data->addr = addr;
> +        its_data->size = size;
> +        its_data->dt_node = its;
> +
> +        printk("GICv3: Found ITS @0x%lx\n", addr);
> +
> +        list_add_tail(&its_data->entry, &host_its_list);
> +    }
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b8be395..838dd11 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -43,9 +43,12 @@
>  #include <asm/device.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
>  #include <asm/cpufeature.h>
>  #include <asm/acpi.h>
>
> +LIST_HEAD(host_its_list);

I would rather limit the number of variable exported. I've looked at how 
host_its_list is used accross this series and I don't think it is 
necessary to export it.

The 2 users (not including gic-v3-its.c) are in gic-v3.c and vgic-v3.c. 
I will explain how to replace the one in vgic-v3.c on the corresponding 
patch.

For gic-v3.c, you use host_its_list to check if ITS is available and 
going through the list. For the former, you could have gicv3_its_dt_init 
returning the number ITS available. For the latter, the loop is calling 
a function living in gic-v3-its.c where host_its_list is already available.

I will mention again when review the associated patches.

> +
>  /* Global state */
>  static struct {
>      void __iomem *map_dbase;  /* Mapped address of distributor registers */
> @@ -1224,11 +1227,12 @@ static void __init gicv3_dt_init(void)
>       */
>      res = dt_device_get_address(node, 1 + gicv3.rdist_count,
>                                  &cbase, &csize);
> -    if ( res )
> -        return;
> +    if ( !res )
> +        dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
> +                              &vbase, &vsize);
>
> -    dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
> -                          &vbase, &vsize);
> +    /* Check for ITS child nodes and build the host ITS list accordingly. */
> +    gicv3_its_dt_init(node);
>  }
>
>  static int gicv3_iomem_deny_access(const struct domain *d)
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> new file mode 100644
> index 0000000..2f5c51c
> --- /dev/null
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -0,0 +1,57 @@
> +/*
> + * ARM GICv3 ITS support
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#ifndef __ASM_ARM_ITS_H__
> +#define __ASM_ARM_ITS_H__
> +
> +#ifndef __ASSEMBLY__

Do you expect the ITS header to be included in the assembly code? If 
not, then I would drop the #ifndef __ASSEMBLY.

> +#include <xen/device_tree.h>
> +
> +/* data structure for each hardware ITS */
> +struct host_its {
> +    struct list_head entry;
> +    const struct dt_device_node *dt_node;
> +    paddr_t addr;
> +    paddr_t size;
> +};
> +
> +extern struct list_head host_its_list;
> +
> +#ifdef CONFIG_HAS_ITS
> +
> +/* Parse the host DT and pick up all host ITSes. */
> +void gicv3_its_dt_init(const struct dt_device_node *node);
> +
> +#else
> +
> +static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +}
> +
> +#endif /* CONFIG_HAS_ITS */
> +
> +#endif /* __ASSEMBLY__ */
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-01-30 18:31 ` [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
  2017-02-06 12:39   ` Julien Grall
@ 2017-02-06 12:58   ` Julien Grall
  2017-02-27 11:43     ` Andre Przywara
  1 sibling, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-06 12:58 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
> offers. Store that information in a list to both propagate all of them
> later to Dom0, but also to be able to iterate over all ITSes.
> This introduces an ITS Kconfig option.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig             |  4 +++
>  xen/arch/arm/Makefile            |  1 +
>  xen/arch/arm/gic-v3-its.c        | 71 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c            | 12 ++++---
>  xen/include/asm-arm/gic_v3_its.h | 57 ++++++++++++++++++++++++++++++++
>  5 files changed, 141 insertions(+), 4 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-its.c
>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 2e023d1..bf64c61 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -45,6 +45,10 @@ config ACPI
>  config HAS_GICV3
>  	bool
>
> +config HAS_ITS
> +        bool "GICv3 ITS MSI controller support"
> +        depends on HAS_GICV3

Should not this be disabled by default until the last patch of the 
series in order to avoid potential issue if bisecting Xen?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table
  2017-01-30 18:31 ` [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2017-02-06 16:26   ` Julien Grall
  2017-02-27 11:34     ` Andre Przywara
  2017-02-14  0:47   ` Stefano Stabellini
  1 sibling, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-06 16:26 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> The ARM GICv3 provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> redistributor. The maximum number of LPIs to be used can be adjusted with
> the command line option "max_lpi_bits", which defaults to a compile time
> constant exposed in Kconfig.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig              |  15 +++++
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/gic-v3-lpi.c         | 129 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c             |  44 +++++++++++++
>  xen/include/asm-arm/bitops.h      |   1 +
>  xen/include/asm-arm/gic.h         |   2 +
>  xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
>  xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
>  8 files changed, 264 insertions(+), 2 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index bf64c61..71734a1 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -49,6 +49,21 @@ config HAS_ITS
>          bool "GICv3 ITS MSI controller support"
>          depends on HAS_GICV3
>
> +config MAX_PHYS_LPI_BITS
> +        depends on HAS_ITS
> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> +        range 14 32
> +        default "20"
> +        help
> +          Specifies the maximum number of LPIs (in bits) Xen should take
> +          care of. The host ITS may provide support for a very large number
> +          of supported LPIs, for all of which we may not want to allocate
> +          memory, so this number here allows to limit this.

I think the description is misleading, if a user wants 8K worth of LPIs 
by default, he would have to use 14 and not 13.

Furthermore, you provide both a runtime option (via command line) and 
build time option (via Kconfig). You don't express what is the 
differences between both and how there are supposed to co-exist.

Anyway, IHMO the command line option should be sufficient to allow 
override if necessary. So I would drop the Kconfig version.

> +          Xen itself does not know how many LPIs domains will ever need
> +          beforehand.
> +          This can be overriden on the command line with the max_lpi_bits

s/overriden/overridden/

> +          parameter.
> +
>  endmenu
>
>  menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 5f4ff23..4ccf2eb 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -19,6 +19,7 @@ obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
>  obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
>  obj-y += guestcopy.o
>  obj-y += hvm.o
>  obj-y += io.o
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> new file mode 100644
> index 0000000..e2fc901
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -0,0 +1,129 @@
> +/*
> + * xen/arch/arm/gic-v3-lpi.c
> + *
> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
> + *
> + * Copyright (C) 2016,2017 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/config.h>

xen/config.h is not necessary.

> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +
> +/* Global state */
> +static struct {
> +    uint8_t *lpi_property;
> +    unsigned int host_lpi_bits;

On the previous version, Stefano suggested to rename this to 
phys_lpi_bits + adding a comment as you store the number of bits.

However, looking at the usage the number of bits is only required during 
the initialization. Runtime code (such as gic_get_host_lpi) will use the 
number of LPIs (see gic_get_host_lpi) and therefore require extra 
instructions to compute the value.

So I would prefer if you store the number of LPIs here to optimize the 
common case.

Also, I find the naming "id_bits" confusing because you store the number 
of bits to encode the max LPI ID and not the number of bits to encode 
the number of LPI.

> +} lpi_data;
> +
> +/* Pending table for each redistributor */
> +static DEFINE_PER_CPU(void *, pending_table);
> +
> +#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
> +
> +uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    uint64_t reg;
> +    void *pendtable;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    if ( !this_cpu(pending_table) )
> +    {
> +        /*
> +         * The pending table holds one bit per LPI and even covers bits for
> +         * interrupt IDs below 8192, so we allocate the full range.
> +         * The GICv3 imposes a 64KB alignment requirement.
> +         */
> +        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8, SZ_64K);
> +        if ( !pendtable )
> +            return 0;
> +
> +        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);

You can use _zalloc to do the allocation and then memset to 0.

> +        __flush_dcache_area(pendtable, BIT_ULL(lpi_data.host_lpi_bits) / 8);

Please use clean_and_invalidate_dcache_va_range.

> +
> +        this_cpu(pending_table) = pendtable;
> +    }
> +    else
> +    {
> +        pendtable = this_cpu(pending_table);
> +    }

The {} are not necessary. Also, on the previous version it was mentioned 
this should be an error and then replace by a BUG_ON().

Please do the change.

> +
> +    reg |= GICR_PENDBASER_PTZ;
> +
> +    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));

I don't understand the purpose of this ASSERT. The bits 15:0 should 
always be zero otherwise this would be a bug in the memory allocator. 
For bits 64:52, the architecture so far only support up to 52 bits.

By keeping this ASSERT, you will make our life more difficult to extend 
the number of physical address supported if ARM decides to bump it.
So please drop this ASSERT.

> +    reg |= virt_to_maddr(pendtable);
> +
> +    return reg;
> +}
> +
> +uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    uint64_t reg;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;

You are using the shift defines from PENDBASER and not PROPBASER.

> +
> +    /*
> +     * The property table is shared across all redistributors, so allocate
> +     * this only once, but return the same value on subsequent calls.
> +     */
> +    if ( !lpi_data.lpi_property )
> +    {
> +        /* The property table holds one byte per LPI. */
> +        void *table = alloc_xenheap_pages(lpi_data.host_lpi_bits - PAGE_SHIFT,
> +                                          0);

The property table address has to be 4KB aligned right? If so, I would 
much prefer if you use _xmalloc(BIT_ULL(lpi_data.host_lpi_bits), SZ_4K) 
to avoid relying on PAGE_SIZE == 4KB.

Also, you will allocate more memory than necessary because the property 
table only covers the LPIs.

> +
> +        if ( !table )
> +            return 0;
> +
> +        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);

You could combine both suggested _xmalloc and memset to 0 in a single 
call to _zalloc.

> +        __flush_dcache_area(table, MAX_PHYS_LPIS);

Please use clean_and_invalidate_dcache_va_range.

> +        lpi_data.lpi_property = table;
> +    }
> +
> +    reg |= ((lpi_data.host_lpi_bits - 1) << 0);

Please avoid hardcoded shift.

> +
> +    ASSERT(!(virt_to_maddr(lpi_data.lpi_property) & ~GENMASK(51, 12)));
> +    reg |= virt_to_maddr(lpi_data.lpi_property);
> +
> +    return reg;
> +}
> +
> +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
> +integer_param("max_lpi_bits", max_lpi_bits);

Please document this new option in docs/misc/xen-command-line.markdown.

> +
> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)

Stefano suggested to rename this function to gicv3_lpi_init_phys_lpis 
and I agree with him here.

> +{
> +    lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
> +
> +    printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);

s/lld/llu/.

> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 838dd11..fcb86c8 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);

A macro has been suggested on the previous version here to avoid the 
hardcoded 0x1f.

> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>
> @@ -616,6 +619,33 @@ static int gicv3_enable_redist(void)
>      return 0;
>  }
>
> +static int gicv3_rdist_init_lpis(void __iomem * rdist_base)

I think it would make sense to move this function in gicv3-lpi.c. So 
only one function rather than 2 would be exported.

> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +
> +    /* We don't support LPIs without an ITS. */
> +    if ( list_empty(&host_its_list) )

See my comment on patch #2 regarding host_its_list.

> +        return -ENODEV;
> +
> +    /* Make sure LPIs are disabled before setting up the tables. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    if ( reg & GICR_CTLR_ENABLE_LPIS )
> +        return -EBUSY;

Why don't you just disable LPIs here? AFAIK, it should just be
writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, GICR_CTLR);

> +
> +    table_reg = gicv3_lpi_allocate_pendtable();
> +    if ( !table_reg )

 From the spec, GICR_PENDBASER full of 0 is valid.

> +        return -ENOMEM;
> +    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);

On the first version of this series, I mentioned that based on the spec 
(8.11.18 in ARM IHI 0069C) cacheability and shareability may not stick.

Whilst this may not (?) be a concern for the pending table, Xen will 
write in the property table to enable/disable LPIs. So we would need to 
know whether the cache needs to be cleaned after each access or not.

> +
> +    table_reg = gicv3_lpi_get_proptable();
> +    if ( !table_reg )
> +        return -ENOMEM;
> +    writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);

See all my remarks above.

> +
> +    return 0;
> +}
> +
>  static int __init gicv3_populate_rdist(void)
>  {
>      int i;
> @@ -658,6 +688,20 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                {
> +                    int ret;
> +
> +                    ret = gicv3_rdist_init_lpis(ptr);
> +                    if ( ret && ret != -ENODEV )
> +                    {
> +                        printk("GICv3: CPU%d: Cannot initialize LPIs: %d\n",

CPU%u

> +                               smp_processor_id(), ret);
> +                        break;
> +                    }
> +                }
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
> index bda8898..1cbfb9e 100644
> --- a/xen/include/asm-arm/bitops.h
> +++ b/xen/include/asm-arm/bitops.h
> @@ -24,6 +24,7 @@
>  #define BIT(nr)                 (1UL << (nr))
>  #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
>  #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
> +#define BIT_ULL(nr)             (1ULL << (nr))
>  #define BITS_PER_BYTE           8
>
>  #define ADDR (*(volatile int *) addr)
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 836a103..12bd155 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -220,6 +220,8 @@ enum gic_version {
>      GIC_V3,
>  };
>
> +#define LPI_OFFSET      8192
> +

It would make much sense to have this definition moved in irq.h close to 
NR_IRQS.

Also, I am a bit surprised that NR_IRQS & co has not been modified. Is 
there any reason for that?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2017-02-06 17:19   ` Julien Grall
  2017-02-14  0:55     ` Stefano Stabellini
  2017-02-06 17:36   ` Julien Grall
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-06 17:19 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.

s/hand it the/hand it to the/ ?

> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig             |  14 +++++
>  xen/arch/arm/gic-v3-its.c        | 129 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c            |   5 ++
>  xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>  4 files changed, 202 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 71734a1..81bc233 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>            This can be overriden on the command line with the max_lpi_bits
>            parameter.
>
> +config MAX_PHYS_ITS_DEVICE_BITS
> +        depends on HAS_ITS
> +        int "Number of device bits the ITS supports"
> +        range 1 32
> +        default "10"
> +        help
> +          Specifies the maximum number of devices which want to use the ITS.
> +          Xen needs to allocates memory for the whole range very early.
> +          The allocation scheme may be sparse, so a much larger number must
> +          be supported to cover devices with a high bus number or those on
> +          separate bus segments.

As for MAX_PHYS_LPI_BITS, this Kconfig is questionable. The default 
value is arbitrary and may not fit everyone.

The way forward is to use the 2-level table if available to limit the 
memory usage. If only flat table is supported, then the user can use the 
command line option to limit it.

> +          This can be overriden on the command line with the max_its_device_bits

s/overriden/overridden/

> +          parameter.
> +
>  endmenu
>
>  menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index ff0f571..c31fef6 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,9 +20,138 @@
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16 )
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)

I know that PAGE_SHIFT has been suggested by Stefano on the previous 
version. However, I think  this is wrong. The PAGE_BITS is not based on 
the page granularity of Xen, so I would much prefer to keep an 12 
hardcoded with a comment.

> +
> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)

s/int/unsigned int/

> +{
> +    uint64_t attr, reg;
> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;

s/int/unsigned int/

> +    int pagesz = 0, order, table_size;

s/int/unsigned int/

> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Setup the BASE register with the attributes that we like. Then read
> +     * it back and see what sticks (page size, cacheability and shareability
> +     * attributes), retrying if necessary.
> +     */
> +    while ( 1 )

This loop is really confusing to read. A set of goto would probably make 
it more readable thanks to the labels labels.

> +    {
> +        table_size = ROUNDUP(nr_items * entry_size, BIT(PAGE_BITS(pagesz)));
> +        order = get_order_from_bytes(table_size);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= table_size >> PAGE_BITS(pagesz);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_VALID_BIT;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +        {
> +            /* If we can't map it shareable, drop cacheability as well. */
> +            if ( (regc & GITS_BASER_SHAREABILITY_MASK) == GIC_BASER_NonShareable )
> +            {
> +                regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;

So you drop cacheability, but you never clean & invalidate the cache. Is 
it normal?

> +                attr = regc & BASER_ATTR_MASK;
> +                continue;
> +            }
> +            attr = regc & BASER_ATTR_MASK;
> +        }
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )

This check looks wrong to me. The page size is encoded in bits 9:8 but I 
don't see any shift here.

Also a mask for the field would be useful.

> +            return 0;
> +
> +        /* None of the page sizes was accepted, give up */
> +        if ( pagesz >= 2 )
> +            break;
> +
> +        free_xenheap_pages(buffer, order);
> +        buffer = NULL;

If you move the check "if ( pagesz >= 2 )" here ...

> +
> +        pagesz++;
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);

... those 2 lines could be dropped.

> +
> +    return -EINVAL;
> +}
> +
> +static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
> +integer_param("max_its_device_bits", max_its_device_bits);

This new command line option needs to be documented in 
docs/misc/xen-command-line.markdown.

> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));

You will waste space if the platform support less DevID bits 
max_its_device_bits.

> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            its_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
>
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index fcb86c8..440c079 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +

This loop could be handled in gic-v3-its.c.

>      spin_unlock(&gicv3.lock);
>
>      return res;

[...]

>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> @@ -27,6 +74,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>
>  extern struct list_head host_its_list;
> @@ -42,8 +90,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>  #else
>
> @@ -62,6 +111,10 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
>  {
>      return 0;
>  }

Newline here.

> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}

Newline here.

>  #endif /* CONFIG_HAS_ITS */
>
>  #endif /* __ASSEMBLY__ */
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
  2017-02-06 17:19   ` Julien Grall
@ 2017-02-06 17:36   ` Julien Grall
  2017-02-06 17:43   ` Julien Grall
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-06 17:36 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari



On 30/01/17 18:31, Andre Przywara wrote:
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index fcb86c8..440c079 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);

Also, it is probably a really bad idea to ignore any error from the ITS 
and not even printing an error. For the next version, I would add more 
error message to quickly find out what's going on from the log.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
  2017-02-06 17:19   ` Julien Grall
  2017-02-06 17:36   ` Julien Grall
@ 2017-02-06 17:43   ` Julien Grall
  2017-03-23 18:06     ` Andre Przywara
  2017-02-14  0:54   ` Stefano Stabellini
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-06 17:43 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi,

On 30/01/17 18:31, Andre Przywara wrote:
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            its_map_baser(basereg, reg, NR_CPUS);

And I forgot to mention about the collection. Same remark as for the 
device collection, NR_CPUS is the maximum size.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer
  2017-01-30 18:31 ` [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2017-02-06 17:43   ` Julien Grall
  2017-02-14  0:59   ` Stefano Stabellini
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-06 17:43 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer to the ITS h/w to create or alter the
> LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 46 ++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic_v3_its.h |  6 ++++++
>  2 files changed, 52 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index c31fef6..ad7cd2a 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -27,6 +27,8 @@
>  #include <asm/gic_v3_its.h>
>  #include <asm/io.h>
>
> +#define ITS_CMD_QUEUE_SZ                SZ_64K
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -44,6 +46,45 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>  }
>
> +static void *its_map_cbaser(struct host_its *its)
> +{
> +    void __iomem *cbasereg = its->its_base + GITS_CBASER;
> +    uint64_t reg, regc;
> +    void *buffer;
> +    paddr_t paddr;
> +
> +    reg  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    buffer = _xzalloc(ITS_CMD_QUEUE_SZ, PAGE_SIZE);

s/PAGE_SIZE/SZ_4K/

> +    if ( !buffer )
> +        return NULL;
> +    paddr = virt_to_maddr(buffer);
> +    ASSERT(!(paddr & ~GENMASK(51, 12)));

Please remove the ASSERT (see why on patch #3).

> +
> +    reg |= GITS_VALID_BIT | paddr;
> +    reg |= ((ITS_CMD_QUEUE_SZ / PAGE_SIZE) - 1) & GITS_CBASER_SIZE_MASK;
> +    writeq_relaxed(reg, cbasereg);
> +    regc = readq_relaxed(cbasereg);

You could re-use reg rather than introduce a new variable regc. This 
would avoid some confusion on which one to use.

> +
> +    /* If the ITS dropped shareability, drop cacheability as well. */
> +    if ( (regc & GITS_BASER_SHAREABILITY_MASK) == 0 )
> +    {
> +        regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +        writeq_relaxed(regc, cbasereg);
> +    }
> +
> +    /*
> +     * If the command queue memory is mapped as uncached, we need to flush
> +     * it on every access.
> +     */
> +    if ( !(regc & GITS_BASER_INNER_CACHEABILITY_MASK) )
> +        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;

Could you add a print to inform the user about the cache flush? It is 
something quite useful to know in the log.

> +
> +    return buffer;
> +}
> +
>  #define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
>
>  static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> @@ -150,6 +191,11 @@ int gicv3_its_init(struct host_its *hw_its)
>          }
>      }
>
> +    hw_its->cmd_buf = its_map_cbaser(hw_its);
> +    if ( !hw_its->cmd_buf )
> +        return -ENOMEM;
> +    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
> +
>      return 0;
>  }
>
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index ed44bdb..ff5572f 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -65,9 +65,13 @@
>  #define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
>  #define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
>
> +#define GITS_CBASER_SIZE_MASK           0xff
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
> +#define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -75,6 +79,8 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    void *cmd_buf;
> +    unsigned int flags;
>  };
>
>  extern struct list_head host_its_list;
>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-01-30 18:31 ` [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2017-02-06 19:16   ` Julien Grall
  2017-02-07 11:44     ` Julien Grall
  2017-03-07 18:08     ` Andre Przywara
  2017-02-07 11:59   ` Julien Grall
  1 sibling, 2 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-06 19:16 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> To be able to easily send commands to the ITS, create the respective
> wrapper functions, which take care of the ring buffer.
> The first two commands we implement provide methods to map a collection
> to a redistributor (aka host core) and to flush the command queue (SYNC).
> Start using these commands for mapping one collection to each host CPU.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c         | 142 +++++++++++++++++++++++++++++++++++++-
>  xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
>  xen/arch/arm/gic-v3.c             |  18 ++++-
>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>  xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
>  5 files changed, 215 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index ad7cd2a..6578e8a 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -19,6 +19,7 @@
>  #include <xen/config.h>
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
> +#include <xen/delay.h>
>  #include <xen/libfdt/libfdt.h>
>  #include <xen/mm.h>
>  #include <xen/sizes.h>
> @@ -29,6 +30,98 @@
>
>  #define ITS_CMD_QUEUE_SZ                SZ_64K
>
> +#define BUFPTR_MASK                     GENMASK(19, 5)
> +static int its_send_command(struct host_its *hw_its, const void *its_cmd)
> +{
> +    uint64_t readp, writep;
> +
> +    spin_lock(&hw_its->cmd_lock);

Do you never expect a command to be sent in an interrupt path? I could 
see at least one, we may decide to throttle the number of LPIs received 
by a guest so this would involve disabling the interrupt.

> +
> +    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +
> +    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
> +    {

I look at all the series applied and there is no error message at all 
when the queue is full. This will make difficult to see what's going on.

Furthermore, this limit could be easily reached. Furthermore, this could 
happen easily if you decide to map a device with thousands of 
interrupts. For instance the function gicv3_map_its_map_host_events will 
issue 2 commands per event (MAPTI and INV).

So how do you plan to address this?

> +        spin_unlock(&hw_its->cmd_lock);
> +        return -EBUSY;
> +    }
> +
> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
> +        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);

Please use dcache_.... helpers.

> +    else
> +        dsb(ishst);
> +
> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base + GITS_CWRITER);
> +
> +    spin_unlock(&hw_its->cmd_lock);
> +
> +    return 0;
> +}
> +
> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)

s/int cpu/unsigned int cpu/

> +{
> +    reg &= ~GENMASK(51, 16);
> +
> +    reg |= gicv3_get_redist_address(cpu, hw_its->flags & HOST_ITS_USES_PTA);
> +
> +    return reg;
> +}
> +
> +static int its_send_cmd_sync(struct host_its *its, int cpu)

s/int cpu/unsigned int cpu/

> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_SYNC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)

s/int/unsigned int/ for both collection_id and cpu.

> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)));

Please drop the mask here.

> +    cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +/* Set up the (1:1) collection mapping for the given host CPU. */
> +int gicv3_its_setup_collection(int cpu)

So you are calling this function from gicv3_rdist_init_lpis which make 
little sense to me. This should probably called from gicv3_cpu_init.

> +{
> +    struct host_its *its;
> +    int ret;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        /*
> +         * This function is called on CPU0 before any ITSes have been
> +         * properly initialized. Skip the collection setup in this case,
> +         * it will be done explicitly for CPU0 upon initializing the ITS.
> +         */

Looking at the code, I don't understand why you need to do that. AFAIU 
there are no restriction to initialize the ITS (e.g call gicv3_its_init) 
before gicv3_cpu_init.

> +        if ( !its->cmd_buf )
> +            continue;
> +
> +        ret = its_send_cmd_mapc(its, cpu, cpu);
> +        if ( ret )
> +            return ret;
> +
> +        ret = its_send_cmd_sync(its, cpu);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    return 0;
> +}
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -156,18 +249,51 @@ static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>      return -EINVAL;
>  }
>
> +/* Wait for an ITS to become quiescient (all ITS operations completed). */

s/quiescient/quiescent/

> +static int gicv3_its_wait_quiescient(struct host_its *hw_its)

s/quiescient/quiescent/

> +{
> +    uint32_t reg;
> +    s_time_t deadline = NOW() + MILLISECS(1000);

So that sounds fine for handling a couple of command, but what about 
thousands at the same time?

> +
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    if ( (reg & (GITS_CTLR_QUIESCENT | GITS_CTLR_ENABLE)) == GITS_CTLR_QUIESCENT )
> +        return 0;
> +
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
> +    do {
> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +        if ( reg & GITS_CTLR_QUIESCENT )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    dprintk(XENLOG_ERR, "ITS not quiescient\n");

s/quiescient/quiescent/ + newline.

> +    return -ETIMEDOUT;
> +}
> +

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-02-06 19:16   ` Julien Grall
@ 2017-02-07 11:44     ` Julien Grall
  2017-03-07 18:08     ` Andre Przywara
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-07 11:44 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 06/02/2017 19:16, Julien Grall wrote:
> On 30/01/17 18:31, Andre Przywara wrote:
>> +/* Wait for an ITS to become quiescient (all ITS operations
>> completed). */
>
> s/quiescient/quiescent/
>
>> +static int gicv3_its_wait_quiescient(struct host_its *hw_its)
>
> s/quiescient/quiescent/
>
>> +{
>> +    uint32_t reg;
>> +    s_time_t deadline = NOW() + MILLISECS(1000);
>
> So that sounds fine for handling a couple of command, but what about
> thousands at the same time?

Please ignore this question. I just noticed that I commented on the 
wrong function. Sorry for that.

>> +
>> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +    if ( (reg & (GITS_CTLR_QUIESCENT | GITS_CTLR_ENABLE)) ==
>> GITS_CTLR_QUIESCENT )
>> +        return 0;
>> +
>> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base +
>> GITS_CTLR);
>> +
>> +    do {
>> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +        if ( reg & GITS_CTLR_QUIESCENT )
>> +            return 0;
>> +
>> +        cpu_relax();
>> +        udelay(1);
>> +    } while ( NOW() <= deadline );
>> +
>> +    dprintk(XENLOG_ERR, "ITS not quiescient\n");
>
> s/quiescient/quiescent/ + newline.
>
>> +    return -ETIMEDOUT;
>> +}
>> +

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-01-30 18:31 ` [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
  2017-02-06 19:16   ` Julien Grall
@ 2017-02-07 11:59   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-07 11:59 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

Continuing the review where I left it yesterday.

On 30/01/2017 18:31, Andre Przywara wrote:

[...]

> +/* Wait for an ITS to become quiescient (all ITS operations completed). */
> +static int gicv3_its_wait_quiescient(struct host_its *hw_its)
> +{
> +    uint32_t reg;
> +    s_time_t deadline = NOW() + MILLISECS(1000);
> +
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    if ( (reg & (GITS_CTLR_QUIESCENT | GITS_CTLR_ENABLE)) == GITS_CTLR_QUIESCENT )

It would be clearer if you rewrite this:

  (reg & GITS_CTLR_QUIESCENT) && !(reg & GITS_CTLR_ENABLE).

> +        return 0;
> +
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);

It feels a bit odd to disable the ITS in a function containing "wait" in 
the name. You may want to rename the function to reflect the behavior.

> +
> +    do {
> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +        if ( reg & GITS_CTLR_QUIESCENT )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    dprintk(XENLOG_ERR, "ITS not quiescient\n");
> +    return -ETIMEDOUT;
> +}
> +
>  static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
>  integer_param("max_its_device_bits", max_its_device_bits);
>
>  int gicv3_its_init(struct host_its *hw_its)
>  {
>      uint64_t reg;
> -    int i;
> +    int i, ret;
>
>      hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>      if ( !hw_its->its_base )
>          return -ENOMEM;
>
> +    ret = gicv3_its_wait_quiescient(hw_its);
> +    if ( ret )
> +        return ret;
> +
> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
> +    if ( reg & GITS_TYPER_PTA )
> +        hw_its->flags |= HOST_ITS_USES_PTA;
> +
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
>          void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> @@ -196,6 +322,20 @@ int gicv3_its_init(struct host_its *hw_its)
>          return -ENOMEM;
>      writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
>

[...]

> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index e2fc901..5911b91 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -30,11 +30,31 @@ static struct {
>      unsigned int host_lpi_bits;
>  } lpi_data;
>
> +/* Physical redistributor address */
> +static DEFINE_PER_CPU(paddr_t, redist_addr);
> +/* Redistributor ID */
> +static DEFINE_PER_CPU(int, redist_id);

s/int/unsigned int/

>  /* Pending table for each redistributor */
>  static DEFINE_PER_CPU(void *, pending_table);

Rather than defining 3 per-cpu variables, could we merge all in a single 
structure?

>
>  #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
>
> +/* Stores this redistributor's physical address and ID in a per-CPU variable */
> +void gicv3_set_redist_address(paddr_t address, int redist_id)

s/int/unsigned int/

> +{
> +    this_cpu(redist_addr) = address;
> +    this_cpu(redist_id) = redist_id;
> +}
> +
> +/* Returns a redistributor's ID (either as an address or as an ID) */
> +uint64_t gicv3_get_redist_address(int cpu, bool use_pta)

s/int/unsigned int/

> +{
> +    if ( use_pta )
> +        return per_cpu(redist_addr, cpu) & GENMASK(51, 16);
> +    else
> +        return per_cpu(redist_id, cpu) << 16;

What if the function is called before the CPU has been setup? If it 
cannot happen, please document it.

> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 440c079..5f825a6 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -644,7 +644,7 @@ static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
>          return -ENOMEM;
>      writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
>
> -    return 0;
> +    return gicv3_its_setup_collection(smp_processor_id());
>  }
>
>  static int __init gicv3_populate_rdist(void)
> @@ -692,7 +692,21 @@ static int __init gicv3_populate_rdist(void)
>
>                  if ( typer & GICR_TYPER_PLPIS )
>                  {
> -                    int ret;
> +                    paddr_t rdist_addr;
> +                    int procnum, ret;

procnum should be unsigned.

> +
> +                    rdist_addr = gicv3.rdist_regions[i].base;
> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
> +                    procnum = (typer & GICR_TYPER_PROC_NUM_MASK);
> +                    procnum >>= GICR_TYPER_PROC_NUM_SHIFT;
> +
> +                    /*
> +                     * The ITS refers to redistributors either by their physical
> +                     * address or by their ID. Determine those two values and
> +                     * let the ITS code store them in per host CPU variables to
> +                     * later be able to address those redistributors.
> +                     */

This comment does not look useful and is misleading as the code to 
get/set the redistributor information is living in gic-v3-lpi.c and not 
gic-v3-its.c.

> +                    gicv3_set_redist_address(rdist_addr, procnum);
>
>                      ret = gicv3_rdist_init_lpis(ptr);
>                      if ( ret && ret != -ENODEV )
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index b307322..878bae2 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -101,6 +101,8 @@
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
> +#define GICR_TYPER_PROC_NUM_SHIFT    8
> +#define GICR_TYPER_PROC_NUM_MASK     (0xffff << GICR_TYPER_PROC_NUM_SHIFT)
>
>  /* For specifying the inner cacheability type only */
>  #define GIC_BASER_CACHE_nCnB         0ULL
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index ff5572f..8288185 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -40,6 +40,9 @@
>  #define GITS_CTLR_QUIESCENT             BIT(31)
>  #define GITS_CTLR_ENABLE                BIT(0)
>
> +#define GITS_TYPER_PTA                  BIT_ULL(19)
> +#define GITS_TYPER_IDBITS_SHIFT         8
> +
>  #define GITS_IIDR_VALUE                 0x34c
>
>  #define GITS_BASER_INDIRECT             BIT_ULL(62)
> @@ -67,10 +70,27 @@
>
>  #define GITS_CBASER_SIZE_MASK           0xff
>
> +/* ITS command definitions */
> +#define ITS_CMD_SIZE                    32
> +
> +#define GITS_CMD_MOVI                   0x01
> +#define GITS_CMD_INT                    0x03
> +#define GITS_CMD_CLEAR                  0x04
> +#define GITS_CMD_SYNC                   0x05
> +#define GITS_CMD_MAPD                   0x08
> +#define GITS_CMD_MAPC                   0x09
> +#define GITS_CMD_MAPTI                  0x0a
> +#define GITS_CMD_MAPI                   0x0b
> +#define GITS_CMD_INV                    0x0c
> +#define GITS_CMD_INVALL                 0x0d
> +#define GITS_CMD_MOVALL                 0x0e
> +#define GITS_CMD_DISCARD                0x0f
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
> +#define HOST_ITS_USES_PTA               (1U << 1)
>
>  /* data structure for each hardware ITS */
>  struct host_its {
> @@ -79,6 +99,7 @@ struct host_its {
>      paddr_t addr;
>      paddr_t size;
>      void __iomem *its_base;
> +    spinlock_t cmd_lock;

I was expecting to see a code to initialize the spinlock. So please 
initialize the spinlock.

>      void *cmd_buf;
>      unsigned int flags;
>  };

Cheers,


-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2017-02-07 14:05   ` Julien Grall
  2017-02-15 16:30   ` Julien Grall
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-07 14:05 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/2017 18:31, Andre Przywara wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 188 ++++++++++++++++++++++++++++++++++++++-
>  xen/arch/arm/vgic-v3.c           |   3 +
>  xen/include/asm-arm/domain.h     |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  28 ++++++
>  4 files changed, 221 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 6578e8a..4a3a394 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,8 +21,10 @@
>  #include <xen/device_tree.h>
>  #include <xen/delay.h>
>  #include <xen/libfdt/libfdt.h>
> -#include <xen/mm.h>

Why did you drop the include xen/mm.h?

> +#include <xen/rbtree.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/domain.h>

All the header looks to be included in an alphabetical order except this 
one. Why?

>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
> @@ -94,6 +96,21 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             int size, uint64_t itt_addr, bool valid)

Please use unsigned for the size. Also it could be uint8_t.

Also, itt_addr should technically be a paddr_t.

> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size & GENMASK(4, 0);

The mask here and the one below are not necessary and could hide a 
programming error.

It would make much sense to have an ASSERT(size < GITS_TYPER.IDbits) 
here and someone checking that the value is correct.

> +    cmd[2] = itt_addr & GENMASK(51, 8);

For this one, we only want to make sure the bits 7:0 are all zeroes as 
the address is provided by the caller. So I would replace the mask by an 
ASSERT(!(its_addr & GENMASK(7, 0)).

> +    if ( valid )
> +        cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(int cpu)
>  {
> @@ -293,6 +310,7 @@ int gicv3_its_init(struct host_its *hw_its)
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      if ( reg & GITS_TYPER_PTA )
>          hw_its->flags |= HOST_ITS_USES_PTA;
> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
> @@ -339,6 +357,173 @@ int gicv3_its_init(struct host_its *hw_its)
>      return 0;
>  }
>
> +static void remove_mapped_guest_device(struct its_devices *dev)
> +{
> +    if ( dev->hw_its )
> +        its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);

I would propagate the return of the ITS command as it may have failed 
because, for instance, the command queue is full.

> +
> +    xfree(dev->itt_addr);
> +    xfree(dev);
> +}
> +
> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                               int guest_devid, int bits, bool valid)


Please use uint32_t for the host_devid and guest_devid.

Also, what is bits? It looks like to me this is the number of bit 
supported for the event. It think it would be clearer to pass the number 
of events and compute the number of bits within the function.

Furthermore, looking at the code, I think it would be better to have two 
separate functions: one to add a device, the other to remove. Overall 
the code looks quite different for both and some parameter are not useful.

Lastly, I don't see any code here which prevent a device to be assigned 
to another domain or even wrong host/guest DeviceID and wrong number of 
events bits. Who will do that?

For checking if the device has been assigned to another, I think this 
will be done could be done outside.

In any case, as requested by Stefano on the previous version, a TODO in 
the code will be needed to avoid forgetting.

> +{
> +    void *itt_addr = NULL;
> +    struct its_devices *dev, *temp;
> +    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
> +    struct host_its *hw_its;
> +    int ret;
> +
> +    /* check for already existing mappings */
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    while (*new)

Coding style: while ( *new ).

> +    {
> +        temp = rb_entry(*new, struct its_devices, rbnode);
> +
> +        if ( temp->guest_devid == guest_devid )
> +        {
> +            if ( !valid )
> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
> +
> +            spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +            if ( valid )

A printk(XENLOG_GUEST...) here would be useful to know which host 
DeviceID was associated to the guest deviceID.

> +                return -EBUSY;
> +
> +            remove_mapped_guest_device(temp);

See my comment on the function about the error checking.

> +
> +            return 0;
> +        }
> +
> +        if ( guest_devid < temp->guest_devid )
> +            new = &((*new)->rb_right);
> +        else
> +            new = &((*new)->rb_left);
> +    }
> +
> +    if ( !valid )
> +    {
> +        ret = -ENOENT;
> +        goto out_unlock;
> +    }
> +
> +    /*
> +     * TODO: Work out the correct hardware ITS to use here.
> +     * Maybe a per-platform function: devid -> ITS?
> +     * Or parsing the DT to find the msi_parent?
> +     * Or get Dom0 to give us this information?
> +     * For now just use the first ITS.

That's will likely come with the PCI work. The idea so far is to parse 
the firmware tables in Xen and find out the root complex.

So would in fact expect this function to take a struct device in 
parameter and deal with it.


> +     */
> +    hw_its = list_first_entry(&host_its_list, struct host_its, entry);
> +
> +    ret = -ENOMEM;
> +
> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);

Please document why 256.

Also, from the spec (see 6.3.9 in IHI0069C) the behavior is 
unpredictable when the ITT is not all zero. So please use _xzalloc here.

> +    if ( !itt_addr )
> +        goto out_unlock;
> +
> +    dev = xmalloc(struct its_devices);

I would use xzalloc just in case we forgot to initialize a field.

> +    if ( !dev )
> +    {
> +        xfree(itt_addr);
> +        goto out_unlock;
> +    }
> +
> +    ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
> +                            virt_to_maddr(itt_addr), true);
> +    if ( ret )
> +    {
> +        xfree(itt_addr);
> +        xfree(dev);
> +        goto out_unlock;

Duplicating the clean-up path multiple time is a call to error. I would 
prefer if you move the xfree in the out_unlock path. Note the xfree is 
able to handle NULL pointer.

> +    }
> +
> +    dev->itt_addr = itt_addr;
> +    dev->hw_its = hw_its;
> +    dev->guest_devid = guest_devid;
> +    dev->host_devid = host_devid;
> +    dev->eventids = BIT(bits);
> +
> +    rb_link_node(&dev->rbnode, parent, new);
> +    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return 0;
> +
> +out_unlock:
> +    spin_unlock(&d->arch.vgic.its_devices_lock);

Newline here please.

> +    return ret;
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +void gicv3_its_unmap_all_devices(struct domain *d)

I don't see this function called anywhere in this series. I appreciate 
that we don't yet support guest ITS. But calling the function in the 
right place would be helpful. This would avoid us to miss some clean-up 
and get yet another XSA.

Also, this function should have a prototype defined in the header.

> +{
> +    struct rb_node *victim;
> +    struct its_devices *dev;
> +
> +    /*
> +     * This is an easily readable, yet inefficient implementation.
> +     * It uses the provided iteration wrapper and erases each node, which
> +     * possibly triggers rebalancing.
> +     * This seems overkill since we are going to abolish the whole tree, but
> +     * avoids an open-coded re-implementation of the traversal functions with
> +     * some recursive function calls.
> +     * Performance does not matter here, since we are destroying a domain.

So this is slightly untrue. Performance matter when destroying a domain 
as Xen cannot be preempted. So if it is takes too long, you will have a 
impact on the overall system.

However, I think it would be fair to assume that all device will be 
deassigned before the ITS is destroyed.

So I would just drop this function. Note that we do the same assumption 
in the SMMU driver.

> +     */
> +restart:
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
> +    {
> +        dev = rb_entry(victim, struct its_devices, rbnode);
> +        rb_erase(victim, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        goto restart;
> +    }
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +}
> +
> +int gicv3_its_unmap_device(struct domain *d, int guest_devid)

Can you explain why you have this function and 
gicv3_its_map_guest_device (with valid = 0) doing exactly the same?

We should avoid having two interfaces doing exactly the same. In this 
case, I would avoid to handle the unmapping in gicv3_its_map_guest_device.

Also, this function should have a prototype defined in the header.

> +{
> +    struct rb_node *node;
> +
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    node = d->arch.vgic.its_devices.rb_node;
> +    while (node)
> +    {
> +        struct its_devices *dev = rb_entry(node, struct its_devices, rbnode);
> +
> +        if ( dev->guest_devid > guest_devid )
> +        {
> +            node = node->rb_left;
> +            continue;
> +        }
> +        if ( dev->guest_devid < guest_devid )
> +        {
> +            node = node->rb_right;
> +            continue;
> +        }
> +
> +        rb_erase(node, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        return 0;
> +
> +    }
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return -ENOENT;
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> @@ -369,6 +554,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);

The cmd_lock was introduced in the previous patch. Please move the 
initialization there.

>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d61479d..1fadb00 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
>      d->arch.vgic.nr_regions = rdist_count;
>      d->arch.vgic.rdist_regions = rdist_regions;
>
> +    spin_lock_init(&d->arch.vgic.its_devices_lock);
> +    d->arch.vgic.its_devices = RB_ROOT;

The placement of those 2 lines are likely wrong. This should belong to 
the vITS and not the vgic-v3.

I think it would make sense to get a patch that introduces a skeleton 
for the vITS before this patch and start plumbing through.

> +
>      /*
>       * Domain 0 gets the hardware address.
>       * Guests get the virtual platform layout.
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 2d6fbb1..00b9c1a 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -11,6 +11,7 @@
>  #include <asm/gic.h>
>  #include <public/hvm/params.h>
>  #include <xen/serial.h>
> +#include <xen/rbtree.h>
>
>  struct hvm_domain
>  {
> @@ -109,6 +110,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        struct rb_root its_devices;         /* devices mapped to an ITS */
> +        spinlock_t its_devices_lock;        /* protects the its_devices tree */
>  #endif
>      } vgic;
>
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 8288185..9c5dcf3 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -42,6 +42,10 @@
>
>  #define GITS_TYPER_PTA                  BIT_ULL(19)
>  #define GITS_TYPER_IDBITS_SHIFT         8
> +#define GITS_TYPER_ITT_SIZE_SHIFT       4
> +#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
> +#define GITS_TYPER_ITT_SIZE(r)          (((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
> +                                                GITS_TYPER_ITT_SIZE_SHIFT)
>
>  #define GITS_IIDR_VALUE                 0x34c
>
> @@ -88,6 +92,7 @@
>
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
> +#include <xen/rbtree.h>
>
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
> @@ -101,9 +106,19 @@ struct host_its {
>      void __iomem *its_base;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    int itte_size;

Please use unsigned.

>      unsigned int flags;
>  };
>
> +struct its_devices {
> +    struct rb_node rbnode;
> +    struct host_its *hw_its;
> +    void *itt_addr;
> +    uint32_t guest_devid;
> +    uint32_t host_devid;
> +    uint32_t eventids;
> +};
> +
>  extern struct list_head host_its_list;
>
>  #ifdef CONFIG_HAS_ITS
> @@ -128,6 +143,13 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta);
>  /* Map a collection for this host CPU to each host ITS. */
>  int gicv3_its_setup_collection(int cpu);
>
> +/* Map a device on the host by allocating an ITT on the host (ITS).

Coding style:

/*
  * Map a ...

> + * "bits" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                               int guest_devid, int bits, bool valid);
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -156,6 +178,12 @@ static inline int gicv3_its_setup_collection(int cpu)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                                             int guest_devid, int bits,
> +                                             bool valid)
> +{
> +    return -ENODEV;

I think it should be -ENOSYS rather than -ENODEV.

> +}
>
>  #endif /* CONFIG_HAS_ITS */
>
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array
  2017-01-30 18:31 ` [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2017-02-07 18:01   ` Julien Grall
  2017-02-14 20:05   ` Stefano Stabellini
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-07 18:01 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Andre,

On 30/01/2017 18:31, Andre Przywara wrote:
> The number of LPIs on a host can be potentially huge (millions),
> although in practise will be mostly reasonable. So prematurely allocating
> an array of struct irq_desc's for each LPI is not an option.
> However Xen itself does not care about LPIs, as every LPI will be injected
> into a guest (Dom0 for now).
> Create a dense data structure (8 Bytes) for each LPI which holds just
> enough information to determine the virtual IRQ number and the VCPU into
> which the LPI needs to be injected.
> Also to not artificially limit the number of LPIs, we create a 2-level
> table for holding those structures.
> This patch introduces functions to initialize these tables and to
> create, lookup and destroy entries for a given LPI.
> We allocate and access LPI information in a way that does not require
> a lock.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        |  80 ++++++++++++++++-
>  xen/arch/arm/gic-v3-lpi.c        | 187 ++++++++++++++++++++++++++++++++++++++-
>  xen/include/asm-arm/atomic.h     |   6 +-
>  xen/include/asm-arm/gic.h        |   5 ++
>  xen/include/asm-arm/gic_v3_its.h |   9 ++
>  5 files changed, 282 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 4a3a394..f073ab5 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -83,6 +83,20 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapti(struct host_its *its,
> +                              uint32_t deviceid, uint32_t eventid,
> +                              uint32_t pintid, uint16_t icid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
> +    cmd[2] = icid;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>  {
>      uint64_t cmd[4];
> @@ -111,6 +125,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_inv(struct host_its *its,
> +                            uint32_t deviceid, uint32_t eventid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid;
> +    cmd[2] = 0x00;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(int cpu)
>  {
> @@ -359,13 +386,47 @@ int gicv3_its_init(struct host_its *hw_its)
>
>  static void remove_mapped_guest_device(struct its_devices *dev)
>  {
> +    int i;
> +
>      if ( dev->hw_its )
>          its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
>
> +    for ( i = 0; i < dev->eventids / 32; i++ )

Please use LPI_BLOCK rather than 32.

> +        gicv3_free_host_lpi_block(dev->hw_its, dev->host_lpis[i]);

Without looking at the implementation of gicv3_free_host_lpi_block, I 
think the usage of the function is very confusing. When I read 
host_lpis, I expect to see one LPI per event. But instead it be the 
first LPI of a block. The lack of documentation of the field in 
its_devices does not help to understand what's going on.

So please add some documentation and probably renaming some fields.

Also, the function can return an error but you don't check it. Please 
make sure to verify the return value.

Lastly should not we discard the LPIs before removing the device? Or 
does MAPD take care for you?

> +
>      xfree(dev->itt_addr);
> +    xfree(dev->host_lpis);

I forgot to mention in the previous patch. You free dev->itt_addr and 
dev->host_lpis without even waiting that the ITS handle the command. 
This is real bad idea as Xen could re-allocate the memory to someone 
else as soon as xfree has finished.

>      xfree(dev);
>  }
>
> +/*
> + * On the host ITS @its, map @nr_events consecutive LPIs.
> + * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
> + * increasing both @eventid and @lpi to cover the number of requested LPIs.
> + */
> +int gicv3_its_map_host_events(struct host_its *its,
> +                              int devid, int eventid, int lpi,
> +                              int nr_events)

All those fields should at least be unsigned int. For devid, I would use 
uint32_t.

In general anything that does not require to be signed should be 
unsigned int. Similarly if we deal with an hardware value the type 
should be uintXX_t. This makes easier to match the hardware and avoid 
potential issue later.

> +{
> +    int i, ret;

i should be unsigned int.

> +
> +    for ( i = 0; i < nr_events; i++ )
> +    {
> +        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);

A comment explain what 0 stands for would be really helpful. Something 
along those lines: "All interrupt are mapped to CPU0 (e.g collection 0) 
by default".

> +        if ( ret )
> +            return ret;
> +        ret = its_send_cmd_inv(its, devid, eventid + i);

So the spec allows up to 32KB event per device. As all the LPIs will be 
routed to CPU0 (e.g collection 0), it would be more efficient to do an 
INVALL.

Furthermore, what if the queue is full? AFAIU, you will return an error 
but it is not propagate. So Xen will think the device has been mapped 
even if it is not true.

I think we need to have a plan here as this may likely happen if a 
device has many MSI and/or the queue is nearly full.

> +        if ( ret )
> +            return ret;
> +    }
> +
> +    ret = its_send_cmd_sync(its, 0);
> +    if ( ret )
> +        return ret;
> +
> +    return 0;

the "if ( ret ) return ret; return 0; could be simplified with return ret;

> +}
> +
>  int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>                                 int guest_devid, int bits, bool valid)
>  {
> @@ -373,7 +434,7 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>      struct its_devices *dev, *temp;
>      struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
>      struct host_its *hw_its;
> -    int ret;
> +    int ret, i;

i should be unsigned.

>
>      /* check for already existing mappings */
>      spin_lock(&d->arch.vgic.its_devices_lock);
> @@ -430,10 +491,19 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>          goto out_unlock;
>      }
>
> +    dev->host_lpis = xzalloc_array(uint32_t, BIT(bits) / 32);

I cannot find any code making sure the number of event is aligned to 32.

BTW, the 32 should really be LPI_BLOCK.

> +    if ( !dev->host_lpis )
> +    {
> +        xfree(dev);
> +        xfree(itt_addr);
> +        return -ENOMEM;
> +    }
> +
>      ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
>                              virt_to_maddr(itt_addr), true);
>      if ( ret )
>      {
> +        xfree(dev->host_lpis);
>          xfree(itt_addr);
>          xfree(dev);
>          goto out_unlock;
> @@ -450,6 +520,14 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>
>      spin_unlock(&d->arch.vgic.its_devices_lock);
>
> +    /*
> +     * Map all host LPIs within this device already. We can't afford to queue
> +     * any host ITS commands later on during the guest's runtime.
> +     */
> +    for ( i = 0; i < BIT(bits) / 32; i++ )

Ditto.

> +        dev->host_lpis[i] = gicv3_allocate_host_lpi_block(hw_its, d, host_devid,
> +                                                          i * 32);

Ditto.

Also, gicv3_allocate_host_lpi_block could return an error if, for 
instance, there is not enough LPIs. So please check the return value.

> +
>      return 0;
>
>  out_unlock:
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 5911b91..8f6e7f3 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -18,16 +18,34 @@
>
>  #include <xen/config.h>
>  #include <xen/lib.h>
> -#include <xen/mm.h>

Why mm.h has been dropped?

> +#include <xen/sched.h>
> +#include <xen/err.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <asm/atomic.h>
> +#include <asm/domain.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
>
> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
> +union host_lpi {
> +    uint64_t data;
> +    struct {
> +        uint32_t virt_lpi;
> +        uint16_t dom_id;
> +        uint16_t vcpu_id;
> +    };
> +};

I think Stefano requested some documentation on this field.

> +
>  /* Global state */
>  static struct {
>      uint8_t *lpi_property;
> +    union host_lpi **host_lpis;

It would be really helpful to a have documentation in the code 
explaining what host_lpis is supposed to contain.

>      unsigned int host_lpi_bits;
> +    /* Protects allocation and deallocation of host LPIs, but not the access */
> +    spinlock_t host_lpis_lock;
>  } lpi_data;
>
>  /* Physical redistributor address */
> @@ -38,6 +56,19 @@ static DEFINE_PER_CPU(int, redist_id);
>  static DEFINE_PER_CPU(void *, pending_table);
>
>  #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
> +
> +static union host_lpi *gic_get_host_lpi(uint32_t plpi)
> +{
> +    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
> +        return NULL;
> +
> +    plpi -= LPI_OFFSET;
> +    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
> +        return NULL;
> +
> +    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % HOST_LPIS_PER_PAGE];
> +}
>
>  /* Stores this redistributor's physical address and ID in a per-CPU variable */
>  void gicv3_set_redist_address(paddr_t address, int redist_id)
> @@ -130,15 +161,169 @@ uint64_t gicv3_lpi_get_proptable(void)
>  static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
>  integer_param("max_lpi_bits", max_lpi_bits);
>
> +/*
> + * Allocate the 2nd level array for host LPIs. This one holds pointers
> + * to the page with the actual "union host_lpi" entries. Our LPI limit
> + * avoids excessive memory usage.
> + */
>  int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
>  {
> +    int nr_lpi_ptrs;

unsigned int

> +
> +    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));

Why this BUILD_BUG_ON? Is it because you want to make sure read are 
atomic? If so, please add a comment.

> +
>      lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
>
> +    spin_lock_init(&lpi_data.host_lpis_lock);
> +
> +    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
> +    if ( !lpi_data.host_lpis )
> +        return -ENOMEM;
> +
>      printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);
>
>      return 0;
>  }
>
> +#define LPI_BLOCK       32

I think a comment on the reason behind the number 32 was requested by 
Stefano here. And I agree with him.

> +
> +/* Must be called with host_lpis_lock held. */

This is a call for adding an ASSERT in the function.

> +static int find_unused_host_lpi(int start, uint32_t *index)

start should probably unsigned.

> +{
> +    int chunk;

Ditto.

> +    uint32_t i = *index;
> +
> +    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
> +    {
> +        /* If we hit an unallocated chunk, use entry 0 in that one. */
> +        if ( !lpi_data.host_lpis[chunk] )
> +        {
> +            *index = 0;
> +            return chunk;
> +        }
> +
> +        /* Find an unallocated entry in this chunk. */
> +        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +        {
> +            if ( lpi_data.host_lpis[chunk][i].dom_id == INVALID_DOMID )
> +            {
> +                *index = i;
> +                return chunk;
> +            }
> +        }
> +        i = 0;
> +    }
> +
> +    return -1;
> +}
> +
> +/*
> + * Allocate a block of 32 LPIs on the given host ITS for device "devid",
> + * starting with "eventid". Put them into the respective ITT by issuing a
> + */
> +int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
> +                                  uint32_t host_devid, uint32_t eventid)

Most of the parameters here is just for calling back ITS and the caller 
should all in hand. So I would much prefer to avoid a call chain ITS -> 
LPI -> ITS and make the LPI code ITS agnostic.

Furthermore, the return value is both used to return an error and the 
LPI. Given that an LPI is encoded on 32-bit, sooner or later there will 
be a clash with between the error and the LPI. So it would be wise to 
dissociate the error code and the LPIs.

The prototype of this function would like:

int gicv3_allocate_host_lpi_block(struct domain *domain, uint32_t 
*first_lpi);


> +{
> +    static uint32_t next_lpi = 0;
> +    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;

lpi_idx will be overridden by find_unused_host_lpi, so you don't need to 
initialize it here.

This would also allow you to store the chunk rather than the next lpi 
and dropping the division in find_unused_host_lpi.

> +    int chunk;
> +    int i;
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
> +
> +    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
> +    {
> +        lpi_idx = 0;
> +        chunk = find_unused_host_lpi(0, &lpi_idx);
> +        if ( chunk == -1 )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOSPC;
> +        }
> +    }

My understanding of the algo to find a chunk is you will always try to 
allocate forward. So if the current chunk is full, you will allocate the 
next one rather than looking whether a previous chunk has space 
available. This will result to allocate more memory than necessary.

Similarly unused chunk could be freed to save memory.

> +
> +    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
> +    if ( !lpi_data.host_lpis[chunk] )
> +    {
> +        union host_lpi *new_chunk;
> +
> +        new_chunk = alloc_xenheap_pages(0, 0);

Please use alloc_xenheap_page as you only allocate one page.

Also, when NUMA support will be added we may want to take into account 
the node associated to the device saving us some time when reading the 
memory. You don't need to handle that now, but a TODO would be quite 
helpful.

> +        if ( !new_chunk )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOMEM;
> +        }
> +
> +        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +            new_chunk[i].dom_id = INVALID_DOMID;
> +
> +        lpi_data.host_lpis[chunk] = new_chunk;
> +        lpi_idx = 0;
> +    }
> +
> +    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +    {
> +        union host_lpi hlpi;
> +
> +        /*
> +         * Mark this host LPI as belonging to the domain, but don't assign
> +         * any virtual LPI or a VCPU yet.
> +         */
> +        hlpi.virt_lpi = INVALID_LPI;
> +        hlpi.dom_id = d->domain_id;
> +        hlpi.vcpu_id = INVALID_DOMID;

Please don't mix vcpu and domain. If INVALID_VCPUID does not exist then 
it might be worth adding one.

> +        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
> +                         hlpi.data);
> +
> +        /*
> +         * Enable this host LPI, so we don't have to do this during the
> +         * guest's runtime.
> +         */
> +        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
> +    }
> +
> +    /*
> +     * We have allocated and initialized the host LPI entries, so it's safe
> +     * to drop the lock now. Access to the structures can be done concurrently
> +     * as it involves only an atomic uint64_t access.
> +     */
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    __flush_dcache_area(&lpi_data.lpi_property[lpi], LPI_BLOCK);

Please use dcache_* helpers. Also, the flush is only needed when the 
property table is not mapped cacheable and innershareable by the GIC.

> +
> +    gicv3_its_map_host_events(its, host_devid, eventid, lpi + LPI_OFFSET,
> +                              LPI_BLOCK);
> +
> +    next_lpi = lpi + LPI_BLOCK;
> +    return lpi + LPI_OFFSET;
> +}
> +
> +int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi)
> +{
> +    union host_lpi *hlpi, empty_lpi = { .dom_id = INVALID_DOMID };
> +    int i;
> +
> +    hlpi = gic_get_host_lpi(lpi);
> +    if ( !hlpi )
> +        return -ENOENT;
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
> +
> +    /* TODO: Call a function in gic-v3-its.c to send DISCARDs */

I think this should be done by the caller and not here. You also need to 
disable the interrupts which has not been done here.

> +
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
> index 22a5036..df9de6a 100644
> --- a/xen/include/asm-arm/atomic.h
> +++ b/xen/include/asm-arm/atomic.h
> @@ -53,9 +53,9 @@ build_atomic_write(write_u16_atomic, "h", WORD, uint16_t, "r")
>  build_atomic_write(write_u32_atomic, "",  WORD, uint32_t, "r")
>  build_atomic_write(write_int_atomic, "",  WORD, int, "r")
>
> -#if 0 /* defined (CONFIG_ARM_64) */
> -build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
> -build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
> +#if defined (CONFIG_ARM_64)
> +build_atomic_read(read_u64_atomic, "", "", uint64_t, "=r")
> +build_atomic_write(write_u64_atomic, "", "", uint64_t, "r")
>  #endif

This change should be in a separate patch that will also explain why 
fixing build_atomic* call is needed.

>
>  build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 12bd155..7825575 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -220,7 +220,12 @@ enum gic_version {
>      GIC_V3,
>  };
>
> +#define INVALID_LPI     0
>  #define LPI_OFFSET      8192

Newline here please.

> +static inline bool is_lpi(unsigned int irq)
> +{
> +    return irq >= LPI_OFFSET;
> +}

I think both INVALID_LPI and is_lpi should be moved in irq.h.

>
>  extern enum gic_version gic_hw_version(void);
>
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 9c5dcf3..0e6b06a 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -97,6 +97,8 @@
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
>
> +#define INVALID_DOMID ((uint16_t)~0)
> +

Rather than defining your own invalid domid, it would be better to use 
the one defined in xen.h (see DOMID_INVALID).

>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -117,6 +119,7 @@ struct its_devices {
>      uint32_t guest_devid;
>      uint32_t host_devid;
>      uint32_t eventids;
> +    uint32_t *host_lpis;

I forgot to mention it earlier. But there is a general lack of comment 
on all the structure. Please try to address that on the next version.

>  };
>
>  extern struct list_head host_its_list;
> @@ -149,6 +152,12 @@ int gicv3_its_setup_collection(int cpu);
>   */
>  int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>                                 int guest_devid, int bits, bool valid);
> +int gicv3_its_map_host_events(struct host_its *its,
> +                              int host_devid, int eventid,
> +                              int lpi, int nrevents);
> +int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
> +                                  uint32_t host_devid, uint32_t eventid);
> +int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);
>
>  #else
>
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 00/28] arm64: Dom0 ITS emulation
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (27 preceding siblings ...)
  2017-01-30 18:31 ` [PATCH 28/28] ARM: vGIC: advertising LPI support Andre Przywara
@ 2017-02-13 13:53 ` Vijay Kilari
  2017-02-14 22:00   ` Stefano Stabellini
  2017-02-15 15:59   ` Julien Grall
  2017-02-15 17:55 ` Julien Grall
  29 siblings, 2 replies; 106+ messages in thread
From: Vijay Kilari @ 2017-02-13 13:53 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi Andre,

  I tried your patch series on HW. Dom0 boots but no LPIs are coming to Dom0.
So I made below patch to consider segment ID in generating devid,
 I see below panic from _xmalloc().

Complete log is here
http://pastebin.com/btythn2V

diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
index 6e02de4..72ffe9f 100644
--- a/xen/arch/arm/physdev.c
+++ b/xen/arch/arm/physdev.c
@@ -17,6 +17,7 @@
 int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 {
     struct physdev_manage_pci manage;
+   struct physdev_pci_device_add pci_add;
     u32 devid;
     int ret;

@@ -33,6 +34,19 @@ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
                                              cmd == PHYSDEVOP_manage_pci_add);

             return ret;
+       case PHYSDEVOP_pci_device_add:
+            if ( copy_from_guest(&pci_add, arg, 1) != 0 )
+                return -EFAULT;
+            devid = pci_add.seg << 16 | pci_add.bus << 8 | pci_add.devfn;
+
+            printk("In %s calling gicv3_its_map_device for S: %d B:
%d F:%d DEVID %u\n",
+                    __func__, pci_add.seg,pci_add.bus, pci_add.devfn, devid);
+            /* Allocate an ITS device table with space for 32 MSIs */
+            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
+                                       cmd == PHYSDEVOP_pci_device_add);
+
+            return ret;
     }


[    3.280979] iommu: Adding device 0000:01:07.2 to group 26
[    3.286373] pci 0000:01:07.3: [177d:a02f] type 00 class 0x058000
[    3.292463] pci 0000:01:07.3: BAR 0: [mem
0x87e05b000000-0x87e05b7fffff 64bit] (from Enhanced Allocation,
properties 0x0)
[    3.303457] pci 0000:01:07.3: BAR 4: [mem
0x87e05bf00000-0x87e05bffffff 64bit] (from Enhanced Allocation,
properties 0x0)
(XEN) In do_physdev_op calling gicv3_its_map_device for S: 0 B: 1 F:59 DEVID 315
(XEN) Hypervisor Trap. HSR=0x96000044 EC=0x25 IL=1 Syndrome=0x44
(XEN) CPU0: Unexpected Trap: Hypervisor
(XEN) ----[ Xen-4.9-unstable  arm64  debug=y   Not tainted ]----
(XEN) CPU:    0
(XEN) PC:     0000000000235e3c xmem_pool_alloc+0x33c/0x49c
(XEN) LR:     0000000000235c28
(XEN) SP:     0000801ffaba7c10
(XEN) CPSR:   20000349 MODE:64-bit EL2h (Hypervisor, handler)
(XEN)      X0: 000000000000001a  X1: 8022bf008022be00  X2: 0000801ffff165d0
(XEN)      X3: 000000000a0e0a0e  X4: 0000000000000001  X5: 0000000000000000
(XEN)      X6: 0000000000000004  X7: 0000000000000003  X8: 00000000fffffffb
(XEN)      X9: 000000000000000a X10: 0000801ffaba7ad8 X11: 0000000000000033
(XEN)     X12: 0000000000000003 X13: 0000000000268df8 X14: 0000000000000020
(XEN)     X15: 0000000000000000 X16: 0000000000000021 X17: 000000000000000b
(XEN)     X18: 000000000000000d X19: 0000801ffff16000 X20: 0000000000000005
(XEN)     X21: 0000000000000150 X22: 0000801ffff17868 X23: 000000000000003a
(XEN)     X24: 00000000ffffffff X25: 000000000000001f X26: 0000000000000039
(XEN)     X27: 0000000000000000 X28: 0000801fff8e9160  FP: 0000801ffaba7c10
(XEN)
(XEN)   VTCR_EL2: 80053590
(XEN)  VTTBR_EL2: 0001001ffabac000
(XEN)
(XEN)  SCTLR_EL2: 30cd183d
(XEN)    HCR_EL2: 000000008038663f
(XEN)  TTBR0_EL2: 0000001fffefe000
(XEN)
(XEN)    ESR_EL2: 96000044
(XEN)  HPFAR_EL2: 0000008010000000
(XEN)    FAR_EL2: 8022bf008022be10
(XEN)
(XEN) Xen stack trace from sp=0000801ffaba7c10:
(XEN)    0000801ffaba7c70 0000000000236394 0000000000000100 0000000000000060
(XEN)    0000000000000150 0000801ff5a62530 0000000000000000 0000801ff5a62528
(XEN)    0000801ffa32f0e0 0000000000000005 000000000000013b ffff8000ea2c6800
(XEN)    0000801ffaba7cc0 000000000024c154 0000801fff8ebec0 000000000000013b
(XEN)    0000801ff5a62000 0000801ff5a62530 0000000000000000 0000801ff5a62528
(XEN)    0000801ffa32f0e0 0000801ffaba7d70 0000801ffaba7d70 0000000000253590
(XEN)    000000000000013b 0000801ffaba7f30 ffff8000000ba1b8 ffff8000000ba1b8
(XEN)    0000000060000045 ffff800000e40000 ffff800000f50000 0000000000000000
(XEN)    ffff800000e30ec8 ffff8000ea2c6800 0000801ffaba7d30 00000000ffffffc8
(XEN)    0000000060000045 000000000026be48 0000000000000000 0000000000000001
(XEN)    000000000000003b 000000000000013b 0000801ffaba7da4 ffff8012f21bb450
(XEN)    0000801ffaba7db0 0000000000254de4 0000801ffaba7eb0 00000000002549f4
(XEN)    000000005a000ea1 000000013b010000 0000000000000000 ffff8000000ba1b8
(XEN)    0000801ffaba7e10 00000000002572a4 000000005a000ea1 0000801ffaba7eb0
(XEN)    000000005a000ea1 0000000000256550 0000801ffaba7e70 0000000000248e34
(XEN)    0000000000313c80 ffff800000d96000 ffffffffffffffff ffff800000525484
(XEN)    ffff8012f3aeb780 000000000025fb54 ffff8012f21cf098 ffff8012f21cf000
(XEN)    ffffffffffffffff ffff8000000ba1b8 0000000060000045 ffff800000e40000
(XEN)    ffff800000f50000 0000000000000000 ffff800000e30ec8 ffff8000ea2c6800
(XEN)    0000801ffaba7e90 000000000025827c 0000000000000002 0000000000258298
(XEN)    ffff8012f3aeafd0 000000000025fb58 0000000000000002 ffff800000d96000
(XEN)    0000000000000019 ffff8012f3aeb7c0 0000000000000007 0000000000000000
(XEN)    0000000000000001 ffff800000e34330 0000000080808080 ffff8012f21bb450
(XEN)    7f7f7f7f7f7f7f7f 5e646c68736d7471 7f7f7f7f7f7f7f7f 0101010101010101
(XEN)    0000000000000020 6962343620666666 6d6f726628205d74 0000000000000000
(XEN)    0000000000000021 000000000000000b 000000000000000d ffff8012f21cf098
(XEN)    ffff8012f21cf000 ffff800000d90000 0000000000000000 ffff8012f21cf098
(XEN)    ffff800000e40000 ffff800000f50000 0000000000000000 ffff800000e30ec8
(XEN)    ffff8000ea2c6800 ffff8012f3aeb780 ffff800000599718 ffffffffffffffff
(XEN)    ffff8000000ba1b8 0000000060000045 0000000060000045 0000000000000000
(XEN)    0000000000000000 0000000000000000 ffff8012f3aeb780 ffff80000010b068
(XEN)    0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<0000000000235e3c>] xmem_pool_alloc+0x33c/0x49c (PC)
(XEN)    [<0000000000235c28>] xmem_pool_alloc+0x128/0x49c (LR)
(XEN)    [<0000000000236394>] _xmalloc+0xfc/0x274
(XEN)    [<000000000024c154>] gicv3_its_map_guest_device+0xb0/0x2a0
(XEN)    [<0000000000253590>] do_physdev_op+0xc4/0x114
(XEN)    [<0000000000254de4>] traps.c#do_trap_hypercall+0x90/0x12c
(XEN)    [<00000000002572a4>] do_trap_hypervisor+0xd88/0x1c6c
(XEN)    [<000000000025fb54>] entry.o#guest_sync+0x90/0xc0
(XEN)

Note: I have added print similar to below that you see in log
(XEN) In do_physdev_op calling gicv3_its_map_device for S: 0 B: 1 F:59 DEVID 315


On Tue, Jan 31, 2017 at 12:01 AM, Andre Przywara <andre.przywara@arm.com> wrote:
> Hi,
>
> after the two RFC versions now the first "serious" attempt for emulating
> an ARM GICv3 ITS interrupt controller, for Dom0 only at the moment.
> The ITS is an interrupt controller widget providing a sophisticated way
> of dealing with MSIs in a scalable manner.
> For hardware which relies on the ITS to provide interrupts for its
> peripherals this code is needed to get a machine booted into Dom0 at all.
> ITS emulation for DomUs is only really useful with PCI passthrough,
> which is not yet available for ARM. It is expected that this feature
> will be co-developed with the ITS DomU code. However this code drop here
> considered DomU emulation already, to keep later architectural changes
> to a minimum.
>
> Some generic design principles:
>
> * The current GIC code statically allocates structures for each supported
> IRQ (both for the host and the guest), which due to the potentially
> millions of LPI interrupts is not feasible to copy for the ITS.
> So we refrain from introducing the ITS as a first class Xen interrupt
> controller, also we don't hold struct irq_desc's or struct pending_irq's
> for each possible LPI.
> Fortunately LPIs are only interesting to guests, so we get away with
> storing only the virtual IRQ number and the guest VCPU for each allocated
> host LPI, which can be stashed into one uint64_t. This data is stored in
> a two-level table, which is both memory efficient and quick to access.
> We hook into the existing IRQ handling and VGIC code to avoid accessing
> the normal structures, providing alternative methods for getting the
> needed information (priority, is enabled?) for LPIs.
> For interrupts which are queued to or are actually in a guest we
> allocate struct pending_irq's on demand. As it is expected that only a
> very small number of interrupts is ever on a VCPU at the same time, this
> seems like the best approach. For now allocated structs are re-used and
> held in a linked list. Should it emerge that traversing a linked list
> is a performance issue, this can be changed to use a hash table.
>
> * On the guest side we (later will) have to deal with malicious guests
> trying to hog Xen with mapping requests for a lot of LPIs, for instance.
> As the ITS actually uses system memory for storing status information,
> we use this memory (which the guest has to provide) to naturally limit
> a guest. For those tables which are page sized (devices, collections (CPUs),
> LPI properties) we map those pages into Xen, so we can easily access
> them from the virtual GIC code.
> Unfortunately the actual interrupt mapping tables are not necessarily
> page aligned, also can be much smaller than a page, so mapping all of
> them permanently is fiddly. As ITS commands in need to iterate those
> tables are pretty rare after all, we for now map them on demand upon
> emulating a virtual ITS command. This is acceptable because "mapping"
> them is actually very cheap on arm64. Also as we can't properly protect
> those areas due to their sub-page-size property, we validate the data
> in there before actually using it. The vITS code basically just stores
> the data in there which the guest has actually transferred via the
> virtual ITS command queue before, so there is no secret revealed nor
> does it create an attack vector for a malicious guest.
>
> * An obvious approach to handling some guest ITS commands would be to
> propagate them to the host, for instance to map devices and LPIs and
> to enable or disable LPIs.
> However this (later with DomU support) will create an attack vector, as
> a malicious guest could try to fill the host command queue with
> propagated commands.
> So (in contrast to the first RFC post) we completely avoid this situation.
> For mapping devices and LPIs we rely on this being done via a hypercall
> prior to the actual guest run. For enabling and disabling LPIs we keep
> this bit on the virtual side and let LPIs always be enabled on the host side,
> dealing with the consequences this approach creates.
>
> As it is expected that the ITS support will become a tech preview in the
> first release, there is a Kconfig option to enable it. Also it is
> supported on arm64 only, which will most likely not change in the future.
> This leads to some hideous constructs like an #ifdef'ed header file with
> empty function stubs, I have some hope we can still clean this up.
> Also some parameters are config options which can be overridden on the
> Xen commandline. This is to support experimentation and adaption to
> various platforms, ideally we find either one-size-fits-all values or
> find another way of getting rid of this.
>
> Compared to the previous post (RFC-v2) this has seen a lot of reworks
> and cleanups in various areas.
> I tried to address all of the review comments, though some are hard to
> follow due to rewrites. So apologies if some points have slipped through.
> Allocating and mapping of memory for both the physical and virtual ITS
> and redistributor tables has been improved, though I didn't manage to
> write protect the virtual tables from a guest without impacting access
> from Xen at the same time. I will need to take a deeper look into this,
> but ideally it's only a small change in get_guest_pages().
>
> This code boots Dom0 on an ARM Fast Model with ITS support. I tried to
> address the issues seen by people running the previous version on real
> hardware, though couldn't verify this here for myself.
> So any testing, bug reports (and possibly even fixes) are very welcome.
>
> The code can also be found on the its/v1 branch here:
> git://linux-arm.org/xen-ap.git
> http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/v1
>
> Cheers,
> Andre
>
> (Rough) changelog RFC-v2 .. v1:
> - split host ITS driver into gic-v3-lpi.c and gic-v3-its.c part
> - rename virtual ITS driver file to vgic-v3-its.c
> - use macros and named constants for all magic numbers
> - use atomic accessors for accessing the host LPI data
> - remove leftovers from connecting virtual and host ITSes
> - bail out if host ITS is disabled in the DT
> - rework map/unmap_guest_pages():
>     - split off p2m part as get/put_guest_pages (to be done on allocation)
>     - get rid of vmap, using map_domain_page() instead
> - delay allocation of virtual tables until actual LPI/ITS enablement
> - properly size both virtual and physical tables upon allocation
> - fix put_domain() locking issues in physdev_op and LPI handling code
> - add and extend comments in various areas
> - fix lotsa coding style and white space issues, including comment style
> - add locking to data structures not yet covered
> - fix various locking issues
> - use an rbtree to deal with ITS devices (instead of a list)
> - properly handle memory attributes for ITS tables
> - handle cacheable/non-cacheable ITS table mappings
> - sanitize guest provided ITS/LPI table attributes
> - fix breakage on non-GICv2 compatible host GICv3 controllers
> - add command line parameters on top of Kconfig options
> - properly wait for an ITS to become quiescient before enabling it
> - handle host ITS command queue errors
> - actually wait for host ITS command completion (READR==WRITER)
> - fix ARM32 compilation
> - various patch splits and reorderings
>
> Andre Przywara (28):
>   ARM: export __flush_dcache_area()
>   ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
>   ARM: GICv3: allocate LPI pending and property table
>   ARM: GICv3 ITS: allocate device and collection table
>   ARM: GICv3 ITS: map ITS command buffer
>   ARM: GICv3 ITS: introduce ITS command handling
>   ARM: GICv3 ITS: introduce device mapping
>   ARM: GICv3 ITS: introduce host LPI array
>   ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
>   ARM: GICv3: introduce separate pending_irq structs for LPIs
>   ARM: GICv3: forward pending LPIs to guests
>   ARM: GICv3: enable ITS and LPIs on the host
>   ARM: vGICv3: handle virtual LPI pending and property tables
>   ARM: vGICv3: Handle disabled LPIs
>   ARM: vGICv3: introduce basic ITS emulation bits
>   ARM: vITS: introduce translation table walks
>   ARM: vITS: handle CLEAR command
>   ARM: vITS: handle INT command
>   ARM: vITS: handle MAPC command
>   ARM: vITS: handle MAPD command
>   ARM: vITS: handle MAPTI command
>   ARM: vITS: handle MOVI command
>   ARM: vITS: handle DISCARD command
>   ARM: vITS: handle INV command
>   ARM: vITS: handle INVALL command
>   ARM: vITS: create and initialize virtual ITSes for Dom0
>   ARM: vITS: create ITS subnodes for Dom0 DT
>   ARM: vGIC: advertising LPI support
>
>  xen/arch/arm/Kconfig              |  33 ++
>  xen/arch/arm/Makefile             |   3 +
>  xen/arch/arm/efi/efi-boot.h       |   1 -
>  xen/arch/arm/gic-v3-its.c         | 825 +++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3-lpi.c         | 414 +++++++++++++++++
>  xen/arch/arm/gic-v3.c             |  98 +++-
>  xen/arch/arm/gic.c                |   9 +-
>  xen/arch/arm/physdev.c            |  21 +
>  xen/arch/arm/vgic-v3-its.c        | 929 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            | 347 ++++++++++++--
>  xen/arch/arm/vgic.c               |  68 ++-
>  xen/include/asm-arm/atomic.h      |   6 +-
>  xen/include/asm-arm/bitops.h      |   1 +
>  xen/include/asm-arm/cache.h       |   4 +
>  xen/include/asm-arm/domain.h      |  14 +-
>  xen/include/asm-arm/gic.h         |   7 +
>  xen/include/asm-arm/gic_v3_defs.h |  73 ++-
>  xen/include/asm-arm/gic_v3_its.h  | 241 ++++++++++
>  xen/include/asm-arm/irq.h         |   8 +
>  xen/include/asm-arm/vgic.h        |  34 ++
>  20 files changed, 3089 insertions(+), 47 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-its.c
>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
>  create mode 100644 xen/arch/arm/vgic-v3-its.c
>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
>
> --
> 2.9.0
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table
  2017-01-30 18:31 ` [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table Andre Przywara
  2017-02-06 16:26   ` Julien Grall
@ 2017-02-14  0:47   ` Stefano Stabellini
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14  0:47 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> The ARM GICv3 provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> redistributor. The maximum number of LPIs to be used can be adjusted with
> the command line option "max_lpi_bits", which defaults to a compile time
> constant exposed in Kconfig.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig              |  15 +++++
>  xen/arch/arm/Makefile             |   1 +
>  xen/arch/arm/gic-v3-lpi.c         | 129 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c             |  44 +++++++++++++
>  xen/include/asm-arm/bitops.h      |   1 +
>  xen/include/asm-arm/gic.h         |   2 +
>  xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
>  xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
>  8 files changed, 264 insertions(+), 2 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index bf64c61..71734a1 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -49,6 +49,21 @@ config HAS_ITS
>          bool "GICv3 ITS MSI controller support"
>          depends on HAS_GICV3
>  
> +config MAX_PHYS_LPI_BITS
> +        depends on HAS_ITS
> +        int "Maximum bits for GICv3 host LPIs (14-32)"
> +        range 14 32
> +        default "20"
> +        help
> +          Specifies the maximum number of LPIs (in bits) Xen should take
> +          care of. The host ITS may provide support for a very large number
> +          of supported LPIs, for all of which we may not want to allocate
> +          memory, so this number here allows to limit this.
> +          Xen itself does not know how many LPIs domains will ever need
> +          beforehand.
> +          This can be overriden on the command line with the max_lpi_bits
> +          parameter.
> +
>  endmenu
>  
>  menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 5f4ff23..4ccf2eb 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -19,6 +19,7 @@ obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
>  obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
>  obj-y += guestcopy.o
>  obj-y += hvm.o
>  obj-y += io.o
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> new file mode 100644
> index 0000000..e2fc901
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -0,0 +1,129 @@
> +/*
> + * xen/arch/arm/gic-v3-lpi.c
> + *
> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
> + *
> + * Copyright (C) 2016,2017 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/config.h>
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +
> +/* Global state */
> +static struct {
> +    uint8_t *lpi_property;
> +    unsigned int host_lpi_bits;

It's still missing a comment


> +} lpi_data;
> +
> +/* Pending table for each redistributor */
> +static DEFINE_PER_CPU(void *, pending_table);
> +
> +#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
> +
> +uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    uint64_t reg;
> +    void *pendtable;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    if ( !this_cpu(pending_table) )
> +    {
> +        /*
> +         * The pending table holds one bit per LPI and even covers bits for
> +         * interrupt IDs below 8192, so we allocate the full range.
> +         * The GICv3 imposes a 64KB alignment requirement.
> +         */
> +        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8, SZ_64K);
> +        if ( !pendtable )
> +            return 0;
> +
> +        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);
> +        __flush_dcache_area(pendtable, BIT_ULL(lpi_data.host_lpi_bits) / 8);
> +
> +        this_cpu(pending_table) = pendtable;
> +    }
> +    else
> +    {
> +        pendtable = this_cpu(pending_table);

it's still missing a BUG_ON


> +    }
> +
> +    reg |= GICR_PENDBASER_PTZ;
> +
> +    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));
> +    reg |= virt_to_maddr(pendtable);
> +
> +    return reg;
> +}
> +
> +uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    uint64_t reg;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The property table is shared across all redistributors, so allocate
> +     * this only once, but return the same value on subsequent calls.
> +     */
> +    if ( !lpi_data.lpi_property )
> +    {
> +        /* The property table holds one byte per LPI. */
> +        void *table = alloc_xenheap_pages(lpi_data.host_lpi_bits - PAGE_SHIFT,
> +                                          0);
> +
> +        if ( !table )
> +            return 0;
> +
> +        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
> +        __flush_dcache_area(table, MAX_PHYS_LPIS);
> +        lpi_data.lpi_property = table;
> +    }
> +
> +    reg |= ((lpi_data.host_lpi_bits - 1) << 0);
> +
> +    ASSERT(!(virt_to_maddr(lpi_data.lpi_property) & ~GENMASK(51, 12)));
> +    reg |= virt_to_maddr(lpi_data.lpi_property);
> +
> +    return reg;
> +}
> +
> +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
> +integer_param("max_lpi_bits", max_lpi_bits);
> +
> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
            ^gicv3_lpi_init_phys_lpis


> +{
> +    lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
> +
> +    printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 838dd11..fcb86c8 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>  
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1);
> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>  
> @@ -616,6 +619,33 @@ static int gicv3_enable_redist(void)
>      return 0;
>  }
>  
> +static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +
> +    /* We don't support LPIs without an ITS. */
> +    if ( list_empty(&host_its_list) )
> +        return -ENODEV;
> +
> +    /* Make sure LPIs are disabled before setting up the tables. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    if ( reg & GICR_CTLR_ENABLE_LPIS )
> +        return -EBUSY;
> +
> +    table_reg = gicv3_lpi_allocate_pendtable();
> +    if ( !table_reg )
> +        return -ENOMEM;
> +    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> +
> +    table_reg = gicv3_lpi_get_proptable();
> +    if ( !table_reg )
> +        return -ENOMEM;
> +    writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> +
> +    return 0;
> +}
> +
>  static int __init gicv3_populate_rdist(void)
>  {
>      int i;
> @@ -658,6 +688,20 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                {
> +                    int ret;
> +
> +                    ret = gicv3_rdist_init_lpis(ptr);
> +                    if ( ret && ret != -ENODEV )
> +                    {
> +                        printk("GICv3: CPU%d: Cannot initialize LPIs: %d\n",
> +                               smp_processor_id(), ret);
> +                        break;
> +                    }
> +                }
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
> index bda8898..1cbfb9e 100644
> --- a/xen/include/asm-arm/bitops.h
> +++ b/xen/include/asm-arm/bitops.h
> @@ -24,6 +24,7 @@
>  #define BIT(nr)                 (1UL << (nr))
>  #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
>  #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
> +#define BIT_ULL(nr)             (1ULL << (nr))
>  #define BITS_PER_BYTE           8
>  
>  #define ADDR (*(volatile int *) addr)
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 836a103..12bd155 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -220,6 +220,8 @@ enum gic_version {
>      GIC_V3,
>  };
>  
> +#define LPI_OFFSET      8192
> +
>  extern enum gic_version gic_hw_version(void);
>  
>  /* Program the IRQ type into the GIC */
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..b307322 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,8 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>  
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_LPIS               (1U << 17)
>  
>  #define GICD_CTLR_RWP                (1UL << 31)
>  #define GICD_CTLR_ARE_NS             (1U << 4)
> @@ -95,12 +96,61 @@
>  #define GICR_IGRPMODR0               (0x0D00)
>  #define GICR_NSACR                   (0x0E00)
>  
> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
> +
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
>  
> +/* For specifying the inner cacheability type only */
> +#define GIC_BASER_CACHE_nCnB         0ULL
> +/* For specifying the outer cacheability type only */
> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> +#define GIC_BASER_CACHE_nC           1ULL
> +#define GIC_BASER_CACHE_RaWt         2ULL
> +#define GIC_BASER_CACHE_RaWb         3ULL
> +#define GIC_BASER_CACHE_WaWt         4ULL
> +#define GIC_BASER_CACHE_WaWb         5ULL
> +#define GIC_BASER_CACHE_RaWaWt       6ULL
> +#define GIC_BASER_CACHE_RaWaWb       7ULL
> +#define GIC_BASER_CACHE_MASK         7ULL
> +
> +#define GIC_BASER_NonShareable       0ULL
> +#define GIC_BASER_InnerShareable     1ULL
> +#define GIC_BASER_OuterShareable     2ULL
> +
> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_RES0_MASK                             \
> +        (GENMASK(63, 59) | GENMASK(55, 52) | GENMASK(6, 5))
> +
> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
> +	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
> +	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_PTZ                              BIT(62)
> +#define GICR_PENDBASER_RES0_MASK                             \
> +        (BIT(63) | GENMASK(61, 59) | GENMASK(55, 52) |       \
> +         GENMASK(15, 12) | GENMASK(6, 0))
> +
>  #define DEFAULT_PMR_VALUE            0xff
>  
> +#define LPI_PROP_PRIO_MASK           0xfc
> +#define LPI_PROP_RES1                (1 << 1)
> +#define LPI_PROP_ENABLED             (1 << 0)
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 2f5c51c..a66b6be 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -36,12 +36,32 @@ extern struct list_head host_its_list;
>  /* Parse the host DT and pick up all host ITSes. */
>  void gicv3_its_dt_init(const struct dt_device_node *node);
>  
> +/* Allocate and initialize tables for each host redistributor.
> + * Returns the respective {PROP,PEND}BASER register value.
> + */
> +uint64_t gicv3_lpi_get_proptable(void);
> +uint64_t gicv3_lpi_allocate_pendtable(void);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +
>  #else
>  
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
>  }
> -
> +static inline uint64_t gicv3_lpi_get_proptable(void)
> +{
> +    return 0;
> +}
> +static inline uint64_t gicv3_lpi_allocate_pendtable(void)
> +{
> +    return 0;
> +}
> +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif /* __ASSEMBLY__ */
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                     ` (2 preceding siblings ...)
  2017-02-06 17:43   ` Julien Grall
@ 2017-02-14  0:54   ` Stefano Stabellini
  2017-02-15 18:31   ` Shanker Donthineni
  2017-02-16 19:03   ` Shanker Donthineni
  5 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14  0:54 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig             |  14 +++++
>  xen/arch/arm/gic-v3-its.c        | 129 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c            |   5 ++
>  xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>  4 files changed, 202 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 71734a1..81bc233 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>            This can be overriden on the command line with the max_lpi_bits
>            parameter.
>  
> +config MAX_PHYS_ITS_DEVICE_BITS
> +        depends on HAS_ITS
> +        int "Number of device bits the ITS supports"
> +        range 1 32
> +        default "10"
> +        help
> +          Specifies the maximum number of devices which want to use the ITS.
> +          Xen needs to allocates memory for the whole range very early.
> +          The allocation scheme may be sparse, so a much larger number must
> +          be supported to cover devices with a high bus number or those on
> +          separate bus segments.
> +          This can be overriden on the command line with the max_its_device_bits
> +          parameter.
> +
>  endmenu
>  
>  menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index ff0f571..c31fef6 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,9 +20,138 @@
>  #include <xen/lib.h>
>  #include <xen/device_tree.h>
>  #include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16 )
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
> +
> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> +{
> +    uint64_t attr, reg;
> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;
> +    int pagesz = 0, order, table_size;
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Setup the BASE register with the attributes that we like. Then read
> +     * it back and see what sticks (page size, cacheability and shareability
> +     * attributes), retrying if necessary.
> +     */
> +    while ( 1 )
> +    {
> +        table_size = ROUNDUP(nr_items * entry_size, BIT(PAGE_BITS(pagesz)));
> +        order = get_order_from_bytes(table_size);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= table_size >> PAGE_BITS(pagesz);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_VALID_BIT;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned. */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +        {
> +            /* If we can't map it shareable, drop cacheability as well. */
> +            if ( (regc & GITS_BASER_SHAREABILITY_MASK) == GIC_BASER_NonShareable )
> +            {
> +                regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +                attr = regc & BASER_ATTR_MASK;
> +                continue;

Why do we continue at this point? Shouldn't we go ahead to check if the
page size was accepted?


> +            }
> +            attr = regc & BASER_ATTR_MASK;
> +        }
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> +            return 0;
> +
> +        /* None of the page sizes was accepted, give up */
> +        if ( pagesz >= 2 )
> +            break;
> +
> +        free_xenheap_pages(buffer, order);
> +        buffer = NULL;
> +
> +        pagesz++;
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
> +integer_param("max_its_device_bits", max_its_device_bits);
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices */
> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            its_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
>  
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index fcb86c8..440c079 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>  #include <xen/irq.h>
>  #include <xen/iocap.h>
>  #include <xen/sched.h>
> +#include <xen/err.h>
>  #include <xen/errno.h>
>  #include <xen/delay.h>
>  #include <xen/device_tree.h>
> @@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
>  {
>      int res, i;
>      uint32_t reg;
> +    struct host_its *hw_its;
>  
>      if ( !cpu_has_gicv3 )
>      {
> @@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>  
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>      spin_unlock(&gicv3.lock);
>  
>      return res;
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index a66b6be..ed44bdb 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -18,6 +18,53 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>  
> +#define GITS_CTLR                       0x000
> +#define GITS_IIDR                       0x004
> +#define GITS_TYPER                      0x008
> +#define GITS_CBASER                     0x080
> +#define GITS_CWRITER                    0x088
> +#define GITS_CREADR                     0x090
> +#define GITS_BASER_NR_REGS              8
> +#define GITS_BASER0                     0x100
> +#define GITS_BASER1                     0x108
> +#define GITS_BASER2                     0x110
> +#define GITS_BASER3                     0x118
> +#define GITS_BASER4                     0x120
> +#define GITS_BASER5                     0x128
> +#define GITS_BASER6                     0x130
> +#define GITS_BASER7                     0x138
> +
> +/* Register bits */
> +#define GITS_VALID_BIT                  BIT_ULL(63)
> +
> +#define GITS_CTLR_QUIESCENT             BIT(31)
> +#define GITS_CTLR_ENABLE                BIT(0)
> +
> +#define GITS_IIDR_VALUE                 0x34c
> +
> +#define GITS_BASER_INDIRECT             BIT_ULL(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL << GITS_BASER_SHAREABILITY_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
> +#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
> +
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
>  
> @@ -27,6 +74,7 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
>  };
>  
>  extern struct list_head host_its_list;
> @@ -42,8 +90,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  uint64_t gicv3_lpi_get_proptable(void);
>  uint64_t gicv3_lpi_allocate_pendtable(void);
>  
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>  
>  #else
>  
> @@ -62,6 +111,10 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif /* __ASSEMBLY__ */
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-02-06 17:19   ` Julien Grall
@ 2017-02-14  0:55     ` Stefano Stabellini
  0 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14  0:55 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andre Przywara, Stefano Stabellini, Vijay Kilari, xen-devel

On Mon, 6 Feb 2017, Julien Grall wrote:
> > +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> > +{
> > +    uint64_t ret;
> > +
> > +    if ( page_bits < 16 )
> > +        return (uint64_t)addr & GENMASK(47, page_bits);
> > +
> > +    ret = addr & GENMASK(47, 16);
> > +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> > +}
> > +
> > +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
> 
> I know that PAGE_SHIFT has been suggested by Stefano on the previous version.
> However, I think  this is wrong. The PAGE_BITS is not based on the page
> granularity of Xen, so I would much prefer to keep an 12 hardcoded with a
> comment.

OK

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer
  2017-01-30 18:31 ` [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
  2017-02-06 17:43   ` Julien Grall
@ 2017-02-14  0:59   ` Stefano Stabellini
  2017-02-14 20:50     ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14  0:59 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer to the ITS h/w to create or alter the
> LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 46 ++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic_v3_its.h |  6 ++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index c31fef6..ad7cd2a 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -27,6 +27,8 @@
>  #include <asm/gic_v3_its.h>
>  #include <asm/io.h>
>  
> +#define ITS_CMD_QUEUE_SZ                SZ_64K
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -44,6 +46,45 @@ static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>      return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>  }
>  
> +static void *its_map_cbaser(struct host_its *its)
> +{
> +    void __iomem *cbasereg = its->its_base + GITS_CBASER;
> +    uint64_t reg, regc;
> +    void *buffer;
> +    paddr_t paddr;
> +
> +    reg  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    buffer = _xzalloc(ITS_CMD_QUEUE_SZ, PAGE_SIZE);
> +    if ( !buffer )
> +        return NULL;
> +    paddr = virt_to_maddr(buffer);
> +    ASSERT(!(paddr & ~GENMASK(51, 12)));
> +
> +    reg |= GITS_VALID_BIT | paddr;
> +    reg |= ((ITS_CMD_QUEUE_SZ / PAGE_SIZE) - 1) & GITS_CBASER_SIZE_MASK;
> +    writeq_relaxed(reg, cbasereg);
> +    regc = readq_relaxed(cbasereg);
> +
> +    /* If the ITS dropped shareability, drop cacheability as well. */
> +    if ( (regc & GITS_BASER_SHAREABILITY_MASK) == 0 )
> +    {
> +        regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +        writeq_relaxed(regc, cbasereg);
> +    }
> +
> +    /*
> +     * If the command queue memory is mapped as uncached, we need to flush
> +     * it on every access.
> +     */
> +    if ( !(regc & GITS_BASER_INNER_CACHEABILITY_MASK) )
> +        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
> +
> +    return buffer;
> +}
> +
>  #define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
>  
>  static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
> @@ -150,6 +191,11 @@ int gicv3_its_init(struct host_its *hw_its)
>          }
>      }
>  
> +    hw_its->cmd_buf = its_map_cbaser(hw_its);
> +    if ( !hw_its->cmd_buf )
> +        return -ENOMEM;
> +    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
 
Why this new write?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array
  2017-01-30 18:31 ` [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
  2017-02-07 18:01   ` Julien Grall
@ 2017-02-14 20:05   ` Stefano Stabellini
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 20:05 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> The number of LPIs on a host can be potentially huge (millions),
> although in practise will be mostly reasonable. So prematurely allocating
> an array of struct irq_desc's for each LPI is not an option.
> However Xen itself does not care about LPIs, as every LPI will be injected
> into a guest (Dom0 for now).
> Create a dense data structure (8 Bytes) for each LPI which holds just
> enough information to determine the virtual IRQ number and the VCPU into
> which the LPI needs to be injected.
> Also to not artificially limit the number of LPIs, we create a 2-level
> table for holding those structures.
> This patch introduces functions to initialize these tables and to
> create, lookup and destroy entries for a given LPI.
> We allocate and access LPI information in a way that does not require
> a lock.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        |  80 ++++++++++++++++-
>  xen/arch/arm/gic-v3-lpi.c        | 187 ++++++++++++++++++++++++++++++++++++++-
>  xen/include/asm-arm/atomic.h     |   6 +-
>  xen/include/asm-arm/gic.h        |   5 ++
>  xen/include/asm-arm/gic_v3_its.h |   9 ++
>  5 files changed, 282 insertions(+), 5 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 4a3a394..f073ab5 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -83,6 +83,20 @@ static int its_send_cmd_sync(struct host_its *its, int cpu)
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_mapti(struct host_its *its,
> +                              uint32_t deviceid, uint32_t eventid,
> +                              uint32_t pintid, uint16_t icid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
> +    cmd[2] = icid;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>  {
>      uint64_t cmd[4];
> @@ -111,6 +125,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_inv(struct host_its *its,
> +                            uint32_t deviceid, uint32_t eventid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid;
> +    cmd[2] = 0x00;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(int cpu)
>  {
> @@ -359,13 +386,47 @@ int gicv3_its_init(struct host_its *hw_its)
>  
>  static void remove_mapped_guest_device(struct its_devices *dev)
>  {
> +    int i;
> +
>      if ( dev->hw_its )
>          its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
>  
> +    for ( i = 0; i < dev->eventids / 32; i++ )
> +        gicv3_free_host_lpi_block(dev->hw_its, dev->host_lpis[i]);
> +
>      xfree(dev->itt_addr);
> +    xfree(dev->host_lpis);
>      xfree(dev);
>  }
>  
> +/*
> + * On the host ITS @its, map @nr_events consecutive LPIs.
> + * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
> + * increasing both @eventid and @lpi to cover the number of requested LPIs.
> + */
> +int gicv3_its_map_host_events(struct host_its *its,
> +                              int devid, int eventid, int lpi,
> +                              int nr_events)
> +{
> +    int i, ret;
> +
> +    for ( i = 0; i < nr_events; i++ )
> +    {
> +        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
> +        if ( ret )
> +            return ret;
> +        ret = its_send_cmd_inv(its, devid, eventid + i);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    ret = its_send_cmd_sync(its, 0);
> +    if ( ret )
> +        return ret;
> +
> +    return 0;
> +}
> +
>  int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>                                 int guest_devid, int bits, bool valid)
>  {
> @@ -373,7 +434,7 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>      struct its_devices *dev, *temp;
>      struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
>      struct host_its *hw_its;
> -    int ret;
> +    int ret, i;
>  
>      /* check for already existing mappings */
>      spin_lock(&d->arch.vgic.its_devices_lock);
> @@ -430,10 +491,19 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>          goto out_unlock;
>      }
>  
> +    dev->host_lpis = xzalloc_array(uint32_t, BIT(bits) / 32);
> +    if ( !dev->host_lpis )
> +    {
> +        xfree(dev);
> +        xfree(itt_addr);
> +        return -ENOMEM;
> +    }
> +
>      ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
>                              virt_to_maddr(itt_addr), true);
>      if ( ret )
>      {
> +        xfree(dev->host_lpis);
>          xfree(itt_addr);
>          xfree(dev);
>          goto out_unlock;
> @@ -450,6 +520,14 @@ int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>  
>      spin_unlock(&d->arch.vgic.its_devices_lock);
>  
> +    /*
> +     * Map all host LPIs within this device already. We can't afford to queue
> +     * any host ITS commands later on during the guest's runtime.
> +     */
> +    for ( i = 0; i < BIT(bits) / 32; i++ )
> +        dev->host_lpis[i] = gicv3_allocate_host_lpi_block(hw_its, d, host_devid,
> +                                                          i * 32);

Not checking for errors


>      return 0;
>  
>  out_unlock:
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 5911b91..8f6e7f3 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -18,16 +18,34 @@
>  
>  #include <xen/config.h>
>  #include <xen/lib.h>
> -#include <xen/mm.h>
> +#include <xen/sched.h>
> +#include <xen/err.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <asm/atomic.h>
> +#include <asm/domain.h>
> +#include <asm/io.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
>  
> +/* LPIs on the host always go to a guest, so no struct irq_desc for them. */
> +union host_lpi {
> +    uint64_t data;
> +    struct {
> +        uint32_t virt_lpi;
> +        uint16_t dom_id;
> +        uint16_t vcpu_id;
> +    };
> +};
> +
>  /* Global state */
>  static struct {
>      uint8_t *lpi_property;
> +    union host_lpi **host_lpis;
>      unsigned int host_lpi_bits;
> +    /* Protects allocation and deallocation of host LPIs, but not the access */
> +    spinlock_t host_lpis_lock;
>  } lpi_data;
>  
>  /* Physical redistributor address */
> @@ -38,6 +56,19 @@ static DEFINE_PER_CPU(int, redist_id);
>  static DEFINE_PER_CPU(void *, pending_table);
>  
>  #define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
> +
> +static union host_lpi *gic_get_host_lpi(uint32_t plpi)
> +{
> +    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
> +        return NULL;
> +
> +    plpi -= LPI_OFFSET;
> +    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
> +        return NULL;
> +
> +    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % HOST_LPIS_PER_PAGE];
> +}
>  
>  /* Stores this redistributor's physical address and ID in a per-CPU variable */
>  void gicv3_set_redist_address(paddr_t address, int redist_id)
> @@ -130,15 +161,169 @@ uint64_t gicv3_lpi_get_proptable(void)
>  static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
>  integer_param("max_lpi_bits", max_lpi_bits);
>  
> +/*
> + * Allocate the 2nd level array for host LPIs. This one holds pointers
> + * to the page with the actual "union host_lpi" entries. Our LPI limit
> + * avoids excessive memory usage.
> + */
>  int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
>  {
> +    int nr_lpi_ptrs;
> +
> +    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
> +
>      lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
>  
> +    spin_lock_init(&lpi_data.host_lpis_lock);
> +
> +    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
> +    if ( !lpi_data.host_lpis )
> +        return -ENOMEM;
> +
>      printk("GICv3: using at most %lld LPIs on the host.\n", MAX_PHYS_LPIS);
>  
>      return 0;
>  }
>  
> +#define LPI_BLOCK       32

Still missing some info on why LPI_BLOCK is 32 (you already provided
most of it in past emails). But the patch series is improving quite well
:-)


> +/* Must be called with host_lpis_lock held. */
> +static int find_unused_host_lpi(int start, uint32_t *index)
> +{
> +    int chunk;
> +    uint32_t i = *index;
> +
> +    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
> +    {
> +        /* If we hit an unallocated chunk, use entry 0 in that one. */
> +        if ( !lpi_data.host_lpis[chunk] )
> +        {
> +            *index = 0;
> +            return chunk;
> +        }
> +
> +        /* Find an unallocated entry in this chunk. */
> +        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +        {
> +            if ( lpi_data.host_lpis[chunk][i].dom_id == INVALID_DOMID )
> +            {
> +                *index = i;
> +                return chunk;
> +            }
> +        }
> +        i = 0;
> +    }
> +
> +    return -1;
> +}
> +
> +/*
> + * Allocate a block of 32 LPIs on the given host ITS for device "devid",
> + * starting with "eventid". Put them into the respective ITT by issuing a
> + * MAPTI command for each of them.
> + */
> +int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
> +                                  uint32_t host_devid, uint32_t eventid)
> +{
> +    static uint32_t next_lpi = 0;
> +    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;
> +    int chunk;
> +    int i;
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
> +
> +    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
> +    {
> +        lpi_idx = 0;
> +        chunk = find_unused_host_lpi(0, &lpi_idx);
> +        if ( chunk == -1 )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOSPC;
> +        }
> +    }
> +
> +    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
> +    if ( !lpi_data.host_lpis[chunk] )
> +    {
> +        union host_lpi *new_chunk;
> +
> +        new_chunk = alloc_xenheap_pages(0, 0);
> +        if ( !new_chunk )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOMEM;
> +        }
> +
> +        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +            new_chunk[i].dom_id = INVALID_DOMID;
> +
> +        lpi_data.host_lpis[chunk] = new_chunk;
> +        lpi_idx = 0;
> +    }
> +
> +    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +    {
> +        union host_lpi hlpi;
> +
> +        /*
> +         * Mark this host LPI as belonging to the domain, but don't assign
> +         * any virtual LPI or a VCPU yet.
> +         */
> +        hlpi.virt_lpi = INVALID_LPI;
> +        hlpi.dom_id = d->domain_id;
> +        hlpi.vcpu_id = INVALID_DOMID;
> +        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
> +                         hlpi.data);
> +
> +        /*
> +         * Enable this host LPI, so we don't have to do this during the
> +         * guest's runtime.
> +         */
> +        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
> +    }
> +
> +    /*
> +     * We have allocated and initialized the host LPI entries, so it's safe
> +     * to drop the lock now. Access to the structures can be done concurrently
> +     * as it involves only an atomic uint64_t access.
> +     */
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    __flush_dcache_area(&lpi_data.lpi_property[lpi], LPI_BLOCK);
> +
> +    gicv3_its_map_host_events(its, host_devid, eventid, lpi + LPI_OFFSET,
> +                              LPI_BLOCK);
> +
> +    next_lpi = lpi + LPI_BLOCK;
> +    return lpi + LPI_OFFSET;
> +}
> +
> +int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi)
> +{
> +    union host_lpi *hlpi, empty_lpi = { .dom_id = INVALID_DOMID };
> +    int i;
> +
> +    hlpi = gic_get_host_lpi(lpi);
> +    if ( !hlpi )
> +        return -ENOENT;
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
> +
> +    /* TODO: Call a function in gic-v3-its.c to send DISCARDs */
> +
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    return 0;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/atomic.h b/xen/include/asm-arm/atomic.h
> index 22a5036..df9de6a 100644
> --- a/xen/include/asm-arm/atomic.h
> +++ b/xen/include/asm-arm/atomic.h
> @@ -53,9 +53,9 @@ build_atomic_write(write_u16_atomic, "h", WORD, uint16_t, "r")
>  build_atomic_write(write_u32_atomic, "",  WORD, uint32_t, "r")
>  build_atomic_write(write_int_atomic, "",  WORD, int, "r")
>  
> -#if 0 /* defined (CONFIG_ARM_64) */
> -build_atomic_read(read_u64_atomic, "x", uint64_t, "=r")
> -build_atomic_write(write_u64_atomic, "x", uint64_t, "r")
> +#if defined (CONFIG_ARM_64)
> +build_atomic_read(read_u64_atomic, "", "", uint64_t, "=r")
> +build_atomic_write(write_u64_atomic, "", "", uint64_t, "r")
>  #endif
>  
>  build_add_sized(add_u8_sized, "b", BYTE, uint8_t, "ri")
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 12bd155..7825575 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -220,7 +220,12 @@ enum gic_version {
>      GIC_V3,
>  };
>  
> +#define INVALID_LPI     0
>  #define LPI_OFFSET      8192
> +static inline bool is_lpi(unsigned int irq)
> +{
> +    return irq >= LPI_OFFSET;
> +}
>  
>  extern enum gic_version gic_hw_version(void);
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 9c5dcf3..0e6b06a 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -97,6 +97,8 @@
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
>  
> +#define INVALID_DOMID ((uint16_t)~0)
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -117,6 +119,7 @@ struct its_devices {
>      uint32_t guest_devid;
>      uint32_t host_devid;
>      uint32_t eventids;
> +    uint32_t *host_lpis;
>  };
>  
>  extern struct list_head host_its_list;
> @@ -149,6 +152,12 @@ int gicv3_its_setup_collection(int cpu);
>   */
>  int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>                                 int guest_devid, int bits, bool valid);
> +int gicv3_its_map_host_events(struct host_its *its,
> +                              int host_devid, int eventid,
> +                              int lpi, int nrevents);
> +int gicv3_allocate_host_lpi_block(struct host_its *its, struct domain *d,
> +                                  uint32_t host_devid, uint32_t eventid);
> +int gicv3_free_host_lpi_block(struct host_its *its, uint32_t lpi);
>  
>  #else
>  
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-30 18:31 ` [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall Andre Przywara
  2017-01-31 10:29   ` Jaggi, Manish
@ 2017-02-14 20:11   ` Stefano Stabellini
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 20:11 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> To get MSIs from devices forwarded to a CPU, we need to name the device
> and its MSIs by mapping them to an ITS.
> Since this involves queueing commands to the ITS command queue, we can't
> really afford to do this during the guest's runtime, as this would open
> up a denial-of-service attack vector.
> So we require every device with MSI interrupts to be mapped explicitly by
> Dom0. For Dom0 itself we can just use the existing PCI physdev_op
> hypercalls, which the existing Linux kernel issues already.
> So upon receipt of this hypercall we map the device to the hardware ITS
> and prepare it to be later mapped by the virtual ITS by using the very
> same device ID (for Dom0 only).
> Also we ask for mapping 32 LPIs to cover 32 MSIs that the device may
> use.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/physdev.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
> index 27bbbda..6e02de4 100644
> --- a/xen/arch/arm/physdev.c
> +++ b/xen/arch/arm/physdev.c
> @@ -9,11 +9,32 @@
>  #include <xen/lib.h>
>  #include <xen/errno.h>
>  #include <xen/sched.h>
> +#include <xen/guest_access.h>
> +#include <asm/gic_v3_its.h>
>  #include <asm/hypercall.h>
>  
>  
>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
> +    struct physdev_manage_pci manage;
> +    u32 devid;
> +    int ret;
> +
> +    switch (cmd)
> +    {
> +        case PHYSDEVOP_manage_pci_add:
> +        case PHYSDEVOP_manage_pci_remove:
> +            if ( copy_from_guest(&manage, arg, 1) != 0 )
> +                return -EFAULT;

You need to check that current is the hardware domain first.


> +            devid = manage.bus << 8 | manage.devfn;
> +            /* Allocate an ITS device table with space for 32 MSIs */

Please explain why 32


> +            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
> +                                             cmd == PHYSDEVOP_manage_pci_add);
> +
> +            return ret;
> +    }
> +
>      gdprintk(XENLOG_DEBUG, "PHYSDEVOP cmd=%d: not implemented\n", cmd);
>      return -ENOSYS;
>  }
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-30 18:31 ` [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
@ 2017-02-14 20:39   ` Stefano Stabellini
  2017-02-15 17:06     ` Julien Grall
  2017-02-15 17:03   ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 20:39 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> For the same reason that allocating a struct irq_desc for each
> possible LPI is not an option, having a struct pending_irq for each LPI
> is also not feasible. However we actually only need those when an
> interrupt is on a vCPU (or is about to be injected).
> Maintain a list of those structs that we can use for the lifecycle of
> a guest LPI. We allocate new entries if necessary, however reuse
> pre-owned entries whenever possible.
> I added some locking around this list here, however my gut feeling is
> that we don't need one because this a per-VCPU structure anyway.
> If someone could confirm this, I'd be grateful.
> Teach the existing VGIC functions to find the right pointer when being
> given a virtual LPI number.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Please address past comments, specifically use a data structure per
domain, rather than per-vcpu. Also please move to a more efficient data
structure, such as an hashtable or a tree.


>  xen/arch/arm/gic.c           |  3 +++
>  xen/arch/arm/vgic-v3.c       |  3 +++
>  xen/arch/arm/vgic.c          | 64 +++++++++++++++++++++++++++++++++++++++++---
>  xen/include/asm-arm/domain.h |  2 ++
>  xen/include/asm-arm/vgic.h   | 14 ++++++++++
>  5 files changed, 83 insertions(+), 3 deletions(-)
> 
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index a5348f2..bd3c032 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -509,6 +509,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>              }
> +            /* If this was an LPI, mark this struct as available again. */
> +            if ( is_lpi(p->irq) )
> +                p->irq = 0;
>          }
>      }
>  }
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 1fadb00..b0653c2 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1426,6 +1426,9 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>  
> +    spin_lock_init(&v->arch.vgic.pending_lpi_list_lock);
> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
> +
>      return 0;
>  }
>  
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 364d5f0..7e3440f 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -31,6 +31,8 @@
>  #include <asm/mmio.h>
>  #include <asm/gic.h>
>  #include <asm/vgic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
>  
>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>  {
> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>      return vgic_get_rank(v, rank);
>  }
>  
> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>  {
>      INIT_LIST_HEAD(&p->inflight);
>      INIT_LIST_HEAD(&p->lr_queue);
> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>  
>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>  {
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> +    struct vgic_irq_rank *rank;
>      unsigned long flags;
>      int priority;
>  
> +    if ( is_lpi(virq) )
> +        return vgic_lpi_get_priority(v->domain, virq);
> +
> +    rank = vgic_rank_irq(v, virq);
>      vgic_lock_rank(v, rank, flags);
>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>      vgic_unlock_rank(v, rank, flags);
> @@ -446,13 +452,63 @@ bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode,
>      return true;
>  }
>  
> +/*
> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> + * requires too much Xen memory, also a malicious guest could potentially
> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> + * on demand.
> + */
> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> +                                   bool allocate)
> +{
> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> +
> +    spin_lock(&v->arch.vgic.pending_lpi_list_lock);
> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> +    {
> +        if ( lpi_irq->pirq.irq == lpi )
> +        {
> +            spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +            return &lpi_irq->pirq;
> +        }
> +
> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> +            empty = lpi_irq;
> +    }
> +
> +    if ( !allocate )
> +    {
> +        spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +        return NULL;
> +    }
> +
> +    if ( !empty )
> +    {
> +        empty = xzalloc(struct lpi_pending_irq);
> +        vgic_init_pending_irq(&empty->pirq, lpi);
> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> +    } else
> +    {
> +        empty->pirq.status = 0;
> +        empty->pirq.irq = lpi;
> +    }
> +
> +    spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +
> +    return &empty->pirq;
> +}
> +
>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>  {
>      struct pending_irq *n;
> +
>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>       * are used for SPIs; the rests are used for per cpu irqs */
>      if ( irq < 32 )
>          n = &v->arch.vgic.pending_irqs[irq];
> +    else if ( is_lpi(irq) )
> +        n = lpi_to_pending(v, irq, true);
>      else
>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>      return n;
> @@ -480,7 +536,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  {
>      uint8_t priority;
> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
> +    struct pending_irq *iter, *n;
>      unsigned long flags;
>      bool running;
>  
> @@ -488,6 +544,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>  
> +    n = irq_to_pending(v, virq);
> +
>      /* vcpu offline */
>      if ( test_bit(_VPF_down, &v->pause_flags) )
>      {
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 00b9c1a..f44a84b 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -257,6 +257,8 @@ struct arch_vcpu
>          paddr_t rdist_base;
>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>          uint8_t flags;
> +        struct list_head pending_lpi_list;
> +        spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
>      } vgic;
>  
>      /* Timer registers  */
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 672f649..03d4d2e 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -83,6 +83,12 @@ struct pending_irq
>      struct list_head lr_queue;
>  };
>  
> +struct lpi_pending_irq
> +{
> +    struct list_head entry;
> +    struct pending_irq pirq;
> +};
> +
>  #define NR_INTERRUPT_PER_RANK   32
>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>  
> @@ -296,13 +302,21 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
> +                                          bool allocate);
>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>  extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
>  extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
> +/* placeholder function until the property table gets introduced */
> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    return 0xa;
> +}
>  extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
>  int vgic_v2_init(struct domain *d, int *mmio_count);
>  int vgic_v3_init(struct domain *d, int *mmio_count);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer
  2017-02-14  0:59   ` Stefano Stabellini
@ 2017-02-14 20:50     ` Julien Grall
  2017-02-14 21:00       ` Stefano Stabellini
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-14 20:50 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel, Vijay Kilari

Hi Stefano,

On 02/14/2017 12:59 AM, Stefano Stabellini wrote:
> On Mon, 30 Jan 2017, Andre Przywara wrote:
>>  static int its_map_baser(void __iomem *basereg, uint64_t regc, int nr_items)
>> @@ -150,6 +191,11 @@ int gicv3_its_init(struct host_its *hw_its)
>>          }
>>      }
>>
>> +    hw_its->cmd_buf = its_map_cbaser(hw_its);
>> +    if ( !hw_its->cmd_buf )
>> +        return -ENOMEM;
>> +    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
>
> Why this new write?

This was requested by me. From the spec (8.19.5 in ARM IHI 0069C), the
reset value of GITS_CWRITER is unknown. So we have to reset the register
to 0 otherwise the ITS may try to read invalid command as soon as it has
been enabled.

FWIW, GITS_CREADR was reset to 0 by the ITS when GITS_CBASER has
successfully been written (see 8.19.2).

Cheers,

--
Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-01-30 18:31 ` [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests Andre Przywara
@ 2017-02-14 21:00   ` Stefano Stabellini
  2017-02-15 17:18     ` Julien Grall
  2017-02-15 17:30   ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 21:00 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
> number to get this IRQ injected.
> Iterate our two-level LPI table to find this information quickly when
> the host takes an LPI. Call the existing injection function to let the
> GIC emulation deal with this interrupt.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-lpi.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic.c        |  6 ++++--
>  xen/include/asm-arm/irq.h |  8 ++++++++
>  3 files changed, 53 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 8f6e7f3..d270053 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -86,6 +86,47 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta)
>          return per_cpu(redist_id, cpu) << 16;
>  }
>  
> +/*
> + * Handle incoming LPIs, which are a bit special, because they are potentially
> + * numerous and also only get injected into guests. Treat them specially here,
> + * by just looking up their target vCPU and virtual LPI number and hand it
> + * over to the injection function.
> + */
> +void do_LPI(unsigned int lpi)
> +{
> +    struct domain *d;
> +    union host_lpi *hlpip, hlpi;
> +    struct vcpu *vcpu;
> +
> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> +
> +    hlpip = gic_get_host_lpi(lpi);
> +    if ( !hlpip )
> +        return;
> +
> +    hlpi.data = read_u64_atomic(&hlpip->data);
> +
> +    /* We may have mapped more host LPIs than the guest actually asked for. */
> +    if ( !hlpi.virt_lpi )
> +        return;
> +
> +    d = get_domain_by_id(hlpi.dom_id);
> +    if ( !d )
> +        return;
> +
> +    if ( hlpi.vcpu_id >= d->max_vcpus )
> +    {
> +        put_domain(d);
> +        return;
> +    }
> +
> +    vcpu = d->vcpu[hlpi.vcpu_id];
> +
> +    put_domain(d);
> +
> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);

put_domain should be here


> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg;
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index bd3c032..7286e5d 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -700,8 +700,10 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
>              local_irq_enable();
>              do_IRQ(regs, irq, is_fiq);
>              local_irq_disable();
> -        }
> -        else if (unlikely(irq < 16))
> +        } else if ( is_lpi(irq) )
> +        {
> +            do_LPI(irq);
> +        } else if ( unlikely(irq < 16) )
>          {
>              do_sgi(regs, irq);
>          }
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 8f7a167..ee47de8 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -34,6 +34,14 @@ struct irq_desc *__irq_to_desc(int irq);
>  
>  void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>  
> +#ifdef CONFIG_HAS_ITS
> +void do_LPI(unsigned int irq);
> +#else
> +static inline void do_LPI(unsigned int irq)
> +{
> +}
> +#endif
> +
>  #define domain_pirq_to_irq(d, pirq) (pirq)
>  
>  bool_t is_assignable_irq(unsigned int irq);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer
  2017-02-14 20:50     ` Julien Grall
@ 2017-02-14 21:00       ` Stefano Stabellini
  0 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 21:00 UTC (permalink / raw)
  To: Julien Grall; +Cc: Andre Przywara, Stefano Stabellini, Vijay Kilari, xen-devel

On Tue, 14 Feb 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 02/14/2017 12:59 AM, Stefano Stabellini wrote:
> > On Mon, 30 Jan 2017, Andre Przywara wrote:
> > >  static int its_map_baser(void __iomem *basereg, uint64_t regc, int
> > > nr_items)
> > > @@ -150,6 +191,11 @@ int gicv3_its_init(struct host_its *hw_its)
> > >          }
> > >      }
> > > 
> > > +    hw_its->cmd_buf = its_map_cbaser(hw_its);
> > > +    if ( !hw_its->cmd_buf )
> > > +        return -ENOMEM;
> > > +    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
> > 
> > Why this new write?
> 
> This was requested by me. From the spec (8.19.5 in ARM IHI 0069C), the
> reset value of GITS_CWRITER is unknown. So we have to reset the register
> to 0 otherwise the ITS may try to read invalid command as soon as it has
> been enabled.
> 
> FWIW, GITS_CREADR was reset to 0 by the ITS when GITS_CBASER has
> successfully been written (see 8.19.2).

All right, thanks

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 00/28] arm64: Dom0 ITS emulation
  2017-02-13 13:53 ` [PATCH 00/28] arm64: Dom0 ITS emulation Vijay Kilari
@ 2017-02-14 22:00   ` Stefano Stabellini
  2017-02-15 15:59   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 22:00 UTC (permalink / raw)
  To: Vijay Kilari; +Cc: Andre Przywara, Julien Grall, Stefano Stabellini, xen-devel

On Mon, 13 Feb 2017, Vijay Kilari wrote:
> Hi Andre,
> 
>   I tried your patch series on HW. Dom0 boots but no LPIs are coming to Dom0.
> So I made below patch to consider segment ID in generating devid,
>  I see below panic from _xmalloc().
> 
> Complete log is here
> http://pastebin.com/btythn2V
> 
> diff --git a/xen/arch/arm/physdev.c b/xen/arch/arm/physdev.c
> index 6e02de4..72ffe9f 100644
> --- a/xen/arch/arm/physdev.c
> +++ b/xen/arch/arm/physdev.c
> @@ -17,6 +17,7 @@
>  int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  {
>      struct physdev_manage_pci manage;
> +   struct physdev_pci_device_add pci_add;
>      u32 devid;
>      int ret;
> 
> @@ -33,6 +34,19 @@ int do_physdev_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>                                               cmd == PHYSDEVOP_manage_pci_add);
> 
>              return ret;
> +       case PHYSDEVOP_pci_device_add:
> +            if ( copy_from_guest(&pci_add, arg, 1) != 0 )
> +                return -EFAULT;
> +            devid = pci_add.seg << 16 | pci_add.bus << 8 | pci_add.devfn;
> +
> +            printk("In %s calling gicv3_its_map_device for S: %d B:
> %d F:%d DEVID %u\n",
> +                    __func__, pci_add.seg,pci_add.bus, pci_add.devfn, devid);
> +            /* Allocate an ITS device table with space for 32 MSIs */
> +            ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 5,
> +                                       cmd == PHYSDEVOP_pci_device_add);
> +
> +            return ret;
>      }

Hi Vijay, thanks for testing the series. Instead of implementing
PHYSDEVOP_pci_device_add here, could you call gicv3_its_map_guest_device
for each device statically from a Cavium specific platform file under
xen/arch/arm/platforms?

Once we'll have a clearer idea about how to implement which hypercalls,
we'll do this properly.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host
  2017-01-30 18:31 ` [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
@ 2017-02-14 22:41   ` Stefano Stabellini
  2017-02-15 17:35   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 22:41 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> Now that the host part of the ITS code is in place, we can enable the
> ITS and also LPIs on each redistributor to get the show rolling.
> At this point there would be no LPIs mapped, as guests don't know about
> the ITS yet.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c | 34 ++++++++++++++++++++++++++++++++--
>  xen/arch/arm/gic-v3.c     | 19 +++++++++++++++++++
>  2 files changed, 51 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index f073ab5..2a7093f 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -62,6 +62,28 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
>      return 0;
>  }
>  
> +/* Wait for an ITS to finish processing all commands. */
> +static int gicv3_its_wait_commands(struct host_its *hw_its)
> +{
> +    s_time_t deadline = NOW() + MILLISECS(1000);
> +    uint64_t readp, writep;
> +
> +    do {
> +        spin_lock(&hw_its->cmd_lock);
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +        spin_unlock(&hw_its->cmd_lock);
> +
> +        if ( readp == writep )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    return -ETIMEDOUT;
> +}

I hope we won't have anything like this after the initialization is
completed. In fact, if this is called only at initialization, do we need
the spin lock?


>  static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
>  {
>      reg &= ~GENMASK(51, 16);
> @@ -161,6 +183,10 @@ int gicv3_its_setup_collection(int cpu)
>          ret = its_send_cmd_sync(its, cpu);
>          if ( ret )
>              return ret;
> +
> +        ret = gicv3_its_wait_commands(its);
> +        if ( ret )
> +            return ret;

Just do
    
    return gicv3_its_wait_commands(its);

as for the other cases


>      }
>  
>      return 0;
> @@ -367,6 +393,10 @@ int gicv3_its_init(struct host_its *hw_its)
>          return -ENOMEM;
>      writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
>  
> +    /* Now enable interrupt translation and command processing on that ITS. */
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    writel_relaxed(reg | GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
>      /*
>       * We issue the collection mapping calls upon initialising the
>       * redistributors, which for CPU 0 happens before the ITS gets initialised
> @@ -381,7 +411,7 @@ int gicv3_its_init(struct host_its *hw_its)
>      if ( ret )
>          return ret;
>  
> -    return 0;
> +    return gicv3_its_wait_commands(hw_its);
>  }
>  
>  static void remove_mapped_guest_device(struct its_devices *dev)
> @@ -424,7 +454,7 @@ int gicv3_its_map_host_events(struct host_its *its,
>      if ( ret )
>          return ret;
>  
> -    return 0;
> +    return gicv3_its_wait_commands(its);
>  }
>  
>  int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 5f825a6..23cf33d 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -647,6 +647,21 @@ static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
>      return gicv3_its_setup_collection(smp_processor_id());
>  }
>  
> +/* Enable LPIs on this redistributor (only useful when the host has an ITS. */
> +static bool gicv3_enable_lpis(void)
> +{
> +    uint32_t val;
> +
> +    val = readl_relaxed(GICD_RDIST_BASE + GICR_TYPER);
> +    if ( !(val & GICR_TYPER_PLPIS) )
> +        return false;
> +
> +    val = readl_relaxed(GICD_RDIST_BASE + GICR_CTLR);
> +    writel_relaxed(val | GICR_CTLR_ENABLE_LPIS, GICD_RDIST_BASE + GICR_CTLR);
> +
> +    return true;
> +}
> +
>  static int __init gicv3_populate_rdist(void)
>  {
>      int i;
> @@ -755,6 +770,10 @@ static int gicv3_cpu_init(void)
>      if ( gicv3_enable_redist() )
>          return -ENODEV;
>  
> +    /* If the host has any ITSes, enable LPIs now. */
> +    if ( !list_empty(&host_its_list) )
> +        gicv3_enable_lpis();
> +
>      /* Set priority on PPI and SGI interrupts */
>      priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
>                  GIC_PRI_IPI);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-01-30 18:31 ` [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2017-02-14 23:56   ` Stefano Stabellini
  2017-02-15 18:44   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 23:56 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Please give a look at
alpine.DEB.2.10.1610281619240.9978@sstabellini-ThinkPad-X260


>  xen/arch/arm/vgic-v3.c       | 220 +++++++++++++++++++++++++++++++++++++++----
>  xen/arch/arm/vgic.c          |   4 +
>  xen/include/asm-arm/domain.h |   8 +-
>  xen/include/asm-arm/vgic.h   |  24 ++++-
>  4 files changed, 233 insertions(+), 23 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index b0653c2..c6db2d7 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>  
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>  
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
> +        return 1;
>  
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
> +        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
> +        return 1;
>  
>      case 0x0080:
>          goto read_reserved;
> @@ -302,11 +307,6 @@ bad_width:
>      domain_crash_synchronous();
>      return 0;
>  
> -read_as_zero_64:
> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -    *r = 0;
> -    return 1;
> -
>  read_as_zero_32:
>      if ( dabt.size != DABT_WORD ) goto bad_width;
>      *r = 0;
> @@ -331,11 +331,179 @@ read_unknown:
>      return 1;
>  }
>  
> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
> +                                    int field_shift,
> +                                    uint64_t (*sanitise_fn)(uint64_t))
> +{
> +    uint64_t field = (reg & field_mask) >> field_shift;
> +
> +    field = sanitise_fn(field) << field_shift;
> +
> +    return (reg & ~field_mask) | field;
> +}
> +
> +/* We want to avoid outer shareable. */
> +static uint64_t vgic_sanitise_shareability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_OuterShareable:
> +        return GIC_BASER_InnerShareable;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Avoid any inner non-cacheable mapping. */
> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_nCnB:
> +    case GIC_BASER_CACHE_nC:
> +        return GIC_BASER_CACHE_RaWb;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Non-cacheable or same-as-inner are OK. */
> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
> +{
> +    switch (field) {
> +    case GIC_BASER_CACHE_SameAsInner:
> +    case GIC_BASER_CACHE_nC:
> +        return field;
> +    default:
> +        return GIC_BASER_CACHE_nC;
> +    }
> +}
> +
> +static uint64_t sanitize_propbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~GICR_PROPBASER_RES0_MASK;
> +    return reg;
> +}
> +
> +static uint64_t sanitize_pendbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~GICR_PENDBASER_RES0_MASK;
> +    return reg;
> +}
> +
> +/*
> + * Mark a given number of guest pages as used (by increasing their refcount),
> + * starting with the given guest address. This needs to be called once before
> + * calling (possibly repeatedly) map_guest_pages().
> + * Before the domain gets destroyed, call put_guest_pages() to drop the
> + * reference.
> + */
> +int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)
> +{
> +    int i;
> +    struct page_info *page;
> +
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
> +        if ( ! page )
> +            return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)
> +{
> +    mfn_t mfn;
> +    int i;
> +
> +    p2m_read_lock(&d->arch.p2m);
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
> +                            NULL, NULL, NULL);
> +        if ( mfn_eq(mfn, INVALID_MFN) )
> +            continue;
> +        put_page(mfn_to_page(mfn_x(mfn)));
> +    }
> +    p2m_read_unlock(&d->arch.p2m);
> +}
> +
> +/*
> + * Provides easy access to guest memory by "mapping" some parts of it into
> + * Xen's VA space. In fact it relies on the memory being already mapped
> + * and just provides a pointer to it.
> + * This allows the ITS configuration data to be held in guest memory and
> + * avoids using Xen's memory for that.
> + */
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
> +{
> +    int i;
> +    void *ptr, *follow;
> +
> +    ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));
> +
> +    /* Make sure subsequent pages are mapped in a virtually contigious way. */
> +    for ( i = 1; i < nr_pages; i++ )
> +    {
> +        follow = map_domain_page(_mfn((guest_addr >> PAGE_SHIFT) + i));
> +        if ( follow != ptr + ((long)i << PAGE_SHIFT) )
> +            return NULL;
> +    }
> +
> +    return ptr + (guest_addr & ~PAGE_MASK);
> +}
> +
> +/* "Unmap" previously mapped guest pages. Should be optimized away on arm64. */
> +void unmap_guest_pages(void *va, int nr_pages)
> +{
> +    long i;
> +
> +    for ( i = nr_pages - 1; i >= 0; i-- )
> +        unmap_domain_page(((uintptr_t)va & PAGE_MASK) + (i << PAGE_SHIFT));
> +}
> +
> +int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    if ( vlpi >= d->arch.vgic.nr_lpis )
> +        return GIC_PRI_IRQ;
> +
> +    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_PRIO_MASK;
> +}
> +
> +bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
> +{
> +    if ( vlpi >= d->arch.vgic.nr_lpis )
> +        return false;
> +
> +    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_ENABLED;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>  
>      switch ( gicr_reg )
>      {
> @@ -366,36 +534,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>          goto write_impl_defined;
>  
>      case VREG64(GICR_SETLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>  
>      case VREG64(GICR_CLRLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>  
>      case 0x0050:
>          goto write_reserved;
>  
>      case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;
> +
> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;
> +        return 1;
>  
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;
> +
> +	reg = v->arch.vgic.rdist_pendbase;
> +	vgic_reg64_update(&reg, r, info);
> +	reg = sanitize_pendbaser(reg);
> +	v->arch.vgic.rdist_pendbase = reg;
> +	return 1;
>  
>      case 0x0080:
>          goto write_reserved;
>  
>      case VREG64(GICR_INVLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>  
>      case 0x00A8:
>          goto write_reserved;
>  
>      case VREG64(GICR_INVALLR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>  
>      case 0x00B8:
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 7e3440f..cf444f3 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -494,6 +494,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>          empty->pirq.irq = lpi;
>      }
>  
> +    /* Update the enabled status */
> +    if ( vgic_lpi_is_enabled(v->domain, lpi) )
> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
> +
>      spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
>  
>      return &empty->pirq;
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index f44a84b..33c1851 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -110,6 +110,9 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        int nr_lpis;
> +        uint64_t rdist_propbase;
> +        uint8_t *proptable;
>          struct rb_root its_devices;         /* devices mapped to an ITS */
>          spinlock_t its_devices_lock;        /* protects the its_devices tree */
>  #endif
> @@ -255,7 +258,10 @@ struct arch_vcpu
>  
>          /* GICv3: redistributor base and flags for this vCPU */
>          paddr_t rdist_base;
> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
> +        uint64_t rdist_pendbase;
> +        unsigned long *pendtable;
>          uint8_t flags;
>          struct list_head pending_lpi_list;
>          spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 03d4d2e..a882fe8 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -285,6 +285,11 @@ VGIC_REG_HELPERS(32, 0x3);
>  
>  #undef VGIC_REG_HELPERS
>  
> +int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
> +void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
> +void unmap_guest_pages(void *va, int nr_pages);
> +
>  enum gic_sgi_mode;
>  
>  /*
> @@ -312,14 +317,23 @@ extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>  extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
>  extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
> -/* placeholder function until the property table gets introduced */
> -static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> -{
> -    return 0xa;
> -}
>  extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
>  int vgic_v2_init(struct domain *d, int *mmio_count);
>  int vgic_v3_init(struct domain *d, int *mmio_count);
> +#ifdef CONFIG_HAS_GICV3
> +extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
> +extern bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi);
> +#else
> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    return 0xa0;
> +}
> +
> +static inline bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
> +{
> +    return false;
> +}
> +#endif
>  
>  extern int domain_vgic_register(struct domain *d, int *mmio_count);
>  extern int vcpu_vgic_free(struct vcpu *v);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs
  2017-01-30 18:31 ` [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs Andre Przywara
@ 2017-02-14 23:58   ` Stefano Stabellini
  0 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-14 23:58 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> If a guest disables an LPI, we do not forward this to the associated
> host LPI to avoid queueing commands to the host ITS command queue.
> So it may happen that an LPI fires nevertheless on the host. In this
> case we can bail out early, but have to save the pending state on the
> virtual side.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Please see alpine.DEB.2.10.1701051422020.2866@sstabellini-ThinkPad-X260


>  xen/arch/arm/gic-v3-lpi.c  |  8 ++++++++
>  xen/arch/arm/vgic-v3.c     | 12 ++++++++++++
>  xen/include/asm-arm/vgic.h |  6 ++++++
>  3 files changed, 26 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index d270053..ade8b69 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -124,6 +124,14 @@ void do_LPI(unsigned int lpi)
>  
>      put_domain(d);
>  
> +    /*
> +     * We keep all host LPIs enabled, so check if it's disabled on the guest
> +     * side and just record this LPI in the virtual pending table in this case.
> +     * The guest picks it up once it gets enabled again.
> +     */
> +    if ( !vgic_can_inject_lpi(vcpu, hlpi.virt_lpi) )
> +        return;
> +
>      vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>  }
>  
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index c6db2d7..de625bf 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -498,6 +498,18 @@ bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
>      return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_ENABLED;
>  }
>  
> +bool vgic_can_inject_lpi(struct vcpu *vcpu, uint32_t vlpi)
> +{
> +    if ( vlpi >= vcpu->domain->arch.vgic.nr_lpis )
> +        return false;
> +
> +    if ( vgic_lpi_is_enabled(vcpu->domain, vlpi) )
> +        return true;
> +
> +    set_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable);
> +    return false;
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index a882fe8..e71b18b 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -323,6 +323,7 @@ int vgic_v3_init(struct domain *d, int *mmio_count);
>  #ifdef CONFIG_HAS_GICV3
>  extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
>  extern bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi);
> +extern bool vgic_can_inject_lpi(struct vcpu *v, uint32_t vlpi);
>  #else
>  static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
>  {
> @@ -333,6 +334,11 @@ static inline bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
>  {
>      return false;
>  }
> +
> +static inline bool vgic_can_inject_lpi(struct vcpu *v, uint32_t vlpi)
> +{
> +    return false;
> +}
>  #endif
>  
>  extern int domain_vgic_register(struct domain *d, int *mmio_count);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 17/28] ARM: vITS: handle CLEAR command
  2017-01-30 18:31 ` [PATCH 17/28] ARM: vITS: handle CLEAR command Andre Przywara
@ 2017-02-15  0:07   ` Stefano Stabellini
  0 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-15  0:07 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> This introduces the ITS command handler for the CLEAR command, which
> clears the pending state of an LPI.
> This removes a not-yet injected, but already queued IRQ from a VCPU.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Please see alpine.DEB.2.10.1611081608580.3491@sstabellini-ThinkPad-X260


> ---
>  xen/arch/arm/vgic-v3-its.c | 35 +++++++++++++++++++++++++++++++++--
>  1 file changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 982c51d..48eb924 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -129,8 +129,8 @@ static void put_devid_evid(struct virt_its *its, struct vits_itte *itte)
>   * protect the ITTs with their less-than-page-size granularity.
>   * Takes and drops the its_lock.
>   */
> -bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
> -               struct vcpu **vcpu, uint32_t *vlpi)
> +static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
> +                      struct vcpu **vcpu, uint32_t *vlpi)
>  {
>      struct vits_itte *itte;
>      int collid;
> @@ -214,6 +214,34 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>  #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
>  #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
>  
> +static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    struct pending_irq *pirq;
> +    struct vcpu *vcpu;
> +    uint32_t vlpi;
> +
> +    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
> +        return -1;
> +
> +    /* Remove a pending, but not yet injected guest IRQ. */
> +    pirq = lpi_to_pending(vcpu, vlpi, false);
> +    if ( pirq )
> +    {
> +        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
> +        gic_remove_from_queues(vcpu, vlpi);
> +
> +        /* Mark this pending IRQ struct as availabe again. */
> +        if ( !test_bit(GIC_IRQ_GUEST_VISIBLE, &pirq->status) )
> +            pirq->irq = 0;
> +    }
> +
> +    clear_bit(vlpi - LPI_OFFSET, vcpu->arch.vgic.pendtable);
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -234,6 +262,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>          switch (its_cmd_get_command(cmdptr))
>          {
> +        case GITS_CMD_CLEAR:
> +            its_handle_clear(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 20/28] ARM: vITS: handle MAPD command
  2017-01-30 18:31 ` [PATCH 20/28] ARM: vITS: handle MAPD command Andre Przywara
@ 2017-02-15  0:17   ` Stefano Stabellini
  0 siblings, 0 replies; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-15  0:17 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini, Vijay Kilari

On Mon, 30 Jan 2017, Andre Przywara wrote:
> The MAPD command maps a device by associating a memory region for
> storing ITTEs with a certain device ID.
> We just store the given guest physical address in the device table.
> We don't map the device tables permanently, as their alignment
> requirement is only 256 Bytes, thus making mapping of several tables
> complicated. We map the device tables on demand when we need them later.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3-its.c | 24 ++++++++++++++++++++++++
>  1 file changed, 24 insertions(+)
> 
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index e6523a3..5be40d8 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -289,6 +289,27 @@ static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
>      return ret;
>  }
>  
> +static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    int size = its_cmd_get_size(cmdptr);
> +    bool valid = its_cmd_get_validbit(cmdptr);
> +    paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) & GENMASK(51, 8);

please see alpine.DEB.2.10.1611081649530.3491@sstabellini-ThinkPad-X260 


> +    if ( !its->dev_table )
> +        return -1;
> +
> +    spin_lock(&its->its_lock);
> +    if ( valid )
> +        its->dev_table[devid] = DEV_TABLE_ENTRY(itt_addr, size + 1);
> +    else
> +        its->dev_table[devid] = 0;
> +
> +    spin_unlock(&its->its_lock);
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>  
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -318,6 +339,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_MAPC:
>              its_handle_mapc(its, cmdptr);
>              break;
> +        case GITS_CMD_MAPD:
> +            its_handle_mapd(its, cmdptr);
> +	    break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>  	    break;
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 00/28] arm64: Dom0 ITS emulation
  2017-02-13 13:53 ` [PATCH 00/28] arm64: Dom0 ITS emulation Vijay Kilari
  2017-02-14 22:00   ` Stefano Stabellini
@ 2017-02-15 15:59   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 15:59 UTC (permalink / raw)
  To: Vijay Kilari, Andre Przywara; +Cc: xen-devel, nd, Stefano Stabellini

Hi Vijay,

On 13/02/17 13:53, Vijay Kilari wrote:
>   I tried your patch series on HW. Dom0 boots but no LPIs are coming to Dom0.
> So I made below patch to consider segment ID in generating devid,
>  I see below panic from _xmalloc().

I found the root cause of this bug. The size of the ITT entry
is not read correctly from GITS_TYPER. Can you try the below patch?

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 36839c919d..46519648e8 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -363,7 +363,7 @@ int gicv3_its_init(struct host_its *hw_its)
     reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
     if ( reg & GITS_TYPER_PTA )
         hw_its->flags |= HOST_ITS_USES_PTA;
-    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
+    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg) + 1;
 
     for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
     {


Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
  2017-02-07 14:05   ` Julien Grall
@ 2017-02-15 16:30   ` Julien Grall
  2017-02-22  7:06   ` Vijay Kilari
  2017-02-22 13:17   ` Julien Grall
  3 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 16:30 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             int size, uint64_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size & GENMASK(4, 0);
> +    cmd[2] = itt_addr & GENMASK(51, 8);
> +    if ( valid )
> +        cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(int cpu)
>  {
> @@ -293,6 +310,7 @@ int gicv3_its_init(struct host_its *hw_its)
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      if ( reg & GITS_TYPER_PTA )
>          hw_its->flags |= HOST_ITS_USES_PTA;
> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);

The GITS_TYPER.ITT_entry_size indicates the number of bytes minus one. 
So you would have to add a + 1.

I would add it in the GITS_TYPER_ITT_SIZE macro to simplify it.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-01-30 18:31 ` [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
  2017-02-14 20:39   ` Stefano Stabellini
@ 2017-02-15 17:03   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:03 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> For the same reason that allocating a struct irq_desc for each
> possible LPI is not an option, having a struct pending_irq for each LPI
> is also not feasible. However we actually only need those when an
> interrupt is on a vCPU (or is about to be injected).
> Maintain a list of those structs that we can use for the lifecycle of
> a guest LPI. We allocate new entries if necessary, however reuse
> pre-owned entries whenever possible.
> I added some locking around this list here, however my gut feeling is
> that we don't need one because this a per-VCPU structure anyway.
> If someone could confirm this, I'd be grateful.
> Teach the existing VGIC functions to find the right pointer when being
> given a virtual LPI number.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic.c           |  3 +++
>  xen/arch/arm/vgic-v3.c       |  3 +++
>  xen/arch/arm/vgic.c          | 64 +++++++++++++++++++++++++++++++++++++++++---
>  xen/include/asm-arm/domain.h |  2 ++
>  xen/include/asm-arm/vgic.h   | 14 ++++++++++
>  5 files changed, 83 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index a5348f2..bd3c032 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -509,6 +509,9 @@ static void gic_update_one_lr(struct vcpu *v, int i)
>                  struct vcpu *v_target = vgic_get_target_vcpu(v, irq);
>                  irq_set_affinity(p->desc, cpumask_of(v_target->processor));
>              }
> +            /* If this was an LPI, mark this struct as available again. */
> +            if ( is_lpi(p->irq) )
> +                p->irq = 0;
>          }
>      }
>  }
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 1fadb00..b0653c2 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1426,6 +1426,9 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>      if ( v->vcpu_id == last_cpu || (v->vcpu_id == (d->max_vcpus - 1)) )
>          v->arch.vgic.flags |= VGIC_V3_RDIST_LAST;
>
> +    spin_lock_init(&v->arch.vgic.pending_lpi_list_lock);
> +    INIT_LIST_HEAD(&v->arch.vgic.pending_lpi_list);
> +
>      return 0;
>  }
>
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 364d5f0..7e3440f 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -31,6 +31,8 @@
>  #include <asm/mmio.h>
>  #include <asm/gic.h>
>  #include <asm/vgic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>

I really don't want to see gic_v3_* header included in common code.

>
>  static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
>  {
> @@ -61,7 +63,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>      return vgic_get_rank(v, rank);
>  }
>
> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>  {
>      INIT_LIST_HEAD(&p->inflight);
>      INIT_LIST_HEAD(&p->lr_queue);
> @@ -244,10 +246,14 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
>
>  static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
>  {
> -    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
> +    struct vgic_irq_rank *rank;
>      unsigned long flags;
>      int priority;
>
> +    if ( is_lpi(virq) )
> +        return vgic_lpi_get_priority(v->domain, virq);

This would benefit some comments to explain why LPI handling is a 
different path.

> +
> +    rank = vgic_rank_irq(v, virq);
>      vgic_lock_rank(v, rank, flags);
>      priority = rank->priority[virq & INTERRUPT_RANK_MASK];
>      vgic_unlock_rank(v, rank, flags);
> @@ -446,13 +452,63 @@ bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode,
>      return true;
>  }
>
> +/*
> + * Holding struct pending_irq's for each possible virtual LPI in each domain
> + * requires too much Xen memory, also a malicious guest could potentially
> + * spam Xen with LPI map requests. We cannot cover those with (guest allocated)
> + * ITS memory, so we use a dynamic scheme of allocating struct pending_irq's
> + * on demand.
> + */

I am afraid this will not prevent a guest to use too much Xen memory. 
Let's imagine the guest is mapping thousands of LPIs but decides to 
never handle them or is slowly. You would allocate pending_irq for each 
LPI, and never release the memory.

If we use dynamic allocation, we need a way to limit the number of LPIs 
received by a guest to avoid memory exhaustion. The only idea I have is 
an artificial limit, but I don't think it is good. Any other ideas?

> +struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
> +                                   bool allocate)
> +{
> +    struct lpi_pending_irq *lpi_irq, *empty = NULL;
> +
> +    spin_lock(&v->arch.vgic.pending_lpi_list_lock);
> +    list_for_each_entry(lpi_irq, &v->arch.vgic.pending_lpi_list, entry)
> +    {
> +        if ( lpi_irq->pirq.irq == lpi )
> +        {
> +            spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +            return &lpi_irq->pirq;
> +        }
> +
> +        if ( lpi_irq->pirq.irq == 0 && !empty )
> +            empty = lpi_irq;
> +    }
> +
> +    if ( !allocate )
> +    {
> +        spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +        return NULL;
> +    }
> +
> +    if ( !empty )
> +    {
> +        empty = xzalloc(struct lpi_pending_irq);

xzalloc will return NULL if memory is exhausted. There is a general lack 
of error checking within this series. Any missing error could be a 
potential target from a guest, leading to security issue. Stefano and I 
already spot some, it does not mean we found all. So Before sending the 
next version, please go through the series and verify *every* return.

Also, I can't find the code which release LPIs neither in this patch nor 
in this series. A general rule is too have allocation and free within 
the same patch. It is much easier to spot missing free.

> +        vgic_init_pending_irq(&empty->pirq, lpi);
> +        list_add_tail(&empty->entry, &v->arch.vgic.pending_lpi_list);
> +    } else
> +    {
> +        empty->pirq.status = 0;
> +        empty->pirq.irq = lpi;
> +    }
> +
> +    spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
> +
> +    return &empty->pirq;
> +}
> +
>  struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
>  {
>      struct pending_irq *n;
> +

Spurious change.

>      /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
>       * are used for SPIs; the rests are used for per cpu irqs */
>      if ( irq < 32 )
>          n = &v->arch.vgic.pending_irqs[irq];
> +    else if ( is_lpi(irq) )
> +        n = lpi_to_pending(v, irq, true);
>      else
>          n = &v->domain->arch.vgic.pending_irqs[irq - 32];
>      return n;
> @@ -480,7 +536,7 @@ void vgic_clear_pending_irqs(struct vcpu *v)
>  void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>  {
>      uint8_t priority;
> -    struct pending_irq *iter, *n = irq_to_pending(v, virq);
> +    struct pending_irq *iter, *n;
>      unsigned long flags;
>      bool running;
>
> @@ -488,6 +544,8 @@ void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq)
>
>      spin_lock_irqsave(&v->arch.vgic.lock, flags);
>
> +    n = irq_to_pending(v, virq);

Why did you move this code?

> +
>      /* vcpu offline */
>      if ( test_bit(_VPF_down, &v->pause_flags) )
>      {
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 00b9c1a..f44a84b 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -257,6 +257,8 @@ struct arch_vcpu
>          paddr_t rdist_base;
>  #define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
>          uint8_t flags;
> +        struct list_head pending_lpi_list;
> +        spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
>      } vgic;
>
>      /* Timer registers  */
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 672f649..03d4d2e 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -83,6 +83,12 @@ struct pending_irq
>      struct list_head lr_queue;
>  };
>
> +struct lpi_pending_irq
> +{
> +    struct list_head entry;
> +    struct pending_irq pirq;
> +};
> +
>  #define NR_INTERRUPT_PER_RANK   32
>  #define INTERRUPT_RANK_MASK (NR_INTERRUPT_PER_RANK - 1)
>
> @@ -296,13 +302,21 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
> +extern struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int irq,
> +                                          bool allocate);
>  extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
>  extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>  extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
>  extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
> +/* placeholder function until the property table gets introduced */
> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    return 0xa;
> +}

To be fair, you can avoid this function by re-ordering the patches. As 
suggested earlier, I would introduce some bits of the vITS before. This 
would also make the series easier to read.

>  extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
>  int vgic_v2_init(struct domain *d, int *mmio_count);
>  int vgic_v3_init(struct domain *d, int *mmio_count);
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-02-14 20:39   ` Stefano Stabellini
@ 2017-02-15 17:06     ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:06 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel, nd, Vijay Kilari

Hi Stefano,

On 14/02/17 20:39, Stefano Stabellini wrote:
> On Mon, 30 Jan 2017, Andre Przywara wrote:
>> For the same reason that allocating a struct irq_desc for each
>> possible LPI is not an option, having a struct pending_irq for each LPI
>> is also not feasible. However we actually only need those when an
>> interrupt is on a vCPU (or is about to be injected).
>> Maintain a list of those structs that we can use for the lifecycle of
>> a guest LPI. We allocate new entries if necessary, however reuse
>> pre-owned entries whenever possible.
>> I added some locking around this list here, however my gut feeling is
>> that we don't need one because this a per-VCPU structure anyway.
>> If someone could confirm this, I'd be grateful.
>> Teach the existing VGIC functions to find the right pointer when being
>> given a virtual LPI number.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>
> Please address past comments, specifically use a data structure per
> domain, rather than per-vcpu. Also please move to a more efficient data
> structure, such as an hashtable or a tree.

+1 for both. We need to limit the memory used by a domain.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-02-14 21:00   ` Stefano Stabellini
@ 2017-02-15 17:18     ` Julien Grall
  2017-02-15 21:25       ` Stefano Stabellini
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:18 UTC (permalink / raw)
  To: Stefano Stabellini, Andre Przywara; +Cc: xen-devel, nd, Vijay Kilari

Hi Stefano,

On 14/02/17 21:00, Stefano Stabellini wrote:
> On Mon, 30 Jan 2017, Andre Przywara wrote:
>> +/*
>> + * Handle incoming LPIs, which are a bit special, because they are potentially
>> + * numerous and also only get injected into guests. Treat them specially here,
>> + * by just looking up their target vCPU and virtual LPI number and hand it
>> + * over to the injection function.
>> + */
>> +void do_LPI(unsigned int lpi)
>> +{
>> +    struct domain *d;
>> +    union host_lpi *hlpip, hlpi;
>> +    struct vcpu *vcpu;
>> +
>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>> +
>> +    hlpip = gic_get_host_lpi(lpi);
>> +    if ( !hlpip )
>> +        return;
>> +
>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>> +
>> +    /* We may have mapped more host LPIs than the guest actually asked for. */
>> +    if ( !hlpi.virt_lpi )
>> +        return;
>> +
>> +    d = get_domain_by_id(hlpi.dom_id);
>> +    if ( !d )
>> +        return;
>> +
>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>> +    {
>> +        put_domain(d);
>> +        return;
>> +    }
>> +
>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>> +
>> +    put_domain(d);
>> +
>> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>
> put_domain should be here

Why? I don't even understand why we would need to take a reference on 
the domain for LPIs. Would not it be enough to use rcu_lock_domain_by_id 
here?

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-01-30 18:31 ` [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests Andre Przywara
  2017-02-14 21:00   ` Stefano Stabellini
@ 2017-02-15 17:30   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:30 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
> number to get this IRQ injected.
> Iterate our two-level LPI table to find this information quickly when
> the host takes an LPI. Call the existing injection function to let the
> GIC emulation deal with this interrupt.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-lpi.c | 41 +++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic.c        |  6 ++++--
>  xen/include/asm-arm/irq.h |  8 ++++++++
>  3 files changed, 53 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 8f6e7f3..d270053 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -86,6 +86,47 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta)
>          return per_cpu(redist_id, cpu) << 16;
>  }
>
> +/*
> + * Handle incoming LPIs, which are a bit special, because they are potentially
> + * numerous and also only get injected into guests. Treat them specially here,
> + * by just looking up their target vCPU and virtual LPI number and hand it
> + * over to the injection function.
> + */
> +void do_LPI(unsigned int lpi)
> +{
> +    struct domain *d;
> +    union host_lpi *hlpip, hlpi;
> +    struct vcpu *vcpu;
> +
> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> +
> +    hlpip = gic_get_host_lpi(lpi);
> +    if ( !hlpip )
> +        return;
> +
> +    hlpi.data = read_u64_atomic(&hlpip->data);
> +
> +    /* We may have mapped more host LPIs than the guest actually asked for. */

Another way, is the interrupt has been received at the same time the 
guest is configuring it. What will happen if the interrupt is lost?

> +    if ( !hlpi.virt_lpi )
> +        return;
> +
> +    d = get_domain_by_id(hlpi.dom_id);
> +    if ( !d )
> +        return;
> +
> +    if ( hlpi.vcpu_id >= d->max_vcpus )

A comment would be certainly useful here to explain why this check.

> +    {
> +        put_domain(d);
> +        return;
> +    }
> +
> +    vcpu = d->vcpu[hlpi.vcpu_id];
> +
> +    put_domain(d);
> +
> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
> +}
> +
>  uint64_t gicv3_lpi_allocate_pendtable(void)
>  {
>      uint64_t reg;
> diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
> index bd3c032..7286e5d 100644
> --- a/xen/arch/arm/gic.c
> +++ b/xen/arch/arm/gic.c
> @@ -700,8 +700,10 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
>              local_irq_enable();
>              do_IRQ(regs, irq, is_fiq);
>              local_irq_disable();
> -        }
> -        else if (unlikely(irq < 16))
> +        } else if ( is_lpi(irq) )

Coding style:

}
else if (...)
{
}
else if (...)

> +        {
> +            do_LPI(irq);

I really don't want to see GICv3 specific code called in common code. 
Please introduce a specific callback in gic_hw_operations.

> +        } else if ( unlikely(irq < 16) )
>          {
>              do_sgi(regs, irq);
>          }
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 8f7a167..ee47de8 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -34,6 +34,14 @@ struct irq_desc *__irq_to_desc(int irq);
>
>  void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>
> +#ifdef CONFIG_HAS_ITS
> +void do_LPI(unsigned int irq);
> +#else
> +static inline void do_LPI(unsigned int irq)
> +{
> +}
> +#endif
> +

This would avoid such ugly hack where do_LPI is define in gic-v3-its.c 
but declared in irq.h.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host
  2017-01-30 18:31 ` [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
  2017-02-14 22:41   ` Stefano Stabellini
@ 2017-02-15 17:35   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:35 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Now that the host part of the ITS code is in place, we can enable the
> ITS and also LPIs on each redistributor to get the show rolling.
> At this point there would be no LPIs mapped, as guests don't know about
> the ITS yet.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c | 34 ++++++++++++++++++++++++++++++++--
>  xen/arch/arm/gic-v3.c     | 19 +++++++++++++++++++
>  2 files changed, 51 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index f073ab5..2a7093f 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -62,6 +62,28 @@ static int its_send_command(struct host_its *hw_its, const void *its_cmd)
>      return 0;
>  }
>
> +/* Wait for an ITS to finish processing all commands. */
> +static int gicv3_its_wait_commands(struct host_its *hw_its)
> +{
> +    s_time_t deadline = NOW() + MILLISECS(1000);
> +    uint64_t readp, writep;
> +
> +    do {
> +        spin_lock(&hw_its->cmd_lock);
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +        spin_unlock(&hw_its->cmd_lock);
> +
> +        if ( readp == writep )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    return -ETIMEDOUT;
> +}
> +
>  static uint64_t encode_rdbase(struct host_its *hw_its, int cpu, uint64_t reg)
>  {
>      reg &= ~GENMASK(51, 16);
> @@ -161,6 +183,10 @@ int gicv3_its_setup_collection(int cpu)
>          ret = its_send_cmd_sync(its, cpu);
>          if ( ret )
>              return ret;
> +
> +        ret = gicv3_its_wait_commands(its);
> +        if ( ret )
> +            return ret;

Why are all the gicv3_its_wait_commands are added now and not when the 
command was added?

Maybe I am missing something but base on the commit message, this patch 
should really just turning on the enable bit.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 00/28] arm64: Dom0 ITS emulation
  2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
                   ` (28 preceding siblings ...)
  2017-02-13 13:53 ` [PATCH 00/28] arm64: Dom0 ITS emulation Vijay Kilari
@ 2017-02-15 17:55 ` Julien Grall
  29 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 17:55 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

On 30/01/17 18:31, Andre Przywara wrote:
> Hi,

Hi Andre,

> Compared to the previous post (RFC-v2) this has seen a lot of reworks
> and cleanups in various areas.
> I tried to address all of the review comments, though some are hard to
> follow due to rewrites. So apologies if some points have slipped through.

A lot of the comments were not in the code reworked. It is not that 
difficult to check whether a comment still apply or not before sending a 
series and can save a lot of bandwidth review.

For the next version please go through all the comments from the first 2 
versions and see whether they need to be addressed.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                     ` (3 preceding siblings ...)
  2017-02-14  0:54   ` Stefano Stabellini
@ 2017-02-15 18:31   ` Shanker Donthineni
  2017-02-16 19:03   ` Shanker Donthineni
  5 siblings, 0 replies; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-15 18:31 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi Andre,


On 01/30/2017 12:31 PM, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>   xen/arch/arm/Kconfig             |  14 +++++
>   xen/arch/arm/gic-v3-its.c        | 129
> +++++++++++++++++++++++++++++++++++++++
>   xen/arch/arm/gic-v3.c            |   5 ++
>   xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>   4 files changed, 202 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 71734a1..81bc233 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>             This can be overriden on the command line with the max_lpi_bits
>             parameter.
>
> +config MAX_PHYS_ITS_DEVICE_BITS
> +        depends on HAS_ITS
> +        int "Number of device bits the ITS supports"
> +        range 1 32
> +        default "10"
> +        help
> +          Specifies the maximum number of devices which want to use the
> ITS.
> +          Xen needs to allocates memory for the whole range very early.
> +          The allocation scheme may be sparse, so a much larger number must
> +          be supported to cover devices with a high bus number or those on
> +          separate bus segments.
> +          This can be overriden on the command line with the
> max_its_device_bits
> +          parameter.
> +
The number of DEVID bits that hardware supports is discoverable through 
a register field GITS_TYPER.Devbits. The XEN driver must all the devices 
that hardware says like in the Linux kernel. I'm seeing a XEN crash if I 
set DEVID to 32bits.
>   endmenu
>
>   menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index ff0f571..c31fef6 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,9 +20,138 @@
>   #include <xen/lib.h>
>   #include <xen/device_tree.h>
>   #include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
>   #include <asm/gic.h>
>   #include <asm/gic_v3_defs.h>
>   #include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16 )
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
> +
> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int
> nr_items)
> +{
> +    uint64_t attr, reg;
> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;
> +    int pagesz = 0, order, table_size;
Better try using ITS page sizes in ote order 64K, 16K and 4K to cover 
maximum device as possible using a single level translation.
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner <<
> GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Setup the BASE register with the attributes that we like. Then read
> +     * it back and see what sticks (page size, cacheability and
> shareability
> +     * attributes), retrying if necessary.
> +     */
> +    while ( 1 )
> +    {
> +        table_size = ROUNDUP(nr_items * entry_size,
> BIT(PAGE_BITS(pagesz)));
> +        order = get_order_from_bytes(table_size);
> +
You can map a maximum of 256 ITS pages, order must be 'table_size >> 
PAGE_BITS(pagesz) <= 256'.

> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= table_size >> PAGE_BITS(pagesz);
Change to 'reg |= table_size >> (PAGE_BITS(pagesz)) & 0xff'
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_VALID_BIT;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
> +
> +        /* The host didn't like our attributes, just use what it returned.
> */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +        {
> +            /* If we can't map it shareable, drop cacheability as well. */
> +            if ( (regc & GITS_BASER_SHAREABILITY_MASK) ==
> GIC_BASER_NonShareable )
> +            {
> +                regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +                attr = regc & BASER_ATTR_MASK;
> +                continue;
> +            }
> +            attr = regc & BASER_ATTR_MASK;
> +        }
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
> +            return 0;
> +
> +        /* None of the page sizes was accepted, give up */
> +        if ( pagesz >= 2 )
> +            break;
> +
> +        free_xenheap_pages(buffer, order);
> +        buffer = NULL;
> +
> +        pagesz++;
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
> +integer_param("max_its_device_bits", max_its_device_bits);
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
It's redundant since the default case already has a continue statement.

> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices
> */
> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            its_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
>
>   /* Scan the DT for any ITS nodes and create a list of host ITSes out of it.
> */
>   void gicv3_its_dt_init(const struct dt_device_node *node)
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index fcb86c8..440c079 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>   #include <xen/irq.h>
>   #include <xen/iocap.h>
>   #include <xen/sched.h>
> +#include <xen/err.h>
>   #include <xen/errno.h>
>   #include <xen/delay.h>
>   #include <xen/device_tree.h>
> @@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
>   {
>       int res, i;
>       uint32_t reg;
> +    struct host_its *hw_its;
>
>       if ( !cpu_has_gicv3 )
>       {
> @@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
>       res = gicv3_cpu_init();
>       gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>       spin_unlock(&gicv3.lock);
>
>       return res;
> diff --git a/xen/include/asm-arm/gic_v3_its.h
> b/xen/include/asm-arm/gic_v3_its.h
> index a66b6be..ed44bdb 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -18,6 +18,53 @@
>   #ifndef __ASM_ARM_ITS_H__
>   #define __ASM_ARM_ITS_H__
>
> +#define GITS_CTLR                       0x000
> +#define GITS_IIDR                       0x004
> +#define GITS_TYPER                      0x008
> +#define GITS_CBASER                     0x080
> +#define GITS_CWRITER                    0x088
> +#define GITS_CREADR                     0x090
> +#define GITS_BASER_NR_REGS              8
> +#define GITS_BASER0                     0x100
> +#define GITS_BASER1                     0x108
> +#define GITS_BASER2                     0x110
> +#define GITS_BASER3                     0x118
> +#define GITS_BASER4                     0x120
> +#define GITS_BASER5                     0x128
> +#define GITS_BASER6                     0x130
> +#define GITS_BASER7                     0x138
> +
> +/* Register bits */
> +#define GITS_VALID_BIT                  BIT_ULL(63)
> +
> +#define GITS_CTLR_QUIESCENT             BIT(31)
> +#define GITS_CTLR_ENABLE                BIT(0)
> +
> +#define GITS_IIDR_VALUE                 0x34c
> +
> +#define GITS_BASER_INDIRECT             BIT_ULL(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
> +                                        (31UL <<
> GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL <<
> GITS_BASER_SHAREABILITY_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL <<
> GITS_BASER_OUTER_CACHEABILITY_SHIFT)
> +#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL <<
> GITS_BASER_INNER_CACHEABILITY_SHIFT)
> +
>   #ifndef __ASSEMBLY__
>   #include <xen/device_tree.h>
>
> @@ -27,6 +74,7 @@ struct host_its {
>       const struct dt_device_node *dt_node;
>       paddr_t addr;
>       paddr_t size;
> +    void __iomem *its_base;
>   };
>
>   extern struct list_head host_its_list;
> @@ -42,8 +90,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>   uint64_t gicv3_lpi_get_proptable(void);
>   uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>   int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>   #else
>
> @@ -62,6 +111,10 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int
> nr_lpis)
>   {
>       return 0;
>   }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>   #endif /* CONFIG_HAS_ITS */
>
>   #endif /* __ASSEMBLY__ */

-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-01-30 18:31 ` [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
  2017-02-14 23:56   ` Stefano Stabellini
@ 2017-02-15 18:44   ` Julien Grall
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-15 18:44 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c       | 220 +++++++++++++++++++++++++++++++++++++++----
>  xen/arch/arm/vgic.c          |   4 +
>  xen/include/asm-arm/domain.h |   8 +-
>  xen/include/asm-arm/vgic.h   |  24 ++++-
>  4 files changed, 233 insertions(+), 23 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index b0653c2..c6db2d7 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
> +        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
> +        return 1;
>
>      case 0x0080:
>          goto read_reserved;
> @@ -302,11 +307,6 @@ bad_width:
>      domain_crash_synchronous();
>      return 0;
>
> -read_as_zero_64:
> -    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -    *r = 0;
> -    return 1;
> -
>  read_as_zero_32:
>      if ( dabt.size != DABT_WORD ) goto bad_width;
>      *r = 0;
> @@ -331,11 +331,179 @@ read_unknown:
>      return 1;
>  }
>
> +static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
> +                                    int field_shift,
> +                                    uint64_t (*sanitise_fn)(uint64_t))
> +{
> +    uint64_t field = (reg & field_mask) >> field_shift;
> +
> +    field = sanitise_fn(field) << field_shift;
> +
> +    return (reg & ~field_mask) | field;
> +}
> +
> +/* We want to avoid outer shareable. */
> +static uint64_t vgic_sanitise_shareability(uint64_t field)
> +{
> +    switch (field) {

Coding style:

switch ( field )
{

> +    case GIC_BASER_OuterShareable:
> +        return GIC_BASER_InnerShareable;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Avoid any inner non-cacheable mapping. */
> +static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
> +{
> +    switch (field) {

Ditto

> +    case GIC_BASER_CACHE_nCnB:
> +    case GIC_BASER_CACHE_nC:
> +        return GIC_BASER_CACHE_RaWb;
> +    default:
> +        return field;
> +    }
> +}
> +
> +/* Non-cacheable or same-as-inner are OK. */
> +static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
> +{
> +    switch (field) {

Ditto

> +    case GIC_BASER_CACHE_SameAsInner:
> +    case GIC_BASER_CACHE_nC:
> +        return field;
> +    default:
> +        return GIC_BASER_CACHE_nC;
> +    }
> +}
> +
> +static uint64_t sanitize_propbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> +                              GICR_PROPBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~GICR_PROPBASER_RES0_MASK;

Newline here please.

> +    return reg;
> +}
> +
> +static uint64_t sanitize_pendbaser(uint64_t reg)
> +{
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> +                              GICR_PENDBASER_SHAREABILITY_SHIFT,
> +                              vgic_sanitise_shareability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_inner_cacheability);
> +    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> +                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> +                              vgic_sanitise_outer_cacheability);
> +
> +    reg &= ~GICR_PENDBASER_RES0_MASK;

Newline here please.

> +    return reg;
> +}
> +
> +/*
> + * Mark a given number of guest pages as used (by increasing their refcount),
> + * starting with the given guest address. This needs to be called once before
> + * calling (possibly repeatedly) map_guest_pages().
> + * Before the domain gets destroyed, call put_guest_pages() to drop the
> + * reference.
> + */
> +int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)

Many comments here:
    * please use the type gfn_t rather paddr_t,
    * nr_pages should be unsigned int
    * this code does not belong to vgic-v3.c. It is a generic function.

> +{
> +    int i;

unsigned int.

> +    struct page_info *page;
> +
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);

Get page may return a foreign page (e.g belonging to another domain) and 
we don't want to use this type of page for ITS memory.

> +        if ( ! page )
> +            return -EINVAL;
> +    }

Should not you revert the reference on all the previous pages when you 
fail to get one page?

> +
> +    return 0;
> +}
> +
> +void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages)

Same comments as above.

> +{
> +    mfn_t mfn;
> +    int i;

unsigned int.

> +
> +    p2m_read_lock(&d->arch.p2m);
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
> +                            NULL, NULL, NULL);
> +        if ( mfn_eq(mfn, INVALID_MFN) )
> +            continue;
> +        put_page(mfn_to_page(mfn_x(mfn)));

This function is completely wrong in the actual state. You assume that 
the stage-2 page table has not been modified by the guest between 
get_guest_pages and put_guest_pages. If it has been modified, you may 
remove a reference on the wrong page.

Furthermore, it is likely an error to have the mfn not valid in this case.

As we discussed earlier, the way forward is to protect the pages. It is 
not mandatory for DOM0, but a comment in the code is necessary to 
explain what is missing.

While I am here, for the next version I would like a todo list of all 
the items missing to support properly a guest. So we can track what is 
required before enabling ITS for guest.

> +    }
> +    p2m_read_unlock(&d->arch.p2m);
> +}
> +
> +/*
> + * Provides easy access to guest memory by "mapping" some parts of it into
> + * Xen's VA space. In fact it relies on the memory being already mapped
> + * and just provides a pointer to it.
> + * This allows the ITS configuration data to be held in guest memory and
> + * avoids using Xen's memory for that.
> + */
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages)
> +{
> +    int i;
> +    void *ptr, *follow;
> +
> +    ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));

This might be correct for DOM0 but will not work for guest. Even if you 
don't support guest right now, we should really avoid such assumption in 
the code. It will likely mean quite a lot of rework which I'd like to 
see now.

> +
> +    /* Make sure subsequent pages are mapped in a virtually contigious way. */

s/contigious/contiguous/

> +    for ( i = 1; i < nr_pages; i++ )
> +    {
> +        follow = map_domain_page(_mfn((guest_addr >> PAGE_SHIFT) + i));

map_domain_page should only be used for temporary mapping and this not 
the case here.

As mentioned in the first version of this series, the best solution is 
to map/unmap the page every time we need it.

> +        if ( follow != ptr + ((long)i << PAGE_SHIFT) )

In the case of the guest, the region may not be contiguous in the 
physical address space.

> +            return NULL;
> +    }
> +
> +    return ptr + (guest_addr & ~PAGE_MASK);
> +}
> +
> +/* "Unmap" previously mapped guest pages. Should be optimized away on arm64. */
> +void unmap_guest_pages(void *va, int nr_pages)
> +{
> +    long i;
> +
> +    for ( i = nr_pages - 1; i >= 0; i-- )
> +        unmap_domain_page(((uintptr_t)va & PAGE_MASK) + (i << PAGE_SHIFT));
> +}

It does not make sense to implement the 4 functions above in this patch 
as they are never

> +
> +int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    if ( vlpi >= d->arch.vgic.nr_lpis )
> +        return GIC_PRI_IRQ;

What if we get the priority before the ITS has been enabled?

> +
> +    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_PRIO_MASK;

This is memory shared with the guest, you need to make sure the compiler 
will not read twice.

> +}
> +
> +bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
> +{
> +    if ( vlpi >= d->arch.vgic.nr_lpis )
> +        return false;
> +
> +    return d->arch.vgic.proptable[vlpi - LPI_OFFSET] & LPI_PROP_ENABLED;

Ditto.

> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>
>      switch ( gicr_reg )
>      {
> @@ -366,36 +534,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>          goto write_impl_defined;
>
>      case VREG64(GICR_SETLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case VREG64(GICR_CLRLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case 0x0050:
>          goto write_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;
> +
> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;
> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
> +            return 1;
> +
> +	reg = v->arch.vgic.rdist_pendbase;
> +	vgic_reg64_update(&reg, r, info);
> +	reg = sanitize_pendbaser(reg);
> +	v->arch.vgic.rdist_pendbase = reg;
> +	return 1;

The indentation is wrong.

>
>      case 0x0080:
>          goto write_reserved;
>
>      case VREG64(GICR_INVLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case 0x00A8:
>          goto write_reserved;
>
>      case VREG64(GICR_INVALLR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case 0x00B8:
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 7e3440f..cf444f3 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -494,6 +494,10 @@ struct pending_irq *lpi_to_pending(struct vcpu *v, unsigned int lpi,
>          empty->pirq.irq = lpi;
>      }
>
> +    /* Update the enabled status */
> +    if ( vgic_lpi_is_enabled(v->domain, lpi) )
> +        set_bit(GIC_IRQ_GUEST_ENABLED, &empty->pirq.status);
> +
>      spin_unlock(&v->arch.vgic.pending_lpi_list_lock);
>
>      return &empty->pirq;
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index f44a84b..33c1851 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -110,6 +110,9 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        int nr_lpis;

unsigned int.

> +        uint64_t rdist_propbase;
> +        uint8_t *proptable;
>          struct rb_root its_devices;         /* devices mapped to an ITS */
>          spinlock_t its_devices_lock;        /* protects the its_devices tree */
>  #endif
> @@ -255,7 +258,10 @@ struct arch_vcpu
>
>          /* GICv3: redistributor base and flags for this vCPU */
>          paddr_t rdist_base;
> -#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
> +#define VGIC_V3_LPIS_ENABLED    (1 << 1)
> +        uint64_t rdist_pendbase;
> +        unsigned long *pendtable;

unsigned long? Why?

>          uint8_t flags;
>          struct list_head pending_lpi_list;
>          spinlock_t pending_lpi_list_lock;   /* protects the pending_lpi_list */
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 03d4d2e..a882fe8 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -285,6 +285,11 @@ VGIC_REG_HELPERS(32, 0x3);
>
>  #undef VGIC_REG_HELPERS
>
> +int get_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
> +void put_guest_pages(struct domain *d, paddr_t gpa, int nr_pages);
> +void *map_guest_pages(struct domain *d, paddr_t guest_addr, int nr_pages);
> +void unmap_guest_pages(void *va, int nr_pages);
> +
>  enum gic_sgi_mode;
>
>  /*
> @@ -312,14 +317,23 @@ extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
>  extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
>  extern void vgic_disable_irqs(struct vcpu *v, uint32_t r, int n);
>  extern void vgic_enable_irqs(struct vcpu *v, uint32_t r, int n);
> -/* placeholder function until the property table gets introduced */
> -static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> -{
> -    return 0xa;
> -}
>  extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
>  int vgic_v2_init(struct domain *d, int *mmio_count);
>  int vgic_v3_init(struct domain *d, int *mmio_count);
> +#ifdef CONFIG_HAS_GICV3
> +extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
> +extern bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi);
> +#else
> +static inline int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
> +{
> +    return 0xa0;

Why 0xa0? This function should never be called when GICv3 ITS is not 
supported.

> +}
> +
> +static inline bool vgic_lpi_is_enabled(struct domain *d, uint32_t vlpi)
> +{
> +    return false;
> +}
> +#endif
>
>  extern int domain_vgic_register(struct domain *d, int *mmio_count);
>  extern int vcpu_vgic_free(struct vcpu *v);
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits
  2017-01-30 18:31 ` [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
@ 2017-02-15 20:06   ` Shanker Donthineni
  0 siblings, 0 replies; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-15 20:06 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi Andre


On 01/30/2017 12:31 PM, Andre Przywara wrote:
> Create a new file to hold the emulation code for the ITS widget.
> For now we emulate the memory mapped ITS registers and provide a stub
> to introduce the ITS command handling framework (but without actually
> emulating any commands at this time).
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>   xen/arch/arm/Makefile             |   1 +
>   xen/arch/arm/vgic-v3-its.c        | 485
> ++++++++++++++++++++++++++++++++++++++
>   xen/arch/arm/vgic-v3.c            |   9 -
>   xen/include/asm-arm/gic_v3_defs.h |  19 ++
>   4 files changed, 505 insertions(+), 9 deletions(-)
>   create mode 100644 xen/arch/arm/vgic-v3-its.c
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 4ccf2eb..a1cbc27 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -46,6 +46,7 @@ obj-y += traps.o
>   obj-y += vgic.o
>   obj-y += vgic-v2.o
>   obj-$(CONFIG_HAS_GICV3) += vgic-v3.o
> +obj-$(CONFIG_HAS_ITS) += vgic-v3-its.o
>   obj-y += vm_event.o
>   obj-y += vtimer.o
>   obj-y += vpsci.o
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> new file mode 100644
> index 0000000..fc28376
> --- /dev/null
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -0,0 +1,485 @@
> +/*
> + * xen/arch/arm/vgic-v3-its.c
> + *
> + * ARM Interrupt Translation Service (ITS) emulation
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/domain_page.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <asm/current.h>
> +#include <asm/mmio.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +#include <asm/vgic.h>
> +#include <asm/vgic-emul.h>
> +
> +/* Data structure to describe a virtual ITS */
> +struct virt_its {
> +    struct domain *d;
> +    spinlock_t vcmd_lock;       /* protects the virtual command buffer */
> +    uint64_t cbaser;
> +    uint64_t *cmdbuf;
> +    int cwriter;
> +    int creadr;
> +    spinlock_t its_lock;        /* protects the collection and device
> tables */
> +    uint64_t baser0, baser1;
> +    uint16_t *coll_table;
> +    int max_collections;
> +    uint64_t *dev_table;
> +    int max_devices;
> +    bool enabled;
> +};
> +
> +/*
> + * An Interrupt Translation Table Entry: this is indexed by a
> + * DeviceID/EventID pair and is located in guest memory.
> + */
> +struct vits_itte
> +{
> +    uint32_t vlpi;
> +    uint16_t collection;
> +};
> +
> +/**************************************
> + * Functions that handle ITS commands *
> + **************************************/
> +
> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
> +                                   int word, int shift, int size)
> +{
> +    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
> +}
> +
> +#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
> +#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
> +#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
> +#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
> +#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
> +#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
> +#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
> +#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
> +
> +#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
> +
> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> +                                uint32_t writer)
> +{
> +    uint64_t *cmdptr;
> +
> +    if ( !its->cmdbuf )
> +        return -1;
> +
> +    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +        return -1;
> +
> +    spin_lock(&its->vcmd_lock);
> +
> +    while ( its->creadr != writer )
> +    {
> +        cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
> +        switch (its_cmd_get_command(cmdptr))
> +        {
> +        case GITS_CMD_SYNC:
> +            /* We handle ITS commands synchronously, so we ignore SYNC. */
> +	    break;
> +        default:
> +            gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS command %ld\n",
> +                   its_cmd_get_command(cmdptr));
> +            break;
> +        }
> +
> +        its->creadr += ITS_CMD_SIZE;
> +        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
> +            its->creadr = 0;
> +    }
> +    its->cwriter = writer;
> +
> +    spin_unlock(&its->vcmd_lock);
> +
> +    return 0;
> +}
> +
> +/*****************************
> + * ITS registers read access *
> + *****************************/
> +
> +/*
> + * The physical address is encoded slightly differently depending on
> + * the used page size: the highest four bits are stored in the lowest
> + * four bits of the field for 64K pages.
> + */
> +static paddr_t get_baser_phys_addr(uint64_t reg)
> +{
> +    if ( reg & BIT(9) )
> +        return (reg & GENMASK(47, 16)) | ((reg & GENMASK(15, 12)) << 36);
> +    else
> +        return reg & GENMASK(47, 12);
> +}
> +
> +static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
> +                                 register_t *r, void *priv)
> +{
> +    struct virt_its *its = priv;
> +
> +    switch ( info->gpa & 0xffff )
> +    {
> +    case VREG32(GITS_CTLR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(its->enabled | BIT(31), info);
> +	break;
> +    case VREG32(GITS_IIDR):
> +        if ( info->dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
> +        break;
> +    case VREG64(GITS_TYPER):
> +        if ( info->dabt.size < DABT_WORD ) goto bad_width;
> +        *r = vgic_reg64_extract(0x1eff1, info);
Why GITS_TYPER is hard-coded here? For DOM0, at least MOVP, ID_bits and 
Devbits fields should be same as hardware otherwise MSI(x) feature fails 
for some devices.

Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-02-15 17:18     ` Julien Grall
@ 2017-02-15 21:25       ` Stefano Stabellini
  2017-03-02 20:56         ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Stefano Stabellini @ 2017-02-15 21:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Andre Przywara, JBeulich,
	andrew.cooper3, xen-devel, nd

On Wed, 15 Feb 2017, Julien Grall wrote:
> Hi Stefano,
> 
> On 14/02/17 21:00, Stefano Stabellini wrote:
> > On Mon, 30 Jan 2017, Andre Przywara wrote:
> > > +/*
> > > + * Handle incoming LPIs, which are a bit special, because they are
> > > potentially
> > > + * numerous and also only get injected into guests. Treat them specially
> > > here,
> > > + * by just looking up their target vCPU and virtual LPI number and hand
> > > it
> > > + * over to the injection function.
> > > + */
> > > +void do_LPI(unsigned int lpi)
> > > +{
> > > +    struct domain *d;
> > > +    union host_lpi *hlpip, hlpi;
> > > +    struct vcpu *vcpu;
> > > +
> > > +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
> > > +
> > > +    hlpip = gic_get_host_lpi(lpi);
> > > +    if ( !hlpip )
> > > +        return;
> > > +
> > > +    hlpi.data = read_u64_atomic(&hlpip->data);
> > > +
> > > +    /* We may have mapped more host LPIs than the guest actually asked
> > > for. */
> > > +    if ( !hlpi.virt_lpi )
> > > +        return;
> > > +
> > > +    d = get_domain_by_id(hlpi.dom_id);
> > > +    if ( !d )
> > > +        return;
> > > +
> > > +    if ( hlpi.vcpu_id >= d->max_vcpus )
> > > +    {
> > > +        put_domain(d);
> > > +        return;
> > > +    }
> > > +
> > > +    vcpu = d->vcpu[hlpi.vcpu_id];
> > > +
> > > +    put_domain(d);
> > > +
> > > +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
> > 
> > put_domain should be here
> 
> Why? I don't even understand why we would need to take a reference on the
> domain for LPIs. Would not it be enough to use rcu_lock_domain_by_id here?

I think that rcu_lock_domain_by_id would also work, but similarly we
would need to call rcu_unlock here.

To be honest, I don't know exactly in which cases get_domain should be
used instead of rcu_lock_domain_by_id.

CC'ing the x86 guys that might know the answer.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-02-06 12:39   ` Julien Grall
@ 2017-02-16 17:44     ` Andre Przywara
  2017-02-16 18:15       ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-02-16 17:44 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Julien,

On 06/02/17 12:39, Julien Grall wrote:
> Hi Andre,
> 
> On 30/01/17 18:31, Andre Przywara wrote:
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> new file mode 100644
>> index 0000000..ff0f571
>> --- /dev/null
>> +++ b/xen/arch/arm/gic-v3-its.c
>> @@ -0,0 +1,71 @@
>> +/*
>> + * xen/arch/arm/gic-v3-its.c
>> + *
>> + * ARM GICv3 Interrupt Translation Service (ITS) support
>> + *
>> + * Copyright (C) 2016,2017 - ARM Ltd
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/config.h>
> 
> No need to include xen/config.h it will be done by default.
> 
>> +#include <xen/lib.h>
>> +#include <xen/device_tree.h>
> 
> 
>> +#include <xen/libfdt/libfdt.h>
>> +#include <asm/gic.h>
>> +#include <asm/gic_v3_defs.h>
> 
> The 3 headers above does not look necessary for now. Please try to
> include them when needed.
> 
>> +#include <asm/gic_v3_its.h>
>> +
>> +/* Scan the DT for any ITS nodes and create a list of host ITSes out
>> of it. */
>> +void gicv3_its_dt_init(const struct dt_device_node *node)
>> +{
>> +    const struct dt_device_node *its = NULL;
>> +    struct host_its *its_data;
>> +
>> +    /*
>> +     * Check for ITS MSI subnodes. If any, add the ITS register
>> +     * frames to the ITS list.
>> +     */
>> +    dt_for_each_child_node(node, its)
>> +    {
>> +        paddr_t addr, size;
> 
> NIT: dt_device_get_address is taking uint64_t variables in parameter. So
> I would prefer to use uint64_t here for consistency.
> 
>> +
>> +        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
>> +            continue;
>> +
>> +        if ( !dt_device_is_available(its) )
>> +            continue;
> 
> Can an ITS really be disabled? Or is it just for debugging?

This was indeed introduced for debugging, but is useful with multiple
ITSes. Firmware could ship with a DT covering the maximum hardware
configuration, then disabling not existing hardware at boot time.

And in general I consider this good style to support the status property.

>> +
>> +        if ( dt_device_get_address(its, 0, &addr, &size) )
>> +            panic("GICv3: Cannot find a valid ITS frame address");
>> +
>> +        its_data = xzalloc(struct host_its);
>> +        if ( !its_data )
>> +            panic("GICv3: Cannot allocate memory for ITS frame");
>> +
>> +        its_data->addr = addr;
>> +        its_data->size = size;
>> +        its_data->dt_node = its;
>> +
>> +        printk("GICv3: Found ITS @0x%lx\n", addr);
>> +
>> +        list_add_tail(&its_data->entry, &host_its_list);
>> +    }
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index b8be395..838dd11 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -43,9 +43,12 @@
>>  #include <asm/device.h>
>>  #include <asm/gic.h>
>>  #include <asm/gic_v3_defs.h>
>> +#include <asm/gic_v3_its.h>
>>  #include <asm/cpufeature.h>
>>  #include <asm/acpi.h>
>>
>> +LIST_HEAD(host_its_list);
> 
> I would rather limit the number of variable exported. I've looked at how
> host_its_list is used accross this series and I don't think it is
> necessary to export it.

Yes, I wasn't too happy to introduce it in the first place, but
originally needed it even beyond the current usage. So now that this is
gone, I am happy trying to confining it to this file, which indeed
sounds architecturally cleaner.

Cheers,
Andre.

> The 2 users (not including gic-v3-its.c) are in gic-v3.c and vgic-v3.c.
> I will explain how to replace the one in vgic-v3.c on the corresponding
> patch.
> 
> For gic-v3.c, you use host_its_list to check if ITS is available and
> going through the list. For the former, you could have gicv3_its_dt_init
> returning the number ITS available. For the latter, the loop is calling
> a function living in gic-v3-its.c where host_its_list is already available.
> 
> I will mention again when review the associated patches.
> 
>> +
>>  /* Global state */
>>  static struct {
>>      void __iomem *map_dbase;  /* Mapped address of distributor
>> registers */
>> @@ -1224,11 +1227,12 @@ static void __init gicv3_dt_init(void)
>>       */
>>      res = dt_device_get_address(node, 1 + gicv3.rdist_count,
>>                                  &cbase, &csize);
>> -    if ( res )
>> -        return;
>> +    if ( !res )
>> +        dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
>> +                              &vbase, &vsize);
>>
>> -    dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
>> -                          &vbase, &vsize);
>> +    /* Check for ITS child nodes and build the host ITS list
>> accordingly. */
>> +    gicv3_its_dt_init(node);
>>  }
>>
>>  static int gicv3_iomem_deny_access(const struct domain *d)
>> diff --git a/xen/include/asm-arm/gic_v3_its.h
>> b/xen/include/asm-arm/gic_v3_its.h
>> new file mode 100644
>> index 0000000..2f5c51c
>> --- /dev/null
>> +++ b/xen/include/asm-arm/gic_v3_its.h
>> @@ -0,0 +1,57 @@
>> +/*
>> + * ARM GICv3 ITS support
>> + *
>> + * Andre Przywara <andre.przywara@arm.com>
>> + * Copyright (c) 2016 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#ifndef __ASM_ARM_ITS_H__
>> +#define __ASM_ARM_ITS_H__
>> +
>> +#ifndef __ASSEMBLY__
> 
> Do you expect the ITS header to be included in the assembly code? If
> not, then I would drop the #ifndef __ASSEMBLY.
> 
>> +#include <xen/device_tree.h>
>> +
>> +/* data structure for each hardware ITS */
>> +struct host_its {
>> +    struct list_head entry;
>> +    const struct dt_device_node *dt_node;
>> +    paddr_t addr;
>> +    paddr_t size;
>> +};
>> +
>> +extern struct list_head host_its_list;
>> +
>> +#ifdef CONFIG_HAS_ITS
>> +
>> +/* Parse the host DT and pick up all host ITSes. */
>> +void gicv3_its_dt_init(const struct dt_device_node *node);
>> +
>> +#else
>> +
>> +static inline void gicv3_its_dt_init(const struct dt_device_node *node)
>> +{
>> +}
>> +
>> +#endif /* CONFIG_HAS_ITS */
>> +
>> +#endif /* __ASSEMBLY__ */
>> +#endif
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>>
> 
> Cheers,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-02-16 17:44     ` Andre Przywara
@ 2017-02-16 18:15       ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-16 18:15 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

On 16/02/17 17:44, Andre Przywara wrote:
> Hi Julien,

Hi Andre,

> On 06/02/17 12:39, Julien Grall wrote:
>> On 30/01/17 18:31, Andre Przywara wrote:
>>> +
>>> +        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
>>> +            continue;
>>> +
>>> +        if ( !dt_device_is_available(its) )
>>> +            continue;
>>
>> Can an ITS really be disabled? Or is it just for debugging?
>
> This was indeed introduced for debugging, but is useful with multiple
> ITSes. Firmware could ship with a DT covering the maximum hardware
> configuration, then disabling not existing hardware at boot time.
>
> And in general I consider this good style to support the status property.

I tend to agree here, however this will have a side-effect on the 
device-tree generated for DOM0. The ITS node will not be replicated and 
you will end up using a broken device-tree. While Linux will only 
complain, some other OS may just abort.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                     ` (4 preceding siblings ...)
  2017-02-15 18:31   ` Shanker Donthineni
@ 2017-02-16 19:03   ` Shanker Donthineni
  2017-02-24 19:29     ` Shanker Donthineni
  5 siblings, 1 reply; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-16 19:03 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi Andre,


On 01/30/2017 12:31 PM, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
> and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it the ITS.
> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>   xen/arch/arm/Kconfig             |  14 +++++
>   xen/arch/arm/gic-v3-its.c        | 129
> +++++++++++++++++++++++++++++++++++++++
>   xen/arch/arm/gic-v3.c            |   5 ++
>   xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>   4 files changed, 202 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 71734a1..81bc233 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>             This can be overriden on the command line with the max_lpi_bits
>             parameter.
>
> +config MAX_PHYS_ITS_DEVICE_BITS
> +        depends on HAS_ITS
> +        int "Number of device bits the ITS supports"
> +        range 1 32
> +        default "10"
> +        help
> +          Specifies the maximum number of devices which want to use the
> ITS.
> +          Xen needs to allocates memory for the whole range very early.
> +          The allocation scheme may be sparse, so a much larger number must
> +          be supported to cover devices with a high bus number or those on
> +          separate bus segments.
> +          This can be overriden on the command line with the
> max_its_device_bits
> +          parameter.
> +
>   endmenu
>
>   menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index ff0f571..c31fef6 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,9 +20,138 @@
>   #include <xen/lib.h>
>   #include <xen/device_tree.h>
>   #include <xen/libfdt/libfdt.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
>   #include <asm/gic.h>
>   #include <asm/gic_v3_defs.h>
>   #include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
> +
> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
> +{
> +    uint64_t ret;
> +
> +    if ( page_bits < 16 )
> +        return (uint64_t)addr & GENMASK(47, page_bits);
> +
> +    ret = addr & GENMASK(47, 16);
> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
> +}
> +
> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
> +
> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int
> nr_items)
> +{
> +    uint64_t attr, reg;
> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1;
> +    int pagesz = 0, order, table_size;
> +    void *buffer = NULL;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner <<
> GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Setup the BASE register with the attributes that we like. Then read
> +     * it back and see what sticks (page size, cacheability and
> shareability
> +     * attributes), retrying if necessary.
> +     */
> +    while ( 1 )
> +    {
> +        table_size = ROUNDUP(nr_items * entry_size,
> BIT(PAGE_BITS(pagesz)));
> +        order = get_order_from_bytes(table_size);
> +
> +        if ( !buffer )
> +            buffer = alloc_xenheap_pages(order, 0);
> +        if ( !buffer )
> +            return -ENOMEM;
> +
> +        reg  = attr;
> +        reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +        reg |= table_size >> PAGE_BITS(pagesz);
> +        reg |= regc & BASER_RO_MASK;
> +        reg |= GITS_VALID_BIT;
> +        reg |= encode_phys_addr(virt_to_maddr(buffer), PAGE_BITS(pagesz));
> +
> +        writeq_relaxed(reg, basereg);
> +        regc = readl_relaxed(basereg);
expecting regc = readq_relaxed(baserq)?

> +
> +        /* The host didn't like our attributes, just use what it returned.
> */
> +        if ( (regc & BASER_ATTR_MASK) != attr )
> +        {
> +            /* If we can't map it shareable, drop cacheability as well. */
> +            if ( (regc & GITS_BASER_SHAREABILITY_MASK) ==
> GIC_BASER_NonShareable )
> +            {
> +                regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +                attr = regc & BASER_ATTR_MASK;
> +                continue;
> +            }
> +            attr = regc & BASER_ATTR_MASK;
> +        }
> +
> +        /* If the host accepted our page size, we are done. */
> +        if ( (regc & (3UL << GITS_BASER_PAGE_SIZE_SHIFT)) == pagesz )
Invalid check, should be 'if ( ((regc >> GITS_BASER_PAGE_SIZE_SHIFT) & 
0x3) == pagesz)'
> +            return 0;
> +
> +        /* None of the page sizes was accepted, give up */
> +        if ( pagesz >= 2 )
> +            break;
> +
> +        free_xenheap_pages(buffer, order);
> +        buffer = NULL;
> +
> +        pagesz++;
> +    }
> +
> +    if ( buffer )
> +        free_xenheap_pages(buffer, order);
> +
> +    return -EINVAL;
> +}
> +
> +static unsigned int max_its_device_bits = CONFIG_MAX_PHYS_ITS_DEVICE_BITS;
> +integer_param("max_its_device_bits", max_its_device_bits);
> +
> +int gicv3_its_init(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            /* TODO: find some better way of limiting the number of devices
> */
> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            its_map_baser(basereg, reg, NR_CPUS);
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
>
>   /* Scan the DT for any ITS nodes and create a list of host ITSes out of it.
> */
>   void gicv3_its_dt_init(const struct dt_device_node *node)
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index fcb86c8..440c079 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -29,6 +29,7 @@
>   #include <xen/irq.h>
>   #include <xen/iocap.h>
>   #include <xen/sched.h>
> +#include <xen/err.h>
>   #include <xen/errno.h>
>   #include <xen/delay.h>
>   #include <xen/device_tree.h>
> @@ -1563,6 +1564,7 @@ static int __init gicv3_init(void)
>   {
>       int res, i;
>       uint32_t reg;
> +    struct host_its *hw_its;
>
>       if ( !cpu_has_gicv3 )
>       {
> @@ -1618,6 +1620,9 @@ static int __init gicv3_init(void)
>       res = gicv3_cpu_init();
>       gicv3_hyp_init();
>
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +        gicv3_its_init(hw_its);
> +
>       spin_unlock(&gicv3.lock);
>
>       return res;
> diff --git a/xen/include/asm-arm/gic_v3_its.h
> b/xen/include/asm-arm/gic_v3_its.h
> index a66b6be..ed44bdb 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -18,6 +18,53 @@
>   #ifndef __ASM_ARM_ITS_H__
>   #define __ASM_ARM_ITS_H__
>
> +#define GITS_CTLR                       0x000
> +#define GITS_IIDR                       0x004
> +#define GITS_TYPER                      0x008
> +#define GITS_CBASER                     0x080
> +#define GITS_CWRITER                    0x088
> +#define GITS_CREADR                     0x090
> +#define GITS_BASER_NR_REGS              8
> +#define GITS_BASER0                     0x100
> +#define GITS_BASER1                     0x108
> +#define GITS_BASER2                     0x110
> +#define GITS_BASER3                     0x118
> +#define GITS_BASER4                     0x120
> +#define GITS_BASER5                     0x128
> +#define GITS_BASER6                     0x130
> +#define GITS_BASER7                     0x138
> +
> +/* Register bits */
> +#define GITS_VALID_BIT                  BIT_ULL(63)
> +
> +#define GITS_CTLR_QUIESCENT             BIT(31)
> +#define GITS_CTLR_ENABLE                BIT(0)
> +
> +#define GITS_IIDR_VALUE                 0x34c
> +
> +#define GITS_BASER_INDIRECT             BIT_ULL(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
> +                                        (31UL <<
> GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL <<
> GITS_BASER_SHAREABILITY_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL <<
> GITS_BASER_OUTER_CACHEABILITY_SHIFT)
> +#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL <<
> GITS_BASER_INNER_CACHEABILITY_SHIFT)
> +
>   #ifndef __ASSEMBLY__
>   #include <xen/device_tree.h>
>
> @@ -27,6 +74,7 @@ struct host_its {
>       const struct dt_device_node *dt_node;
>       paddr_t addr;
>       paddr_t size;
> +    void __iomem *its_base;
>   };
>
>   extern struct list_head host_its_list;
> @@ -42,8 +90,9 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>   uint64_t gicv3_lpi_get_proptable(void);
>   uint64_t gicv3_lpi_allocate_pendtable(void);
>
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>   int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +int gicv3_its_init(struct host_its *hw_its);
>
>   #else
>
> @@ -62,6 +111,10 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int
> nr_lpis)
>   {
>       return 0;
>   }
> +static inline int gicv3_its_init(struct host_its *hw_its)
> +{
> +    return 0;
> +}
>   #endif /* CONFIG_HAS_ITS */
>
>   #endif /* __ASSEMBLY__ */

-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
  2017-02-07 14:05   ` Julien Grall
  2017-02-15 16:30   ` Julien Grall
@ 2017-02-22  7:06   ` Vijay Kilari
  2017-02-24 19:37     ` Shanker Donthineni
  2017-02-22 13:17   ` Julien Grall
  3 siblings, 1 reply; 106+ messages in thread
From: Vijay Kilari @ 2017-02-22  7:06 UTC (permalink / raw)
  To: Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi Andre,

On Tue, Jan 31, 2017 at 12:01 AM, Andre Przywara <andre.przywara@arm.com> wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 188 ++++++++++++++++++++++++++++++++++++++-
>  xen/arch/arm/vgic-v3.c           |   3 +
>  xen/include/asm-arm/domain.h     |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  28 ++++++
>  4 files changed, 221 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 6578e8a..4a3a394 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,8 +21,10 @@
>  #include <xen/device_tree.h>
>  #include <xen/delay.h>
>  #include <xen/libfdt/libfdt.h>
> -#include <xen/mm.h>
> +#include <xen/rbtree.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/domain.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
> @@ -94,6 +96,21 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             int size, uint64_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size & GENMASK(4, 0);
> +    cmd[2] = itt_addr & GENMASK(51, 8);
> +    if ( valid )
> +        cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(int cpu)
>  {
> @@ -293,6 +310,7 @@ int gicv3_its_init(struct host_its *hw_its)
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      if ( reg & GITS_TYPER_PTA )
>          hw_its->flags |= HOST_ITS_USES_PTA;
> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
> @@ -339,6 +357,173 @@ int gicv3_its_init(struct host_its *hw_its)
>      return 0;
>  }
>
> +static void remove_mapped_guest_device(struct its_devices *dev)
> +{
> +    if ( dev->hw_its )
> +        its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
> +
> +    xfree(dev->itt_addr);
> +    xfree(dev);
> +}
> +
> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                               int guest_devid, int bits, bool valid)
> +{
> +    void *itt_addr = NULL;
> +    struct its_devices *dev, *temp;
> +    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
> +    struct host_its *hw_its;
> +    int ret;
> +
> +    /* check for already existing mappings */
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    while (*new)
> +    {
> +        temp = rb_entry(*new, struct its_devices, rbnode);
> +
> +        if ( temp->guest_devid == guest_devid )
> +        {
> +            if ( !valid )
> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
> +
> +            spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +            if ( valid )
> +                return -EBUSY;
> +
> +            remove_mapped_guest_device(temp);
> +
> +            return 0;
> +        }
> +
> +        if ( guest_devid < temp->guest_devid )
> +            new = &((*new)->rb_right);

Shouldn't be rb_left here?
> +        else
> +            new = &((*new)->rb_left);
Shouldn't be rb_right here?

> +    }

I see duplicate code in find and inserting node into rb_tree.

> +
> +    if ( !valid )
> +    {
> +        ret = -ENOENT;
> +        goto out_unlock;
> +    }
> +
> +    /*
> +     * TODO: Work out the correct hardware ITS to use here.
> +     * Maybe a per-platform function: devid -> ITS?
> +     * Or parsing the DT to find the msi_parent?
> +     * Or get Dom0 to give us this information?
> +     * For now just use the first ITS.
> +     */
> +    hw_its = list_first_entry(&host_its_list, struct host_its, entry);
> +
> +    ret = -ENOMEM;
> +
> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);
> +    if ( !itt_addr )
> +        goto out_unlock;
> +
> +    dev = xmalloc(struct its_devices);
> +    if ( !dev )
> +    {
> +        xfree(itt_addr);
> +        goto out_unlock;
> +    }
> +
> +    ret = its_send_cmd_mapd(hw_its, host_devid, bits - 1,
> +                            virt_to_maddr(itt_addr), true);
> +    if ( ret )
> +    {
> +        xfree(itt_addr);
> +        xfree(dev);
> +        goto out_unlock;
> +    }
> +
> +    dev->itt_addr = itt_addr;
> +    dev->hw_its = hw_its;
> +    dev->guest_devid = guest_devid;
> +    dev->host_devid = host_devid;
> +    dev->eventids = BIT(bits);
> +
> +    rb_link_node(&dev->rbnode, parent, new);

parent is not updated before inserting. So whole tree is not
managed properly.

> +    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return 0;
> +
> +out_unlock:
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +    return ret;
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +    struct rb_node *victim;
> +    struct its_devices *dev;
> +
> +    /*
> +     * This is an easily readable, yet inefficient implementation.
> +     * It uses the provided iteration wrapper and erases each node, which
> +     * possibly triggers rebalancing.
> +     * This seems overkill since we are going to abolish the whole tree, but
> +     * avoids an open-coded re-implementation of the traversal functions with
> +     * some recursive function calls.
> +     * Performance does not matter here, since we are destroying a domain.
> +     */
> +restart:
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
> +    {
> +        dev = rb_entry(victim, struct its_devices, rbnode);
> +        rb_erase(victim, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        goto restart;
> +    }
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +}
> +
> +int gicv3_its_unmap_device(struct domain *d, int guest_devid)
> +{
> +    struct rb_node *node;
> +
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    node = d->arch.vgic.its_devices.rb_node;
> +    while (node)
> +    {
> +        struct its_devices *dev = rb_entry(node, struct its_devices, rbnode);
> +
> +        if ( dev->guest_devid > guest_devid )
> +        {
> +            node = node->rb_left;
> +            continue;
> +        }
> +        if ( dev->guest_devid < guest_devid )
> +        {
> +            node = node->rb_right;
> +            continue;
> +        }
> +
> +        rb_erase(node, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        return 0;
> +
> +    }
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return -ENOENT;
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> @@ -369,6 +554,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);
>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d61479d..1fadb00 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
>      d->arch.vgic.nr_regions = rdist_count;
>      d->arch.vgic.rdist_regions = rdist_regions;
>
> +    spin_lock_init(&d->arch.vgic.its_devices_lock);
> +    d->arch.vgic.its_devices = RB_ROOT;
> +
>      /*
>       * Domain 0 gets the hardware address.
>       * Guests get the virtual platform layout.
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 2d6fbb1..00b9c1a 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -11,6 +11,7 @@
>  #include <asm/gic.h>
>  #include <public/hvm/params.h>
>  #include <xen/serial.h>
> +#include <xen/rbtree.h>
>
>  struct hvm_domain
>  {
> @@ -109,6 +110,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        struct rb_root its_devices;         /* devices mapped to an ITS */
> +        spinlock_t its_devices_lock;        /* protects the its_devices tree */
>  #endif
>      } vgic;
>
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 8288185..9c5dcf3 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -42,6 +42,10 @@
>
>  #define GITS_TYPER_PTA                  BIT_ULL(19)
>  #define GITS_TYPER_IDBITS_SHIFT         8
> +#define GITS_TYPER_ITT_SIZE_SHIFT       4
> +#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
> +#define GITS_TYPER_ITT_SIZE(r)          (((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
> +                                                GITS_TYPER_ITT_SIZE_SHIFT)
>
>  #define GITS_IIDR_VALUE                 0x34c
>
> @@ -88,6 +92,7 @@
>
>  #ifndef __ASSEMBLY__
>  #include <xen/device_tree.h>
> +#include <xen/rbtree.h>
>
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
> @@ -101,9 +106,19 @@ struct host_its {
>      void __iomem *its_base;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    int itte_size;
>      unsigned int flags;
>  };
>
> +struct its_devices {
> +    struct rb_node rbnode;
> +    struct host_its *hw_its;
> +    void *itt_addr;
> +    uint32_t guest_devid;
> +    uint32_t host_devid;
> +    uint32_t eventids;
> +};
> +
>  extern struct list_head host_its_list;
>
>  #ifdef CONFIG_HAS_ITS
> @@ -128,6 +143,13 @@ uint64_t gicv3_get_redist_address(int cpu, bool use_pta);
>  /* Map a collection for this host CPU to each host ITS. */
>  int gicv3_its_setup_collection(int cpu);
>
> +/* Map a device on the host by allocating an ITT on the host (ITS).
> + * "bits" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                               int guest_devid, int bits, bool valid);
> +
>  #else
>
>  static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> @@ -156,6 +178,12 @@ static inline int gicv3_its_setup_collection(int cpu)
>  {
>      return 0;
>  }
> +static inline int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                                             int guest_devid, int bits,
> +                                             bool valid)
> +{
> +    return -ENODEV;
> +}
>
>  #endif /* CONFIG_HAS_ITS */
>
> --
> 2.9.0
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
                     ` (2 preceding siblings ...)
  2017-02-22  7:06   ` Vijay Kilari
@ 2017-02-22 13:17   ` Julien Grall
  3 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-22 13:17 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

Hi Andre,

On 30/01/17 18:31, Andre Przywara wrote:
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 6578e8a..4a3a394 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c

[...]


> +
> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
> +                               int guest_devid, int bits, bool valid)

Whilst looking at the IORT table it occurred to me that the DeviceID may 
not be uniq accross all the ITSes on the platform.

This means 2 ITS could use the same DeviceID for different devices. 
However, this function assume the DeviceID will always be uniq.

So we would need to know specify the pITS and vITS for all PCI devices.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-02-16 19:03   ` Shanker Donthineni
@ 2017-02-24 19:29     ` Shanker Donthineni
  2017-02-27 10:23       ` Andre Przywara
  0 siblings, 1 reply; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-24 19:29 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi Andre


On 02/16/2017 01:03 PM, Shanker Donthineni wrote:
> Hi Andre,
>
>
> On 01/30/2017 12:31 PM, Andre Przywara wrote:
>> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
>> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
>> and collection ID, which points to the target CPU.
>> This mapping is stored in the device and collection tables, which 
>> software
>> has to provide for the ITS to use.
>> Allocate the required memory and hand it the ITS.
>> The maximum number of devices is limited to a compile-time constant
>> exposed in Kconfig.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>   xen/arch/arm/Kconfig             |  14 +++++
>>   xen/arch/arm/gic-v3-its.c        | 129
>> +++++++++++++++++++++++++++++++++++++++
>>   xen/arch/arm/gic-v3.c            |   5 ++
>>   xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>>   4 files changed, 202 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 71734a1..81bc233 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>>             This can be overriden on the command line with the 
>> max_lpi_bits
>>             parameter.
>>
>> +config MAX_PHYS_ITS_DEVICE_BITS
>> +        depends on HAS_ITS
>> +        int "Number of device bits the ITS supports"
>> +        range 1 32
>> +        default "10"
>> +        help
>> +          Specifies the maximum number of devices which want to use the
>> ITS.
>> +          Xen needs to allocates memory for the whole range very early.
>> +          The allocation scheme may be sparse, so a much larger 
>> number must
>> +          be supported to cover devices with a high bus number or 
>> those on
>> +          separate bus segments.
>> +          This can be overriden on the command line with the
>> max_its_device_bits
>> +          parameter.
>> +
>>   endmenu
>>
>>   menu "ARM errata workaround via the alternative framework"
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> index ff0f571..c31fef6 100644
>> --- a/xen/arch/arm/gic-v3-its.c
>> +++ b/xen/arch/arm/gic-v3-its.c
>> @@ -20,9 +20,138 @@
>>   #include <xen/lib.h>
>>   #include <xen/device_tree.h>
>>   #include <xen/libfdt/libfdt.h>
>> +#include <xen/mm.h>
>> +#include <xen/sizes.h>
>>   #include <asm/gic.h>
>>   #include <asm/gic_v3_defs.h>
>>   #include <asm/gic_v3_its.h>
>> +#include <asm/io.h>
>> +
>> +#define BASER_ATTR_MASK                                           \
>> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
>> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
>> +
>> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>> +{
>> +    uint64_t ret;
>> +
>> +    if ( page_bits < 16 )
>> +        return (uint64_t)addr & GENMASK(47, page_bits);
>> +
>> +    ret = addr & GENMASK(47, 16);
>> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>> +}
>> +
>> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
>> +
>> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int
>> nr_items)
>> +{
>> +    uint64_t attr, reg;
>> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) 
>> + 1;
>> +    int pagesz = 0, order, table_size;

Please try ITS page sizes in the order 64K, 16K and 4K to cover more ITS 
devices using a flat table. Similar to Linux ITS driver.

>> +    void *buffer = NULL;
>> +
>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>> GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>> +    attr |= GIC_BASER_CACHE_RaWaWb << 
>> GITS_BASER_INNER_CACHEABILITY_SHIFT;
>> +
>> +    /*
>> +     * Setup the BASE register with the attributes that we like. 
>> Then read
>> +     * it back and see what sticks (page size, cacheability and
>> shareability
>> +     * attributes), retrying if necessary.
>> +     */
>> +    while ( 1 )
>> +    {
>> +        table_size = ROUNDUP(nr_items * entry_size,
>> BIT(PAGE_BITS(pagesz)));
>> +        order = get_order_from_bytes(table_size);
>> +

Limit to 256 ITS pages, ITS spec doesn't support more than 256 ITS pages.

        /* Maximum of 256 ITS pages are allowed */
        if ( (table_size >> PAGE_BITS(pagesz)) > GITS_BASER_PAGES_MAX )
                table_size = BIT(PAGE_BITS(pagesz)) * GITS_BASER_PAGES_MAX;

>> +        if ( !buffer )
>> +            buffer = alloc_xenheap_pages(order, 0);
>> +        if ( !buffer )
>> +            return -ENOMEM;
>> +

Please zero memory memset(buffer, 0x00, order << PAGE_SHIFT)


-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping
  2017-02-22  7:06   ` Vijay Kilari
@ 2017-02-24 19:37     ` Shanker Donthineni
  0 siblings, 0 replies; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-24 19:37 UTC (permalink / raw)
  To: Vijay Kilari, Andre Przywara; +Cc: xen-devel, Julien Grall, Stefano Stabellini

Hi Andre,


On 02/22/2017 01:06 AM, Vijay Kilari wrote:
> Hi Andre,
>
> On Tue, Jan 31, 2017 at 12:01 AM, Andre Przywara <andre.przywara@arm.com> wrote:
>> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
>> those IDs, which we directly pass on to the host.
>> For this we have to map each device that Dom0 may request to a host
>> ITS device with the same identifier.
>> Allocate the respective memory and enter each device into an rbtree to
>> later be able to iterate over it or to easily teardown guests.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>   xen/arch/arm/gic-v3-its.c        | 188 ++++++++++++++++++++++++++++++++++++++-
>>   xen/arch/arm/vgic-v3.c           |   3 +
>>   xen/include/asm-arm/domain.h     |   3 +
>>   xen/include/asm-arm/gic_v3_its.h |  28 ++++++
>>   4 files changed, 221 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> index 6578e8a..4a3a394 100644
>> --- a/xen/arch/arm/gic-v3-its.c
>> +++ b/xen/arch/arm/gic-v3-its.c
>> @@ -21,8 +21,10 @@
>>   #include <xen/device_tree.h>
>>   #include <xen/delay.h>
>>   #include <xen/libfdt/libfdt.h>
>> -#include <xen/mm.h>
>> +#include <xen/rbtree.h>
>> +#include <xen/sched.h>
>>   #include <xen/sizes.h>
>> +#include <xen/domain.h>
>>   #include <asm/gic.h>
>>   #include <asm/gic_v3_defs.h>
>>   #include <asm/gic_v3_its.h>
>> @@ -94,6 +96,21 @@ static int its_send_cmd_mapc(struct host_its *its, int collection_id, int cpu)
>>       return its_send_command(its, cmd);
>>   }
>>
>> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>> +                             int size, uint64_t itt_addr, bool valid)
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
>> +    cmd[1] = size & GENMASK(4, 0);
>> +    cmd[2] = itt_addr & GENMASK(51, 8);
>> +    if ( valid )
>> +        cmd[2] |= GITS_VALID_BIT;
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>>   /* Set up the (1:1) collection mapping for the given host CPU. */
>>   int gicv3_its_setup_collection(int cpu)
>>   {
>> @@ -293,6 +310,7 @@ int gicv3_its_init(struct host_its *hw_its)
>>       reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>>       if ( reg & GITS_TYPER_PTA )
>>           hw_its->flags |= HOST_ITS_USES_PTA;
>> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>>
>>       for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>>       {
>> @@ -339,6 +357,173 @@ int gicv3_its_init(struct host_its *hw_its)
>>       return 0;
>>   }
>>
>> +static void remove_mapped_guest_device(struct its_devices *dev)
>> +{
>> +    if ( dev->hw_its )
>> +        its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
>> +
>> +    xfree(dev->itt_addr);
>> +    xfree(dev);
>> +}
>> +
>> +int gicv3_its_map_guest_device(struct domain *d, int host_devid,
>> +                               int guest_devid, int bits, bool valid)
>> +{
>> +    void *itt_addr = NULL;
>> +    struct its_devices *dev, *temp;
>> +    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
>> +    struct host_its *hw_its;
>> +    int ret;
>> +
>> +    /* check for already existing mappings */
>> +    spin_lock(&d->arch.vgic.its_devices_lock);
>> +    while (*new)
>> +    {
>> +        temp = rb_entry(*new, struct its_devices, rbnode);
>> +
>> +        if ( temp->guest_devid == guest_devid )
>> +        {
>> +            if ( !valid )
>> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
>> +
>> +            spin_unlock(&d->arch.vgic.its_devices_lock);
>> +
>> +            if ( valid )
>> +                return -EBUSY;
>> +
>> +            remove_mapped_guest_device(temp);
>> +
>> +            return 0;
>> +        }
>> +
>> +        if ( guest_devid < temp->guest_devid )
>> +            new = &((*new)->rb_right);
> Shouldn't be rb_left here?
>> +        else
>> +            new = &((*new)->rb_left);
> Shouldn't be rb_right here?
>

I found the same thing when I tested ITS feature on Qualcomm platform.

>> +    }
> I see duplicate code in find and inserting node into rb_tree.
>
>> +
>> +    if ( !valid )
>> +    {
>> +        ret = -ENOENT;
>> +        goto out_unlock;
>> +    }
>> +
>> +    /*
>> +     * TODO: Work out the correct hardware ITS to use here.
>> +     * Maybe a per-platform function: devid -> ITS?
>> +     * Or parsing the DT to find the msi_parent?
>> +     * Or get Dom0 to give us this information?
>> +     * For now just use the first ITS.
>> +     */
>> +    hw_its = list_first_entry(&host_its_list, struct host_its, entry);
>> +
>> +    ret = -ENOMEM;
>> +
>> +    itt_addr = _xmalloc(BIT(bits) * hw_its->itte_size, 256);

Please use _xzalloc() to avoid potential issues with ITS hardware 
prefetching feature.

-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-01-31 16:18                 ` Julien Grall
@ 2017-02-24 19:57                   ` Shanker Donthineni
  2017-02-24 20:28                     ` Julien Grall
  2017-02-27 17:20                     ` Andre Przywara
  0 siblings, 2 replies; 106+ messages in thread
From: Shanker Donthineni @ 2017-02-24 19:57 UTC (permalink / raw)
  To: Julien Grall, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

Hi Julien,


On 01/31/2017 10:18 AM, Julien Grall wrote:
>
>
> On 31/01/17 16:02, Jaggi, Manish wrote:
>>
>>
>> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>>
>>>
>>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>>> Hi Julien,
>>>>
>>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>> If you please go back to your comment where you wrote "we need to 
>>>> find another way to get the DeviceID", I was referring that we 
>>>> should add that another way in this series so that correct DeviceID 
>>>> is programmed in ITS.
>>>
>>> This is not the first time I am saying this, just saying "we should 
>>> add that another way..." is not helpful. You should also provide 
>>> some details on what you would do.
>>>
>> Julien, As you suggested we need to find another way, I assumed you 
>> had something in mind.
>
> I gave suggestions on my e-mail but you may have missed it...
>
>> Since we both agree that sbdf!=deviceID, the current series of ITS 
>> patches will program the incorrect deviceID so there is a need to
>> have a way to map sbdf with deviceID in xen.
>>
>> One option could be to add a new hypercall to supply sbdf and 
>> deviceID to xen.
>
> ... as well as the part where I am saying that I am not in favor to
> implement an hypercall temporarily, and against adding a new hypercall
> for only a couple of weeks. As you may know PHYSDEV hypercall are part
> of the stable ABI and once they are added they cannot be removed.
>
> So we need to be sure the hypercall is necessary. In this case, the
> hypercall is not necessary as all the information can be found in the
> firmware tables. However this is not implemented yet and part of the
> discussion on PCI Passthrough (see [1]).
>
> We need a temporary solution that does not involve any commitment on the
> ABI until Xen is able to discover PCI.
>

Why can't  we handle ITS device creation whenever a virtual ITS driver 
receives the MAPD command from dom0/domU. In case of dom0, it's straight 
forward dom0 always passes the real ITS device through MAPD command. 
This way we can support PCIe devices without hard-coded MSI(x) limit 32, 
and platform devices transparently. I used the below code to platform 
and PCIe device MSI(x) functionality on QDF2400 server platform.

@@ -383,10 +384,17 @@ static int its_handle_mapd(struct virt_its *its, 
uint64_t *cmdptr)
      int size = its_cmd_get_size(cmdptr);
      bool valid = its_cmd_get_validbit(cmdptr);
      paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) & 
GENMASK(51, 8);
+    int ret;

      if ( !its->dev_table )
          return -1;

+    size = size < 4 ? 4 : size;
+    ret = gicv3_its_map_guest_device(hardware_domain, devid, devid, 
size + 1,
+                                     valid);
+    if (ret < 0)
+        return ret;
+


-- 
Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-02-24 19:57                   ` Shanker Donthineni
@ 2017-02-24 20:28                     ` Julien Grall
  2017-02-27 17:20                     ` Andre Przywara
  1 sibling, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-24 20:28 UTC (permalink / raw)
  To: shankerd, Jaggi, Manish, Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, nd, Vijay Kilari, Kapoor, Prasun

On 24/02/2017 19:57, Shanker Donthineni wrote:
> Hi Julien,

Hi Shanker,

>
> On 01/31/2017 10:18 AM, Julien Grall wrote:
>>
>>
>> On 31/01/17 16:02, Jaggi, Manish wrote:
>>>
>>>
>>> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>>>
>>>>
>>>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>>>> Hi Julien,
>>>>>
>>>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>> If you please go back to your comment where you wrote "we need to
>>>>> find another way to get the DeviceID", I was referring that we
>>>>> should add that another way in this series so that correct DeviceID
>>>>> is programmed in ITS.
>>>>
>>>> This is not the first time I am saying this, just saying "we should
>>>> add that another way..." is not helpful. You should also provide
>>>> some details on what you would do.
>>>>
>>> Julien, As you suggested we need to find another way, I assumed you
>>> had something in mind.
>>
>> I gave suggestions on my e-mail but you may have missed it...
>>
>>> Since we both agree that sbdf!=deviceID, the current series of ITS
>>> patches will program the incorrect deviceID so there is a need to
>>> have a way to map sbdf with deviceID in xen.
>>>
>>> One option could be to add a new hypercall to supply sbdf and
>>> deviceID to xen.
>>
>> ... as well as the part where I am saying that I am not in favor to
>> implement an hypercall temporarily, and against adding a new hypercall
>> for only a couple of weeks. As you may know PHYSDEV hypercall are part
>> of the stable ABI and once they are added they cannot be removed.
>>
>> So we need to be sure the hypercall is necessary. In this case, the
>> hypercall is not necessary as all the information can be found in the
>> firmware tables. However this is not implemented yet and part of the
>> discussion on PCI Passthrough (see [1]).
>>
>> We need a temporary solution that does not involve any commitment on the
>> ABI until Xen is able to discover PCI.
>>
>
> Why can't  we handle ITS device creation whenever a virtual ITS driver
> receives the MAPD command from dom0/domU. In case of dom0, it's straight
> forward dom0 always passes the real ITS device through MAPD command.

I guess this can work. Note that, on a separate thread (see [1]), I 
suggested to decouple the virtual DeviceID to the physical one for DOM0 
to simply the generation of the IORT.

So we would have to be a bit more clever here. But that's probably a 
separate subject and can go in Xen in separate series.

> This way we can support PCIe devices without hard-coded MSI(x) limit 32,
> and platform devices transparently. I used the below code to platform
> and PCIe device MSI(x) functionality on QDF2400 server platform.
>
> @@ -383,10 +384,17 @@ static int its_handle_mapd(struct virt_its *its,
> uint64_t *cmdptr)
>      int size = its_cmd_get_size(cmdptr);
>      bool valid = its_cmd_get_validbit(cmdptr);
>      paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) &
> GENMASK(51, 8);
> +    int ret;
>
>      if ( !its->dev_table )
>          return -1;
>
> +    size = size < 4 ? 4 : size;
> +    ret = gicv3_its_map_guest_device(hardware_domain, devid, devid,
> size + 1,
> +                                     valid);
> +    if (ret < 0)
> +        return ret;
> +
>

Cheers,

[1] https://lists.xen.org/archives/html/xen-devel/2017-02/msg02782.html

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-02-24 19:29     ` Shanker Donthineni
@ 2017-02-27 10:23       ` Andre Przywara
  0 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-02-27 10:23 UTC (permalink / raw)
  To: shankerd, Stefano Stabellini, Julien Grall; +Cc: xen-devel, Vijay Kilari

Hi Shanker,

thanks for having a look.

On 24/02/17 19:29, Shanker Donthineni wrote:
> Hi Andre
>
>
> On 02/16/2017 01:03 PM, Shanker Donthineni wrote:
>> Hi Andre,
>>
>>
>> On 01/30/2017 12:31 PM, Andre Przywara wrote:
>>> Each ITS maps a pair of a DeviceID (usually the PCI b/d/f triplet) and
>>> an EventID (the MSI payload or interrupt ID) to a pair of LPI number
>>> and collection ID, which points to the target CPU.
>>> This mapping is stored in the device and collection tables, which
>>> software
>>> has to provide for the ITS to use.
>>> Allocate the required memory and hand it the ITS.
>>> The maximum number of devices is limited to a compile-time constant
>>> exposed in Kconfig.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>   xen/arch/arm/Kconfig             |  14 +++++
>>>   xen/arch/arm/gic-v3-its.c        | 129
>>> +++++++++++++++++++++++++++++++++++++++
>>>   xen/arch/arm/gic-v3.c            |   5 ++
>>>   xen/include/asm-arm/gic_v3_its.h |  55 ++++++++++++++++-
>>>   4 files changed, 202 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 71734a1..81bc233 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -64,6 +64,20 @@ config MAX_PHYS_LPI_BITS
>>>             This can be overriden on the command line with the
>>> max_lpi_bits
>>>             parameter.
>>>
>>> +config MAX_PHYS_ITS_DEVICE_BITS
>>> +        depends on HAS_ITS
>>> +        int "Number of device bits the ITS supports"
>>> +        range 1 32
>>> +        default "10"
>>> +        help
>>> +          Specifies the maximum number of devices which want to use the
>>> ITS.
>>> +          Xen needs to allocates memory for the whole range very early.
>>> +          The allocation scheme may be sparse, so a much larger
>>> number must
>>> +          be supported to cover devices with a high bus number or
>>> those on
>>> +          separate bus segments.
>>> +          This can be overriden on the command line with the
>>> max_its_device_bits
>>> +          parameter.
>>> +
>>>   endmenu
>>>
>>>   menu "ARM errata workaround via the alternative framework"
>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>> index ff0f571..c31fef6 100644
>>> --- a/xen/arch/arm/gic-v3-its.c
>>> +++ b/xen/arch/arm/gic-v3-its.c
>>> @@ -20,9 +20,138 @@
>>>   #include <xen/lib.h>
>>>   #include <xen/device_tree.h>
>>>   #include <xen/libfdt/libfdt.h>
>>> +#include <xen/mm.h>
>>> +#include <xen/sizes.h>
>>>   #include <asm/gic.h>
>>>   #include <asm/gic_v3_defs.h>
>>>   #include <asm/gic_v3_its.h>
>>> +#include <asm/io.h>
>>> +
>>> +#define BASER_ATTR_MASK                                           \
>>> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>>> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>>> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
>>> +#define BASER_RO_MASK   (GENMASK(58, 56) | GENMASK(52, 48))
>>> +
>>> +static uint64_t encode_phys_addr(paddr_t addr, int page_bits)
>>> +{
>>> +    uint64_t ret;
>>> +
>>> +    if ( page_bits < 16 )
>>> +        return (uint64_t)addr & GENMASK(47, page_bits);
>>> +
>>> +    ret = addr & GENMASK(47, 16);
>>> +    return ret | (addr & GENMASK(51, 48)) >> (48 - 12);
>>> +}
>>> +
>>> +#define PAGE_BITS(sz) ((sz) * 2 + PAGE_SHIFT)
>>> +
>>> +static int its_map_baser(void __iomem *basereg, uint64_t regc, int
>>> nr_items)
>>> +{
>>> +    uint64_t attr, reg;
>>> +    int entry_size = ((regc >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f)
>>> + 1;
>>> +    int pagesz = 0, order, table_size;
>
> Please try ITS page sizes in the order 64K, 16K and 4K to cover more ITS
> devices using a flat table. Similar to Linux ITS driver.
>
>>> +    void *buffer = NULL;
>>> +
>>> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
>>> +    attr |= GIC_BASER_CACHE_SameAsInner <<
>>> GITS_BASER_OUTER_CACHEABILITY_SHIFT;
>>> +    attr |= GIC_BASER_CACHE_RaWaWb <<
>>> GITS_BASER_INNER_CACHEABILITY_SHIFT;
>>> +
>>> +    /*
>>> +     * Setup the BASE register with the attributes that we like.
>>> Then read
>>> +     * it back and see what sticks (page size, cacheability and
>>> shareability
>>> +     * attributes), retrying if necessary.
>>> +     */
>>> +    while ( 1 )
>>> +    {
>>> +        table_size = ROUNDUP(nr_items * entry_size,
>>> BIT(PAGE_BITS(pagesz)));
>>> +        order = get_order_from_bytes(table_size);
>>> +
>
> Limit to 256 ITS pages, ITS spec doesn't support more than 256 ITS pages.
>
>        /* Maximum of 256 ITS pages are allowed */
>        if ( (table_size >> PAGE_BITS(pagesz)) > GITS_BASER_PAGES_MAX )
>                table_size = BIT(PAGE_BITS(pagesz)) * GITS_BASER_PAGES_MAX;
>
>>> +        if ( !buffer )
>>> +            buffer = alloc_xenheap_pages(order, 0);
>>> +        if ( !buffer )
>>> +            return -ENOMEM;
>>> +
>
> Please zero memory memset(buffer, 0x00, order << PAGE_SHIFT)

All three comments make sense, thanks for pointing these out.
I will incorporate those changes into the next post.

Cheers,
Andre.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table
  2017-02-06 16:26   ` Julien Grall
@ 2017-02-27 11:34     ` Andre Przywara
  2017-02-27 12:48       ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-02-27 11:34 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi,

"Yes, will fix" to everything not explicitly mentioned below.

On 06/02/17 16:26, Julien Grall wrote:
> Hi Andre,
> 
> On 30/01/17 18:31, Andre Przywara wrote:
>> The ARM GICv3 provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> redistributor. The maximum number of LPIs to be used can be adjusted with
>> the command line option "max_lpi_bits", which defaults to a compile time
>> constant exposed in Kconfig.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig              |  15 +++++
>>  xen/arch/arm/Makefile             |   1 +
>>  xen/arch/arm/gic-v3-lpi.c         | 129
>> ++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c             |  44 +++++++++++++
>>  xen/include/asm-arm/bitops.h      |   1 +
>>  xen/include/asm-arm/gic.h         |   2 +
>>  xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
>>  xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
>>  8 files changed, 264 insertions(+), 2 deletions(-)
>>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index bf64c61..71734a1 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -49,6 +49,21 @@ config HAS_ITS
>>          bool "GICv3 ITS MSI controller support"
>>          depends on HAS_GICV3
>>
>> +config MAX_PHYS_LPI_BITS
>> +        depends on HAS_ITS
>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>> +        range 14 32
>> +        default "20"
>> +        help
>> +          Specifies the maximum number of LPIs (in bits) Xen should take
>> +          care of. The host ITS may provide support for a very large
>> number
>> +          of supported LPIs, for all of which we may not want to
>> allocate
>> +          memory, so this number here allows to limit this.
> 
> I think the description is misleading, if a user wants 8K worth of LPIs
> by default, he would have to use 14 and not 13.
> 
> Furthermore, you provide both a runtime option (via command line) and
> build time option (via Kconfig). You don't express what is the
> differences between both and how there are supposed to co-exist.
> 
> Anyway, IHMO the command line option should be sufficient to allow
> override if necessary. So I would drop the Kconfig version.

The idea is simply to let the Kconfig version specify the default value
if there is no command line option. So giving a command line option will
always override Kconfig.
Should we know a sensible default value, we can indeed get rid of this
Kconfig snippet here.

>> +          Xen itself does not know how many LPIs domains will ever need
>> +          beforehand.
>> +          This can be overriden on the command line with the
>> max_lpi_bits
> 
> s/overriden/overridden/
> 
>> +          parameter.
>> +
>>  endmenu
>>
>>  menu "ARM errata workaround via the alternative framework"
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 5f4ff23..4ccf2eb 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -19,6 +19,7 @@ obj-y += gic.o
>>  obj-y += gic-v2.o
>>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
>>  obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
>> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
>>  obj-y += guestcopy.o
>>  obj-y += hvm.o
>>  obj-y += io.o
>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
>> new file mode 100644
>> index 0000000..e2fc901
>> --- /dev/null
>> +++ b/xen/arch/arm/gic-v3-lpi.c
>> @@ -0,0 +1,129 @@
>> +/*
>> + * xen/arch/arm/gic-v3-lpi.c
>> + *
>> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
>> + *
>> + * Copyright (C) 2016,2017 - ARM Ltd
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + */
>> +
>> +#include <xen/config.h>
> 
> xen/config.h is not necessary.
> 
>> +#include <xen/lib.h>
>> +#include <xen/mm.h>
>> +#include <xen/sizes.h>
>> +#include <asm/gic.h>
>> +#include <asm/gic_v3_defs.h>
>> +#include <asm/gic_v3_its.h>
>> +
>> +/* Global state */
>> +static struct {
>> +    uint8_t *lpi_property;
>> +    unsigned int host_lpi_bits;
> 
> On the previous version, Stefano suggested to rename this to
> phys_lpi_bits + adding a comment as you store the number of bits.
> 
> However, looking at the usage the number of bits is only required during
> the initialization. Runtime code (such as gic_get_host_lpi) will use the
> number of LPIs (see gic_get_host_lpi) and therefore require extra
> instructions to compute the value.
> 
> So I would prefer if you store the number of LPIs here to optimize the
> common case.

Well, it's a shift, not the 5th root. This is in fact the only
difference between the two approaches:
000000000024d770 <gic_get_host_lpi>:
  ...
  24d788:       9ac22421        lsr     x1, x1, x2

And I was thinking about it before, my rationale for not doing it was:
- We need both the number and the shift, and it's much easier to get the
number from the bits than the other way around.
- The bits are the real source of the information (from TYPER).
- Having a number would always raise the question whether we need to
make sure that there is more than one bit set in there and how to deal
with it.

> Also, I find the naming "id_bits" confusing because you store the number
> of bits to encode the max LPI ID and not the number of bits to encode
> the number of LPI.

"IDbits" is the spec term used. It describes how many bits you need to
describe an LPI number. LPIs start at number 8192.
GICv3 spec, 8.9.24:
IDbits, bits [23:19]
       The number of interrupt identifier bits supported, minus one.

I can rename this to "phys_lpi_idbits", if that sounds reasonable.

>> +} lpi_data;
>> +
>> +/* Pending table for each redistributor */
>> +static DEFINE_PER_CPU(void *, pending_table);
>> +
>> +#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
>> +
>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>> +{
>> +    uint64_t reg;
>> +    void *pendtable;
>> +
>> +    reg  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    reg |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    reg |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
>> +
>> +    if ( !this_cpu(pending_table) )
>> +    {
>> +        /*
>> +         * The pending table holds one bit per LPI and even covers
>> bits for
>> +         * interrupt IDs below 8192, so we allocate the full range.
>> +         * The GICv3 imposes a 64KB alignment requirement.
>> +         */
>> +        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8,
>> SZ_64K);
>> +        if ( !pendtable )
>> +            return 0;
>> +
>> +        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);
> 
> You can use _zalloc to do the allocation and then memset to 0.
> 
>> +        __flush_dcache_area(pendtable,
>> BIT_ULL(lpi_data.host_lpi_bits) / 8);
> 
> Please use clean_and_invalidate_dcache_va_range.
> 
>> +
>> +        this_cpu(pending_table) = pendtable;
>> +    }
>> +    else
>> +    {
>> +        pendtable = this_cpu(pending_table);
>> +    }
> 
> The {} are not necessary.

Sure, this was coming from the Linux rule here: if the if-clause
requires braces, the else-clause has to have those too. To me it looks
weird otherwise. Xen's coding style document doesn't explicitly say.

> Also, on the previous version it was mentioned
> this should be an error and then replace by a BUG_ON().

I don't see how this would be an actual bug. Yes, the code as it is
right now calls this only once, but it wouldn't hurt if we call this
multiple times. And I am always a bit wary of crashing the system if we
could just carry on instead, but ...

> Please do the change.

... however you like.

>> +
>> +    reg |= GICR_PENDBASER_PTZ;
>> +
>> +    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));
> 
> I don't understand the purpose of this ASSERT. The bits 15:0 should
> always be zero otherwise this would be a bug in the memory allocator.
> For bits 64:52, the architecture so far only support up to 52 bits.

You complained about using a mask on the address before, which I can
understand.
However we are writing to a register described in the spec here and
should observe that the address only goes into bits [51:16]. Any other
bits should not be touched or we are getting weird errors.
So somehow I have to make sure we comply with this.
This could either be a mask or an ASSERT.
If the assert never fires: great. Nothing to worry about here. But I
think this matches the ASSERT idea: we rely on this address being 4K
aligned and not exceeding 52 bits worth of address bits, so we should
check these assumptions.

> By keeping this ASSERT, you will make our life more difficult to extend
> the number of physical address supported if ARM decides to bump it.

This GICR register is limited to 52 bits of physical address space
according to the spec.
If we ever upgrade the address size, the GIC spec would need to be
upgraded as well, so chances are we either need to touch that code here
anyways or we use a GICv5 or the like from the beginning.

> So please drop this ASSERT.
> 
>> +    reg |= virt_to_maddr(pendtable);
>> +
>> +    return reg;
>> +}
>> +
>> +uint64_t gicv3_lpi_get_proptable(void)
>> +{
>> +    uint64_t reg;
>> +
>> +    reg  = GIC_BASER_CACHE_RaWaWb <<
>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>> +    reg |= GIC_BASER_CACHE_SameAsInner <<
>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>> +    reg |= GIC_BASER_InnerShareable <<
>> GICR_PENDBASER_SHAREABILITY_SHIFT;
> 
> You are using the shift defines from PENDBASER and not PROPBASER.

Doh. I thought I fixed this.

>> +
>> +    /*
>> +     * The property table is shared across all redistributors, so
>> allocate
>> +     * this only once, but return the same value on subsequent calls.
>> +     */
>> +    if ( !lpi_data.lpi_property )
>> +    {
>> +        /* The property table holds one byte per LPI. */
>> +        void *table = alloc_xenheap_pages(lpi_data.host_lpi_bits -
>> PAGE_SHIFT,
>> +                                          0);
> 
> The property table address has to be 4KB aligned right? If so, I would
> much prefer if you use _xmalloc(BIT_ULL(lpi_data.host_lpi_bits), SZ_4K)
> to avoid relying on PAGE_SIZE == 4KB.
> 
> Also, you will allocate more memory than necessary because the property
> table only covers the LPIs.
> 
>> +
>> +        if ( !table )
>> +            return 0;
>> +
>> +        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
> 
> You could combine both suggested _xmalloc and memset to 0 in a single
> call to _zalloc.
> 
>> +        __flush_dcache_area(table, MAX_PHYS_LPIS);
> 
> Please use clean_and_invalidate_dcache_va_range.
> 
>> +        lpi_data.lpi_property = table;
>> +    }
>> +
>> +    reg |= ((lpi_data.host_lpi_bits - 1) << 0);
> 
> Please avoid hardcoded shift.
> 
>> +
>> +    ASSERT(!(virt_to_maddr(lpi_data.lpi_property) & ~GENMASK(51, 12)));
>> +    reg |= virt_to_maddr(lpi_data.lpi_property);
>> +
>> +    return reg;
>> +}
>> +
>> +static unsigned int max_lpi_bits = CONFIG_MAX_PHYS_LPI_BITS;
>> +integer_param("max_lpi_bits", max_lpi_bits);
> 
> Please document this new option in docs/misc/xen-command-line.markdown.
> 
>> +
>> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
> 
> Stefano suggested to rename this function to gicv3_lpi_init_phys_lpis
> and I agree with him here.
> 
>> +{
>> +    lpi_data.host_lpi_bits = min(hw_lpi_bits, max_lpi_bits);
>> +
>> +    printk("GICv3: using at most %lld LPIs on the host.\n",
>> MAX_PHYS_LPIS);
> 
> s/lld/llu/.
> 
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
>> index 838dd11..fcb86c8 100644
>> --- a/xen/arch/arm/gic-v3.c
>> +++ b/xen/arch/arm/gic-v3.c
>> @@ -546,6 +546,9 @@ static void __init gicv3_dist_init(void)
>>      type = readl_relaxed(GICD + GICD_TYPER);
>>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>>
>> +    if ( type & GICD_TYPE_LPIS )
>> +        gicv3_lpi_init_host_lpis(((type >> GICD_TYPE_ID_BITS_SHIFT) &
>> 0x1f) + 1);
> 
> A macro has been suggested on the previous version here to avoid the
> hardcoded 0x1f.
> 
>> +
>>      printk("GICv3: %d lines, (IID %8.8x).\n",
>>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>>
>> @@ -616,6 +619,33 @@ static int gicv3_enable_redist(void)
>>      return 0;
>>  }
>>
>> +static int gicv3_rdist_init_lpis(void __iomem * rdist_base)
> 
> I think it would make sense to move this function in gicv3-lpi.c. So
> only one function rather than 2 would be exported.
> 
>> +{
>> +    uint32_t reg;
>> +    uint64_t table_reg;
>> +
>> +    /* We don't support LPIs without an ITS. */
>> +    if ( list_empty(&host_its_list) )
> 
> See my comment on patch #2 regarding host_its_list.
> 
>> +        return -ENODEV;
>> +
>> +    /* Make sure LPIs are disabled before setting up the tables. */
>> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
>> +    if ( reg & GICR_CTLR_ENABLE_LPIS )
>> +        return -EBUSY;
> 
> Why don't you just disable LPIs here? AFAIK, it should just be
> writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, GICR_CTLR);

Please don't shoot the messenger, but:
GICv3 spec 8.11.2 "GICR_CTLR, Redistributor Control Register":
Enable_LPIs, bit [0]:
"... When a write changes this bit from 0 to 1, this bit becomes RES1
and the Redistributor must load the LPI Pending table from memory to
check for any pending interrupts."

Read: LPIs are so great that you can't disable them anymore once you
have enabled them.
Yes, this is a bit weird, even has nasty side effects.

>> +
>> +    table_reg = gicv3_lpi_allocate_pendtable();
>> +    if ( !table_reg )
> 
> From the spec, GICR_PENDBASER full of 0 is valid.

Meh.
Changed it to take an uint64_t * and return an error code.

>> +        return -ENOMEM;
>> +    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> 
> On the first version of this series, I mentioned that based on the spec
> (8.11.18 in ARM IHI 0069C) cacheability and shareability may not stick.
> 
> Whilst this may not (?) be a concern for the pending table, Xen will
> write in the property table to enable/disable LPIs. So we would need to
> know whether the cache needs to be cleaned after each access or not.

Ah, yes, I implemented this for the command queue, then realized that we
need it for the PROPBASER as well, but it's a per-redistributor
property. At this point I decided to split the code between -lpi.c and
-its.c, which took me a day, making me loose my "mental return address",
so I forget to go back and fix the actual issue ;-)
Sorry for that, this is now fixed in my code base.

>> +
>> +    table_reg = gicv3_lpi_get_proptable();
>> +    if ( !table_reg )
>> +        return -ENOMEM;
>> +    writeq_relaxed(table_reg, rdist_base + GICR_PROPBASER);
> 
> See all my remarks above.
> 
>> +
>> +    return 0;
>> +}
>> +
>>  static int __init gicv3_populate_rdist(void)
>>  {
>>      int i;
>> @@ -658,6 +688,20 @@ static int __init gicv3_populate_rdist(void)
>>              if ( (typer >> 32) == aff )
>>              {
>>                  this_cpu(rbase) = ptr;
>> +
>> +                if ( typer & GICR_TYPER_PLPIS )
>> +                {
>> +                    int ret;
>> +
>> +                    ret = gicv3_rdist_init_lpis(ptr);
>> +                    if ( ret && ret != -ENODEV )
>> +                    {
>> +                        printk("GICv3: CPU%d: Cannot initialize LPIs:
>> %d\n",
> 
> CPU%u
> 
>> +                               smp_processor_id(), ret);
>> +                        break;
>> +                    }
>> +                }
>> +
>>                  printk("GICv3: CPU%d: Found redistributor in region
>> %d @%p\n",
>>                          smp_processor_id(), i, ptr);
>>                  return 0;
>> diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
>> index bda8898..1cbfb9e 100644
>> --- a/xen/include/asm-arm/bitops.h
>> +++ b/xen/include/asm-arm/bitops.h
>> @@ -24,6 +24,7 @@
>>  #define BIT(nr)                 (1UL << (nr))
>>  #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
>>  #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
>> +#define BIT_ULL(nr)             (1ULL << (nr))
>>  #define BITS_PER_BYTE           8
>>
>>  #define ADDR (*(volatile int *) addr)
>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
>> index 836a103..12bd155 100644
>> --- a/xen/include/asm-arm/gic.h
>> +++ b/xen/include/asm-arm/gic.h
>> @@ -220,6 +220,8 @@ enum gic_version {
>>      GIC_V3,
>>  };
>>
>> +#define LPI_OFFSET      8192
>> +
> 
> It would make much sense to have this definition moved in irq.h close to
> NR_IRQS.
> 
> Also, I am a bit surprised that NR_IRQS & co has not been modified. Is
> there any reason for that?

It wasn't needed and really shouldn't be.
LPI are to some degree *not* first class citizen IRQs, and AFAICT
NR_IRQS relate to struct irq_decs's, which we don't have for LPIs (since
Xen itself doesn't really care about LPIs, at least as interrupts).

I am still chasing every (derived) use of NR_IRQS, but so far this looks
good to me. Let me know if you find any issues with that.

Cheers,
Andre.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-02-06 12:58   ` Julien Grall
@ 2017-02-27 11:43     ` Andre Przywara
  2017-02-27 12:51       ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-02-27 11:43 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi,

On 06/02/17 12:58, Julien Grall wrote:
> Hi Andre,
> 
> On 30/01/17 18:31, Andre Przywara wrote:
>> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
>> offers. Store that information in a list to both propagate all of them
>> later to Dom0, but also to be able to iterate over all ITSes.
>> This introduces an ITS Kconfig option.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/Kconfig             |  4 +++
>>  xen/arch/arm/Makefile            |  1 +
>>  xen/arch/arm/gic-v3-its.c        | 71
>> ++++++++++++++++++++++++++++++++++++++++
>>  xen/arch/arm/gic-v3.c            | 12 ++++---
>>  xen/include/asm-arm/gic_v3_its.h | 57 ++++++++++++++++++++++++++++++++
>>  5 files changed, 141 insertions(+), 4 deletions(-)
>>  create mode 100644 xen/arch/arm/gic-v3-its.c
>>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 2e023d1..bf64c61 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -45,6 +45,10 @@ config ACPI
>>  config HAS_GICV3
>>      bool
>>
>> +config HAS_ITS
>> +        bool "GICv3 ITS MSI controller support"
>> +        depends on HAS_GICV3
> 
> Should not this be disabled by default until the last patch of the
> series in order to avoid potential issue if bisecting Xen?

It should work without it, as we only ever map something driven by Dom0,
which does not learn about the virtual until the very last patch.
Actually enabling it early on allows fine-grained build and runtime
testing and bisecting, so we can pinpoint breakages in the VGIC
subsystem to a particular patch.

So do you see any real issues with that? Or was that just a feeling?

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table
  2017-02-27 11:34     ` Andre Przywara
@ 2017-02-27 12:48       ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-27 12:48 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari



On 27/02/17 11:34, Andre Przywara wrote:
> Hi,

Hi Andre,

>
> "Yes, will fix" to everything not explicitly mentioned below.
>
> On 06/02/17 16:26, Julien Grall wrote:
>> Hi Andre,
>>
>> On 30/01/17 18:31, Andre Przywara wrote:
>>> The ARM GICv3 provides a new kind of interrupt called LPIs.
>>> The pending bits and the configuration data (priority, enable bits) for
>>> those LPIs are stored in tables in normal memory, which software has to
>>> provide to the hardware.
>>> Allocate the required memory, initialize it and hand it over to each
>>> redistributor. The maximum number of LPIs to be used can be adjusted with
>>> the command line option "max_lpi_bits", which defaults to a compile time
>>> constant exposed in Kconfig.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>  xen/arch/arm/Kconfig              |  15 +++++
>>>  xen/arch/arm/Makefile             |   1 +
>>>  xen/arch/arm/gic-v3-lpi.c         | 129
>>> ++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/gic-v3.c             |  44 +++++++++++++
>>>  xen/include/asm-arm/bitops.h      |   1 +
>>>  xen/include/asm-arm/gic.h         |   2 +
>>>  xen/include/asm-arm/gic_v3_defs.h |  52 ++++++++++++++-
>>>  xen/include/asm-arm/gic_v3_its.h  |  22 ++++++-
>>>  8 files changed, 264 insertions(+), 2 deletions(-)
>>>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index bf64c61..71734a1 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -49,6 +49,21 @@ config HAS_ITS
>>>          bool "GICv3 ITS MSI controller support"
>>>          depends on HAS_GICV3
>>>
>>> +config MAX_PHYS_LPI_BITS
>>> +        depends on HAS_ITS
>>> +        int "Maximum bits for GICv3 host LPIs (14-32)"
>>> +        range 14 32
>>> +        default "20"
>>> +        help
>>> +          Specifies the maximum number of LPIs (in bits) Xen should take
>>> +          care of. The host ITS may provide support for a very large
>>> number
>>> +          of supported LPIs, for all of which we may not want to
>>> allocate
>>> +          memory, so this number here allows to limit this.
>>
>> I think the description is misleading, if a user wants 8K worth of LPIs
>> by default, he would have to use 14 and not 13.
>>
>> Furthermore, you provide both a runtime option (via command line) and
>> build time option (via Kconfig). You don't express what is the
>> differences between both and how there are supposed to co-exist.
>>
>> Anyway, IHMO the command line option should be sufficient to allow
>> override if necessary. So I would drop the Kconfig version.
>
> The idea is simply to let the Kconfig version specify the default value
> if there is no command line option. So giving a command line option will
> always override Kconfig.
> Should we know a sensible default value, we can indeed get rid of this
> Kconfig snippet here.

Please have in mind that distribution will ship one version of Xen for 
all the platforms. There is no sensible default value and I don't see 
how a distribution will be able to pick-up one.

So I still don't see the point of this default option.

>
>>> +          Xen itself does not know how many LPIs domains will ever need
>>> +          beforehand.
>>> +          This can be overriden on the command line with the
>>> max_lpi_bits
>>
>> s/overriden/overridden/
>>
>>> +          parameter.
>>> +
>>>  endmenu
>>>
>>>  menu "ARM errata workaround via the alternative framework"
>>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>>> index 5f4ff23..4ccf2eb 100644
>>> --- a/xen/arch/arm/Makefile
>>> +++ b/xen/arch/arm/Makefile
>>> @@ -19,6 +19,7 @@ obj-y += gic.o
>>>  obj-y += gic-v2.o
>>>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
>>>  obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
>>> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
>>>  obj-y += guestcopy.o
>>>  obj-y += hvm.o
>>>  obj-y += io.o
>>> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
>>> new file mode 100644
>>> index 0000000..e2fc901
>>> --- /dev/null
>>> +++ b/xen/arch/arm/gic-v3-lpi.c
>>> @@ -0,0 +1,129 @@
>>> +/*
>>> + * xen/arch/arm/gic-v3-lpi.c
>>> + *
>>> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
>>> + *
>>> + * Copyright (C) 2016,2017 - ARM Ltd
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify
>>> + * it under the terms of the GNU General Public License as published by
>>> + * the Free Software Foundation; either version 2 of the License, or
>>> + * (at your option) any later version.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + */
>>> +
>>> +#include <xen/config.h>
>>
>> xen/config.h is not necessary.
>>
>>> +#include <xen/lib.h>
>>> +#include <xen/mm.h>
>>> +#include <xen/sizes.h>
>>> +#include <asm/gic.h>
>>> +#include <asm/gic_v3_defs.h>
>>> +#include <asm/gic_v3_its.h>
>>> +
>>> +/* Global state */
>>> +static struct {
>>> +    uint8_t *lpi_property;
>>> +    unsigned int host_lpi_bits;
>>
>> On the previous version, Stefano suggested to rename this to
>> phys_lpi_bits + adding a comment as you store the number of bits.
>>
>> However, looking at the usage the number of bits is only required during
>> the initialization. Runtime code (such as gic_get_host_lpi) will use the
>> number of LPIs (see gic_get_host_lpi) and therefore require extra
>> instructions to compute the value.
>>
>> So I would prefer if you store the number of LPIs here to optimize the
>> common case.
>
> Well, it's a shift, not the 5th root. This is in fact the only
> difference between the two approaches:
> 000000000024d770 <gic_get_host_lpi>:
>   ...
>   24d788:       9ac22421        lsr     x1, x1, x2

One instruction can make a lot of differences when a function is 
executed hundred times.

> And I was thinking about it before, my rationale for not doing it was:
> - We need both the number and the shift, and it's much easier to get the
> number from the bits than the other way around.

But you only need the number of bits in the initialization. I don't care 
if the ITS initialization is little slower than the invert.

> - The bits are the real source of the information (from TYPER).

So? It is fine to use another way to store the value in Xen if it makes 
easier to use in most of the places. This will also be less confusing 
for the users.

> - Having a number would always raise the question whether we need to
> make sure that there is more than one bit set in there and how to deal
> with it.

It is not difficult to handle that.

>
>> Also, I find the naming "id_bits" confusing because you store the number
>> of bits to encode the max LPI ID and not the number of bits to encode
>> the number of LPI.
>
> "IDbits" is the spec term used. It describes how many bits you need to
> describe an LPI number. LPIs start at number 8192.
> GICv3 spec, 8.9.24:
> IDbits, bits [23:19]
>        The number of interrupt identifier bits supported, minus one.
>
> I can rename this to "phys_lpi_idbits", if that sounds reasonable.
>
>>> +} lpi_data;
>>> +
>>> +/* Pending table for each redistributor */
>>> +static DEFINE_PER_CPU(void *, pending_table);
>>> +
>>> +#define MAX_PHYS_LPIS   (BIT_ULL(lpi_data.host_lpi_bits) - LPI_OFFSET)
>>> +
>>> +uint64_t gicv3_lpi_allocate_pendtable(void)
>>> +{
>>> +    uint64_t reg;
>>> +    void *pendtable;
>>> +
>>> +    reg  = GIC_BASER_CACHE_RaWaWb <<
>>> GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
>>> +    reg |= GIC_BASER_CACHE_SameAsInner <<
>>> GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
>>> +    reg |= GIC_BASER_InnerShareable <<
>>> GICR_PENDBASER_SHAREABILITY_SHIFT;
>>> +
>>> +    if ( !this_cpu(pending_table) )
>>> +    {
>>> +        /*
>>> +         * The pending table holds one bit per LPI and even covers
>>> bits for
>>> +         * interrupt IDs below 8192, so we allocate the full range.
>>> +         * The GICv3 imposes a 64KB alignment requirement.
>>> +         */
>>> +        pendtable = _xmalloc(BIT_ULL(lpi_data.host_lpi_bits) / 8,
>>> SZ_64K);
>>> +        if ( !pendtable )
>>> +            return 0;
>>> +
>>> +        memset(pendtable, 0, BIT_ULL(lpi_data.host_lpi_bits) / 8);
>>
>> You can use _zalloc to do the allocation and then memset to 0.
>>
>>> +        __flush_dcache_area(pendtable,
>>> BIT_ULL(lpi_data.host_lpi_bits) / 8);
>>
>> Please use clean_and_invalidate_dcache_va_range.
>>
>>> +
>>> +        this_cpu(pending_table) = pendtable;
>>> +    }
>>> +    else
>>> +    {
>>> +        pendtable = this_cpu(pending_table);
>>> +    }
>>
>> The {} are not necessary.
>
> Sure, this was coming from the Linux rule here: if the if-clause
> requires braces, the else-clause has to have those too. To me it looks
> weird otherwise. Xen's coding style document doesn't explicitly say.

We tend to avoid pointless {} in Xen.

>
>> Also, on the previous version it was mentioned
>> this should be an error and then replace by a BUG_ON().
>
> I don't see how this would be an actual bug. Yes, the code as it is
> right now calls this only once, but it wouldn't hurt if we call this
> multiple times. And I am always a bit wary of crashing the system if we
> could just carry on instead, but ...
>
>> Please do the change.
>
> ... however you like.

The question is not whether it would hurt to call this code twice but if 
it makes sense. It does not make sense to try to allocate the pendtable 
twice for the same CPU. So it is ever happening it means this is a 
programming error, hence a BUG_ON makes sense here.

Another solution is having an ASSERT(this_cpu(pending_table) == NULL); 
at the beginning of the function.

>
>>> +
>>> +    reg |= GICR_PENDBASER_PTZ;
>>> +
>>> +    ASSERT(!(virt_to_maddr(pendtable) & ~GENMASK(51, 16)));
>>
>> I don't understand the purpose of this ASSERT. The bits 15:0 should
>> always be zero otherwise this would be a bug in the memory allocator.
>> For bits 64:52, the architecture so far only support up to 52 bits.
>
> You complained about using a mask on the address before, which I can
> understand.
> However we are writing to a register described in the spec here and
> should observe that the address only goes into bits [51:16]. Any other
> bits should not be touched or we are getting weird errors.
> So somehow I have to make sure we comply with this.
> This could either be a mask or an ASSERT.
> If the assert never fires: great. Nothing to worry about here.
> But I think this matches the ASSERT idea: we rely on this address being 4K
> aligned and not exceeding 52 bits worth of address bits, so we should
> check these assumptions.

ASSERT are only for debug build. However, nothing prevent the Xen 
allocator to differ between non-debug and debug build. So you may end up 
to never allocate address that are invalid for the GIC in non-debug build.

>
>> By keeping this ASSERT, you will make our life more difficult to extend
>> the number of physical address supported if ARM decides to bump it.
>
> This GICR register is limited to 52 bits of physical address space
> according to the spec.
> If we ever upgrade the address size, the GIC spec would need to be
> upgraded as well, so chances are we either need to touch that code here
> anyways or we use a GICv5 or the like from the beginning.

I am quite convinced it will make our life more difficult. But I am not 
going to fight for an ASSERT, there are more important stuff to fix. :)


>> Why don't you just disable LPIs here? AFAIK, it should just be
>> writel_relaxed(reg & ~GICR_CTLR_ENABLE_LPIS, GICR_CTLR);
>
> Please don't shoot the messenger, but:
> GICv3 spec 8.11.2 "GICR_CTLR, Redistributor Control Register":
> Enable_LPIs, bit [0]:
> "... When a write changes this bit from 0 to 1, this bit becomes RES1
> and the Redistributor must load the LPI Pending table from memory to
> check for any pending interrupts."
>
> Read: LPIs are so great that you can't disable them anymore once you
> have enabled them.

How this would work with kexec?

> Yes, this is a bit weird, even has nasty side effects.

In that case, I would prefer to have either a panic or make the CPU 
unusable to avoid having weird behavior afterwards.

[...]

>>> diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
>>> index bda8898..1cbfb9e 100644
>>> --- a/xen/include/asm-arm/bitops.h
>>> +++ b/xen/include/asm-arm/bitops.h
>>> @@ -24,6 +24,7 @@
>>>  #define BIT(nr)                 (1UL << (nr))
>>>  #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
>>>  #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
>>> +#define BIT_ULL(nr)             (1ULL << (nr))
>>>  #define BITS_PER_BYTE           8
>>>
>>>  #define ADDR (*(volatile int *) addr)
>>> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
>>> index 836a103..12bd155 100644
>>> --- a/xen/include/asm-arm/gic.h
>>> +++ b/xen/include/asm-arm/gic.h
>>> @@ -220,6 +220,8 @@ enum gic_version {
>>>      GIC_V3,
>>>  };
>>>
>>> +#define LPI_OFFSET      8192
>>> +
>>
>> It would make much sense to have this definition moved in irq.h close to
>> NR_IRQS.
>>
>> Also, I am a bit surprised that NR_IRQS & co has not been modified. Is
>> there any reason for that?
>
> It wasn't needed and really shouldn't be.
> LPI are to some degree *not* first class citizen IRQs, and AFAICT
> NR_IRQS relate to struct irq_decs's, which we don't have for LPIs (since
> Xen itself doesn't really care about LPIs, at least as interrupts).
>
> I am still chasing every (derived) use of NR_IRQS, but so far this looks
> good to me. Let me know if you find any issues with that.

Can we have a comment on top of NR_IRQS explaining why LPIs are not 
taken into account?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-02-27 11:43     ` Andre Przywara
@ 2017-02-27 12:51       ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-02-27 12:51 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, nd, Vijay Kilari

On 27/02/17 11:43, Andre Przywara wrote:
> Hi,

Hi Andre,

> On 06/02/17 12:58, Julien Grall wrote:
>> Hi Andre,
>>
>> On 30/01/17 18:31, Andre Przywara wrote:
>>> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
>>> offers. Store that information in a list to both propagate all of them
>>> later to Dom0, but also to be able to iterate over all ITSes.
>>> This introduces an ITS Kconfig option.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>  xen/arch/arm/Kconfig             |  4 +++
>>>  xen/arch/arm/Makefile            |  1 +
>>>  xen/arch/arm/gic-v3-its.c        | 71
>>> ++++++++++++++++++++++++++++++++++++++++
>>>  xen/arch/arm/gic-v3.c            | 12 ++++---
>>>  xen/include/asm-arm/gic_v3_its.h | 57 ++++++++++++++++++++++++++++++++
>>>  5 files changed, 141 insertions(+), 4 deletions(-)
>>>  create mode 100644 xen/arch/arm/gic-v3-its.c
>>>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index 2e023d1..bf64c61 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -45,6 +45,10 @@ config ACPI
>>>  config HAS_GICV3
>>>      bool
>>>
>>> +config HAS_ITS
>>> +        bool "GICv3 ITS MSI controller support"
>>> +        depends on HAS_GICV3
>>
>> Should not this be disabled by default until the last patch of the
>> series in order to avoid potential issue if bisecting Xen?
>
> It should work without it, as we only ever map something driven by Dom0,
> which does not learn about the virtual until the very last patch.
> Actually enabling it early on allows fine-grained build and runtime
> testing and bisecting, so we can pinpoint breakages in the VGIC
> subsystem to a particular patch.
>
> So do you see any real issues with that? Or was that just a feeling?

Just a feeling. If you say it should be ok, then I am happy with that.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-02-24 19:57                   ` Shanker Donthineni
  2017-02-24 20:28                     ` Julien Grall
@ 2017-02-27 17:20                     ` Andre Przywara
  2017-02-28 18:29                       ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-02-27 17:20 UTC (permalink / raw)
  To: shankerd, Julien Grall, Jaggi, Manish, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, Vijay Kilari, Kapoor, Prasun

Hi,

On 24/02/17 19:57, Shanker Donthineni wrote:
> Hi Julien,
> 
> 
> On 01/31/2017 10:18 AM, Julien Grall wrote:
>>
>>
>> On 31/01/17 16:02, Jaggi, Manish wrote:
>>>
>>>
>>> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>>>
>>>>
>>>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>>>> Hi Julien,
>>>>>
>>>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>> If you please go back to your comment where you wrote "we need to
>>>>> find another way to get the DeviceID", I was referring that we
>>>>> should add that another way in this series so that correct DeviceID
>>>>> is programmed in ITS.
>>>>
>>>> This is not the first time I am saying this, just saying "we should
>>>> add that another way..." is not helpful. You should also provide
>>>> some details on what you would do.
>>>>
>>> Julien, As you suggested we need to find another way, I assumed you
>>> had something in mind.
>>
>> I gave suggestions on my e-mail but you may have missed it...
>>
>>> Since we both agree that sbdf!=deviceID, the current series of ITS
>>> patches will program the incorrect deviceID so there is a need to
>>> have a way to map sbdf with deviceID in xen.
>>>
>>> One option could be to add a new hypercall to supply sbdf and
>>> deviceID to xen.
>>
>> ... as well as the part where I am saying that I am not in favor to
>> implement an hypercall temporarily, and against adding a new hypercall
>> for only a couple of weeks. As you may know PHYSDEV hypercall are part
>> of the stable ABI and once they are added they cannot be removed.
>>
>> So we need to be sure the hypercall is necessary. In this case, the
>> hypercall is not necessary as all the information can be found in the
>> firmware tables. However this is not implemented yet and part of the
>> discussion on PCI Passthrough (see [1]).
>>
>> We need a temporary solution that does not involve any commitment on the
>> ABI until Xen is able to discover PCI.
>>
> 
> Why can't  we handle ITS device creation whenever a virtual ITS driver
> receives the MAPD command from dom0/domU. In case of dom0, it's straight
> forward dom0 always passes the real ITS device through MAPD command.
> This way we can support PCIe devices without hard-coded MSI(x) limit 32,
> and platform devices transparently. I used the below code to platform
> and PCIe device MSI(x) functionality on QDF2400 server platform.

But this breaks our assumption that no ITS commands can ever be
propagated at guest's runtime, which is the cornerstone of this series.
I agree that this is unfortunate and allowing it would simplify things,
but after long discussions we came to the conclusion that it's not
feasible to do so:
A malicious guest could flood the virtual ITS with MAPD commands. Xen
would need to propagate those to the hardware, which relies on the host
command queue to have free slots, which we can't guarantee. For
technical reasons we can't reschedule the guest (because this is an MMIO
trap), also the domain actually triggering the "final" MAPD might not be
the culprit, but an actual legitimate user.
So we agreed upon issuing all hardware ITS commands before a guest
actually starts (DomUs), respectively on hypercalls for Dom0.
I think we can do exceptions for Dom0, since it's not supposed to be
malicious.
So I'd suggest the following:
- To make Dom0 run in this version of the patches, especially with
platform devices, we allow MAPDs to propagate from Dom0.
  - We check whether this device has already  been mapped. If yes, we
map the virtual side and return.
  - If not mapped already, we possibly somehow sanitize the device ID
(using some platform-specific function, for instance) and issue the MAPD
and all the possible MAPTIs to the hardware ITS. We might avoid this in
the future, when we have proper passthrough support in place.

So PCI devices would be mapped by the PHYSOPS hypercall as before, but
platform devices would be handled via this way.

Does this make sense?

I need to work out the details, keep you posted ...

Cheers,
Andre.

> 
> @@ -383,10 +384,17 @@ static int its_handle_mapd(struct virt_its *its,
> uint64_t *cmdptr)
>      int size = its_cmd_get_size(cmdptr);
>      bool valid = its_cmd_get_validbit(cmdptr);
>      paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) &
> GENMASK(51, 8);
> +    int ret;
> 
>      if ( !its->dev_table )
>          return -1;
> 
> +    size = size < 4 ? 4 : size;
> +    ret = gicv3_its_map_guest_device(hardware_domain, devid, devid,
> size + 1,
> +                                     valid);
> +    if (ret < 0)
> +        return ret;
> +
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-02-27 17:20                     ` Andre Przywara
@ 2017-02-28 18:29                       ` Julien Grall
  2017-03-01 19:42                         ` Shanker Donthineni
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-02-28 18:29 UTC (permalink / raw)
  To: Andre Przywara, shankerd, Jaggi, Manish, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, nd, Vijay Kilari, Kapoor, Prasun



On 27/02/17 17:20, Andre Przywara wrote:
> Hi,

Hi Andre,

> On 24/02/17 19:57, Shanker Donthineni wrote:
>> Hi Julien,
>>
>>
>> On 01/31/2017 10:18 AM, Julien Grall wrote:
>>>
>>>
>>> On 31/01/17 16:02, Jaggi, Manish wrote:
>>>>
>>>>
>>>> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>>>>
>>>>>
>>>>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>>>>> Hi Julien,
>>>>>>
>>>>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>>> If you please go back to your comment where you wrote "we need to
>>>>>> find another way to get the DeviceID", I was referring that we
>>>>>> should add that another way in this series so that correct DeviceID
>>>>>> is programmed in ITS.
>>>>>
>>>>> This is not the first time I am saying this, just saying "we should
>>>>> add that another way..." is not helpful. You should also provide
>>>>> some details on what you would do.
>>>>>
>>>> Julien, As you suggested we need to find another way, I assumed you
>>>> had something in mind.
>>>
>>> I gave suggestions on my e-mail but you may have missed it...
>>>
>>>> Since we both agree that sbdf!=deviceID, the current series of ITS
>>>> patches will program the incorrect deviceID so there is a need to
>>>> have a way to map sbdf with deviceID in xen.
>>>>
>>>> One option could be to add a new hypercall to supply sbdf and
>>>> deviceID to xen.
>>>
>>> ... as well as the part where I am saying that I am not in favor to
>>> implement an hypercall temporarily, and against adding a new hypercall
>>> for only a couple of weeks. As you may know PHYSDEV hypercall are part
>>> of the stable ABI and once they are added they cannot be removed.
>>>
>>> So we need to be sure the hypercall is necessary. In this case, the
>>> hypercall is not necessary as all the information can be found in the
>>> firmware tables. However this is not implemented yet and part of the
>>> discussion on PCI Passthrough (see [1]).
>>>
>>> We need a temporary solution that does not involve any commitment on the
>>> ABI until Xen is able to discover PCI.
>>>
>>
>> Why can't  we handle ITS device creation whenever a virtual ITS driver
>> receives the MAPD command from dom0/domU. In case of dom0, it's straight
>> forward dom0 always passes the real ITS device through MAPD command.
>> This way we can support PCIe devices without hard-coded MSI(x) limit 32,
>> and platform devices transparently. I used the below code to platform
>> and PCIe device MSI(x) functionality on QDF2400 server platform.
>
> But this breaks our assumption that no ITS commands can ever be
> propagated at guest's runtime, which is the cornerstone of this series.
> I agree that this is unfortunate and allowing it would simplify things,
> but after long discussions we came to the conclusion that it's not
> feasible to do so:
> A malicious guest could flood the virtual ITS with MAPD commands. Xen
> would need to propagate those to the hardware, which relies on the host
> command queue to have free slots, which we can't guarantee. For
> technical reasons we can't reschedule the guest (because this is an MMIO
> trap), also the domain actually triggering the "final" MAPD might not be
> the culprit, but an actual legitimate user.
> So we agreed upon issuing all hardware ITS commands before a guest
> actually starts (DomUs), respectively on hypercalls for Dom0.
> I think we can do exceptions for Dom0, since it's not supposed to be
> malicious.

Thank you for summarizing the problem :).

> So I'd suggest the following:
> - To make Dom0 run in this version of the patches, especially with
> platform devices, we allow MAPDs to propagate from Dom0.
>   - We check whether this device has already  been mapped. If yes, we
> map the virtual side and return.
>   - If not mapped already, we possibly somehow sanitize the device ID
> (using some platform-specific function, for instance) and issue the MAPD
> and all the possible MAPTIs to the hardware ITS. We might avoid this in
> the future, when we have proper passthrough support in place.

I am not sure why you would need per-platform code to sanitize the 
Device ID. I think a first approach is to trust all input from dom0, we 
can refine this later one by either reading the configuration space for 
PCI, for platform device we would need to come up for possibly a new 
hypercall (this could be discussed in a separate thread).

>
> So PCI devices would be mapped by the PHYSOPS hypercall as before, but
> platform devices would be handled via this way.

I don't understand why you still want to implement physdevop hypercalls 
knowing that they will likely get ditched for ARM and don't provide all 
the information we need. It is not possible to know the DeviceID from 
RID without parsing DT and we don't have the number of MSI supported in 
hand.

So it makes no sense to implement those hypercalls.

>
> Does this make sense?

Looking at the implementation of gicv3_its_map_guest_device, for each 
virtual MAPD issued, you will issue one host MAPD command, one host 
MAPTI and INV per event.

This will potentially fill up the host command queue and takes time to 
executed (imagine a SYNC at the end).

So what will you do if the queue is full? Xen is not preemptible and if 
you busy loop, dom0 may have its watchdog raised or the RCU stalls.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-02-28 18:29                       ` Julien Grall
@ 2017-03-01 19:42                         ` Shanker Donthineni
  2017-03-03 15:53                           ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Shanker Donthineni @ 2017-03-01 19:42 UTC (permalink / raw)
  To: Julien Grall, Andre Przywara, Jaggi, Manish, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, nd, Vijay Kilari, Kapoor, Prasun

Hi Julien,


On 02/28/2017 12:29 PM, Julien Grall wrote:
>
>
> On 27/02/17 17:20, Andre Przywara wrote:
>> Hi,
>
> Hi Andre,
>
>> On 24/02/17 19:57, Shanker Donthineni wrote:
>>> Hi Julien,
>>>
>>>
>>> On 01/31/2017 10:18 AM, Julien Grall wrote:
>>>>
>>>>
>>>> On 31/01/17 16:02, Jaggi, Manish wrote:
>>>>>
>>>>>
>>>>> On 1/31/2017 8:47 PM, Julien Grall wrote:
>>>>>>
>>>>>>
>>>>>> On 31/01/17 14:08, Jaggi, Manish wrote:
>>>>>>> Hi Julien,
>>>>>>>
>>>>>>> On 1/31/2017 7:16 PM, Julien Grall wrote:
>>>>>>>> On 31/01/17 13:19, Jaggi, Manish wrote:
>>>>>>>>> On 1/31/2017 6:13 PM, Julien Grall wrote:
>>>>>>>>>> On 31/01/17 10:29, Jaggi, Manish wrote:
>>>>>>> If you please go back to your comment where you wrote "we need to
>>>>>>> find another way to get the DeviceID", I was referring that we
>>>>>>> should add that another way in this series so that correct DeviceID
>>>>>>> is programmed in ITS.
>>>>>>
>>>>>> This is not the first time I am saying this, just saying "we should
>>>>>> add that another way..." is not helpful. You should also provide
>>>>>> some details on what you would do.
>>>>>>
>>>>> Julien, As you suggested we need to find another way, I assumed you
>>>>> had something in mind.
>>>>
>>>> I gave suggestions on my e-mail but you may have missed it...
>>>>
>>>>> Since we both agree that sbdf!=deviceID, the current series of ITS
>>>>> patches will program the incorrect deviceID so there is a need to
>>>>> have a way to map sbdf with deviceID in xen.
>>>>>
>>>>> One option could be to add a new hypercall to supply sbdf and
>>>>> deviceID to xen.
>>>>
>>>> ... as well as the part where I am saying that I am not in favor to
>>>> implement an hypercall temporarily, and against adding a new hypercall
>>>> for only a couple of weeks. As you may know PHYSDEV hypercall are part
>>>> of the stable ABI and once they are added they cannot be removed.
>>>>
>>>> So we need to be sure the hypercall is necessary. In this case, the
>>>> hypercall is not necessary as all the information can be found in the
>>>> firmware tables. However this is not implemented yet and part of the
>>>> discussion on PCI Passthrough (see [1]).
>>>>
>>>> We need a temporary solution that does not involve any commitment 
>>>> on the
>>>> ABI until Xen is able to discover PCI.
>>>>
>>>
>>> Why can't  we handle ITS device creation whenever a virtual ITS driver
>>> receives the MAPD command from dom0/domU. In case of dom0, it's 
>>> straight
>>> forward dom0 always passes the real ITS device through MAPD command.
>>> This way we can support PCIe devices without hard-coded MSI(x) limit 
>>> 32,
>>> and platform devices transparently. I used the below code to platform
>>> and PCIe device MSI(x) functionality on QDF2400 server platform.
>>
>> But this breaks our assumption that no ITS commands can ever be
>> propagated at guest's runtime, which is the cornerstone of this series.
>> I agree that this is unfortunate and allowing it would simplify things,
>> but after long discussions we came to the conclusion that it's not
>> feasible to do so:
>> A malicious guest could flood the virtual ITS with MAPD commands. Xen
>> would need to propagate those to the hardware, which relies on the host
>> command queue to have free slots, which we can't guarantee. For
>> technical reasons we can't reschedule the guest (because this is an MMIO
>> trap), also the domain actually triggering the "final" MAPD might not be
>> the culprit, but an actual legitimate user.
>> So we agreed upon issuing all hardware ITS commands before a guest
>> actually starts (DomUs), respectively on hypercalls for Dom0.
>> I think we can do exceptions for Dom0, since it's not supposed to be
>> malicious.
>
> Thank you for summarizing the problem :).
>

Direct VLPI injection feature is included in GICv4 architecture. A new 
set of VLPI commands are introduced to map ITS vpend/vprop tables, ITTE 
setup, and maintenance operations for VLPIs. In case of direct VLPI 
injection, domU/dom0 LPI commands are mapped to VLPI commands. Some of 
these commands must be applied to a real ITS hardware whenever XEN 
receives the ITS commands during runtime.


Any thought on this, how we are going to support a direct VLPI injection 
without prolongating dom0/domU ITS commands to hardware at runtime?

-- 

Shanker Donthineni
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-02-15 21:25       ` Stefano Stabellini
@ 2017-03-02 20:56         ` Julien Grall
  2017-03-03  7:58           ` Jan Beulich
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-03-02 20:56 UTC (permalink / raw)
  To: Stefano Stabellini, JBeulich, andrew.cooper3
  Cc: Andre Przywara, nd, Vijay Kilari, xen-devel

Hi,

Ping? I'd like the question to be sorted out before Andre is sending a 
new version.

On 02/15/2017 09:25 PM, Stefano Stabellini wrote:
> On Wed, 15 Feb 2017, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 14/02/17 21:00, Stefano Stabellini wrote:
>>> On Mon, 30 Jan 2017, Andre Przywara wrote:
>>>> +/*
>>>> + * Handle incoming LPIs, which are a bit special, because they are
>>>> potentially
>>>> + * numerous and also only get injected into guests. Treat them specially
>>>> here,
>>>> + * by just looking up their target vCPU and virtual LPI number and hand
>>>> it
>>>> + * over to the injection function.
>>>> + */
>>>> +void do_LPI(unsigned int lpi)
>>>> +{
>>>> +    struct domain *d;
>>>> +    union host_lpi *hlpip, hlpi;
>>>> +    struct vcpu *vcpu;
>>>> +
>>>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>>>> +
>>>> +    hlpip = gic_get_host_lpi(lpi);
>>>> +    if ( !hlpip )
>>>> +        return;
>>>> +
>>>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>>>> +
>>>> +    /* We may have mapped more host LPIs than the guest actually asked
>>>> for. */
>>>> +    if ( !hlpi.virt_lpi )
>>>> +        return;
>>>> +
>>>> +    d = get_domain_by_id(hlpi.dom_id);
>>>> +    if ( !d )
>>>> +        return;
>>>> +
>>>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>>>> +    {
>>>> +        put_domain(d);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>>>> +
>>>> +    put_domain(d);
>>>> +
>>>> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>>>
>>> put_domain should be here
>>
>> Why? I don't even understand why we would need to take a reference on the
>> domain for LPIs. Would not it be enough to use rcu_lock_domain_by_id here?
>
> I think that rcu_lock_domain_by_id would also work, but similarly we
> would need to call rcu_unlock here.
>
> To be honest, I don't know exactly in which cases get_domain should be
> used instead of rcu_lock_domain_by_id.
>
> CC'ing the x86 guys that might know the answer.
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-03-02 20:56         ` Julien Grall
@ 2017-03-03  7:58           ` Jan Beulich
  2017-03-03 14:53             ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Jan Beulich @ 2017-03-03  7:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Vijay Kilari, Andre Przywara, andrew.cooper3,
	xen-devel, nd

>>> On 02.03.17 at 21:56, <julien.grall@arm.com> wrote:
> Ping? I'd like the question to be sorted out before Andre is sending a 
> new version.
> 
> On 02/15/2017 09:25 PM, Stefano Stabellini wrote:
>> On Wed, 15 Feb 2017, Julien Grall wrote:
>>> Hi Stefano,
>>>
>>> On 14/02/17 21:00, Stefano Stabellini wrote:
>>>> On Mon, 30 Jan 2017, Andre Przywara wrote:
>>>>> +/*
>>>>> + * Handle incoming LPIs, which are a bit special, because they are
>>>>> potentially
>>>>> + * numerous and also only get injected into guests. Treat them specially
>>>>> here,
>>>>> + * by just looking up their target vCPU and virtual LPI number and hand
>>>>> it
>>>>> + * over to the injection function.
>>>>> + */
>>>>> +void do_LPI(unsigned int lpi)
>>>>> +{
>>>>> +    struct domain *d;
>>>>> +    union host_lpi *hlpip, hlpi;
>>>>> +    struct vcpu *vcpu;
>>>>> +
>>>>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>>>>> +
>>>>> +    hlpip = gic_get_host_lpi(lpi);
>>>>> +    if ( !hlpip )
>>>>> +        return;
>>>>> +
>>>>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>>>>> +
>>>>> +    /* We may have mapped more host LPIs than the guest actually asked
>>>>> for. */
>>>>> +    if ( !hlpi.virt_lpi )
>>>>> +        return;
>>>>> +
>>>>> +    d = get_domain_by_id(hlpi.dom_id);
>>>>> +    if ( !d )
>>>>> +        return;
>>>>> +
>>>>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>>>>> +    {
>>>>> +        put_domain(d);
>>>>> +        return;
>>>>> +    }
>>>>> +
>>>>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>>>>> +
>>>>> +    put_domain(d);
>>>>> +
>>>>> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>>>>
>>>> put_domain should be here
>>>
>>> Why? I don't even understand why we would need to take a reference on the
>>> domain for LPIs. Would not it be enough to use rcu_lock_domain_by_id here?
>>
>> I think that rcu_lock_domain_by_id would also work, but similarly we
>> would need to call rcu_unlock here.
>>
>> To be honest, I don't know exactly in which cases get_domain should be
>> used instead of rcu_lock_domain_by_id.

Aiui get_domain() is needed when you want to retain the reference
across an operation that may involved blocking/scheduling. The RCU
variant should be sufficient whenever you only need to make sure
the domain won't go away for the duration of (a portion of) a
function, since final domain destruction gets carried out from an
RCU callback.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests
  2017-03-03  7:58           ` Jan Beulich
@ 2017-03-03 14:53             ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-03-03 14:53 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Vijay Kilari, Andre Przywara, andrew.cooper3,
	xen-devel, nd

Hi Jan,

On 03/03/17 07:58, Jan Beulich wrote:
>>>> On 02.03.17 at 21:56, <julien.grall@arm.com> wrote:
>> Ping? I'd like the question to be sorted out before Andre is sending a
>> new version.
>>
>> On 02/15/2017 09:25 PM, Stefano Stabellini wrote:
>>> On Wed, 15 Feb 2017, Julien Grall wrote:
>>>> Hi Stefano,
>>>>
>>>> On 14/02/17 21:00, Stefano Stabellini wrote:
>>>>> On Mon, 30 Jan 2017, Andre Przywara wrote:
>>>>>> +/*
>>>>>> + * Handle incoming LPIs, which are a bit special, because they are
>>>>>> potentially
>>>>>> + * numerous and also only get injected into guests. Treat them specially
>>>>>> here,
>>>>>> + * by just looking up their target vCPU and virtual LPI number and hand
>>>>>> it
>>>>>> + * over to the injection function.
>>>>>> + */
>>>>>> +void do_LPI(unsigned int lpi)
>>>>>> +{
>>>>>> +    struct domain *d;
>>>>>> +    union host_lpi *hlpip, hlpi;
>>>>>> +    struct vcpu *vcpu;
>>>>>> +
>>>>>> +    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
>>>>>> +
>>>>>> +    hlpip = gic_get_host_lpi(lpi);
>>>>>> +    if ( !hlpip )
>>>>>> +        return;
>>>>>> +
>>>>>> +    hlpi.data = read_u64_atomic(&hlpip->data);
>>>>>> +
>>>>>> +    /* We may have mapped more host LPIs than the guest actually asked
>>>>>> for. */
>>>>>> +    if ( !hlpi.virt_lpi )
>>>>>> +        return;
>>>>>> +
>>>>>> +    d = get_domain_by_id(hlpi.dom_id);
>>>>>> +    if ( !d )
>>>>>> +        return;
>>>>>> +
>>>>>> +    if ( hlpi.vcpu_id >= d->max_vcpus )
>>>>>> +    {
>>>>>> +        put_domain(d);
>>>>>> +        return;
>>>>>> +    }
>>>>>> +
>>>>>> +    vcpu = d->vcpu[hlpi.vcpu_id];
>>>>>> +
>>>>>> +    put_domain(d);
>>>>>> +
>>>>>> +    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
>>>>>
>>>>> put_domain should be here
>>>>
>>>> Why? I don't even understand why we would need to take a reference on the
>>>> domain for LPIs. Would not it be enough to use rcu_lock_domain_by_id here?
>>>
>>> I think that rcu_lock_domain_by_id would also work, but similarly we
>>> would need to call rcu_unlock here.
>>>
>>> To be honest, I don't know exactly in which cases get_domain should be
>>> used instead of rcu_lock_domain_by_id.
>
> Aiui get_domain() is needed when you want to retain the reference
> across an operation that may involved blocking/scheduling. The RCU
> variant should be sufficient whenever you only need to make sure
> the domain won't go away for the duration of (a portion of) a
> function, since final domain destruction gets carried out from an
> RCU callback.

Thank you for explanation. I think it makes sense. There will be no 
scheduling or softirq_pending involves in do_LPI so using 
rcu_lock_domain_by_id seems more suitable here.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall
  2017-03-01 19:42                         ` Shanker Donthineni
@ 2017-03-03 15:53                           ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-03-03 15:53 UTC (permalink / raw)
  To: shankerd, Andre Przywara, Jaggi, Manish, Stefano Stabellini
  Cc: xen-devel, Nair, Jayachandran, nd, Vijay Kilari, Kapoor, Prasun

On 01/03/17 19:42, Shanker Donthineni wrote:
> Hi Julien,

Hi Shanker,

> On 02/28/2017 12:29 PM, Julien Grall wrote:
>> On 27/02/17 17:20, Andre Przywara wrote:
> Direct VLPI injection feature is included in GICv4 architecture. A new
> set of VLPI commands are introduced to map ITS vpend/vprop tables, ITTE
> setup, and maintenance operations for VLPIs. In case of direct VLPI
> injection, domU/dom0 LPI commands are mapped to VLPI commands. Some of
> these commands must be applied to a real ITS hardware whenever XEN
> receives the ITS commands during runtime.
>
>
> Any thought on this, how we are going to support a direct VLPI injection
> without prolongating dom0/domU ITS commands to hardware at runtime?

direct vLPI injection will indeed require to propagate commands. But as 
the host command queue is shared among multiple guest, we have to 
prevent a guest to overflow the host command queue and affecting other 
guests.

During the discussion for GICv3 ITS support in Xen, we looked at various 
solution (see the various design doc sent by Ian Campbell [1]) and the 
only suitable one for it was to decouple vITS and ITS. This is what 
Andre has implemented in this series.

I don't know yet how we can make things secure for direct vLPI 
injection. For the time being, I think we should focus to get GICv3 ITS 
supported as it is a requirement to get MSI supported.

Once this is done, we can think about integrating directly vLPI in the 
code. Feel free to start a new thread about this.

Cheers,

[1] https://xenbits.xen.org/people/ianc/vits/

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-02-06 19:16   ` Julien Grall
  2017-02-07 11:44     ` Julien Grall
@ 2017-03-07 18:08     ` Andre Przywara
  2017-03-08 15:28       ` Julien Grall
  1 sibling, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-03-07 18:08 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi Julien,

On 06/02/17 19:16, Julien Grall wrote:
> Hi Andre,
> 
> On 30/01/17 18:31, Andre Przywara wrote:
>> To be able to easily send commands to the ITS, create the respective
>> wrapper functions, which take care of the ring buffer.
>> The first two commands we implement provide methods to map a collection
>> to a redistributor (aka host core) and to flush the command queue (SYNC).
>> Start using these commands for mapping one collection to each host CPU.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/gic-v3-its.c         | 142
>> +++++++++++++++++++++++++++++++++++++-
>>  xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
>>  xen/arch/arm/gic-v3.c             |  18 ++++-
>>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>>  xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
>>  5 files changed, 215 insertions(+), 3 deletions(-)
>>
>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>> index ad7cd2a..6578e8a 100644
>> --- a/xen/arch/arm/gic-v3-its.c
>> +++ b/xen/arch/arm/gic-v3-its.c
>> @@ -19,6 +19,7 @@
>>  #include <xen/config.h>
>>  #include <xen/lib.h>
>>  #include <xen/device_tree.h>
>> +#include <xen/delay.h>
>>  #include <xen/libfdt/libfdt.h>
>>  #include <xen/mm.h>
>>  #include <xen/sizes.h>
>> @@ -29,6 +30,98 @@
>>
>>  #define ITS_CMD_QUEUE_SZ                SZ_64K
>>
>> +#define BUFPTR_MASK                     GENMASK(19, 5)
>> +static int its_send_command(struct host_its *hw_its, const void
>> *its_cmd)
>> +{
>> +    uint64_t readp, writep;
>> +
>> +    spin_lock(&hw_its->cmd_lock);
> 
> Do you never expect a command to be sent in an interrupt path? I could
> see at least one, we may decide to throttle the number of LPIs received
> by a guest so this would involve disabling the interrupt.

I take it you are asking for spin_lock_irq[save]()?
I don't think queuing ITS commands in interrupt context is a good idea,
especially since I just introduced a grace period to wait for a draining
command queue.
I am happy to revisit this when needed.

>> +
>> +    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
>> +    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) &
>> BUFPTR_MASK;
>> +
>> +    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
>> +    {
> 
> I look at all the series applied and there is no error message at all
> when the queue is full. This will make difficult to see what's going on.
> 
> Furthermore, this limit could be easily reached. Furthermore, this could
> happen easily if you decide to map a device with thousands of
> interrupts. For instance the function gicv3_map_its_map_host_events will
> issue 2 commands per event (MAPTI and INV).
> 
> So how do you plan to address this?

So I changed this now to wait for 1 ms (or whatever value you prefer) in
hope the command queue drains. In the end the ITS is hardware, so
processing commands it's the only thing it does and I don't expect it to
be seriously stalled, usually. So waiting a tiny bit to cover this odd
case of command queue contention seems useful to me, especially since we
only send commands from non-critical Dom0 code.
The command queue is now 1 MB in size, so we have 32,768 commands in
there. Should be enough for everybody ;-)

>> +        spin_unlock(&hw_its->cmd_lock);
>> +        return -EBUSY;
>> +    }
>> +
>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>> +        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
> 
> Please use dcache_.... helpers.
> 
>> +    else
>> +        dsb(ishst);
>> +
>> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
>> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base +
>> GITS_CWRITER);
>> +
>> +    spin_unlock(&hw_its->cmd_lock);
>> +
>> +    return 0;
>> +}
>> +
>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>> uint64_t reg)
> 
> s/int cpu/unsigned int cpu/

So it's easy to do so, but why is that actually?
I see that both "processor" and "vcpu_id" are "int" in struct vcpu, so I
was using int as the type for CPUs here as well.

>> +{
>> +    reg &= ~GENMASK(51, 16);
>> +
>> +    reg |= gicv3_get_redist_address(cpu, hw_its->flags &
>> HOST_ITS_USES_PTA);
>> +
>> +    return reg;
>> +}
>> +
>> +static int its_send_cmd_sync(struct host_its *its, int cpu)
> 
> s/int cpu/unsigned int cpu/
> 
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_SYNC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +static int its_send_cmd_mapc(struct host_its *its, int collection_id,
>> int cpu)
> 
> s/int/unsigned int/ for both collection_id and cpu.
> 
>> +{
>> +    uint64_t cmd[4];
>> +
>> +    cmd[0] = GITS_CMD_MAPC;
>> +    cmd[1] = 0x00;
>> +    cmd[2] = encode_rdbase(its, cpu, (collection_id & GENMASK(15, 0)));
> 
> Please drop the mask here.
> 
>> +    cmd[2] |= GITS_VALID_BIT;
>> +    cmd[3] = 0x00;
>> +
>> +    return its_send_command(its, cmd);
>> +}
>> +
>> +/* Set up the (1:1) collection mapping for the given host CPU. */
>> +int gicv3_its_setup_collection(int cpu)
> 
> So you are calling this function from gicv3_rdist_init_lpis which make
> little sense to me. This should probably called from gicv3_cpu_init.

gicv3_cpu_init() calls gicv3_rdist_init_lpis(), but well, I changed it
because it looks indeed more reasonable to group this under the
gicv3_its_host_has_its() guard.

>> +{
>> +    struct host_its *its;
>> +    int ret;
>> +
>> +    list_for_each_entry(its, &host_its_list, entry)
>> +    {
>> +        /*
>> +         * This function is called on CPU0 before any ITSes have been
>> +         * properly initialized. Skip the collection setup in this case,
>> +         * it will be done explicitly for CPU0 upon initializing the
>> ITS.
>> +         */
> 
> Looking at the code, I don't understand why you need to do that. AFAIU
> there are no restriction to initialize the ITS (e.g call gicv3_its_init)
> before gicv3_cpu_init.

Well, it's a bit more subtle: For initialising the ITS (the collection
table entry, more specifically), we need to know the "rdbase", so either
the physical address or the logical ID. Those we determine only
somewhere deep in gicv3_cpu_init().
So just moving gicv3_its_init() before gicv3_cpu_init() does not work. I
will try and see if it's worth to split gicv3_its_init() into a generic
and a per-CPU part, though I doubt that this is helpful.

Cheers,
Andre.

> +        if ( !its->cmd_buf )
>> +            continue;
>> +
>> +        ret = its_send_cmd_mapc(its, cpu, cpu);
>> +        if ( ret )
>> +            return ret;
>> +
>> +        ret = its_send_cmd_sync(its, cpu);
>> +        if ( ret )
>> +            return ret;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>  #define BASER_ATTR_MASK                                           \
>>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
>> @@ -156,18 +249,51 @@ static int its_map_baser(void __iomem *basereg,
>> uint64_t regc, int nr_items)
>>      return -EINVAL;
>>  }
>>
>> +/* Wait for an ITS to become quiescient (all ITS operations
>> completed). */
> 
> s/quiescient/quiescent/
> 
>> +static int gicv3_its_wait_quiescient(struct host_its *hw_its)
> 
> s/quiescient/quiescent/
> 
>> +{
>> +    uint32_t reg;
>> +    s_time_t deadline = NOW() + MILLISECS(1000);
> 
> So that sounds fine for handling a couple of command, but what about
> thousands at the same time?
> 
>> +
>> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +    if ( (reg & (GITS_CTLR_QUIESCENT | GITS_CTLR_ENABLE)) ==
>> GITS_CTLR_QUIESCENT )
>> +        return 0;
>> +
>> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base +
>> GITS_CTLR);
>> +
>> +    do {
>> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
>> +        if ( reg & GITS_CTLR_QUIESCENT )
>> +            return 0;
>> +
>> +        cpu_relax();
>> +        udelay(1);
>> +    } while ( NOW() <= deadline );
>> +
>> +    dprintk(XENLOG_ERR, "ITS not quiescient\n");
> 
> s/quiescient/quiescent/ + newline.
> 
>> +    return -ETIMEDOUT;
>> +}
>> +
> 
> Cheers,
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-03-07 18:08     ` Andre Przywara
@ 2017-03-08 15:28       ` Julien Grall
  2017-03-08 16:16         ` Andre Przywara
  0 siblings, 1 reply; 106+ messages in thread
From: Julien Grall @ 2017-03-08 15:28 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari



On 07/03/17 18:08, Andre Przywara wrote:
> Hi Julien,

Hi Andre,

>
> On 06/02/17 19:16, Julien Grall wrote:
>> Hi Andre,
>>
>> On 30/01/17 18:31, Andre Przywara wrote:
>>> To be able to easily send commands to the ITS, create the respective
>>> wrapper functions, which take care of the ring buffer.
>>> The first two commands we implement provide methods to map a collection
>>> to a redistributor (aka host core) and to flush the command queue (SYNC).
>>> Start using these commands for mapping one collection to each host CPU.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> ---
>>>  xen/arch/arm/gic-v3-its.c         | 142
>>> +++++++++++++++++++++++++++++++++++++-
>>>  xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
>>>  xen/arch/arm/gic-v3.c             |  18 ++++-
>>>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>>>  xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
>>>  5 files changed, 215 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>> index ad7cd2a..6578e8a 100644
>>> --- a/xen/arch/arm/gic-v3-its.c
>>> +++ b/xen/arch/arm/gic-v3-its.c
>>> @@ -19,6 +19,7 @@
>>>  #include <xen/config.h>
>>>  #include <xen/lib.h>
>>>  #include <xen/device_tree.h>
>>> +#include <xen/delay.h>
>>>  #include <xen/libfdt/libfdt.h>
>>>  #include <xen/mm.h>
>>>  #include <xen/sizes.h>
>>> @@ -29,6 +30,98 @@
>>>
>>>  #define ITS_CMD_QUEUE_SZ                SZ_64K
>>>
>>> +#define BUFPTR_MASK                     GENMASK(19, 5)
>>> +static int its_send_command(struct host_its *hw_its, const void
>>> *its_cmd)
>>> +{
>>> +    uint64_t readp, writep;
>>> +
>>> +    spin_lock(&hw_its->cmd_lock);
>>
>> Do you never expect a command to be sent in an interrupt path? I could
>> see at least one, we may decide to throttle the number of LPIs received
>> by a guest so this would involve disabling the interrupt.
>
> I take it you are asking for spin_lock_irq[save]()?

Yes.

> I don't think queuing ITS commands in interrupt context is a good idea,
> especially since I just introduced a grace period to wait for a draining
> command queue.
> I am happy to revisit this when needed.

As mentioned on the previous mail, we might need to send a command 
whilst in the interrupt context if we need to disable an interrupt that 
fire too often.

I would be fine to have an ASSERT(!in_irq()) and a comment explaining 
why for the time being.

>
>>> +
>>> +    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
>>> +    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) &
>>> BUFPTR_MASK;
>>> +
>>> +    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
>>> +    {
>>
>> I look at all the series applied and there is no error message at all
>> when the queue is full. This will make difficult to see what's going on.
>>
>> Furthermore, this limit could be easily reached. Furthermore, this could
>> happen easily if you decide to map a device with thousands of
>> interrupts. For instance the function gicv3_map_its_map_host_events will
>> issue 2 commands per event (MAPTI and INV).
>>
>> So how do you plan to address this?
>
> So I changed this now to wait for 1 ms (or whatever value you prefer) in
> hope the command queue drains. In the end the ITS is hardware, so
> processing commands it's the only thing it does and I don't expect it to
> be seriously stalled, usually. So waiting a tiny bit to cover this odd
> case of command queue contention seems useful to me, especially since we
> only send commands from non-critical Dom0 code.

I don't have any idea of a good value. My worry with such value is you 
are only hoping it will never happen. If you fail here, what will you 
do? You will likely have to revert changes which mean more command and 
then? If it fails once, why would it not fail again? You will end up in 
a spiral loop.

Regarding the value, is it something we could confirm with the hardware 
guys?

> The command queue is now 1 MB in size, so we have 32,768 commands in
> there. Should be enough for everybody ;-)
>
>>> +        spin_unlock(&hw_its->cmd_lock);
>>> +        return -EBUSY;
>>> +    }
>>> +
>>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>>> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>>> +        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>>
>> Please use dcache_.... helpers.
>>
>>> +    else
>>> +        dsb(ishst);
>>> +
>>> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
>>> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base +
>>> GITS_CWRITER);
>>> +
>>> +    spin_unlock(&hw_its->cmd_lock);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>>> uint64_t reg)
>>
>> s/int cpu/unsigned int cpu/
>
> So it's easy to do so, but why is that actually?

Because a CPU and a vCPU ID cannot be signed. So what's the point to 
make them signed except saving 9 characters?

> I see that both "processor" and "vcpu_id" are "int" in struct vcpu, so I
> was using int as the type for CPUs here as well.


[...]

>>> +{
>>> +    struct host_its *its;
>>> +    int ret;
>>> +
>>> +    list_for_each_entry(its, &host_its_list, entry)
>>> +    {
>>> +        /*
>>> +         * This function is called on CPU0 before any ITSes have been
>>> +         * properly initialized. Skip the collection setup in this case,
>>> +         * it will be done explicitly for CPU0 upon initializing the
>>> ITS.
>>> +         */
>>
>> Looking at the code, I don't understand why you need to do that. AFAIU
>> there are no restriction to initialize the ITS (e.g call gicv3_its_init)
>> before gicv3_cpu_init.
>
> Well, it's a bit more subtle: For initialising the ITS (the collection
> table entry, more specifically), we need to know the "rdbase", so either
> the physical address or the logical ID. Those we determine only
> somewhere deep in gicv3_cpu_init().
> So just moving gicv3_its_init() before gicv3_cpu_init() does not work. I
> will try and see if it's worth to split gicv3_its_init() into a generic
> and a per-CPU part, though I doubt that this is helpful.

Looking at the gicv3_its_init function, the only code requiring "rdbase" 
is mapping the collection for the CPU0. The rest is CPU agnostic and 
should only populate the data structure.

So this does not answer why you need to wait until CPU0 is initialized 
to populate those table. The following path would be fine
	gicv3_its_init();
		-> Populate BASER
	gicv3_cpu_init();
		-> gicv3_its_setup_collection()
			-> Initialize collection for CPU0

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling
  2017-03-08 15:28       ` Julien Grall
@ 2017-03-08 16:16         ` Andre Przywara
  0 siblings, 0 replies; 106+ messages in thread
From: Andre Przywara @ 2017-03-08 16:16 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi,

On 08/03/17 15:28, Julien Grall wrote:
> 
> 
> On 07/03/17 18:08, Andre Przywara wrote:
>> Hi Julien,
> 
> Hi Andre,
> 
>>
>> On 06/02/17 19:16, Julien Grall wrote:
>>> Hi Andre,
>>>
>>> On 30/01/17 18:31, Andre Przywara wrote:
>>>> To be able to easily send commands to the ITS, create the respective
>>>> wrapper functions, which take care of the ring buffer.
>>>> The first two commands we implement provide methods to map a collection
>>>> to a redistributor (aka host core) and to flush the command queue
>>>> (SYNC).
>>>> Start using these commands for mapping one collection to each host CPU.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>> ---
>>>>  xen/arch/arm/gic-v3-its.c         | 142
>>>> +++++++++++++++++++++++++++++++++++++-
>>>>  xen/arch/arm/gic-v3-lpi.c         |  20 ++++++
>>>>  xen/arch/arm/gic-v3.c             |  18 ++++-
>>>>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>>>>  xen/include/asm-arm/gic_v3_its.h  |  36 ++++++++++
>>>>  5 files changed, 215 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
>>>> index ad7cd2a..6578e8a 100644
>>>> --- a/xen/arch/arm/gic-v3-its.c
>>>> +++ b/xen/arch/arm/gic-v3-its.c
>>>> @@ -19,6 +19,7 @@
>>>>  #include <xen/config.h>
>>>>  #include <xen/lib.h>
>>>>  #include <xen/device_tree.h>
>>>> +#include <xen/delay.h>
>>>>  #include <xen/libfdt/libfdt.h>
>>>>  #include <xen/mm.h>
>>>>  #include <xen/sizes.h>
>>>> @@ -29,6 +30,98 @@
>>>>
>>>>  #define ITS_CMD_QUEUE_SZ                SZ_64K
>>>>
>>>> +#define BUFPTR_MASK                     GENMASK(19, 5)
>>>> +static int its_send_command(struct host_its *hw_its, const void
>>>> *its_cmd)
>>>> +{
>>>> +    uint64_t readp, writep;
>>>> +
>>>> +    spin_lock(&hw_its->cmd_lock);
>>>
>>> Do you never expect a command to be sent in an interrupt path? I could
>>> see at least one, we may decide to throttle the number of LPIs received
>>> by a guest so this would involve disabling the interrupt.
>>
>> I take it you are asking for spin_lock_irq[save]()?
> 
> Yes.
> 
>> I don't think queuing ITS commands in interrupt context is a good idea,
>> especially since I just introduced a grace period to wait for a draining
>> command queue.
>> I am happy to revisit this when needed.
> 
> As mentioned on the previous mail, we might need to send a command
> whilst in the interrupt context if we need to disable an interrupt that
> fire too often.
> 
> I would be fine to have an ASSERT(!in_irq()) and a comment explaining
> why for the time being.

Done.

>>
>>>> +
>>>> +    readp = readq_relaxed(hw_its->its_base + GITS_CREADR) &
>>>> BUFPTR_MASK;
>>>> +    writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) &
>>>> BUFPTR_MASK;
>>>> +
>>>> +    if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) == readp )
>>>> +    {
>>>
>>> I look at all the series applied and there is no error message at all
>>> when the queue is full. This will make difficult to see what's going on.
>>>
>>> Furthermore, this limit could be easily reached. Furthermore, this could
>>> happen easily if you decide to map a device with thousands of
>>> interrupts. For instance the function gicv3_map_its_map_host_events will
>>> issue 2 commands per event (MAPTI and INV).
>>>
>>> So how do you plan to address this?
>>
>> So I changed this now to wait for 1 ms (or whatever value you prefer) in
>> hope the command queue drains. In the end the ITS is hardware, so
>> processing commands it's the only thing it does and I don't expect it to
>> be seriously stalled, usually. So waiting a tiny bit to cover this odd
>> case of command queue contention seems useful to me, especially since we
>> only send commands from non-critical Dom0 code.
> 
> I don't have any idea of a good value. My worry with such value is you
> are only hoping it will never happen. If you fail here, what will you
> do? You will likely have to revert changes which mean more command and
> then? If it fails once, why would it not fail again? You will end up in
> a spiral loop.
> 
> Regarding the value, is it something we could confirm with the hardware
> guys?

Let's move this bikesh^Wfine-tuning to a later point in time ;-)

>> The command queue is now 1 MB in size, so we have 32,768 commands in
>> there. Should be enough for everybody ;-)
>>
>>>> +        spin_unlock(&hw_its->cmd_lock);
>>>> +        return -EBUSY;
>>>> +    }
>>>> +
>>>> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
>>>> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
>>>> +        __flush_dcache_area(hw_its->cmd_buf + writep, ITS_CMD_SIZE);
>>>
>>> Please use dcache_.... helpers.
>>>
>>>> +    else
>>>> +        dsb(ishst);
>>>> +
>>>> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
>>>> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base +
>>>> GITS_CWRITER);
>>>> +
>>>> +    spin_unlock(&hw_its->cmd_lock);
>>>> +
>>>> +    return 0;
>>>> +}
>>>> +
>>>> +static uint64_t encode_rdbase(struct host_its *hw_its, int cpu,
>>>> uint64_t reg)
>>>
>>> s/int cpu/unsigned int cpu/
>>
>> So it's easy to do so, but why is that actually?
> 
> Because a CPU and a vCPU ID cannot be signed. So what's the point to
> make them signed except saving 9 characters?

That doesn't explain why the rest of Xen is using signed values, but
well, I will change it.

>> I see that both "processor" and "vcpu_id" are "int" in struct vcpu, so I
>> was using int as the type for CPUs here as well.
> 
> 
> [...]
> 
>>>> +{
>>>> +    struct host_its *its;
>>>> +    int ret;
>>>> +
>>>> +    list_for_each_entry(its, &host_its_list, entry)
>>>> +    {
>>>> +        /*
>>>> +         * This function is called on CPU0 before any ITSes have been
>>>> +         * properly initialized. Skip the collection setup in this
>>>> case,
>>>> +         * it will be done explicitly for CPU0 upon initializing the
>>>> ITS.
>>>> +         */
>>>
>>> Looking at the code, I don't understand why you need to do that. AFAIU
>>> there are no restriction to initialize the ITS (e.g call gicv3_its_init)
>>> before gicv3_cpu_init.
>>
>> Well, it's a bit more subtle: For initialising the ITS (the collection
>> table entry, more specifically), we need to know the "rdbase", so either
>> the physical address or the logical ID. Those we determine only
>> somewhere deep in gicv3_cpu_init().
>> So just moving gicv3_its_init() before gicv3_cpu_init() does not work. I
>> will try and see if it's worth to split gicv3_its_init() into a generic
>> and a per-CPU part, though I doubt that this is helpful.
> 
> Looking at the gicv3_its_init function, the only code requiring "rdbase"
> is mapping the collection for the CPU0. The rest is CPU agnostic and
> should only populate the data structure.
> 
> So this does not answer why you need to wait until CPU0 is initialized
> to populate those table. The following path would be fine
>     gicv3_its_init();
>         -> Populate BASER
>     gicv3_cpu_init();
>         -> gicv3_its_setup_collection()
>             -> Initialize collection for CPU0

Yeah, I changed it similarly yesterday:
Move the gicv3_its_setup_collection() call from the LPI setup path to be
called just before the gicv3_enable_lpis() call. And then move the call
to gicv3_its_init() to be done directly after gicv3_dist_init(), so
before gicv3_cpu_init().
That seems to work and allows us to get rid of the extra handling.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-02-06 17:43   ` Julien Grall
@ 2017-03-23 18:06     ` Andre Przywara
  2017-03-23 18:08       ` Julien Grall
  0 siblings, 1 reply; 106+ messages in thread
From: Andre Przywara @ 2017-03-23 18:06 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari

Hi,

On 06/02/17 17:43, Julien Grall wrote:
> Hi,
> 
> On 30/01/17 18:31, Andre Przywara wrote:
>> +int gicv3_its_init(struct host_its *hw_its)
>> +{
>> +    uint64_t reg;
>> +    int i;
>> +
>> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>> +    if ( !hw_its->its_base )
>> +        return -ENOMEM;
>> +
>> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>> +    {
>> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>> +        int type;
>> +
>> +        reg = readq_relaxed(basereg);
>> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
>> +        switch ( type )
>> +        {
>> +        case GITS_BASER_TYPE_NONE:
>> +            continue;
>> +        case GITS_BASER_TYPE_DEVICE:
>> +            /* TODO: find some better way of limiting the number of
>> devices */
>> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
>> +            break;
>> +        case GITS_BASER_TYPE_COLLECTION:
>> +            its_map_baser(basereg, reg, NR_CPUS);
> 
> And I forgot to mention about the collection. Same remark as for the
> device collection, NR_CPUS is the maximum size.

NR_CPUS is 128, entry size for each collection is probably around 8 or
16 bytes, if at all. This gives me half a page, worst case.
The granularity of the table memory handed to the ITS is (64K|16K|4K),
so as we only hand over whole pages to the ITS, I don't see how we can
save memory here.
Beside, we have other memory issues to worry about than this single 64K
allocated at boot time.

So if you don't mind, I'd just keep it as it is. I am happy to revisit
this once NR_CPUS gets significantly increased.

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table
  2017-03-23 18:06     ` Andre Przywara
@ 2017-03-23 18:08       ` Julien Grall
  0 siblings, 0 replies; 106+ messages in thread
From: Julien Grall @ 2017-03-23 18:08 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini; +Cc: xen-devel, Vijay Kilari



On 23/03/17 18:06, Andre Przywara wrote:
> Hi,

Hi Andre,

> On 06/02/17 17:43, Julien Grall wrote:
>> Hi,
>>
>> On 30/01/17 18:31, Andre Przywara wrote:
>>> +int gicv3_its_init(struct host_its *hw_its)
>>> +{
>>> +    uint64_t reg;
>>> +    int i;
>>> +
>>> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
>>> +    if ( !hw_its->its_base )
>>> +        return -ENOMEM;
>>> +
>>> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>>> +    {
>>> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
>>> +        int type;
>>> +
>>> +        reg = readq_relaxed(basereg);
>>> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
>>> +        switch ( type )
>>> +        {
>>> +        case GITS_BASER_TYPE_NONE:
>>> +            continue;
>>> +        case GITS_BASER_TYPE_DEVICE:
>>> +            /* TODO: find some better way of limiting the number of
>>> devices */
>>> +            its_map_baser(basereg, reg, BIT(max_its_device_bits));
>>> +            break;
>>> +        case GITS_BASER_TYPE_COLLECTION:
>>> +            its_map_baser(basereg, reg, NR_CPUS);
>>
>> And I forgot to mention about the collection. Same remark as for the
>> device collection, NR_CPUS is the maximum size.
>
> NR_CPUS is 128, entry size for each collection is probably around 8 or
> 16 bytes, if at all. This gives me half a page, worst case.
> The granularity of the table memory handed to the ITS is (64K|16K|4K),
> so as we only hand over whole pages to the ITS, I don't see how we can
> save memory here.
> Beside, we have other memory issues to worry about than this single 64K
> allocated at boot time.
>
> So if you don't mind, I'd just keep it as it is. I am happy to revisit
> this once NR_CPUS gets significantly increased.

Replacing NR_CPUS by nr_cpu_ids would have addressed my comment and 
requiring less keystrokes than writing this e-mail.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2017-03-23 18:08 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-30 18:31 [PATCH 00/28] arm64: Dom0 ITS emulation Andre Przywara
2017-01-30 18:31 ` [PATCH 01/28] ARM: export __flush_dcache_area() Andre Przywara
2017-02-06 11:23   ` Julien Grall
2017-01-30 18:31 ` [PATCH 02/28] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
2017-02-06 12:39   ` Julien Grall
2017-02-16 17:44     ` Andre Przywara
2017-02-16 18:15       ` Julien Grall
2017-02-06 12:58   ` Julien Grall
2017-02-27 11:43     ` Andre Przywara
2017-02-27 12:51       ` Julien Grall
2017-01-30 18:31 ` [PATCH 03/28] ARM: GICv3: allocate LPI pending and property table Andre Przywara
2017-02-06 16:26   ` Julien Grall
2017-02-27 11:34     ` Andre Przywara
2017-02-27 12:48       ` Julien Grall
2017-02-14  0:47   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 04/28] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
2017-02-06 17:19   ` Julien Grall
2017-02-14  0:55     ` Stefano Stabellini
2017-02-06 17:36   ` Julien Grall
2017-02-06 17:43   ` Julien Grall
2017-03-23 18:06     ` Andre Przywara
2017-03-23 18:08       ` Julien Grall
2017-02-14  0:54   ` Stefano Stabellini
2017-02-15 18:31   ` Shanker Donthineni
2017-02-16 19:03   ` Shanker Donthineni
2017-02-24 19:29     ` Shanker Donthineni
2017-02-27 10:23       ` Andre Przywara
2017-01-30 18:31 ` [PATCH 05/28] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
2017-02-06 17:43   ` Julien Grall
2017-02-14  0:59   ` Stefano Stabellini
2017-02-14 20:50     ` Julien Grall
2017-02-14 21:00       ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 06/28] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
2017-02-06 19:16   ` Julien Grall
2017-02-07 11:44     ` Julien Grall
2017-03-07 18:08     ` Andre Przywara
2017-03-08 15:28       ` Julien Grall
2017-03-08 16:16         ` Andre Przywara
2017-02-07 11:59   ` Julien Grall
2017-01-30 18:31 ` [PATCH 07/28] ARM: GICv3 ITS: introduce device mapping Andre Przywara
2017-02-07 14:05   ` Julien Grall
2017-02-15 16:30   ` Julien Grall
2017-02-22  7:06   ` Vijay Kilari
2017-02-24 19:37     ` Shanker Donthineni
2017-02-22 13:17   ` Julien Grall
2017-01-30 18:31 ` [PATCH 08/28] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
2017-02-07 18:01   ` Julien Grall
2017-02-14 20:05   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 09/28] ARM: GICv3 ITS: map device and LPIs to the ITS on physdev_op hypercall Andre Przywara
2017-01-31 10:29   ` Jaggi, Manish
2017-01-31 12:43     ` Julien Grall
2017-01-31 13:19       ` Jaggi, Manish
2017-01-31 13:46         ` Julien Grall
2017-01-31 14:08           ` Jaggi, Manish
2017-01-31 15:17             ` Julien Grall
2017-01-31 16:02               ` Jaggi, Manish
2017-01-31 16:18                 ` Julien Grall
2017-02-24 19:57                   ` Shanker Donthineni
2017-02-24 20:28                     ` Julien Grall
2017-02-27 17:20                     ` Andre Przywara
2017-02-28 18:29                       ` Julien Grall
2017-03-01 19:42                         ` Shanker Donthineni
2017-03-03 15:53                           ` Julien Grall
2017-01-31 13:28       ` Jaggi, Manish
2017-02-14 20:11   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 10/28] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
2017-02-14 20:39   ` Stefano Stabellini
2017-02-15 17:06     ` Julien Grall
2017-02-15 17:03   ` Julien Grall
2017-01-30 18:31 ` [PATCH 11/28] ARM: GICv3: forward pending LPIs to guests Andre Przywara
2017-02-14 21:00   ` Stefano Stabellini
2017-02-15 17:18     ` Julien Grall
2017-02-15 21:25       ` Stefano Stabellini
2017-03-02 20:56         ` Julien Grall
2017-03-03  7:58           ` Jan Beulich
2017-03-03 14:53             ` Julien Grall
2017-02-15 17:30   ` Julien Grall
2017-01-30 18:31 ` [PATCH 12/28] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
2017-02-14 22:41   ` Stefano Stabellini
2017-02-15 17:35   ` Julien Grall
2017-01-30 18:31 ` [PATCH 13/28] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
2017-02-14 23:56   ` Stefano Stabellini
2017-02-15 18:44   ` Julien Grall
2017-01-30 18:31 ` [PATCH 14/28] ARM: vGICv3: Handle disabled LPIs Andre Przywara
2017-02-14 23:58   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 15/28] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
2017-02-15 20:06   ` Shanker Donthineni
2017-01-30 18:31 ` [PATCH 16/28] ARM: vITS: introduce translation table walks Andre Przywara
2017-01-30 18:31 ` [PATCH 17/28] ARM: vITS: handle CLEAR command Andre Przywara
2017-02-15  0:07   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 18/28] ARM: vITS: handle INT command Andre Przywara
2017-01-30 18:31 ` [PATCH 19/28] ARM: vITS: handle MAPC command Andre Przywara
2017-01-30 18:31 ` [PATCH 20/28] ARM: vITS: handle MAPD command Andre Przywara
2017-02-15  0:17   ` Stefano Stabellini
2017-01-30 18:31 ` [PATCH 21/28] ARM: vITS: handle MAPTI command Andre Przywara
2017-01-30 18:31 ` [PATCH 22/28] ARM: vITS: handle MOVI command Andre Przywara
2017-01-30 18:31 ` [PATCH 23/28] ARM: vITS: handle DISCARD command Andre Przywara
2017-01-30 18:31 ` [PATCH 24/28] ARM: vITS: handle INV command Andre Przywara
2017-01-30 18:31 ` [PATCH 25/28] ARM: vITS: handle INVALL command Andre Przywara
2017-01-30 18:31 ` [PATCH 26/28] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
2017-01-30 18:31 ` [PATCH 27/28] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
2017-01-30 18:31 ` [PATCH 28/28] ARM: vGIC: advertising LPI support Andre Przywara
2017-02-13 13:53 ` [PATCH 00/28] arm64: Dom0 ITS emulation Vijay Kilari
2017-02-14 22:00   ` Stefano Stabellini
2017-02-15 15:59   ` Julien Grall
2017-02-15 17:55 ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.