All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/26] arm64: Dom0 ITS emulation
@ 2017-03-31 18:04 Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
                   ` (26 more replies)
  0 siblings, 27 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:04 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi,

another desperate try to get the ITS Dom0 emulation series ready.
Major changes this time:
- Instead of allocating struct pending_irq's on the fly during LPI injection,
  we allocate them upon mapping a device and assign them upon the LPI
  mapping into a radix tree, so that we can quickly look them up.
- Instead of mapping the virtual collection, device and property tables
  upon the ITS/redistributor enablement, we just mark those pages now and
  map them only for the brief period we actually need them.
  This changes the map_guest_pages() functions to only map a single page
  and so avoiding some issues that have been spotted. I believe the
  put_guest_pages() implementation is still fishy, I am happy to take any
  advice on how to fix this.
  Since on-demand mapping does not seem practical for the pending and
  property tables, we cache the information in those in the newly availabe
  pending_irq structs.

The time I planned for the indirect device table was spent on the above two
items, so I will write this now while the reviewers are on it.

I tried to check every error return and kick out every signed int.
Also the bug that Vijay reported has been fixed (I hope).
While the two command line parameters are still around, the Kconfig
options have been removed.
I tried to separate functions between the existing VGIC and the LPI
and ITS code parts. However there is always some connection which prevents
a clean separation (I tried several approaches).
I checked using the vgic_ops structure, but that feels like abuse for some
functions, also has issues since a GICv3 and a GICv3 with ITS are not
really separate (both could have LPIs), and the latter would always be a
superset of the former, which duplicates code and makes a separate
vgic_ops questionable.
Apart from that there were a lot of fixes and reworks here and there, too
numerous to list here.
If I can be of help with anything, please let me know.

Cheers,
Andre

----------------------------------
This series adds support for emulation of an ARM GICv3 ITS interrupt
controller. For hardware which relies on the ITS to provide interrupts for
its peripherals this code is needed to get a machine booted into Dom0 at
all. ITS emulation for DomUs is only really useful with PCI passthrough,
which is not yet available for ARM. It is expected that this feature
will be co-developed with the ITS DomU code. However this code drop here
considered DomU emulation already, to keep later architectural changes
to a minimum.

Some generic design principles:

* The current GIC code statically allocates structures for each supported
IRQ (both for the host and the guest), which due to the potentially
millions of LPI interrupts is not feasible to copy for the ITS.
So we refrain from introducing the ITS as a first class Xen interrupt
controller, also we don't hold struct irq_desc's or struct pending_irq's
for each possible LPI.
Fortunately LPIs are only interesting to guests, so we get away with
storing only the virtual IRQ number and the guest VCPU for each allocated
host LPI, which can be stashed into one uint64_t. This data is stored in
a two-level table, which is both memory efficient and quick to access.
We hook into the existing IRQ handling and VGIC code to avoid accessing
the normal structures, providing alternative methods for getting the
needed information (priority, is enabled?) for LPIs.
For interrupts which are queued to or are actually in a guest we
allocate struct pending_irq's on demand. As it is expected that only a
very small number of interrupts is ever on a VCPU at the same time, this
seems like the best approach. For now allocated structs are re-used and
held in a linked list. Should it emerge that traversing a linked list
is a performance issue, this can be changed to use a hash table.

* On the guest side we (later will) have to deal with malicious guests
trying to hog Xen with mapping requests for a lot of LPIs, for instance.
As the ITS actually uses system memory for storing status information,
we use this memory (which the guest has to provide) to naturally limit
a guest. For those tables which are page sized (devices, collections (CPUs),
LPI properties) we map those pages into Xen, so we can easily access
them from the virtual GIC code.
Unfortunately the actual interrupt mapping tables are not necessarily
page aligned, also can be much smaller than a page, so mapping all of
them permanently is fiddly. As ITS commands in need to iterate those
tables are pretty rare after all, we for now map them on demand upon
emulating a virtual ITS command. This is acceptable because "mapping"
them is actually very cheap on arm64. Also as we can't properly protect
those areas due to their sub-page-size property, we validate the data
in there before actually using it. The vITS code basically just stores
the data in there which the guest has actually transferred via the
virtual ITS command queue before, so there is no secret revealed nor
does it create an attack vector for a malicious guest.

* An obvious approach to handling some guest ITS commands would be to
propagate them to the host, for instance to map devices and LPIs and
to enable or disable LPIs.
However this (later with DomU support) will create an attack vector, as
a malicious guest could try to fill the host command queue with
propagated commands.
So we try to avoid this situation: Dom0 sending a device mapping (MAPD)
command is the only time we allow queuing commands to the host ITS command
queue, as this seems to be the only reliable way of getting the
required information at the moment. However at the same time we map all
events to LPIs already, also enable them. This avoids sending commands
later at runtime, as we can deal with mappings and LPI enabling/disabling
internally.

As it is expected that the ITS support will become a tech preview in the
first release, there is a Kconfig option to enable it. Also it is
supported on arm64 only, which will most likely not change in the future.
This leads to some hideous constructs like an #ifdef'ed header file with
empty function stubs, I have some hope we can still clean this up.
Also some parameters are config options which can be overridden on the
Xen commandline. This is to support experimentation and adaption to
various platforms, ideally we find either one-size-fits-all values or
find another way of getting rid of this.

This code boots Dom0 on an ARM Fast Model with ITS support. I tried to
address the issues seen by people running the previous version on real
hardware, though couldn't verify this here for myself.
So any testing, bug reports (and possibly even fixes) are very welcome.

The code can also be found on the its/v2 branch here:
git://linux-arm.org/xen-ap.git
http://www.linux-arm.org/git?p=xen-ap.git;a=shortlog;h=refs/heads/its/v2

Cheers,
Andre

Changelog v2 .. v3:
- preallocate struct pending_irq's
- map ITS and redistributor tables only on demand
- store property, enable and pending bit in struct pending_irq
- improve error checking and handling
- add comments

Changelog v1 .. v2:
- clean up header file inclusion
- rework host ITS table allocation: observe attributes, many fixes
- remove patch 1 to export __flush_dcache_area, use existing function instead
- use number of LPIs internally instead of number of bits
- keep host_its_list as private as possible
- keep struct its_devices private
- rework gicv3_its_map_guest_devices
- fix rbtree issues
- more error handling and propagation
- cope with GICv4 implementations (but no virtual LPI features!)
- abstract host and guest ITSes by using doorbell addresses
- join per-redistributor variables into one per-CPU structure
- fix data types (unsigned int)
- many minor bug fixes

(Rough) changelog RFC-v2 .. v1:
- split host ITS driver into gic-v3-lpi.c and gic-v3-its.c part
- rename virtual ITS driver file to vgic-v3-its.c
- use macros and named constants for all magic numbers
- use atomic accessors for accessing the host LPI data
- remove leftovers from connecting virtual and host ITSes
- bail out if host ITS is disabled in the DT
- rework map/unmap_guest_pages():
    - split off p2m part as get/put_guest_pages (to be done on allocation)
    - get rid of vmap, using map_domain_page() instead
- delay allocation of virtual tables until actual LPI/ITS enablement
- properly size both virtual and physical tables upon allocation
- fix put_domain() locking issues in physdev_op and LPI handling code
- add and extend comments in various areas
- fix lotsa coding style and white space issues, including comment style
- add locking to data structures not yet covered
- fix various locking issues
- use an rbtree to deal with ITS devices (instead of a list)
- properly handle memory attributes for ITS tables
- handle cacheable/non-cacheable ITS table mappings
- sanitize guest provided ITS/LPI table attributes
- fix breakage on non-GICv2 compatible host GICv3 controllers
- add command line parameters on top of Kconfig options
- properly wait for an ITS to become quiescient before enabling it
- handle host ITS command queue errors
- actually wait for host ITS command completion (READR==WRITER)
- fix ARM32 compilation
- various patch splits and reorderings

Andre Przywara (26):
  ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  ARM: GICv3: allocate LPI pending and property table
  ARM: GICv3 ITS: allocate device and collection table
  ARM: GICv3 ITS: map ITS command buffer
  ARM: GICv3 ITS: introduce ITS command handling
  ARM: GICv3 ITS: introduce device mapping
  ARM: GICv3 ITS: introduce host LPI array
  ARM: GICv3: introduce separate pending_irq structs for LPIs
  ARM: GICv3: forward pending LPIs to guests
  ARM: GICv3: enable ITS and LPIs on the host
  ARM: vGICv3: handle virtual LPI pending and property tables
  ARM: vGICv3: Handle disabled LPIs
  ARM: vGICv3: introduce basic ITS emulation bits
  ARM: vITS: introduce translation table walks
  ARM: vITS: handle CLEAR command
  ARM: vITS: handle INT command
  ARM: vITS: handle MAPC command
  ARM: vITS: handle MAPD command
  ARM: vITS: handle MAPTI command
  ARM: vITS: handle MOVI command
  ARM: vITS: handle DISCARD command
  ARM: vITS: handle INV command
  ARM: vITS: handle INVALL command
  ARM: vITS: create and initialize virtual ITSes for Dom0
  ARM: vITS: create ITS subnodes for Dom0 DT
  ARM: vGIC: advertise LPI support

 docs/misc/xen-command-line.markdown |   18 +
 xen/arch/arm/Kconfig                |    4 +
 xen/arch/arm/Makefile               |    3 +
 xen/arch/arm/gic-v3-its.c           |  968 ++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c           |  519 ++++++++++++++++
 xen/arch/arm/gic-v3.c               |   75 ++-
 xen/arch/arm/gic.c                  |   20 +-
 xen/arch/arm/vgic-v3-its.c          | 1116 +++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c              |  272 ++++++++-
 xen/arch/arm/vgic.c                 |   16 +-
 xen/common/memory.c                 |   61 ++
 xen/include/asm-arm/bitops.h        |    1 +
 xen/include/asm-arm/config.h        |    2 +
 xen/include/asm-arm/domain.h        |   12 +-
 xen/include/asm-arm/gic.h           |    2 +
 xen/include/asm-arm/gic_v3_defs.h   |   75 ++-
 xen/include/asm-arm/gic_v3_its.h    |  256 ++++++++
 xen/include/asm-arm/irq.h           |   15 +
 xen/include/asm-arm/vgic.h          |    6 +
 xen/include/xen/bitops.h            |    5 +-
 xen/include/xen/mm.h                |    8 +
 21 files changed, 3411 insertions(+), 43 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/arch/arm/gic-v3-lpi.c
 create mode 100644 xen/arch/arm/vgic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic_v3_its.h

-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:08   ` Stefano Stabellini
  2017-03-31 18:05 ` [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table Andre Przywara
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Parse the DT GIC subnodes to find every ITS MSI controller the hardware
offers. Store that information in a list to both propagate all of them
later to Dom0, but also to be able to iterate over all ITSes.
This introduces an ITS Kconfig option.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Kconfig             |  4 +++
 xen/arch/arm/Makefile            |  1 +
 xen/arch/arm/gic-v3-its.c        | 73 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            | 10 +++---
 xen/include/asm-arm/gic_v3_its.h | 67 ++++++++++++++++++++++++++++++++++++
 5 files changed, 151 insertions(+), 4 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic_v3_its.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 2e023d1..bf64c61 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -45,6 +45,10 @@ config ACPI
 config HAS_GICV3
 	bool
 
+config HAS_ITS
+        bool "GICv3 ITS MSI controller support"
+        depends on HAS_GICV3
+
 endmenu
 
 menu "ARM errata workaround via the alternative framework"
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 7afb8a3..54860e0 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -18,6 +18,7 @@ obj-$(EARLY_PRINTK) += early_printk.o
 obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
+obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
new file mode 100644
index 0000000..4056e5b
--- /dev/null
+++ b/xen/arch/arm/gic-v3-its.c
@@ -0,0 +1,73 @@
+/*
+ * xen/arch/arm/gic-v3-its.c
+ *
+ * ARM GICv3 Interrupt Translation Service (ITS) support
+ *
+ * Copyright (C) 2016,2017 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/lib.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+
+LIST_HEAD(host_its_list);
+
+bool gicv3_its_host_has_its(void)
+{
+    return !list_empty(&host_its_list);
+}
+
+/* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
+void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+    const struct dt_device_node *its = NULL;
+    struct host_its *its_data;
+
+    /*
+     * Check for ITS MSI subnodes. If any, add the ITS register
+     * frames to the ITS list.
+     */
+    dt_for_each_child_node(node, its)
+    {
+        uint64_t addr, size;
+
+        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
+            continue;
+
+        if ( dt_device_get_address(its, 0, &addr, &size) )
+            panic("GICv3: Cannot find a valid ITS frame address");
+
+        its_data = xzalloc(struct host_its);
+        if ( !its_data )
+            panic("GICv3: Cannot allocate memory for ITS frame");
+
+        its_data->addr = addr;
+        its_data->size = size;
+        its_data->dt_node = its;
+
+        printk("GICv3: Found ITS @0x%lx\n", addr);
+
+        list_add_tail(&its_data->entry, &host_its_list);
+    }
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 955591b..1512521 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -43,6 +43,7 @@
 #include <asm/device.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
 #include <asm/cpufeature.h>
 #include <asm/acpi.h>
 
@@ -1228,11 +1229,12 @@ static void __init gicv3_dt_init(void)
      */
     res = dt_device_get_address(node, 1 + gicv3.rdist_count,
                                 &cbase, &csize);
-    if ( res )
-        return;
+    if ( !res )
+        dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
+                              &vbase, &vsize);
 
-    dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
-                          &vbase, &vsize);
+    /* Check for ITS child nodes and build the host ITS list accordingly. */
+    gicv3_its_dt_init(node);
 }
 
 static int gicv3_iomem_deny_access(const struct domain *d)
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
new file mode 100644
index 0000000..765a655
--- /dev/null
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -0,0 +1,67 @@
+/*
+ * ARM GICv3 ITS support
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016,2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_ARM_ITS_H__
+#define __ASM_ARM_ITS_H__
+
+#include <xen/device_tree.h>
+
+/* data structure for each hardware ITS */
+struct host_its {
+    struct list_head entry;
+    const struct dt_device_node *dt_node;
+    paddr_t addr;
+    paddr_t size;
+};
+
+
+#ifdef CONFIG_HAS_ITS
+
+extern struct list_head host_its_list;
+
+/* Parse the host DT and pick up all host ITSes. */
+void gicv3_its_dt_init(const struct dt_device_node *node);
+
+bool gicv3_its_host_has_its(void);
+
+#else
+
+static LIST_HEAD(host_its_list);
+
+static inline void gicv3_its_dt_init(const struct dt_device_node *node)
+{
+}
+
+static inline bool gicv3_its_host_has_its(void)
+{
+    return false;
+}
+
+#endif /* CONFIG_HAS_ITS */
+
+#endif
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 22:59   ` Stefano Stabellini
  2017-04-03 13:53   ` Julien Grall
  2017-03-31 18:05 ` [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
                   ` (24 subsequent siblings)
  26 siblings, 2 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The ARM GICv3 provides a new kind of interrupt called LPIs.
The pending bits and the configuration data (priority, enable bits) for
those LPIs are stored in tables in normal memory, which software has to
provide to the hardware.
Allocate the required memory, initialize it and hand it over to each
redistributor. The maximum number of LPIs to be used can be adjusted with
the command line option "max_lpi_bits", which defaults to 20 bits,
covering about one million LPIs.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 docs/misc/xen-command-line.markdown |   9 ++
 xen/arch/arm/Makefile               |   1 +
 xen/arch/arm/gic-v3-lpi.c           | 209 ++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c               |  17 +++
 xen/include/asm-arm/bitops.h        |   1 +
 xen/include/asm-arm/config.h        |   2 +
 xen/include/asm-arm/gic_v3_defs.h   |  54 +++++++++-
 xen/include/asm-arm/gic_v3_its.h    |  14 +++
 xen/include/asm-arm/irq.h           |   8 ++
 xen/include/xen/bitops.h            |   5 +-
 10 files changed, 318 insertions(+), 2 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-lpi.c

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index a11fdf9..619016d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI.
 ### maxcpus
 > `= <integer>`
 
+### max\_lpi\_bits
+> `= <integer>`
+
+Specifies the number of ARM GICv3 LPI interrupts to allocate on the host,
+presented as the number of bits needed to encode it. This must be at least
+14 and not exceed 32, and each LPI requires one byte (configuration) and
+one pending bit to be allocated.
+Defaults to 20 bits (to cover at most 1048576 interrupts).
+
 ### mce
 > `= <integer>`
 
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 54860e0..02a8737 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -19,6 +19,7 @@ obj-y += gic.o
 obj-y += gic-v2.o
 obj-$(CONFIG_HAS_GICV3) += gic-v3.o
 obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
+obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
 obj-y += guestcopy.o
 obj-y += hvm.o
 obj-y += io.o
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
new file mode 100644
index 0000000..77f6009
--- /dev/null
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -0,0 +1,209 @@
+/*
+ * xen/arch/arm/gic-v3-lpi.c
+ *
+ * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
+ *
+ * Copyright (C) 2016,2017 - ARM Ltd
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/lib.h>
+#include <xen/mm.h>
+#include <xen/sizes.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+#include <asm/io.h>
+#include <asm/page.h>
+
+#define LPI_PROPTABLE_NEEDS_FLUSHING    (1U << 0)
+/* Global state */
+static struct {
+    /* The global LPI property table, shared by all redistributors. */
+    uint8_t *lpi_property;
+    /*
+     * Number of physical LPIs the host supports. This is a property of
+     * the GIC hardware. We depart from the habit of naming these things
+     * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI"
+     * in a different context to differentiate them from "virtual LPIs".
+     */
+    unsigned long int nr_host_lpis;
+    unsigned int flags;
+} lpi_data;
+
+struct lpi_redist_data {
+    void                *pending_table;
+};
+
+static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
+
+#define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
+
+static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
+{
+    uint64_t val;
+    void *pendtable;
+
+    if ( this_cpu(lpi_redist).pending_table )
+        return -EBUSY;
+
+    val  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+    val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
+    val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
+
+    /*
+     * The pending table holds one bit per LPI and even covers bits for
+     * interrupt IDs below 8192, so we allocate the full range.
+     * The GICv3 imposes a 64KB alignment requirement, also requires
+     * physically contiguous memory.
+     */
+    pendtable = _xzalloc(lpi_data.nr_host_lpis / 8, SZ_64K);
+    if ( !pendtable )
+        return -ENOMEM;
+
+    /* Make sure the physical address can be encoded in the register. */
+    if ( (virt_to_maddr(pendtable) & ~GENMASK_ULL(51, 16)) )
+    {
+        xfree(pendtable);
+        return -ERANGE;
+    }
+    clean_and_invalidate_dcache_va_range(pendtable,
+                                         lpi_data.nr_host_lpis / 8);
+
+    this_cpu(lpi_redist).pending_table = pendtable;
+
+    val |= GICR_PENDBASER_PTZ;
+
+    val |= virt_to_maddr(pendtable);
+
+    *reg = val;
+
+    return 0;
+}
+
+/*
+ * Tell a redistributor about the (shared) property table, allocating one
+ * if not already done.
+ */
+static int gicv3_lpi_set_proptable(void __iomem * rdist_base)
+{
+    uint64_t reg;
+
+    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT;
+
+    /*
+     * The property table is shared across all redistributors, so allocate
+     * this only once, but return the same value on subsequent calls.
+     */
+    if ( !lpi_data.lpi_property )
+    {
+        /* The property table holds one byte per LPI. */
+        void *table = _xmalloc(lpi_data.nr_host_lpis, SZ_4K);
+
+        if ( !table )
+            return -ENOMEM;
+
+        /* Make sure the physical address can be encoded in the register. */
+        if ( (virt_to_maddr(table) & ~GENMASK_ULL(51, 12)) )
+        {
+            xfree(table);
+            return -ERANGE;
+        }
+        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
+        clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS);
+        lpi_data.lpi_property = table;
+    }
+
+    /* Encode the number of bits needed, minus one */
+    reg |= (fls(lpi_data.nr_host_lpis - 1) - 1);
+
+    reg |= virt_to_maddr(lpi_data.lpi_property);
+
+    writeq_relaxed(reg, rdist_base + GICR_PROPBASER);
+    reg = readq_relaxed(rdist_base + GICR_PROPBASER);
+
+    /* If we can't do shareable, we have to drop cacheability as well. */
+    if ( !(reg & GICR_PROPBASER_SHAREABILITY_MASK) )
+    {
+        reg &= ~GICR_PROPBASER_INNER_CACHEABILITY_MASK;
+        reg |= GIC_BASER_CACHE_nC << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT;
+    }
+
+    /* Remember that we have to flush the property table if non-cacheable. */
+    if ( (reg & GICR_PROPBASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC )
+    {
+        lpi_data.flags |= LPI_PROPTABLE_NEEDS_FLUSHING;
+        /* Update the redistributors knowledge about the attributes. */
+        writeq_relaxed(reg, rdist_base + GICR_PROPBASER);
+    }
+
+    return 0;
+}
+
+int gicv3_lpi_init_rdist(void __iomem * rdist_base)
+{
+    uint32_t reg;
+    uint64_t table_reg;
+    int ret;
+
+    /* We don't support LPIs without an ITS. */
+    if ( !gicv3_its_host_has_its() )
+        return -ENODEV;
+
+    /* Make sure LPIs are disabled before setting up the tables. */
+    reg = readl_relaxed(rdist_base + GICR_CTLR);
+    if ( reg & GICR_CTLR_ENABLE_LPIS )
+        return -EBUSY;
+
+    ret = gicv3_lpi_allocate_pendtable(&table_reg);
+    if (ret)
+        return ret;
+    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
+    table_reg = readq_relaxed(rdist_base + GICR_PENDBASER);
+
+    /* If the hardware reports non-shareable, drop cacheability as well. */
+    if ( !(table_reg & GICR_PENDBASER_SHAREABILITY_MASK) )
+    {
+        table_reg &= GICR_PENDBASER_SHAREABILITY_MASK;
+        table_reg &= GICR_PENDBASER_INNER_CACHEABILITY_MASK;
+        table_reg |= GIC_BASER_CACHE_nC << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
+
+        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
+    }
+
+    return gicv3_lpi_set_proptable(rdist_base);
+}
+
+static unsigned int max_lpi_bits = 20;
+integer_param("max_lpi_bits", max_lpi_bits);
+
+int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
+{
+    lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits));
+
+    printk("GICv3: using at most %lu LPIs on the host.\n", MAX_PHYS_LPIS);
+
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 1512521..36cd269 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void)
     type = readl_relaxed(GICD + GICD_TYPER);
     nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
 
+    if ( type & GICD_TYPE_LPIS )
+        gicv3_lpi_init_host_lpis(GICD_TYPE_ID_BITS(type));
+
     printk("GICv3: %d lines, (IID %8.8x).\n",
            nr_lines, readl_relaxed(GICD + GICD_IIDR));
 
@@ -660,6 +663,20 @@ static int __init gicv3_populate_rdist(void)
             if ( (typer >> 32) == aff )
             {
                 this_cpu(rbase) = ptr;
+
+                if ( typer & GICR_TYPER_PLPIS )
+                {
+                    int ret;
+
+                    ret = gicv3_lpi_init_rdist(ptr);
+                    if ( ret && ret != -ENODEV )
+                    {
+                        printk("GICv3: CPU%d: Cannot initialize LPIs: %u\n",
+                               smp_processor_id(), ret);
+                        break;
+                    }
+                }
+
                 printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
                         smp_processor_id(), i, ptr);
                 return 0;
diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
index bda8898..1cbfb9e 100644
--- a/xen/include/asm-arm/bitops.h
+++ b/xen/include/asm-arm/bitops.h
@@ -24,6 +24,7 @@
 #define BIT(nr)                 (1UL << (nr))
 #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
 #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
+#define BIT_ULL(nr)             (1ULL << (nr))
 #define BITS_PER_BYTE           8
 
 #define ADDR (*(volatile int *) addr)
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index ba61f65..f064e8a 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -19,6 +19,8 @@
 #define BITS_PER_LONG (BYTES_PER_LONG << 3)
 #define POINTER_ALIGN BYTES_PER_LONG
 
+#define BITS_PER_LONG_LONG (sizeof (long long) * BITS_PER_BYTE)
+
 /* xen_ulong_t is always 64 bits */
 #define BITS_PER_XEN_ULONG 64
 
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 6bd25a5..7cdebc5 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -44,7 +44,10 @@
 #define GICC_SRE_EL2_ENEL1           (1UL << 3)
 
 /* Additional bits in GICD_TYPER defined by GICv3 */
-#define GICD_TYPE_ID_BITS_SHIFT 19
+#define GICD_TYPE_ID_BITS_SHIFT      19
+#define GICD_TYPE_ID_BITS(r)     ((((r) >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1)
+
+#define GICD_TYPE_LPIS               (1U << 17)
 
 #define GICD_CTLR_RWP                (1UL << 31)
 #define GICD_CTLR_ARE_NS             (1U << 4)
@@ -95,12 +98,61 @@
 #define GICR_IGRPMODR0               (0x0D00)
 #define GICR_NSACR                   (0x0E00)
 
+#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
+
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
 
+/* For specifying the inner cacheability type only */
+#define GIC_BASER_CACHE_nCnB         0ULL
+/* For specifying the outer cacheability type only */
+#define GIC_BASER_CACHE_SameAsInner  0ULL
+#define GIC_BASER_CACHE_nC           1ULL
+#define GIC_BASER_CACHE_RaWt         2ULL
+#define GIC_BASER_CACHE_RaWb         3ULL
+#define GIC_BASER_CACHE_WaWt         4ULL
+#define GIC_BASER_CACHE_WaWb         5ULL
+#define GIC_BASER_CACHE_RaWaWt       6ULL
+#define GIC_BASER_CACHE_RaWaWb       7ULL
+#define GIC_BASER_CACHE_MASK         7ULL
+
+#define GIC_BASER_NonShareable       0ULL
+#define GIC_BASER_InnerShareable     1ULL
+#define GIC_BASER_OuterShareable     2ULL
+
+#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
+#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
+#define GICR_PROPBASER_SHAREABILITY_MASK                     \
+        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
+#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
+        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PROPBASER_RES0_MASK                             \
+        (GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
+
+#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
+#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
+#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
+#define GICR_PENDBASER_SHAREABILITY_MASK                     \
+	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
+#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
+	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
+        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
+#define GICR_PENDBASER_PTZ                              BIT(62)
+#define GICR_PENDBASER_RES0_MASK                             \
+        (BIT(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |       \
+         GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
+
 #define DEFAULT_PMR_VALUE            0xff
 
+#define LPI_PROP_PRIO_MASK           0xfc
+#define LPI_PROP_RES1                (1 << 1)
+#define LPI_PROP_ENABLED             (1 << 0)
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 765a655..219d109 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
 
 bool gicv3_its_host_has_its(void);
 
+int gicv3_lpi_init_rdist(void __iomem * rdist_base);
+
+/* Initialize the host structures for LPIs. */
+int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
+
 #else
 
 static LIST_HEAD(host_its_list);
@@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void)
     return false;
 }
 
+static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
+{
+    return -ENODEV;
+}
+
+static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 8f7a167..13528c0 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -19,8 +19,16 @@ struct arch_irq_desc {
 };
 
 #define NR_LOCAL_IRQS	32
+
+/*
+ * This only covers the interrupts that Xen cares about, so SGIs, PPIs and
+ * SPIs. LPIs are too numerous, also only propagated to guests, so they are
+ * not included in this number.
+ */
 #define NR_IRQS		1024
 
+#define LPI_OFFSET      8192
+
 #define nr_irqs NR_IRQS
 #define nr_static_irqs NR_IRQS
 #define arch_hwdom_irqs(domid) NR_IRQS
diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
index bd0883a..9261e06 100644
--- a/xen/include/xen/bitops.h
+++ b/xen/include/xen/bitops.h
@@ -5,11 +5,14 @@
 /*
  * Create a contiguous bitmask starting at bit position @l and ending at
  * position @h. For example
- * GENMASK(30, 21) gives us the 32bit vector 0x01fe00000.
+ * GENMASK(30, 21) gives us the 32bit vector 0x7fe00000.
  */
 #define GENMASK(h, l) \
     (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
 
+#define GENMASK_ULL(h, l) \
+    (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))
+
 /*
  * ffs: find first bit set. This is defined the same way as
  * the libc and compiler builtin ffs routines, therefore
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:06   ` Stefano Stabellini
  2017-04-03 15:38   ` Julien Grall
  2017-03-31 18:05 ` [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
                   ` (23 subsequent siblings)
  26 siblings, 2 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Each ITS maps a pair of a DeviceID (for instance derived from a PCI
b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
pair of LPI number and collection ID, which points to the target CPU.
This mapping is stored in the device and collection tables, which software
has to provide for the ITS to use.
Allocate the required memory and hand it to the ITS.
The maximum number of devices is limited to a compile-time constant
exposed in Kconfig.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 docs/misc/xen-command-line.markdown |   9 ++
 xen/arch/arm/gic-v3-its.c           | 168 ++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c               |   3 +
 xen/include/asm-arm/gic_v3_its.h    |  64 +++++++++++++-
 4 files changed, 243 insertions(+), 1 deletion(-)

diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
index 619016d..c67c925 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI.
 ### maxcpus
 > `= <integer>`
 
+### max\_its\_device\_bits
+> `= <integer>`
+
+Specifies the maximum number of devices using MSIs on the ARM GICv3 ITS
+controller to allocate table entries for. Each table entry uses a hardware
+specific size, typically 8 or 16 bytes. This value is given as the number
+of bits required to hold one device ID.
+Defaults to the machine provided value, which is at most 32 bits.
+
 ### max\_lpi\_bits
 > `= <integer>`
 
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 4056e5b..bfdb7ac 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -19,8 +19,10 @@
  */
 
 #include <xen/lib.h>
+#include <xen/mm.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
+#include <asm/io.h>
 
 LIST_HEAD(host_its_list);
 
@@ -29,6 +31,172 @@ bool gicv3_its_host_has_its(void)
     return !list_empty(&host_its_list);
 }
 
+#define BASER_ATTR_MASK                                           \
+        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
+         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
+         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
+#define BASER_RO_MASK   (GENMASK_ULL(58, 56) | GENMASK_ULL(52, 48))
+
+/* Check that the physical address can be encoded in the PROPBASER register. */
+static bool check_baser_phys_addr(void *vaddr, unsigned int page_bits)
+{
+    paddr_t paddr = virt_to_maddr(vaddr);
+
+    return (!(paddr & ~GENMASK_ULL(page_bits < 16 ? 47 : 51, page_bits)));
+}
+
+static uint64_t encode_propbaser_phys_addr(paddr_t addr, unsigned int page_bits)
+{
+    uint64_t ret = addr & GENMASK_ULL(47, page_bits);
+
+    if ( page_bits < 16 )
+        return ret;
+
+    /* For 64K pages address bits 51-48 are encoded in bits 15-12. */
+    return ret | ((addr & GENMASK_ULL(51, 48)) >> (48 - 12));
+}
+
+/* The ITS BASE registers work with page sizes of 4K, 16K or 64K. */
+#define BASER_PAGE_BITS(sz) ((sz) * 2 + 12)
+
+static int its_map_baser(void __iomem *basereg, uint64_t regc,
+                         unsigned int nr_items)
+{
+    uint64_t attr, reg;
+    unsigned int entry_size = GITS_BASER_ENTRY_SIZE(regc);
+    unsigned int pagesz = 2;    /* try 64K pages first, then go down. */
+    unsigned int table_size;
+    void *buffer;
+
+    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    /*
+     * Setup the BASE register with the attributes that we like. Then read
+     * it back and see what sticks (page size, cacheability and shareability
+     * attributes), retrying if necessary.
+     */
+retry:
+    table_size = ROUNDUP(nr_items * entry_size, BIT(BASER_PAGE_BITS(pagesz)));
+    /* The BASE registers support at most 256 pages. */
+    table_size = min(table_size, 256U << BASER_PAGE_BITS(pagesz));
+
+    buffer = _xzalloc(table_size, BIT(BASER_PAGE_BITS(pagesz)));
+    if ( !buffer )
+        return -ENOMEM;
+
+    if ( !check_baser_phys_addr(buffer, BASER_PAGE_BITS(pagesz)) )
+    {
+        xfree(buffer);
+        return -ERANGE;
+    }
+
+    reg  = attr;
+    reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
+    reg |= (table_size >> BASER_PAGE_BITS(pagesz)) - 1;
+    reg |= regc & BASER_RO_MASK;
+    reg |= GITS_VALID_BIT;
+    reg |= encode_propbaser_phys_addr(virt_to_maddr(buffer),
+                                      BASER_PAGE_BITS(pagesz));
+
+    writeq_relaxed(reg, basereg);
+    regc = readq_relaxed(basereg);
+
+    /* The host didn't like our attributes, just use what it returned. */
+    if ( (regc & BASER_ATTR_MASK) != attr )
+    {
+        /* If we can't map it shareable, drop cacheability as well. */
+        if ( (regc & GITS_BASER_SHAREABILITY_MASK) == GIC_BASER_NonShareable )
+        {
+            regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
+            writeq_relaxed(regc, basereg);
+        }
+        attr = regc & BASER_ATTR_MASK;
+    }
+    if ( (regc & GITS_BASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC )
+        clean_and_invalidate_dcache_va_range(buffer, table_size);
+
+    /* If the host accepted our page size, we are done. */
+    if ( ((regc >> GITS_BASER_PAGE_SIZE_SHIFT) & 0x3UL) == pagesz )
+        return 0;
+
+    xfree(buffer);
+
+    if ( pagesz-- > 0 )
+        goto retry;
+
+    /* None of the page sizes was accepted, give up */
+    return -EINVAL;
+}
+
+/* Allow a user to limit the number of devices. */
+static unsigned int max_its_device_bits = 32;
+integer_param("max_its_device_bits", max_its_device_bits);
+
+static int gicv3_its_init_single_its(struct host_its *hw_its)
+{
+    uint64_t reg;
+    int i, ret;
+
+    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
+    if ( !hw_its->its_base )
+        return -ENOMEM;
+
+    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
+    hw_its->devid_bits = GITS_TYPER_DEVICE_ID_BITS(reg);
+    hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
+
+    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
+    {
+        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
+        unsigned int type;
+
+        reg = readq_relaxed(basereg);
+        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
+        switch ( type )
+        {
+        case GITS_BASER_TYPE_NONE:
+            continue;
+        case GITS_BASER_TYPE_DEVICE:
+            ret = its_map_baser(basereg, reg, BIT(hw_its->devid_bits));
+            if ( ret )
+                return ret;
+            break;
+        case GITS_BASER_TYPE_COLLECTION:
+            ret = its_map_baser(basereg, reg, num_possible_cpus());
+            if ( ret )
+                return ret;
+            break;
+        /* In case this is a GICv4, provide a (dummy) vPE table as well. */
+        case GITS_BASER_TYPE_VCPU:
+            ret = its_map_baser(basereg, reg, 1);
+            if ( ret )
+                return ret;
+            break;
+        default:
+            continue;
+        }
+    }
+
+    return 0;
+}
+
+int gicv3_its_init(void)
+{
+    struct host_its *hw_its;
+    int ret;
+
+    list_for_each_entry(hw_its, &host_its_list, entry)
+    {
+        ret = gicv3_its_init_single_its(hw_its);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 36cd269..b84bc40 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1590,6 +1590,9 @@ static int __init gicv3_init(void)
     spin_lock(&gicv3.lock);
 
     gicv3_dist_init();
+    res = gicv3_its_init();
+    if ( res )
+        panic("GICv3: ITS: initialization failed: %d\n", res);
     res = gicv3_cpu_init();
     gicv3_hyp_init();
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 219d109..badb644 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -20,6 +20,60 @@
 #ifndef __ASM_ARM_ITS_H__
 #define __ASM_ARM_ITS_H__
 
+#define GITS_CTLR                       0x000
+#define GITS_IIDR                       0x004
+#define GITS_TYPER                      0x008
+#define GITS_CBASER                     0x080
+#define GITS_CWRITER                    0x088
+#define GITS_CREADR                     0x090
+#define GITS_BASER_NR_REGS              8
+#define GITS_BASER0                     0x100
+#define GITS_BASER1                     0x108
+#define GITS_BASER2                     0x110
+#define GITS_BASER3                     0x118
+#define GITS_BASER4                     0x120
+#define GITS_BASER5                     0x128
+#define GITS_BASER6                     0x130
+#define GITS_BASER7                     0x138
+
+/* Register bits */
+#define GITS_VALID_BIT                  BIT_ULL(63)
+
+#define GITS_CTLR_QUIESCENT             BIT(31)
+#define GITS_CTLR_ENABLE                BIT(0)
+
+#define GITS_TYPER_DEVIDS_SHIFT         13
+#define GITS_TYPER_DEVIDS_MASK          (0x1fUL << GITS_TYPER_DEVIDS_SHIFT)
+#define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
+                                               GITS_TYPER_DEVIDS_SHIFT) + 1)
+
+#define GITS_IIDR_VALUE                 0x34c
+
+#define GITS_BASER_INDIRECT             BIT_ULL(62)
+#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
+#define GITS_BASER_TYPE_SHIFT           56
+#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
+#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
+#define GITS_BASER_TYPE_NONE            0UL
+#define GITS_BASER_TYPE_DEVICE          1UL
+#define GITS_BASER_TYPE_VCPU            2UL
+#define GITS_BASER_TYPE_CPU             3UL
+#define GITS_BASER_TYPE_COLLECTION      4UL
+#define GITS_BASER_TYPE_RESERVED5       5UL
+#define GITS_BASER_TYPE_RESERVED6       6UL
+#define GITS_BASER_TYPE_RESERVED7       7UL
+#define GITS_BASER_ENTRY_SIZE_SHIFT     48
+#define GITS_BASER_ENTRY_SIZE(reg)                                       \
+                        (((reg >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1)
+#define GITS_BASER_SHAREABILITY_SHIFT   10
+#define GITS_BASER_PAGE_SIZE_SHIFT      8
+#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
+                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
+                                        GITS_BASER_INDIRECT)
+#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL << GITS_BASER_SHAREABILITY_SHIFT)
+#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
+#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
+
 #include <xen/device_tree.h>
 
 /* data structure for each hardware ITS */
@@ -28,6 +82,8 @@ struct host_its {
     const struct dt_device_node *dt_node;
     paddr_t addr;
     paddr_t size;
+    void __iomem *its_base;
+    unsigned int devid_bits;
 };
 
 
@@ -42,8 +98,9 @@ bool gicv3_its_host_has_its(void);
 
 int gicv3_lpi_init_rdist(void __iomem * rdist_base);
 
-/* Initialize the host structures for LPIs. */
+/* Initialize the host structures for LPIs and the host ITSes. */
 int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
+int gicv3_its_init(void);
 
 #else
 
@@ -67,6 +124,11 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
 {
     return 0;
 }
+
+static inline int gicv3_its_init(void)
+{
+    return 0;
+}
 #endif /* CONFIG_HAS_ITS */
 
 #endif
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (2 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:10   ` Stefano Stabellini
  2017-04-03 16:00   ` Julien Grall
  2017-03-31 18:05 ` [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
                   ` (22 subsequent siblings)
  26 siblings, 2 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Instead of directly manipulating the tables in memory, an ITS driver
sends commands via a ring buffer in normal system memory to the ITS h/w
to create or alter the LPI mappings.
Allocate memory for that buffer and tell the ITS about it to be able
to send ITS commands.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 53 ++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic_v3_its.h |  6 +++++
 2 files changed, 59 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index bfdb7ac..9a86769 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,10 +20,13 @@
 
 #include <xen/lib.h>
 #include <xen/mm.h>
+#include <xen/sizes.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
 #include <asm/io.h>
 
+#define ITS_CMD_QUEUE_SZ                SZ_1M
+
 LIST_HEAD(host_its_list);
 
 bool gicv3_its_host_has_its(void)
@@ -56,6 +59,51 @@ static uint64_t encode_propbaser_phys_addr(paddr_t addr, unsigned int page_bits)
     return ret | ((addr & GENMASK_ULL(51, 48)) >> (48 - 12));
 }
 
+static void *its_map_cbaser(struct host_its *its)
+{
+    void __iomem *cbasereg = its->its_base + GITS_CBASER;
+    uint64_t reg;
+    void *buffer;
+
+    reg  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    reg |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    buffer = _xzalloc(ITS_CMD_QUEUE_SZ, SZ_64K);
+    if ( !buffer )
+        return NULL;
+
+    if ( virt_to_maddr(buffer) & ~GENMASK_ULL(51, 12) )
+    {
+        xfree(buffer);
+        return NULL;
+    }
+
+    reg |= GITS_VALID_BIT | virt_to_maddr(buffer);
+    reg |= ((ITS_CMD_QUEUE_SZ / SZ_4K) - 1) & GITS_CBASER_SIZE_MASK;
+    writeq_relaxed(reg, cbasereg);
+    reg = readq_relaxed(cbasereg);
+
+    /* If the ITS dropped shareability, drop cacheability as well. */
+    if ( (reg & GITS_BASER_SHAREABILITY_MASK) == 0 )
+    {
+        reg &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
+        writeq_relaxed(reg, cbasereg);
+    }
+
+    /*
+     * If the command queue memory is mapped as uncached, we need to flush
+     * it on every access.
+     */
+    if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
+    {
+        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
+        printk(XENLOG_WARNING "using non-cacheable ITS command queue\n");
+    }
+
+    return buffer;
+}
+
 /* The ITS BASE registers work with page sizes of 4K, 16K or 64K. */
 #define BASER_PAGE_BITS(sz) ((sz) * 2 + 12)
 
@@ -179,6 +227,11 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
         }
     }
 
+    hw_its->cmd_buf = its_map_cbaser(hw_its);
+    if ( !hw_its->cmd_buf )
+        return -ENOMEM;
+    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
+
     return 0;
 }
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index badb644..f21162a 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -74,8 +74,12 @@
 #define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
 #define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
 
+#define GITS_CBASER_SIZE_MASK           0xff
+
 #include <xen/device_tree.h>
 
+#define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
+
 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -84,6 +88,8 @@ struct host_its {
     paddr_t size;
     void __iomem *its_base;
     unsigned int devid_bits;
+    void *cmd_buf;
+    unsigned int flags;
 };
 
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (3 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:16   ` Stefano Stabellini
  2017-04-03 17:32   ` Julien Grall
  2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
                   ` (21 subsequent siblings)
  26 siblings, 2 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

To be able to easily send commands to the ITS, create the respective
wrapper functions, which take care of the ring buffer.
The first two commands we implement provide methods to map a collection
to a redistributor (aka host core) and to flush the command queue (SYNC).
Start using these commands for mapping one collection to each host CPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c         | 182 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c         |  22 +++++
 xen/arch/arm/gic-v3.c             |  25 +++++-
 xen/include/asm-arm/gic_v3_defs.h |   2 +
 xen/include/asm-arm/gic_v3_its.h  |  38 ++++++++
 5 files changed, 267 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 9a86769..1ac598f 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -19,11 +19,14 @@
  */
 
 #include <xen/lib.h>
+#include <xen/delay.h>
 #include <xen/mm.h>
 #include <xen/sizes.h>
+#include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
 #include <asm/io.h>
+#include <asm/page.h>
 
 #define ITS_CMD_QUEUE_SZ                SZ_1M
 
@@ -34,6 +37,147 @@ bool gicv3_its_host_has_its(void)
     return !list_empty(&host_its_list);
 }
 
+#define BUFPTR_MASK                     GENMASK_ULL(19, 5)
+static int its_send_command(struct host_its *hw_its, const void *its_cmd)
+{
+    /* Some small grace period in case the command queue is congested. */
+    s_time_t deadline = NOW() + MILLISECS(1);
+    uint64_t readp, writep;
+    int ret = -EBUSY;
+
+    /* No ITS commands from an interrupt handler (at the moment). */
+    ASSERT(!in_irq());
+
+    spin_lock(&hw_its->cmd_lock);
+
+    do {
+        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
+        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
+
+        if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) != readp )
+        {
+            ret = 0;
+            break;
+        }
+
+        /*
+         * If the command queue is full, wait for a bit in the hope it drains
+         * before giving up.
+         */
+        spin_unlock(&hw_its->cmd_lock);
+        cpu_relax();
+        udelay(1);
+        spin_lock(&hw_its->cmd_lock);
+    } while ( NOW() <= deadline );
+
+    if ( ret )
+    {
+        spin_unlock(&hw_its->cmd_lock);
+        printk(XENLOG_WARNING "ITS: command queue full.\n");
+        return ret;
+    }
+
+    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
+    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
+        clean_and_invalidate_dcache_va_range(hw_its->cmd_buf + writep,
+                                             ITS_CMD_SIZE);
+    else
+        dsb(ishst);
+
+    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
+    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base + GITS_CWRITER);
+
+    spin_unlock(&hw_its->cmd_lock);
+
+    return 0;
+}
+
+/* Wait for an ITS to finish processing all commands. */
+static int gicv3_its_wait_commands(struct host_its *hw_its)
+{
+    /* Define an upper limit for our wait time. */
+    s_time_t deadline = NOW() + MILLISECS(100);
+    uint64_t readp, writep;
+
+    do {
+        spin_lock(&hw_its->cmd_lock);
+        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
+        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
+        spin_unlock(&hw_its->cmd_lock);
+
+        if ( readp == writep )
+            return 0;
+
+        cpu_relax();
+        udelay(1);
+    } while ( NOW() <= deadline );
+
+    return -ETIMEDOUT;
+}
+
+static uint64_t encode_rdbase(struct host_its *hw_its, unsigned int cpu,
+                              uint64_t reg)
+{
+    reg &= ~GENMASK_ULL(51, 16);
+
+    reg |= gicv3_get_redist_address(cpu, hw_its->flags & HOST_ITS_USES_PTA);
+
+    return reg;
+}
+
+static int its_send_cmd_sync(struct host_its *its, unsigned int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_SYNC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, 0x0);
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
+                             unsigned int cpu)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPC;
+    cmd[1] = 0x00;
+    cmd[2] = encode_rdbase(its, cpu, collection_id);
+    cmd[2] |= GITS_VALID_BIT;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
+/* Set up the (1:1) collection mapping for the given host CPU. */
+int gicv3_its_setup_collection(unsigned int cpu)
+{
+    struct host_its *its;
+    int ret;
+
+    list_for_each_entry(its, &host_its_list, entry)
+    {
+        if ( !its->cmd_buf )
+            continue;
+
+        ret = its_send_cmd_mapc(its, cpu, cpu);
+        if ( ret )
+            return ret;
+
+        ret = its_send_cmd_sync(its, cpu);
+        if ( ret )
+            return ret;
+
+        ret = gicv3_its_wait_commands(its);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+
 #define BASER_ATTR_MASK                                           \
         ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
          (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
@@ -178,6 +322,38 @@ retry:
     return -EINVAL;
 }
 
+/*
+ * Before an ITS gets initialized, it should be in a quiescent state, where
+ * all outstanding commands and transactions have finished.
+ * So if the ITS is already enabled, turn it off and wait for all outstanding
+ * operations to get processed by polling the QUIESCENT bit.
+ */
+static int gicv3_disable_its(struct host_its *hw_its)
+{
+    uint32_t reg;
+    /* A similar generous wait limit as we use for the command queue wait. */
+    s_time_t deadline = NOW() + MILLISECS(100);
+
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    if ( !(reg & GITS_CTLR_ENABLE) && (reg & GITS_CTLR_QUIESCENT) )
+        return 0;
+
+    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
+    do {
+        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+        if ( reg & GITS_CTLR_QUIESCENT )
+            return 0;
+
+        cpu_relax();
+        udelay(1);
+    } while ( NOW() <= deadline );
+
+    dprintk(XENLOG_ERR, "ITS not quiescent.\n");
+
+    return -ETIMEDOUT;
+}
+
 /* Allow a user to limit the number of devices. */
 static unsigned int max_its_device_bits = 32;
 integer_param("max_its_device_bits", max_its_device_bits);
@@ -191,9 +367,15 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
     if ( !hw_its->its_base )
         return -ENOMEM;
 
+    ret = gicv3_disable_its(hw_its);
+    if ( ret )
+        return ret;
+
     reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
     hw_its->devid_bits = GITS_TYPER_DEVICE_ID_BITS(reg);
     hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
+    if ( reg & GITS_TYPER_PTA )
+        hw_its->flags |= HOST_ITS_USES_PTA;
 
     for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
     {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 77f6009..d85d63d 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -43,6 +43,8 @@ static struct {
 } lpi_data;
 
 struct lpi_redist_data {
+    paddr_t             redist_addr;
+    unsigned int        redist_id;
     void                *pending_table;
 };
 
@@ -50,6 +52,26 @@ static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
 
 #define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
 
+/* Stores this redistributor's physical address and ID in a per-CPU variable */
+void gicv3_set_redist_address(paddr_t address, unsigned int redist_id)
+{
+    this_cpu(lpi_redist).redist_addr = address;
+    this_cpu(lpi_redist).redist_id = redist_id;
+}
+
+/*
+ * Returns a redistributor's ID (either as an address or as an ID).
+ * This must be (and is) called only after it has been setup by the above
+ * function.
+ */
+uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
+{
+    if ( use_pta )
+        return per_cpu(lpi_redist, cpu).redist_addr & GENMASK_ULL(51, 16);
+    else
+        return per_cpu(lpi_redist, cpu).redist_id << 16;
+}
+
 static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
 {
     uint64_t val;
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b84bc40..0e21cb2 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -666,7 +666,21 @@ static int __init gicv3_populate_rdist(void)
 
                 if ( typer & GICR_TYPER_PLPIS )
                 {
-                    int ret;
+                    paddr_t rdist_addr;
+                    int procnum, ret;
+
+                    /*
+                     * The ITS refers to redistributors either by their physical
+                     * address or by their ID. Determine those two values and
+                     * let the ITS code store them in per host CPU variables to
+                     * later be able to address those redistributors.
+                     */
+                    rdist_addr = gicv3.rdist_regions[i].base;
+                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
+                    procnum = (typer & GICR_TYPER_PROC_NUM_MASK);
+                    procnum >>= GICR_TYPER_PROC_NUM_SHIFT;
+
+                    gicv3_set_redist_address(rdist_addr, procnum);
 
                     ret = gicv3_lpi_init_rdist(ptr);
                     if ( ret && ret != -ENODEV )
@@ -705,7 +719,7 @@ static int __init gicv3_populate_rdist(void)
 
 static int gicv3_cpu_init(void)
 {
-    int i;
+    int i, ret;
     uint32_t priority;
 
     /* Register ourselves with the rest of the world */
@@ -715,6 +729,13 @@ static int gicv3_cpu_init(void)
     if ( gicv3_enable_redist() )
         return -ENODEV;
 
+    if ( gicv3_its_host_has_its() )
+    {
+        ret = gicv3_its_setup_collection(smp_processor_id());
+        if ( ret )
+            return ret;
+    }
+
     /* Set priority on PPI and SGI interrupts */
     priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
                 GIC_PRI_IPI);
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 7cdebc5..b01b6ed 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -103,6 +103,8 @@
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
+#define GICR_TYPER_PROC_NUM_SHIFT    8
+#define GICR_TYPER_PROC_NUM_MASK     (0xffff << GICR_TYPER_PROC_NUM_SHIFT)
 
 /* For specifying the inner cacheability type only */
 #define GIC_BASER_CACHE_nCnB         0ULL
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index f21162a..4c2ae1c 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -42,10 +42,12 @@
 #define GITS_CTLR_QUIESCENT             BIT(31)
 #define GITS_CTLR_ENABLE                BIT(0)
 
+#define GITS_TYPER_PTA                  BIT_ULL(19)
 #define GITS_TYPER_DEVIDS_SHIFT         13
 #define GITS_TYPER_DEVIDS_MASK          (0x1fUL << GITS_TYPER_DEVIDS_SHIFT)
 #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
                                                GITS_TYPER_DEVIDS_SHIFT) + 1)
+#define GITS_TYPER_IDBITS_SHIFT         8
 
 #define GITS_IIDR_VALUE                 0x34c
 
@@ -76,9 +78,26 @@
 
 #define GITS_CBASER_SIZE_MASK           0xff
 
+/* ITS command definitions */
+#define ITS_CMD_SIZE                    32
+
+#define GITS_CMD_MOVI                   0x01
+#define GITS_CMD_INT                    0x03
+#define GITS_CMD_CLEAR                  0x04
+#define GITS_CMD_SYNC                   0x05
+#define GITS_CMD_MAPD                   0x08
+#define GITS_CMD_MAPC                   0x09
+#define GITS_CMD_MAPTI                  0x0a
+#define GITS_CMD_MAPI                   0x0b
+#define GITS_CMD_INV                    0x0c
+#define GITS_CMD_INVALL                 0x0d
+#define GITS_CMD_MOVALL                 0x0e
+#define GITS_CMD_DISCARD                0x0f
+
 #include <xen/device_tree.h>
 
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
+#define HOST_ITS_USES_PTA               (1U << 1)
 
 /* data structure for each hardware ITS */
 struct host_its {
@@ -88,6 +107,7 @@ struct host_its {
     paddr_t size;
     void __iomem *its_base;
     unsigned int devid_bits;
+    spinlock_t cmd_lock;
     void *cmd_buf;
     unsigned int flags;
 };
@@ -108,6 +128,13 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base);
 int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
 int gicv3_its_init(void);
 
+/* Store the physical address and ID for each redistributor as read from DT. */
+void gicv3_set_redist_address(paddr_t address, unsigned int redist_id);
+uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
+
+/* Map a collection for this host CPU to each host ITS. */
+int gicv3_its_setup_collection(unsigned int cpu);
+
 #else
 
 static LIST_HEAD(host_its_list);
@@ -135,6 +162,17 @@ static inline int gicv3_its_init(void)
 {
     return 0;
 }
+
+static inline void gicv3_set_redist_address(paddr_t address,
+                                            unsigned int redist_id)
+{
+}
+
+static inline int gicv3_its_setup_collection(unsigned int cpu)
+{
+    return 0;
+}
+
 #endif /* CONFIG_HAS_ITS */
 
 #endif
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (4 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:20   ` Stefano Stabellini
                     ` (2 more replies)
  2017-03-31 18:05 ` [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
                   ` (20 subsequent siblings)
  26 siblings, 3 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The ITS uses device IDs to map LPIs to a device. Dom0 will later use
those IDs, which we directly pass on to the host.
For this we have to map each device that Dom0 may request to a host
ITS device with the same identifier.
Allocate the respective memory and enter each device into an rbtree to
later be able to iterate over it or to easily teardown guests.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 227 +++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c           |   4 +
 xen/include/asm-arm/domain.h     |   3 +
 xen/include/asm-arm/gic_v3_its.h |  23 ++++
 4 files changed, 257 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 1ac598f..295f7dc 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -21,6 +21,8 @@
 #include <xen/lib.h>
 #include <xen/delay.h>
 #include <xen/mm.h>
+#include <xen/rbtree.h>
+#include <xen/sched.h>
 #include <xen/sizes.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
@@ -32,6 +34,18 @@
 
 LIST_HEAD(host_its_list);
 
+struct its_devices {
+    struct rb_node rbnode;
+    struct host_its *hw_its;
+    void *itt_addr;
+    paddr_t guest_doorbell;             /* Identifies the virtual ITS */
+    uint32_t host_devid;
+    uint32_t guest_devid;
+    uint32_t eventids;                  /* Number of event IDs (MSIs) */
+    uint32_t *host_lpi_blocks;          /* Which LPIs are used on the host */
+    struct pending_irq *pend_irqs;      /* One struct per event */
+};
+
 bool gicv3_its_host_has_its(void)
 {
     return !list_empty(&host_its_list);
@@ -151,6 +165,26 @@ static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
+                             uint8_t size_bits, paddr_t itt_addr, bool valid)
+{
+    uint64_t cmd[4];
+
+    if ( valid )
+    {
+        ASSERT(size_bits < 32);
+        ASSERT(!(itt_addr & ~GENMASK_ULL(51, 8)));
+    }
+    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
+    cmd[1] = size_bits;
+    cmd[2] = itt_addr;
+    if ( valid )
+        cmd[2] |= GITS_VALID_BIT;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(unsigned int cpu)
 {
@@ -376,6 +410,7 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
     hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
     if ( reg & GITS_TYPER_PTA )
         hw_its->flags |= HOST_ITS_USES_PTA;
+    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
 
     for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
     {
@@ -432,6 +467,197 @@ int gicv3_its_init(void)
     return 0;
 }
 
+static int remove_mapped_guest_device(struct its_devices *dev)
+{
+    int ret;
+
+    if ( dev->hw_its )
+    {
+        /* MAPD also discards all events with this device ID. */
+        int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
+        if ( ret )
+            return ret;
+    }
+
+    ret = gicv3_its_wait_commands(dev->hw_its);
+    if ( ret )
+        return ret;
+
+    xfree(dev->itt_addr);
+    xfree(dev->pend_irqs);
+    xfree(dev);
+
+    return 0;
+}
+
+static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
+{
+    struct host_its *hw_its;
+
+    list_for_each_entry(hw_its, &host_its_list, entry)
+    {
+        if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )
+            return hw_its;
+    }
+
+    return NULL;
+}
+
+static int compare_its_guest_devices(struct its_devices *dev,
+                                     paddr_t doorbell, uint32_t devid)
+{
+    if ( dev->guest_doorbell < doorbell )
+        return -1;
+
+    if ( dev->guest_doorbell > doorbell )
+        return 1;
+
+    if ( dev->guest_devid < devid )
+        return -1;
+
+    if ( dev->guest_devid > devid )
+        return 1;
+
+    return 0;
+}
+
+/*
+ * Map a hardware device, identified by a certain host ITS and its device ID
+ * to domain d, a guest ITS (identified by its doorbell address) and device ID.
+ * Also provide the number of events (MSIs) needed for that device.
+ * This does not check if this particular hardware device is already mapped
+ * at another domain, it is expected that this would be done by the caller.
+ */
+int gicv3_its_map_guest_device(struct domain *d,
+                               paddr_t host_doorbell, uint32_t host_devid,
+                               paddr_t guest_doorbell, uint32_t guest_devid,
+                               uint32_t nr_events, bool valid)
+{
+    void *itt_addr = NULL;
+    struct host_its *hw_its;
+    struct its_devices *dev = NULL;
+    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
+    int ret = -ENOENT;
+
+    hw_its = gicv3_its_find_by_doorbell(host_doorbell);
+    if ( !hw_its )
+        return ret;
+
+    /* check for already existing mappings */
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    while ( *new )
+    {
+        struct its_devices *temp;
+        int cmp;
+
+        temp = rb_entry(*new, struct its_devices, rbnode);
+
+        parent = *new;
+        cmp = compare_its_guest_devices(temp, guest_doorbell, guest_devid);
+        if ( !cmp )
+        {
+            if ( !valid )
+                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
+
+            spin_unlock(&d->arch.vgic.its_devices_lock);
+
+            if ( valid )
+                return -EBUSY;
+
+            return remove_mapped_guest_device(temp);
+        }
+
+        if ( cmp > 0 )
+            new = &((*new)->rb_left);
+        else
+            new = &((*new)->rb_right);
+    }
+
+    if ( !valid )
+        goto out_unlock;
+
+    ret = -ENOMEM;
+
+    /* An Interrupt Translation Table needs to be 256-byte aligned. */
+    itt_addr = _xzalloc(nr_events * hw_its->itte_size, 256);
+    if ( !itt_addr )
+        goto out_unlock;
+
+    dev = xzalloc(struct its_devices);
+    if ( !dev )
+        goto out_unlock;
+
+    /*
+     * Allocate the pending_irqs for each virtual LPI. They will be put
+     * into the domain's radix tree upon the guest's MAPTI command.
+     */
+    dev->pend_irqs = xzalloc_array(struct pending_irq, nr_events);
+    if ( !dev->pend_irqs )
+        goto out_unlock;
+
+    ret = its_send_cmd_mapd(hw_its, host_devid,
+                            fls(ROUNDUP(nr_events, LPI_BLOCK) - 1) - 1,
+                            virt_to_maddr(itt_addr), true);
+    if ( ret )
+        goto out_unlock;
+
+    dev->itt_addr = itt_addr;
+    dev->hw_its = hw_its;
+    dev->guest_doorbell = guest_doorbell;
+    dev->guest_devid = guest_devid;
+    dev->host_devid = host_devid;
+    dev->eventids = nr_events;
+
+    rb_link_node(&dev->rbnode, parent, new);
+    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
+
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    return 0;
+
+out_unlock:
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+    if ( dev )
+    {
+        xfree(dev->pend_irqs);
+        xfree(dev->host_lpi_blocks);
+    }
+    xfree(itt_addr);
+    xfree(dev);
+    return ret;
+}
+
+/* Removing any connections a domain had to any ITS in the system. */
+void gicv3_its_unmap_all_devices(struct domain *d)
+{
+    struct rb_node *victim;
+    struct its_devices *dev;
+
+    /*
+     * This is an easily readable, but suboptimal implementation.
+     * It uses the provided iteration wrapper and erases each node, which
+     * possibly triggers rebalancing.
+     * This seems overkill since we are going to abolish the whole tree, but
+     * avoids an open-coded re-implementation of the traversal functions with
+     * some recursive function calls.
+     */
+restart:
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
+    {
+        dev = rb_entry(victim, struct its_devices, rbnode);
+        rb_erase(victim, &d->arch.vgic.its_devices);
+
+        spin_unlock(&d->arch.vgic.its_devices_lock);
+
+        remove_mapped_guest_device(dev);
+
+        goto restart;
+    }
+
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
@@ -459,6 +685,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
         its_data->addr = addr;
         its_data->size = size;
         its_data->dt_node = its;
+        spin_lock_init(&its_data->cmd_lock);
 
         printk("GICv3: Found ITS @0x%lx\n", addr);
 
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index d61479d..6242252 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
     d->arch.vgic.nr_regions = rdist_count;
     d->arch.vgic.rdist_regions = rdist_regions;
 
+    spin_lock_init(&d->arch.vgic.its_devices_lock);
+    d->arch.vgic.its_devices = RB_ROOT;
+
     /*
      * Domain 0 gets the hardware address.
      * Guests get the virtual platform layout.
@@ -1522,6 +1525,7 @@ static int vgic_v3_domain_init(struct domain *d)
 
 static void vgic_v3_domain_free(struct domain *d)
 {
+    gicv3_its_unmap_all_devices(d);
     xfree(d->arch.vgic.rdist_regions);
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 2d6fbb1..e559027 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -11,6 +11,7 @@
 #include <asm/gic.h>
 #include <public/hvm/params.h>
 #include <xen/serial.h>
+#include <xen/rbtree.h>
 
 struct hvm_domain
 {
@@ -109,6 +110,8 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        struct rb_root its_devices;         /* Devices mapped to an ITS */
+        spinlock_t its_devices_lock;        /* Protects the its_devices tree */
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 4c2ae1c..4ade5f6 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -48,6 +48,10 @@
 #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
                                                GITS_TYPER_DEVIDS_SHIFT) + 1)
 #define GITS_TYPER_IDBITS_SHIFT         8
+#define GITS_TYPER_ITT_SIZE_SHIFT       4
+#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
+#define GITS_TYPER_ITT_SIZE(r)          ((((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
+                                                GITS_TYPER_ITT_SIZE_SHIFT) + 1)
 
 #define GITS_IIDR_VALUE                 0x34c
 
@@ -94,7 +98,10 @@
 #define GITS_CMD_MOVALL                 0x0e
 #define GITS_CMD_DISCARD                0x0f
 
+#define ITS_DOORBELL_OFFSET             0x10040
+
 #include <xen/device_tree.h>
+#include <xen/rbtree.h>
 
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)
@@ -109,6 +116,7 @@ struct host_its {
     unsigned int devid_bits;
     spinlock_t cmd_lock;
     void *cmd_buf;
+    unsigned int itte_size;
     unsigned int flags;
 };
 
@@ -135,6 +143,17 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
 /* Map a collection for this host CPU to each host ITS. */
 int gicv3_its_setup_collection(unsigned int cpu);
 
+/*
+ * Map a device on the host by allocating an ITT on the host (ITS).
+ * "nr_event" specifies how many events (interrupts) this device will need.
+ * Setting "valid" to false deallocates the device.
+ */
+int gicv3_its_map_guest_device(struct domain *d,
+                               paddr_t host_doorbell, uint32_t host_devid,
+                               paddr_t guest_doorbell, uint32_t guest_devid,
+                               uint32_t nr_events, bool valid);
+void gicv3_its_unmap_all_devices(struct domain *d);
+
 #else
 
 static LIST_HEAD(host_its_list);
@@ -173,6 +192,10 @@ static inline int gicv3_its_setup_collection(unsigned int cpu)
     return 0;
 }
 
+static inline void gicv3_its_unmap_all_devices(struct domain *d)
+{
+}
+
 #endif /* CONFIG_HAS_ITS */
 
 #endif
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (5 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 23:24   ` Stefano Stabellini
  2017-03-31 18:05 ` [PATCH v3 08/26] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
                   ` (19 subsequent siblings)
  26 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The number of LPIs on a host can be potentially huge (millions),
although in practise will be mostly reasonable. So prematurely allocating
an array of struct irq_desc's for each LPI is not an option.
However Xen itself does not care about LPIs, as every LPI will be injected
into a guest (Dom0 for now).
Create a dense data structure (8 Bytes) for each LPI which holds just
enough information to determine the virtual IRQ number and the VCPU into
which the LPI needs to be injected.
Also to not artificially limit the number of LPIs, we create a 2-level
table for holding those structures.
This patch introduces functions to initialize these tables and to
create, lookup and destroy entries for a given LPI.
By using the naturally atomic access guarantee the native uint64_t data
type gives us, we allocate and access LPI information in a way that does
not require a lock.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        |  89 +++++++++++++++++-
 xen/arch/arm/gic-v3-lpi.c        | 196 +++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic.h        |   2 +
 xen/include/asm-arm/gic_v3_its.h |   5 +
 xen/include/asm-arm/irq.h        |   5 +
 5 files changed, 295 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 295f7dc..fa284e7 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -151,6 +151,20 @@ static int its_send_cmd_sync(struct host_its *its, unsigned int cpu)
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_mapti(struct host_its *its,
+                              uint32_t deviceid, uint32_t eventid,
+                              uint32_t pintid, uint16_t icid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid | ((uint64_t)pintid << 32);
+    cmd[2] = icid;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
                              unsigned int cpu)
 {
@@ -185,6 +199,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
     return its_send_command(its, cmd);
 }
 
+static int its_send_cmd_inv(struct host_its *its,
+                            uint32_t deviceid, uint32_t eventid)
+{
+    uint64_t cmd[4];
+
+    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
+    cmd[1] = eventid;
+    cmd[2] = 0x00;
+    cmd[3] = 0x00;
+
+    return its_send_command(its, cmd);
+}
+
 /* Set up the (1:1) collection mapping for the given host CPU. */
 int gicv3_its_setup_collection(unsigned int cpu)
 {
@@ -469,7 +496,7 @@ int gicv3_its_init(void)
 
 static int remove_mapped_guest_device(struct its_devices *dev)
 {
-    int ret;
+    int ret, i;
 
     if ( dev->hw_its )
     {
@@ -479,12 +506,16 @@ static int remove_mapped_guest_device(struct its_devices *dev)
             return ret;
     }
 
+    for ( i = 0; i < DIV_ROUND_UP(dev->eventids, LPI_BLOCK); i++ )
+        gicv3_free_host_lpi_block(dev->host_lpi_blocks[i]);
+
     ret = gicv3_its_wait_commands(dev->hw_its);
     if ( ret )
         return ret;
 
     xfree(dev->itt_addr);
     xfree(dev->pend_irqs);
+    xfree(dev->host_lpi_blocks);
     xfree(dev);
 
     return 0;
@@ -522,6 +553,37 @@ static int compare_its_guest_devices(struct its_devices *dev,
 }
 
 /*
+ * On the host ITS @its, map @nr_events consecutive LPIs.
+ * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
+ * increasing both @eventid and @lpi to cover the number of requested LPIs.
+ */
+static int gicv3_its_map_host_events(struct host_its *its,
+                                     uint32_t devid, uint32_t eventid,
+                                     uint32_t lpi, uint32_t nr_events)
+{
+    uint32_t i;
+    int ret;
+
+    for ( i = 0; i < nr_events; i++ )
+    {
+        /* For now we map every host LPI to host CPU 0 */
+        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
+        if ( ret )
+            return ret;
+
+        ret = its_send_cmd_inv(its, devid, eventid + i);
+        if ( ret )
+            return ret;
+    }
+
+    ret = its_send_cmd_sync(its, 0);
+    if ( ret )
+        return ret;
+
+    return gicv3_its_wait_commands(its);
+}
+
+/*
  * Map a hardware device, identified by a certain host ITS and its device ID
  * to domain d, a guest ITS (identified by its doorbell address) and device ID.
  * Also provide the number of events (MSIs) needed for that device.
@@ -537,7 +599,7 @@ int gicv3_its_map_guest_device(struct domain *d,
     struct host_its *hw_its;
     struct its_devices *dev = NULL;
     struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
-    int ret = -ENOENT;
+    int ret = -ENOENT, i;
 
     hw_its = gicv3_its_find_by_doorbell(host_doorbell);
     if ( !hw_its )
@@ -595,6 +657,11 @@ int gicv3_its_map_guest_device(struct domain *d,
     if ( !dev->pend_irqs )
         goto out_unlock;
 
+    dev->host_lpi_blocks = xzalloc_array(uint32_t,
+                                         DIV_ROUND_UP(nr_events, LPI_BLOCK));
+    if ( !dev->host_lpi_blocks )
+        goto out_unlock;
+
     ret = its_send_cmd_mapd(hw_its, host_devid,
                             fls(ROUNDUP(nr_events, LPI_BLOCK) - 1) - 1,
                             virt_to_maddr(itt_addr), true);
@@ -613,10 +680,28 @@ int gicv3_its_map_guest_device(struct domain *d,
 
     spin_unlock(&d->arch.vgic.its_devices_lock);
 
+    /*
+     * Map all host LPIs within this device already. We can't afford to queue
+     * any host ITS commands later on during the guest's runtime.
+     */
+    for ( i = 0; i < DIV_ROUND_UP(nr_events, LPI_BLOCK); i++ )
+    {
+        ret = gicv3_allocate_host_lpi_block(d, &dev->host_lpi_blocks[i]);
+        if ( ret < 0 )
+            goto out;
+
+        ret = gicv3_its_map_host_events(hw_its, host_devid, i * LPI_BLOCK,
+                                        dev->host_lpi_blocks[i], LPI_BLOCK);
+        if ( ret < 0 )
+            goto out;
+    }
+
     return 0;
 
 out_unlock:
     spin_unlock(&d->arch.vgic.its_devices_lock);
+
+out:
     if ( dev )
     {
         xfree(dev->pend_irqs);
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index d85d63d..d642cc5 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -20,25 +20,55 @@
 
 #include <xen/lib.h>
 #include <xen/mm.h>
+#include <xen/sched.h>
 #include <xen/sizes.h>
+#include <asm/atomic.h>
+#include <asm/domain.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic_v3_its.h>
 #include <asm/io.h>
 #include <asm/page.h>
 
+/*
+ * There could be a lot of LPIs on the host side, and they always go to
+ * a guest. So having a struct irq_desc for each of them would be wasteful
+ * and useless.
+ * Instead just store enough information to find the right VCPU to inject
+ * those LPIs into, which just requires the virtual LPI number.
+ * To avoid a global lock on this data structure, this is using a lockless
+ * approach relying on the architectural atomicty of native data types:
+ * We read or write the "data" view of this union atomically, then can
+ * access the broken-down fields in our local copy.
+ */
+union host_lpi {
+    uint64_t data;
+    struct {
+        uint32_t virt_lpi;
+        uint16_t dom_id;
+        uint16_t vcpu_id;
+    };
+};
+
 #define LPI_PROPTABLE_NEEDS_FLUSHING    (1U << 0)
 /* Global state */
 static struct {
     /* The global LPI property table, shared by all redistributors. */
     uint8_t *lpi_property;
     /*
+     * A two-level table to lookup LPIs firing on the host and look up the
+     * VCPU and virtual LPI number to inject into.
+     */
+    union host_lpi **host_lpis;
+    /*
      * Number of physical LPIs the host supports. This is a property of
      * the GIC hardware. We depart from the habit of naming these things
      * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI"
      * in a different context to differentiate them from "virtual LPIs".
      */
     unsigned long int nr_host_lpis;
+    /* Protects allocation and deallocation of host LPIs, but not the access */
+    spinlock_t host_lpis_lock;
     unsigned int flags;
 } lpi_data;
 
@@ -51,6 +81,19 @@ struct lpi_redist_data {
 static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
 
 #define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
+#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
+
+static union host_lpi *gic_get_host_lpi(uint32_t plpi)
+{
+    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
+        return NULL;
+
+    plpi -= LPI_OFFSET;
+    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
+        return NULL;
+
+    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % HOST_LPIS_PER_PAGE];
+}
 
 /* Stores this redistributor's physical address and ID in a per-CPU variable */
 void gicv3_set_redist_address(paddr_t address, unsigned int redist_id)
@@ -212,15 +255,168 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base)
 static unsigned int max_lpi_bits = 20;
 integer_param("max_lpi_bits", max_lpi_bits);
 
+/*
+ * Allocate the 2nd level array for host LPIs. This one holds pointers
+ * to the page with the actual "union host_lpi" entries. Our LPI limit
+ * avoids excessive memory usage.
+ */
 int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
 {
+    int nr_lpi_ptrs;
+
+    /* We rely on the data structure being atomically accessible. */
+    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
+
     lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits));
 
+    spin_lock_init(&lpi_data.host_lpis_lock);
+
+    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
+    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
+    if ( !lpi_data.host_lpis )
+        return -ENOMEM;
+
     printk("GICv3: using at most %lu LPIs on the host.\n", MAX_PHYS_LPIS);
 
     return 0;
 }
 
+static int find_unused_host_lpi(uint32_t start, uint32_t *index)
+{
+    unsigned int chunk;
+    uint32_t i = *index;
+
+    ASSERT(spin_is_locked(&lpi_data.host_lpis_lock));
+
+    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
+    {
+        /* If we hit an unallocated chunk, use entry 0 in that one. */
+        if ( !lpi_data.host_lpis[chunk] )
+        {
+            *index = 0;
+            return chunk;
+        }
+
+        /* Find an unallocated entry in this chunk. */
+        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+        {
+            if ( lpi_data.host_lpis[chunk][i].dom_id == DOMID_INVALID )
+            {
+                *index = i;
+                return chunk;
+            }
+        }
+        i = 0;
+    }
+
+    return -1;
+}
+
+/*
+ * Allocate a block of 32 LPIs on the given host ITS for device "devid",
+ * starting with "eventid". Put them into the respective ITT by issuing a
+ * MAPTI command for each of them.
+ */
+int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi)
+{
+    static uint32_t next_lpi = 0;
+    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;
+    int chunk;
+    int i;
+
+    spin_lock(&lpi_data.host_lpis_lock);
+    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
+
+    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
+    {
+        lpi_idx = 0;
+        chunk = find_unused_host_lpi(0, &lpi_idx);
+        if ( chunk == -1 )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOSPC;
+        }
+    }
+
+    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
+    if ( !lpi_data.host_lpis[chunk] )
+    {
+        union host_lpi *new_chunk;
+
+        /* TODO: NUMA locality for quicker IRQ path? */
+        new_chunk = xmalloc_bytes(PAGE_SIZE);
+        if ( !new_chunk )
+        {
+            spin_unlock(&lpi_data.host_lpis_lock);
+            return -ENOMEM;
+        }
+
+        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
+            new_chunk[i].dom_id = DOMID_INVALID;
+
+        lpi_data.host_lpis[chunk] = new_chunk;
+        lpi_idx = 0;
+    }
+
+    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+    {
+        union host_lpi hlpi;
+
+        /*
+         * Mark this host LPI as belonging to the domain, but don't assign
+         * any virtual LPI or a VCPU yet.
+         */
+        hlpi.virt_lpi = INVALID_LPI;
+        hlpi.dom_id = d->domain_id;
+        hlpi.vcpu_id = ~0;
+        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
+                         hlpi.data);
+
+        /*
+         * Enable this host LPI, so we don't have to do this during the
+         * guest's runtime.
+         */
+        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
+    }
+
+    /*
+     * We have allocated and initialized the host LPI entries, so it's safe
+     * to drop the lock now. Access to the structures can be done concurrently
+     * as it involves only an atomic uint64_t access.
+     */
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    if ( lpi_data.flags & LPI_PROPTABLE_NEEDS_FLUSHING )
+        clean_and_invalidate_dcache_va_range(&lpi_data.lpi_property[lpi],
+                                             LPI_BLOCK);
+
+    next_lpi = lpi + LPI_BLOCK;
+    *first_lpi = lpi + LPI_OFFSET;
+
+    return 0;
+}
+
+void gicv3_free_host_lpi_block(uint32_t first_lpi)
+{
+    union host_lpi *hlpi, empty_lpi = { .dom_id = DOMID_INVALID };
+    int i;
+
+    hlpi = gic_get_host_lpi(first_lpi);
+    if ( !hlpi )
+        return;         /* Nothing to free here. */
+
+    spin_lock(&lpi_data.host_lpis_lock);
+
+    for ( i = 0; i < LPI_BLOCK; i++ )
+        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
+
+    spin_unlock(&lpi_data.host_lpis_lock);
+
+    return;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 836a103..d04bd04 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -220,6 +220,8 @@ enum gic_version {
     GIC_V3,
 };
 
+#define INVALID_LPI     0
+
 extern enum gic_version gic_hw_version(void);
 
 /* Program the IRQ type into the GIC */
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 4ade5f6..7b47596 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -106,6 +106,9 @@
 #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
 #define HOST_ITS_USES_PTA               (1U << 1)
 
+/* We allocate LPIs on the hosts in chunks of 32 to reduce handling overhead. */
+#define LPI_BLOCK                       32
+
 /* data structure for each hardware ITS */
 struct host_its {
     struct list_head entry;
@@ -153,6 +156,8 @@ int gicv3_its_map_guest_device(struct domain *d,
                                paddr_t guest_doorbell, uint32_t guest_devid,
                                uint32_t nr_events, bool valid);
 void gicv3_its_unmap_all_devices(struct domain *d);
+int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi);
+void gicv3_free_host_lpi_block(uint32_t first_lpi);
 
 #else
 
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 13528c0..d16affc 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -42,6 +42,11 @@ struct irq_desc *__irq_to_desc(int irq);
 
 void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
 
+static inline bool is_lpi(unsigned int irq)
+{
+    return irq >= LPI_OFFSET;
+}
+
 #define domain_pirq_to_irq(d, pirq) (pirq)
 
 bool_t is_assignable_irq(unsigned int irq);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 08/26] ARM: GICv3: introduce separate pending_irq structs for LPIs
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (6 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 09/26] ARM: GICv3: forward pending LPIs to guests Andre Przywara
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

For the same reason that allocating a struct irq_desc for each
possible LPI is not an option, having a struct pending_irq for each LPI
is also not feasible. We only care about mapped LPIs, so we can get away
with having struct pending_irq's only for them.
Maintain a radix tree per domain where we drop the pointer to the
respective pending_irq. The index used is the virtual LPI number.
The memory for the actual structures has been allocated already per
device at device mapping time.
Teach the existing VGIC functions to find the right pointer when being
given a virtual LPI number.
We also take care of checking for a NULL pointer in the VCPU exit path,
should an LPI have been removed from the tree for any reason.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic.c           | 12 ++++++++++++
 xen/arch/arm/vgic-v3.c       | 21 +++++++++++++++++++++
 xen/arch/arm/vgic.c          |  5 +++++
 xen/include/asm-arm/domain.h |  2 ++
 xen/include/asm-arm/vgic.h   |  1 +
 5 files changed, 41 insertions(+)

diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index a5348f2..6a5c882 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -462,7 +462,19 @@ static void gic_update_one_lr(struct vcpu *v, int i)
 
     gic_hw_ops->read_lr(i, &lr_val);
     irq = lr_val.virq;
+
     p = irq_to_pending(v, irq);
+    /* An LPI might have been unmapped, in which case we just clean up here. */
+    if ( !p )
+    {
+        ASSERT(is_lpi(irq));
+
+        gic_hw_ops->clear_lr(i);
+        clear_bit(i, &this_cpu(lr_mask));
+
+        return;
+    }
+
     if ( lr_val.state & GICH_LR_ACTIVE )
     {
         set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 6242252..29c97eb 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -331,6 +331,23 @@ read_unknown:
     return 1;
 }
 
+/*
+ * Looks up a virtual LPI number in our tree of mapped LPIs. This will return
+ * the corresponding struct pending_irq, which we also use to store the
+ * enabled and pending bit plus the priority.
+ * Returns NULL if an LPI cannot be found.
+ */
+struct pending_irq *lpi_to_pending(struct domain *d, unsigned int lpi)
+{
+    struct pending_irq *pirq;
+
+    read_lock(&d->arch.vgic.pend_lpi_tree_lock);
+    pirq = radix_tree_lookup(&d->arch.vgic.pend_lpi_tree, lpi);
+    read_unlock(&d->arch.vgic.pend_lpi_tree_lock);
+
+    return pirq;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
@@ -1453,6 +1470,9 @@ static int vgic_v3_domain_init(struct domain *d)
     spin_lock_init(&d->arch.vgic.its_devices_lock);
     d->arch.vgic.its_devices = RB_ROOT;
 
+    rwlock_init(&d->arch.vgic.pend_lpi_tree_lock);
+    radix_tree_init(&d->arch.vgic.pend_lpi_tree);
+
     /*
      * Domain 0 gets the hardware address.
      * Guests get the virtual platform layout.
@@ -1526,6 +1546,7 @@ static int vgic_v3_domain_init(struct domain *d)
 static void vgic_v3_domain_free(struct domain *d)
 {
     gicv3_its_unmap_all_devices(d);
+    radix_tree_destroy(&d->arch.vgic.pend_lpi_tree, NULL);
     xfree(d->arch.vgic.rdist_regions);
 }
 
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 364d5f0..15c8ef8 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -449,10 +449,15 @@ bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode,
 struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
 {
     struct pending_irq *n;
+
     /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
      * are used for SPIs; the rests are used for per cpu irqs */
     if ( irq < 32 )
         n = &v->arch.vgic.pending_irqs[irq];
+#ifdef CONFIG_HAS_ITS
+    else if ( is_lpi(irq) )
+        n = lpi_to_pending(v->domain, irq);
+#endif
     else
         n = &v->domain->arch.vgic.pending_irqs[irq - 32];
     return n;
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index e559027..a83904a 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -112,6 +112,8 @@ struct arch_domain
         uint32_t rdist_stride;              /* Re-Distributor stride */
         struct rb_root its_devices;         /* Devices mapped to an ITS */
         spinlock_t its_devices_lock;        /* Protects the its_devices tree */
+        struct radix_tree_root pend_lpi_tree; /* Stores struct pending_irq's */
+        rwlock_t pend_lpi_tree_lock;        /* Protects the pend_lpi_tree */
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 467333c..e6dad38 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -298,6 +298,7 @@ extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
 extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
+extern struct pending_irq *lpi_to_pending(struct domain *d, unsigned int irq);
 extern struct vgic_irq_rank *vgic_rank_offset(struct vcpu *v, int b, int n, int s);
 extern struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq);
 extern bool vgic_emulate(struct cpu_user_regs *regs, union hsr hsr);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 09/26] ARM: GICv3: forward pending LPIs to guests
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (7 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 08/26] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 10/26] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Upon receiving an LPI, we need to find the right VCPU and virtual IRQ
number to get this IRQ injected.
Iterate our two-level LPI table to find this information quickly when
the host takes an LPI. Call the existing injection function to let the
GIC emulation deal with this interrupt.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-lpi.c  | 42 ++++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic.c         |  8 +++++++-
 xen/arch/arm/vgic-v3.c     | 11 +++++++++++
 xen/arch/arm/vgic.c        |  9 ++++++++-
 xen/include/asm-arm/irq.h  |  2 ++
 xen/include/asm-arm/vgic.h |  2 ++
 6 files changed, 72 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index d642cc5..df75cf6 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -115,6 +115,48 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
         return per_cpu(lpi_redist, cpu).redist_id << 16;
 }
 
+/*
+ * Handle incoming LPIs, which are a bit special, because they are potentially
+ * numerous and also only get injected into guests. Treat them specially here,
+ * by just looking up their target vCPU and virtual LPI number and hand it
+ * over to the injection function.
+ */
+void do_LPI(unsigned int lpi)
+{
+    struct domain *d;
+    union host_lpi *hlpip, hlpi;
+    struct vcpu *vcpu;
+
+    WRITE_SYSREG32(lpi, ICC_EOIR1_EL1);
+
+    hlpip = gic_get_host_lpi(lpi);
+    if ( !hlpip )
+        return;
+
+    hlpi.data = read_u64_atomic(&hlpip->data);
+
+    /* Unmapped events are marked with an invalid LPI ID. */
+    if ( hlpi.virt_lpi == INVALID_LPI )
+        return;
+
+    d = rcu_lock_domain_by_id(hlpi.dom_id);
+    if ( !d )
+        return;
+
+    /* Make sure we don't step beyond the vcpu array. */
+    if ( hlpi.vcpu_id >= d->max_vcpus )
+    {
+        rcu_unlock_domain(d);
+        return;
+    }
+
+    vcpu = d->vcpu[hlpi.vcpu_id];
+
+    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
+
+    rcu_unlock_domain(d);
+}
+
 static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
 {
     uint64_t val;
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6a5c882..41cd2d1 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -710,7 +710,13 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
             do_IRQ(regs, irq, is_fiq);
             local_irq_disable();
         }
-        else if (unlikely(irq < 16))
+#ifdef CONFIG_HAS_ITS
+        else if ( is_lpi(irq) )
+        {
+            do_LPI(irq);
+        }
+#endif
+        else if ( unlikely(irq < 16) )
         {
             do_sgi(regs, irq);
         }
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 29c97eb..69572e3 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -348,6 +348,17 @@ struct pending_irq *lpi_to_pending(struct domain *d, unsigned int lpi)
     return pirq;
 }
 
+/* Retrieve the priority of an LPI from its struct pending_irq. */
+int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
+{
+    struct pending_irq *p = lpi_to_pending(d, vlpi);
+
+    if ( !p )
+        return GIC_PRI_IRQ;
+
+    return p->lpi_priority;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 15c8ef8..2aee20f 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -244,10 +244,17 @@ struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq)
 
 static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq)
 {
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, virq);
+    struct vgic_irq_rank *rank;
     unsigned long flags;
     int priority;
 
+#ifdef CONFIG_HAS_ITS
+    /* LPIs don't have a rank, also store their priority separately. */
+    if ( is_lpi(virq) )
+        return vgic_lpi_get_priority(v->domain, virq);
+#endif
+
+    rank = vgic_rank_irq(v, virq);
     vgic_lock_rank(v, rank, flags);
     priority = rank->priority[virq & INTERRUPT_RANK_MASK];
     vgic_unlock_rank(v, rank, flags);
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index d16affc..3fdf1e0 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -47,6 +47,8 @@ static inline bool is_lpi(unsigned int irq)
     return irq >= LPI_OFFSET;
 }
 
+void do_LPI(unsigned int irq);
+
 #define domain_pirq_to_irq(d, pirq) (pirq)
 
 bool_t is_assignable_irq(unsigned int irq);
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index e6dad38..eabdf91 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -66,12 +66,14 @@ struct pending_irq
 #define GIC_IRQ_GUEST_VISIBLE  2
 #define GIC_IRQ_GUEST_ENABLED  3
 #define GIC_IRQ_GUEST_MIGRATING   4
+#define GIC_IRQ_GUEST_LPI_PENDING 5
     unsigned long status;
     struct irq_desc *desc; /* only set it the irq corresponds to a physical irq */
     unsigned int irq;
 #define GIC_INVALID_LR         (uint8_t)~0
     uint8_t lr;
     uint8_t priority;
+    uint8_t lpi_priority;       /* Caches the priority if this is an LPI. */
     /* inflight is used to append instances of pending_irq to
      * vgic.inflight_irqs */
     struct list_head inflight;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 10/26] ARM: GICv3: enable ITS and LPIs on the host
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (8 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 09/26] ARM: GICv3: forward pending LPIs to guests Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Now that the host part of the ITS code is in place, we can enable the
ITS and also LPIs on each redistributor to get the show rolling.
At this point there would be no LPIs mapped, as guests don't know about
the ITS yet.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c |  4 ++++
 xen/arch/arm/gic-v3.c     | 18 ++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index fa284e7..8db2a09 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -476,6 +476,10 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
         return -ENOMEM;
     writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
 
+    /* Now enable interrupt translation and command processing on that ITS. */
+    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
+    writel_relaxed(reg | GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
+
     return 0;
 }
 
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 0e21cb2..d92d115 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -621,6 +621,21 @@ static int gicv3_enable_redist(void)
     return 0;
 }
 
+/* Enable LPIs on this redistributor (only useful when the host has an ITS). */
+static bool gicv3_enable_lpis(void)
+{
+    uint32_t val;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_TYPER);
+    if ( !(val & GICR_TYPER_PLPIS) )
+        return false;
+
+    val = readl_relaxed(GICD_RDIST_BASE + GICR_CTLR);
+    writel_relaxed(val | GICR_CTLR_ENABLE_LPIS, GICD_RDIST_BASE + GICR_CTLR);
+
+    return true;
+}
+
 static int __init gicv3_populate_rdist(void)
 {
     int i;
@@ -729,11 +744,14 @@ static int gicv3_cpu_init(void)
     if ( gicv3_enable_redist() )
         return -ENODEV;
 
+    /* If the host has any ITSes, enable LPIs now. */
     if ( gicv3_its_host_has_its() )
     {
         ret = gicv3_its_setup_collection(smp_processor_id());
         if ( ret )
             return ret;
+        if ( !gicv3_enable_lpis() )
+            return -EBUSY;
     }
 
     /* Set priority on PPI and SGI interrupts */
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (9 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 10/26] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-04-04 12:55   ` Julien Grall
  2017-03-31 18:05 ` [PATCH v3 12/26] ARM: vGICv3: Handle disabled LPIs Andre Przywara
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Allow a guest to provide the address and size for the memory regions
it has reserved for the GICv3 pending and property tables.
We sanitise the various fields of the respective redistributor
registers and map those pages into Xen's address space to have easy
access.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c       | 136 +++++++++++++++++++++++++++++++++++++------
 xen/common/memory.c          |  61 +++++++++++++++++++
 xen/include/asm-arm/domain.h |   6 +-
 xen/include/asm-arm/vgic.h   |   2 +
 xen/include/xen/mm.h         |   8 +++
 5 files changed, 195 insertions(+), 18 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 69572e3..7f84fbf 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -20,12 +20,14 @@
 
 #include <xen/bitops.h>
 #include <xen/config.h>
+#include <xen/domain_page.h>
 #include <xen/lib.h>
 #include <xen/init.h>
 #include <xen/softirq.h>
 #include <xen/irq.h>
 #include <xen/sched.h>
 #include <xen/sizes.h>
+#include <xen/vmap.h>
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
@@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         goto read_reserved;
 
     case VREG64(GICR_PROPBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);
+        return 1;
 
     case VREG64(GICR_PENDBASER):
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
+        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
+        return 1;
 
     case 0x0080:
         goto read_reserved;
@@ -302,11 +307,6 @@ bad_width:
     domain_crash_synchronous();
     return 0;
 
-read_as_zero_64:
-    if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-    *r = 0;
-    return 1;
-
 read_as_zero_32:
     if ( dabt.size != DABT_WORD ) goto bad_width;
     *r = 0;
@@ -359,11 +359,95 @@ int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi)
     return p->lpi_priority;
 }
 
+static uint64_t vgic_sanitise_field(uint64_t reg, uint64_t field_mask,
+                                    int field_shift,
+                                    uint64_t (*sanitise_fn)(uint64_t))
+{
+    uint64_t field = (reg & field_mask) >> field_shift;
+
+    field = sanitise_fn(field) << field_shift;
+
+    return (reg & ~field_mask) | field;
+}
+
+/* We want to avoid outer shareable. */
+static uint64_t vgic_sanitise_shareability(uint64_t field)
+{
+    switch ( field )
+    {
+    case GIC_BASER_OuterShareable:
+        return GIC_BASER_InnerShareable;
+    default:
+        return field;
+    }
+}
+
+/* Avoid any inner non-cacheable mapping. */
+static uint64_t vgic_sanitise_inner_cacheability(uint64_t field)
+{
+    switch ( field )
+    {
+    case GIC_BASER_CACHE_nCnB:
+    case GIC_BASER_CACHE_nC:
+        return GIC_BASER_CACHE_RaWb;
+    default:
+        return field;
+    }
+}
+
+/* Non-cacheable or same-as-inner are OK. */
+static uint64_t vgic_sanitise_outer_cacheability(uint64_t field)
+{
+    switch ( field )
+    {
+    case GIC_BASER_CACHE_SameAsInner:
+    case GIC_BASER_CACHE_nC:
+        return field;
+    default:
+        return GIC_BASER_CACHE_nC;
+    }
+}
+
+static uint64_t sanitize_propbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
+                              GICR_PROPBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~GICR_PROPBASER_RES0_MASK;
+
+    return reg;
+}
+
+static uint64_t sanitize_pendbaser(uint64_t reg)
+{
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
+                              GICR_PENDBASER_SHAREABILITY_SHIFT,
+                              vgic_sanitise_shareability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_inner_cacheability);
+    reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
+                              GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
+                              vgic_sanitise_outer_cacheability);
+
+    reg &= ~GICR_PENDBASER_RES0_MASK;
+
+    return reg;
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
 {
     struct hsr_dabt dabt = info->dabt;
+    uint64_t reg;
 
     switch ( gicr_reg )
     {
@@ -394,36 +478,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
         goto write_impl_defined;
 
     case VREG64(GICR_SETLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case VREG64(GICR_CLRLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x0050:
         goto write_reserved;
 
     case VREG64(GICR_PROPBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
+            return 1;
+
+        reg = v->domain->arch.vgic.rdist_propbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_propbaser(reg);
+        v->domain->arch.vgic.rdist_propbase = reg;
+        return 1;
 
     case VREG64(GICR_PENDBASER):
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
+
+        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
+        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
+            return 1;
+
+        reg = v->arch.vgic.rdist_pendbase;
+        vgic_reg64_update(&reg, r, info);
+        reg = sanitize_pendbaser(reg);
+        v->arch.vgic.rdist_pendbase = reg;
+        return 1;
 
     case 0x0080:
         goto write_reserved;
 
     case VREG64(GICR_INVLPIR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x00A8:
         goto write_reserved;
 
     case VREG64(GICR_INVALLR):
-        /* LPI is not implemented */
+        /* LPIs without an ITS are not implemented */
         goto write_ignore_64;
 
     case 0x00B8:
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 21797ca..29ef9bb 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1419,6 +1419,67 @@ int prepare_ring_for_helper(
 }
 
 /*
+ * Mark a given number of guest pages as used (by increasing their refcount),
+ * starting with the given guest address. This needs to be called once before
+ * calling (possibly repeatedly) map_one_guest_pages().
+ * Before the domain gets destroyed, call put_guest_pages() to drop the
+ * reference.
+ */
+int get_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)
+{
+    unsigned int i;
+    struct page_info *page;
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);
+        if ( !page )
+        {
+            /* Make sure we drop the references of pages we got so far. */
+            put_guest_pages(d, gpa, i);
+            return -EINVAL;
+        }
+    }
+
+    return 0;
+}
+
+void put_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)
+{
+    mfn_t mfn;
+    int i;
+
+    p2m_read_lock(&d->arch.p2m);
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
+                            NULL, NULL, NULL);
+        if ( mfn_eq(mfn, INVALID_MFN) )
+            continue;
+        put_page(mfn_to_page(mfn_x(mfn)));
+    }
+    p2m_read_unlock(&d->arch.p2m);
+}
+
+/*
+ * Provides easy access to guest memory by "mapping" one page of it into
+ * Xen's VA space. In fact it relies on the memory being already mapped
+ * and just provides a pointer to it.
+ */
+void *map_one_guest_page(struct domain *d, paddr_t guest_addr)
+{
+    void *ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));
+
+    return ptr + (guest_addr & ~PAGE_MASK);
+}
+
+/* "Unmap" a previously mapped guest page. Could be optimized away. */
+void unmap_one_guest_page(void *va)
+{
+    unmap_domain_page(((uintptr_t)va & PAGE_MASK));
+}
+
+/*
  * Local variables:
  * mode: C
  * c-file-style: "BSD"
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index a83904a..ad4dfdc 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -110,6 +110,8 @@ struct arch_domain
         } *rdist_regions;
         int nr_regions;                     /* Number of rdist regions */
         uint32_t rdist_stride;              /* Re-Distributor stride */
+        unsigned int nr_lpis;
+        uint64_t rdist_propbase;
         struct rb_root its_devices;         /* Devices mapped to an ITS */
         spinlock_t its_devices_lock;        /* Protects the its_devices tree */
         struct radix_tree_root pend_lpi_tree; /* Stores struct pending_irq's */
@@ -257,7 +259,9 @@ struct arch_vcpu
 
         /* GICv3: redistributor base and flags for this vCPU */
         paddr_t rdist_base;
-#define VGIC_V3_RDIST_LAST  (1 << 0)        /* last vCPU of the rdist */
+        uint64_t rdist_pendbase;
+#define VGIC_V3_RDIST_LAST      (1 << 0)        /* last vCPU of the rdist */
+#define VGIC_V3_LPIS_ENABLED    (1 << 1)
         uint8_t flags;
     } vgic;
 
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index eabdf91..9f48e9a 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -310,6 +310,8 @@ extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
 int vgic_v2_init(struct domain *d, int *mmio_count);
 int vgic_v3_init(struct domain *d, int *mmio_count);
 
+extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
+
 extern int domain_vgic_register(struct domain *d, int *mmio_count);
 extern int vcpu_vgic_free(struct vcpu *v);
 extern bool vgic_to_sgi(struct vcpu *v, register_t sgir,
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 88de3c1..c402856 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -570,6 +570,14 @@ int prepare_ring_for_helper(struct domain *d, unsigned long gmfn,
                             struct page_info **_page, void **_va);
 void destroy_ring_for_helper(void **_va, struct page_info *page);
 
+/* Mark guest pages as used (by the hypervisor) to avoid dropping them. */
+int get_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages);
+void put_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages);
+
+/* Map guest memory into Xen's VA space. */
+void *map_one_guest_page(struct domain *d, paddr_t guest_addr);
+void unmap_one_guest_page(void *va);
+
 #include <asm/flushtlb.h>
 
 static inline void accumulate_tlbflush(bool *need_tlbflush,
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 12/26] ARM: vGICv3: Handle disabled LPIs
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (10 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 13/26] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

If a guest disables an LPI, we do not forward this to the associated
host LPI to avoid queueing commands to the host ITS command queue.
So it may happen that an LPI fires nevertheless on the host. In this
case we can bail out early, but have to save the pending state on the
virtual side.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-lpi.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index df75cf6..2301d53 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -115,6 +115,21 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
         return per_cpu(lpi_redist, cpu).redist_id << 16;
 }
 
+static bool vgic_can_inject_lpi(struct vcpu *vcpu, uint32_t vlpi)
+{
+    struct pending_irq *p = lpi_to_pending(vcpu->domain, vlpi);
+
+    if ( !p )
+        return false;
+
+    if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) )
+        return true;
+
+    set_bit(GIC_IRQ_GUEST_LPI_PENDING, &p->status);
+
+    return false;
+}
+
 /*
  * Handle incoming LPIs, which are a bit special, because they are potentially
  * numerous and also only get injected into guests. Treat them specially here,
@@ -152,7 +167,13 @@ void do_LPI(unsigned int lpi)
 
     vcpu = d->vcpu[hlpi.vcpu_id];
 
-    vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
+    /*
+     * We keep all host LPIs enabled, so check if it's disabled on the guest
+     * side and just record this LPI in the virtual pending table in this case.
+     * The guest picks it up once it gets enabled again.
+     */
+    if ( vgic_can_inject_lpi(vcpu, hlpi.virt_lpi) )
+        vgic_vcpu_inject_irq(vcpu, hlpi.virt_lpi);
 
     rcu_unlock_domain(d);
 }
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 13/26] ARM: vGICv3: introduce basic ITS emulation bits
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (11 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 12/26] ARM: vGICv3: Handle disabled LPIs Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 14/26] ARM: vITS: introduce translation table walks Andre Przywara
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Create a new file to hold the emulation code for the ITS widget.
For now we emulate the memory mapped ITS registers and provide a stub
to introduce the ITS command handling framework (but without actually
emulating any commands at this time).

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/vgic-v3-its.c        | 547 ++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            |   9 -
 xen/include/asm-arm/gic_v3_defs.h |  19 ++
 xen/include/asm-arm/gic_v3_its.h  |   2 +
 5 files changed, 569 insertions(+), 9 deletions(-)
 create mode 100644 xen/arch/arm/vgic-v3-its.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 02a8737..e7ce2c83 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -47,6 +47,7 @@ obj-y += traps.o
 obj-y += vgic.o
 obj-y += vgic-v2.o
 obj-$(CONFIG_HAS_GICV3) += vgic-v3.o
+obj-$(CONFIG_HAS_ITS) += vgic-v3-its.o
 obj-y += vm_event.o
 obj-y += vtimer.o
 obj-y += vpsci.o
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
new file mode 100644
index 0000000..fd3b9a1
--- /dev/null
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -0,0 +1,547 @@
+/*
+ * xen/arch/arm/vgic-v3-its.c
+ *
+ * ARM Interrupt Translation Service (ITS) emulation
+ *
+ * Andre Przywara <andre.przywara@arm.com>
+ * Copyright (c) 2016,2017 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; under version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/bitops.h>
+#include <xen/config.h>
+#include <xen/domain_page.h>
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/softirq.h>
+#include <xen/irq.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
+#include <asm/current.h>
+#include <asm/mmio.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
+#include <asm/vgic.h>
+#include <asm/vgic-emul.h>
+
+/* Data structure to describe a virtual ITS */
+#define VIRT_ITS_ENABLED        0
+#define VIRT_ITS_COLL_VALID     1
+#define VIRT_ITS_DEV_VALID      2
+#define VIRT_ITS_CMDBUF_VALID   3
+struct virt_its {
+    struct domain *d;
+    spinlock_t vcmd_lock;       /* Protects the virtual command buffer. */
+    uint64_t cbaser;
+    uint64_t cwriter;
+    uint64_t creadr;
+    spinlock_t its_lock;        /* Protects the collection and device tables. */
+    uint64_t baser_dev, baser_coll;
+    unsigned int max_collections;
+    unsigned int max_devices;
+    unsigned int devid_bits;
+    unsigned int intid_bits;
+    unsigned long flags;
+};
+
+/*
+ * An Interrupt Translation Table Entry: this is indexed by a
+ * DeviceID/EventID pair and is located in guest memory.
+ */
+struct vits_itte
+{
+    uint32_t vlpi;
+    uint16_t collection;
+    uint16_t pad;
+};
+
+static bool its_is_enabled(struct virt_its *its)
+{
+    return test_bit(VIRT_ITS_ENABLED, &its->flags);
+}
+
+/**************************************
+ * Functions that handle ITS commands *
+ **************************************/
+
+static uint64_t its_cmd_mask_field(uint64_t *its_cmd, unsigned int word,
+                                   unsigned int shift, unsigned int size)
+{
+    return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
+}
+
+#define its_cmd_get_command(cmd)        its_cmd_mask_field(cmd, 0,  0,  8)
+#define its_cmd_get_deviceid(cmd)       its_cmd_mask_field(cmd, 0, 32, 32)
+#define its_cmd_get_size(cmd)           its_cmd_mask_field(cmd, 1,  0,  5)
+#define its_cmd_get_id(cmd)             its_cmd_mask_field(cmd, 1,  0, 32)
+#define its_cmd_get_physical_id(cmd)    its_cmd_mask_field(cmd, 1, 32, 32)
+#define its_cmd_get_collection(cmd)     its_cmd_mask_field(cmd, 2,  0, 16)
+#define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
+#define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
+
+#define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
+
+static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
+                                uint32_t writer)
+{
+    paddr_t cmdbuf_addr = its->cbaser & GENMASK_ULL(51, 12);
+    void *cmdbuf = NULL;
+    uint64_t *cmdptr;
+
+    if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
+        return -1;
+
+    spin_lock(&its->vcmd_lock);
+
+    while ( its->creadr != writer )
+    {
+        int ret;
+
+        ret = 0;
+
+        /*
+         * If this is the first command we handle or we cross a page boundary,
+         * we need to (re)map the command buffer.
+         */
+        if ( !cmdbuf || (its->creadr & ~PAGE_MASK) == 0 )
+        {
+            if ( cmdbuf )
+                unmap_one_guest_page(cmdbuf);
+            cmdbuf = map_one_guest_page(d,
+                                       (cmdbuf_addr + its->creadr) & PAGE_MASK);
+            if ( !cmdbuf )
+                return -EFAULT;
+        }
+        cmdptr = cmdbuf + (its->creadr & ~PAGE_MASK);
+
+        switch ( its_cmd_get_command(cmdptr) )
+        {
+        case GITS_CMD_SYNC:
+            /* We handle ITS commands synchronously, so we ignore SYNC. */
+	    break;
+        default:
+            gdprintk(XENLOG_WARNING, "ITS: unhandled ITS command %lu\n",
+                     its_cmd_get_command(cmdptr));
+            break;
+        }
+
+        its->creadr += ITS_CMD_SIZE;
+        if ( its->creadr == ITS_CMD_BUFFER_SIZE(its->cbaser) )
+            its->creadr = 0;
+
+        if ( ret )
+            gdprintk(XENLOG_WARNING,
+                     "ITS: ITS command error %d while handling command %lu\n",
+                     ret, its_cmd_get_command(cmdptr));
+    }
+    its->cwriter = writer;
+
+    spin_unlock(&its->vcmd_lock);
+
+    if ( cmdbuf )
+        unmap_one_guest_page(cmdbuf);
+
+    return 0;
+}
+
+/*****************************
+ * ITS registers read access *
+ *****************************/
+
+static int vgic_v3_its_mmio_read(struct vcpu *v, mmio_info_t *info,
+                                 register_t *r, void *priv)
+{
+    struct virt_its *its = priv;
+    uint64_t reg;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        if ( its_is_enabled(its) )
+            reg = GITS_CTLR_ENABLE | BIT(31);
+        else
+            reg = BIT(31);
+        *r = vgic_reg32_extract(reg, info);
+        break;
+    case VREG32(GITS_IIDR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GITS_IIDR_VALUE, info);
+        break;
+    case VREG64(GITS_TYPER):
+        if ( !vgic_reg64_check_access(info->dabt) ) goto bad_width;
+
+        reg = GITS_TYPER_PHYSICAL;
+        reg |= (sizeof(struct vits_itte) - 1) << GITS_TYPER_ITT_SIZE_SHIFT;
+        reg |= (its->intid_bits - 1) << GITS_TYPER_IDBITS_SHIFT;
+        reg |= (its->devid_bits - 1) << GITS_TYPER_DEVIDS_SHIFT;
+        *r = vgic_reg64_extract(reg, info);
+        break;
+    case VREG64(GITS_CBASER):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(its->cbaser, info);
+        break;
+    case VREG64(GITS_CWRITER):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(its->cwriter, info);
+        break;
+    case VREG64(GITS_CREADR):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(its->creadr, info);
+        break;
+    case VREG64(GITS_BASER0):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser_dev, info);
+        break;
+    case VREG64(GITS_BASER1):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(its->baser_coll, info);
+        break;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        *r = vgic_reg64_extract(0, info);
+        break;
+    case VREG32(GITS_PIDR2):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(GICV3_GICD_PIDR2, info);
+        break;
+    }
+
+    return 1;
+
+bad_width:
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+/******************************
+ * ITS registers write access *
+ ******************************/
+
+static int its_baser_table_size(uint64_t baser)
+{
+    int page_size = 0;
+
+    switch ( (baser >> 8) & 3 )
+    {
+    case 0: page_size = SZ_4K; break;
+    case 1: page_size = SZ_16K; break;
+    case 2:
+    case 3: page_size = SZ_64K; break;
+    }
+
+    return page_size * ((baser & GENMASK_ULL(7, 0)) + 1);
+}
+
+static int its_baser_nr_entries(uint64_t baser)
+{
+    int entry_size = ((baser & GENMASK_ULL(52, 48)) >> 48) + 1;
+
+    return its_baser_table_size(baser) / entry_size;
+}
+
+static int vgic_its_map_cmdbuf(struct virt_its *its)
+{
+    if ( !(its->cbaser & GITS_VALID_BIT) )
+        return -EBUSY;
+
+    return get_guest_pages(its->d, its->cbaser & GENMASK_ULL(51, 12),
+                           (its->cbaser & 0xff) + 1);
+}
+
+static void vgic_its_unmap_cmdbuf(struct virt_its *its)
+{
+    int nr_pages = (its->cbaser & 0xff) + 1;
+
+    put_guest_pages(its->d, its->cbaser & GENMASK_ULL(51, 12), nr_pages);
+}
+
+static int vgic_its_map_its_table(struct virt_its *its, uint64_t reg)
+{
+    unsigned int i, table_size = its_baser_table_size(reg);
+    paddr_t guest_addr = get_baser_phys_addr(reg);
+
+    if ( !(reg & GITS_VALID_BIT) )
+        return -EINVAL;
+
+    get_guest_pages(its->d, guest_addr, table_size >> PAGE_SHIFT);
+    /* Map each page one by one to check and clear it. */
+    for ( i = 0; i < table_size >> PAGE_SHIFT; i++ )
+    {
+        void *ptr = map_one_guest_page(its->d, guest_addr + (i << PAGE_SHIFT));
+
+        if ( !ptr )
+            return -EFAULT;
+
+        memset(ptr, 0, table_size);
+        unmap_one_guest_page(ptr);
+    }
+
+    return 0;
+}
+
+static void vgic_its_unmap_its_table(struct domain *d, uint64_t reg)
+{
+    put_guest_pages(d, get_baser_phys_addr(reg),
+                    its_baser_table_size(reg) >> PAGE_SHIFT);
+}
+
+static bool vgic_v3_its_change_its_status(struct virt_its *its, bool status)
+{
+    bool ret = true;
+
+    if ( !status )
+    {
+        clear_bit(VIRT_ITS_ENABLED, &its->flags);
+        return false;
+    }
+
+    if ( !vgic_its_map_cmdbuf(its) )
+        set_bit(VIRT_ITS_CMDBUF_VALID, &its->flags);
+    else
+    {
+        clear_bit(VIRT_ITS_CMDBUF_VALID, &its->flags);
+        ret = false;
+    }
+
+    if ( !vgic_its_map_its_table(its, its->baser_dev) )
+        set_bit(VIRT_ITS_DEV_VALID, &its->flags);
+    else
+    {
+        clear_bit(VIRT_ITS_DEV_VALID, &its->flags);
+        ret = false;
+    }
+
+    if ( !vgic_its_map_its_table(its, its->baser_coll) )
+        set_bit(VIRT_ITS_COLL_VALID, &its->flags);
+    else
+    {
+        clear_bit(VIRT_ITS_COLL_VALID, &its->flags);
+        ret = false;
+    }
+
+    if ( ret )
+        set_bit(VIRT_ITS_ENABLED, &its->flags);
+    else
+        clear_bit(VIRT_ITS_ENABLED, &its->flags);
+
+    return ret;
+}
+
+static void sanitize_its_base_reg(uint64_t *reg)
+{
+    uint64_t r = *reg;
+
+    /* Avoid outer shareable. */
+    switch ( (r >> GITS_BASER_SHAREABILITY_SHIFT) & 0x03 )
+    {
+    case GIC_BASER_OuterShareable:
+        r = r & ~GITS_BASER_SHAREABILITY_MASK;
+        r |= GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+        break;
+    default:
+        break;
+    }
+
+    /* Avoid any inner non-cacheable mapping. */
+    switch ( (r >> GITS_BASER_INNER_CACHEABILITY_SHIFT) & 0x07 )
+    {
+    case GIC_BASER_CACHE_nCnB:
+    case GIC_BASER_CACHE_nC:
+        r = r & ~GITS_BASER_INNER_CACHEABILITY_MASK;
+        r |= GIC_BASER_CACHE_RaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+        break;
+    default:
+        break;
+    }
+
+    /* Only allow non-cacheable or same-as-inner. */
+    switch ( (r >> GITS_BASER_OUTER_CACHEABILITY_SHIFT) & 0x07 )
+    {
+    case GIC_BASER_CACHE_SameAsInner:
+    case GIC_BASER_CACHE_nC:
+        break;
+    default:
+        r = r & ~GITS_BASER_OUTER_CACHEABILITY_MASK;
+        r |= GIC_BASER_CACHE_nC << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+        break;
+    }
+
+    *reg = r;
+}
+
+static int vgic_v3_its_mmio_write(struct vcpu *v, mmio_info_t *info,
+                                  register_t r, void *priv)
+{
+    struct domain *d = v->domain;
+    struct virt_its *its = priv;
+    uint64_t reg;
+    uint32_t reg32, ctlr;
+
+    switch ( info->gpa & 0xffff )
+    {
+    case VREG32(GITS_CTLR):
+        if ( info->dabt.size != DABT_WORD ) goto bad_width;
+
+        ctlr = its_is_enabled(its) ? GITS_CTLR_ENABLE : 0;
+        reg32 = ctlr;
+        vgic_reg32_update(&reg32, r, info);
+
+        if ( ctlr ^ reg32 )
+            vgic_v3_its_change_its_status(its, reg32 & GITS_CTLR_ENABLE);
+        return 1;
+
+    case VREG32(GITS_IIDR):
+        goto write_ignore_32;
+    case VREG32(GITS_TYPER):
+        goto write_ignore_32;
+    case VREG64(GITS_CBASER):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its_is_enabled(its) )
+        {
+            gdprintk(XENLOG_WARNING, "ITS: Domain %d tried to change CBASER with the ITS enabled.\n", d->domain_id);
+            return 1;
+        }
+
+        reg = its->cbaser;
+        vgic_reg64_update(&reg, r, info);
+        sanitize_its_base_reg(&reg);
+
+        vgic_its_unmap_cmdbuf(its);
+        its->cbaser = reg;
+
+	return 1;
+
+    case VREG64(GITS_CWRITER):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+        reg = its->cwriter & 0xfffe0;
+        vgic_reg64_update(&reg, r, info);
+        its->cwriter = reg & 0xfffe0;
+
+        if ( its_is_enabled(its) )
+            vgic_its_handle_cmds(d, its, reg);
+
+        return 1;
+
+    case VREG64(GITS_CREADR):
+        goto write_ignore_64;
+    case VREG64(GITS_BASER0):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its_is_enabled(its) )
+        {
+            gdprintk(XENLOG_WARNING, "ITS: Domain %d tried to change BASER with the ITS enabled.\n",
+                     d->domain_id);
+
+            return 1;
+        }
+
+        reg = its->baser_dev;
+        vgic_reg64_update(&reg, r, info);
+
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint64_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
+        sanitize_its_base_reg(&reg);
+
+        /* Has the table address been changed or invalidated? */
+        if ( !(reg & GITS_VALID_BIT) ||
+             get_baser_phys_addr(reg) != get_baser_phys_addr(its->baser_dev) )
+        {
+            vgic_its_unmap_its_table(its->d, its->baser_dev);
+            clear_bit(VIRT_ITS_DEV_VALID, &its->flags);
+        }
+
+        if ( reg & GITS_VALID_BIT )
+            its->max_devices = its_baser_nr_entries(reg);
+        else
+            its->max_devices = 0;
+
+        its->baser_dev = reg;
+        return 1;
+    case VREG64(GITS_BASER1):
+        if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+
+        /* Changing base registers with the ITS enabled is UNPREDICTABLE. */
+        if ( its_is_enabled(its) )
+        {
+            gdprintk(XENLOG_INFO, "ITS: Domain %d tried to change BASER with the ITS enabled.\n",
+                     d->domain_id);
+            return 1;
+        }
+
+        reg = its->baser_coll;
+        vgic_reg64_update(&reg, r, info);
+        reg &= ~GITS_BASER_RO_MASK;
+        reg |= (sizeof(uint16_t) - 1) << GITS_BASER_ENTRY_SIZE_SHIFT;
+        reg |= GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
+        sanitize_its_base_reg(&reg);
+
+        if ( !(reg & GITS_VALID_BIT) ||
+             get_baser_phys_addr(reg) != get_baser_phys_addr(its->baser_coll) )
+        {
+            vgic_its_unmap_its_table(its->d, its->baser_coll);
+            clear_bit(VIRT_ITS_COLL_VALID, &its->flags);
+        }
+
+        if ( reg & GITS_VALID_BIT )
+            its->max_collections = its_baser_nr_entries(reg);
+        else
+            its->max_collections = 0;
+        its->baser_coll = reg;
+        return 1;
+    case VRANGE64(GITS_BASER2, GITS_BASER7):
+        goto write_ignore_64;
+    default:
+        gdprintk(XENLOG_G_WARNING, "ITS: unhandled ITS register 0x%lx\n",
+                 info->gpa & 0xffff);
+        return 0;
+    }
+
+    return 1;
+
+write_ignore_64:
+    if ( ! vgic_reg64_check_access(info->dabt) ) goto bad_width;
+    return 1;
+
+write_ignore_32:
+    if ( info->dabt.size != DABT_WORD ) goto bad_width;
+    return 1;
+
+bad_width:
+    printk(XENLOG_G_ERR "%pv vGICR: bad read width %d r%d offset %#08lx\n",
+           v, info->dabt.size, info->dabt.reg, info->gpa & 0xffff);
+
+    domain_crash_synchronous();
+
+    return 0;
+}
+
+static const struct mmio_handler_ops vgic_its_mmio_handler = {
+    .read  = vgic_v3_its_mmio_read,
+    .write = vgic_v3_its_mmio_write,
+};
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 7f84fbf..4159fb8 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -159,15 +159,6 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
     rank->vcpu[offset] = new_vcpu->vcpu_id;
 }
 
-static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
-{
-    /*
-     * 64 bits registers can be accessible using 32-bit and 64-bit unless
-     * stated otherwise (See 8.1.3 ARM IHI 0069A).
-     */
-    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
-}
-
 static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
                                          uint32_t gicr_reg,
                                          register_t *r)
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index b01b6ed..8999937 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -155,6 +155,16 @@
 #define LPI_PROP_RES1                (1 << 1)
 #define LPI_PROP_ENABLED             (1 << 0)
 
+/*
+ * PIDR2: Only bits[7:4] are not implementation defined. We are
+ * emulating a GICv3 ([7:4] = 0x3).
+ *
+ * We don't emulate a specific registers scheme so implement the others
+ * bits as RES0 as recommended by the spec (see 8.1.13 in ARM IHI 0069A).
+ */
+#define GICV3_GICD_PIDR2  0x30
+#define GICV3_GICR_PIDR2  GICV3_GICD_PIDR2
+
 #define GICH_VMCR_EOI                (1 << 9)
 #define GICH_VMCR_VENG1              (1 << 1)
 
@@ -198,6 +208,15 @@ struct rdist_region {
     bool single_rdist;
 };
 
+/*
+ * 64 bits registers can be accessible using 32-bit and 64-bit unless
+ * stated otherwise (See 8.1.3 ARM IHI 0069A).
+ */
+static inline bool vgic_reg64_check_access(struct hsr_dabt dabt)
+{
+    return ( dabt.size == DABT_DOUBLE_WORD || dabt.size == DABT_WORD );
+}
+
 #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
 
 /*
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 7b47596..bc9b42a 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -35,6 +35,7 @@
 #define GITS_BASER5                     0x128
 #define GITS_BASER6                     0x130
 #define GITS_BASER7                     0x138
+#define GITS_PIDR2                      GICR_PIDR2
 
 /* Register bits */
 #define GITS_VALID_BIT                  BIT_ULL(63)
@@ -52,6 +53,7 @@
 #define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
 #define GITS_TYPER_ITT_SIZE(r)          ((((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
                                                 GITS_TYPER_ITT_SIZE_SHIFT) + 1)
+#define GITS_TYPER_PHYSICAL             (1U << 0)
 
 #define GITS_IIDR_VALUE                 0x34c
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 14/26] ARM: vITS: introduce translation table walks
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (12 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 13/26] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 15/26] ARM: vITS: handle CLEAR command Andre Przywara
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The ITS stores the target (v)CPU and the (virtual) LPI number in tables.
Introduce functions to walk those tables and translate an device ID -
event ID pair into a pair of virtual LPI and vCPU.
We map those tables on demand - which is cheap on arm64. Also we take
care of the locking on the way, since we can't easily protect those ITTs
from being altered by the guest.

To allow compiling without warnings, we declare two functions as
non-static for the moment, which two later patches will fix.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 177 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index fd3b9a1..d75f404 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -71,6 +71,183 @@ static bool its_is_enabled(struct virt_its *its)
     return test_bit(VIRT_ITS_ENABLED, &its->flags);
 }
 
+#define UNMAPPED_COLLECTION      ((uint16_t)~0)
+
+/*
+ * The physical address is encoded slightly differently depending on
+ * the used page size: the highest four bits are stored in the lowest
+ * four bits of the field for 64K pages.
+ */
+static paddr_t get_baser_phys_addr(uint64_t reg)
+{
+    if ( reg & BIT(9) )
+        return (reg & GENMASK_ULL(47, 16)) |
+                ((reg & GENMASK_ULL(15, 12)) << 36);
+    else
+        return reg & GENMASK_ULL(47, 12);
+}
+
+/* Must be called with the ITS lock held. */
+static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
+{
+    paddr_t addr = get_baser_phys_addr(its->baser_coll);
+    uint16_t *coll_table;
+    uint16_t vcpu_id;
+
+    if ( collid >= its->max_collections )
+        return NULL;
+
+    coll_table = map_one_guest_page(its->d, addr + collid * sizeof(uint16_t));
+    if ( !coll_table )
+        return NULL;
+
+    vcpu_id = *coll_table;
+
+    unmap_one_guest_page(coll_table);
+
+    if ( vcpu_id == UNMAPPED_COLLECTION || vcpu_id >= its->d->max_vcpus )
+        return NULL;
+
+    return its->d->vcpu[vcpu_id];
+}
+
+/*
+ * Our device table encodings:
+ * Contains the guest physical address of the Interrupt Translation Table in
+ * bits [51:8], and the size of it encoded in the lowest 8 bits.
+ */
+#define DEV_TABLE_ITT_ADDR(x) ((x) & GENMASK_ULL(51, 8))
+#define DEV_TABLE_ITT_SIZE(x) (BIT(((x) & GENMASK_ULL(7, 0)) + 1))
+#define DEV_TABLE_ENTRY(addr, bits)                     \
+        (((addr) & GENMASK_ULL(51, 8)) | (((bits) - 1) & GENMASK_ULL(7, 0)))
+
+/*
+ * Lookup the address of the Interrupt Translation Table associated with
+ * a device ID and return the address of the ITTE belonging to the event ID
+ * (which is an index into that table).
+ */
+static paddr_t its_get_itte_address(struct virt_its *its,
+                                    uint32_t devid, uint32_t evid)
+{
+    paddr_t ret, addr = get_baser_phys_addr(its->baser_dev);
+    uint64_t *itt;
+
+    if ( devid >= its->max_devices )
+        return INVALID_PADDR;
+
+    itt = map_one_guest_page(its->d, addr + devid * sizeof(uint64_t));
+    if ( !itt )
+        return INVALID_PADDR;
+
+    if ( evid < DEV_TABLE_ITT_SIZE(*itt) &&
+         DEV_TABLE_ITT_ADDR(*itt) != INVALID_PADDR )
+        ret = DEV_TABLE_ITT_ADDR(*itt) + evid * sizeof(struct vits_itte);
+    else
+        ret = INVALID_PADDR;
+
+    unmap_one_guest_page(itt);
+
+    return ret;
+}
+
+/*
+ * Looks up a given deviceID/eventID pair on an ITS and returns a pointer to
+ * the corresponding ITTE. This maps the respective guest page into Xen.
+ * Once finished with handling the ITTE, call put_itte() to unmap
+ * the page again.
+ * Must be called with the ITS lock held.
+ */
+static struct vits_itte *get_itte(struct virt_its *its,
+                                  uint32_t devid, uint32_t evid)
+{
+    paddr_t addr = its_get_itte_address(its, devid, evid);
+
+    if ( addr == INVALID_PADDR )
+        return NULL;
+
+    return map_one_guest_page(its->d, addr);
+}
+
+/* Must be called with the ITS lock held. */
+static void put_itte(struct virt_its *its, struct vits_itte *itte)
+{
+    unmap_one_guest_page(itte);
+}
+
+/*
+ * Queries the collection and device tables to get the vCPU and virtual
+ * LPI number for a given guest event. This takes care of mapping the
+ * respective tables and validating the values, since we can't efficiently
+ * protect the ITTs with their less-than-page-size granularity.
+ * Takes and drops the its_lock.
+ */
+bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+               struct vcpu **vcpu, uint32_t *vlpi)
+{
+    struct vits_itte *itte;
+    int collid;
+    uint32_t _vlpi;
+    struct vcpu *_vcpu;
+
+    spin_lock(&its->its_lock);
+    itte = get_itte(its, devid, evid);
+    if ( !itte )
+    {
+        spin_unlock(&its->its_lock);
+        return false;
+    }
+    collid = itte->collection;
+    _vlpi = itte->vlpi;
+    put_itte(its, itte);
+
+    _vcpu = get_vcpu_from_collection(its, collid);
+    spin_unlock(&its->its_lock);
+
+    if ( !_vcpu )
+        return false;
+
+    if ( collid >= its->max_collections )
+        return false;
+
+    *vcpu = _vcpu;
+    *vlpi = _vlpi;
+
+    return true;
+}
+
+#define SKIP_LPI_UPDATE 1
+bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
+{
+    struct vits_itte *itte;
+
+    if ( collid >= its->max_collections )
+        return false;
+
+    if ( vlpi >= its->d->arch.vgic.nr_lpis )
+        return false;
+
+    spin_lock(&its->its_lock);
+    itte = get_itte(its, devid, evid);
+    if ( !itte )
+    {
+        spin_unlock(&its->its_lock);
+        return false;
+    }
+
+    itte->collection = collid;
+    if ( vlpi != SKIP_LPI_UPDATE )
+        itte->vlpi = vlpi;
+
+    if ( vcpu )
+        *vcpu = get_vcpu_from_collection(its, collid);
+
+    put_itte(its, itte);
+    spin_unlock(&its->its_lock);
+
+    return true;
+}
+
 /**************************************
  * Functions that handle ITS commands *
  **************************************/
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 15/26] ARM: vITS: handle CLEAR command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (13 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 14/26] ARM: vITS: introduce translation table walks Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 16/26] ARM: vITS: handle INT command Andre Przywara
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

This introduces the ITS command handler for the CLEAR command, which
clears the pending state of an LPI.
This removes a not-yet injected, but already queued IRQ from a VCPU.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index d75f404..e49df30 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -181,8 +181,8 @@ static void put_itte(struct virt_its *its, struct vits_itte *itte)
  * protect the ITTs with their less-than-page-size granularity.
  * Takes and drops the its_lock.
  */
-bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
-               struct vcpu **vcpu, uint32_t *vlpi)
+static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                      struct vcpu **vcpu, uint32_t *vlpi)
 {
     struct vits_itte *itte;
     int collid;
@@ -267,6 +267,30 @@ static uint64_t its_cmd_mask_field(uint64_t *its_cmd, unsigned int word,
 #define its_cmd_get_target_addr(cmd)    its_cmd_mask_field(cmd, 2, 16, 32)
 #define its_cmd_get_validbit(cmd)       its_cmd_mask_field(cmd, 2, 63,  1)
 
+static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *p;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    p = lpi_to_pending(its->d, vlpi);
+    if ( !p )
+        return -1;
+
+    clear_bit(GIC_IRQ_GUEST_LPI_PENDING, &p->status);
+
+    /* Remove a pending, but not yet injected guest IRQ. */
+    clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status);
+    gic_remove_from_queues(vcpu, vlpi);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -304,6 +328,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
 
         switch ( its_cmd_get_command(cmdptr) )
         {
+        case GITS_CMD_CLEAR:
+            ret = its_handle_clear(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 16/26] ARM: vITS: handle INT command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (14 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 15/26] ARM: vITS: handle CLEAR command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 17/26] ARM: vITS: handle MAPC command Andre Przywara
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The INT command sets a given LPI identified by a DeviceID/EventID pair
as pending and thus triggers it to be injected.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index e49df30..c6a2e28 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -291,6 +291,33 @@ static int its_handle_clear(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *p;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    p = lpi_to_pending(its->d, vlpi);
+    if ( !p )
+        return -1;
+
+    /*
+     * If the LPI is enabled, inject it.
+     * If not, store the pending state to inject it once it gets enabled later.
+     */
+    if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) )
+        vgic_vcpu_inject_irq(vcpu, vlpi);
+    else
+        set_bit(GIC_IRQ_GUEST_LPI_PENDING, &p->status);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -331,6 +358,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             ret = its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_INT:
+            ret = its_handle_int(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 17/26] ARM: vITS: handle MAPC command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (15 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 16/26] ARM: vITS: handle INT command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 18/26] ARM: vITS: handle MAPD command Andre Przywara
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The MAPC command associates a given collection ID with a given
redistributor, thus mapping collections to VCPUs.
We just store the vcpu_id in the collection table for that.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index c6a2e28..8c2eaaa 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -87,6 +87,26 @@ static paddr_t get_baser_phys_addr(uint64_t reg)
         return reg & GENMASK_ULL(47, 12);
 }
 
+static int its_set_collection(struct virt_its *its, uint16_t collid,
+                              uint16_t vcpu_id)
+{
+    paddr_t addr = get_baser_phys_addr(its->baser_coll);
+    uint16_t *coll_table;
+
+    if ( collid >= its->max_collections )
+        return -ENOENT;
+
+    coll_table = map_one_guest_page(its->d, addr + collid * sizeof(uint16_t));
+    if ( !coll_table )
+        return -EFAULT;
+
+    *coll_table = vcpu_id;
+
+    unmap_one_guest_page(coll_table);
+
+    return 0;
+}
+
 /* Must be called with the ITS lock held. */
 static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
 {
@@ -318,6 +338,29 @@ static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    uint64_t rdbase = its_cmd_mask_field(cmdptr, 2, 16, 44);
+
+    if ( collid >= its->max_collections )
+        return -1;
+
+    if ( rdbase >= its->d->max_vcpus )
+        return -1;
+
+    spin_lock(&its->its_lock);
+
+    if ( its_cmd_get_validbit(cmdptr) )
+        its_set_collection(its, collid, rdbase);
+    else
+        its_set_collection(its, collid, UNMAPPED_COLLECTION);
+
+    spin_unlock(&its->its_lock);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -361,6 +404,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             ret = its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_MAPC:
+            ret = its_handle_mapc(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 18/26] ARM: vITS: handle MAPD command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (16 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 17/26] ARM: vITS: handle MAPC command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 19/26] ARM: vITS: handle MAPTI command Andre Przywara
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The MAPD command maps a device by associating a memory region for
storing ITEs with a certain device ID.
We store the given guest physical address in the device table, and, if
this command comes from Dom0, tell the host ITS driver about this new
mapping, so it can issue the corresponing host MAPD command and create
the required tables.
We don't map the device tables permanently, as their alignment
requirement is only 256 Bytes, thus making mapping of several tables
complicated. Instead we map the device tables on demand when we need
them later.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 8c2eaaa..36b44f2 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -42,6 +42,7 @@
 #define VIRT_ITS_CMDBUF_VALID   3
 struct virt_its {
     struct domain *d;
+    paddr_t doorbell_address;
     spinlock_t vcmd_lock;       /* Protects the virtual command buffer. */
     uint64_t cbaser;
     uint64_t cwriter;
@@ -141,6 +142,27 @@ static struct vcpu *get_vcpu_from_collection(struct virt_its *its, int collid)
 #define DEV_TABLE_ENTRY(addr, bits)                     \
         (((addr) & GENMASK_ULL(51, 8)) | (((bits) - 1) & GENMASK_ULL(7, 0)))
 
+/* Set the address of an ITT for a given device ID. */
+static int its_set_itt_address(struct virt_its *its, uint32_t devid,
+                               paddr_t itt_address, uint32_t nr_bits)
+{
+    paddr_t addr = get_baser_phys_addr(its->baser_dev);
+    uint64_t *itt;
+
+    if ( devid >= its->max_devices )
+        return -ENOENT;
+
+    itt = map_one_guest_page(its->d, addr + devid * sizeof(uint64_t));
+    if ( !itt )
+        return -EFAULT;
+
+    *itt = DEV_TABLE_ENTRY(itt_address, nr_bits);
+
+    unmap_one_guest_page(itt);
+
+    return 0;
+}
+
 /*
  * Lookup the address of the Interrupt Translation Table associated with
  * a device ID and return the address of the ITTE belonging to the event ID
@@ -361,6 +383,44 @@ static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    int size = its_cmd_get_size(cmdptr) + 1;
+    bool valid = its_cmd_get_validbit(cmdptr);
+    paddr_t itt_addr = its_cmd_mask_field(cmdptr, 2, 0, 52) &
+                           GENMASK_ULL(51, 8);
+    int ret;
+
+    /*
+     * There is no easy and clean way for Xen to know the ITS device ID of a
+     * particular (PCI) device, so we have to rely on the guest telling
+     * us about it. For *now* we are just using the device ID *Dom0* uses,
+     * because the driver there has the actual knowledge.
+     * Eventually this will be replaced with a dedicated hypercall to
+     * announce pass-through of devices.
+     */
+    if ( is_hardware_domain(its->d) )
+    {
+        /* Dom0's ITSes are mapped 1:1, so both address are the same. */
+        ret = gicv3_its_map_guest_device(its->d, its->doorbell_address, devid,
+                                         its->doorbell_address, devid,
+                                         BIT(size), valid);
+        if ( ret )
+            return ret;
+    }
+
+    spin_lock(&its->its_lock);
+    if ( valid )
+        ret = its_set_itt_address(its, devid, itt_addr, size);
+    else
+        ret = its_set_itt_address(its, devid, INVALID_PADDR, 1);
+
+    spin_unlock(&its->its_lock);
+
+    return ret;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -407,6 +467,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPC:
             ret = its_handle_mapc(its, cmdptr);
             break;
+        case GITS_CMD_MAPD:
+            ret = its_handle_mapd(its, cmdptr);
+	    break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 19/26] ARM: vITS: handle MAPTI command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (17 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 18/26] ARM: vITS: handle MAPD command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-04-01  8:32   ` Vijay Kilari
  2017-03-31 18:05 ` [PATCH v3 20/26] ARM: vITS: handle MOVI command Andre Przywara
                   ` (7 subsequent siblings)
  26 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
pair and actually instantiates LPI interrupts.
We connect the already allocated host LPI to this virtual LPI, so that
any triggering IRQ on the host can be quickly forwarded to a guest.
Beside entering the VCPU and the virtual LPI number in the respective
host LPI entry, we also initialize and add the already allocated
struct pending_irq to our radix tree, so that we can now easily find it
by its virtual LPI number.
This exports the vgic_init_pending_irq() function for that purpose.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 74 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c        | 16 +++++++++
 xen/arch/arm/vgic-v3-its.c       | 36 +++++++++++++++++--
 xen/arch/arm/vgic.c              |  2 +-
 xen/include/asm-arm/gic_v3_its.h |  6 ++++
 xen/include/asm-arm/vgic.h       |  1 +
 6 files changed, 132 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 8db2a09..39f16b2 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -747,6 +747,80 @@ restart:
     spin_unlock(&d->arch.vgic.its_devices_lock);
 }
 
+/* Must be called with the its_device_lock held. */
+static struct its_devices *get_its_device(struct domain *d, paddr_t doorbell,
+                                          uint32_t devid)
+{
+    struct rb_node *node = d->arch.vgic.its_devices.rb_node;
+    struct its_devices *dev;
+
+    while (node)
+    {
+        int cmp;
+
+        dev = rb_entry(node, struct its_devices, rbnode);
+        cmp = compare_its_guest_devices(dev, doorbell, devid);
+
+        if ( !cmp )
+            return dev;
+
+        if ( cmp > 0 )
+            node = node->rb_left;
+        else
+            node = node->rb_right;
+    }
+
+    return NULL;
+}
+
+static uint32_t get_host_lpi(struct its_devices *dev, uint32_t eventid)
+{
+    uint32_t host_lpi = 0;
+
+    if ( dev && (eventid < dev->eventids) )
+    {
+        host_lpi = dev->host_lpi_blocks[eventid / LPI_BLOCK] +
+                                       (eventid % LPI_BLOCK);
+        if ( !is_lpi(host_lpi) )
+            host_lpi = 0;
+    }
+
+    return host_lpi;
+}
+
+/*
+ * Connects the event ID for an already assigned device to the given VCPU/vLPI
+ * pair. The corresponding physical LPI is already mapped on the host side
+ * (when assigning the physical device to the guest), so we just connect the
+ * target VCPU/vLPI pair to that interrupt to inject it properly if it fires.
+ */
+struct pending_irq *gicv3_assign_guest_event(struct domain *d,
+                                             paddr_t doorbell_address,
+                                             uint32_t devid, uint32_t eventid,
+                                             struct vcpu *v, uint32_t virt_lpi)
+{
+    struct its_devices *dev;
+    struct pending_irq *pirq = NULL;
+    uint32_t host_lpi = 0;
+
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    dev = get_its_device(d, doorbell_address, devid);
+    if ( dev )
+    {
+        host_lpi = get_host_lpi(dev, eventid);
+        pirq = &dev->pend_irqs[eventid];
+    }
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    if ( !host_lpi || !pirq )
+        return NULL;
+
+    gicv3_lpi_update_host_entry(host_lpi, d->domain_id,
+                                v ? v->vcpu_id : -1, virt_lpi);
+
+    return pirq;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index 2301d53..a6b728e 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -178,6 +178,22 @@ void do_LPI(unsigned int lpi)
     rcu_unlock_domain(d);
 }
 
+void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
+                                 unsigned int vcpu_id, uint32_t virt_lpi)
+{
+    union host_lpi *hlpip, hlpi;
+
+    host_lpi -= LPI_OFFSET;
+
+    hlpip = &lpi_data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % HOST_LPIS_PER_PAGE];
+
+    hlpi.virt_lpi = virt_lpi;
+    hlpi.dom_id = domain_id;
+    hlpi.vcpu_id = vcpu_id;
+
+    write_u64_atomic(&hlpip->data, hlpi.data);
+}
+
 static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
 {
     uint64_t val;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 36b44f2..d9dce3f 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -258,8 +258,8 @@ static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
 }
 
 #define SKIP_LPI_UPDATE 1
-bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
-                uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
+static bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
+                       uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
 {
     struct vits_itte *itte;
 
@@ -421,6 +421,34 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
     return ret;
 }
 
+static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    uint32_t intid = its_cmd_get_physical_id(cmdptr);
+    uint16_t collid = its_cmd_get_collection(cmdptr);
+    struct pending_irq *pirq;
+    struct vcpu *vcpu;
+
+    if ( its_cmd_get_command(cmdptr) == GITS_CMD_MAPI )
+        intid = eventid;
+
+    pirq = gicv3_assign_guest_event(its->d, its->doorbell_address,
+                                    devid, eventid, vcpu, intid);
+    if ( !pirq )
+        return -1;
+
+    vgic_init_pending_irq(pirq, intid);
+    write_lock(&its->d->arch.vgic.pend_lpi_tree_lock);
+    radix_tree_insert(&its->d->arch.vgic.pend_lpi_tree, intid, pirq);
+    write_unlock(&its->d->arch.vgic.pend_lpi_tree_lock);
+
+    if ( !write_itte(its, devid, eventid, collid, intid, &vcpu) )
+        return -1;
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -470,6 +498,10 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPD:
             ret = its_handle_mapd(its, cmdptr);
 	    break;
+        case GITS_CMD_MAPI:
+        case GITS_CMD_MAPTI:
+            ret = its_handle_mapti(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 2aee20f..94eb9c5 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -61,7 +61,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
     return vgic_get_rank(v, rank);
 }
 
-static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
+void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
 {
     INIT_LIST_HEAD(&p->inflight);
     INIT_LIST_HEAD(&p->lr_queue);
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index bc9b42a..35a3e22 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -161,6 +161,12 @@ void gicv3_its_unmap_all_devices(struct domain *d);
 int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi);
 void gicv3_free_host_lpi_block(uint32_t first_lpi);
 
+struct pending_irq *gicv3_assign_guest_event(struct domain *d, paddr_t doorbell,
+                                             uint32_t devid, uint32_t eventid,
+                                             struct vcpu *v, uint32_t virt_lpi);
+void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
+                                 unsigned int vcpu_id, uint32_t virt_lpi);
+
 #else
 
 static LIST_HEAD(host_its_list);
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index 9f48e9a..3fb7433 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -298,6 +298,7 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
 extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
 extern void vgic_clear_pending_irqs(struct vcpu *v);
+extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
 extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
 extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
 extern struct pending_irq *lpi_to_pending(struct domain *d, unsigned int irq);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 20/26] ARM: vITS: handle MOVI command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (18 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 19/26] ARM: vITS: handle MAPTI command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 21/26] ARM: vITS: handle DISCARD command Andre Przywara
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The MOVI command moves the interrupt affinity from one redistributor
(read: VCPU) to another.
For now migration of "live" LPIs is not yet implemented, but we store
the changed affinity in the host LPI structure and in our virtual ITTE.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 24 ++++++++++++++++++++++++
 xen/arch/arm/gic-v3-lpi.c        | 13 +++++++++++++
 xen/arch/arm/vgic-v3-its.c       | 24 ++++++++++++++++++++++++
 xen/include/asm-arm/gic_v3_its.h |  4 ++++
 4 files changed, 65 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 39f16b2..f29e70f 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -821,6 +821,30 @@ struct pending_irq *gicv3_assign_guest_event(struct domain *d,
     return pirq;
 }
 
+/* Changes the target VCPU for a given host LPI assigned to a domain. */
+int gicv3_lpi_change_vcpu(struct domain *d, paddr_t doorbell,
+                          uint32_t devid, uint32_t eventid,
+                          unsigned int vcpu_id)
+{
+    uint32_t host_lpi;
+    struct its_devices *dev;
+
+    spin_lock(&d->arch.vgic.its_devices_lock);
+    dev = get_its_device(d, doorbell, devid);
+    if ( dev )
+        host_lpi = get_host_lpi(dev, eventid);
+    else
+        host_lpi = 0;
+    spin_unlock(&d->arch.vgic.its_devices_lock);
+
+    if ( !host_lpi )
+        return -ENOENT;
+
+    gicv3_lpi_update_host_vcpuid(host_lpi, vcpu_id);
+
+    return 0;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
index a6b728e..e889592 100644
--- a/xen/arch/arm/gic-v3-lpi.c
+++ b/xen/arch/arm/gic-v3-lpi.c
@@ -194,6 +194,19 @@ void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
     write_u64_atomic(&hlpip->data, hlpi.data);
 }
 
+int gicv3_lpi_update_host_vcpuid(uint32_t host_lpi, unsigned int vcpu_id)
+{
+    union host_lpi *hlpip;
+
+    host_lpi -= LPI_OFFSET;
+
+    hlpip = &lpi_data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % HOST_LPIS_PER_PAGE];
+
+    write_u16_atomic(&hlpip->vcpu_id, vcpu_id);
+
+    return 0;
+}
+
 static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
 {
     uint64_t val;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index d9dce3f..0085719 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -449,6 +449,24 @@ static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    int collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+
+    if ( !write_itte(its, devid, eventid, collid, SKIP_LPI_UPDATE, &vcpu) )
+        return -1;
+
+    /* TODO: lookup currently-in-guest virtual IRQs and migrate them */
+
+    gicv3_lpi_change_vcpu(its->d,
+                          its->doorbell_address, devid, eventid, vcpu->vcpu_id);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -502,6 +520,12 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_MAPTI:
             ret = its_handle_mapti(its, cmdptr);
             break;
+        case GITS_CMD_MOVALL:
+            gdprintk(XENLOG_G_INFO, "ITS: ignoring MOVALL command\n");
+            break;
+        case GITS_CMD_MOVI:
+            ret = its_handle_movi(its, cmdptr);
+            break;
         case GITS_CMD_SYNC:
             /* We handle ITS commands synchronously, so we ignore SYNC. */
 	    break;
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 35a3e22..457400b 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -164,8 +164,12 @@ void gicv3_free_host_lpi_block(uint32_t first_lpi);
 struct pending_irq *gicv3_assign_guest_event(struct domain *d, paddr_t doorbell,
                                              uint32_t devid, uint32_t eventid,
                                              struct vcpu *v, uint32_t virt_lpi);
+int gicv3_lpi_change_vcpu(struct domain *d, paddr_t doorbell,
+                          uint32_t devid, uint32_t eventid,
+                          unsigned int vcpu_id);
 void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
                                  unsigned int vcpu_id, uint32_t virt_lpi);
+int gicv3_lpi_update_host_vcpuid(uint32_t host_lpi, unsigned int vcpu_id);
 
 #else
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 21/26] ARM: vITS: handle DISCARD command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (19 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 20/26] ARM: vITS: handle MOVI command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 22/26] ARM: vITS: handle INV command Andre Przywara
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The DISCARD command drops the connection between a DeviceID/EventID
and an LPI/collection pair.
We mark the respective structure entries as not allocated and make
sure that any queued IRQs are removed.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 0085719..59553f8 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -467,6 +467,33 @@ static int its_handle_movi(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+static int its_handle_discard(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct pending_irq *pirq;
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    pirq = lpi_to_pending(its->d, vlpi);
+    if ( pirq )
+    {
+        clear_bit(GIC_IRQ_GUEST_QUEUED, &pirq->status);
+        gic_remove_from_queues(vcpu, vlpi);
+    }
+
+    if ( !write_itte(its, devid, eventid, UNMAPPED_COLLECTION, INVALID_LPI, NULL) )
+        return -1;
+
+    gicv3_assign_guest_event(its->d, its->doorbell_address,
+                             devid, eventid, NULL, 0);
+
+    return 0;
+}
+
 #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
 
 static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
@@ -507,6 +534,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_CLEAR:
             ret = its_handle_clear(its, cmdptr);
             break;
+        case GITS_CMD_DISCARD:
+            ret = its_handle_discard(its, cmdptr);
+            break;
         case GITS_CMD_INT:
             ret = its_handle_int(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 22/26] ARM: vITS: handle INV command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (20 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 21/26] ARM: vITS: handle DISCARD command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 23/26] ARM: vITS: handle INVALL command Andre Przywara
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The INV command instructs the ITS to update the configuration data for
a given LPI by re-reading its entry from the property table.
We don't need to care so much about the priority value, but enabling
or disabling an LPI has some effect: We remove or push virtual LPIs
to their VCPUs, also check the virtual pending bit if an LPI gets enabled.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 59553f8..0c479e0 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -360,6 +360,65 @@ static int its_handle_int(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+/*
+ * For a given virtual LPI read the enabled bit and priority from the virtual
+ * property table and update the virtual IRQ's state.
+ * This takes care of removing or pushing of virtual LPIs to their VCPUs.
+ */
+static void update_lpi_enabled_status(struct virt_its* its,
+                                      struct vcpu *vcpu, uint32_t vlpi)
+{
+    struct pending_irq *p = lpi_to_pending(its->d, vlpi);
+    paddr_t proptable_addr;
+    uint8_t *property;
+
+    if ( !p )
+        return;
+
+    proptable_addr = its->d->arch.vgic.rdist_propbase & GENMASK_ULL(51, 12);
+    property = map_one_guest_page(its->d, proptable_addr + vlpi - LPI_OFFSET);
+
+    p->lpi_priority = *property & LPI_PROP_PRIO_MASK;
+
+    if ( *property & LPI_PROP_ENABLED )
+    {
+        unsigned long flags;
+
+        set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+        spin_lock_irqsave(&vcpu->arch.vgic.lock, flags);
+        if ( !list_empty(&p->inflight) &&
+             !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
+            gic_raise_guest_irq(vcpu, vlpi, p->lpi_priority);
+        spin_unlock_irqrestore(&vcpu->arch.vgic.lock, flags);
+
+        /* Check whether the LPI has fired while the guest had it disabled. */
+        if ( test_and_clear_bit(GIC_IRQ_GUEST_LPI_PENDING, &p->status) )
+            vgic_vcpu_inject_irq(vcpu, vlpi);
+    }
+    else
+    {
+        clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+        gic_remove_from_queues(vcpu, vlpi);
+    }
+
+    unmap_one_guest_page(property);
+}
+
+static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t devid = its_cmd_get_deviceid(cmdptr);
+    uint32_t eventid = its_cmd_get_id(cmdptr);
+    struct vcpu *vcpu;
+    uint32_t vlpi;
+
+    if ( !read_itte(its, devid, eventid, &vcpu, &vlpi) )
+        return -1;
+
+    update_lpi_enabled_status(its, vcpu, vlpi);
+
+    return 0;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -540,6 +599,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INT:
             ret = its_handle_int(its, cmdptr);
             break;
+        case GITS_CMD_INV:
+            ret = its_handle_inv(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             ret = its_handle_mapc(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 23/26] ARM: vITS: handle INVALL command
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (21 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 22/26] ARM: vITS: handle INV command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 24/26] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

The INVALL command instructs an ITS to invalidate the configuration
data for all LPIs associated with a given redistributor (read: VCPU).
This is nasty to emulate exactly with our architecture, so we just scan
the pending table and inject _every_ LPI found there that got enabled.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 0c479e0..24e7d17 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -419,6 +419,49 @@ static int its_handle_inv(struct virt_its *its, uint64_t *cmdptr)
     return 0;
 }
 
+/*
+ * INVALL updates the per-LPI configuration status for every LPI mapped to
+ * a particular redistributor.
+ * We iterate over all mapped LPIs in our radix tree and update those.
+ */
+static int its_handle_invall(struct virt_its *its, uint64_t *cmdptr)
+{
+    uint32_t collid = its_cmd_get_collection(cmdptr);
+    struct vcpu *vcpu;
+    struct pending_irq *pirqs[16];
+    uint32_t vlpi = 0;
+    int nr_lpis, i;
+
+    /* We may want to revisit this implementation for DomUs. */
+    ASSERT(is_hardware_domain(its->d));
+
+    spin_lock(&its->its_lock);
+    vcpu = get_vcpu_from_collection(its, collid);
+    spin_unlock(&its->its_lock);
+
+    read_lock(&its->d->arch.vgic.pend_lpi_tree_lock);
+
+    do {
+        nr_lpis = radix_tree_gang_lookup(&its->d->arch.vgic.pend_lpi_tree,
+                                         (void **)pirqs, vlpi,
+					 ARRAY_SIZE(pirqs));
+
+        for ( i = 0; i < nr_lpis; i++ )
+        {
+            vlpi = pirqs[i]->irq;
+            update_lpi_enabled_status(its, vcpu, vlpi);
+        }
+
+        /* Protect from overflow when incrementing 0xffffffff */
+        if ( vlpi == ~0 || ++vlpi < its->d->arch.vgic.nr_lpis )
+            break;
+    } while ( nr_lpis == ARRAY_SIZE(pirqs));
+
+    read_unlock(&its->d->arch.vgic.pend_lpi_tree_lock);
+
+    return 0;
+}
+
 static int its_handle_mapc(struct virt_its *its, uint64_t *cmdptr)
 {
     uint32_t collid = its_cmd_get_collection(cmdptr);
@@ -602,6 +645,9 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
         case GITS_CMD_INV:
             ret = its_handle_inv(its, cmdptr);
 	    break;
+        case GITS_CMD_INVALL:
+            ret = its_handle_invall(its, cmdptr);
+	    break;
         case GITS_CMD_MAPC:
             ret = its_handle_mapc(its, cmdptr);
             break;
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 24/26] ARM: vITS: create and initialize virtual ITSes for Dom0
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (22 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 23/26] ARM: vITS: handle INVALL command Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 25/26] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

For each hardware ITS create and initialize a virtual ITS for Dom0.
We use the same memory mapped address to keep the doorbell working.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3-its.c       | 32 ++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c           | 17 +++++++++++++++++
 xen/include/asm-arm/domain.h     |  1 +
 xen/include/asm-arm/gic_v3_its.h | 16 ++++++++++++++++
 4 files changed, 66 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 24e7d17..c5cf249 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -1074,6 +1074,38 @@ static const struct mmio_handler_ops vgic_its_mmio_handler = {
     .write = vgic_v3_its_mmio_write,
 };
 
+int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr,
+                             unsigned int devid_bits, unsigned int intid_bits)
+{
+    struct virt_its *its;
+    uint64_t base_attr;
+
+    its = xzalloc(struct virt_its);
+    if ( ! its )
+        return -ENOMEM;
+
+    base_attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
+    base_attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
+    base_attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
+
+    its->cbaser  = base_attr;
+    base_attr |= 0ULL << GITS_BASER_PAGE_SIZE_SHIFT;
+    its->baser_dev  = GITS_BASER_TYPE_DEVICE << GITS_BASER_TYPE_SHIFT;
+    its->baser_dev |= (7ULL << GITS_BASER_ENTRY_SIZE_SHIFT) | base_attr;
+    its->baser_coll  = GITS_BASER_TYPE_COLLECTION << GITS_BASER_TYPE_SHIFT;
+    its->baser_coll |= (1ULL << GITS_BASER_ENTRY_SIZE_SHIFT) | base_attr;
+    its->d = d;
+    its->doorbell_address = guest_addr + ITS_DOORBELL_OFFSET;
+    its->devid_bits = devid_bits;
+    its->intid_bits = intid_bits;
+    spin_lock_init(&its->vcmd_lock);
+    spin_lock_init(&its->its_lock);
+
+    register_mmio_handler(d, &vgic_its_mmio_handler, guest_addr, SZ_64K, its);
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 4159fb8..22a7b1b 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -31,6 +31,7 @@
 #include <asm/current.h>
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic_v3_its.h>
 #include <asm/vgic.h>
 #include <asm/vgic-emul.h>
 #include <asm/vreg.h>
@@ -1583,6 +1584,7 @@ static int vgic_v3_domain_init(struct domain *d)
      */
     if ( is_hardware_domain(d) )
     {
+        struct host_its *hw_its;
         unsigned int first_cpu = 0;
 
         d->arch.vgic.dbase = vgic_v3_hw.dbase;
@@ -1608,6 +1610,21 @@ static int vgic_v3_domain_init(struct domain *d)
 
             first_cpu += size / d->arch.vgic.rdist_stride;
         }
+        d->arch.vgic.nr_regions = vgic_v3_hw.nr_rdist_regions;
+
+        list_for_each_entry(hw_its, &host_its_list, entry)
+        {
+            /*
+	     * For each host ITS create a virtual ITS using the same
+	     * base and thus doorbell address.
+	     * Use the same number of device ID bits as the host, and
+	     * allow 20 bits for the interrupt ID.
+	     */
+            vgic_v3_its_init_virtual(d, hw_its->addr, hw_its->devid_bits, 20);
+
+            d->arch.vgic.has_its = true;
+        }
+
     }
     else
     {
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index ad4dfdc..fa227ca 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -116,6 +116,7 @@ struct arch_domain
         spinlock_t its_devices_lock;        /* Protects the its_devices tree */
         struct radix_tree_root pend_lpi_tree; /* Stores struct pending_irq's */
         rwlock_t pend_lpi_tree_lock;        /* Protects the pend_lpi_tree */
+        bool has_its;
 #endif
     } vgic;
 
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index 457400b..e055735 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -149,6 +149,14 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
 int gicv3_its_setup_collection(unsigned int cpu);
 
 /*
+ * Create and register a virtual ITS at the given guest address.
+ * If a host ITS is specified, a hardware domain can reach out to that host
+ * ITS to deal with devices and LPI mappings and can enable/disable LPIs.
+ */
+int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr,
+			     unsigned int devid_bits, unsigned int intid_bits);
+
+/*
  * Map a device on the host by allocating an ITT on the host (ITS).
  * "nr_event" specifies how many events (interrupts) this device will need.
  * Setting "valid" to false deallocates the device.
@@ -213,6 +221,14 @@ static inline void gicv3_its_unmap_all_devices(struct domain *d)
 {
 }
 
+static inline int vgic_v3_its_init_virtual(struct domain *d,
+                                           paddr_t guest_addr,
+                                           unsigned int devid_bits,
+                                           unsigned int intid_bits)
+{
+    return 0;
+}
+
 #endif /* CONFIG_HAS_ITS */
 
 #endif
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 25/26] ARM: vITS: create ITS subnodes for Dom0 DT
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (23 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 24/26] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-03-31 18:05 ` [PATCH v3 26/26] ARM: vGIC: advertise LPI support Andre Przywara
  2017-04-01 20:37 ` [PATCH v3 00/26] arm64: Dom0 ITS emulation Julien Grall
  26 siblings, 0 replies; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Dom0 expects all ITSes in the system to be propagated to be able to
use MSIs.
Create Dom0 DT nodes for each hardware ITS, keeping the register frame
address the same, as the doorbell address that the Dom0 drivers program
into the BARs has to match the hardware.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/gic-v3-its.c        | 78 ++++++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c            |  4 ++-
 xen/include/asm-arm/gic_v3_its.h | 13 +++++++
 3 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index f29e70f..4addef4 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -20,6 +20,7 @@
 
 #include <xen/lib.h>
 #include <xen/delay.h>
+#include <xen/libfdt/libfdt.h>
 #include <xen/mm.h>
 #include <xen/rbtree.h>
 #include <xen/sched.h>
@@ -845,6 +846,83 @@ int gicv3_lpi_change_vcpu(struct domain *d, paddr_t doorbell,
     return 0;
 }
 
+/*
+ * Create the respective guest DT nodes for a list of host ITSes.
+ * This copies the reg property, so the guest sees the ITS at the same address
+ * as the host.
+ * Giving NULL for the its_list will make it use the list of host ITSes.
+ */
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt)
+{
+    uint32_t len;
+    int res;
+    const void *prop = NULL;
+    const struct dt_device_node *its = NULL;
+    const struct host_its *its_data;
+
+    if ( !its_list )
+        its_list = &host_its_list;
+
+    if ( list_empty(its_list) )
+        return 0;
+
+    /* The sub-nodes require the ranges property */
+    prop = dt_get_property(gic, "ranges", &len);
+    if ( !prop )
+    {
+        printk(XENLOG_ERR "Can't find ranges property for the gic node\n");
+        return -FDT_ERR_XEN(ENOENT);
+    }
+
+    res = fdt_property(fdt, "ranges", prop, len);
+    if ( res )
+        return res;
+
+    list_for_each_entry(its_data, its_list, entry)
+    {
+        its = its_data->dt_node;
+
+        res = fdt_begin_node(fdt, its->name);
+        if ( res )
+            return res;
+
+        res = fdt_property_string(fdt, "compatible", "arm,gic-v3-its");
+        if ( res )
+            return res;
+
+        res = fdt_property(fdt, "msi-controller", NULL, 0);
+        if ( res )
+            return res;
+
+        if ( its->phandle )
+        {
+            res = fdt_property_cell(fdt, "phandle", its->phandle);
+            if ( res )
+                return res;
+        }
+
+        /* Use the same reg regions as the ITS node in host DTB. */
+        prop = dt_get_property(its, "reg", &len);
+        if ( !prop )
+        {
+            printk(XENLOG_ERR "GICv3: Can't find ITS reg property.\n");
+            res = -FDT_ERR_XEN(ENOENT);
+            return res;
+        }
+
+        res = fdt_property(fdt, "reg", prop, len);
+        if ( res )
+            return res;
+
+        fdt_end_node(fdt);
+    }
+
+    return res;
+}
+
 /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
 void gicv3_its_dt_init(const struct dt_device_node *node)
 {
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index d92d115..f5b2c00 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1170,8 +1170,10 @@ static int gicv3_make_hwdom_dt_node(const struct domain *d,
 
     res = fdt_property(fdt, "reg", new_cells, len);
     xfree(new_cells);
+    if ( res )
+        return res;
 
-    return res;
+    return gicv3_its_make_dt_nodes(NULL, d, gic, fdt);
 }
 
 static const hw_irq_controller gicv3_host_irq_type = {
diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
index e055735..a4300d2 100644
--- a/xen/include/asm-arm/gic_v3_its.h
+++ b/xen/include/asm-arm/gic_v3_its.h
@@ -156,6 +156,12 @@ int gicv3_its_setup_collection(unsigned int cpu);
 int vgic_v3_its_init_virtual(struct domain *d, paddr_t guest_addr,
 			     unsigned int devid_bits, unsigned int intid_bits);
 
+/* Given a list of ITSes, create the appropriate DT nodes for a domain. */
+int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                            const struct domain *d,
+                            const struct dt_device_node *gic,
+                            void *fdt);
+
 /*
  * Map a device on the host by allocating an ITT on the host (ITS).
  * "nr_event" specifies how many events (interrupts) this device will need.
@@ -228,6 +234,13 @@ static inline int vgic_v3_its_init_virtual(struct domain *d,
 {
     return 0;
 }
+static inline int gicv3_its_make_dt_nodes(struct list_head *its_list,
+                                       const struct domain *d,
+                                       const struct dt_device_node *gic,
+                                       void *fdt)
+{
+    return 0;
+}
 
 #endif /* CONFIG_HAS_ITS */
 
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH v3 26/26] ARM: vGIC: advertise LPI support
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (24 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 25/26] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
@ 2017-03-31 18:05 ` Andre Przywara
  2017-04-04 17:06   ` Julien Grall
  2017-04-01 20:37 ` [PATCH v3 00/26] arm64: Dom0 ITS emulation Julien Grall
  26 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-03-31 18:05 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

To let a guest know about the availability of virtual LPIs, set the
respective bits in the virtual GIC registers and let a guest control
the LPI enable bit.
Only report the LPI capability if the host has initialized at least
one ITS.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
---
 xen/arch/arm/vgic-v3.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 69 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 22a7b1b..47dad6a 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -169,8 +169,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* We have not implemented LPI's, read zero */
-        goto read_as_zero_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),
+                                info);
+        return 1;
 
     case VREG32(GICR_IIDR):
         if ( dabt.size != DABT_WORD ) goto bad_width;
@@ -182,16 +184,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         uint64_t typer, aff;
 
         if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
-        /* TBD: Update processor id in [23:8] when ITS support is added */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
         typer = aff;
+        typer |= (v->vcpu_id & 0xffff) << 8;
 
         if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
             typer |= GICR_TYPER_LAST;
 
+        if ( v->domain->arch.vgic.has_its )
+            typer |= GICR_TYPER_PLPIS;
+
         *r = vgic_reg64_extract(typer, info);
 
         return 1;
@@ -434,6 +439,35 @@ static uint64_t sanitize_pendbaser(uint64_t reg)
     return reg;
 }
 
+static void vgic_vcpu_enable_lpis(struct vcpu *v)
+{
+    uint64_t reg = v->domain->arch.vgic.rdist_propbase;
+    unsigned int nr_lpis = BIT((reg & 0x1f) + 1) - LPI_OFFSET;
+    int nr_pages;
+
+    /* The first VCPU to enable LPIs maps the property table. */
+    if ( !v->domain->arch.vgic.nr_lpis )
+    {
+        v->domain->arch.vgic.nr_lpis = nr_lpis;
+
+        nr_pages = DIV_ROUND_UP(nr_lpis, PAGE_SIZE);
+        get_guest_pages(v->domain, reg & GENMASK_ULL(51, 12), nr_pages);
+        gprintk(XENLOG_INFO, "VGIC-v3: VCPU%d mapped %d pages for property table\n",
+               v->vcpu_id, nr_pages);
+    }
+    nr_pages = DIV_ROUND_UP(((nr_lpis + LPI_OFFSET) / 8), PAGE_SIZE);
+    reg = v->arch.vgic.rdist_pendbase;
+
+    get_guest_pages(v->domain, reg & GENMASK_ULL(51, 12), nr_pages);
+
+    gprintk(XENLOG_INFO, "VGIC-v3: VCPU%d mapped %d pages for pending table\n",
+            v->vcpu_id, nr_pages);
+
+    v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
+
+    printk("VGICv3: enabled %d LPIs for VCPU%d\n", nr_lpis, v->vcpu_id);
+}
+
 static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
                                           uint32_t gicr_reg,
                                           register_t r)
@@ -444,8 +478,18 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case VREG32(GICR_CTLR):
-        /* LPI's not implemented */
-        goto write_ignore_32;
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        if ( !v->domain->arch.vgic.has_its )
+            return 1;
+
+        /* LPIs can only be enabled once, but never disabled again. */
+        if ( !(r & GICR_CTLR_ENABLE_LPIS) ||
+             (v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )
+            return 1;
+
+        vgic_vcpu_enable_lpis(v);
+
+        return 1;
 
     case VREG32(GICR_IIDR):
         /* RO */
@@ -1045,6 +1089,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
         typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
                  DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
 
+        if ( v->domain->arch.vgic.has_its )
+        {
+            typer |= GICD_TYPE_LPIS;
+            irq_bits = 16;
+        }
         typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
 
         *r = vgic_reg32_extract(typer, info);
@@ -1666,6 +1715,21 @@ static int vgic_v3_domain_init(struct domain *d)
 
 static void vgic_v3_domain_free(struct domain *d)
 {
+    int nr_pages;
+    struct vcpu *v;
+
+    if ( d->arch.vgic.nr_lpis )
+    {
+        nr_pages = DIV_ROUND_UP(d->arch.vgic.nr_lpis, PAGE_SIZE);
+        put_guest_pages(d, d->arch.vgic.rdist_propbase & GENMASK_ULL(51, 12),
+                        nr_pages);
+
+        nr_pages = DIV_ROUND_UP((d->arch.vgic.nr_lpis + LPI_OFFSET) / 8,
+                                PAGE_SIZE);
+        for_each_vcpu(d, v)
+            put_guest_pages(d, v->arch.vgic.rdist_pendbase & GENMASK_ULL(51, 12),
+                            nr_pages);
+    }
     gicv3_its_unmap_all_devices(d);
     radix_tree_destroy(&d->arch.vgic.pend_lpi_tree, NULL);
     xfree(d->arch.vgic.rdist_regions);
-- 
2.9.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-03-31 18:05 ` [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table Andre Przywara
@ 2017-03-31 22:59   ` Stefano Stabellini
  2017-04-03  9:05     ` Andre Przywara
  2017-04-03 13:53   ` Julien Grall
  1 sibling, 1 reply; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 22:59 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> The ARM GICv3 provides a new kind of interrupt called LPIs.
> The pending bits and the configuration data (priority, enable bits) for
> those LPIs are stored in tables in normal memory, which software has to
> provide to the hardware.
> Allocate the required memory, initialize it and hand it over to each
> redistributor. The maximum number of LPIs to be used can be adjusted with
> the command line option "max_lpi_bits", which defaults to 20 bits,
> covering about one million LPIs.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  docs/misc/xen-command-line.markdown |   9 ++
>  xen/arch/arm/Makefile               |   1 +
>  xen/arch/arm/gic-v3-lpi.c           | 209 ++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c               |  17 +++
>  xen/include/asm-arm/bitops.h        |   1 +
>  xen/include/asm-arm/config.h        |   2 +
>  xen/include/asm-arm/gic_v3_defs.h   |  54 +++++++++-
>  xen/include/asm-arm/gic_v3_its.h    |  14 +++
>  xen/include/asm-arm/irq.h           |   8 ++
>  xen/include/xen/bitops.h            |   5 +-
>  10 files changed, 318 insertions(+), 2 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-lpi.c
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index a11fdf9..619016d 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI.
>  ### maxcpus
>  > `= <integer>`
>  
> +### max\_lpi\_bits
> +> `= <integer>`
> +
> +Specifies the number of ARM GICv3 LPI interrupts to allocate on the host,
> +presented as the number of bits needed to encode it. This must be at least
> +14 and not exceed 32, and each LPI requires one byte (configuration) and
> +one pending bit to be allocated.
> +Defaults to 20 bits (to cover at most 1048576 interrupts).
> +
>  ### mce
>  > `= <integer>`
>  
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 54860e0..02a8737 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -19,6 +19,7 @@ obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
>  obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
> +obj-$(CONFIG_HAS_ITS) += gic-v3-lpi.o
>  obj-y += guestcopy.o
>  obj-y += hvm.o
>  obj-y += io.o
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> new file mode 100644
> index 0000000..77f6009
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -0,0 +1,209 @@
> +/*
> + * xen/arch/arm/gic-v3-lpi.c
> + *
> + * ARM GICv3 Locality-specific Peripheral Interrupts (LPI) support
> + *
> + * Copyright (C) 2016,2017 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; under version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +#include <asm/page.h>
> +
> +#define LPI_PROPTABLE_NEEDS_FLUSHING    (1U << 0)
> +/* Global state */
> +static struct {
> +    /* The global LPI property table, shared by all redistributors. */
> +    uint8_t *lpi_property;
> +    /*
> +     * Number of physical LPIs the host supports. This is a property of
> +     * the GIC hardware. We depart from the habit of naming these things
> +     * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI"
> +     * in a different context to differentiate them from "virtual LPIs".
> +     */
> +    unsigned long int nr_host_lpis;
> +    unsigned int flags;
> +} lpi_data;
> +
> +struct lpi_redist_data {
> +    void                *pending_table;
> +};
> +
> +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
> +
> +#define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
> +
> +static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
> +{
> +    uint64_t val;
> +    void *pendtable;
> +
> +    if ( this_cpu(lpi_redist).pending_table )
> +        return -EBUSY;
> +
> +    val  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The pending table holds one bit per LPI and even covers bits for
> +     * interrupt IDs below 8192, so we allocate the full range.
> +     * The GICv3 imposes a 64KB alignment requirement, also requires
> +     * physically contiguous memory.
> +     */
> +    pendtable = _xzalloc(lpi_data.nr_host_lpis / 8, SZ_64K);
> +    if ( !pendtable )
> +        return -ENOMEM;
> +
> +    /* Make sure the physical address can be encoded in the register. */
> +    if ( (virt_to_maddr(pendtable) & ~GENMASK_ULL(51, 16)) )
> +    {
> +        xfree(pendtable);
> +        return -ERANGE;
> +    }
> +    clean_and_invalidate_dcache_va_range(pendtable,
> +                                         lpi_data.nr_host_lpis / 8);
> +
> +    this_cpu(lpi_redist).pending_table = pendtable;
> +
> +    val |= GICR_PENDBASER_PTZ;
> +
> +    val |= virt_to_maddr(pendtable);
> +
> +    *reg = val;
> +
> +    return 0;
> +}
> +
> +/*
> + * Tell a redistributor about the (shared) property table, allocating one
> + * if not already done.
> + */
> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base)
> +{
> +    uint64_t reg;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The property table is shared across all redistributors, so allocate
> +     * this only once, but return the same value on subsequent calls.
> +     */
> +    if ( !lpi_data.lpi_property )
> +    {
> +        /* The property table holds one byte per LPI. */
> +        void *table = _xmalloc(lpi_data.nr_host_lpis, SZ_4K);
> +
> +        if ( !table )
> +            return -ENOMEM;
> +
> +        /* Make sure the physical address can be encoded in the register. */
> +        if ( (virt_to_maddr(table) & ~GENMASK_ULL(51, 12)) )
> +        {
> +            xfree(table);
> +            return -ERANGE;
> +        }
> +        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
> +        clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS);
> +        lpi_data.lpi_property = table;
> +    }
> +
> +    /* Encode the number of bits needed, minus one */
> +    reg |= (fls(lpi_data.nr_host_lpis - 1) - 1);
> +
> +    reg |= virt_to_maddr(lpi_data.lpi_property);
> +
> +    writeq_relaxed(reg, rdist_base + GICR_PROPBASER);
> +    reg = readq_relaxed(rdist_base + GICR_PROPBASER);
> +
> +    /* If we can't do shareable, we have to drop cacheability as well. */
> +    if ( !(reg & GICR_PROPBASER_SHAREABILITY_MASK) )
> +    {
> +        reg &= ~GICR_PROPBASER_INNER_CACHEABILITY_MASK;
> +        reg |= GIC_BASER_CACHE_nC << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT;
> +    }
> +
> +    /* Remember that we have to flush the property table if non-cacheable. */
> +    if ( (reg & GICR_PROPBASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC )
> +    {
> +        lpi_data.flags |= LPI_PROPTABLE_NEEDS_FLUSHING;
> +        /* Update the redistributors knowledge about the attributes. */
> +        writeq_relaxed(reg, rdist_base + GICR_PROPBASER);
> +    }
> +
> +    return 0;
> +}
> +
> +int gicv3_lpi_init_rdist(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +    int ret;
> +
> +    /* We don't support LPIs without an ITS. */
> +    if ( !gicv3_its_host_has_its() )
> +        return -ENODEV;
> +
> +    /* Make sure LPIs are disabled before setting up the tables. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    if ( reg & GICR_CTLR_ENABLE_LPIS )
> +        return -EBUSY;
> +
> +    ret = gicv3_lpi_allocate_pendtable(&table_reg);
> +    if (ret)
> +        return ret;
> +    writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> +    table_reg = readq_relaxed(rdist_base + GICR_PENDBASER);
> +
> +    /* If the hardware reports non-shareable, drop cacheability as well. */
> +    if ( !(table_reg & GICR_PENDBASER_SHAREABILITY_MASK) )
> +    {
> +        table_reg &= GICR_PENDBASER_SHAREABILITY_MASK;
> +        table_reg &= GICR_PENDBASER_INNER_CACHEABILITY_MASK;
> +        table_reg |= GIC_BASER_CACHE_nC << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +
> +        writeq_relaxed(table_reg, rdist_base + GICR_PENDBASER);
> +    }
> +
> +    return gicv3_lpi_set_proptable(rdist_base);
> +}
> +
> +static unsigned int max_lpi_bits = 20;
> +integer_param("max_lpi_bits", max_lpi_bits);

The only thing missing is checking that the user has passed max_lpi_bits
or warn if she has not (or if the memory usage is too high).

Look at the way dom0_mem is implemented.



> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
> +{
> +    lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits));
> +
> +    printk("GICv3: using at most %lu LPIs on the host.\n", MAX_PHYS_LPIS);
> +
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 1512521..36cd269 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -548,6 +548,9 @@ static void __init gicv3_dist_init(void)
>      type = readl_relaxed(GICD + GICD_TYPER);
>      nr_lines = 32 * ((type & GICD_TYPE_LINES) + 1);
>  
> +    if ( type & GICD_TYPE_LPIS )
> +        gicv3_lpi_init_host_lpis(GICD_TYPE_ID_BITS(type));
> +
>      printk("GICv3: %d lines, (IID %8.8x).\n",
>             nr_lines, readl_relaxed(GICD + GICD_IIDR));
>  
> @@ -660,6 +663,20 @@ static int __init gicv3_populate_rdist(void)
>              if ( (typer >> 32) == aff )
>              {
>                  this_cpu(rbase) = ptr;
> +
> +                if ( typer & GICR_TYPER_PLPIS )
> +                {
> +                    int ret;
> +
> +                    ret = gicv3_lpi_init_rdist(ptr);
> +                    if ( ret && ret != -ENODEV )
> +                    {
> +                        printk("GICv3: CPU%d: Cannot initialize LPIs: %u\n",
> +                               smp_processor_id(), ret);
> +                        break;
> +                    }
> +                }
> +
>                  printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
>                          smp_processor_id(), i, ptr);
>                  return 0;
> diff --git a/xen/include/asm-arm/bitops.h b/xen/include/asm-arm/bitops.h
> index bda8898..1cbfb9e 100644
> --- a/xen/include/asm-arm/bitops.h
> +++ b/xen/include/asm-arm/bitops.h
> @@ -24,6 +24,7 @@
>  #define BIT(nr)                 (1UL << (nr))
>  #define BIT_MASK(nr)            (1UL << ((nr) % BITS_PER_WORD))
>  #define BIT_WORD(nr)            ((nr) / BITS_PER_WORD)
> +#define BIT_ULL(nr)             (1ULL << (nr))
>  #define BITS_PER_BYTE           8
>  
>  #define ADDR (*(volatile int *) addr)
> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index ba61f65..f064e8a 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -19,6 +19,8 @@
>  #define BITS_PER_LONG (BYTES_PER_LONG << 3)
>  #define POINTER_ALIGN BYTES_PER_LONG
>  
> +#define BITS_PER_LONG_LONG (sizeof (long long) * BITS_PER_BYTE)
> +
>  /* xen_ulong_t is always 64 bits */
>  #define BITS_PER_XEN_ULONG 64
>  
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..7cdebc5 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,10 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>  
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_ID_BITS(r)     ((((r) >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1)
> +
> +#define GICD_TYPE_LPIS               (1U << 17)
>  
>  #define GICD_CTLR_RWP                (1UL << 31)
>  #define GICD_CTLR_ARE_NS             (1U << 4)
> @@ -95,12 +98,61 @@
>  #define GICR_IGRPMODR0               (0x0D00)
>  #define GICR_NSACR                   (0x0E00)
>  
> +#define GICR_CTLR_ENABLE_LPIS        (1U << 0)
> +
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
>  
> +/* For specifying the inner cacheability type only */
> +#define GIC_BASER_CACHE_nCnB         0ULL
> +/* For specifying the outer cacheability type only */
> +#define GIC_BASER_CACHE_SameAsInner  0ULL
> +#define GIC_BASER_CACHE_nC           1ULL
> +#define GIC_BASER_CACHE_RaWt         2ULL
> +#define GIC_BASER_CACHE_RaWb         3ULL
> +#define GIC_BASER_CACHE_WaWt         4ULL
> +#define GIC_BASER_CACHE_WaWb         5ULL
> +#define GIC_BASER_CACHE_RaWaWt       6ULL
> +#define GIC_BASER_CACHE_RaWaWb       7ULL
> +#define GIC_BASER_CACHE_MASK         7ULL
> +
> +#define GIC_BASER_NonShareable       0ULL
> +#define GIC_BASER_InnerShareable     1ULL
> +#define GIC_BASER_OuterShareable     2ULL
> +
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PROPBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PROPBASER_SHAREABILITY_MASK                     \
> +        (3UL << GICR_PROPBASER_SHAREABILITY_SHIFT)
> +#define GICR_PROPBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PROPBASER_INNER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PROPBASER_RES0_MASK                             \
> +        (GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
> +
> +#define GICR_PENDBASER_SHAREABILITY_SHIFT               10
> +#define GICR_PENDBASER_INNER_CACHEABILITY_SHIFT         7
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT         56
> +#define GICR_PENDBASER_SHAREABILITY_MASK                     \
> +	(3UL << GICR_PENDBASER_SHAREABILITY_SHIFT)
> +#define GICR_PENDBASER_INNER_CACHEABILITY_MASK               \
> +	(7UL << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_OUTER_CACHEABILITY_MASK               \
> +        (7UL << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT)
> +#define GICR_PENDBASER_PTZ                              BIT(62)
> +#define GICR_PENDBASER_RES0_MASK                             \
> +        (BIT(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |       \
> +         GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
> +
>  #define DEFAULT_PMR_VALUE            0xff
>  
> +#define LPI_PROP_PRIO_MASK           0xfc
> +#define LPI_PROP_RES1                (1 << 1)
> +#define LPI_PROP_ENABLED             (1 << 0)
> +
>  #define GICH_VMCR_EOI                (1 << 9)
>  #define GICH_VMCR_VENG1              (1 << 1)
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 765a655..219d109 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>  
>  bool gicv3_its_host_has_its(void);
>  
> +int gicv3_lpi_init_rdist(void __iomem * rdist_base);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +
>  #else
>  
>  static LIST_HEAD(host_its_list);
> @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void)
>      return false;
>  }
>  
> +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
> +{
> +    return -ENODEV;
> +}
> +
> +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 8f7a167..13528c0 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -19,8 +19,16 @@ struct arch_irq_desc {
>  };
>  
>  #define NR_LOCAL_IRQS	32
> +
> +/*
> + * This only covers the interrupts that Xen cares about, so SGIs, PPIs and
> + * SPIs. LPIs are too numerous, also only propagated to guests, so they are
> + * not included in this number.
> + */
>  #define NR_IRQS		1024
>  
> +#define LPI_OFFSET      8192
> +
>  #define nr_irqs NR_IRQS
>  #define nr_static_irqs NR_IRQS
>  #define arch_hwdom_irqs(domid) NR_IRQS
> diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
> index bd0883a..9261e06 100644
> --- a/xen/include/xen/bitops.h
> +++ b/xen/include/xen/bitops.h
> @@ -5,11 +5,14 @@
>  /*
>   * Create a contiguous bitmask starting at bit position @l and ending at
>   * position @h. For example
> - * GENMASK(30, 21) gives us the 32bit vector 0x01fe00000.
> + * GENMASK(30, 21) gives us the 32bit vector 0x7fe00000.
>   */
>  #define GENMASK(h, l) \
>      (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
>  
> +#define GENMASK_ULL(h, l) \
> +    (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))
> +
>  /*
>   * ffs: find first bit set. This is defined the same way as
>   * the libc and compiler builtin ffs routines, therefore
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-03-31 18:05 ` [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
@ 2017-03-31 23:06   ` Stefano Stabellini
  2017-04-03 15:38   ` Julien Grall
  1 sibling, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:06 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (for instance derived from a PCI
> b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
> pair of LPI number and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it to the ITS.
> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  docs/misc/xen-command-line.markdown |   9 ++
>  xen/arch/arm/gic-v3-its.c           | 168 ++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c               |   3 +
>  xen/include/asm-arm/gic_v3_its.h    |  64 +++++++++++++-
>  4 files changed, 243 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
> index 619016d..c67c925 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1158,6 +1158,15 @@ based interrupts. Any higher IRQs will be available for use via PCI MSI.
>  ### maxcpus
>  > `= <integer>`
>  
> +### max\_its\_device\_bits
> +> `= <integer>`
> +
> +Specifies the maximum number of devices using MSIs on the ARM GICv3 ITS
> +controller to allocate table entries for. Each table entry uses a hardware
> +specific size, typically 8 or 16 bytes. This value is given as the number
> +of bits required to hold one device ID.
> +Defaults to the machine provided value, which is at most 32 bits.
> +
>  ### max\_lpi\_bits
>  > `= <integer>`
>  
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 4056e5b..bfdb7ac 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -19,8 +19,10 @@
>   */
>  
>  #include <xen/lib.h>
> +#include <xen/mm.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
> +#include <asm/io.h>
>  
>  LIST_HEAD(host_its_list);
>  
> @@ -29,6 +31,172 @@ bool gicv3_its_host_has_its(void)
>      return !list_empty(&host_its_list);
>  }
>  
> +#define BASER_ATTR_MASK                                           \
> +        ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
> +         (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> +         (0x7UL << GITS_BASER_INNER_CACHEABILITY_SHIFT))
> +#define BASER_RO_MASK   (GENMASK_ULL(58, 56) | GENMASK_ULL(52, 48))
> +
> +/* Check that the physical address can be encoded in the PROPBASER register. */
> +static bool check_baser_phys_addr(void *vaddr, unsigned int page_bits)
> +{
> +    paddr_t paddr = virt_to_maddr(vaddr);
> +
> +    return (!(paddr & ~GENMASK_ULL(page_bits < 16 ? 47 : 51, page_bits)));
> +}
> +
> +static uint64_t encode_propbaser_phys_addr(paddr_t addr, unsigned int page_bits)
> +{
> +    uint64_t ret = addr & GENMASK_ULL(47, page_bits);
> +
> +    if ( page_bits < 16 )
> +        return ret;
> +
> +    /* For 64K pages address bits 51-48 are encoded in bits 15-12. */
> +    return ret | ((addr & GENMASK_ULL(51, 48)) >> (48 - 12));
> +}
> +
> +/* The ITS BASE registers work with page sizes of 4K, 16K or 64K. */
> +#define BASER_PAGE_BITS(sz) ((sz) * 2 + 12)
> +
> +static int its_map_baser(void __iomem *basereg, uint64_t regc,
> +                         unsigned int nr_items)
> +{
> +    uint64_t attr, reg;
> +    unsigned int entry_size = GITS_BASER_ENTRY_SIZE(regc);
> +    unsigned int pagesz = 2;    /* try 64K pages first, then go down. */
> +    unsigned int table_size;
> +    void *buffer;
> +
> +    attr  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    attr |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    /*
> +     * Setup the BASE register with the attributes that we like. Then read
> +     * it back and see what sticks (page size, cacheability and shareability
> +     * attributes), retrying if necessary.
> +     */
> +retry:
> +    table_size = ROUNDUP(nr_items * entry_size, BIT(BASER_PAGE_BITS(pagesz)));
> +    /* The BASE registers support at most 256 pages. */
> +    table_size = min(table_size, 256U << BASER_PAGE_BITS(pagesz));
> +
> +    buffer = _xzalloc(table_size, BIT(BASER_PAGE_BITS(pagesz)));
> +    if ( !buffer )
> +        return -ENOMEM;
> +
> +    if ( !check_baser_phys_addr(buffer, BASER_PAGE_BITS(pagesz)) )
> +    {
> +        xfree(buffer);
> +        return -ERANGE;
> +    }
> +
> +    reg  = attr;
> +    reg |= (pagesz << GITS_BASER_PAGE_SIZE_SHIFT);
> +    reg |= (table_size >> BASER_PAGE_BITS(pagesz)) - 1;
> +    reg |= regc & BASER_RO_MASK;
> +    reg |= GITS_VALID_BIT;
> +    reg |= encode_propbaser_phys_addr(virt_to_maddr(buffer),
> +                                      BASER_PAGE_BITS(pagesz));
> +
> +    writeq_relaxed(reg, basereg);
> +    regc = readq_relaxed(basereg);
> +
> +    /* The host didn't like our attributes, just use what it returned. */
> +    if ( (regc & BASER_ATTR_MASK) != attr )
> +    {
> +        /* If we can't map it shareable, drop cacheability as well. */
> +        if ( (regc & GITS_BASER_SHAREABILITY_MASK) == GIC_BASER_NonShareable )
> +        {
> +            regc &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +            writeq_relaxed(regc, basereg);
> +        }
> +        attr = regc & BASER_ATTR_MASK;
> +    }
> +    if ( (regc & GITS_BASER_INNER_CACHEABILITY_MASK) <= GIC_BASER_CACHE_nC )
> +        clean_and_invalidate_dcache_va_range(buffer, table_size);
> +
> +    /* If the host accepted our page size, we are done. */
> +    if ( ((regc >> GITS_BASER_PAGE_SIZE_SHIFT) & 0x3UL) == pagesz )
> +        return 0;
> +
> +    xfree(buffer);
> +
> +    if ( pagesz-- > 0 )
> +        goto retry;
> +
> +    /* None of the page sizes was accepted, give up */
> +    return -EINVAL;
> +}
> +
> +/* Allow a user to limit the number of devices. */
> +static unsigned int max_its_device_bits = 32;
> +integer_param("max_its_device_bits", max_its_device_bits);
> +
> +static int gicv3_its_init_single_its(struct host_its *hw_its)
> +{
> +    uint64_t reg;
> +    int i, ret;
> +
> +    hw_its->its_base = ioremap_nocache(hw_its->addr, hw_its->size);
> +    if ( !hw_its->its_base )
> +        return -ENOMEM;
> +
> +    reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
> +    hw_its->devid_bits = GITS_TYPER_DEVICE_ID_BITS(reg);
> +    hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
> +
> +    for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
> +    {
> +        void __iomem *basereg = hw_its->its_base + GITS_BASER0 + i * 8;
> +        unsigned int type;
> +
> +        reg = readq_relaxed(basereg);
> +        type = (reg & GITS_BASER_TYPE_MASK) >> GITS_BASER_TYPE_SHIFT;
> +        switch ( type )
> +        {
> +        case GITS_BASER_TYPE_NONE:
> +            continue;
> +        case GITS_BASER_TYPE_DEVICE:
> +            ret = its_map_baser(basereg, reg, BIT(hw_its->devid_bits));
> +            if ( ret )
> +                return ret;
> +            break;
> +        case GITS_BASER_TYPE_COLLECTION:
> +            ret = its_map_baser(basereg, reg, num_possible_cpus());
> +            if ( ret )
> +                return ret;
> +            break;
> +        /* In case this is a GICv4, provide a (dummy) vPE table as well. */
> +        case GITS_BASER_TYPE_VCPU:
> +            ret = its_map_baser(basereg, reg, 1);
> +            if ( ret )
> +                return ret;
> +            break;
> +        default:
> +            continue;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +int gicv3_its_init(void)
> +{
> +    struct host_its *hw_its;
> +    int ret;
> +
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +    {
> +        ret = gicv3_its_init_single_its(hw_its);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    return 0;
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 36cd269..b84bc40 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -1590,6 +1590,9 @@ static int __init gicv3_init(void)
>      spin_lock(&gicv3.lock);
>  
>      gicv3_dist_init();
> +    res = gicv3_its_init();
> +    if ( res )
> +        panic("GICv3: ITS: initialization failed: %d\n", res);
>      res = gicv3_cpu_init();
>      gicv3_hyp_init();
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 219d109..badb644 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -20,6 +20,60 @@
>  #ifndef __ASM_ARM_ITS_H__
>  #define __ASM_ARM_ITS_H__
>  
> +#define GITS_CTLR                       0x000
> +#define GITS_IIDR                       0x004
> +#define GITS_TYPER                      0x008
> +#define GITS_CBASER                     0x080
> +#define GITS_CWRITER                    0x088
> +#define GITS_CREADR                     0x090
> +#define GITS_BASER_NR_REGS              8
> +#define GITS_BASER0                     0x100
> +#define GITS_BASER1                     0x108
> +#define GITS_BASER2                     0x110
> +#define GITS_BASER3                     0x118
> +#define GITS_BASER4                     0x120
> +#define GITS_BASER5                     0x128
> +#define GITS_BASER6                     0x130
> +#define GITS_BASER7                     0x138
> +
> +/* Register bits */
> +#define GITS_VALID_BIT                  BIT_ULL(63)
> +
> +#define GITS_CTLR_QUIESCENT             BIT(31)
> +#define GITS_CTLR_ENABLE                BIT(0)
> +
> +#define GITS_TYPER_DEVIDS_SHIFT         13
> +#define GITS_TYPER_DEVIDS_MASK          (0x1fUL << GITS_TYPER_DEVIDS_SHIFT)
> +#define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
> +                                               GITS_TYPER_DEVIDS_SHIFT) + 1)
> +
> +#define GITS_IIDR_VALUE                 0x34c
> +
> +#define GITS_BASER_INDIRECT             BIT_ULL(62)
> +#define GITS_BASER_INNER_CACHEABILITY_SHIFT        59
> +#define GITS_BASER_TYPE_SHIFT           56
> +#define GITS_BASER_TYPE_MASK            (7ULL << GITS_BASER_TYPE_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_SHIFT        53
> +#define GITS_BASER_TYPE_NONE            0UL
> +#define GITS_BASER_TYPE_DEVICE          1UL
> +#define GITS_BASER_TYPE_VCPU            2UL
> +#define GITS_BASER_TYPE_CPU             3UL
> +#define GITS_BASER_TYPE_COLLECTION      4UL
> +#define GITS_BASER_TYPE_RESERVED5       5UL
> +#define GITS_BASER_TYPE_RESERVED6       6UL
> +#define GITS_BASER_TYPE_RESERVED7       7UL
> +#define GITS_BASER_ENTRY_SIZE_SHIFT     48
> +#define GITS_BASER_ENTRY_SIZE(reg)                                       \
> +                        (((reg >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0x1f) + 1)
> +#define GITS_BASER_SHAREABILITY_SHIFT   10
> +#define GITS_BASER_PAGE_SIZE_SHIFT      8
> +#define GITS_BASER_RO_MASK              (GITS_BASER_TYPE_MASK | \
> +                                        (31UL << GITS_BASER_ENTRY_SIZE_SHIFT) |\
> +                                        GITS_BASER_INDIRECT)
> +#define GITS_BASER_SHAREABILITY_MASK   (0x3ULL << GITS_BASER_SHAREABILITY_SHIFT)
> +#define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
> +#define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
> +
>  #include <xen/device_tree.h>
>  
>  /* data structure for each hardware ITS */
> @@ -28,6 +82,8 @@ struct host_its {
>      const struct dt_device_node *dt_node;
>      paddr_t addr;
>      paddr_t size;
> +    void __iomem *its_base;
> +    unsigned int devid_bits;
>  };
>  
>  
> @@ -42,8 +98,9 @@ bool gicv3_its_host_has_its(void);
>  
>  int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>  
> -/* Initialize the host structures for LPIs. */
> +/* Initialize the host structures for LPIs and the host ITSes. */
>  int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
> +int gicv3_its_init(void);
>  
>  #else
>  
> @@ -67,6 +124,11 @@ static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)
>  {
>      return 0;
>  }
> +
> +static inline int gicv3_its_init(void)
> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT
  2017-03-31 18:05 ` [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
@ 2017-03-31 23:08   ` Stefano Stabellini
  0 siblings, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:08 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> Parse the DT GIC subnodes to find every ITS MSI controller the hardware
> offers. Store that information in a list to both propagate all of them
> later to Dom0, but also to be able to iterate over all ITSes.
> This introduces an ITS Kconfig option.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/Kconfig             |  4 +++
>  xen/arch/arm/Makefile            |  1 +
>  xen/arch/arm/gic-v3-its.c        | 73 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3.c            | 10 +++---
>  xen/include/asm-arm/gic_v3_its.h | 67 ++++++++++++++++++++++++++++++++++++
>  5 files changed, 151 insertions(+), 4 deletions(-)
>  create mode 100644 xen/arch/arm/gic-v3-its.c
>  create mode 100644 xen/include/asm-arm/gic_v3_its.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 2e023d1..bf64c61 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -45,6 +45,10 @@ config ACPI
>  config HAS_GICV3
>  	bool
>  
> +config HAS_ITS
> +        bool "GICv3 ITS MSI controller support"
> +        depends on HAS_GICV3
> +
>  endmenu

needs dependency on EXPERT


>  menu "ARM errata workaround via the alternative framework"
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 7afb8a3..54860e0 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -18,6 +18,7 @@ obj-$(EARLY_PRINTK) += early_printk.o
>  obj-y += gic.o
>  obj-y += gic-v2.o
>  obj-$(CONFIG_HAS_GICV3) += gic-v3.o
> +obj-$(CONFIG_HAS_ITS) += gic-v3-its.o
>  obj-y += guestcopy.o
>  obj-y += hvm.o
>  obj-y += io.o
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> new file mode 100644
> index 0000000..4056e5b
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -0,0 +1,73 @@
> +/*
> + * xen/arch/arm/gic-v3-its.c
> + *
> + * ARM GICv3 Interrupt Translation Service (ITS) support
> + *
> + * Copyright (C) 2016,2017 - ARM Ltd
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; under version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/lib.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +
> +LIST_HEAD(host_its_list);
> +
> +bool gicv3_its_host_has_its(void)
> +{
> +    return !list_empty(&host_its_list);
> +}
> +
> +/* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
> +void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +    const struct dt_device_node *its = NULL;
> +    struct host_its *its_data;
> +
> +    /*
> +     * Check for ITS MSI subnodes. If any, add the ITS register
> +     * frames to the ITS list.
> +     */
> +    dt_for_each_child_node(node, its)
> +    {
> +        uint64_t addr, size;
> +
> +        if ( !dt_device_is_compatible(its, "arm,gic-v3-its") )
> +            continue;
> +
> +        if ( dt_device_get_address(its, 0, &addr, &size) )
> +            panic("GICv3: Cannot find a valid ITS frame address");
> +
> +        its_data = xzalloc(struct host_its);
> +        if ( !its_data )
> +            panic("GICv3: Cannot allocate memory for ITS frame");
> +
> +        its_data->addr = addr;
> +        its_data->size = size;
> +        its_data->dt_node = its;
> +
> +        printk("GICv3: Found ITS @0x%lx\n", addr);
> +
> +        list_add_tail(&its_data->entry, &host_its_list);
> +    }
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 955591b..1512521 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -43,6 +43,7 @@
>  #include <asm/device.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
>  #include <asm/cpufeature.h>
>  #include <asm/acpi.h>
>  
> @@ -1228,11 +1229,12 @@ static void __init gicv3_dt_init(void)
>       */
>      res = dt_device_get_address(node, 1 + gicv3.rdist_count,
>                                  &cbase, &csize);
> -    if ( res )
> -        return;
> +    if ( !res )
> +        dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
> +                              &vbase, &vsize);
>  
> -    dt_device_get_address(node, 1 + gicv3.rdist_count + 2,
> -                          &vbase, &vsize);
> +    /* Check for ITS child nodes and build the host ITS list accordingly. */
> +    gicv3_its_dt_init(node);
>  }
>  
>  static int gicv3_iomem_deny_access(const struct domain *d)
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> new file mode 100644
> index 0000000..765a655
> --- /dev/null
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -0,0 +1,67 @@
> +/*
> + * ARM GICv3 ITS support
> + *
> + * Andre Przywara <andre.przywara@arm.com>
> + * Copyright (c) 2016,2017 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; under version 2 of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __ASM_ARM_ITS_H__
> +#define __ASM_ARM_ITS_H__
> +
> +#include <xen/device_tree.h>
> +
> +/* data structure for each hardware ITS */
> +struct host_its {
> +    struct list_head entry;
> +    const struct dt_device_node *dt_node;
> +    paddr_t addr;
> +    paddr_t size;
> +};
> +
> +
> +#ifdef CONFIG_HAS_ITS
> +
> +extern struct list_head host_its_list;
> +
> +/* Parse the host DT and pick up all host ITSes. */
> +void gicv3_its_dt_init(const struct dt_device_node *node);
> +
> +bool gicv3_its_host_has_its(void);
> +
> +#else
> +
> +static LIST_HEAD(host_its_list);
> +
> +static inline void gicv3_its_dt_init(const struct dt_device_node *node)
> +{
> +}
> +
> +static inline bool gicv3_its_host_has_its(void)
> +{
> +    return false;
> +}
> +
> +#endif /* CONFIG_HAS_ITS */
> +
> +#endif
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer
  2017-03-31 18:05 ` [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
@ 2017-03-31 23:10   ` Stefano Stabellini
  2017-04-03 16:00   ` Julien Grall
  1 sibling, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:10 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer in normal system memory to the ITS h/w
> to create or alter the LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> ---
>  xen/arch/arm/gic-v3-its.c        | 53 ++++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic_v3_its.h |  6 +++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index bfdb7ac..9a86769 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -20,10 +20,13 @@
>  
>  #include <xen/lib.h>
>  #include <xen/mm.h>
> +#include <xen/sizes.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
>  #include <asm/io.h>
>  
> +#define ITS_CMD_QUEUE_SZ                SZ_1M
> +
>  LIST_HEAD(host_its_list);
>  
>  bool gicv3_its_host_has_its(void)
> @@ -56,6 +59,51 @@ static uint64_t encode_propbaser_phys_addr(paddr_t addr, unsigned int page_bits)
>      return ret | ((addr & GENMASK_ULL(51, 48)) >> (48 - 12));
>  }
>  
> +static void *its_map_cbaser(struct host_its *its)
> +{
> +    void __iomem *cbasereg = its->its_base + GITS_CBASER;
> +    uint64_t reg;
> +    void *buffer;
> +
> +    reg  = GIC_BASER_InnerShareable << GITS_BASER_SHAREABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GITS_BASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_RaWaWb << GITS_BASER_INNER_CACHEABILITY_SHIFT;
> +
> +    buffer = _xzalloc(ITS_CMD_QUEUE_SZ, SZ_64K);
> +    if ( !buffer )
> +        return NULL;
> +
> +    if ( virt_to_maddr(buffer) & ~GENMASK_ULL(51, 12) )
> +    {
> +        xfree(buffer);
> +        return NULL;
> +    }
> +
> +    reg |= GITS_VALID_BIT | virt_to_maddr(buffer);
> +    reg |= ((ITS_CMD_QUEUE_SZ / SZ_4K) - 1) & GITS_CBASER_SIZE_MASK;
> +    writeq_relaxed(reg, cbasereg);
> +    reg = readq_relaxed(cbasereg);
> +
> +    /* If the ITS dropped shareability, drop cacheability as well. */
> +    if ( (reg & GITS_BASER_SHAREABILITY_MASK) == 0 )
> +    {
> +        reg &= ~GITS_BASER_INNER_CACHEABILITY_MASK;
> +        writeq_relaxed(reg, cbasereg);
> +    }
> +
> +    /*
> +     * If the command queue memory is mapped as uncached, we need to flush
> +     * it on every access.
> +     */
> +    if ( !(reg & GITS_BASER_INNER_CACHEABILITY_MASK) )
> +    {
> +        its->flags |= HOST_ITS_FLUSH_CMD_QUEUE;
> +        printk(XENLOG_WARNING "using non-cacheable ITS command queue\n");
> +    }
> +
> +    return buffer;
> +}
> +
>  /* The ITS BASE registers work with page sizes of 4K, 16K or 64K. */
>  #define BASER_PAGE_BITS(sz) ((sz) * 2 + 12)
>  
> @@ -179,6 +227,11 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
>          }
>      }
>  
> +    hw_its->cmd_buf = its_map_cbaser(hw_its);
> +    if ( !hw_its->cmd_buf )
> +        return -ENOMEM;
> +    writeq_relaxed(0, hw_its->its_base + GITS_CWRITER);
> +
>      return 0;
>  }
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index badb644..f21162a 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -74,8 +74,12 @@
>  #define GITS_BASER_OUTER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)
>  #define GITS_BASER_INNER_CACHEABILITY_MASK   (0x7ULL << GITS_BASER_INNER_CACHEABILITY_SHIFT)
>  
> +#define GITS_CBASER_SIZE_MASK           0xff
> +
>  #include <xen/device_tree.h>
>  
> +#define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -84,6 +88,8 @@ struct host_its {
>      paddr_t size;
>      void __iomem *its_base;
>      unsigned int devid_bits;
> +    void *cmd_buf;
> +    unsigned int flags;
>  };
>  
>  
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling
  2017-03-31 18:05 ` [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
@ 2017-03-31 23:16   ` Stefano Stabellini
  2017-04-03 17:32   ` Julien Grall
  1 sibling, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:16 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> To be able to easily send commands to the ITS, create the respective
> wrapper functions, which take care of the ring buffer.
> The first two commands we implement provide methods to map a collection
> to a redistributor (aka host core) and to flush the command queue (SYNC).
> Start using these commands for mapping one collection to each host CPU.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Please address Julien's comments. In particular, cmd_lock needs to be
initialized here.


> ---
>  xen/arch/arm/gic-v3-its.c         | 182 ++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3-lpi.c         |  22 +++++
>  xen/arch/arm/gic-v3.c             |  25 +++++-
>  xen/include/asm-arm/gic_v3_defs.h |   2 +
>  xen/include/asm-arm/gic_v3_its.h  |  38 ++++++++
>  5 files changed, 267 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 9a86769..1ac598f 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -19,11 +19,14 @@
>   */
>  
>  #include <xen/lib.h>
> +#include <xen/delay.h>
>  #include <xen/mm.h>
>  #include <xen/sizes.h>
> +#include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
>  #include <asm/io.h>
> +#include <asm/page.h>
>  
>  #define ITS_CMD_QUEUE_SZ                SZ_1M
>  
> @@ -34,6 +37,147 @@ bool gicv3_its_host_has_its(void)
>      return !list_empty(&host_its_list);
>  }
>  
> +#define BUFPTR_MASK                     GENMASK_ULL(19, 5)
> +static int its_send_command(struct host_its *hw_its, const void *its_cmd)
> +{
> +    /* Some small grace period in case the command queue is congested. */
> +    s_time_t deadline = NOW() + MILLISECS(1);
> +    uint64_t readp, writep;
> +    int ret = -EBUSY;
> +
> +    /* No ITS commands from an interrupt handler (at the moment). */
> +    ASSERT(!in_irq());
> +
> +    spin_lock(&hw_its->cmd_lock);
> +
> +    do {
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +
> +        if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) != readp )
> +        {
> +            ret = 0;
> +            break;
> +        }
> +
> +        /*
> +         * If the command queue is full, wait for a bit in the hope it drains
> +         * before giving up.
> +         */
> +        spin_unlock(&hw_its->cmd_lock);
> +        cpu_relax();
> +        udelay(1);
> +        spin_lock(&hw_its->cmd_lock);
> +    } while ( NOW() <= deadline );
> +
> +    if ( ret )
> +    {
> +        spin_unlock(&hw_its->cmd_lock);
> +        printk(XENLOG_WARNING "ITS: command queue full.\n");
> +        return ret;
> +    }
> +
> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
> +        clean_and_invalidate_dcache_va_range(hw_its->cmd_buf + writep,
> +                                             ITS_CMD_SIZE);
> +    else
> +        dsb(ishst);
> +
> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base + GITS_CWRITER);
> +
> +    spin_unlock(&hw_its->cmd_lock);
> +
> +    return 0;
> +}
> +
> +/* Wait for an ITS to finish processing all commands. */
> +static int gicv3_its_wait_commands(struct host_its *hw_its)
> +{
> +    /* Define an upper limit for our wait time. */
> +    s_time_t deadline = NOW() + MILLISECS(100);
> +    uint64_t readp, writep;
> +
> +    do {
> +        spin_lock(&hw_its->cmd_lock);
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +        spin_unlock(&hw_its->cmd_lock);
> +
> +        if ( readp == writep )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    return -ETIMEDOUT;
> +}
> +
> +static uint64_t encode_rdbase(struct host_its *hw_its, unsigned int cpu,
> +                              uint64_t reg)
> +{
> +    reg &= ~GENMASK_ULL(51, 16);
> +
> +    reg |= gicv3_get_redist_address(cpu, hw_its->flags & HOST_ITS_USES_PTA);
> +
> +    return reg;
> +}
> +
> +static int its_send_cmd_sync(struct host_its *its, unsigned int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_SYNC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, 0x0);
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
> +                             unsigned int cpu)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPC;
> +    cmd[1] = 0x00;
> +    cmd[2] = encode_rdbase(its, cpu, collection_id);
> +    cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
> +/* Set up the (1:1) collection mapping for the given host CPU. */
> +int gicv3_its_setup_collection(unsigned int cpu)
> +{
> +    struct host_its *its;
> +    int ret;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        if ( !its->cmd_buf )
> +            continue;
> +
> +        ret = its_send_cmd_mapc(its, cpu, cpu);
> +        if ( ret )
> +            return ret;
> +
> +        ret = its_send_cmd_sync(its, cpu);
> +        if ( ret )
> +            return ret;
> +
> +        ret = gicv3_its_wait_commands(its);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    return 0;
> +}
> +
>  #define BASER_ATTR_MASK                                           \
>          ((0x3UL << GITS_BASER_SHAREABILITY_SHIFT)               | \
>           (0x7UL << GITS_BASER_OUTER_CACHEABILITY_SHIFT)         | \
> @@ -178,6 +322,38 @@ retry:
>      return -EINVAL;
>  }
>  
> +/*
> + * Before an ITS gets initialized, it should be in a quiescent state, where
> + * all outstanding commands and transactions have finished.
> + * So if the ITS is already enabled, turn it off and wait for all outstanding
> + * operations to get processed by polling the QUIESCENT bit.
> + */
> +static int gicv3_disable_its(struct host_its *hw_its)
> +{
> +    uint32_t reg;
> +    /* A similar generous wait limit as we use for the command queue wait. */
> +    s_time_t deadline = NOW() + MILLISECS(100);
> +
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    if ( !(reg & GITS_CTLR_ENABLE) && (reg & GITS_CTLR_QUIESCENT) )
> +        return 0;
> +
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
> +    do {
> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +        if ( reg & GITS_CTLR_QUIESCENT )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    dprintk(XENLOG_ERR, "ITS not quiescent.\n");
> +
> +    return -ETIMEDOUT;
> +}
> +
>  /* Allow a user to limit the number of devices. */
>  static unsigned int max_its_device_bits = 32;
>  integer_param("max_its_device_bits", max_its_device_bits);
> @@ -191,9 +367,15 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
>      if ( !hw_its->its_base )
>          return -ENOMEM;
>  
> +    ret = gicv3_disable_its(hw_its);
> +    if ( ret )
> +        return ret;
> +
>      reg = readq_relaxed(hw_its->its_base + GITS_TYPER);
>      hw_its->devid_bits = GITS_TYPER_DEVICE_ID_BITS(reg);
>      hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
> +    if ( reg & GITS_TYPER_PTA )
> +        hw_its->flags |= HOST_ITS_USES_PTA;
>  
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 77f6009..d85d63d 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -43,6 +43,8 @@ static struct {
>  } lpi_data;
>  
>  struct lpi_redist_data {
> +    paddr_t             redist_addr;
> +    unsigned int        redist_id;
>      void                *pending_table;
>  };
>  
> @@ -50,6 +52,26 @@ static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
>  
>  #define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
>  
> +/* Stores this redistributor's physical address and ID in a per-CPU variable */
> +void gicv3_set_redist_address(paddr_t address, unsigned int redist_id)
> +{
> +    this_cpu(lpi_redist).redist_addr = address;
> +    this_cpu(lpi_redist).redist_id = redist_id;
> +}
> +
> +/*
> + * Returns a redistributor's ID (either as an address or as an ID).
> + * This must be (and is) called only after it has been setup by the above
> + * function.
> + */
> +uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
> +{
> +    if ( use_pta )
> +        return per_cpu(lpi_redist, cpu).redist_addr & GENMASK_ULL(51, 16);
> +    else
> +        return per_cpu(lpi_redist, cpu).redist_id << 16;
> +}
> +
>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>  {
>      uint64_t val;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b84bc40..0e21cb2 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -666,7 +666,21 @@ static int __init gicv3_populate_rdist(void)
>  
>                  if ( typer & GICR_TYPER_PLPIS )
>                  {
> -                    int ret;
> +                    paddr_t rdist_addr;
> +                    int procnum, ret;
> +
> +                    /*
> +                     * The ITS refers to redistributors either by their physical
> +                     * address or by their ID. Determine those two values and
> +                     * let the ITS code store them in per host CPU variables to
> +                     * later be able to address those redistributors.
> +                     */
> +                    rdist_addr = gicv3.rdist_regions[i].base;
> +                    rdist_addr += ptr - gicv3.rdist_regions[i].map_base;
> +                    procnum = (typer & GICR_TYPER_PROC_NUM_MASK);
> +                    procnum >>= GICR_TYPER_PROC_NUM_SHIFT;
> +
> +                    gicv3_set_redist_address(rdist_addr, procnum);
>  
>                      ret = gicv3_lpi_init_rdist(ptr);
>                      if ( ret && ret != -ENODEV )
> @@ -705,7 +719,7 @@ static int __init gicv3_populate_rdist(void)
>  
>  static int gicv3_cpu_init(void)
>  {
> -    int i;
> +    int i, ret;
>      uint32_t priority;
>  
>      /* Register ourselves with the rest of the world */
> @@ -715,6 +729,13 @@ static int gicv3_cpu_init(void)
>      if ( gicv3_enable_redist() )
>          return -ENODEV;
>  
> +    if ( gicv3_its_host_has_its() )
> +    {
> +        ret = gicv3_its_setup_collection(smp_processor_id());
> +        if ( ret )
> +            return ret;
> +    }
> +
>      /* Set priority on PPI and SGI interrupts */
>      priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
>                  GIC_PRI_IPI);
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 7cdebc5..b01b6ed 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -103,6 +103,8 @@
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
>  #define GICR_TYPER_LAST              (1U << 4)
> +#define GICR_TYPER_PROC_NUM_SHIFT    8
> +#define GICR_TYPER_PROC_NUM_MASK     (0xffff << GICR_TYPER_PROC_NUM_SHIFT)
>  
>  /* For specifying the inner cacheability type only */
>  #define GIC_BASER_CACHE_nCnB         0ULL
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index f21162a..4c2ae1c 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -42,10 +42,12 @@
>  #define GITS_CTLR_QUIESCENT             BIT(31)
>  #define GITS_CTLR_ENABLE                BIT(0)
>  
> +#define GITS_TYPER_PTA                  BIT_ULL(19)
>  #define GITS_TYPER_DEVIDS_SHIFT         13
>  #define GITS_TYPER_DEVIDS_MASK          (0x1fUL << GITS_TYPER_DEVIDS_SHIFT)
>  #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
>                                                 GITS_TYPER_DEVIDS_SHIFT) + 1)
> +#define GITS_TYPER_IDBITS_SHIFT         8
>  
>  #define GITS_IIDR_VALUE                 0x34c
>  
> @@ -76,9 +78,26 @@
>  
>  #define GITS_CBASER_SIZE_MASK           0xff
>  
> +/* ITS command definitions */
> +#define ITS_CMD_SIZE                    32
> +
> +#define GITS_CMD_MOVI                   0x01
> +#define GITS_CMD_INT                    0x03
> +#define GITS_CMD_CLEAR                  0x04
> +#define GITS_CMD_SYNC                   0x05
> +#define GITS_CMD_MAPD                   0x08
> +#define GITS_CMD_MAPC                   0x09
> +#define GITS_CMD_MAPTI                  0x0a
> +#define GITS_CMD_MAPI                   0x0b
> +#define GITS_CMD_INV                    0x0c
> +#define GITS_CMD_INVALL                 0x0d
> +#define GITS_CMD_MOVALL                 0x0e
> +#define GITS_CMD_DISCARD                0x0f
> +
>  #include <xen/device_tree.h>
>  
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
> +#define HOST_ITS_USES_PTA               (1U << 1)
>  
>  /* data structure for each hardware ITS */
>  struct host_its {
> @@ -88,6 +107,7 @@ struct host_its {
>      paddr_t size;
>      void __iomem *its_base;
>      unsigned int devid_bits;
> +    spinlock_t cmd_lock;
>      void *cmd_buf;
>      unsigned int flags;
>  };
> @@ -108,6 +128,13 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>  int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
>  int gicv3_its_init(void);
>  
> +/* Store the physical address and ID for each redistributor as read from DT. */
> +void gicv3_set_redist_address(paddr_t address, unsigned int redist_id);
> +uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
> +
> +/* Map a collection for this host CPU to each host ITS. */
> +int gicv3_its_setup_collection(unsigned int cpu);
> +
>  #else
>  
>  static LIST_HEAD(host_its_list);
> @@ -135,6 +162,17 @@ static inline int gicv3_its_init(void)
>  {
>      return 0;
>  }
> +
> +static inline void gicv3_set_redist_address(paddr_t address,
> +                                            unsigned int redist_id)
> +{
> +}
> +
> +static inline int gicv3_its_setup_collection(unsigned int cpu)
> +{
> +    return 0;
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping
  2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
@ 2017-03-31 23:20   ` Stefano Stabellini
  2017-04-01  8:01   ` Vijay Kilari
  2017-04-03 18:56   ` Julien Grall
  2 siblings, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:20 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

See 3a044833-4b61-d807-600c-cf88d6e1901a@arm.com and
alpine.DEB.2.10.1703211718190.11679@sstabellini-ThinkPad-X260 


> ---
>  xen/arch/arm/gic-v3-its.c        | 227 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c           |   4 +
>  xen/include/asm-arm/domain.h     |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  23 ++++
>  4 files changed, 257 insertions(+)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 1ac598f..295f7dc 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,6 +21,8 @@
>  #include <xen/lib.h>
>  #include <xen/delay.h>
>  #include <xen/mm.h>
> +#include <xen/rbtree.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> @@ -32,6 +34,18 @@
>  
>  LIST_HEAD(host_its_list);
>  
> +struct its_devices {
> +    struct rb_node rbnode;
> +    struct host_its *hw_its;
> +    void *itt_addr;
> +    paddr_t guest_doorbell;             /* Identifies the virtual ITS */
> +    uint32_t host_devid;
> +    uint32_t guest_devid;
> +    uint32_t eventids;                  /* Number of event IDs (MSIs) */
> +    uint32_t *host_lpi_blocks;          /* Which LPIs are used on the host */
> +    struct pending_irq *pend_irqs;      /* One struct per event */
> +};
> +
>  bool gicv3_its_host_has_its(void)
>  {
>      return !list_empty(&host_its_list);
> @@ -151,6 +165,26 @@ static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             uint8_t size_bits, paddr_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    if ( valid )
> +    {
> +        ASSERT(size_bits < 32);
> +        ASSERT(!(itt_addr & ~GENMASK_ULL(51, 8)));
> +    }
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size_bits;
> +    cmd[2] = itt_addr;
> +    if ( valid )
> +        cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -376,6 +410,7 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
>      hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
>      if ( reg & GITS_TYPER_PTA )
>          hw_its->flags |= HOST_ITS_USES_PTA;
> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>  
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
> @@ -432,6 +467,197 @@ int gicv3_its_init(void)
>      return 0;
>  }
>  
> +static int remove_mapped_guest_device(struct its_devices *dev)
> +{
> +    int ret;
> +
> +    if ( dev->hw_its )
> +    {
> +        /* MAPD also discards all events with this device ID. */
> +        int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    ret = gicv3_its_wait_commands(dev->hw_its);
> +    if ( ret )
> +        return ret;
> +
> +    xfree(dev->itt_addr);
> +    xfree(dev->pend_irqs);
> +    xfree(dev);
> +
> +    return 0;
> +}
> +
> +static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
> +{
> +    struct host_its *hw_its;
> +
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +    {
> +        if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )
> +            return hw_its;
> +    }
> +
> +    return NULL;
> +}
> +
> +static int compare_its_guest_devices(struct its_devices *dev,
> +                                     paddr_t doorbell, uint32_t devid)
> +{
> +    if ( dev->guest_doorbell < doorbell )
> +        return -1;
> +
> +    if ( dev->guest_doorbell > doorbell )
> +        return 1;
> +
> +    if ( dev->guest_devid < devid )
> +        return -1;
> +
> +    if ( dev->guest_devid > devid )
> +        return 1;
> +
> +    return 0;
> +}
> +
> +/*
> + * Map a hardware device, identified by a certain host ITS and its device ID
> + * to domain d, a guest ITS (identified by its doorbell address) and device ID.
> + * Also provide the number of events (MSIs) needed for that device.
> + * This does not check if this particular hardware device is already mapped
> + * at another domain, it is expected that this would be done by the caller.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +                               paddr_t host_doorbell, uint32_t host_devid,
> +                               paddr_t guest_doorbell, uint32_t guest_devid,
> +                               uint32_t nr_events, bool valid)
> +{
> +    void *itt_addr = NULL;
> +    struct host_its *hw_its;
> +    struct its_devices *dev = NULL;
> +    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
> +    int ret = -ENOENT;
> +
> +    hw_its = gicv3_its_find_by_doorbell(host_doorbell);
> +    if ( !hw_its )
> +        return ret;
> +
> +    /* check for already existing mappings */
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    while ( *new )
> +    {
> +        struct its_devices *temp;
> +        int cmp;
> +
> +        temp = rb_entry(*new, struct its_devices, rbnode);
> +
> +        parent = *new;
> +        cmp = compare_its_guest_devices(temp, guest_doorbell, guest_devid);
> +        if ( !cmp )
> +        {
> +            if ( !valid )
> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
> +
> +            spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +            if ( valid )
> +                return -EBUSY;
> +
> +            return remove_mapped_guest_device(temp);
> +        }
> +
> +        if ( cmp > 0 )
> +            new = &((*new)->rb_left);
> +        else
> +            new = &((*new)->rb_right);
> +    }
> +
> +    if ( !valid )
> +        goto out_unlock;
> +
> +    ret = -ENOMEM;
> +
> +    /* An Interrupt Translation Table needs to be 256-byte aligned. */
> +    itt_addr = _xzalloc(nr_events * hw_its->itte_size, 256);
> +    if ( !itt_addr )
> +        goto out_unlock;
> +
> +    dev = xzalloc(struct its_devices);
> +    if ( !dev )
> +        goto out_unlock;
> +
> +    /*
> +     * Allocate the pending_irqs for each virtual LPI. They will be put
> +     * into the domain's radix tree upon the guest's MAPTI command.
> +     */
> +    dev->pend_irqs = xzalloc_array(struct pending_irq, nr_events);
> +    if ( !dev->pend_irqs )
> +        goto out_unlock;
> +
> +    ret = its_send_cmd_mapd(hw_its, host_devid,
> +                            fls(ROUNDUP(nr_events, LPI_BLOCK) - 1) - 1,
> +                            virt_to_maddr(itt_addr), true);
> +    if ( ret )
> +        goto out_unlock;
> +
> +    dev->itt_addr = itt_addr;
> +    dev->hw_its = hw_its;
> +    dev->guest_doorbell = guest_doorbell;
> +    dev->guest_devid = guest_devid;
> +    dev->host_devid = host_devid;
> +    dev->eventids = nr_events;
> +
> +    rb_link_node(&dev->rbnode, parent, new);
> +    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return 0;
> +
> +out_unlock:
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +    if ( dev )
> +    {
> +        xfree(dev->pend_irqs);
> +        xfree(dev->host_lpi_blocks);
> +    }
> +    xfree(itt_addr);
> +    xfree(dev);
> +    return ret;
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +    struct rb_node *victim;
> +    struct its_devices *dev;
> +
> +    /*
> +     * This is an easily readable, but suboptimal implementation.
> +     * It uses the provided iteration wrapper and erases each node, which
> +     * possibly triggers rebalancing.
> +     * This seems overkill since we are going to abolish the whole tree, but
> +     * avoids an open-coded re-implementation of the traversal functions with
> +     * some recursive function calls.
> +     */
> +restart:
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
> +    {
> +        dev = rb_entry(victim, struct its_devices, rbnode);
> +        rb_erase(victim, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        goto restart;
> +    }
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> @@ -459,6 +685,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);
>  
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>  
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d61479d..6242252 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
>      d->arch.vgic.nr_regions = rdist_count;
>      d->arch.vgic.rdist_regions = rdist_regions;
>  
> +    spin_lock_init(&d->arch.vgic.its_devices_lock);
> +    d->arch.vgic.its_devices = RB_ROOT;
> +
>      /*
>       * Domain 0 gets the hardware address.
>       * Guests get the virtual platform layout.
> @@ -1522,6 +1525,7 @@ static int vgic_v3_domain_init(struct domain *d)
>  
>  static void vgic_v3_domain_free(struct domain *d)
>  {
> +    gicv3_its_unmap_all_devices(d);
>      xfree(d->arch.vgic.rdist_regions);
>  }
>  
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 2d6fbb1..e559027 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -11,6 +11,7 @@
>  #include <asm/gic.h>
>  #include <public/hvm/params.h>
>  #include <xen/serial.h>
> +#include <xen/rbtree.h>
>  
>  struct hvm_domain
>  {
> @@ -109,6 +110,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        struct rb_root its_devices;         /* Devices mapped to an ITS */
> +        spinlock_t its_devices_lock;        /* Protects the its_devices tree */
>  #endif
>      } vgic;
>  
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 4c2ae1c..4ade5f6 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -48,6 +48,10 @@
>  #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
>                                                 GITS_TYPER_DEVIDS_SHIFT) + 1)
>  #define GITS_TYPER_IDBITS_SHIFT         8
> +#define GITS_TYPER_ITT_SIZE_SHIFT       4
> +#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
> +#define GITS_TYPER_ITT_SIZE(r)          ((((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
> +                                                GITS_TYPER_ITT_SIZE_SHIFT) + 1)
>  
>  #define GITS_IIDR_VALUE                 0x34c
>  
> @@ -94,7 +98,10 @@
>  #define GITS_CMD_MOVALL                 0x0e
>  #define GITS_CMD_DISCARD                0x0f
>  
> +#define ITS_DOORBELL_OFFSET             0x10040
> +
>  #include <xen/device_tree.h>
> +#include <xen/rbtree.h>
>  
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
> @@ -109,6 +116,7 @@ struct host_its {
>      unsigned int devid_bits;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    unsigned int itte_size;
>      unsigned int flags;
>  };
>  
> @@ -135,6 +143,17 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
>  /* Map a collection for this host CPU to each host ITS. */
>  int gicv3_its_setup_collection(unsigned int cpu);
>  
> +/*
> + * Map a device on the host by allocating an ITT on the host (ITS).
> + * "nr_event" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +                               paddr_t host_doorbell, uint32_t host_devid,
> +                               paddr_t guest_doorbell, uint32_t guest_devid,
> +                               uint32_t nr_events, bool valid);
> +void gicv3_its_unmap_all_devices(struct domain *d);
> +
>  #else
>  
>  static LIST_HEAD(host_its_list);
> @@ -173,6 +192,10 @@ static inline int gicv3_its_setup_collection(unsigned int cpu)
>      return 0;
>  }
>  
> +static inline void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>  
>  #endif
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array
  2017-03-31 18:05 ` [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
@ 2017-03-31 23:24   ` Stefano Stabellini
  0 siblings, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-03-31 23:24 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Fri, 31 Mar 2017, Andre Przywara wrote:
> The number of LPIs on a host can be potentially huge (millions),
> although in practise will be mostly reasonable. So prematurely allocating
> an array of struct irq_desc's for each LPI is not an option.
> However Xen itself does not care about LPIs, as every LPI will be injected
> into a guest (Dom0 for now).
> Create a dense data structure (8 Bytes) for each LPI which holds just
> enough information to determine the virtual IRQ number and the VCPU into
> which the LPI needs to be injected.
> Also to not artificially limit the number of LPIs, we create a 2-level
> table for holding those structures.
> This patch introduces functions to initialize these tables and to
> create, lookup and destroy entries for a given LPI.
> By using the naturally atomic access guarantee the native uint64_t data
> type gives us, we allocate and access LPI information in a way that does
> not require a lock.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

See alpine.DEB.2.10.1703221552490.8001@sstabellini-ThinkPad-X260.

I'll stop here for now, I think that are enough comments already for
another version.


> ---
>  xen/arch/arm/gic-v3-its.c        |  89 +++++++++++++++++-
>  xen/arch/arm/gic-v3-lpi.c        | 196 +++++++++++++++++++++++++++++++++++++++
>  xen/include/asm-arm/gic.h        |   2 +
>  xen/include/asm-arm/gic_v3_its.h |   5 +
>  xen/include/asm-arm/irq.h        |   5 +
>  5 files changed, 295 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 295f7dc..fa284e7 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -151,6 +151,20 @@ static int its_send_cmd_sync(struct host_its *its, unsigned int cpu)
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_mapti(struct host_its *its,
> +                              uint32_t deviceid, uint32_t eventid,
> +                              uint32_t pintid, uint16_t icid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_MAPTI | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid | ((uint64_t)pintid << 32);
> +    cmd[2] = icid;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
>                               unsigned int cpu)
>  {
> @@ -185,6 +199,19 @@ static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
>      return its_send_command(its, cmd);
>  }
>  
> +static int its_send_cmd_inv(struct host_its *its,
> +                            uint32_t deviceid, uint32_t eventid)
> +{
> +    uint64_t cmd[4];
> +
> +    cmd[0] = GITS_CMD_INV | ((uint64_t)deviceid << 32);
> +    cmd[1] = eventid;
> +    cmd[2] = 0x00;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -469,7 +496,7 @@ int gicv3_its_init(void)
>  
>  static int remove_mapped_guest_device(struct its_devices *dev)
>  {
> -    int ret;
> +    int ret, i;
>  
>      if ( dev->hw_its )
>      {
> @@ -479,12 +506,16 @@ static int remove_mapped_guest_device(struct its_devices *dev)
>              return ret;
>      }
>  
> +    for ( i = 0; i < DIV_ROUND_UP(dev->eventids, LPI_BLOCK); i++ )
> +        gicv3_free_host_lpi_block(dev->host_lpi_blocks[i]);
> +
>      ret = gicv3_its_wait_commands(dev->hw_its);
>      if ( ret )
>          return ret;
>  
>      xfree(dev->itt_addr);
>      xfree(dev->pend_irqs);
> +    xfree(dev->host_lpi_blocks);
>      xfree(dev);
>  
>      return 0;
> @@ -522,6 +553,37 @@ static int compare_its_guest_devices(struct its_devices *dev,
>  }
>  
>  /*
> + * On the host ITS @its, map @nr_events consecutive LPIs.
> + * The mapping connects a device @devid and event @eventid pair to LPI @lpi,
> + * increasing both @eventid and @lpi to cover the number of requested LPIs.
> + */
> +static int gicv3_its_map_host_events(struct host_its *its,
> +                                     uint32_t devid, uint32_t eventid,
> +                                     uint32_t lpi, uint32_t nr_events)
> +{
> +    uint32_t i;
> +    int ret;
> +
> +    for ( i = 0; i < nr_events; i++ )
> +    {
> +        /* For now we map every host LPI to host CPU 0 */
> +        ret = its_send_cmd_mapti(its, devid, eventid + i, lpi + i, 0);
> +        if ( ret )
> +            return ret;
> +
> +        ret = its_send_cmd_inv(its, devid, eventid + i);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    ret = its_send_cmd_sync(its, 0);
> +    if ( ret )
> +        return ret;
> +
> +    return gicv3_its_wait_commands(its);
> +}
> +
> +/*
>   * Map a hardware device, identified by a certain host ITS and its device ID
>   * to domain d, a guest ITS (identified by its doorbell address) and device ID.
>   * Also provide the number of events (MSIs) needed for that device.
> @@ -537,7 +599,7 @@ int gicv3_its_map_guest_device(struct domain *d,
>      struct host_its *hw_its;
>      struct its_devices *dev = NULL;
>      struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
> -    int ret = -ENOENT;
> +    int ret = -ENOENT, i;
>  
>      hw_its = gicv3_its_find_by_doorbell(host_doorbell);
>      if ( !hw_its )
> @@ -595,6 +657,11 @@ int gicv3_its_map_guest_device(struct domain *d,
>      if ( !dev->pend_irqs )
>          goto out_unlock;
>  
> +    dev->host_lpi_blocks = xzalloc_array(uint32_t,
> +                                         DIV_ROUND_UP(nr_events, LPI_BLOCK));
> +    if ( !dev->host_lpi_blocks )
> +        goto out_unlock;
> +
>      ret = its_send_cmd_mapd(hw_its, host_devid,
>                              fls(ROUNDUP(nr_events, LPI_BLOCK) - 1) - 1,
>                              virt_to_maddr(itt_addr), true);
> @@ -613,10 +680,28 @@ int gicv3_its_map_guest_device(struct domain *d,
>  
>      spin_unlock(&d->arch.vgic.its_devices_lock);
>  
> +    /*
> +     * Map all host LPIs within this device already. We can't afford to queue
> +     * any host ITS commands later on during the guest's runtime.
> +     */
> +    for ( i = 0; i < DIV_ROUND_UP(nr_events, LPI_BLOCK); i++ )
> +    {
> +        ret = gicv3_allocate_host_lpi_block(d, &dev->host_lpi_blocks[i]);
> +        if ( ret < 0 )
> +            goto out;
> +
> +        ret = gicv3_its_map_host_events(hw_its, host_devid, i * LPI_BLOCK,
> +                                        dev->host_lpi_blocks[i], LPI_BLOCK);
> +        if ( ret < 0 )
> +            goto out;
> +    }
> +
>      return 0;
>  
>  out_unlock:
>      spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +out:
>      if ( dev )
>      {
>          xfree(dev->pend_irqs);
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index d85d63d..d642cc5 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -20,25 +20,55 @@
>  
>  #include <xen/lib.h>
>  #include <xen/mm.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <asm/atomic.h>
> +#include <asm/domain.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic_v3_its.h>
>  #include <asm/io.h>
>  #include <asm/page.h>
>  
> +/*
> + * There could be a lot of LPIs on the host side, and they always go to
> + * a guest. So having a struct irq_desc for each of them would be wasteful
> + * and useless.
> + * Instead just store enough information to find the right VCPU to inject
> + * those LPIs into, which just requires the virtual LPI number.
> + * To avoid a global lock on this data structure, this is using a lockless
> + * approach relying on the architectural atomicty of native data types:
> + * We read or write the "data" view of this union atomically, then can
> + * access the broken-down fields in our local copy.
> + */
> +union host_lpi {
> +    uint64_t data;
> +    struct {
> +        uint32_t virt_lpi;
> +        uint16_t dom_id;
> +        uint16_t vcpu_id;
> +    };
> +};
> +
>  #define LPI_PROPTABLE_NEEDS_FLUSHING    (1U << 0)
>  /* Global state */
>  static struct {
>      /* The global LPI property table, shared by all redistributors. */
>      uint8_t *lpi_property;
>      /*
> +     * A two-level table to lookup LPIs firing on the host and look up the
> +     * VCPU and virtual LPI number to inject into.
> +     */
> +    union host_lpi **host_lpis;
> +    /*
>       * Number of physical LPIs the host supports. This is a property of
>       * the GIC hardware. We depart from the habit of naming these things
>       * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI"
>       * in a different context to differentiate them from "virtual LPIs".
>       */
>      unsigned long int nr_host_lpis;
> +    /* Protects allocation and deallocation of host LPIs, but not the access */
> +    spinlock_t host_lpis_lock;
>      unsigned int flags;
>  } lpi_data;
>  
> @@ -51,6 +81,19 @@ struct lpi_redist_data {
>  static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
>  
>  #define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)
> +#define HOST_LPIS_PER_PAGE      (PAGE_SIZE / sizeof(union host_lpi))
> +
> +static union host_lpi *gic_get_host_lpi(uint32_t plpi)
> +{
> +    if ( !is_lpi(plpi) || plpi >= MAX_PHYS_LPIS + LPI_OFFSET )
> +        return NULL;
> +
> +    plpi -= LPI_OFFSET;
> +    if ( !lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE] )
> +        return NULL;
> +
> +    return &lpi_data.host_lpis[plpi / HOST_LPIS_PER_PAGE][plpi % HOST_LPIS_PER_PAGE];
> +}
>  
>  /* Stores this redistributor's physical address and ID in a per-CPU variable */
>  void gicv3_set_redist_address(paddr_t address, unsigned int redist_id)
> @@ -212,15 +255,168 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base)
>  static unsigned int max_lpi_bits = 20;
>  integer_param("max_lpi_bits", max_lpi_bits);
>  
> +/*
> + * Allocate the 2nd level array for host LPIs. This one holds pointers
> + * to the page with the actual "union host_lpi" entries. Our LPI limit
> + * avoids excessive memory usage.
> + */
>  int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
>  {
> +    int nr_lpi_ptrs;
> +
> +    /* We rely on the data structure being atomically accessible. */
> +    BUILD_BUG_ON(sizeof(union host_lpi) > sizeof(unsigned long));
> +
>      lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits));
>  
> +    spin_lock_init(&lpi_data.host_lpis_lock);
> +
> +    nr_lpi_ptrs = MAX_PHYS_LPIS / (PAGE_SIZE / sizeof(union host_lpi));
> +    lpi_data.host_lpis = xzalloc_array(union host_lpi *, nr_lpi_ptrs);
> +    if ( !lpi_data.host_lpis )
> +        return -ENOMEM;
> +
>      printk("GICv3: using at most %lu LPIs on the host.\n", MAX_PHYS_LPIS);
>  
>      return 0;
>  }
>  
> +static int find_unused_host_lpi(uint32_t start, uint32_t *index)
> +{
> +    unsigned int chunk;
> +    uint32_t i = *index;
> +
> +    ASSERT(spin_is_locked(&lpi_data.host_lpis_lock));
> +
> +    for ( chunk = start; chunk < MAX_PHYS_LPIS / HOST_LPIS_PER_PAGE; chunk++ )
> +    {
> +        /* If we hit an unallocated chunk, use entry 0 in that one. */
> +        if ( !lpi_data.host_lpis[chunk] )
> +        {
> +            *index = 0;
> +            return chunk;
> +        }
> +
> +        /* Find an unallocated entry in this chunk. */
> +        for ( ; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +        {
> +            if ( lpi_data.host_lpis[chunk][i].dom_id == DOMID_INVALID )
> +            {
> +                *index = i;
> +                return chunk;
> +            }
> +        }
> +        i = 0;
> +    }
> +
> +    return -1;
> +}
> +
> +/*
> + * Allocate a block of 32 LPIs on the given host ITS for device "devid",
> + * starting with "eventid". Put them into the respective ITT by issuing a
> + * MAPTI command for each of them.
> + */
> +int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi)
> +{
> +    static uint32_t next_lpi = 0;
> +    uint32_t lpi, lpi_idx = next_lpi % HOST_LPIS_PER_PAGE;
> +    int chunk;
> +    int i;
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +    chunk = find_unused_host_lpi(next_lpi / HOST_LPIS_PER_PAGE, &lpi_idx);
> +
> +    if ( chunk == - 1 )          /* rescan for a hole from the beginning */
> +    {
> +        lpi_idx = 0;
> +        chunk = find_unused_host_lpi(0, &lpi_idx);
> +        if ( chunk == -1 )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOSPC;
> +        }
> +    }
> +
> +    /* If we hit an unallocated chunk, we initialize it and use entry 0. */
> +    if ( !lpi_data.host_lpis[chunk] )
> +    {
> +        union host_lpi *new_chunk;
> +
> +        /* TODO: NUMA locality for quicker IRQ path? */
> +        new_chunk = xmalloc_bytes(PAGE_SIZE);
> +        if ( !new_chunk )
> +        {
> +            spin_unlock(&lpi_data.host_lpis_lock);
> +            return -ENOMEM;
> +        }
> +
> +        for ( i = 0; i < HOST_LPIS_PER_PAGE; i += LPI_BLOCK )
> +            new_chunk[i].dom_id = DOMID_INVALID;
> +
> +        lpi_data.host_lpis[chunk] = new_chunk;
> +        lpi_idx = 0;
> +    }
> +
> +    lpi = chunk * HOST_LPIS_PER_PAGE + lpi_idx;
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +    {
> +        union host_lpi hlpi;
> +
> +        /*
> +         * Mark this host LPI as belonging to the domain, but don't assign
> +         * any virtual LPI or a VCPU yet.
> +         */
> +        hlpi.virt_lpi = INVALID_LPI;
> +        hlpi.dom_id = d->domain_id;
> +        hlpi.vcpu_id = ~0;
> +        write_u64_atomic(&lpi_data.host_lpis[chunk][lpi_idx + i].data,
> +                         hlpi.data);
> +
> +        /*
> +         * Enable this host LPI, so we don't have to do this during the
> +         * guest's runtime.
> +         */
> +        lpi_data.lpi_property[lpi + i] |= LPI_PROP_ENABLED;
> +    }
> +
> +    /*
> +     * We have allocated and initialized the host LPI entries, so it's safe
> +     * to drop the lock now. Access to the structures can be done concurrently
> +     * as it involves only an atomic uint64_t access.
> +     */
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    if ( lpi_data.flags & LPI_PROPTABLE_NEEDS_FLUSHING )
> +        clean_and_invalidate_dcache_va_range(&lpi_data.lpi_property[lpi],
> +                                             LPI_BLOCK);
> +
> +    next_lpi = lpi + LPI_BLOCK;
> +    *first_lpi = lpi + LPI_OFFSET;
> +
> +    return 0;
> +}
> +
> +void gicv3_free_host_lpi_block(uint32_t first_lpi)
> +{
> +    union host_lpi *hlpi, empty_lpi = { .dom_id = DOMID_INVALID };
> +    int i;
> +
> +    hlpi = gic_get_host_lpi(first_lpi);
> +    if ( !hlpi )
> +        return;         /* Nothing to free here. */
> +
> +    spin_lock(&lpi_data.host_lpis_lock);
> +
> +    for ( i = 0; i < LPI_BLOCK; i++ )
> +        write_u64_atomic(&hlpi[i].data, empty_lpi.data);
> +
> +    spin_unlock(&lpi_data.host_lpis_lock);
> +
> +    return;
> +}
> +
>  /*
>   * Local variables:
>   * mode: C
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 836a103..d04bd04 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -220,6 +220,8 @@ enum gic_version {
>      GIC_V3,
>  };
>  
> +#define INVALID_LPI     0
> +
>  extern enum gic_version gic_hw_version(void);
>  
>  /* Program the IRQ type into the GIC */
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 4ade5f6..7b47596 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -106,6 +106,9 @@
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
>  
> +/* We allocate LPIs on the hosts in chunks of 32 to reduce handling overhead. */
> +#define LPI_BLOCK                       32
> +
>  /* data structure for each hardware ITS */
>  struct host_its {
>      struct list_head entry;
> @@ -153,6 +156,8 @@ int gicv3_its_map_guest_device(struct domain *d,
>                                 paddr_t guest_doorbell, uint32_t guest_devid,
>                                 uint32_t nr_events, bool valid);
>  void gicv3_its_unmap_all_devices(struct domain *d);
> +int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi);
> +void gicv3_free_host_lpi_block(uint32_t first_lpi);
>  
>  #else
>  
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 13528c0..d16affc 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -42,6 +42,11 @@ struct irq_desc *__irq_to_desc(int irq);
>  
>  void do_IRQ(struct cpu_user_regs *regs, unsigned int irq, int is_fiq);
>  
> +static inline bool is_lpi(unsigned int irq)
> +{
> +    return irq >= LPI_OFFSET;
> +}
> +
>  #define domain_pirq_to_irq(d, pirq) (pirq)
>  
>  bool_t is_assignable_irq(unsigned int irq);
> -- 
> 2.9.0
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping
  2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
  2017-03-31 23:20   ` Stefano Stabellini
@ 2017-04-01  8:01   ` Vijay Kilari
  2017-04-03 18:33     ` Julien Grall
  2017-04-03 18:56   ` Julien Grall
  2 siblings, 1 reply; 52+ messages in thread
From: Vijay Kilari @ 2017-04-01  8:01 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni

Hi Andre,

On Fri, Mar 31, 2017 at 11:35 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 227 +++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c           |   4 +
>  xen/include/asm-arm/domain.h     |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  23 ++++
>  4 files changed, 257 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 1ac598f..295f7dc 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,6 +21,8 @@
>  #include <xen/lib.h>
>  #include <xen/delay.h>
>  #include <xen/mm.h>
> +#include <xen/rbtree.h>
> +#include <xen/sched.h>
>  #include <xen/sizes.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> @@ -32,6 +34,18 @@
>
>  LIST_HEAD(host_its_list);
>
> +struct its_devices {
> +    struct rb_node rbnode;
> +    struct host_its *hw_its;
> +    void *itt_addr;
> +    paddr_t guest_doorbell;             /* Identifies the virtual ITS */
> +    uint32_t host_devid;
> +    uint32_t guest_devid;
> +    uint32_t eventids;                  /* Number of event IDs (MSIs) */
> +    uint32_t *host_lpi_blocks;          /* Which LPIs are used on the host */
> +    struct pending_irq *pend_irqs;      /* One struct per event */
> +};
> +
>  bool gicv3_its_host_has_its(void)
>  {
>      return !list_empty(&host_its_list);
> @@ -151,6 +165,26 @@ static int its_send_cmd_mapc(struct host_its *its, uint32_t collection_id,
>      return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             uint8_t size_bits, paddr_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    if ( valid )
> +    {
> +        ASSERT(size_bits < 32);
> +        ASSERT(!(itt_addr & ~GENMASK_ULL(51, 8)));
> +    }
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size_bits;
> +    cmd[2] = itt_addr;
> +    if ( valid )
> +        cmd[2] |= GITS_VALID_BIT;
> +    cmd[3] = 0x00;
> +
> +    return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -376,6 +410,7 @@ static int gicv3_its_init_single_its(struct host_its *hw_its)
>      hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
>      if ( reg & GITS_TYPER_PTA )
>          hw_its->flags |= HOST_ITS_USES_PTA;
> +    hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>
>      for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>      {
> @@ -432,6 +467,197 @@ int gicv3_its_init(void)
>      return 0;
>  }
>
> +static int remove_mapped_guest_device(struct its_devices *dev)
> +{
> +    int ret;
> +
> +    if ( dev->hw_its )
> +    {
> +        /* MAPD also discards all events with this device ID. */
> +        int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);
> +        if ( ret )
> +            return ret;
> +    }
> +
> +    ret = gicv3_its_wait_commands(dev->hw_its);
> +    if ( ret )
> +        return ret;
> +
> +    xfree(dev->itt_addr);
> +    xfree(dev->pend_irqs);
> +    xfree(dev);
> +
> +    return 0;
> +}
> +
> +static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
> +{
> +    struct host_its *hw_its;
> +
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +    {
> +        if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )
> +            return hw_its;
> +    }
> +
> +    return NULL;
> +}
> +
> +static int compare_its_guest_devices(struct its_devices *dev,
> +                                     paddr_t doorbell, uint32_t devid)
> +{
> +    if ( dev->guest_doorbell < doorbell )
> +        return -1;
> +
> +    if ( dev->guest_doorbell > doorbell )
> +        return 1;
> +
> +    if ( dev->guest_devid < devid )
> +        return -1;
> +
> +    if ( dev->guest_devid > devid )
> +        return 1;
> +
> +    return 0;
> +}
> +
> +/*
> + * Map a hardware device, identified by a certain host ITS and its device ID
> + * to domain d, a guest ITS (identified by its doorbell address) and device ID.
> + * Also provide the number of events (MSIs) needed for that device.
> + * This does not check if this particular hardware device is already mapped
> + * at another domain, it is expected that this would be done by the caller.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +                               paddr_t host_doorbell, uint32_t host_devid,
> +                               paddr_t guest_doorbell, uint32_t guest_devid,
> +                               uint32_t nr_events, bool valid)
> +{
> +    void *itt_addr = NULL;
> +    struct host_its *hw_its;
> +    struct its_devices *dev = NULL;
> +    struct rb_node **new = &d->arch.vgic.its_devices.rb_node, *parent = NULL;
> +    int ret = -ENOENT;
> +
> +    hw_its = gicv3_its_find_by_doorbell(host_doorbell);
> +    if ( !hw_its )
> +        return ret;
> +
> +    /* check for already existing mappings */
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    while ( *new )
> +    {
> +        struct its_devices *temp;
> +        int cmp;
> +
> +        temp = rb_entry(*new, struct its_devices, rbnode);
> +
> +        parent = *new;
> +        cmp = compare_its_guest_devices(temp, guest_doorbell, guest_devid);
> +        if ( !cmp )
> +        {
> +            if ( !valid )
> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
> +
> +            spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +            if ( valid )
> +                return -EBUSY;
> +
> +            return remove_mapped_guest_device(temp);
> +        }
> +
> +        if ( cmp > 0 )
> +            new = &((*new)->rb_left);
> +        else
> +            new = &((*new)->rb_right);
> +    }
> +
> +    if ( !valid )
> +        goto out_unlock;
> +
> +    ret = -ENOMEM;
> +
> +    /* An Interrupt Translation Table needs to be 256-byte aligned. */
> +    itt_addr = _xzalloc(nr_events * hw_its->itte_size, 256);

      As I mentioned, in previous version, if itt_addr is not enough size,
ITS would overwrite and corrupt memory.
Similar to size passed in MAPD cmd, itt_addr should also be allocated of size
ROUNDUP(nr_events, LPI_BLOCK).

> +    if ( !itt_addr )
> +        goto out_unlock;
> +
> +    dev = xzalloc(struct its_devices);
> +    if ( !dev )
> +        goto out_unlock;
> +
> +    /*
> +     * Allocate the pending_irqs for each virtual LPI. They will be put
> +     * into the domain's radix tree upon the guest's MAPTI command.
> +     */
> +    dev->pend_irqs = xzalloc_array(struct pending_irq, nr_events);
> +    if ( !dev->pend_irqs )
> +        goto out_unlock;
> +
> +    ret = its_send_cmd_mapd(hw_its, host_devid,
> +                            fls(ROUNDUP(nr_events, LPI_BLOCK) - 1) - 1,
> +                            virt_to_maddr(itt_addr), true);
> +    if ( ret )
> +        goto out_unlock;
> +
> +    dev->itt_addr = itt_addr;
> +    dev->hw_its = hw_its;
> +    dev->guest_doorbell = guest_doorbell;
> +    dev->guest_devid = guest_devid;
> +    dev->host_devid = host_devid;
> +    dev->eventids = nr_events;
> +
> +    rb_link_node(&dev->rbnode, parent, new);
> +    rb_insert_color(&dev->rbnode, &d->arch.vgic.its_devices);
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    return 0;
> +
> +out_unlock:
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +    if ( dev )
> +    {
> +        xfree(dev->pend_irqs);
> +        xfree(dev->host_lpi_blocks);
> +    }
> +    xfree(itt_addr);
> +    xfree(dev);
> +    return ret;
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +    struct rb_node *victim;
> +    struct its_devices *dev;
> +
> +    /*
> +     * This is an easily readable, but suboptimal implementation.
> +     * It uses the provided iteration wrapper and erases each node, which
> +     * possibly triggers rebalancing.
> +     * This seems overkill since we are going to abolish the whole tree, but
> +     * avoids an open-coded re-implementation of the traversal functions with
> +     * some recursive function calls.
> +     */
> +restart:
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
> +    {
> +        dev = rb_entry(victim, struct its_devices, rbnode);
> +        rb_erase(victim, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        goto restart;
> +    }
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> @@ -459,6 +685,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);
>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d61479d..6242252 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
>      d->arch.vgic.nr_regions = rdist_count;
>      d->arch.vgic.rdist_regions = rdist_regions;
>
> +    spin_lock_init(&d->arch.vgic.its_devices_lock);
> +    d->arch.vgic.its_devices = RB_ROOT;
> +
>      /*
>       * Domain 0 gets the hardware address.
>       * Guests get the virtual platform layout.
> @@ -1522,6 +1525,7 @@ static int vgic_v3_domain_init(struct domain *d)
>
>  static void vgic_v3_domain_free(struct domain *d)
>  {
> +    gicv3_its_unmap_all_devices(d);
>      xfree(d->arch.vgic.rdist_regions);
>  }
>
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 2d6fbb1..e559027 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -11,6 +11,7 @@
>  #include <asm/gic.h>
>  #include <public/hvm/params.h>
>  #include <xen/serial.h>
> +#include <xen/rbtree.h>
>
>  struct hvm_domain
>  {
> @@ -109,6 +110,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        struct rb_root its_devices;         /* Devices mapped to an ITS */
> +        spinlock_t its_devices_lock;        /* Protects the its_devices tree */
>  #endif
>      } vgic;
>
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 4c2ae1c..4ade5f6 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -48,6 +48,10 @@
>  #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
>                                                 GITS_TYPER_DEVIDS_SHIFT) + 1)
>  #define GITS_TYPER_IDBITS_SHIFT         8
> +#define GITS_TYPER_ITT_SIZE_SHIFT       4
> +#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
> +#define GITS_TYPER_ITT_SIZE(r)          ((((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
> +                                                GITS_TYPER_ITT_SIZE_SHIFT) + 1)
>
>  #define GITS_IIDR_VALUE                 0x34c
>
> @@ -94,7 +98,10 @@
>  #define GITS_CMD_MOVALL                 0x0e
>  #define GITS_CMD_DISCARD                0x0f
>
> +#define ITS_DOORBELL_OFFSET             0x10040
> +
>  #include <xen/device_tree.h>
> +#include <xen/rbtree.h>
>
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
> @@ -109,6 +116,7 @@ struct host_its {
>      unsigned int devid_bits;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    unsigned int itte_size;
>      unsigned int flags;
>  };
>
> @@ -135,6 +143,17 @@ uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
>  /* Map a collection for this host CPU to each host ITS. */
>  int gicv3_its_setup_collection(unsigned int cpu);
>
> +/*
> + * Map a device on the host by allocating an ITT on the host (ITS).
> + * "nr_event" specifies how many events (interrupts) this device will need.
> + * Setting "valid" to false deallocates the device.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +                               paddr_t host_doorbell, uint32_t host_devid,
> +                               paddr_t guest_doorbell, uint32_t guest_devid,
> +                               uint32_t nr_events, bool valid);
> +void gicv3_its_unmap_all_devices(struct domain *d);
> +
>  #else
>
>  static LIST_HEAD(host_its_list);
> @@ -173,6 +192,10 @@ static inline int gicv3_its_setup_collection(unsigned int cpu)
>      return 0;
>  }
>
> +static inline void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif
> --
> 2.9.0
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 19/26] ARM: vITS: handle MAPTI command
  2017-03-31 18:05 ` [PATCH v3 19/26] ARM: vITS: handle MAPTI command Andre Przywara
@ 2017-04-01  8:32   ` Vijay Kilari
  0 siblings, 0 replies; 52+ messages in thread
From: Vijay Kilari @ 2017-04-01  8:32 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni

On Fri, Mar 31, 2017 at 11:35 PM, Andre Przywara <andre.przywara@arm.com> wrote:
> The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
> pair and actually instantiates LPI interrupts.
> We connect the already allocated host LPI to this virtual LPI, so that
> any triggering IRQ on the host can be quickly forwarded to a guest.
> Beside entering the VCPU and the virtual LPI number in the respective
> host LPI entry, we also initialize and add the already allocated
> struct pending_irq to our radix tree, so that we can now easily find it
> by its virtual LPI number.
> This exports the vgic_init_pending_irq() function for that purpose.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/gic-v3-its.c        | 74 ++++++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/gic-v3-lpi.c        | 16 +++++++++
>  xen/arch/arm/vgic-v3-its.c       | 36 +++++++++++++++++--
>  xen/arch/arm/vgic.c              |  2 +-
>  xen/include/asm-arm/gic_v3_its.h |  6 ++++
>  xen/include/asm-arm/vgic.h       |  1 +
>  6 files changed, 132 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 8db2a09..39f16b2 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -747,6 +747,80 @@ restart:
>      spin_unlock(&d->arch.vgic.its_devices_lock);
>  }
>
> +/* Must be called with the its_device_lock held. */
> +static struct its_devices *get_its_device(struct domain *d, paddr_t doorbell,
> +                                          uint32_t devid)
> +{
> +    struct rb_node *node = d->arch.vgic.its_devices.rb_node;
> +    struct its_devices *dev;
> +
> +    while (node)
> +    {
> +        int cmp;
> +
> +        dev = rb_entry(node, struct its_devices, rbnode);
> +        cmp = compare_its_guest_devices(dev, doorbell, devid);
> +
> +        if ( !cmp )
> +            return dev;
> +
> +        if ( cmp > 0 )
> +            node = node->rb_left;
> +        else
> +            node = node->rb_right;
> +    }
> +
> +    return NULL;
> +}
> +
> +static uint32_t get_host_lpi(struct its_devices *dev, uint32_t eventid)
> +{
> +    uint32_t host_lpi = 0;
> +
> +    if ( dev && (eventid < dev->eventids) )
> +    {
> +        host_lpi = dev->host_lpi_blocks[eventid / LPI_BLOCK] +
> +                                       (eventid % LPI_BLOCK);
> +        if ( !is_lpi(host_lpi) )
> +            host_lpi = 0;
> +    }
> +
> +    return host_lpi;
> +}
> +
> +/*
> + * Connects the event ID for an already assigned device to the given VCPU/vLPI
> + * pair. The corresponding physical LPI is already mapped on the host side
> + * (when assigning the physical device to the guest), so we just connect the
> + * target VCPU/vLPI pair to that interrupt to inject it properly if it fires.
> + */
> +struct pending_irq *gicv3_assign_guest_event(struct domain *d,
> +                                             paddr_t doorbell_address,
> +                                             uint32_t devid, uint32_t eventid,
> +                                             struct vcpu *v, uint32_t virt_lpi)
> +{
> +    struct its_devices *dev;
> +    struct pending_irq *pirq = NULL;
> +    uint32_t host_lpi = 0;
> +
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    dev = get_its_device(d, doorbell_address, devid);
> +    if ( dev )
> +    {
> +        host_lpi = get_host_lpi(dev, eventid);
> +        pirq = &dev->pend_irqs[eventid];
> +    }
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +    if ( !host_lpi || !pirq )
> +        return NULL;
> +
> +    gicv3_lpi_update_host_entry(host_lpi, d->domain_id,
> +                                v ? v->vcpu_id : -1, virt_lpi);
> +
> +    return pirq;
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 2301d53..a6b728e 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -178,6 +178,22 @@ void do_LPI(unsigned int lpi)
>      rcu_unlock_domain(d);
>  }
>
> +void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
> +                                 unsigned int vcpu_id, uint32_t virt_lpi)
> +{
> +    union host_lpi *hlpip, hlpi;
> +
> +    host_lpi -= LPI_OFFSET;
> +
> +    hlpip = &lpi_data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % HOST_LPIS_PER_PAGE];
> +
> +    hlpi.virt_lpi = virt_lpi;
> +    hlpi.dom_id = domain_id;
> +    hlpi.vcpu_id = vcpu_id;
> +
> +    write_u64_atomic(&hlpip->data, hlpi.data);
> +}
> +
>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>  {
>      uint64_t val;
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 36b44f2..d9dce3f 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -258,8 +258,8 @@ static bool read_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
>  }
>
>  #define SKIP_LPI_UPDATE 1
> -bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
> -                uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
> +static bool write_itte(struct virt_its *its, uint32_t devid, uint32_t evid,
> +                       uint32_t collid, uint32_t vlpi, struct vcpu **vcpu)
>  {
>      struct vits_itte *itte;
>
> @@ -421,6 +421,34 @@ static int its_handle_mapd(struct virt_its *its, uint64_t *cmdptr)
>      return ret;
>  }
>
> +static int its_handle_mapti(struct virt_its *its, uint64_t *cmdptr)
> +{
> +    uint32_t devid = its_cmd_get_deviceid(cmdptr);
> +    uint32_t eventid = its_cmd_get_id(cmdptr);
> +    uint32_t intid = its_cmd_get_physical_id(cmdptr);
> +    uint16_t collid = its_cmd_get_collection(cmdptr);
> +    struct pending_irq *pirq;
> +    struct vcpu *vcpu;
> +
> +    if ( its_cmd_get_command(cmdptr) == GITS_CMD_MAPI )
> +        intid = eventid;
> +
> +    pirq = gicv3_assign_guest_event(its->d, its->doorbell_address,
> +                                    devid, eventid, vcpu, intid);

    This series does not compile. vcpu is not initliazed.

> +    if ( !pirq )
> +        return -1;
> +
> +    vgic_init_pending_irq(pirq, intid);
> +    write_lock(&its->d->arch.vgic.pend_lpi_tree_lock);
> +    radix_tree_insert(&its->d->arch.vgic.pend_lpi_tree, intid, pirq);
> +    write_unlock(&its->d->arch.vgic.pend_lpi_tree_lock);
> +
> +    if ( !write_itte(its, devid, eventid, collid, intid, &vcpu) )
> +        return -1;
> +
> +    return 0;
> +}
> +
>  #define ITS_CMD_BUFFER_SIZE(baser)      ((((baser) & 0xff) + 1) << 12)
>
>  static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
> @@ -470,6 +498,10 @@ static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>          case GITS_CMD_MAPD:
>              ret = its_handle_mapd(its, cmdptr);
>             break;
> +        case GITS_CMD_MAPI:
> +        case GITS_CMD_MAPTI:
> +            ret = its_handle_mapti(its, cmdptr);
> +            break;
>          case GITS_CMD_SYNC:
>              /* We handle ITS commands synchronously, so we ignore SYNC. */
>             break;
> diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
> index 2aee20f..94eb9c5 100644
> --- a/xen/arch/arm/vgic.c
> +++ b/xen/arch/arm/vgic.c
> @@ -61,7 +61,7 @@ struct vgic_irq_rank *vgic_rank_irq(struct vcpu *v, unsigned int irq)
>      return vgic_get_rank(v, rank);
>  }
>
> -static void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
> +void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq)
>  {
>      INIT_LIST_HEAD(&p->inflight);
>      INIT_LIST_HEAD(&p->lr_queue);
> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index bc9b42a..35a3e22 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -161,6 +161,12 @@ void gicv3_its_unmap_all_devices(struct domain *d);
>  int gicv3_allocate_host_lpi_block(struct domain *d, uint32_t *first_lpi);
>  void gicv3_free_host_lpi_block(uint32_t first_lpi);
>
> +struct pending_irq *gicv3_assign_guest_event(struct domain *d, paddr_t doorbell,
> +                                             uint32_t devid, uint32_t eventid,
> +                                             struct vcpu *v, uint32_t virt_lpi);
> +void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
> +                                 unsigned int vcpu_id, uint32_t virt_lpi);
> +
>  #else
>
>  static LIST_HEAD(host_its_list);
> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index 9f48e9a..3fb7433 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -298,6 +298,7 @@ extern struct vcpu *vgic_get_target_vcpu(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int virq);
>  extern void vgic_vcpu_inject_spi(struct domain *d, unsigned int virq);
>  extern void vgic_clear_pending_irqs(struct vcpu *v);
> +extern void vgic_init_pending_irq(struct pending_irq *p, unsigned int virq);
>  extern struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq);
>  extern struct pending_irq *spi_to_pending(struct domain *d, unsigned int irq);
>  extern struct pending_irq *lpi_to_pending(struct domain *d, unsigned int irq);
> --
> 2.9.0
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 00/26] arm64: Dom0 ITS emulation
  2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
                   ` (25 preceding siblings ...)
  2017-03-31 18:05 ` [PATCH v3 26/26] ARM: vGIC: advertise LPI support Andre Przywara
@ 2017-04-01 20:37 ` Julien Grall
  26 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-01 20:37 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, nd, Shanker Donthineni, Vijay Kilari

On 31/03/2017 19:04, Andre Przywara wrote:
> Hi,

Hi Andre,

> The time I planned for the indirect device table was spent on the above two
> items, so I will write this now while the reviewers are on it.
>
> I tried to check every error return and kick out every signed int.
> Also the bug that Vijay reported has been fixed (I hope).
> While the two command line parameters are still around, the Kconfig
> options have been removed.
> I tried to separate functions between the existing VGIC and the LPI
> and ITS code parts. However there is always some connection which prevents
> a clean separation (I tried several approaches).
> I checked using the vgic_ops structure, but that feels like abuse for some
> functions, also has issues since a GICv3 and a GICv3 with ITS are not
> really separate (both could have LPIs), and the latter would always be a
> superset of the former, which duplicates code and makes a separate
> vgic_ops questionable.

There are reasons why I asked no direct call to {,v}GICv3 specific 
functions in the common code. As you may know common code is very 
generic and here to abstract the implementation of a specific GIC 
controller. It will become quickly unmaintainable if we start to do that.

Do you think the irqchip Linux maintainers will accept direct GICv3 call 
in the generic code? They will clearly say no and probably not in a very 
politely way, this is similar here.

Regarding the vgic_ops, I asked to have separate vgic ops for vGICv3 and 
vGICv3 with ITS because some code path differs. For instance we don't 
support LPIs without ITS yet. So it makes sense to avoid LPIs code to be 
called by mistake without ITS. Duplicating a vgic_ops (at max 10 lines) 
is not much compare to the potential issues.

You also seem to assume that Xen compiled with ITS will only run on 
GICv3 ITS platform. This is not true at all. The same binary can run on 
a wide range of platform.

This is not the first time I asked that. So let me be clear, I will not 
accept any call to specific functions nor #ifdef CONFIG_HAS_ITS in 
common code. It is a direct NAck from me.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-03-31 22:59   ` Stefano Stabellini
@ 2017-04-03  9:05     ` Andre Przywara
  2017-04-03 18:16       ` Stefano Stabellini
  0 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-04-03  9:05 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, Julien Grall, Shanker Donthineni, Vijay Kilari

Hi,

On 31/03/17 23:59, Stefano Stabellini wrote:
> On Fri, 31 Mar 2017, Andre Przywara wrote:
>> The ARM GICv3 provides a new kind of interrupt called LPIs.
>> The pending bits and the configuration data (priority, enable bits) for
>> those LPIs are stored in tables in normal memory, which software has to
>> provide to the hardware.
>> Allocate the required memory, initialize it and hand it over to each
>> redistributor. The maximum number of LPIs to be used can be adjusted with
>> the command line option "max_lpi_bits", which defaults to 20 bits,
>> covering about one million LPIs.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---

[...]

>> +static unsigned int max_lpi_bits = 20;
>> +integer_param("max_lpi_bits", max_lpi_bits);
> 
> The only thing missing is checking that the user has passed max_lpi_bits
> or warn if she has not (or if the memory usage is too high).

Right, I was missing that.
So I went with the "if memory usage is too high" version here, since the
default of 20 bits results in a 16KB first level table only. I would
then start warning if the bits exceed 24 (which is 256KB).
From what I could learn, the ARM GIC-500 provides 16 bits of LPIs, and
Cavium advertises 20 bits, AFAIK (hence the default).
I don't know about other platforms or future machines, though, so we
should keep this in.

> Look at the way dom0_mem is implemented.

Thanks!
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-03-31 18:05 ` [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table Andre Przywara
  2017-03-31 22:59   ` Stefano Stabellini
@ 2017-04-03 13:53   ` Julien Grall
  2017-04-03 14:01     ` Julien Grall
  1 sibling, 1 reply; 52+ messages in thread
From: Julien Grall @ 2017-04-03 13:53 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 31/03/17 19:05, Andre Przywara wrote:
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> new file mode 100644
> index 0000000..77f6009
> --- /dev/null
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -0,0 +1,209 @@

[...]

> +#include <xen/lib.h>
> +#include <xen/mm.h>
> +#include <xen/sizes.h>
> +#include <asm/gic.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic_v3_its.h>
> +#include <asm/io.h>
> +#include <asm/page.h>
> +
> +#define LPI_PROPTABLE_NEEDS_FLUSHING    (1U << 0)

NIT: newline here.

> +/* Global state */
> +static struct {
> +    /* The global LPI property table, shared by all redistributors. */
> +    uint8_t *lpi_property;
> +    /*
> +     * Number of physical LPIs the host supports. This is a property of
> +     * the GIC hardware. We depart from the habit of naming these things
> +     * "physical" in Xen, as the GICv3/4 spec uses the term "physical LPI"
> +     * in a different context to differentiate them from "virtual LPIs".
> +     */
> +    unsigned long int nr_host_lpis;

On v2, you said you will rename this variable to max_host_lpi_ids and ...

> +    unsigned int flags;
> +} lpi_data;
> +
> +struct lpi_redist_data {
> +    void                *pending_table;
> +};
> +
> +static DEFINE_PER_CPU(struct lpi_redist_data, lpi_redist);
> +
> +#define MAX_PHYS_LPIS   (lpi_data.nr_host_lpis - LPI_OFFSET)

... this one to MAX_NR_PHYS_LPIS or even MAX_NR_HOST_LPIS to stay 
consistent.

So please do it.

> +
> +static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
> +{
> +    uint64_t val;
> +    void *pendtable;
> +
> +    if ( this_cpu(lpi_redist).pending_table )
> +        return -EBUSY;
> +
> +    val  = GIC_BASER_CACHE_RaWaWb << GICR_PENDBASER_INNER_CACHEABILITY_SHIFT;
> +    val |= GIC_BASER_CACHE_SameAsInner << GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT;
> +    val |= GIC_BASER_InnerShareable << GICR_PENDBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The pending table holds one bit per LPI and even covers bits for
> +     * interrupt IDs below 8192, so we allocate the full range.
> +     * The GICv3 imposes a 64KB alignment requirement, also requires
> +     * physically contiguous memory.
> +     */
> +    pendtable = _xzalloc(lpi_data.nr_host_lpis / 8, SZ_64K);
> +    if ( !pendtable )
> +        return -ENOMEM;
> +
> +    /* Make sure the physical address can be encoded in the register. */
> +    if ( (virt_to_maddr(pendtable) & ~GENMASK_ULL(51, 16)) )

NIT the middle ( ... ) are not necessary.

[...]

> +static int gicv3_lpi_set_proptable(void __iomem * rdist_base)
> +{
> +    uint64_t reg;
> +
> +    reg  = GIC_BASER_CACHE_RaWaWb << GICR_PROPBASER_INNER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_CACHE_SameAsInner << GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT;
> +    reg |= GIC_BASER_InnerShareable << GICR_PROPBASER_SHAREABILITY_SHIFT;
> +
> +    /*
> +     * The property table is shared across all redistributors, so allocate
> +     * this only once, but return the same value on subsequent calls.
> +     */
> +    if ( !lpi_data.lpi_property )
> +    {
> +        /* The property table holds one byte per LPI. */
> +        void *table = _xmalloc(lpi_data.nr_host_lpis, SZ_4K);
> +
> +        if ( !table )
> +            return -ENOMEM;
> +
> +        /* Make sure the physical address can be encoded in the register. */
> +        if ( (virt_to_maddr(table) & ~GENMASK_ULL(51, 12)) )
> +        {
> +            xfree(table);
> +            return -ERANGE;
> +        }
> +        memset(table, GIC_PRI_IRQ | LPI_PROP_RES1, MAX_PHYS_LPIS);
> +        clean_and_invalidate_dcache_va_range(table, MAX_PHYS_LPIS);
> +        lpi_data.lpi_property = table;
> +    }
> +
> +    /* Encode the number of bits needed, minus one */
> +    reg |= (fls(lpi_data.nr_host_lpis - 1) - 1);

NIT: The outer ( ... ) are not necessary.

[...]

> +int gicv3_lpi_init_rdist(void __iomem * rdist_base)
> +{
> +    uint32_t reg;
> +    uint64_t table_reg;
> +    int ret;
> +
> +    /* We don't support LPIs without an ITS. */
> +    if ( !gicv3_its_host_has_its() )
> +        return -ENODEV;
> +
> +    /* Make sure LPIs are disabled before setting up the tables. */
> +    reg = readl_relaxed(rdist_base + GICR_CTLR);
> +    if ( reg & GICR_CTLR_ENABLE_LPIS )
> +        return -EBUSY;
> +
> +    ret = gicv3_lpi_allocate_pendtable(&table_reg);
> +    if (ret)

Coding style:

if ( ... )

[...]

> +static unsigned int max_lpi_bits = 20;
> +integer_param("max_lpi_bits", max_lpi_bits);
> +
> +int gicv3_lpi_init_host_lpis(unsigned int hw_lpi_bits)
> +{

Again, please add a comment to explain why you don't sanitize the value 
from the command line.

> +    lpi_data.nr_host_lpis = BIT_ULL(min(hw_lpi_bits, max_lpi_bits));

Again, nr_host_lpis is "unsigned long" so why are you using BIT_ULL? 
Looking at the introduction of GENMASK_ULL, it likely means nr_host_lpis 
should be unsigned long long.

> +
> +    printk("GICv3: using at most %lu LPIs on the host.\n", MAX_PHYS_LPIS);
> +
> +    return 0;
> +}

[...]

> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 6bd25a5..7cdebc5 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -44,7 +44,10 @@
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>
>  /* Additional bits in GICD_TYPER defined by GICv3 */
> -#define GICD_TYPE_ID_BITS_SHIFT 19
> +#define GICD_TYPE_ID_BITS_SHIFT      19
> +#define GICD_TYPE_ID_BITS(r)     ((((r) >> GICD_TYPE_ID_BITS_SHIFT) & 0x1f) + 1)

Please align with the rest.

[...]

> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 765a655..219d109 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -40,6 +40,11 @@ void gicv3_its_dt_init(const struct dt_device_node *node);
>
>  bool gicv3_its_host_has_its(void);
>
> +int gicv3_lpi_init_rdist(void __iomem * rdist_base);
> +
> +/* Initialize the host structures for LPIs. */
> +int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);

Again, in the implementation, the parameter is called "hw_lpi_bits". 
Please stay consistent.

> +
>  #else
>
>  static LIST_HEAD(host_its_list);
> @@ -53,6 +58,15 @@ static inline bool gicv3_its_host_has_its(void)
>      return false;
>  }
>
> +static inline int gicv3_lpi_init_rdist(void __iomem * rdist_base)
> +{
> +    return -ENODEV;
> +}
> +
> +static inline int gicv3_lpi_init_host_lpis(unsigned int nr_lpis)

Ditto.

> +{
> +    return 0;
> +}
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 8f7a167..13528c0 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -19,8 +19,16 @@ struct arch_irq_desc {
>  };
>
>  #define NR_LOCAL_IRQS	32
> +
> +/*
> + * This only covers the interrupts that Xen cares about, so SGIs, PPIs and
> + * SPIs. LPIs are too numerous, also only propagated to guests, so they are
> + * not included in this number.
> + */
>  #define NR_IRQS		1024
>
> +#define LPI_OFFSET      8192
> +
>  #define nr_irqs NR_IRQS
>  #define nr_static_irqs NR_IRQS
>  #define arch_hwdom_irqs(domid) NR_IRQS
> diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
> index bd0883a..9261e06 100644
> --- a/xen/include/xen/bitops.h
> +++ b/xen/include/xen/bitops.h
> @@ -5,11 +5,14 @@
>  /*
>   * Create a contiguous bitmask starting at bit position @l and ending at
>   * position @h. For example
> - * GENMASK(30, 21) gives us the 32bit vector 0x01fe00000.
> + * GENMASK(30, 21) gives us the 32bit vector 0x7fe00000.

This is a new addition in this series and should really be a separate 
patch with the appropriate maintainers CCed.

>   */
>  #define GENMASK(h, l) \
>      (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
>
> +#define GENMASK_ULL(h, l) \
> +    (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))

This should also be a separate patch with BITS_PER_LONG_LONG too.

> +
>  /*
>   * ffs: find first bit set. This is defined the same way as
>   * the libc and compiler builtin ffs routines, therefore
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-04-03 13:53   ` Julien Grall
@ 2017-04-03 14:01     ` Julien Grall
  0 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 14:01 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari



On 03/04/17 14:53, Julien Grall wrote:
>> diff --git a/xen/include/xen/bitops.h b/xen/include/xen/bitops.h
>> index bd0883a..9261e06 100644
>> --- a/xen/include/xen/bitops.h
>> +++ b/xen/include/xen/bitops.h
>> @@ -5,11 +5,14 @@
>>  /*
>>   * Create a contiguous bitmask starting at bit position @l and ending at
>>   * position @h. For example
>> - * GENMASK(30, 21) gives us the 32bit vector 0x01fe00000.
>> + * GENMASK(30, 21) gives us the 32bit vector 0x7fe00000.
>
> This is a new addition in this series and should really be a separate
> patch with the appropriate maintainers CCed.
>
>>   */
>>  #define GENMASK(h, l) \
>>      (((~0UL) << (l)) & (~0UL >> (BITS_PER_LONG - 1 - (h))))
>>
>> +#define GENMASK_ULL(h, l) \
>> +    (((~0ULL) << (l)) & (~0ULL >> (BITS_PER_LONG_LONG - 1 - (h))))
>
> This should also be a separate patch with BITS_PER_LONG_LONG too.

BTW,  please take a look at the discussion:

https://patchwork.kernel.org/patch/8824091/

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-03-31 18:05 ` [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
  2017-03-31 23:06   ` Stefano Stabellini
@ 2017-04-03 15:38   ` Julien Grall
  2017-04-03 17:22     ` Julien Grall
  1 sibling, 1 reply; 52+ messages in thread
From: Julien Grall @ 2017-04-03 15:38 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 31/03/17 19:05, Andre Przywara wrote:
> Each ITS maps a pair of a DeviceID (for instance derived from a PCI
> b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
> pair of LPI number and collection ID, which points to the target CPU.
> This mapping is stored in the device and collection tables, which software
> has to provide for the ITS to use.
> Allocate the required memory and hand it to the ITS.
> The maximum number of devices is limited to a compile-time constant
> exposed in Kconfig.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Julien Grall <julien.grall@arm.com>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer
  2017-03-31 18:05 ` [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
  2017-03-31 23:10   ` Stefano Stabellini
@ 2017-04-03 16:00   ` Julien Grall
  1 sibling, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 16:00 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 31/03/17 19:05, Andre Przywara wrote:
> Instead of directly manipulating the tables in memory, an ITS driver
> sends commands via a ring buffer in normal system memory to the ITS h/w
> to create or alter the LPI mappings.
> Allocate memory for that buffer and tell the ITS about it to be able
> to send ITS commands.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>

Reviewed-by: Julien Grall <julien.grall@arm.com>

Regards,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-04-03 15:38   ` Julien Grall
@ 2017-04-03 17:22     ` Julien Grall
  2017-04-03 19:39       ` Andre Przywara
  0 siblings, 1 reply; 52+ messages in thread
From: Julien Grall @ 2017-04-03 17:22 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 03/04/17 16:38, Julien Grall wrote:
> On 31/03/17 19:05, Andre Przywara wrote:
>> Each ITS maps a pair of a DeviceID (for instance derived from a PCI
>> b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
>> pair of LPI number and collection ID, which points to the target CPU.
>> This mapping is stored in the device and collection tables, which
>> software
>> has to provide for the ITS to use.
>> Allocate the required memory and hand it to the ITS.
>> The maximum number of devices is limited to a compile-time constant
>> exposed in Kconfig.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>
> Reviewed-by: Julien Grall <julien.grall@arm.com>

Actually I will withdraw my reviewed-by. I didn't spot you keep the 
command line around which I clearly say no and gave some reasons why. 
Sorry for the mess.

To explain it again, no-one can possible know how the DeviceID will be 
spread on the platform without having the platform data sheet in hand. 
If the platform provide more DeviceID and is not able to cope with that. 
Then it is a platform specific quirk.

When we spoke f2f you agree on this. So please drop this command line.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling
  2017-03-31 18:05 ` [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
  2017-03-31 23:16   ` Stefano Stabellini
@ 2017-04-03 17:32   ` Julien Grall
  1 sibling, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 17:32 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

I will be nice and repeating my comments. I am hoping they will be fixed 
next version.

On 31/03/17 19:05, Andre Przywara wrote:

[...]

>  #define ITS_CMD_QUEUE_SZ                SZ_1M
>
> @@ -34,6 +37,147 @@ bool gicv3_its_host_has_its(void)
>      return !list_empty(&host_its_list);
>  }
>
> +#define BUFPTR_MASK                     GENMASK_ULL(19, 5)
> +static int its_send_command(struct host_its *hw_its, const void *its_cmd)
> +{
> +    /* Some small grace period in case the command queue is congested. */

This comment is a nice improvement. But as mention in the previous 
version, should make it clear that it is a guess. People will likely ask 
why you choose 1ms whilst Linux is using 1s.

> +    s_time_t deadline = NOW() + MILLISECS(1);
> +    uint64_t readp, writep;
> +    int ret = -EBUSY;
> +
> +    /* No ITS commands from an interrupt handler (at the moment). */
> +    ASSERT(!in_irq());
> +
> +    spin_lock(&hw_its->cmd_lock);
> +
> +    do {
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +
> +        if ( ((writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ) != readp )
> +        {
> +            ret = 0;
> +            break;
> +        }
> +
> +        /*
> +         * If the command queue is full, wait for a bit in the hope it drains
> +         * before giving up.
> +         */
> +        spin_unlock(&hw_its->cmd_lock);
> +        cpu_relax();
> +        udelay(1);
> +        spin_lock(&hw_its->cmd_lock);
> +    } while ( NOW() <= deadline );
> +
> +    if ( ret )
> +    {
> +        spin_unlock(&hw_its->cmd_lock);
> +        printk(XENLOG_WARNING "ITS: command queue full.\n");

This function could be called from a domain. So please ratelimit the 
message (see printk_ratelimit).

You replied this morning, you will fixed in in v4. I am hoping this will 
be the case.

> +        return ret;
> +    }
> +
> +    memcpy(hw_its->cmd_buf + writep, its_cmd, ITS_CMD_SIZE);
> +    if ( hw_its->flags & HOST_ITS_FLUSH_CMD_QUEUE )
> +        clean_and_invalidate_dcache_va_range(hw_its->cmd_buf + writep,
> +                                             ITS_CMD_SIZE);
> +    else
> +        dsb(ishst);
> +
> +    writep = (writep + ITS_CMD_SIZE) % ITS_CMD_QUEUE_SZ;
> +    writeq_relaxed(writep & BUFPTR_MASK, hw_its->its_base + GITS_CWRITER);
> +
> +    spin_unlock(&hw_its->cmd_lock);
> +
> +    return 0;
> +}
> +
> +/* Wait for an ITS to finish processing all commands. */
> +static int gicv3_its_wait_commands(struct host_its *hw_its)
> +{
> +    /* Define an upper limit for our wait time. */

See my remark on the previous timeout comment.

> +    s_time_t deadline = NOW() + MILLISECS(100);
> +    uint64_t readp, writep;
> +
> +    do {
> +        spin_lock(&hw_its->cmd_lock);
> +        readp = readq_relaxed(hw_its->its_base + GITS_CREADR) & BUFPTR_MASK;
> +        writep = readq_relaxed(hw_its->its_base + GITS_CWRITER) & BUFPTR_MASK;
> +        spin_unlock(&hw_its->cmd_lock);
> +
> +        if ( readp == writep )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    return -ETIMEDOUT;
> +}
> +

[...]

> +/* Set up the (1:1) collection mapping for the given host CPU. */
> +int gicv3_its_setup_collection(unsigned int cpu)
> +{
> +    struct host_its *its;
> +    int ret;
> +
> +    list_for_each_entry(its, &host_its_list, entry)
> +    {
> +        if ( !its->cmd_buf )

This check should be dropped.

[...]

> +/*
> + * Before an ITS gets initialized, it should be in a quiescent state, where
> + * all outstanding commands and transactions have finished.
> + * So if the ITS is already enabled, turn it off and wait for all outstanding
> + * operations to get processed by polling the QUIESCENT bit.
> + */
> +static int gicv3_disable_its(struct host_its *hw_its)
> +{
> +    uint32_t reg;
> +    /* A similar generous wait limit as we use for the command queue wait. */

See my above comments about the timeout.

> +    s_time_t deadline = NOW() + MILLISECS(100);
> +
> +    reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +    if ( !(reg & GITS_CTLR_ENABLE) && (reg & GITS_CTLR_QUIESCENT) )
> +        return 0;
> +
> +    writel_relaxed(reg & ~GITS_CTLR_ENABLE, hw_its->its_base + GITS_CTLR);
> +
> +    do {
> +        reg = readl_relaxed(hw_its->its_base + GITS_CTLR);
> +        if ( reg & GITS_CTLR_QUIESCENT )
> +            return 0;
> +
> +        cpu_relax();
> +        udelay(1);
> +    } while ( NOW() <= deadline );
> +
> +    dprintk(XENLOG_ERR, "ITS not quiescent.\n");

dprintk will disappear on non-debug build. But this looks quite useful. 
So I would use printk.

[...]

> +uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta)
> +{
> +    if ( use_pta )
> +        return per_cpu(lpi_redist, cpu).redist_addr & GENMASK_ULL(51, 16);
> +    else
> +        return per_cpu(lpi_redist, cpu).redist_id << 16;
> +}
> +
>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>  {
>      uint64_t val;
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index b84bc40..0e21cb2 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -666,7 +666,21 @@ static int __init gicv3_populate_rdist(void)
>
>                  if ( typer & GICR_TYPER_PLPIS )
>                  {
> -                    int ret;
> +                    paddr_t rdist_addr;
> +                    int procnum, ret;

procnum should be unsigned.

> +
> +                    /*
> +                     * The ITS refers to redistributors either by their physical
> +                     * address or by their ID. Determine those two values and
> +                     * let the ITS code store them in per host CPU variables to
> +                     * later be able to address those redistributors.
> +                     */

I said it on v2 this morning and will repeat it for record. This comment 
is not useful in itself here because redist_address could be used by 
other code. It would be more useful on top of the call to initialize ITS 
as it would explain why it is done there and not before.

[...]

> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 7cdebc5..b01b6ed 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h


[...]

>  /* data structure for each hardware ITS */
>  struct host_its {
> @@ -88,6 +107,7 @@ struct host_its {
>      paddr_t size;
>      void __iomem *its_base;
>      unsigned int devid_bits;
> +    spinlock_t cmd_lock;

Again, initialization, clean-up of a field should be done in the same 
that added the field. Otherwise, this is a call to miss a bit of the 
code and makes more difficult for the reviewer.

So please initialize cmd_lock in this patch and patch #6.

>      void *cmd_buf;
>      unsigned int flags;
>  };
> @@ -108,6 +128,13 @@ int gicv3_lpi_init_rdist(void __iomem * rdist_base);
>  int gicv3_lpi_init_host_lpis(unsigned int nr_lpis);
>  int gicv3_its_init(void);
>
> +/* Store the physical address and ID for each redistributor as read from DT. */
> +void gicv3_set_redist_address(paddr_t address, unsigned int redist_id);
> +uint64_t gicv3_get_redist_address(unsigned int cpu, bool use_pta);
> +
> +/* Map a collection for this host CPU to each host ITS. */
> +int gicv3_its_setup_collection(unsigned int cpu);
> +
>  #else
>
>  static LIST_HEAD(host_its_list);
> @@ -135,6 +162,17 @@ static inline int gicv3_its_init(void)
>  {
>      return 0;
>  }
> +
> +static inline void gicv3_set_redist_address(paddr_t address,
> +                                            unsigned int redist_id)
> +{
> +}
> +
> +static inline int gicv3_its_setup_collection(unsigned int cpu)
> +{

This function should never be called as it is gated by the presence of 
ITS. I would add a BUG() with a comment to ensure this is the case.

> +    return 0;
> +}
> +
>  #endif /* CONFIG_HAS_ITS */
>
>  #endif
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table
  2017-04-03  9:05     ` Andre Przywara
@ 2017-04-03 18:16       ` Stefano Stabellini
  0 siblings, 0 replies; 52+ messages in thread
From: Stefano Stabellini @ 2017-04-03 18:16 UTC (permalink / raw)
  To: Andre Przywara
  Cc: xen-devel, Julien Grall, Stefano Stabellini, Shanker Donthineni,
	Vijay Kilari

On Mon, 3 Apr 2017, Andre Przywara wrote:
> Hi,
> 
> On 31/03/17 23:59, Stefano Stabellini wrote:
> > On Fri, 31 Mar 2017, Andre Przywara wrote:
> >> The ARM GICv3 provides a new kind of interrupt called LPIs.
> >> The pending bits and the configuration data (priority, enable bits) for
> >> those LPIs are stored in tables in normal memory, which software has to
> >> provide to the hardware.
> >> Allocate the required memory, initialize it and hand it over to each
> >> redistributor. The maximum number of LPIs to be used can be adjusted with
> >> the command line option "max_lpi_bits", which defaults to 20 bits,
> >> covering about one million LPIs.
> >>
> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >> ---
> 
> [...]
> 
> >> +static unsigned int max_lpi_bits = 20;
> >> +integer_param("max_lpi_bits", max_lpi_bits);
> > 
> > The only thing missing is checking that the user has passed max_lpi_bits
> > or warn if she has not (or if the memory usage is too high).
> 
> Right, I was missing that.
> So I went with the "if memory usage is too high" version here, since the
> default of 20 bits results in a 16KB first level table only. I would
> then start warning if the bits exceed 24 (which is 256KB).

Yes, but where is the warning? I cannot find it on this patch.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping
  2017-04-01  8:01   ` Vijay Kilari
@ 2017-04-03 18:33     ` Julien Grall
  0 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 18:33 UTC (permalink / raw)
  To: Vijay Kilari, Andre Przywara
  Cc: xen-devel, Stefano Stabellini, Shanker Donthineni

On 01/04/17 09:01, Vijay Kilari wrote:
> Hi Andre,

Hi Vijay,

> On Fri, Mar 31, 2017 at 11:35 PM, Andre Przywara <andre.przywara@arm.com> wrote:
>> +    /* An Interrupt Translation Table needs to be 256-byte aligned. */
>> +    itt_addr = _xzalloc(nr_events * hw_its->itte_size, 256);
>
>       As I mentioned, in previous version, if itt_addr is not enough size,
> ITS would overwrite and corrupt memory.
> Similar to size passed in MAPD cmd, itt_addr should also be allocated of size
> ROUNDUP(nr_events, LPI_BLOCK).

ROUNDUP(nr_events, LPI_BLOCK) would still be wrong as the MAPD command 
works in term of bits. You have to round up to the next bit.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping
  2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
  2017-03-31 23:20   ` Stefano Stabellini
  2017-04-01  8:01   ` Vijay Kilari
@ 2017-04-03 18:56   ` Julien Grall
  2 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 18:56 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

Mostly repeating my comments from the previous version.

On 31/03/17 19:05, Andre Przywara wrote:

[...]

> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> +                             uint8_t size_bits, paddr_t itt_addr, bool valid)
> +{
> +    uint64_t cmd[4];
> +
> +    if ( valid )
> +    {
> +        ASSERT(size_bits < 32);

Again, it would be better if you do the check against the real number in 
hardware (i.e GITS_TYPER.ID_bits).

> +        ASSERT(!(itt_addr & ~GENMASK_ULL(51, 8)));
> +    }
> +    cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +    cmd[1] = size_bits;

I would have expected to see size_bits - 1 to accommodate all the 
helpers rather than relying on them.

[...]

> +static int remove_mapped_guest_device(struct its_devices *dev)
> +{
> +    int ret;
> +
> +    if ( dev->hw_its )
> +    {
> +        /* MAPD also discards all events with this device ID. */
> +        int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, false);

You are re-defining ret. Why?

[...]

> +static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
> +{
> +    struct host_its *hw_its;
> +
> +    list_for_each_entry(hw_its, &host_its_list, entry)
> +    {
> +        if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )

Again, why not storing the ITS address rather than the doorbell to avoid 
"+ ITS_DOORBELL_OFFSET" ?

[...]

> +/*
> + * Map a hardware device, identified by a certain host ITS and its device ID
> + * to domain d, a guest ITS (identified by its doorbell address) and device ID.
> + * Also provide the number of events (MSIs) needed for that device.
> + * This does not check if this particular hardware device is already mapped
> + * at another domain, it is expected that this would be done by the caller.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +                               paddr_t host_doorbell, uint32_t host_devid,
> +                               paddr_t guest_doorbell, uint32_t guest_devid,
> +                               uint32_t nr_events, bool valid)

I am sure I said it somewhere in this series, nr_events likely needs to 
be sanitized against the hardware value. Same for host_devid.

[...]

> +        parent = *new;
> +        cmp = compare_its_guest_devices(temp, guest_doorbell, guest_devid);
> +        if ( !cmp )
> +        {
> +            if ( !valid )
> +                rb_erase(&temp->rbnode, &d->arch.vgic.its_devices);
> +
> +            spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +            if ( valid )

Again, a printk(XENLOG_GUEST...) here would be useful to know which host 
DeviceID was associated to the guest DeviceID.

> +                return -EBUSY;
> +
> +            return remove_mapped_guest_device(temp);

Again, just above you removed the device from the RB-tree but this 
function may fail and never free the memory. This means that memory will 
be leaked leading to a potential denial of service.

> +        }
> +
> +        if ( cmp > 0 )
> +            new = &((*new)->rb_left);
> +        else
> +            new = &((*new)->rb_right);
> +    }
> +
> +    if ( !valid )
> +        goto out_unlock;
> +
> +    ret = -ENOMEM;
> +
> +    /* An Interrupt Translation Table needs to be 256-byte aligned. */
> +    itt_addr = _xzalloc(nr_events * hw_its->itte_size, 256);

See Vijay's comment. But why don't you round up nr_events at the 
beginning once for all rather than doing it in the middle?

[...]

> +out_unlock:
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +    if ( dev )
> +    {
> +        xfree(dev->pend_irqs);
> +        xfree(dev->host_lpi_blocks);

Where is host_lpi_blocks allocated? Why is it freed here?

> +    }
> +    xfree(itt_addr);
> +    xfree(dev);
> +    return ret;
> +}
> +
> +/* Removing any connections a domain had to any ITS in the system. */
> +void gicv3_its_unmap_all_devices(struct domain *d)
> +{
> +    struct rb_node *victim;
> +    struct its_devices *dev;
> +
> +    /*
> +     * This is an easily readable, but suboptimal implementation.
> +     * It uses the provided iteration wrapper and erases each node, which
> +     * possibly triggers rebalancing.
> +     * This seems overkill since we are going to abolish the whole tree, but
> +     * avoids an open-coded re-implementation of the traversal functions with
> +     * some recursive function calls.
> +     */

Well, you updated the comment but it does not make the performance 
problem going away... Xen cannot be preempted, so if it takes too long, 
you will have an impact on the overall system.

As said previously, I think it would be fair to assume that all devices 
will be deassigned before the ITS is destroyed. So I would just drop 
this function. Not that we have the same assumption in the SMMU driver.

If you disagree please say why. But ignoring comments will not help here.

> +restart:
> +    spin_lock(&d->arch.vgic.its_devices_lock);
> +    if ( (victim = rb_first(&d->arch.vgic.its_devices)) )
> +    {
> +        dev = rb_entry(victim, struct its_devices, rbnode);
> +        rb_erase(victim, &d->arch.vgic.its_devices);
> +
> +        spin_unlock(&d->arch.vgic.its_devices_lock);
> +
> +        remove_mapped_guest_device(dev);
> +
> +        goto restart;
> +    }
> +
> +    spin_unlock(&d->arch.vgic.its_devices_lock);
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> @@ -459,6 +685,7 @@ void gicv3_its_dt_init(const struct dt_device_node *node)
>          its_data->addr = addr;
>          its_data->size = size;
>          its_data->dt_node = its;
> +        spin_lock_init(&its_data->cmd_lock);

This should be in patch #5.

>
>          printk("GICv3: Found ITS @0x%lx\n", addr);
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index d61479d..6242252 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -1450,6 +1450,9 @@ static int vgic_v3_domain_init(struct domain *d)
>      d->arch.vgic.nr_regions = rdist_count;
>      d->arch.vgic.rdist_regions = rdist_regions;
>
> +    spin_lock_init(&d->arch.vgic.its_devices_lock);
> +    d->arch.vgic.its_devices = RB_ROOT;
> +

Again, the placement of those 2 lines are likely wrong. This should 
belong to the vITS and not the vgic-v3.

I think it would make sense to get a patch that introduces a skeleton 
for the vITS before this patch and start plumbing through.

>      /*
>       * Domain 0 gets the hardware address.
>       * Guests get the virtual platform layout.
> @@ -1522,6 +1525,7 @@ static int vgic_v3_domain_init(struct domain *d)
>
>  static void vgic_v3_domain_free(struct domain *d)
>  {
> +    gicv3_its_unmap_all_devices(d);

See my comment above regarding this function.

>      xfree(d->arch.vgic.rdist_regions);
>  }


[...]

> diff --git a/xen/include/asm-arm/gic_v3_its.h b/xen/include/asm-arm/gic_v3_its.h
> index 4c2ae1c..4ade5f6 100644
> --- a/xen/include/asm-arm/gic_v3_its.h
> +++ b/xen/include/asm-arm/gic_v3_its.h
> @@ -48,6 +48,10 @@
>  #define GITS_TYPER_DEVICE_ID_BITS(r)    (((r & GITS_TYPER_DEVIDS_MASK) >> \
>                                                 GITS_TYPER_DEVIDS_SHIFT) + 1)
>  #define GITS_TYPER_IDBITS_SHIFT         8
> +#define GITS_TYPER_ITT_SIZE_SHIFT       4
> +#define GITS_TYPER_ITT_SIZE_MASK        (0xfUL << GITS_TYPER_ITT_SIZE_SHIFT)
> +#define GITS_TYPER_ITT_SIZE(r)          ((((r) & GITS_TYPER_ITT_SIZE_MASK) >> \
> +                                                GITS_TYPER_ITT_SIZE_SHIFT) + 1)
>
>  #define GITS_IIDR_VALUE                 0x34c
>
> @@ -94,7 +98,10 @@
>  #define GITS_CMD_MOVALL                 0x0e
>  #define GITS_CMD_DISCARD                0x0f
>
> +#define ITS_DOORBELL_OFFSET             0x10040
> +
>  #include <xen/device_tree.h>
> +#include <xen/rbtree.h>
>
>  #define HOST_ITS_FLUSH_CMD_QUEUE        (1U << 0)
>  #define HOST_ITS_USES_PTA               (1U << 1)
> @@ -109,6 +116,7 @@ struct host_its {
>      unsigned int devid_bits;
>      spinlock_t cmd_lock;
>      void *cmd_buf;
> +    unsigned int itte_size;

Stefano mentioned:
"I would move itte_size and its initialization to the patch that
introduced struct host_its."

>      unsigned int flags;
>  };

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-04-03 17:22     ` Julien Grall
@ 2017-04-03 19:39       ` Andre Przywara
  2017-04-03 20:46         ` Julien Grall
  0 siblings, 1 reply; 52+ messages in thread
From: Andre Przywara @ 2017-04-03 19:39 UTC (permalink / raw)
  To: Julien Grall, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi,

On 03/04/17 18:22, Julien Grall wrote:
> Hi Andre,
> 
> On 03/04/17 16:38, Julien Grall wrote:
>> On 31/03/17 19:05, Andre Przywara wrote:
>>> Each ITS maps a pair of a DeviceID (for instance derived from a PCI
>>> b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
>>> pair of LPI number and collection ID, which points to the target CPU.
>>> This mapping is stored in the device and collection tables, which
>>> software
>>> has to provide for the ITS to use.
>>> Allocate the required memory and hand it to the ITS.
>>> The maximum number of devices is limited to a compile-time constant
>>> exposed in Kconfig.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>
>> Reviewed-by: Julien Grall <julien.grall@arm.com>
> 
> Actually I will withdraw my reviewed-by. I didn't spot you keep the
> command line around which I clearly say no and gave some reasons why.
> Sorry for the mess.

I thought we were talking about the Kconfig option to drop here (which
the commit msg wrongly states as still being around)?

For implementations that don't support indirect tables, but still
advertise high numbers, I'd find it useful to have the possibility to
limit this to avoid memory waste.

> To explain it again, no-one can possible know how the DeviceID will be
> spread on the platform without having the platform data sheet in hand.
> If the platform provide more DeviceID and is not able to cope with that.
> Then it is a platform specific quirk.
> When we spoke f2f you agree on this. So please drop this command line.

Sigh ...

Cheers,
Andre.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table
  2017-04-03 19:39       ` Andre Przywara
@ 2017-04-03 20:46         ` Julien Grall
  0 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-03 20:46 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

On 04/03/2017 08:39 PM, Andre Przywara wrote:
> Hi,
>
> On 03/04/17 18:22, Julien Grall wrote:
>> Hi Andre,
>>
>> On 03/04/17 16:38, Julien Grall wrote:
>>> On 31/03/17 19:05, Andre Przywara wrote:
>>>> Each ITS maps a pair of a DeviceID (for instance derived from a PCI
>>>> b/d/f triplet) and an EventID (the MSI payload or interrupt ID) to a
>>>> pair of LPI number and collection ID, which points to the target CPU.
>>>> This mapping is stored in the device and collection tables, which
>>>> software
>>>> has to provide for the ITS to use.
>>>> Allocate the required memory and hand it to the ITS.
>>>> The maximum number of devices is limited to a compile-time constant
>>>> exposed in Kconfig.
>>>>
>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>
>>> Reviewed-by: Julien Grall <julien.grall@arm.com>
>>
>> Actually I will withdraw my reviewed-by. I didn't spot you keep the
>> command line around which I clearly say no and gave some reasons why.
>> Sorry for the mess.
> `
> I thought we were talking about the Kconfig option to drop here (which
> the commit msg wrongly states as still being around)?
>
> For implementations that don't support indirect tables, but still
> advertise high numbers, I'd find it useful to have the possibility to
> limit this to avoid memory waste.

Again, how the user will know the magic numbers? If the platform 
advertises high device number, then it is none of our business. If the 
number needs to be reduced, this should be a platform specific code.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-03-31 18:05 ` [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
@ 2017-04-04 12:55   ` Julien Grall
  2017-04-04 12:56     ` Julien Grall
  0 siblings, 1 reply; 52+ messages in thread
From: Julien Grall @ 2017-04-04 12:55 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 31/03/17 19:05, Andre Przywara wrote:
> Allow a guest to provide the address and size for the memory regions
> it has reserved for the GICv3 pending and property tables.
> We sanitise the various fields of the respective redistributor
> registers and map those pages into Xen's address space to have easy
> access.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c       | 136 +++++++++++++++++++++++++++++++++++++------
>  xen/common/memory.c          |  61 +++++++++++++++++++
>  xen/include/asm-arm/domain.h |   6 +-
>  xen/include/asm-arm/vgic.h   |   2 +
>  xen/include/xen/mm.h         |   8 +++
>  5 files changed, 195 insertions(+), 18 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 69572e3..7f84fbf 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -20,12 +20,14 @@
>
>  #include <xen/bitops.h>
>  #include <xen/config.h>
> +#include <xen/domain_page.h>
>  #include <xen/lib.h>
>  #include <xen/init.h>
>  #include <xen/softirq.h>
>  #include <xen/irq.h>
>  #include <xen/sched.h>
>  #include <xen/sizes.h>
> +#include <xen/vmap.h>
>  #include <asm/current.h>
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
> @@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          goto read_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase, info);

vgic_reg64_extract will likely turned into a series of non-atomic 
operation. So how do you prevent issue?

> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);

Ditto.

> +        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
> +        return 1;

[...]

>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
>  {
>      struct hsr_dabt dabt = info->dabt;
> +    uint64_t reg;
>
>      switch ( gicr_reg )
>      {
> @@ -394,36 +478,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>          goto write_impl_defined;
>
>      case VREG64(GICR_SETLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case VREG64(GICR_CLRLPIR):
> -        /* LPI is not implemented */
> +        /* LPIs without an ITS are not implemented */
>          goto write_ignore_64;
>
>      case 0x0050:
>          goto write_reserved;
>
>      case VREG64(GICR_PROPBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )

v will point to the vCPU associated to the re-distributor. However, this 
could be updated from any vCPU. So I think there is tiny race where 
v->arch.vgic.flags may not been seen and therefore you will

> +            return 1;
> +
> +        reg = v->domain->arch.vgic.rdist_propbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_propbaser(reg);
> +        v->domain->arch.vgic.rdist_propbase = reg;

This code could be called concurrently and I don't think this will 
behave well.

> +        return 1;
>
>      case VREG64(GICR_PENDBASER):
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> +
> +        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )

Ditto

> +            return 1;
> +
> +        reg = v->arch.vgic.rdist_pendbase;
> +        vgic_reg64_update(&reg, r, info);
> +        reg = sanitize_pendbaser(reg);
> +        v->arch.vgic.rdist_pendbase = reg;

Ditto.


> +        return 1;

[...]

> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 21797ca..29ef9bb 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c

Any modification in common code should be a separate patch and have 
appropriate maintainers CCed.

> @@ -1419,6 +1419,67 @@ int prepare_ring_for_helper(
>  }
>
>  /*
> + * Mark a given number of guest pages as used (by increasing their refcount),
> + * starting with the given guest address. This needs to be called once before
> + * calling (possibly repeatedly) map_one_guest_pages().
> + * Before the domain gets destroyed, call put_guest_pages() to drop the
> + * reference.
> + */

I was hoping that you would have taken my comments into account where 
you wrote the new functions but it seems they were ignored :/. I feel 
like it is a wasted of my time to repeat again and again comments.

Both get_guest_pages and put_guest_pages are wrong because you are 
assuming the p2m mapping will never changes. This is wrong and I asked a 
forward plan for that which seems to have been skipped too...

> +int get_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)

Many comments here:
    * please use the type gfn_t rather paddr_t,

> +{
> +    unsigned int i;
> +    struct page_info *page;
> +
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL, P2M_ALLOC);

Get page may return a foreign page (e.g belonging to another domain) and 
we don't want to use this type of page for ITS memory.

> +        if ( !page )
> +        {
> +            /* Make sure we drop the references of pages we got so far. */
> +            put_guest_pages(d, gpa, i);
> +            return -EINVAL;
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +void put_guest_pages(struct domain *d, paddr_t gpa, unsigned int nr_pages)

Same comments as above.

> +{
> +    mfn_t mfn;
> +    int i;
> +
> +    p2m_read_lock(&d->arch.p2m);
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
> +                            NULL, NULL, NULL);
> +        if ( mfn_eq(mfn, INVALID_MFN) )
> +            continue;
> +        put_page(mfn_to_page(mfn_x(mfn)));

This function is completely wrong in the actual state. You assume that 
the stage-2 page table has not been modified by the guest between 
get_guest_pages and put_guest_pages. If it has been modified, you may 
remove a reference on the wrong page.

Furthermore, it is likely an error to have the mfn not valid in this case.

As we discussed earlier, the way forward is to protect the pages. It is 
not mandatory for DOM0, but a comment in the code is necessary to 
explain what is missing.

> +    }
> +    p2m_read_unlock(&d->arch.p2m);
> +}
> +
> +/*
> + * Provides easy access to guest memory by "mapping" one page of it into
> + * Xen's VA space. In fact it relies on the memory being already mapped
> + * and just provides a pointer to it.
> + */
> +void *map_one_guest_page(struct domain *d, paddr_t guest_addr)
> +{
> +    void *ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));

This might be correct for DOM0 but will not work for guest. Even if you 
don't support guest right now, we should really avoid such assumption in 
the code. It will likely mean quite a lot of rework which I'd like to 
see now.

Looking at how you use this, it would make more sense to have an helper 
to copy from the guest memory to a buffer. I think this is not the first 
time I am suggesting that.

I think this would also avoid protecting the guest memory.

For an example of what I meant give a look to the vITS series sent by 
Cavium a year ago:

https://patchwork.kernel.org/patch/8177251/

> +
> +    return ptr + (guest_addr & ~PAGE_MASK);
> +}
> +
> +/* "Unmap" a previously mapped guest page. Could be optimized away. */
> +void unmap_one_guest_page(void *va)
> +{
> +    unmap_domain_page(((uintptr_t)va & PAGE_MASK));
> +}
> +
> +/*
>   * Local variables:
>   * mode: C
>   * c-file-style: "BSD"
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index a83904a..ad4dfdc 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -110,6 +110,8 @@ struct arch_domain
>          } *rdist_regions;
>          int nr_regions;                     /* Number of rdist regions */
>          uint32_t rdist_stride;              /* Re-Distributor stride */
> +        unsigned int nr_lpis;

You switched to "unsigned long" in the gic-v3-its code, but still keep 
"unsigned int" here.

Why that?

[...]

> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
> index eabdf91..9f48e9a 100644
> --- a/xen/include/asm-arm/vgic.h
> +++ b/xen/include/asm-arm/vgic.h
> @@ -310,6 +310,8 @@ extern void register_vgic_ops(struct domain *d, const struct vgic_ops *ops);
>  int vgic_v2_init(struct domain *d, int *mmio_count);
>  int vgic_v3_init(struct domain *d, int *mmio_count);
>
> +extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);

Prototype should be added in the same patch as the declaration. So 
please move to patch #10.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables
  2017-04-04 12:55   ` Julien Grall
@ 2017-04-04 12:56     ` Julien Grall
  0 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-04 12:56 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hmmm I posted the comment on v3, rather than v4 :/. I will duplicate 
them on v4 for convenience.

Cheers,

On 04/04/17 13:55, Julien Grall wrote:
> Hi Andre,
>
> On 31/03/17 19:05, Andre Przywara wrote:
>> Allow a guest to provide the address and size for the memory regions
>> it has reserved for the GICv3 pending and property tables.
>> We sanitise the various fields of the respective redistributor
>> registers and map those pages into Xen's address space to have easy
>> access.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> ---
>>  xen/arch/arm/vgic-v3.c       | 136
>> +++++++++++++++++++++++++++++++++++++------
>>  xen/common/memory.c          |  61 +++++++++++++++++++
>>  xen/include/asm-arm/domain.h |   6 +-
>>  xen/include/asm-arm/vgic.h   |   2 +
>>  xen/include/xen/mm.h         |   8 +++
>>  5 files changed, 195 insertions(+), 18 deletions(-)
>>
>> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
>> index 69572e3..7f84fbf 100644
>> --- a/xen/arch/arm/vgic-v3.c
>> +++ b/xen/arch/arm/vgic-v3.c
>> @@ -20,12 +20,14 @@
>>
>>  #include <xen/bitops.h>
>>  #include <xen/config.h>
>> +#include <xen/domain_page.h>
>>  #include <xen/lib.h>
>>  #include <xen/init.h>
>>  #include <xen/softirq.h>
>>  #include <xen/irq.h>
>>  #include <xen/sched.h>
>>  #include <xen/sizes.h>
>> +#include <xen/vmap.h>
>>  #include <asm/current.h>
>>  #include <asm/mmio.h>
>>  #include <asm/gic_v3_defs.h>
>> @@ -229,12 +231,15 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct
>> vcpu *v, mmio_info_t *info,
>>          goto read_reserved;
>>
>>      case VREG64(GICR_PROPBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->domain->arch.vgic.rdist_propbase,
>> info);
>
> vgic_reg64_extract will likely turned into a series of non-atomic
> operation. So how do you prevent issue?
>
>> +        return 1;
>>
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI's not implemented */
>> -        goto read_as_zero_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +        *r = vgic_reg64_extract(v->arch.vgic.rdist_pendbase, info);
>
> Ditto.
>
>> +        *r &= ~GICR_PENDBASER_PTZ;       /* WO, reads as 0 */
>> +        return 1;
>
> [...]
>
>>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t
>> *info,
>>                                            uint32_t gicr_reg,
>>                                            register_t r)
>>  {
>>      struct hsr_dabt dabt = info->dabt;
>> +    uint64_t reg;
>>
>>      switch ( gicr_reg )
>>      {
>> @@ -394,36 +478,54 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct
>> vcpu *v, mmio_info_t *info,
>>          goto write_impl_defined;
>>
>>      case VREG64(GICR_SETLPIR):
>> -        /* LPI is not implemented */
>> +        /* LPIs without an ITS are not implemented */
>>          goto write_ignore_64;
>>
>>      case VREG64(GICR_CLRLPIR):
>> -        /* LPI is not implemented */
>> +        /* LPIs without an ITS are not implemented */
>>          goto write_ignore_64;
>>
>>      case 0x0050:
>>          goto write_reserved;
>>
>>      case VREG64(GICR_PROPBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +
>> +        /* Writing PROPBASER with LPIs enabled is UNPREDICTABLE. */
>> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
>
> v will point to the vCPU associated to the re-distributor. However, this
> could be updated from any vCPU. So I think there is tiny race where
> v->arch.vgic.flags may not been seen and therefore you will
>
>> +            return 1;
>> +
>> +        reg = v->domain->arch.vgic.rdist_propbase;
>> +        vgic_reg64_update(&reg, r, info);
>> +        reg = sanitize_propbaser(reg);
>> +        v->domain->arch.vgic.rdist_propbase = reg;
>
> This code could be called concurrently and I don't think this will
> behave well.
>
>> +        return 1;
>>
>>      case VREG64(GICR_PENDBASER):
>> -        /* LPI is not implemented */
>> -        goto write_ignore_64;
>> +        if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
>> +
>> +        /* Writing PENDBASER with LPIs enabled is UNPREDICTABLE. */
>> +        if ( v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED )
>
> Ditto
>
>> +            return 1;
>> +
>> +        reg = v->arch.vgic.rdist_pendbase;
>> +        vgic_reg64_update(&reg, r, info);
>> +        reg = sanitize_pendbaser(reg);
>> +        v->arch.vgic.rdist_pendbase = reg;
>
> Ditto.
>
>
>> +        return 1;
>
> [...]
>
>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>> index 21797ca..29ef9bb 100644
>> --- a/xen/common/memory.c
>> +++ b/xen/common/memory.c
>
> Any modification in common code should be a separate patch and have
> appropriate maintainers CCed.
>
>> @@ -1419,6 +1419,67 @@ int prepare_ring_for_helper(
>>  }
>>
>>  /*
>> + * Mark a given number of guest pages as used (by increasing their
>> refcount),
>> + * starting with the given guest address. This needs to be called
>> once before
>> + * calling (possibly repeatedly) map_one_guest_pages().
>> + * Before the domain gets destroyed, call put_guest_pages() to drop the
>> + * reference.
>> + */
>
> I was hoping that you would have taken my comments into account where
> you wrote the new functions but it seems they were ignored :/. I feel
> like it is a wasted of my time to repeat again and again comments.
>
> Both get_guest_pages and put_guest_pages are wrong because you are
> assuming the p2m mapping will never changes. This is wrong and I asked a
> forward plan for that which seems to have been skipped too...
>
>> +int get_guest_pages(struct domain *d, paddr_t gpa, unsigned int
>> nr_pages)
>
> Many comments here:
>    * please use the type gfn_t rather paddr_t,
>
>> +{
>> +    unsigned int i;
>> +    struct page_info *page;
>> +
>> +    for ( i = 0; i < nr_pages; i++ )
>> +    {
>> +        page = get_page_from_gfn(d, (gpa >> PAGE_SHIFT) + i, NULL,
>> P2M_ALLOC);
>
> Get page may return a foreign page (e.g belonging to another domain) and
> we don't want to use this type of page for ITS memory.
>
>> +        if ( !page )
>> +        {
>> +            /* Make sure we drop the references of pages we got so
>> far. */
>> +            put_guest_pages(d, gpa, i);
>> +            return -EINVAL;
>> +        }
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +void put_guest_pages(struct domain *d, paddr_t gpa, unsigned int
>> nr_pages)
>
> Same comments as above.
>
>> +{
>> +    mfn_t mfn;
>> +    int i;
>> +
>> +    p2m_read_lock(&d->arch.p2m);
>> +    for ( i = 0; i < nr_pages; i++ )
>> +    {
>> +        mfn = p2m_get_entry(&d->arch.p2m, _gfn((gpa >> PAGE_SHIFT) + i),
>> +                            NULL, NULL, NULL);
>> +        if ( mfn_eq(mfn, INVALID_MFN) )
>> +            continue;
>> +        put_page(mfn_to_page(mfn_x(mfn)));
>
> This function is completely wrong in the actual state. You assume that
> the stage-2 page table has not been modified by the guest between
> get_guest_pages and put_guest_pages. If it has been modified, you may
> remove a reference on the wrong page.
>
> Furthermore, it is likely an error to have the mfn not valid in this case.
>
> As we discussed earlier, the way forward is to protect the pages. It is
> not mandatory for DOM0, but a comment in the code is necessary to
> explain what is missing.
>
>> +    }
>> +    p2m_read_unlock(&d->arch.p2m);
>> +}
>> +
>> +/*
>> + * Provides easy access to guest memory by "mapping" one page of it into
>> + * Xen's VA space. In fact it relies on the memory being already mapped
>> + * and just provides a pointer to it.
>> + */
>> +void *map_one_guest_page(struct domain *d, paddr_t guest_addr)
>> +{
>> +    void *ptr = map_domain_page(_mfn(guest_addr >> PAGE_SHIFT));
>
> This might be correct for DOM0 but will not work for guest. Even if you
> don't support guest right now, we should really avoid such assumption in
> the code. It will likely mean quite a lot of rework which I'd like to
> see now.
>
> Looking at how you use this, it would make more sense to have an helper
> to copy from the guest memory to a buffer. I think this is not the first
> time I am suggesting that.
>
> I think this would also avoid protecting the guest memory.
>
> For an example of what I meant give a look to the vITS series sent by
> Cavium a year ago:
>
> https://patchwork.kernel.org/patch/8177251/
>
>> +
>> +    return ptr + (guest_addr & ~PAGE_MASK);
>> +}
>> +
>> +/* "Unmap" a previously mapped guest page. Could be optimized away. */
>> +void unmap_one_guest_page(void *va)
>> +{
>> +    unmap_domain_page(((uintptr_t)va & PAGE_MASK));
>> +}
>> +
>> +/*
>>   * Local variables:
>>   * mode: C
>>   * c-file-style: "BSD"
>> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
>> index a83904a..ad4dfdc 100644
>> --- a/xen/include/asm-arm/domain.h
>> +++ b/xen/include/asm-arm/domain.h
>> @@ -110,6 +110,8 @@ struct arch_domain
>>          } *rdist_regions;
>>          int nr_regions;                     /* Number of rdist
>> regions */
>>          uint32_t rdist_stride;              /* Re-Distributor stride */
>> +        unsigned int nr_lpis;
>
> You switched to "unsigned long" in the gic-v3-its code, but still keep
> "unsigned int" here.
>
> Why that?
>
> [...]
>
>> diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
>> index eabdf91..9f48e9a 100644
>> --- a/xen/include/asm-arm/vgic.h
>> +++ b/xen/include/asm-arm/vgic.h
>> @@ -310,6 +310,8 @@ extern void register_vgic_ops(struct domain *d,
>> const struct vgic_ops *ops);
>>  int vgic_v2_init(struct domain *d, int *mmio_count);
>>  int vgic_v3_init(struct domain *d, int *mmio_count);
>>
>> +extern int vgic_lpi_get_priority(struct domain *d, uint32_t vlpi);
>
> Prototype should be added in the same patch as the declaration. So
> please move to patch #10.
>
> Cheers,
>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v3 26/26] ARM: vGIC: advertise LPI support
  2017-03-31 18:05 ` [PATCH v3 26/26] ARM: vGIC: advertise LPI support Andre Przywara
@ 2017-04-04 17:06   ` Julien Grall
  0 siblings, 0 replies; 52+ messages in thread
From: Julien Grall @ 2017-04-04 17:06 UTC (permalink / raw)
  To: Andre Przywara, Stefano Stabellini
  Cc: xen-devel, Shanker Donthineni, Vijay Kilari

Hi Andre,

On 31/03/17 19:05, Andre Przywara wrote:
> To let a guest know about the availability of virtual LPIs, set the
> respective bits in the virtual GIC registers and let a guest control
> the LPI enable bit.
> Only report the LPI capability if the host has initialized at least
> one ITS.
>
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> ---
>  xen/arch/arm/vgic-v3.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 69 insertions(+), 5 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index 22a7b1b..47dad6a 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -169,8 +169,10 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* We have not implemented LPI's, read zero */
> -        goto read_as_zero_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = vgic_reg32_extract(!!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED),

I would be more readable to do:

uint32_t ctlr;

ctlr = !!(v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED);
*r = vgic_reg32_extract(ctlr, info);

> +                                info);
> +        return 1;
>
>      case VREG32(GICR_IIDR):
>          if ( dabt.size != DABT_WORD ) goto bad_width;
> @@ -182,16 +184,19 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          uint64_t typer, aff;
>
>          if ( !vgic_reg64_check_access(dabt) ) goto bad_width;
> -        /* TBD: Update processor id in [23:8] when ITS support is added */
>          aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
>          typer = aff;
> +        typer |= (v->vcpu_id & 0xffff) << 8;

Please explain in the commit message why this change. At first glance, 
this should be a separate patch.

>
>          if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
>              typer |= GICR_TYPER_LAST;
>
> +        if ( v->domain->arch.vgic.has_its )
> +            typer |= GICR_TYPER_PLPIS;
> +
>          *r = vgic_reg64_extract(typer, info);
>
>          return 1;
> @@ -434,6 +439,35 @@ static uint64_t sanitize_pendbaser(uint64_t reg)
>      return reg;
>  }
>
> +static void vgic_vcpu_enable_lpis(struct vcpu *v)
> +{
> +    uint64_t reg = v->domain->arch.vgic.rdist_propbase;
> +    unsigned int nr_lpis = BIT((reg & 0x1f) + 1) - LPI_OFFSET;
> +    int nr_pages;
> +
> +    /* The first VCPU to enable LPIs maps the property table. */
> +    if ( !v->domain->arch.vgic.nr_lpis )
> +    {
> +        v->domain->arch.vgic.nr_lpis = nr_lpis;
> +
> +        nr_pages = DIV_ROUND_UP(nr_lpis, PAGE_SIZE);
> +        get_guest_pages(v->domain, reg & GENMASK_ULL(51, 12), nr_pages);
> +        gprintk(XENLOG_INFO, "VGIC-v3: VCPU%d mapped %d pages for property table\n",
> +               v->vcpu_id, nr_pages);
> +    }
> +    nr_pages = DIV_ROUND_UP(((nr_lpis + LPI_OFFSET) / 8), PAGE_SIZE);
> +    reg = v->arch.vgic.rdist_pendbase;
> +
> +    get_guest_pages(v->domain, reg & GENMASK_ULL(51, 12), nr_pages);
> +
> +    gprintk(XENLOG_INFO, "VGIC-v3: VCPU%d mapped %d pages for pending table\n",
> +            v->vcpu_id, nr_pages);
> +
> +    v->arch.vgic.flags |= VGIC_V3_LPIS_ENABLED;
> +
> +    printk("VGICv3: enabled %d LPIs for VCPU%d\n", nr_lpis, v->vcpu_id);
> +}
> +
>  static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>                                            uint32_t gicr_reg,
>                                            register_t r)
> @@ -444,8 +478,18 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case VREG32(GICR_CTLR):
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        if ( !v->domain->arch.vgic.has_its )
> +            return 1;
> +
> +        /* LPIs can only be enabled once, but never disabled again. */
> +        if ( !(r & GICR_CTLR_ENABLE_LPIS) ||
> +             (v->arch.vgic.flags & VGIC_V3_LPIS_ENABLED) )

This is fragile. A re-distributor can be updated from any vCPU. So you 
would end up to call vgic_vcpu_enable_lpis twice. You likely want to use 
a lock protecting the GICR_CTLR emulation

> +            return 1;
> +
> +        vgic_vcpu_enable_lpis(v);
> +
> +        return 1;
>
>      case VREG32(GICR_IIDR):
>          /* RO */
> @@ -1045,6 +1089,11 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info,
>          typer = ((ncpus - 1) << GICD_TYPE_CPUS_SHIFT |
>                   DIV_ROUND_UP(v->domain->arch.vgic.nr_spis, 32));
>
> +        if ( v->domain->arch.vgic.has_its )
> +        {
> +            typer |= GICD_TYPE_LPIS;
> +            irq_bits = 16;

Why 16?

> +        }
>          typer |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
>
>          *r = vgic_reg32_extract(typer, info);
> @@ -1666,6 +1715,21 @@ static int vgic_v3_domain_init(struct domain *d)
>
>  static void vgic_v3_domain_free(struct domain *d)
>  {
> +    int nr_pages;
> +    struct vcpu *v;
> +
> +    if ( d->arch.vgic.nr_lpis )
> +    {
> +        nr_pages = DIV_ROUND_UP(d->arch.vgic.nr_lpis, PAGE_SIZE);
> +        put_guest_pages(d, d->arch.vgic.rdist_propbase & GENMASK_ULL(51, 12),
> +                        nr_pages);
> +
> +        nr_pages = DIV_ROUND_UP((d->arch.vgic.nr_lpis + LPI_OFFSET) / 8,
> +                                PAGE_SIZE);
> +        for_each_vcpu(d, v)
> +            put_guest_pages(d, v->arch.vgic.rdist_pendbase & GENMASK_ULL(51, 12),
> +                            nr_pages);
> +    }
>      gicv3_its_unmap_all_devices(d);
>      radix_tree_destroy(&d->arch.vgic.pend_lpi_tree, NULL);
>      xfree(d->arch.vgic.rdist_regions);
>

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2017-04-04 17:06 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-31 18:04 [PATCH v3 00/26] arm64: Dom0 ITS emulation Andre Przywara
2017-03-31 18:05 ` [PATCH v3 01/26] ARM: GICv3 ITS: parse and store ITS subnodes from hardware DT Andre Przywara
2017-03-31 23:08   ` Stefano Stabellini
2017-03-31 18:05 ` [PATCH v3 02/26] ARM: GICv3: allocate LPI pending and property table Andre Przywara
2017-03-31 22:59   ` Stefano Stabellini
2017-04-03  9:05     ` Andre Przywara
2017-04-03 18:16       ` Stefano Stabellini
2017-04-03 13:53   ` Julien Grall
2017-04-03 14:01     ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 03/26] ARM: GICv3 ITS: allocate device and collection table Andre Przywara
2017-03-31 23:06   ` Stefano Stabellini
2017-04-03 15:38   ` Julien Grall
2017-04-03 17:22     ` Julien Grall
2017-04-03 19:39       ` Andre Przywara
2017-04-03 20:46         ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 04/26] ARM: GICv3 ITS: map ITS command buffer Andre Przywara
2017-03-31 23:10   ` Stefano Stabellini
2017-04-03 16:00   ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 05/26] ARM: GICv3 ITS: introduce ITS command handling Andre Przywara
2017-03-31 23:16   ` Stefano Stabellini
2017-04-03 17:32   ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping Andre Przywara
2017-03-31 23:20   ` Stefano Stabellini
2017-04-01  8:01   ` Vijay Kilari
2017-04-03 18:33     ` Julien Grall
2017-04-03 18:56   ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 07/26] ARM: GICv3 ITS: introduce host LPI array Andre Przywara
2017-03-31 23:24   ` Stefano Stabellini
2017-03-31 18:05 ` [PATCH v3 08/26] ARM: GICv3: introduce separate pending_irq structs for LPIs Andre Przywara
2017-03-31 18:05 ` [PATCH v3 09/26] ARM: GICv3: forward pending LPIs to guests Andre Przywara
2017-03-31 18:05 ` [PATCH v3 10/26] ARM: GICv3: enable ITS and LPIs on the host Andre Przywara
2017-03-31 18:05 ` [PATCH v3 11/26] ARM: vGICv3: handle virtual LPI pending and property tables Andre Przywara
2017-04-04 12:55   ` Julien Grall
2017-04-04 12:56     ` Julien Grall
2017-03-31 18:05 ` [PATCH v3 12/26] ARM: vGICv3: Handle disabled LPIs Andre Przywara
2017-03-31 18:05 ` [PATCH v3 13/26] ARM: vGICv3: introduce basic ITS emulation bits Andre Przywara
2017-03-31 18:05 ` [PATCH v3 14/26] ARM: vITS: introduce translation table walks Andre Przywara
2017-03-31 18:05 ` [PATCH v3 15/26] ARM: vITS: handle CLEAR command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 16/26] ARM: vITS: handle INT command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 17/26] ARM: vITS: handle MAPC command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 18/26] ARM: vITS: handle MAPD command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 19/26] ARM: vITS: handle MAPTI command Andre Przywara
2017-04-01  8:32   ` Vijay Kilari
2017-03-31 18:05 ` [PATCH v3 20/26] ARM: vITS: handle MOVI command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 21/26] ARM: vITS: handle DISCARD command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 22/26] ARM: vITS: handle INV command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 23/26] ARM: vITS: handle INVALL command Andre Przywara
2017-03-31 18:05 ` [PATCH v3 24/26] ARM: vITS: create and initialize virtual ITSes for Dom0 Andre Przywara
2017-03-31 18:05 ` [PATCH v3 25/26] ARM: vITS: create ITS subnodes for Dom0 DT Andre Przywara
2017-03-31 18:05 ` [PATCH v3 26/26] ARM: vGIC: advertise LPI support Andre Przywara
2017-04-04 17:06   ` Julien Grall
2017-04-01 20:37 ` [PATCH v3 00/26] arm64: Dom0 ITS emulation Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.