All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/22] xen/arm: Add ITS support
@ 2015-03-19 14:37 vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 01/22] add linked list apis vijay.kilari
                   ` (23 more replies)
  0 siblings, 24 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add ITS support for arm. Following major features
are supported
 - GICv3 ITS support for arm64 platform
 - Supports multi ITS node
 - LPI descriptors are allocated on-demand
 - Only ITS Dom0 is supported

Tested with single ITS node.

Major changes in v2:
 - Added Multi ITS support.
 - GIC ITS physical driver is rebased to linux 4.0_rc4 version
 - ITS DomU support
 - Reused GICv3 hw_irq_controller ops for LPIs
 - Generic Interrupt handling is used for LPI interrupts

Vijaya Kumar K (22):
  add linked list apis
  Use linked list accessors for page_list helper function
  xen/arm: Add bitmap_find_next_zero_area helper function
  xen/arm: its: Import GICv3 ITS driver from linux
  xen/arm: gicv3: Refactor redistributor information
  xen/arm: its: Port ITS driver to xen
  xen/arm: its: Move ITS command encode helper functions
  xen/arm: its: Remove unused code in ITS driver
  xen/arm: its: Add helper functions to decode ITS Command
  xen/arm: Add helper function to get domain page
  xen/arm: its: Move its_device structure to header file
  xen/arm: its: Update irq descriptor for LPIs support
  xen/arm: its: Add virtual ITS command support
  xen/arm: its: Add emulation of ITS control registers
  xen/arm: its: Add support to emulate GICR register for LPIs
  xen/arm: its: implement hw_irq_controller for LPIs
  xen/arm: its: Map ITS translation space
  xen/arm: its: Dynamic allocation of LPI descriptors
  xen/arm: its: Support ITS interrupt handling
  xen/arm: its: Generate ITS node for Dom0
  xen/arm: its: Initialize virtual and physical ITS driver
  xen/arm: its: Generate ITS dt node for DomU

 tools/libxl/libxl_arm.c                |   36 +
 xen/arch/arm/Makefile                  |    2 +
 xen/arch/arm/arm64/lib/find_next_bit.c |   39 +
 xen/arch/arm/domain_build.c            |   50 +-
 xen/arch/arm/gic-v3-its.c              | 1263 +++++++++++++++++++++++++++
 xen/arch/arm/gic-v3.c                  |   70 +-
 xen/arch/arm/gic.c                     |   47 +-
 xen/arch/arm/irq.c                     |  219 ++++-
 xen/arch/arm/p2m.c                     |   24 +
 xen/arch/arm/setup.c                   |    1 +
 xen/arch/arm/vgic-v3-its.c             | 1483 ++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c                 |   65 +-
 xen/arch/arm/vgic.c                    |   34 +-
 xen/include/asm-arm/arm64/bitops.h     |   15 +
 xen/include/asm-arm/domain.h           |   14 +
 xen/include/asm-arm/gic-its.h          |  253 ++++++
 xen/include/asm-arm/gic.h              |   16 +-
 xen/include/asm-arm/gic_v3_defs.h      |  134 ++-
 xen/include/asm-arm/irq.h              |   15 +
 xen/include/asm-arm/p2m.h              |    3 +
 xen/include/asm-arm/vgic.h             |    1 +
 xen/include/public/arch-arm.h          |    3 +
 xen/include/xen/list.h                 |   60 ++
 xen/include/xen/mm.h                   |   10 +-
 24 files changed, 3805 insertions(+), 52 deletions(-)
 create mode 100644 xen/arch/arm/gic-v3-its.c
 create mode 100644 xen/arch/arm/vgic-v3-its.c
 create mode 100644 xen/include/asm-arm/gic-its.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 01/22] add linked list apis
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 02/22] Use linked list accessors for page_list helper function vijay.kilari
                   ` (22 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, Jan Beulich, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add missing linked list apis from kernel

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
CC: Jan Beulich <JBeulich@suse.com>
---
v2: Add additional linked apis like list_last_entry_or_null,
    list_next_entry and list_prev_entry
---
 xen/include/xen/list.h |   60 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/xen/include/xen/list.h b/xen/include/xen/list.h
index 59cf571..fa07d72 100644
--- a/xen/include/xen/list.h
+++ b/xen/include/xen/list.h
@@ -385,6 +385,66 @@ static inline void list_splice_init(struct list_head *list,
     container_of(ptr, type, member)
 
 /**
+ * list_first_entry - get the first element from a list
+ * @ptr:        the list head to take the element from.
+ * @type:       the type of the struct this is embedded in.
+ * @member:     the name of the list_struct within the struct.
+ *
+ * Note, that list is expected to be not empty.
+ */
+#define list_first_entry(ptr, type, member) \
+        list_entry((ptr)->next, type, member)
+
+/**
+ * list_last_entry - get the last element from a list
+ * @ptr:        the list head to take the element from.
+ * @type:       the type of the struct this is embedded in.
+ * @member:     the name of the list_struct within the struct.
+ *
+ * Note, that list is expected to be not empty.
+ */
+#define list_last_entry(ptr, type, member) \
+        list_entry((ptr)->prev, type, member)
+
+/**
+ * list_first_entry_or_null - get the first element from a list
+ * @ptr:        the list head to take the element from.
+ * @type:       the type of the struct this is embedded in.
+ * @member:     the name of the list_struct within the struct.
+ *
+ * Note that if the list is empty, it returns NULL.
+ */
+#define list_first_entry_or_null(ptr, type, member) \
+        (!list_empty(ptr) ? list_first_entry(ptr, type, member) : NULL)
+
+/**
+ * list_last_entry_or_null - get the last element from a list
+ * @ptr:        the list head to take the element from.
+ * @type:       the type of the struct this is embedded in.
+ * @member:     the name of the list_struct within the struct.
+ *
+ * Note that if the list is empty, it returns NULL.
+ */
+#define list_last_entry_or_null(ptr, type, member) \
+        (!list_empty(ptr) ? list_last_entry(ptr, type, member) : NULL)
+
+/**
+  * list_next_entry - get the next element in list
+  * @pos:        the type * to cursor
+  * @member:     the name of the list_struct within the struct.
+  */
+#define list_next_entry(pos, member) \
+        list_entry((pos)->member.next, typeof(*(pos)), member)
+ 
+/**
+  * list_prev_entry - get the prev element in list
+  * @pos:        the type * to cursor
+  * @member:     the name of the list_struct within the struct.
+  */
+#define list_prev_entry(pos, member) \
+        list_entry((pos)->member.prev, typeof(*(pos)), member)
+
+/**
  * list_for_each    -    iterate over a list
  * @pos:    the &struct list_head to use as a loop cursor.
  * @head:    the head for your list.
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 02/22] Use linked list accessors for page_list helper function
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 01/22] add linked list apis vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area " vijay.kilari
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, Jan Beulich, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Use newly introduced linked list helper functions in
page_list* functions

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
CC: Jan Beulich <JBeulich@suse.com>
---
 xen/include/xen/mm.h |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 6ea8b8c..33da984 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -336,14 +336,12 @@ page_list_splice(struct page_list_head *list, struct page_list_head *head)
 # define INIT_PAGE_LIST_HEAD             INIT_LIST_HEAD
 # define INIT_PAGE_LIST_ENTRY            INIT_LIST_HEAD
 # define page_list_empty                 list_empty
-# define page_list_first(hd)             list_entry((hd)->next, \
+# define page_list_first(hd)             list_first_entry(hd, \
                                                     struct page_info, list)
-# define page_list_last(hd)              list_entry((hd)->prev, \
-                                                    struct page_info, list)
-# define page_list_next(pg, hd)          list_entry((pg)->list.next, \
-                                                    struct page_info, list)
-# define page_list_prev(pg, hd)          list_entry((pg)->list.prev, \
+# define page_list_last(hd)              list_last_entry(hd, \
                                                     struct page_info, list)
+# define page_list_next(pg, hd)          list_next_entry(pg, list)
+# define page_list_prev(pg, hd)          list_prev_entry(pg, list)
 # define page_list_add(pg, hd)           list_add(&(pg)->list, hd)
 # define page_list_add_tail(pg, hd)      list_add_tail(&(pg)->list, hd)
 # define page_list_del(pg, hd)           list_del(&(pg)->list)
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area helper function
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 01/22] add linked list apis vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 02/22] Use linked list accessors for page_list helper function vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-20 13:35   ` Julien Grall
  2015-03-19 14:37 ` [RFC PATCH v2 04/22] xen/arm: its: Import GICv3 ITS driver from linux vijay.kilari
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

bitmap_find_next_zero_area helper function will be used
by physical ITS driver imported from linux

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/arm64/lib/find_next_bit.c |   39 ++++++++++++++++++++++++++++++++
 xen/include/asm-arm/arm64/bitops.h     |   15 ++++++++++++
 2 files changed, 54 insertions(+)

diff --git a/xen/arch/arm/arm64/lib/find_next_bit.c b/xen/arch/arm/arm64/lib/find_next_bit.c
index aea69c2..7dc288b 100644
--- a/xen/arch/arm/arm64/lib/find_next_bit.c
+++ b/xen/arch/arm/arm64/lib/find_next_bit.c
@@ -162,6 +162,45 @@ found:
 EXPORT_SYMBOL(find_first_zero_bit);
 #endif
 
+#ifndef bitmap_find_next_zero_area
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ *
+ * The @align_mask should be one less than a power of 2; the effect is that
+ * the bit offset of all zero areas this function finds is multiples of that
+ * power of 2. A @align_mask of 0 means no alignment is required.
+ */
+#define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask))
+
+unsigned long bitmap_find_next_zero_area(unsigned long *map,
+                                         unsigned long size,
+                                         unsigned long start,
+                                         unsigned int nr,
+                                         unsigned long align_mask)
+{
+        unsigned long index, end, i;
+again:
+        index = find_next_zero_bit(map, size, start);
+
+        /* Align allocation */
+        index = ALIGN_MASK(index, align_mask);
+
+        end = index + nr;
+        if (end > size)
+                return end;
+        i = find_next_bit(map, end, index);
+        if (i < end) {
+                start = i + 1;
+                goto again;
+        }
+        return index;
+}
+#endif
 #ifdef __BIG_ENDIAN
 
 /* include/linux/byteorder does not support "unsigned long" type */
diff --git a/xen/include/asm-arm/arm64/bitops.h b/xen/include/asm-arm/arm64/bitops.h
index 6bf1922..d4bc87a 100644
--- a/xen/include/asm-arm/arm64/bitops.h
+++ b/xen/include/asm-arm/arm64/bitops.h
@@ -67,6 +67,21 @@ extern unsigned long find_next_zero_bit(const unsigned long *addr, unsigned
 		long size, unsigned long offset);
 #endif
 
+#ifndef bitmap_find_next_zero_area
+/*
+ * bitmap_find_next_zero_area - find a contiguous aligned zero area
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @align_mask: Alignment mask for zero area
+ */
+extern unsigned long bitmap_find_next_zero_area(unsigned long *map,
+                                                unsigned long size,
+                                                unsigned long start,
+                                                unsigned int nr,
+                                                unsigned long align_mask);
+#endif
 #ifdef CONFIG_GENERIC_FIND_FIRST_BIT
 
 /**
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 04/22] xen/arm: its: Import GICv3 ITS driver from linux
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (2 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area " vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 05/22] xen/arm: gicv3: Refactor redistributor information vijay.kilari
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

This is actual GICv3 ITS driver from linux (4.0_rc4)
with latest commit id: 4559fbb3a9b1bde46afc739fa6c300826acdc19c

No xen related changes are made and is not compiled.
This helps to import any issues found in linux

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: Import GIC ITS driver from linux 4.0_rc4
---
 xen/arch/arm/gic-v3-its.c | 1524 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 1524 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
new file mode 100644
index 0000000..596b0a9
--- /dev/null
+++ b/xen/arch/arm/gic-v3-its.c
@@ -0,0 +1,1524 @@
+/*
+ * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/bitmap.h>
+#include <linux/cpu.h>
+#include <linux/delay.h>
+#include <linux/interrupt.h>
+#include <linux/log2.h>
+#include <linux/mm.h>
+#include <linux/msi.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+#include <linux/of_pci.h>
+#include <linux/of_platform.h>
+#include <linux/percpu.h>
+#include <linux/slab.h>
+
+#include <linux/irqchip/arm-gic-v3.h>
+
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/exception.h>
+
+#include "irqchip.h"
+
+#define ITS_FLAGS_CMDQ_NEEDS_FLUSHING		(1 << 0)
+
+#define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING	(1 << 0)
+
+/*
+ * Collection structure - just an ID, and a redistributor address to
+ * ping. We use one per CPU as a bag of interrupts assigned to this
+ * CPU.
+ */
+struct its_collection {
+	u64			target_address;
+	u16			col_id;
+};
+
+/*
+ * The ITS structure - contains most of the infrastructure, with the
+ * msi_controller, the command queue, the collections, and the list of
+ * devices writing to it.
+ */
+struct its_node {
+	raw_spinlock_t		lock;
+	struct list_head	entry;
+	struct msi_controller	msi_chip;
+	struct irq_domain	*domain;
+	void __iomem		*base;
+	unsigned long		phys_base;
+	struct its_cmd_block	*cmd_base;
+	struct its_cmd_block	*cmd_write;
+	void			*tables[GITS_BASER_NR_REGS];
+	struct its_collection	*collections;
+	struct list_head	its_device_list;
+	u64			flags;
+	u32			ite_size;
+};
+
+#define ITS_ITT_ALIGN		SZ_256
+
+/*
+ * The ITS view of a device - belongs to an ITS, a collection, owns an
+ * interrupt translation table, and a list of interrupts.
+ */
+struct its_device {
+	struct list_head	entry;
+	struct its_node		*its;
+	struct its_collection	*collection;
+	void			*itt;
+	unsigned long		*lpi_map;
+	irq_hw_number_t		lpi_base;
+	int			nr_lpis;
+	u32			nr_ites;
+	u32			device_id;
+};
+
+static LIST_HEAD(its_nodes);
+static DEFINE_SPINLOCK(its_lock);
+static struct device_node *gic_root_node;
+static struct rdists *gic_rdists;
+
+#define gic_data_rdist()		(raw_cpu_ptr(gic_rdists->rdist))
+#define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
+
+/*
+ * ITS command descriptors - parameters to be encoded in a command
+ * block.
+ */
+struct its_cmd_desc {
+	union {
+		struct {
+			struct its_device *dev;
+			u32 event_id;
+		} its_inv_cmd;
+
+		struct {
+			struct its_device *dev;
+			u32 event_id;
+		} its_int_cmd;
+
+		struct {
+			struct its_device *dev;
+			int valid;
+		} its_mapd_cmd;
+
+		struct {
+			struct its_collection *col;
+			int valid;
+		} its_mapc_cmd;
+
+		struct {
+			struct its_device *dev;
+			u32 phys_id;
+			u32 event_id;
+		} its_mapvi_cmd;
+
+		struct {
+			struct its_device *dev;
+			struct its_collection *col;
+			u32 id;
+		} its_movi_cmd;
+
+		struct {
+			struct its_device *dev;
+			u32 event_id;
+		} its_discard_cmd;
+
+		struct {
+			struct its_collection *col;
+		} its_invall_cmd;
+	};
+};
+
+/*
+ * The ITS command block, which is what the ITS actually parses.
+ */
+struct its_cmd_block {
+	u64	raw_cmd[4];
+};
+
+#define ITS_CMD_QUEUE_SZ		SZ_64K
+#define ITS_CMD_QUEUE_NR_ENTRIES	(ITS_CMD_QUEUE_SZ / sizeof(struct its_cmd_block))
+
+typedef struct its_collection *(*its_cmd_builder_t)(struct its_cmd_block *,
+						    struct its_cmd_desc *);
+
+static void its_encode_cmd(struct its_cmd_block *cmd, u8 cmd_nr)
+{
+	cmd->raw_cmd[0] &= ~0xffUL;
+	cmd->raw_cmd[0] |= cmd_nr;
+}
+
+static void its_encode_devid(struct its_cmd_block *cmd, u32 devid)
+{
+	cmd->raw_cmd[0] &= ~(0xffffUL << 32);
+	cmd->raw_cmd[0] |= ((u64)devid) << 32;
+}
+
+static void its_encode_event_id(struct its_cmd_block *cmd, u32 id)
+{
+	cmd->raw_cmd[1] &= ~0xffffffffUL;
+	cmd->raw_cmd[1] |= id;
+}
+
+static void its_encode_phys_id(struct its_cmd_block *cmd, u32 phys_id)
+{
+	cmd->raw_cmd[1] &= 0xffffffffUL;
+	cmd->raw_cmd[1] |= ((u64)phys_id) << 32;
+}
+
+static void its_encode_size(struct its_cmd_block *cmd, u8 size)
+{
+	cmd->raw_cmd[1] &= ~0x1fUL;
+	cmd->raw_cmd[1] |= size & 0x1f;
+}
+
+static void its_encode_itt(struct its_cmd_block *cmd, u64 itt_addr)
+{
+	cmd->raw_cmd[2] &= ~0xffffffffffffUL;
+	cmd->raw_cmd[2] |= itt_addr & 0xffffffffff00UL;
+}
+
+static void its_encode_valid(struct its_cmd_block *cmd, int valid)
+{
+	cmd->raw_cmd[2] &= ~(1UL << 63);
+	cmd->raw_cmd[2] |= ((u64)!!valid) << 63;
+}
+
+static void its_encode_target(struct its_cmd_block *cmd, u64 target_addr)
+{
+	cmd->raw_cmd[2] &= ~(0xffffffffUL << 16);
+	cmd->raw_cmd[2] |= (target_addr & (0xffffffffUL << 16));
+}
+
+static void its_encode_collection(struct its_cmd_block *cmd, u16 col)
+{
+	cmd->raw_cmd[2] &= ~0xffffUL;
+	cmd->raw_cmd[2] |= col;
+}
+
+static inline void its_fixup_cmd(struct its_cmd_block *cmd)
+{
+	/* Let's fixup BE commands */
+	cmd->raw_cmd[0] = cpu_to_le64(cmd->raw_cmd[0]);
+	cmd->raw_cmd[1] = cpu_to_le64(cmd->raw_cmd[1]);
+	cmd->raw_cmd[2] = cpu_to_le64(cmd->raw_cmd[2]);
+	cmd->raw_cmd[3] = cpu_to_le64(cmd->raw_cmd[3]);
+}
+
+static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
+						 struct its_cmd_desc *desc)
+{
+	unsigned long itt_addr;
+	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
+
+	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
+	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
+
+	its_encode_cmd(cmd, GITS_CMD_MAPD);
+	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
+	its_encode_size(cmd, size - 1);
+	its_encode_itt(cmd, itt_addr);
+	its_encode_valid(cmd, desc->its_mapd_cmd.valid);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_mapd_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_mapc_cmd(struct its_cmd_block *cmd,
+						 struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_MAPC);
+	its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
+	its_encode_target(cmd, desc->its_mapc_cmd.col->target_address);
+	its_encode_valid(cmd, desc->its_mapc_cmd.valid);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_mapc_cmd.col;
+}
+
+static struct its_collection *its_build_mapvi_cmd(struct its_cmd_block *cmd,
+						  struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_MAPVI);
+	its_encode_devid(cmd, desc->its_mapvi_cmd.dev->device_id);
+	its_encode_event_id(cmd, desc->its_mapvi_cmd.event_id);
+	its_encode_phys_id(cmd, desc->its_mapvi_cmd.phys_id);
+	its_encode_collection(cmd, desc->its_mapvi_cmd.dev->collection->col_id);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_mapvi_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_movi_cmd(struct its_cmd_block *cmd,
+						 struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_MOVI);
+	its_encode_devid(cmd, desc->its_movi_cmd.dev->device_id);
+	its_encode_event_id(cmd, desc->its_movi_cmd.id);
+	its_encode_collection(cmd, desc->its_movi_cmd.col->col_id);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_movi_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_discard_cmd(struct its_cmd_block *cmd,
+						    struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_DISCARD);
+	its_encode_devid(cmd, desc->its_discard_cmd.dev->device_id);
+	its_encode_event_id(cmd, desc->its_discard_cmd.event_id);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_discard_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_inv_cmd(struct its_cmd_block *cmd,
+						struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_INV);
+	its_encode_devid(cmd, desc->its_inv_cmd.dev->device_id);
+	its_encode_event_id(cmd, desc->its_inv_cmd.event_id);
+
+	its_fixup_cmd(cmd);
+
+	return desc->its_inv_cmd.dev->collection;
+}
+
+static struct its_collection *its_build_invall_cmd(struct its_cmd_block *cmd,
+						   struct its_cmd_desc *desc)
+{
+	its_encode_cmd(cmd, GITS_CMD_INVALL);
+	its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
+
+	its_fixup_cmd(cmd);
+
+	return NULL;
+}
+
+static u64 its_cmd_ptr_to_offset(struct its_node *its,
+				 struct its_cmd_block *ptr)
+{
+	return (ptr - its->cmd_base) * sizeof(*ptr);
+}
+
+static int its_queue_full(struct its_node *its)
+{
+	int widx;
+	int ridx;
+
+	widx = its->cmd_write - its->cmd_base;
+	ridx = readl_relaxed(its->base + GITS_CREADR) / sizeof(struct its_cmd_block);
+
+	/* This is incredibly unlikely to happen, unless the ITS locks up. */
+	if (((widx + 1) % ITS_CMD_QUEUE_NR_ENTRIES) == ridx)
+		return 1;
+
+	return 0;
+}
+
+static struct its_cmd_block *its_allocate_entry(struct its_node *its)
+{
+	struct its_cmd_block *cmd;
+	u32 count = 1000000;	/* 1s! */
+
+	while (its_queue_full(its)) {
+		count--;
+		if (!count) {
+			pr_err_ratelimited("ITS queue not draining\n");
+			return NULL;
+		}
+		cpu_relax();
+		udelay(1);
+	}
+
+	cmd = its->cmd_write++;
+
+	/* Handle queue wrapping */
+	if (its->cmd_write == (its->cmd_base + ITS_CMD_QUEUE_NR_ENTRIES))
+		its->cmd_write = its->cmd_base;
+
+	return cmd;
+}
+
+static struct its_cmd_block *its_post_commands(struct its_node *its)
+{
+	u64 wr = its_cmd_ptr_to_offset(its, its->cmd_write);
+
+	writel_relaxed(wr, its->base + GITS_CWRITER);
+
+	return its->cmd_write;
+}
+
+static void its_flush_cmd(struct its_node *its, struct its_cmd_block *cmd)
+{
+	/*
+	 * Make sure the commands written to memory are observable by
+	 * the ITS.
+	 */
+	if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
+		__flush_dcache_area(cmd, sizeof(*cmd));
+	else
+		dsb(ishst);
+}
+
+static void its_wait_for_range_completion(struct its_node *its,
+					  struct its_cmd_block *from,
+					  struct its_cmd_block *to)
+{
+	u64 rd_idx, from_idx, to_idx;
+	u32 count = 1000000;	/* 1s! */
+
+	from_idx = its_cmd_ptr_to_offset(its, from);
+	to_idx = its_cmd_ptr_to_offset(its, to);
+
+	while (1) {
+		rd_idx = readl_relaxed(its->base + GITS_CREADR);
+		if (rd_idx >= to_idx || rd_idx < from_idx)
+			break;
+
+		count--;
+		if (!count) {
+			pr_err_ratelimited("ITS queue timeout\n");
+			return;
+		}
+		cpu_relax();
+		udelay(1);
+	}
+}
+
+static void its_send_single_command(struct its_node *its,
+				    its_cmd_builder_t builder,
+				    struct its_cmd_desc *desc)
+{
+	struct its_cmd_block *cmd, *sync_cmd, *next_cmd;
+	struct its_collection *sync_col;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&its->lock, flags);
+
+	cmd = its_allocate_entry(its);
+	if (!cmd) {		/* We're soooooo screewed... */
+		pr_err_ratelimited("ITS can't allocate, dropping command\n");
+		raw_spin_unlock_irqrestore(&its->lock, flags);
+		return;
+	}
+	sync_col = builder(cmd, desc);
+	its_flush_cmd(its, cmd);
+
+	if (sync_col) {
+		sync_cmd = its_allocate_entry(its);
+		if (!sync_cmd) {
+			pr_err_ratelimited("ITS can't SYNC, skipping\n");
+			goto post;
+		}
+		its_encode_cmd(sync_cmd, GITS_CMD_SYNC);
+		its_encode_target(sync_cmd, sync_col->target_address);
+		its_fixup_cmd(sync_cmd);
+		its_flush_cmd(its, sync_cmd);
+	}
+
+post:
+	next_cmd = its_post_commands(its);
+	raw_spin_unlock_irqrestore(&its->lock, flags);
+
+	its_wait_for_range_completion(its, cmd, next_cmd);
+}
+
+static void its_send_inv(struct its_device *dev, u32 event_id)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_inv_cmd.dev = dev;
+	desc.its_inv_cmd.event_id = event_id;
+
+	its_send_single_command(dev->its, its_build_inv_cmd, &desc);
+}
+
+static void its_send_mapd(struct its_device *dev, int valid)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_mapd_cmd.dev = dev;
+	desc.its_mapd_cmd.valid = !!valid;
+
+	its_send_single_command(dev->its, its_build_mapd_cmd, &desc);
+}
+
+static void its_send_mapc(struct its_node *its, struct its_collection *col,
+			  int valid)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_mapc_cmd.col = col;
+	desc.its_mapc_cmd.valid = !!valid;
+
+	its_send_single_command(its, its_build_mapc_cmd, &desc);
+}
+
+static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_mapvi_cmd.dev = dev;
+	desc.its_mapvi_cmd.phys_id = irq_id;
+	desc.its_mapvi_cmd.event_id = id;
+
+	its_send_single_command(dev->its, its_build_mapvi_cmd, &desc);
+}
+
+static void its_send_movi(struct its_device *dev,
+			  struct its_collection *col, u32 id)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_movi_cmd.dev = dev;
+	desc.its_movi_cmd.col = col;
+	desc.its_movi_cmd.id = id;
+
+	its_send_single_command(dev->its, its_build_movi_cmd, &desc);
+}
+
+static void its_send_discard(struct its_device *dev, u32 id)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_discard_cmd.dev = dev;
+	desc.its_discard_cmd.event_id = id;
+
+	its_send_single_command(dev->its, its_build_discard_cmd, &desc);
+}
+
+static void its_send_invall(struct its_node *its, struct its_collection *col)
+{
+	struct its_cmd_desc desc;
+
+	desc.its_invall_cmd.col = col;
+
+	its_send_single_command(its, its_build_invall_cmd, &desc);
+}
+
+/*
+ * irqchip functions - assumes MSI, mostly.
+ */
+
+static inline u32 its_get_event_id(struct irq_data *d)
+{
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	return d->hwirq - its_dev->lpi_base;
+}
+
+static void lpi_set_config(struct irq_data *d, bool enable)
+{
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	irq_hw_number_t hwirq = d->hwirq;
+	u32 id = its_get_event_id(d);
+	u8 *cfg = page_address(gic_rdists->prop_page) + hwirq - 8192;
+
+	if (enable)
+		*cfg |= LPI_PROP_ENABLED;
+	else
+		*cfg &= ~LPI_PROP_ENABLED;
+
+	/*
+	 * Make the above write visible to the redistributors.
+	 * And yes, we're flushing exactly: One. Single. Byte.
+	 * Humpf...
+	 */
+	if (gic_rdists->flags & RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING)
+		__flush_dcache_area(cfg, sizeof(*cfg));
+	else
+		dsb(ishst);
+	its_send_inv(its_dev, id);
+}
+
+static void its_mask_irq(struct irq_data *d)
+{
+	lpi_set_config(d, false);
+}
+
+static void its_unmask_irq(struct irq_data *d)
+{
+	lpi_set_config(d, true);
+}
+
+static void its_eoi_irq(struct irq_data *d)
+{
+	gic_write_eoir(d->hwirq);
+}
+
+static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
+			    bool force)
+{
+	unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	struct its_collection *target_col;
+	u32 id = its_get_event_id(d);
+
+	if (cpu >= nr_cpu_ids)
+		return -EINVAL;
+
+	target_col = &its_dev->its->collections[cpu];
+	its_send_movi(its_dev, target_col, id);
+	its_dev->collection = target_col;
+
+	return IRQ_SET_MASK_OK_DONE;
+}
+
+static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
+{
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	struct its_node *its;
+	u64 addr;
+
+	its = its_dev->its;
+	addr = its->phys_base + GITS_TRANSLATER;
+
+	msg->address_lo		= addr & ((1UL << 32) - 1);
+	msg->address_hi		= addr >> 32;
+	msg->data		= its_get_event_id(d);
+}
+
+static struct irq_chip its_irq_chip = {
+	.name			= "ITS",
+	.irq_mask		= its_mask_irq,
+	.irq_unmask		= its_unmask_irq,
+	.irq_eoi		= its_eoi_irq,
+	.irq_set_affinity	= its_set_affinity,
+	.irq_compose_msi_msg	= its_irq_compose_msi_msg,
+};
+
+static void its_mask_msi_irq(struct irq_data *d)
+{
+	pci_msi_mask_irq(d);
+	irq_chip_mask_parent(d);
+}
+
+static void its_unmask_msi_irq(struct irq_data *d)
+{
+	pci_msi_unmask_irq(d);
+	irq_chip_unmask_parent(d);
+}
+
+static struct irq_chip its_msi_irq_chip = {
+	.name			= "ITS-MSI",
+	.irq_unmask		= its_unmask_msi_irq,
+	.irq_mask		= its_mask_msi_irq,
+	.irq_eoi		= irq_chip_eoi_parent,
+	.irq_write_msi_msg	= pci_msi_domain_write_msg,
+};
+
+/*
+ * How we allocate LPIs:
+ *
+ * The GIC has id_bits bits for interrupt identifiers. From there, we
+ * must subtract 8192 which are reserved for SGIs/PPIs/SPIs. Then, as
+ * we allocate LPIs by chunks of 32, we can shift the whole thing by 5
+ * bits to the right.
+ *
+ * This gives us (((1UL << id_bits) - 8192) >> 5) possible allocations.
+ */
+#define IRQS_PER_CHUNK_SHIFT	5
+#define IRQS_PER_CHUNK		(1 << IRQS_PER_CHUNK_SHIFT)
+
+static unsigned long *lpi_bitmap;
+static u32 lpi_chunks;
+static DEFINE_SPINLOCK(lpi_lock);
+
+static int its_lpi_to_chunk(int lpi)
+{
+	return (lpi - 8192) >> IRQS_PER_CHUNK_SHIFT;
+}
+
+static int its_chunk_to_lpi(int chunk)
+{
+	return (chunk << IRQS_PER_CHUNK_SHIFT) + 8192;
+}
+
+static int its_lpi_init(u32 id_bits)
+{
+	lpi_chunks = its_lpi_to_chunk(1UL << id_bits);
+
+	lpi_bitmap = kzalloc(BITS_TO_LONGS(lpi_chunks) * sizeof(long),
+			     GFP_KERNEL);
+	if (!lpi_bitmap) {
+		lpi_chunks = 0;
+		return -ENOMEM;
+	}
+
+	pr_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks);
+	return 0;
+}
+
+static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
+{
+	unsigned long *bitmap = NULL;
+	int chunk_id;
+	int nr_chunks;
+	int i;
+
+	nr_chunks = DIV_ROUND_UP(nr_irqs, IRQS_PER_CHUNK);
+
+	spin_lock(&lpi_lock);
+
+	do {
+		chunk_id = bitmap_find_next_zero_area(lpi_bitmap, lpi_chunks,
+						      0, nr_chunks, 0);
+		if (chunk_id < lpi_chunks)
+			break;
+
+		nr_chunks--;
+	} while (nr_chunks > 0);
+
+	if (!nr_chunks)
+		goto out;
+
+	bitmap = kzalloc(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long),
+			 GFP_ATOMIC);
+	if (!bitmap)
+		goto out;
+
+	for (i = 0; i < nr_chunks; i++)
+		set_bit(chunk_id + i, lpi_bitmap);
+
+	*base = its_chunk_to_lpi(chunk_id);
+	*nr_ids = nr_chunks * IRQS_PER_CHUNK;
+
+out:
+	spin_unlock(&lpi_lock);
+
+	return bitmap;
+}
+
+static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
+{
+	int lpi;
+
+	spin_lock(&lpi_lock);
+
+	for (lpi = base; lpi < (base + nr_ids); lpi += IRQS_PER_CHUNK) {
+		int chunk = its_lpi_to_chunk(lpi);
+		BUG_ON(chunk > lpi_chunks);
+		if (test_bit(chunk, lpi_bitmap)) {
+			clear_bit(chunk, lpi_bitmap);
+		} else {
+			pr_err("Bad LPI chunk %d\n", chunk);
+		}
+	}
+
+	spin_unlock(&lpi_lock);
+
+	kfree(bitmap);
+}
+
+/*
+ * We allocate 64kB for PROPBASE. That gives us at most 64K LPIs to
+ * deal with (one configuration byte per interrupt). PENDBASE has to
+ * be 64kB aligned (one bit per LPI, plus 8192 bits for SPI/PPI/SGI).
+ */
+#define LPI_PROPBASE_SZ		SZ_64K
+#define LPI_PENDBASE_SZ		(LPI_PROPBASE_SZ / 8 + SZ_1K)
+
+/*
+ * This is how many bits of ID we need, including the useless ones.
+ */
+#define LPI_NRBITS		ilog2(LPI_PROPBASE_SZ + SZ_8K)
+
+#define LPI_PROP_DEFAULT_PRIO	0xa0
+
+static int __init its_alloc_lpi_tables(void)
+{
+	phys_addr_t paddr;
+
+	gic_rdists->prop_page = alloc_pages(GFP_NOWAIT,
+					   get_order(LPI_PROPBASE_SZ));
+	if (!gic_rdists->prop_page) {
+		pr_err("Failed to allocate PROPBASE\n");
+		return -ENOMEM;
+	}
+
+	paddr = page_to_phys(gic_rdists->prop_page);
+	pr_info("GIC: using LPI property table @%pa\n", &paddr);
+
+	/* Priority 0xa0, Group-1, disabled */
+	memset(page_address(gic_rdists->prop_page),
+	       LPI_PROP_DEFAULT_PRIO | LPI_PROP_GROUP1,
+	       LPI_PROPBASE_SZ);
+
+	/* Make sure the GIC will observe the written configuration */
+	__flush_dcache_area(page_address(gic_rdists->prop_page), LPI_PROPBASE_SZ);
+
+	return 0;
+}
+
+static const char *its_base_type_string[] = {
+	[GITS_BASER_TYPE_DEVICE]	= "Devices",
+	[GITS_BASER_TYPE_VCPU]		= "Virtual CPUs",
+	[GITS_BASER_TYPE_CPU]		= "Physical CPUs",
+	[GITS_BASER_TYPE_COLLECTION]	= "Interrupt Collections",
+	[GITS_BASER_TYPE_RESERVED5] 	= "Reserved (5)",
+	[GITS_BASER_TYPE_RESERVED6] 	= "Reserved (6)",
+	[GITS_BASER_TYPE_RESERVED7] 	= "Reserved (7)",
+};
+
+static void its_free_tables(struct its_node *its)
+{
+	int i;
+
+	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
+		if (its->tables[i]) {
+			free_page((unsigned long)its->tables[i]);
+			its->tables[i] = NULL;
+		}
+	}
+}
+
+static int its_alloc_tables(struct its_node *its)
+{
+	int err;
+	int i;
+	int psz = SZ_64K;
+	u64 shr = GITS_BASER_InnerShareable;
+
+	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
+		u64 val = readq_relaxed(its->base + GITS_BASER + i * 8);
+		u64 type = GITS_BASER_TYPE(val);
+		u64 entry_size = GITS_BASER_ENTRY_SIZE(val);
+		int order = get_order(psz);
+		int alloc_size;
+		u64 tmp;
+		void *base;
+
+		if (type == GITS_BASER_TYPE_NONE)
+			continue;
+
+		/*
+		 * Allocate as many entries as required to fit the
+		 * range of device IDs that the ITS can grok... The ID
+		 * space being incredibly sparse, this results in a
+		 * massive waste of memory.
+		 *
+		 * For other tables, only allocate a single page.
+		 */
+		if (type == GITS_BASER_TYPE_DEVICE) {
+			u64 typer = readq_relaxed(its->base + GITS_TYPER);
+			u32 ids = GITS_TYPER_DEVBITS(typer);
+
+			order = get_order((1UL << ids) * entry_size);
+			if (order >= MAX_ORDER) {
+				order = MAX_ORDER - 1;
+				pr_warn("%s: Device Table too large, reduce its page order to %u\n",
+					its->msi_chip.of_node->full_name, order);
+			}
+		}
+
+		alloc_size = (1 << order) * PAGE_SIZE;
+		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
+		if (!base) {
+			err = -ENOMEM;
+			goto out_free;
+		}
+
+		its->tables[i] = base;
+
+retry_baser:
+		val = (virt_to_phys(base) 				 |
+		       (type << GITS_BASER_TYPE_SHIFT)			 |
+		       ((entry_size - 1) << GITS_BASER_ENTRY_SIZE_SHIFT) |
+		       GITS_BASER_WaWb					 |
+		       shr						 |
+		       GITS_BASER_VALID);
+
+		switch (psz) {
+		case SZ_4K:
+			val |= GITS_BASER_PAGE_SIZE_4K;
+			break;
+		case SZ_16K:
+			val |= GITS_BASER_PAGE_SIZE_16K;
+			break;
+		case SZ_64K:
+			val |= GITS_BASER_PAGE_SIZE_64K;
+			break;
+		}
+
+		val |= (alloc_size / psz) - 1;
+
+		writeq_relaxed(val, its->base + GITS_BASER + i * 8);
+		tmp = readq_relaxed(its->base + GITS_BASER + i * 8);
+
+		if ((val ^ tmp) & GITS_BASER_SHAREABILITY_MASK) {
+			/*
+			 * Shareability didn't stick. Just use
+			 * whatever the read reported, which is likely
+			 * to be the only thing this redistributor
+			 * supports.
+			 */
+			shr = tmp & GITS_BASER_SHAREABILITY_MASK;
+			goto retry_baser;
+		}
+
+		if ((val ^ tmp) & GITS_BASER_PAGE_SIZE_MASK) {
+			/*
+			 * Page size didn't stick. Let's try a smaller
+			 * size and retry. If we reach 4K, then
+			 * something is horribly wrong...
+			 */
+			switch (psz) {
+			case SZ_16K:
+				psz = SZ_4K;
+				goto retry_baser;
+			case SZ_64K:
+				psz = SZ_16K;
+				goto retry_baser;
+			}
+		}
+
+		if (val != tmp) {
+			pr_err("ITS: %s: GITS_BASER%d doesn't stick: %lx %lx\n",
+			       its->msi_chip.of_node->full_name, i,
+			       (unsigned long) val, (unsigned long) tmp);
+			err = -ENXIO;
+			goto out_free;
+		}
+
+		pr_info("ITS: allocated %d %s @%lx (psz %dK, shr %d)\n",
+			(int)(alloc_size / entry_size),
+			its_base_type_string[type],
+			(unsigned long)virt_to_phys(base),
+			psz / SZ_1K, (int)shr >> GITS_BASER_SHAREABILITY_SHIFT);
+	}
+
+	return 0;
+
+out_free:
+	its_free_tables(its);
+
+	return err;
+}
+
+static int its_alloc_collections(struct its_node *its)
+{
+	its->collections = kzalloc(nr_cpu_ids * sizeof(*its->collections),
+				   GFP_KERNEL);
+	if (!its->collections)
+		return -ENOMEM;
+
+	return 0;
+}
+
+static void its_cpu_init_lpis(void)
+{
+	void __iomem *rbase = gic_data_rdist_rd_base();
+	struct page *pend_page;
+	u64 val, tmp;
+
+	/* If we didn't allocate the pending table yet, do it now */
+	pend_page = gic_data_rdist()->pend_page;
+	if (!pend_page) {
+		phys_addr_t paddr;
+		/*
+		 * The pending pages have to be at least 64kB aligned,
+		 * hence the 'max(LPI_PENDBASE_SZ, SZ_64K)' below.
+		 */
+		pend_page = alloc_pages(GFP_NOWAIT | __GFP_ZERO,
+					get_order(max(LPI_PENDBASE_SZ, SZ_64K)));
+		if (!pend_page) {
+			pr_err("Failed to allocate PENDBASE for CPU%d\n",
+			       smp_processor_id());
+			return;
+		}
+
+		/* Make sure the GIC will observe the zero-ed page */
+		__flush_dcache_area(page_address(pend_page), LPI_PENDBASE_SZ);
+
+		paddr = page_to_phys(pend_page);
+		pr_info("CPU%d: using LPI pending table @%pa\n",
+			smp_processor_id(), &paddr);
+		gic_data_rdist()->pend_page = pend_page;
+	}
+
+	/* Disable LPIs */
+	val = readl_relaxed(rbase + GICR_CTLR);
+	val &= ~GICR_CTLR_ENABLE_LPIS;
+	writel_relaxed(val, rbase + GICR_CTLR);
+
+	/*
+	 * Make sure any change to the table is observable by the GIC.
+	 */
+	dsb(sy);
+
+	/* set PROPBASE */
+	val = (page_to_phys(gic_rdists->prop_page) |
+	       GICR_PROPBASER_InnerShareable |
+	       GICR_PROPBASER_WaWb |
+	       ((LPI_NRBITS - 1) & GICR_PROPBASER_IDBITS_MASK));
+
+	writeq_relaxed(val, rbase + GICR_PROPBASER);
+	tmp = readq_relaxed(rbase + GICR_PROPBASER);
+
+	if ((tmp ^ val) & GICR_PROPBASER_SHAREABILITY_MASK) {
+		pr_info_once("GIC: using cache flushing for LPI property table\n");
+		gic_rdists->flags |= RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING;
+	}
+
+	/* set PENDBASE */
+	val = (page_to_phys(pend_page) |
+	       GICR_PROPBASER_InnerShareable |
+	       GICR_PROPBASER_WaWb);
+
+	writeq_relaxed(val, rbase + GICR_PENDBASER);
+
+	/* Enable LPIs */
+	val = readl_relaxed(rbase + GICR_CTLR);
+	val |= GICR_CTLR_ENABLE_LPIS;
+	writel_relaxed(val, rbase + GICR_CTLR);
+
+	/* Make sure the GIC has seen the above */
+	dsb(sy);
+}
+
+static void its_cpu_init_collection(void)
+{
+	struct its_node *its;
+	int cpu;
+
+	spin_lock(&its_lock);
+	cpu = smp_processor_id();
+
+	list_for_each_entry(its, &its_nodes, entry) {
+		u64 target;
+
+		/*
+		 * We now have to bind each collection to its target
+		 * redistributor.
+		 */
+		if (readq_relaxed(its->base + GITS_TYPER) & GITS_TYPER_PTA) {
+			/*
+			 * This ITS wants the physical address of the
+			 * redistributor.
+			 */
+			target = gic_data_rdist()->phys_base;
+		} else {
+			/*
+			 * This ITS wants a linear CPU number.
+			 */
+			target = readq_relaxed(gic_data_rdist_rd_base() + GICR_TYPER);
+			target = GICR_TYPER_CPU_NUMBER(target);
+		}
+
+		/* Perform collection mapping */
+		its->collections[cpu].target_address = target;
+		its->collections[cpu].col_id = cpu;
+
+		its_send_mapc(its, &its->collections[cpu], 1);
+		its_send_invall(its, &its->collections[cpu]);
+	}
+
+	spin_unlock(&its_lock);
+}
+
+static struct its_device *its_find_device(struct its_node *its, u32 dev_id)
+{
+	struct its_device *its_dev = NULL, *tmp;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&its->lock, flags);
+
+	list_for_each_entry(tmp, &its->its_device_list, entry) {
+		if (tmp->device_id == dev_id) {
+			its_dev = tmp;
+			break;
+		}
+	}
+
+	raw_spin_unlock_irqrestore(&its->lock, flags);
+
+	return its_dev;
+}
+
+static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
+					    int nvecs)
+{
+	struct its_device *dev;
+	unsigned long *lpi_map;
+	unsigned long flags;
+	void *itt;
+	int lpi_base;
+	int nr_lpis;
+	int nr_ites;
+	int cpu;
+	int sz;
+
+	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	/*
+	 * At least one bit of EventID is being used, hence a minimum
+	 * of two entries. No, the architecture doesn't let you
+	 * express an ITT with a single entry.
+	 */
+	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
+	sz = nr_ites * its->ite_size;
+	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
+	itt = kzalloc(sz, GFP_KERNEL);
+	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
+
+	if (!dev || !itt || !lpi_map) {
+		kfree(dev);
+		kfree(itt);
+		kfree(lpi_map);
+		return NULL;
+	}
+
+	dev->its = its;
+	dev->itt = itt;
+	dev->nr_ites = nr_ites;
+	dev->lpi_map = lpi_map;
+	dev->lpi_base = lpi_base;
+	dev->nr_lpis = nr_lpis;
+	dev->device_id = dev_id;
+	INIT_LIST_HEAD(&dev->entry);
+
+	raw_spin_lock_irqsave(&its->lock, flags);
+	list_add(&dev->entry, &its->its_device_list);
+	raw_spin_unlock_irqrestore(&its->lock, flags);
+
+	/* Bind the device to the first possible CPU */
+	cpu = cpumask_first(cpu_online_mask);
+	dev->collection = &its->collections[cpu];
+
+	/* Map device to its ITT */
+	its_send_mapd(dev, 1);
+
+	return dev;
+}
+
+static void its_free_device(struct its_device *its_dev)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&its_dev->its->lock, flags);
+	list_del(&its_dev->entry);
+	raw_spin_unlock_irqrestore(&its_dev->its->lock, flags);
+	kfree(its_dev->itt);
+	kfree(its_dev);
+}
+
+static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
+{
+	int idx;
+
+	idx = find_first_zero_bit(dev->lpi_map, dev->nr_lpis);
+	if (idx == dev->nr_lpis)
+		return -ENOSPC;
+
+	*hwirq = dev->lpi_base + idx;
+	set_bit(idx, dev->lpi_map);
+
+	return 0;
+}
+
+struct its_pci_alias {
+	struct pci_dev	*pdev;
+	u32		dev_id;
+	u32		count;
+};
+
+static int its_pci_msi_vec_count(struct pci_dev *pdev)
+{
+	int msi, msix;
+
+	msi = max(pci_msi_vec_count(pdev), 0);
+	msix = max(pci_msix_vec_count(pdev), 0);
+
+	return max(msi, msix);
+}
+
+static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data)
+{
+	struct its_pci_alias *dev_alias = data;
+
+	dev_alias->dev_id = alias;
+	if (pdev != dev_alias->pdev)
+		dev_alias->count += its_pci_msi_vec_count(dev_alias->pdev);
+
+	return 0;
+}
+
+static int its_msi_prepare(struct irq_domain *domain, struct device *dev,
+			   int nvec, msi_alloc_info_t *info)
+{
+	struct pci_dev *pdev;
+	struct its_node *its;
+	struct its_device *its_dev;
+	struct its_pci_alias dev_alias;
+
+	if (!dev_is_pci(dev))
+		return -EINVAL;
+
+	pdev = to_pci_dev(dev);
+	dev_alias.pdev = pdev;
+	dev_alias.count = nvec;
+
+	pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias);
+	its = domain->parent->host_data;
+
+	its_dev = its_find_device(its, dev_alias.dev_id);
+	if (its_dev) {
+		/*
+		 * We already have seen this ID, probably through
+		 * another alias (PCI bridge of some sort). No need to
+		 * create the device.
+		 */
+		dev_dbg(dev, "Reusing ITT for devID %x\n", dev_alias.dev_id);
+		goto out;
+	}
+
+	its_dev = its_create_device(its, dev_alias.dev_id, dev_alias.count);
+	if (!its_dev)
+		return -ENOMEM;
+
+	dev_dbg(&pdev->dev, "ITT %d entries, %d bits\n",
+		dev_alias.count, ilog2(dev_alias.count));
+out:
+	info->scratchpad[0].ptr = its_dev;
+	info->scratchpad[1].ptr = dev;
+	return 0;
+}
+
+static struct msi_domain_ops its_pci_msi_ops = {
+	.msi_prepare	= its_msi_prepare,
+};
+
+static struct msi_domain_info its_pci_msi_domain_info = {
+	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+		   MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX),
+	.ops	= &its_pci_msi_ops,
+	.chip	= &its_msi_irq_chip,
+};
+
+static int its_irq_gic_domain_alloc(struct irq_domain *domain,
+				    unsigned int virq,
+				    irq_hw_number_t hwirq)
+{
+	struct of_phandle_args args;
+
+	args.np = domain->parent->of_node;
+	args.args_count = 3;
+	args.args[0] = GIC_IRQ_TYPE_LPI;
+	args.args[1] = hwirq;
+	args.args[2] = IRQ_TYPE_EDGE_RISING;
+
+	return irq_domain_alloc_irqs_parent(domain, virq, 1, &args);
+}
+
+static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
+				unsigned int nr_irqs, void *args)
+{
+	msi_alloc_info_t *info = args;
+	struct its_device *its_dev = info->scratchpad[0].ptr;
+	irq_hw_number_t hwirq;
+	int err;
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		err = its_alloc_device_irq(its_dev, &hwirq);
+		if (err)
+			return err;
+
+		err = its_irq_gic_domain_alloc(domain, virq + i, hwirq);
+		if (err)
+			return err;
+
+		irq_domain_set_hwirq_and_chip(domain, virq + i,
+					      hwirq, &its_irq_chip, its_dev);
+		dev_dbg(info->scratchpad[1].ptr, "ID:%d pID:%d vID:%d\n",
+			(int)(hwirq - its_dev->lpi_base), (int)hwirq, virq + i);
+	}
+
+	return 0;
+}
+
+static void its_irq_domain_activate(struct irq_domain *domain,
+				    struct irq_data *d)
+{
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	u32 event = its_get_event_id(d);
+
+	/* Map the GIC IRQ and event to the device */
+	its_send_mapvi(its_dev, d->hwirq, event);
+}
+
+static void its_irq_domain_deactivate(struct irq_domain *domain,
+				      struct irq_data *d)
+{
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	u32 event = its_get_event_id(d);
+
+	/* Stop the delivery of interrupts */
+	its_send_discard(its_dev, event);
+}
+
+static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
+				unsigned int nr_irqs)
+{
+	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
+	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	int i;
+
+	for (i = 0; i < nr_irqs; i++) {
+		struct irq_data *data = irq_domain_get_irq_data(domain,
+								virq + i);
+		u32 event = its_get_event_id(data);
+
+		/* Mark interrupt index as unused */
+		clear_bit(event, its_dev->lpi_map);
+
+		/* Nuke the entry in the domain */
+		irq_domain_reset_irq_data(data);
+	}
+
+	/* If all interrupts have been freed, start mopping the floor */
+	if (bitmap_empty(its_dev->lpi_map, its_dev->nr_lpis)) {
+		its_lpi_free(its_dev->lpi_map,
+			     its_dev->lpi_base,
+			     its_dev->nr_lpis);
+
+		/* Unmap device/itt */
+		its_send_mapd(its_dev, 0);
+		its_free_device(its_dev);
+	}
+
+	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
+}
+
+static const struct irq_domain_ops its_domain_ops = {
+	.alloc			= its_irq_domain_alloc,
+	.free			= its_irq_domain_free,
+	.activate		= its_irq_domain_activate,
+	.deactivate		= its_irq_domain_deactivate,
+};
+
+static int its_force_quiescent(void __iomem *base)
+{
+	u32 count = 1000000;	/* 1s */
+	u32 val;
+
+	val = readl_relaxed(base + GITS_CTLR);
+	if (val & GITS_CTLR_QUIESCENT)
+		return 0;
+
+	/* Disable the generation of all interrupts to this ITS */
+	val &= ~GITS_CTLR_ENABLE;
+	writel_relaxed(val, base + GITS_CTLR);
+
+	/* Poll GITS_CTLR and wait until ITS becomes quiescent */
+	while (1) {
+		val = readl_relaxed(base + GITS_CTLR);
+		if (val & GITS_CTLR_QUIESCENT)
+			return 0;
+
+		count--;
+		if (!count)
+			return -EBUSY;
+
+		cpu_relax();
+		udelay(1);
+	}
+}
+
+static int its_probe(struct device_node *node, struct irq_domain *parent)
+{
+	struct resource res;
+	struct its_node *its;
+	void __iomem *its_base;
+	u32 val;
+	u64 baser, tmp;
+	int err;
+
+	err = of_address_to_resource(node, 0, &res);
+	if (err) {
+		pr_warn("%s: no regs?\n", node->full_name);
+		return -ENXIO;
+	}
+
+	its_base = ioremap(res.start, resource_size(&res));
+	if (!its_base) {
+		pr_warn("%s: unable to map registers\n", node->full_name);
+		return -ENOMEM;
+	}
+
+	val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
+	if (val != 0x30 && val != 0x40) {
+		pr_warn("%s: no ITS detected, giving up\n", node->full_name);
+		err = -ENODEV;
+		goto out_unmap;
+	}
+
+	err = its_force_quiescent(its_base);
+	if (err) {
+		pr_warn("%s: failed to quiesce, giving up\n",
+			node->full_name);
+		goto out_unmap;
+	}
+
+	pr_info("ITS: %s\n", node->full_name);
+
+	its = kzalloc(sizeof(*its), GFP_KERNEL);
+	if (!its) {
+		err = -ENOMEM;
+		goto out_unmap;
+	}
+
+	raw_spin_lock_init(&its->lock);
+	INIT_LIST_HEAD(&its->entry);
+	INIT_LIST_HEAD(&its->its_device_list);
+	its->base = its_base;
+	its->phys_base = res.start;
+	its->msi_chip.of_node = node;
+	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
+
+	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
+	if (!its->cmd_base) {
+		err = -ENOMEM;
+		goto out_free_its;
+	}
+	its->cmd_write = its->cmd_base;
+
+	err = its_alloc_tables(its);
+	if (err)
+		goto out_free_cmd;
+
+	err = its_alloc_collections(its);
+	if (err)
+		goto out_free_tables;
+
+	baser = (virt_to_phys(its->cmd_base)	|
+		 GITS_CBASER_WaWb		|
+		 GITS_CBASER_InnerShareable	|
+		 (ITS_CMD_QUEUE_SZ / SZ_4K - 1)	|
+		 GITS_CBASER_VALID);
+
+	writeq_relaxed(baser, its->base + GITS_CBASER);
+	tmp = readq_relaxed(its->base + GITS_CBASER);
+	writeq_relaxed(0, its->base + GITS_CWRITER);
+	writel_relaxed(GITS_CTLR_ENABLE, its->base + GITS_CTLR);
+
+	if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
+		pr_info("ITS: using cache flushing for cmd queue\n");
+		its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
+	}
+
+	if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
+		its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
+		if (!its->domain) {
+			err = -ENOMEM;
+			goto out_free_tables;
+		}
+
+		its->domain->parent = parent;
+
+		its->msi_chip.domain = pci_msi_create_irq_domain(node,
+								 &its_pci_msi_domain_info,
+								 its->domain);
+		if (!its->msi_chip.domain) {
+			err = -ENOMEM;
+			goto out_free_domains;
+		}
+
+		err = of_pci_msi_chip_add(&its->msi_chip);
+		if (err)
+			goto out_free_domains;
+	}
+
+	spin_lock(&its_lock);
+	list_add(&its->entry, &its_nodes);
+	spin_unlock(&its_lock);
+
+	return 0;
+
+out_free_domains:
+	if (its->msi_chip.domain)
+		irq_domain_remove(its->msi_chip.domain);
+	if (its->domain)
+		irq_domain_remove(its->domain);
+out_free_tables:
+	its_free_tables(its);
+out_free_cmd:
+	kfree(its->cmd_base);
+out_free_its:
+	kfree(its);
+out_unmap:
+	iounmap(its_base);
+	pr_err("ITS: failed probing %s (%d)\n", node->full_name, err);
+	return err;
+}
+
+static bool gic_rdists_supports_plpis(void)
+{
+	return !!(readl_relaxed(gic_data_rdist_rd_base() + GICR_TYPER) & GICR_TYPER_PLPIS);
+}
+
+int its_cpu_init(void)
+{
+	if (!list_empty(&its_nodes)) {
+		if (!gic_rdists_supports_plpis()) {
+			pr_info("CPU%d: LPIs not supported\n", smp_processor_id());
+			return -ENXIO;
+		}
+		its_cpu_init_lpis();
+		its_cpu_init_collection();
+	}
+
+	return 0;
+}
+
+static struct of_device_id its_device_id[] = {
+	{	.compatible	= "arm,gic-v3-its",	},
+	{},
+};
+
+int its_init(struct device_node *node, struct rdists *rdists,
+	     struct irq_domain *parent_domain)
+{
+	struct device_node *np;
+
+	for (np = of_find_matching_node(node, its_device_id); np;
+	     np = of_find_matching_node(np, its_device_id)) {
+		its_probe(np, parent_domain);
+	}
+
+	if (list_empty(&its_nodes)) {
+		pr_warn("ITS: No ITS available, not enabling LPIs\n");
+		return -ENXIO;
+	}
+
+	gic_rdists = rdists;
+	gic_root_node = node;
+
+	its_alloc_lpi_tables();
+	its_lpi_init(rdists->id_bits);
+
+	return 0;
+}
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 05/22] xen/arm: gicv3: Refactor redistributor information
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (3 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 04/22] xen/arm: its: Import GICv3 ITS driver from linux vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen vijay.kilari
                   ` (18 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Separate redistributor information into rdist and rdist_prop
structures.

The rdist_prop holds the redistributor common information
and rdist holds the per cpu specific information.

This percpu rdist defined as global and shared with ITS
driver

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/gic-v3.c             |   15 ++++++++++-----
 xen/include/asm-arm/gic_v3_defs.h |   15 +++++++++++++++
 2 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index ab80670..2b406e6 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -53,6 +53,7 @@ static struct {
     paddr_t dbase;            /* Address of distributor registers */
     paddr_t dbase_size;
     void __iomem *map_dbase;  /* Mapped address of distributor registers */
+    struct rdist_prop rdist_data;
     struct rdist_region *rdist_regions;
     uint32_t  rdist_stride;
     unsigned int rdist_count; /* Number of rdist regions count */
@@ -63,10 +64,10 @@ static struct {
 static struct gic_info gicv3_info;
 
 /* per-cpu re-distributor base */
-static DEFINE_PER_CPU(void __iomem*, rbase);
+DEFINE_PER_CPU(struct rdist, rdist);
 
 #define GICD                   (gicv3.map_dbase)
-#define GICD_RDIST_BASE        (this_cpu(rbase))
+#define GICD_RDIST_BASE        (per_cpu(rdist, smp_processor_id()).rbase)
 #define GICD_RDIST_SGI_BASE    (GICD_RDIST_BASE + SZ_64K)
 
 /*
@@ -609,6 +610,7 @@ static int __init gicv3_populate_rdist(void)
     uint32_t aff;
     uint32_t reg;
     uint64_t typer;
+    uint64_t offset;
     uint64_t mpidr = cpu_logical_map(smp_processor_id());
 
     /*
@@ -644,9 +646,12 @@ static int __init gicv3_populate_rdist(void)
 
             if ( (typer >> 32) == aff )
             {
-                this_cpu(rbase) = ptr;
-                printk("GICv3: CPU%d: Found redistributor in region %d @%p\n",
-                        smp_processor_id(), i, ptr);
+                offset = ptr - gicv3.rdist_regions[i].map_base;
+                per_cpu(rdist, smp_processor_id()).rbase = ptr;
+                per_cpu(rdist, smp_processor_id()).phys_base =  gicv3.rdist_regions[i].base + offset;
+                printk("GICv3: CPU%d: Found redistributor in region %d @%"PRIpaddr"\n",
+                        smp_processor_id(), i,
+                        per_cpu(rdist, smp_processor_id()).phys_base);
                 return 0;
             }
             if ( gicv3.rdist_stride )
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index b8a1c2e..4e64b56 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -152,6 +152,21 @@
 #define ICH_SGI_IRQ_SHIFT            24
 #define ICH_SGI_IRQ_MASK             0xf
 #define ICH_SGI_TARGETLIST_MASK      0xffff
+
+struct rdist {
+    void __iomem *rbase;
+    void * pend_page;
+    paddr_t phys_base;
+};
+
+struct rdist_prop {
+    void * prop_page;
+    int    id_bits;
+    uint64_t flags;
+};
+
+DECLARE_PER_CPU(struct rdist, rdist);
+
 #endif /* __ASM_ARM_GIC_V3_DEFS_H__ */
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (4 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 05/22] xen/arm: gicv3: Refactor redistributor information vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-20 15:06   ` Julien Grall
  2015-04-01 11:34   ` Ian Campbell
  2015-03-19 14:37 ` [RFC PATCH v2 07/22] xen/arm: its: Move ITS command encode helper functions vijay.kilari
                   ` (17 subsequent siblings)
  23 siblings, 2 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

This patch just makes ITS driver taken from linux
compiles in xen environment.

The following changes are done
  - memory allocation apis are changed
  - raw spin lock api's changed to normal spin lock api's
  - debug prints changed to xen debug prints
  - remove msi chip functions to setup_irq and teardown_irq
  - linux irqchip functions are removed
  - updated gic_v3_defs.h file

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - put unused code under #if0/endif
    - changes to redistributor is moved to separate patch
    - Fixed comments from RFC version
---
 xen/arch/arm/Makefile             |    1 +
 xen/arch/arm/gic-v3-its.c         |  337 +++++++++++++++++++++----------------
 xen/include/asm-arm/gic_v3_defs.h |  116 ++++++++++++-
 3 files changed, 304 insertions(+), 150 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 41aba2e..66ea264 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -31,6 +31,7 @@ obj-y += shutdown.o
 obj-y += traps.o
 obj-y += vgic.o vgic-v2.o
 obj-$(CONFIG_ARM_64) += vgic-v3.o
+obj-$(CONFIG_ARM_64) += gic-v3-its.o
 obj-y += vtimer.o
 obj-y += vuart.o
 obj-y += hvm.o
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 596b0a9..ce7ced6 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -2,6 +2,10 @@
  * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
  * Author: Marc Zyngier <marc.zyngier@arm.com>
  *
+ * Xen changes:
+ * Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
+ * Copyright (C) 2014, 2015 Cavium Inc.
+ *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
@@ -15,28 +19,41 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
-#include <linux/bitmap.h>
-#include <linux/cpu.h>
-#include <linux/delay.h>
-#include <linux/interrupt.h>
-#include <linux/log2.h>
-#include <linux/mm.h>
-#include <linux/msi.h>
-#include <linux/of.h>
-#include <linux/of_address.h>
-#include <linux/of_irq.h>
-#include <linux/of_pci.h>
-#include <linux/of_platform.h>
-#include <linux/percpu.h>
-#include <linux/slab.h>
-
-#include <linux/irqchip/arm-gic-v3.h>
-
-#include <asm/cacheflush.h>
-#include <asm/cputype.h>
-#include <asm/exception.h>
-
-#include "irqchip.h"
+#include <xen/config.h>
+#include <xen/bitops.h>
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/cpu.h>
+#include <xen/mm.h>
+#include <xen/irq.h>
+#include <xen/sched.h>
+#include <xen/errno.h>
+#include <xen/delay.h>
+#include <xen/device_tree.h>
+#include <xen/libfdt/libfdt.h>
+#include <xen/xmalloc.h>
+#include <xen/list.h>
+#include <xen/sizes.h>
+#include <xen/vmap.h>
+#include <asm/p2m.h>
+#include <asm/domain.h>
+#include <asm/io.h>
+#include <asm/device.h>
+#include <asm/gic.h>
+#include <asm/gic_v3_defs.h>
+
+#define its_print(lvl, fmt, ...)                                      \
+	printk(lvl "GIC-ITS:" fmt, ## __VA_ARGS__)
+
+#define its_err(fmt, ...) its_print(XENLOG_ERR, fmt, ## __VA_ARGS__)
+
+#define its_dbg(fmt, ...)                                             \
+	its_print(XENLOG_DEBUG, fmt, ## __VA_ARGS__)
+
+#define its_info(fmt, ...)                                            \
+	its_print(XENLOG_INFO, fmt, ## __VA_ARGS__)
+
+#define its_warn(fmt, ...)                                            \
 
 #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING		(1 << 0)
 
@@ -58,10 +75,8 @@ struct its_collection {
  * devices writing to it.
  */
 struct its_node {
-	raw_spinlock_t		lock;
+	spinlock_t		lock;
 	struct list_head	entry;
-	struct msi_controller	msi_chip;
-	struct irq_domain	*domain;
 	void __iomem		*base;
 	unsigned long		phys_base;
 	struct its_cmd_block	*cmd_base;
@@ -85,7 +100,7 @@ struct its_device {
 	struct its_collection	*collection;
 	void			*itt;
 	unsigned long		*lpi_map;
-	irq_hw_number_t		lpi_base;
+	u32			lpi_base;
 	int			nr_lpis;
 	u32			nr_ites;
 	u32			device_id;
@@ -93,11 +108,11 @@ struct its_device {
 
 static LIST_HEAD(its_nodes);
 static DEFINE_SPINLOCK(its_lock);
-static struct device_node *gic_root_node;
-static struct rdists *gic_rdists;
+static struct dt_device_node *gic_root_node;
+static struct rdist_prop  *gic_rdists;
 
-#define gic_data_rdist()		(raw_cpu_ptr(gic_rdists->rdist))
-#define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
+#define gic_data_rdist()		(per_cpu(rdist, smp_processor_id()))
+#define gic_data_rdist_rd_base()	(per_cpu(rdist, smp_processor_id()).rbase)
 
 /*
  * ITS command descriptors - parameters to be encoded in a command
@@ -228,10 +243,10 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
 						 struct its_cmd_desc *desc)
 {
 	unsigned long itt_addr;
-	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
+	u8 size = max(fls(desc->its_mapd_cmd.dev->nr_ites) - 1, 1);
 
-	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
-	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
+	itt_addr = __pa(desc->its_mapd_cmd.dev->itt);
+        itt_addr = ROUNDUP(itt_addr, ITS_ITT_ALIGN);
 
 	its_encode_cmd(cmd, GITS_CMD_MAPD);
 	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
@@ -348,7 +363,7 @@ static struct its_cmd_block *its_allocate_entry(struct its_node *its)
 	while (its_queue_full(its)) {
 		count--;
 		if (!count) {
-			pr_err_ratelimited("ITS queue not draining\n");
+			its_err("ITS queue not draining\n");
 			return NULL;
 		}
 		cpu_relax();
@@ -380,7 +395,7 @@ static void its_flush_cmd(struct its_node *its, struct its_cmd_block *cmd)
 	 * the ITS.
 	 */
 	if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
-		__flush_dcache_area(cmd, sizeof(*cmd));
+		clean_and_invalidate_dcache_va_range(cmd, sizeof(*cmd));
 	else
 		dsb(ishst);
 }
@@ -402,7 +417,7 @@ static void its_wait_for_range_completion(struct its_node *its,
 
 		count--;
 		if (!count) {
-			pr_err_ratelimited("ITS queue timeout\n");
+			its_err("ITS queue timeout\n");
 			return;
 		}
 		cpu_relax();
@@ -418,12 +433,12 @@ static void its_send_single_command(struct its_node *its,
 	struct its_collection *sync_col;
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&its->lock, flags);
+	spin_lock_irqsave(&its->lock, flags);
 
 	cmd = its_allocate_entry(its);
 	if (!cmd) {		/* We're soooooo screewed... */
-		pr_err_ratelimited("ITS can't allocate, dropping command\n");
-		raw_spin_unlock_irqrestore(&its->lock, flags);
+		its_err("ITS can't allocate, dropping command\n");
+		spin_unlock_irqrestore(&its->lock, flags);
 		return;
 	}
 	sync_col = builder(cmd, desc);
@@ -432,7 +447,7 @@ static void its_send_single_command(struct its_node *its,
 	if (sync_col) {
 		sync_cmd = its_allocate_entry(its);
 		if (!sync_cmd) {
-			pr_err_ratelimited("ITS can't SYNC, skipping\n");
+			its_err("ITS can't SYNC, skipping\n");
 			goto post;
 		}
 		its_encode_cmd(sync_cmd, GITS_CMD_SYNC);
@@ -443,12 +458,13 @@ static void its_send_single_command(struct its_node *its,
 
 post:
 	next_cmd = its_post_commands(its);
-	raw_spin_unlock_irqrestore(&its->lock, flags);
+	spin_unlock_irqrestore(&its->lock, flags);
 
 	its_wait_for_range_completion(its, cmd, next_cmd);
 }
 
-static void its_send_inv(struct its_device *dev, u32 event_id)
+/* TODO: Remove static for the sake of compilation */
+void its_send_inv(struct its_device *dev, u32 event_id)
 {
 	struct its_cmd_desc desc;
 
@@ -479,7 +495,8 @@ static void its_send_mapc(struct its_node *its, struct its_collection *col,
 	its_send_single_command(its, its_build_mapc_cmd, &desc);
 }
 
-static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
+/* TODO: Remove static for the sake of compilation */
+void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
 {
 	struct its_cmd_desc desc;
 
@@ -490,7 +507,8 @@ static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
 	its_send_single_command(dev->its, its_build_mapvi_cmd, &desc);
 }
 
-static void its_send_movi(struct its_device *dev,
+/* TODO: Remove static for the sake of compilation */
+void its_send_movi(struct its_device *dev,
 			  struct its_collection *col, u32 id)
 {
 	struct its_cmd_desc desc;
@@ -502,7 +520,8 @@ static void its_send_movi(struct its_device *dev,
 	its_send_single_command(dev->its, its_build_movi_cmd, &desc);
 }
 
-static void its_send_discard(struct its_device *dev, u32 id)
+/* TODO: Remove static for the sake of compilation */
+void its_send_discard(struct its_device *dev, u32 id)
 {
 	struct its_cmd_desc desc;
 
@@ -522,6 +541,11 @@ static void its_send_invall(struct its_node *its, struct its_collection *col)
 }
 
 /*
+ * The below irqchip functions are no more required.
+ * TODO: Will be implemented as separate patch
+ */
+#if 0
+/*
  * irqchip functions - assumes MSI, mostly.
  */
 
@@ -630,6 +654,7 @@ static struct irq_chip its_msi_irq_chip = {
 	.irq_eoi		= irq_chip_eoi_parent,
 	.irq_write_msi_msg	= pci_msi_domain_write_msg,
 };
+#endif
 
 /*
  * How we allocate LPIs:
@@ -662,25 +687,24 @@ static int its_lpi_init(u32 id_bits)
 {
 	lpi_chunks = its_lpi_to_chunk(1UL << id_bits);
 
-	lpi_bitmap = kzalloc(BITS_TO_LONGS(lpi_chunks) * sizeof(long),
-			     GFP_KERNEL);
+	lpi_bitmap = xzalloc_bytes(BITS_TO_LONGS(lpi_chunks) * sizeof(long));
 	if (!lpi_bitmap) {
 		lpi_chunks = 0;
 		return -ENOMEM;
 	}
 
-	pr_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks);
+	its_info("ITS: Allocated %d chunks for LPIs\n", (int)lpi_chunks);
 	return 0;
 }
 
-static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
+static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)
 {
 	unsigned long *bitmap = NULL;
 	int chunk_id;
 	int nr_chunks;
 	int i;
 
-	nr_chunks = DIV_ROUND_UP(nr_irqs, IRQS_PER_CHUNK);
+	nr_chunks = DIV_ROUND_UP(nirqs, IRQS_PER_CHUNK);
 
 	spin_lock(&lpi_lock);
 
@@ -696,8 +720,7 @@ static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
 	if (!nr_chunks)
 		goto out;
 
-	bitmap = kzalloc(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long),
-			 GFP_ATOMIC);
+	bitmap = xzalloc_bytes(BITS_TO_LONGS(nr_chunks * IRQS_PER_CHUNK) * sizeof (long));
 	if (!bitmap)
 		goto out;
 
@@ -713,7 +736,8 @@ out:
 	return bitmap;
 }
 
-static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
+/* TODO: Remove static for the sake of compilation */
+void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
 {
 	int lpi;
 
@@ -725,13 +749,13 @@ static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
 		if (test_bit(chunk, lpi_bitmap)) {
 			clear_bit(chunk, lpi_bitmap);
 		} else {
-			pr_err("Bad LPI chunk %d\n", chunk);
+			its_err("Bad LPI chunk %d\n", chunk);
 		}
 	}
 
 	spin_unlock(&lpi_lock);
 
-	kfree(bitmap);
+	xfree(bitmap);
 }
 
 /*
@@ -745,31 +769,31 @@ static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
 /*
  * This is how many bits of ID we need, including the useless ones.
  */
-#define LPI_NRBITS		ilog2(LPI_PROPBASE_SZ + SZ_8K)
+#define LPI_NRBITS		fls(LPI_PROPBASE_SZ + SZ_8K) - 1
 
 #define LPI_PROP_DEFAULT_PRIO	0xa0
 
 static int __init its_alloc_lpi_tables(void)
 {
-	phys_addr_t paddr;
+	paddr_t paddr;
 
-	gic_rdists->prop_page = alloc_pages(GFP_NOWAIT,
-					   get_order(LPI_PROPBASE_SZ));
+	gic_rdists->prop_page = alloc_xenheap_pages(get_order_from_bytes(LPI_PROPBASE_SZ), 0);
 	if (!gic_rdists->prop_page) {
-		pr_err("Failed to allocate PROPBASE\n");
+		its_err("Failed to allocate PROPBASE\n");
 		return -ENOMEM;
 	}
 
-	paddr = page_to_phys(gic_rdists->prop_page);
-	pr_info("GIC: using LPI property table @%pa\n", &paddr);
+	paddr = __pa(gic_rdists->prop_page);
+	its_info("GIC: using LPI property table @%pa\n", &paddr);
 
 	/* Priority 0xa0, Group-1, disabled */
-	memset(page_address(gic_rdists->prop_page),
+	memset(gic_rdists->prop_page,
 	       LPI_PROP_DEFAULT_PRIO | LPI_PROP_GROUP1,
 	       LPI_PROPBASE_SZ);
 
 	/* Make sure the GIC will observe the written configuration */
-	__flush_dcache_area(page_address(gic_rdists->prop_page), LPI_PROPBASE_SZ);
+	clean_and_invalidate_dcache_va_range(gic_rdists->prop_page,
+	                                     LPI_PROPBASE_SZ);
 
 	return 0;
 }
@@ -790,7 +814,7 @@ static void its_free_tables(struct its_node *its)
 
 	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
 		if (its->tables[i]) {
-			free_page((unsigned long)its->tables[i]);
+			xfree(its->tables[i]);
 			its->tables[i] = NULL;
 		}
 	}
@@ -807,7 +831,7 @@ static int its_alloc_tables(struct its_node *its)
 		u64 val = readq_relaxed(its->base + GITS_BASER + i * 8);
 		u64 type = GITS_BASER_TYPE(val);
 		u64 entry_size = GITS_BASER_ENTRY_SIZE(val);
-		int order = get_order(psz);
+		int order = get_order_from_bytes(psz);
 		int alloc_size;
 		u64 tmp;
 		void *base;
@@ -827,25 +851,25 @@ static int its_alloc_tables(struct its_node *its)
 			u64 typer = readq_relaxed(its->base + GITS_TYPER);
 			u32 ids = GITS_TYPER_DEVBITS(typer);
 
-			order = get_order((1UL << ids) * entry_size);
+			order = get_order_from_bytes((1UL << ids) * entry_size);
 			if (order >= MAX_ORDER) {
 				order = MAX_ORDER - 1;
-				pr_warn("%s: Device Table too large, reduce its page order to %u\n",
-					its->msi_chip.of_node->full_name, order);
+				its_warn("Device Table too large, reduce its page order to %u\n",
+					 order);
 			}
 		}
 
 		alloc_size = (1 << order) * PAGE_SIZE;
-		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
+		base = alloc_xenheap_pages(order, 0);
 		if (!base) {
 			err = -ENOMEM;
 			goto out_free;
 		}
-
+		memset(base, 0, alloc_size);
 		its->tables[i] = base;
 
 retry_baser:
-		val = (virt_to_phys(base) 				 |
+		val = (__pa(base) 					 |
 		       (type << GITS_BASER_TYPE_SHIFT)			 |
 		       ((entry_size - 1) << GITS_BASER_ENTRY_SIZE_SHIFT) |
 		       GITS_BASER_WaWb					 |
@@ -897,17 +921,17 @@ retry_baser:
 		}
 
 		if (val != tmp) {
-			pr_err("ITS: %s: GITS_BASER%d doesn't stick: %lx %lx\n",
-			       its->msi_chip.of_node->full_name, i,
+			its_err("ITS: GITS_BASER%d doesn't stick: %lx %lx\n",
+			       i,
 			       (unsigned long) val, (unsigned long) tmp);
 			err = -ENXIO;
 			goto out_free;
 		}
 
-		pr_info("ITS: allocated %d %s @%lx (psz %dK, shr %d)\n",
+		its_info("ITS: allocated %d %s @%lx (psz %dK, shr %d)\n",
 			(int)(alloc_size / entry_size),
 			its_base_type_string[type],
-			(unsigned long)virt_to_phys(base),
+			(unsigned long)__pa(base),
 			psz / SZ_1K, (int)shr >> GITS_BASER_SHAREABILITY_SHIFT);
 	}
 
@@ -921,8 +945,7 @@ out_free:
 
 static int its_alloc_collections(struct its_node *its)
 {
-	its->collections = kzalloc(nr_cpu_ids * sizeof(*its->collections),
-				   GFP_KERNEL);
+	its->collections = xzalloc_array(struct its_collection, nr_cpu_ids);
 	if (!its->collections)
 		return -ENOMEM;
 
@@ -932,32 +955,31 @@ static int its_alloc_collections(struct its_node *its)
 static void its_cpu_init_lpis(void)
 {
 	void __iomem *rbase = gic_data_rdist_rd_base();
-	struct page *pend_page;
+	void *pend_page;
 	u64 val, tmp;
 
 	/* If we didn't allocate the pending table yet, do it now */
-	pend_page = gic_data_rdist()->pend_page;
+	pend_page = gic_data_rdist().pend_page;
 	if (!pend_page) {
-		phys_addr_t paddr;
+		paddr_t paddr;
 		/*
 		 * The pending pages have to be at least 64kB aligned,
 		 * hence the 'max(LPI_PENDBASE_SZ, SZ_64K)' below.
 		 */
-		pend_page = alloc_pages(GFP_NOWAIT | __GFP_ZERO,
-					get_order(max(LPI_PENDBASE_SZ, SZ_64K)));
+		pend_page = alloc_xenheap_pages(get_order_from_bytes(max(LPI_PENDBASE_SZ, SZ_64K)), 0);
 		if (!pend_page) {
-			pr_err("Failed to allocate PENDBASE for CPU%d\n",
+			its_err("Failed to allocate PENDBASE for CPU%d\n",
 			       smp_processor_id());
 			return;
 		}
-
+		memset(pend_page, 0, max(LPI_PENDBASE_SZ, SZ_64K));
 		/* Make sure the GIC will observe the zero-ed page */
-		__flush_dcache_area(page_address(pend_page), LPI_PENDBASE_SZ);
+		clean_and_invalidate_dcache_va_range(pend_page, LPI_PENDBASE_SZ);
 
-		paddr = page_to_phys(pend_page);
-		pr_info("CPU%d: using LPI pending table @%pa\n",
+		paddr = __pa(pend_page);
+		its_info("CPU%d: using LPI pending table @%pa\n",
 			smp_processor_id(), &paddr);
-		gic_data_rdist()->pend_page = pend_page;
+		gic_data_rdist().pend_page = pend_page;
 	}
 
 	/* Disable LPIs */
@@ -971,7 +993,7 @@ static void its_cpu_init_lpis(void)
 	dsb(sy);
 
 	/* set PROPBASE */
-	val = (page_to_phys(gic_rdists->prop_page) |
+	val = (__pa(gic_rdists->prop_page)   |
 	       GICR_PROPBASER_InnerShareable |
 	       GICR_PROPBASER_WaWb |
 	       ((LPI_NRBITS - 1) & GICR_PROPBASER_IDBITS_MASK));
@@ -980,12 +1002,12 @@ static void its_cpu_init_lpis(void)
 	tmp = readq_relaxed(rbase + GICR_PROPBASER);
 
 	if ((tmp ^ val) & GICR_PROPBASER_SHAREABILITY_MASK) {
-		pr_info_once("GIC: using cache flushing for LPI property table\n");
+		its_info("GIC: using cache flushing for LPI property table\n");
 		gic_rdists->flags |= RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING;
 	}
 
 	/* set PENDBASE */
-	val = (page_to_phys(pend_page) |
+	val = (__pa(pend_page)               |
 	       GICR_PROPBASER_InnerShareable |
 	       GICR_PROPBASER_WaWb);
 
@@ -1020,7 +1042,7 @@ static void its_cpu_init_collection(void)
 			 * This ITS wants the physical address of the
 			 * redistributor.
 			 */
-			target = gic_data_rdist()->phys_base;
+			target = gic_data_rdist().phys_base;
 		} else {
 			/*
 			 * This ITS wants a linear CPU number.
@@ -1040,12 +1062,13 @@ static void its_cpu_init_collection(void)
 	spin_unlock(&its_lock);
 }
 
-static struct its_device *its_find_device(struct its_node *its, u32 dev_id)
+/* TODO: Remove static for the sake of compilation */
+struct its_device *its_find_device(struct its_node *its, u32 dev_id)
 {
 	struct its_device *its_dev = NULL, *tmp;
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&its->lock, flags);
+	spin_lock_irqsave(&its->lock, flags);
 
 	list_for_each_entry(tmp, &its->its_device_list, entry) {
 		if (tmp->device_id == dev_id) {
@@ -1054,12 +1077,13 @@ static struct its_device *its_find_device(struct its_node *its, u32 dev_id)
 		}
 	}
 
-	raw_spin_unlock_irqrestore(&its->lock, flags);
+	spin_unlock_irqrestore(&its->lock, flags);
 
 	return its_dev;
 }
 
-static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
+/* TODO: Remove static for the sake of compilation */
+struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 					    int nvecs)
 {
 	struct its_device *dev;
@@ -1072,22 +1096,26 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	int cpu;
 	int sz;
 
-	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+	dev = xzalloc(struct its_device);
 	/*
 	 * At least one bit of EventID is being used, hence a minimum
 	 * of two entries. No, the architecture doesn't let you
 	 * express an ITT with a single entry.
 	 */
-	nr_ites = max(2UL, roundup_pow_of_two(nvecs));
+        /*
+	 * TODO: replace roundup_pow_of_2 with shift for now.
+	 * This code is not used later
+	 */
+	nr_ites = max(2UL, (1UL << (nvecs)));
 	sz = nr_ites * its->ite_size;
 	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-	itt = kzalloc(sz, GFP_KERNEL);
+	itt = xzalloc_bytes(sz);
 	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
 
 	if (!dev || !itt || !lpi_map) {
-		kfree(dev);
-		kfree(itt);
-		kfree(lpi_map);
+		xfree(dev);
+		xfree(itt);
+		xfree(lpi_map);
 		return NULL;
 	}
 
@@ -1100,12 +1128,12 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	dev->device_id = dev_id;
 	INIT_LIST_HEAD(&dev->entry);
 
-	raw_spin_lock_irqsave(&its->lock, flags);
+	spin_lock_irqsave(&its->lock, flags);
 	list_add(&dev->entry, &its->its_device_list);
-	raw_spin_unlock_irqrestore(&its->lock, flags);
+	spin_unlock_irqrestore(&its->lock, flags);
 
 	/* Bind the device to the first possible CPU */
-	cpu = cpumask_first(cpu_online_mask);
+	cpu = cpumask_first(&cpu_online_map);
 	dev->collection = &its->collections[cpu];
 
 	/* Map device to its ITT */
@@ -1114,18 +1142,20 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
 	return dev;
 }
 
-static void its_free_device(struct its_device *its_dev)
+/* TODO: Remove static for the sake of compilation */
+void its_free_device(struct its_device *its_dev)
 {
 	unsigned long flags;
 
-	raw_spin_lock_irqsave(&its_dev->its->lock, flags);
+	spin_lock_irqsave(&its_dev->its->lock, flags);
 	list_del(&its_dev->entry);
-	raw_spin_unlock_irqrestore(&its_dev->its->lock, flags);
-	kfree(its_dev->itt);
-	kfree(its_dev);
+	spin_unlock_irqrestore(&its_dev->its->lock, flags);
+	xfree(its_dev->itt);
+	xfree(its_dev);
 }
 
-static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
+/* TODO: Remove static for the sake of compilation */
+int its_alloc_device_irq(struct its_device *dev, int *hwirq)
 {
 	int idx;
 
@@ -1139,6 +1169,8 @@ static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
 	return 0;
 }
 
+/* pci and msi handling no more required here */
+#if 0
 struct its_pci_alias {
 	struct pci_dev	*pdev;
 	u32		dev_id;
@@ -1218,6 +1250,9 @@ static struct msi_domain_info its_pci_msi_domain_info = {
 	.chip	= &its_msi_irq_chip,
 };
 
+#endif
+/* IRQ domain management is not required */
+#if 0
 static int its_irq_gic_domain_alloc(struct irq_domain *domain,
 				    unsigned int virq,
 				    irq_hw_number_t hwirq)
@@ -1319,6 +1354,7 @@ static const struct irq_domain_ops its_domain_ops = {
 	.activate		= its_irq_domain_activate,
 	.deactivate		= its_irq_domain_deactivate,
 };
+#endif
 
 static int its_force_quiescent(void __iomem *base)
 {
@@ -1348,58 +1384,57 @@ static int its_force_quiescent(void __iomem *base)
 	}
 }
 
-static int its_probe(struct device_node *node, struct irq_domain *parent)
+static int its_probe(struct dt_device_node *node)
 {
-	struct resource res;
+	paddr_t its_addr, its_size;
 	struct its_node *its;
 	void __iomem *its_base;
 	u32 val;
 	u64 baser, tmp;
 	int err;
 
-	err = of_address_to_resource(node, 0, &res);
+	err = dt_device_get_address(node, 0, &its_addr, &its_size);
 	if (err) {
-		pr_warn("%s: no regs?\n", node->full_name);
+		its_warn("%s: no regs?\n", node->full_name);
 		return -ENXIO;
 	}
 
-	its_base = ioremap(res.start, resource_size(&res));
+	its_base = ioremap_nocache(its_addr, its_size);
 	if (!its_base) {
-		pr_warn("%s: unable to map registers\n", node->full_name);
+		its_warn("%s: unable to map registers\n", node->full_name);
 		return -ENOMEM;
 	}
 
-	val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_MASK;
+	val = readl_relaxed(its_base + GITS_PIDR2) & GIC_PIDR2_ARCH_REV_MASK;
 	if (val != 0x30 && val != 0x40) {
-		pr_warn("%s: no ITS detected, giving up\n", node->full_name);
+		its_warn("%s: no ITS detected, giving up\n", node->full_name);
 		err = -ENODEV;
 		goto out_unmap;
 	}
 
 	err = its_force_quiescent(its_base);
 	if (err) {
-		pr_warn("%s: failed to quiesce, giving up\n",
+		its_warn("%s: failed to quiesce, giving up\n",
 			node->full_name);
 		goto out_unmap;
 	}
 
-	pr_info("ITS: %s\n", node->full_name);
+	its_info("ITS: %s\n", node->full_name);
 
-	its = kzalloc(sizeof(*its), GFP_KERNEL);
+	its = xzalloc(struct its_node);
 	if (!its) {
 		err = -ENOMEM;
 		goto out_unmap;
 	}
 
-	raw_spin_lock_init(&its->lock);
+	spin_lock_init(&its->lock);
 	INIT_LIST_HEAD(&its->entry);
 	INIT_LIST_HEAD(&its->its_device_list);
 	its->base = its_base;
-	its->phys_base = res.start;
-	its->msi_chip.of_node = node;
+	its->phys_base = its_addr;
 	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
 
-	its->cmd_base = kzalloc(ITS_CMD_QUEUE_SZ, GFP_KERNEL);
+	its->cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ);
 	if (!its->cmd_base) {
 		err = -ENOMEM;
 		goto out_free_its;
@@ -1414,7 +1449,7 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 	if (err)
 		goto out_free_tables;
 
-	baser = (virt_to_phys(its->cmd_base)	|
+	baser = (__pa(its->cmd_base)		|
 		 GITS_CBASER_WaWb		|
 		 GITS_CBASER_InnerShareable	|
 		 (ITS_CMD_QUEUE_SZ / SZ_4K - 1)	|
@@ -1426,10 +1461,10 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 	writel_relaxed(GITS_CTLR_ENABLE, its->base + GITS_CTLR);
 
 	if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
-		pr_info("ITS: using cache flushing for cmd queue\n");
+		its_info("ITS: using cache flushing for cmd queue\n");
 		its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
 	}
-
+#if 0
 	if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
 		its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
 		if (!its->domain) {
@@ -1451,27 +1486,28 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
 		if (err)
 			goto out_free_domains;
 	}
-
+#endif
 	spin_lock(&its_lock);
 	list_add(&its->entry, &its_nodes);
 	spin_unlock(&its_lock);
 
 	return 0;
-
+#if 0
 out_free_domains:
 	if (its->msi_chip.domain)
 		irq_domain_remove(its->msi_chip.domain);
 	if (its->domain)
 		irq_domain_remove(its->domain);
+#endif
 out_free_tables:
 	its_free_tables(its);
 out_free_cmd:
-	kfree(its->cmd_base);
+	xfree(its->cmd_base);
 out_free_its:
-	kfree(its);
+	xfree(its);
 out_unmap:
 	iounmap(its_base);
-	pr_err("ITS: failed probing %s (%d)\n", node->full_name, err);
+	its_err("ITS: failed probing %s (%d)\n", node->full_name, err);
 	return err;
 }
 
@@ -1484,7 +1520,7 @@ int its_cpu_init(void)
 {
 	if (!list_empty(&its_nodes)) {
 		if (!gic_rdists_supports_plpis()) {
-			pr_info("CPU%d: LPIs not supported\n", smp_processor_id());
+			its_info("CPU%d: LPIs not supported\n", smp_processor_id());
 			return -ENXIO;
 		}
 		its_cpu_init_lpis();
@@ -1494,23 +1530,28 @@ int its_cpu_init(void)
 	return 0;
 }
 
-static struct of_device_id its_device_id[] = {
-	{	.compatible	= "arm,gic-v3-its",	},
-	{},
-};
-
-int its_init(struct device_node *node, struct rdists *rdists,
-	     struct irq_domain *parent_domain)
+int its_init(struct dt_device_node *node, struct rdist_prop *rdists)
 {
-	struct device_node *np;
+	struct dt_device_node *np = NULL;
+
+	static const struct dt_device_match its_device_ids[] __initconst =
+	{
+		DT_MATCH_COMPATIBLE("arm,gic-v3-its"),
+		{ /* sentinel */ },
+	};
+
+	while ((np = dt_find_matching_node(np, its_device_ids)))
+	{
+		if (!dt_find_property(np, "msi-controller", NULL))
+		continue;
+	}
 
-	for (np = of_find_matching_node(node, its_device_id); np;
-	     np = of_find_matching_node(np, its_device_id)) {
-		its_probe(np, parent_domain);
+	if (np) {
+		its_probe(np);
 	}
 
 	if (list_empty(&its_nodes)) {
-		pr_warn("ITS: No ITS available, not enabling LPIs\n");
+		its_warn("ITS: No ITS available, not enabling LPIs\n");
 		return -ENXIO;
 	}
 
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 4e64b56..f8bac52 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -59,11 +59,12 @@
 #define GICR_WAKER_ProcessorSleep    (1U << 1)
 #define GICR_WAKER_ChildrenAsleep    (1U << 2)
 
-#define GICD_PIDR2_ARCH_REV_MASK     (0xf0)
+#define GIC_PIDR2_ARCH_REV_MASK      (0xf0)
+#define GICD_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
 #define GICD_PIDR2_ARCH_REV_SHIFT    (0x4)
 #define GICD_PIDR2_ARCH_GICV3        (0x3)
 
-#define GICR_PIDR2_ARCH_REV_MASK     GICD_PIDR2_ARCH_REV_MASK
+#define GICR_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
 #define GICR_PIDR2_ARCH_REV_SHIFT    GICD_PIDR2_ARCH_REV_SHIFT
 #define GICR_PIDR2_ARCH_GICV3        GICD_PIDR2_ARCH_GICV3
 
@@ -113,6 +114,23 @@
 #define GICR_ICFGR1                  (0x0C04)
 #define GICR_NSACR                   (0x0E00)
 
+#define GICR_CTLR_ENABLE_LPIS        (1UL << 0)
+#define GICR_TYPER_CPU_NUMBER(r)     (((r) >> 8) & 0xffff)
+
+#define GICR_PROPBASER_NonShareable      (0U << 10)
+#define GICR_PROPBASER_InnerShareable    (1U << 10)
+#define GICR_PROPBASER_OuterShareable    (2U << 10)
+#define GICR_PROPBASER_SHAREABILITY_MASK (3UL << 10)
+#define GICR_PROPBASER_nCnB              (0U << 7)
+#define GICR_PROPBASER_nC                (1U << 7)
+#define GICR_PROPBASER_RaWt              (2U << 7)
+#define GICR_PROPBASER_RaWb              (3U << 7)
+#define GICR_PROPBASER_WaWt              (4U << 7)
+#define GICR_PROPBASER_WaWb              (5U << 7)
+#define GICR_PROPBASER_RaWaWt            (6U << 7)
+#define GICR_PROPBASER_RaWaWb            (7U << 7)
+#define GICR_PROPBASER_IDBITS_MASK       (0x1f)
+
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
 #define GICR_TYPER_LAST              (1U << 4)
@@ -153,6 +171,100 @@
 #define ICH_SGI_IRQ_MASK             0xf
 #define ICH_SGI_TARGETLIST_MASK      0xffff
 
+#define LPI_PROP_GROUP1                 (1 << 1)
+#define LPI_PROP_ENABLED                (1 << 0)
+
+/*
+ * ITS registers, offsets from ITS_base
+ */
+#define GITS_CTLR                       0x0000
+#define GITS_IIDR                       0x0004
+#define GITS_TYPER                      0x0008
+#define GITS_CBASER                     0x0080
+#define GITS_CWRITER                    0x0088
+#define GITS_CREADR                     0x0090
+#define GITS_BASER                      0x0100
+#define GITS_BASERN                     0x013c
+#define GITS_PIDR0                      GICR_PIDR0
+#define GITS_PIDR1                      GICR_PIDR1
+#define GITS_PIDR2                      GICR_PIDR2
+#define GITS_PIDR3                      GICR_PIDR3
+#define GITS_PIDR4                      GICR_PIDR4
+#define GITS_PIDR5                      GICR_PIDR5
+#define GITS_PIDR7                      GICR_PIDR7
+
+#define GITS_TRANSLATER                 0x10040
+#define GITS_CTLR_QUIESCENT		(1U << 31)
+#define GITS_CTLR_ENABLE		(1U << 0)
+
+#define GITS_TYPER_DEVBITS_SHIFT	13
+#define GITS_TYPER_DEVBITS(r)		((((r) >> GITS_TYPER_DEVBITS_SHIFT) & 0x1f) + 1)
+#define GITS_TYPER_PTA                  (1UL << 19)
+
+#define GITS_CBASER_VALID               (1UL << 63)
+#define GITS_CBASER_nCnB                (0UL << 59)
+#define GITS_CBASER_nC                  (1UL << 59)
+#define GITS_CBASER_RaWt                (2UL << 59)
+#define GITS_CBASER_RaWb                (3UL << 59)
+#define GITS_CBASER_WaWt                (4UL << 59)
+#define GITS_CBASER_WaWb                (5UL << 59)
+#define GITS_CBASER_RaWaWt              (6UL << 59)
+#define GITS_CBASER_RaWaWb              (7UL << 59)
+#define GITS_CBASER_NonShareable        (0UL << 10)
+#define GITS_CBASER_InnerShareable      (1UL << 10)
+#define GITS_CBASER_OuterShareable      (2UL << 10)
+#define GITS_CBASER_SHAREABILITY_MASK   (3UL << 10)
+
+#define GITS_BASER_NR_REGS              8
+
+#define GITS_BASER_VALID                (1UL << 63)
+#define GITS_BASER_nCnB                 (0UL << 59)
+#define GITS_BASER_nC                   (1UL << 59)
+#define GITS_BASER_RaWt                 (2UL << 59)
+#define GITS_BASER_RaWb                 (3UL << 59)
+#define GITS_BASER_WaWt                 (4UL << 59)
+#define GITS_BASER_WaWb                 (5UL << 59)
+#define GITS_BASER_RaWaWt               (6UL << 59)
+#define GITS_BASER_RaWaWb               (7UL << 59)
+#define GITS_BASER_TYPE_SHIFT           (56)
+#define GITS_BASER_TYPE(r)              (((r) >> GITS_BASER_TYPE_SHIFT) & 7)
+#define GITS_BASER_ENTRY_SIZE_SHIFT     (48)
+#define GITS_BASER_ENTRY_SIZE(r)        ((((r) >> GITS_BASER_ENTRY_SIZE_SHIFT) & 0xff) + 1)
+#define GITS_BASER_NonShareable         (0UL << 10)
+#define GITS_BASER_InnerShareable       (1UL << 10)
+#define GITS_BASER_OuterShareable       (2UL << 10)
+#define GITS_BASER_SHAREABILITY_SHIFT   (10)
+#define GITS_BASER_SHAREABILITY_MASK    (3UL << GITS_BASER_SHAREABILITY_SHIFT)
+#define GITS_BASER_PAGE_SIZE_SHIFT      (8)
+#define GITS_BASER_PAGE_SIZE_4K         (0UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_16K        (1UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_64K        (2UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_PAGE_SIZE_MASK       (3UL << GITS_BASER_PAGE_SIZE_SHIFT)
+#define GITS_BASER_TYPE_NONE            0
+#define GITS_BASER_TYPE_DEVICE          1
+#define GITS_BASER_TYPE_VCPU            2
+#define GITS_BASER_TYPE_CPU             3
+#define GITS_BASER_TYPE_COLLECTION      4
+#define GITS_BASER_TYPE_RESERVED5       5
+#define GITS_BASER_TYPE_RESERVED6       6
+#define GITS_BASER_TYPE_RESERVED7       7
+
+/*
+ * ITS commands
+ */
+#define GITS_CMD_MAPD                   0x08
+#define GITS_CMD_MAPC                   0x09
+#define GITS_CMD_MAPVI                  0x0a
+#define GITS_CMD_MAPI                   0x0b
+#define GITS_CMD_MOVI                   0x01
+#define GITS_CMD_DISCARD                0x0f
+#define GITS_CMD_INV                    0x0c
+#define GITS_CMD_MOVALL                 0x0e
+#define GITS_CMD_INVALL                 0x0d
+#define GITS_CMD_INT                    0x03
+#define GITS_CMD_CLEAR                  0x04
+#define GITS_CMD_SYNC                   0x05
+
 struct rdist {
     void __iomem *rbase;
     void * pend_page;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 07/22] xen/arm: its: Move ITS command encode helper functions
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (5 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 08/22] xen/arm: its: Remove unused code in ITS driver vijay.kilari
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

ITS command encode functions are moved to
header file gits-its.h and made as inline functions.
This will be useful later in virtual its driver

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/gic-v3-its.c     |   71 +---------------------------
 xen/include/asm-arm/gic-its.h |  103 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+), 70 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index ce7ced6..d159630 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -41,6 +41,7 @@
 #include <asm/device.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
 
 #define its_print(lvl, fmt, ...)                                      \
 	printk(lvl "GIC-ITS:" fmt, ## __VA_ARGS__)
@@ -163,82 +164,12 @@ struct its_cmd_desc {
 	};
 };
 
-/*
- * The ITS command block, which is what the ITS actually parses.
- */
-struct its_cmd_block {
-	u64	raw_cmd[4];
-};
-
 #define ITS_CMD_QUEUE_SZ		SZ_64K
 #define ITS_CMD_QUEUE_NR_ENTRIES	(ITS_CMD_QUEUE_SZ / sizeof(struct its_cmd_block))
 
 typedef struct its_collection *(*its_cmd_builder_t)(struct its_cmd_block *,
 						    struct its_cmd_desc *);
 
-static void its_encode_cmd(struct its_cmd_block *cmd, u8 cmd_nr)
-{
-	cmd->raw_cmd[0] &= ~0xffUL;
-	cmd->raw_cmd[0] |= cmd_nr;
-}
-
-static void its_encode_devid(struct its_cmd_block *cmd, u32 devid)
-{
-	cmd->raw_cmd[0] &= ~(0xffffUL << 32);
-	cmd->raw_cmd[0] |= ((u64)devid) << 32;
-}
-
-static void its_encode_event_id(struct its_cmd_block *cmd, u32 id)
-{
-	cmd->raw_cmd[1] &= ~0xffffffffUL;
-	cmd->raw_cmd[1] |= id;
-}
-
-static void its_encode_phys_id(struct its_cmd_block *cmd, u32 phys_id)
-{
-	cmd->raw_cmd[1] &= 0xffffffffUL;
-	cmd->raw_cmd[1] |= ((u64)phys_id) << 32;
-}
-
-static void its_encode_size(struct its_cmd_block *cmd, u8 size)
-{
-	cmd->raw_cmd[1] &= ~0x1fUL;
-	cmd->raw_cmd[1] |= size & 0x1f;
-}
-
-static void its_encode_itt(struct its_cmd_block *cmd, u64 itt_addr)
-{
-	cmd->raw_cmd[2] &= ~0xffffffffffffUL;
-	cmd->raw_cmd[2] |= itt_addr & 0xffffffffff00UL;
-}
-
-static void its_encode_valid(struct its_cmd_block *cmd, int valid)
-{
-	cmd->raw_cmd[2] &= ~(1UL << 63);
-	cmd->raw_cmd[2] |= ((u64)!!valid) << 63;
-}
-
-static void its_encode_target(struct its_cmd_block *cmd, u64 target_addr)
-{
-	cmd->raw_cmd[2] &= ~(0xffffffffUL << 16);
-	cmd->raw_cmd[2] |= (target_addr & (0xffffffffUL << 16));
-}
-
-static void its_encode_collection(struct its_cmd_block *cmd, u16 col)
-{
-	cmd->raw_cmd[2] &= ~0xffffUL;
-	cmd->raw_cmd[2] |= col;
-}
-
-static inline void its_fixup_cmd(struct its_cmd_block *cmd)
-{
-	/* Let's fixup BE commands */
-	cmd->raw_cmd[0] = cpu_to_le64(cmd->raw_cmd[0]);
-	cmd->raw_cmd[1] = cpu_to_le64(cmd->raw_cmd[1]);
-	cmd->raw_cmd[2] = cpu_to_le64(cmd->raw_cmd[2]);
-	cmd->raw_cmd[3] = cpu_to_le64(cmd->raw_cmd[3]);
-}
-
 static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
 						 struct its_cmd_desc *desc)
 {
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
new file mode 100644
index 0000000..74c4398
--- /dev/null
+++ b/xen/include/asm-arm/gic-its.h
@@ -0,0 +1,103 @@
+/*
+ * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * Xen changes:
+ * Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
+ * Copyright (C) 2014, 2015 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_ARM_GIC_ITS_H__
+#define __ASM_ARM_GIC_ITS_H__
+
+/*
+ * The ITS command block, which is what the ITS actually parses.
+ */
+struct its_cmd_block {
+    u64     raw_cmd[4];
+};
+
+static inline void its_encode_cmd(struct its_cmd_block *cmd, u8 cmd_nr)
+{
+    cmd->raw_cmd[0] &= ~0xffUL;
+    cmd->raw_cmd[0] |= cmd_nr;
+}
+
+static inline void its_encode_devid(struct its_cmd_block *cmd, u32 devid)
+{
+    cmd->raw_cmd[0] &= ~(0xffffUL << 32);
+    cmd->raw_cmd[0] |= ((u64)devid) << 32;
+}
+
+static inline void its_encode_event_id(struct its_cmd_block *cmd, u32 id)
+{
+    cmd->raw_cmd[1] &= ~0xffffffffUL;
+    cmd->raw_cmd[1] |= id;
+}
+
+static inline void its_encode_phys_id(struct its_cmd_block *cmd, u32 phys_id)
+{
+    cmd->raw_cmd[1] &= 0xffffffffUL;
+    cmd->raw_cmd[1] |= ((u64)phys_id) << 32;
+}
+
+static inline void its_encode_size(struct its_cmd_block *cmd, u8 size)
+{
+    cmd->raw_cmd[1] &= ~0x1fUL;
+    cmd->raw_cmd[1] |= size & 0x1f;
+}
+
+static inline void its_encode_itt(struct its_cmd_block *cmd, u64 itt_addr)
+{
+    cmd->raw_cmd[2] &= ~0xffffffffffffUL;
+    cmd->raw_cmd[2] |= itt_addr & 0xffffffffff00UL;
+}
+
+static inline void its_encode_valid(struct its_cmd_block *cmd, int valid)
+{
+    cmd->raw_cmd[2] &= ~(1UL << 63);
+    cmd->raw_cmd[2] |= ((u64)!!valid) << 63;
+}
+
+static inline void its_encode_target(struct its_cmd_block *cmd, u64 target_addr)
+{
+    cmd->raw_cmd[2] &= ~(0xffffffffUL << 16);
+    cmd->raw_cmd[2] |= (target_addr & (0xffffffffUL << 16));
+}
+
+static inline void its_encode_collection(struct its_cmd_block *cmd, u16 col)
+{
+    cmd->raw_cmd[2] &= ~0xffffUL;
+    cmd->raw_cmd[2] |= col;
+}
+
+static inline void its_fixup_cmd(struct its_cmd_block *cmd)
+{
+    /* Let's fixup BE commands */
+    cmd->raw_cmd[0] = cpu_to_le64(cmd->raw_cmd[0]);
+    cmd->raw_cmd[1] = cpu_to_le64(cmd->raw_cmd[1]);
+    cmd->raw_cmd[2] = cpu_to_le64(cmd->raw_cmd[2]);
+    cmd->raw_cmd[3] = cpu_to_le64(cmd->raw_cmd[3]);
+}
+#endif /* __ASM_ARM_GIC_ITS_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 08/22] xen/arm: its: Remove unused code in ITS driver
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (6 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 07/22] xen/arm: its: Move ITS command encode helper functions vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command vijay.kilari
                   ` (15 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

ITS driver does not require functionality to
create/free device. This will be handled by virtual
ITS driver.

The functionality of ITS driver will be limited to
initialization, physical lpi allocation,
sending ITS commands received from Virtual ITS driver
and ITS interrupt handling.

The following functionality is removed
 - Removed used command structure definitions
 - Removed handling of unused ITS commands like MAPD, MAPVI, INT and
   DISCARD and INV
 - Removed irq_domain_ops related code
 - Removed msi_domain_info related code

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2:
 - Retained its_device structure
---
 xen/arch/arm/gic-v3-its.c     |  446 ++---------------------------------------
 xen/include/asm-arm/gic-its.h |    8 -
 2 files changed, 13 insertions(+), 441 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index d159630..0c55959 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -84,7 +84,6 @@ struct its_node {
 	struct its_cmd_block	*cmd_write;
 	void			*tables[GITS_BASER_NR_REGS];
 	struct its_collection	*collections;
-	struct list_head	its_device_list;
 	u64			flags;
 	u32			ite_size;
 };
@@ -122,43 +121,23 @@ static struct rdist_prop  *gic_rdists;
 struct its_cmd_desc {
 	union {
 		struct {
-			struct its_device *dev;
+			struct its_collection *col;
 			u32 event_id;
+			u32 dev_id;
 		} its_inv_cmd;
 
 		struct {
-			struct its_device *dev;
-			u32 event_id;
-		} its_int_cmd;
-
-		struct {
-			struct its_device *dev;
-			int valid;
-		} its_mapd_cmd;
-
-		struct {
 			struct its_collection *col;
 			int valid;
 		} its_mapc_cmd;
 
 		struct {
-			struct its_device *dev;
-			u32 phys_id;
-			u32 event_id;
-		} its_mapvi_cmd;
-
-		struct {
-			struct its_device *dev;
 			struct its_collection *col;
 			u32 id;
+			u32 dev_id;
 		} its_movi_cmd;
 
 		struct {
-			struct its_device *dev;
-			u32 event_id;
-		} its_discard_cmd;
-
-		struct {
 			struct its_collection *col;
 		} its_invall_cmd;
 	};
@@ -170,26 +149,6 @@ struct its_cmd_desc {
 typedef struct its_collection *(*its_cmd_builder_t)(struct its_cmd_block *,
 						    struct its_cmd_desc *);
 
-static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
-						 struct its_cmd_desc *desc)
-{
-	unsigned long itt_addr;
-	u8 size = max(fls(desc->its_mapd_cmd.dev->nr_ites) - 1, 1);
-
-	itt_addr = __pa(desc->its_mapd_cmd.dev->itt);
-        itt_addr = ROUNDUP(itt_addr, ITS_ITT_ALIGN);
-
-	its_encode_cmd(cmd, GITS_CMD_MAPD);
-	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
-	its_encode_size(cmd, size - 1);
-	its_encode_itt(cmd, itt_addr);
-	its_encode_valid(cmd, desc->its_mapd_cmd.valid);
-
-	its_fixup_cmd(cmd);
-
-	return desc->its_mapd_cmd.dev->collection;
-}
-
 static struct its_collection *its_build_mapc_cmd(struct its_cmd_block *cmd,
 						 struct its_cmd_desc *desc)
 {
@@ -198,60 +157,28 @@ static struct its_collection *its_build_mapc_cmd(struct its_cmd_block *cmd,
 	its_encode_target(cmd, desc->its_mapc_cmd.col->target_address);
 	its_encode_valid(cmd, desc->its_mapc_cmd.valid);
 
-	its_fixup_cmd(cmd);
-
 	return desc->its_mapc_cmd.col;
 }
 
-static struct its_collection *its_build_mapvi_cmd(struct its_cmd_block *cmd,
-						  struct its_cmd_desc *desc)
-{
-	its_encode_cmd(cmd, GITS_CMD_MAPVI);
-	its_encode_devid(cmd, desc->its_mapvi_cmd.dev->device_id);
-	its_encode_event_id(cmd, desc->its_mapvi_cmd.event_id);
-	its_encode_phys_id(cmd, desc->its_mapvi_cmd.phys_id);
-	its_encode_collection(cmd, desc->its_mapvi_cmd.dev->collection->col_id);
-
-	its_fixup_cmd(cmd);
-
-	return desc->its_mapvi_cmd.dev->collection;
-}
-
 static struct its_collection *its_build_movi_cmd(struct its_cmd_block *cmd,
 						 struct its_cmd_desc *desc)
 {
 	its_encode_cmd(cmd, GITS_CMD_MOVI);
-	its_encode_devid(cmd, desc->its_movi_cmd.dev->device_id);
+	its_encode_devid(cmd, desc->its_movi_cmd.dev_id);
 	its_encode_event_id(cmd, desc->its_movi_cmd.id);
 	its_encode_collection(cmd, desc->its_movi_cmd.col->col_id);
 
-	its_fixup_cmd(cmd);
-
-	return desc->its_movi_cmd.dev->collection;
-}
-
-static struct its_collection *its_build_discard_cmd(struct its_cmd_block *cmd,
-						    struct its_cmd_desc *desc)
-{
-	its_encode_cmd(cmd, GITS_CMD_DISCARD);
-	its_encode_devid(cmd, desc->its_discard_cmd.dev->device_id);
-	its_encode_event_id(cmd, desc->its_discard_cmd.event_id);
-
-	its_fixup_cmd(cmd);
-
-	return desc->its_discard_cmd.dev->collection;
+	return desc->its_movi_cmd.col;
 }
 
 static struct its_collection *its_build_inv_cmd(struct its_cmd_block *cmd,
 						struct its_cmd_desc *desc)
 {
 	its_encode_cmd(cmd, GITS_CMD_INV);
-	its_encode_devid(cmd, desc->its_inv_cmd.dev->device_id);
+	its_encode_devid(cmd, desc->its_inv_cmd.dev_id);
 	its_encode_event_id(cmd, desc->its_inv_cmd.event_id);
 
-	its_fixup_cmd(cmd);
-
-	return desc->its_inv_cmd.dev->collection;
+	return desc->its_inv_cmd.col;
 }
 
 static struct its_collection *its_build_invall_cmd(struct its_cmd_block *cmd,
@@ -260,8 +187,6 @@ static struct its_collection *its_build_invall_cmd(struct its_cmd_block *cmd,
 	its_encode_cmd(cmd, GITS_CMD_INVALL);
 	its_encode_collection(cmd, desc->its_mapc_cmd.col->col_id);
 
-	its_fixup_cmd(cmd);
-
 	return NULL;
 }
 
@@ -383,7 +308,6 @@ static void its_send_single_command(struct its_node *its,
 		}
 		its_encode_cmd(sync_cmd, GITS_CMD_SYNC);
 		its_encode_target(sync_cmd, sync_col->target_address);
-		its_fixup_cmd(sync_cmd);
 		its_flush_cmd(its, sync_cmd);
 	}
 
@@ -394,27 +318,16 @@ post:
 	its_wait_for_range_completion(its, cmd, next_cmd);
 }
 
-/* TODO: Remove static for the sake of compilation */
 void its_send_inv(struct its_device *dev, u32 event_id)
 {
 	struct its_cmd_desc desc;
 
-	desc.its_inv_cmd.dev = dev;
+	desc.its_inv_cmd.dev_id = dev->device_id;
 	desc.its_inv_cmd.event_id = event_id;
 
 	its_send_single_command(dev->its, its_build_inv_cmd, &desc);
 }
 
-static void its_send_mapd(struct its_device *dev, int valid)
-{
-	struct its_cmd_desc desc;
-
-	desc.its_mapd_cmd.dev = dev;
-	desc.its_mapd_cmd.valid = !!valid;
-
-	its_send_single_command(dev->its, its_build_mapd_cmd, &desc);
-}
-
 static void its_send_mapc(struct its_node *its, struct its_collection *col,
 			  int valid)
 {
@@ -427,39 +340,16 @@ static void its_send_mapc(struct its_node *its, struct its_collection *col,
 }
 
 /* TODO: Remove static for the sake of compilation */
-void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
+void its_send_movi(struct its_node *its, struct its_collection *col,
+	           u32 dev_id, u32 id)
 {
 	struct its_cmd_desc desc;
 
-	desc.its_mapvi_cmd.dev = dev;
-	desc.its_mapvi_cmd.phys_id = irq_id;
-	desc.its_mapvi_cmd.event_id = id;
-
-	its_send_single_command(dev->its, its_build_mapvi_cmd, &desc);
-}
-
-/* TODO: Remove static for the sake of compilation */
-void its_send_movi(struct its_device *dev,
-			  struct its_collection *col, u32 id)
-{
-	struct its_cmd_desc desc;
-
-	desc.its_movi_cmd.dev = dev;
+	desc.its_movi_cmd.dev_id = dev_id;
 	desc.its_movi_cmd.col = col;
 	desc.its_movi_cmd.id = id;
 
-	its_send_single_command(dev->its, its_build_movi_cmd, &desc);
-}
-
-/* TODO: Remove static for the sake of compilation */
-void its_send_discard(struct its_device *dev, u32 id)
-{
-	struct its_cmd_desc desc;
-
-	desc.its_discard_cmd.dev = dev;
-	desc.its_discard_cmd.event_id = id;
-
-	its_send_single_command(dev->its, its_build_discard_cmd, &desc);
+	its_send_single_command(its, its_build_movi_cmd, &desc);
 }
 
 static void its_send_invall(struct its_node *its, struct its_collection *col)
@@ -628,7 +518,7 @@ static int its_lpi_init(u32 id_bits)
 	return 0;
 }
 
-static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)
+unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)
 {
 	unsigned long *bitmap = NULL;
 	int chunk_id;
@@ -994,98 +884,6 @@ static void its_cpu_init_collection(void)
 }
 
 /* TODO: Remove static for the sake of compilation */
-struct its_device *its_find_device(struct its_node *its, u32 dev_id)
-{
-	struct its_device *its_dev = NULL, *tmp;
-	unsigned long flags;
-
-	spin_lock_irqsave(&its->lock, flags);
-
-	list_for_each_entry(tmp, &its->its_device_list, entry) {
-		if (tmp->device_id == dev_id) {
-			its_dev = tmp;
-			break;
-		}
-	}
-
-	spin_unlock_irqrestore(&its->lock, flags);
-
-	return its_dev;
-}
-
-/* TODO: Remove static for the sake of compilation */
-struct its_device *its_create_device(struct its_node *its, u32 dev_id,
-					    int nvecs)
-{
-	struct its_device *dev;
-	unsigned long *lpi_map;
-	unsigned long flags;
-	void *itt;
-	int lpi_base;
-	int nr_lpis;
-	int nr_ites;
-	int cpu;
-	int sz;
-
-	dev = xzalloc(struct its_device);
-	/*
-	 * At least one bit of EventID is being used, hence a minimum
-	 * of two entries. No, the architecture doesn't let you
-	 * express an ITT with a single entry.
-	 */
-        /*
-	 * TODO: replace roundup_pow_of_2 with shift for now.
-	 * This code is not used later
-	 */
-	nr_ites = max(2UL, (1UL << (nvecs)));
-	sz = nr_ites * its->ite_size;
-	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
-	itt = xzalloc_bytes(sz);
-	lpi_map = its_lpi_alloc_chunks(nvecs, &lpi_base, &nr_lpis);
-
-	if (!dev || !itt || !lpi_map) {
-		xfree(dev);
-		xfree(itt);
-		xfree(lpi_map);
-		return NULL;
-	}
-
-	dev->its = its;
-	dev->itt = itt;
-	dev->nr_ites = nr_ites;
-	dev->lpi_map = lpi_map;
-	dev->lpi_base = lpi_base;
-	dev->nr_lpis = nr_lpis;
-	dev->device_id = dev_id;
-	INIT_LIST_HEAD(&dev->entry);
-
-	spin_lock_irqsave(&its->lock, flags);
-	list_add(&dev->entry, &its->its_device_list);
-	spin_unlock_irqrestore(&its->lock, flags);
-
-	/* Bind the device to the first possible CPU */
-	cpu = cpumask_first(&cpu_online_map);
-	dev->collection = &its->collections[cpu];
-
-	/* Map device to its ITT */
-	its_send_mapd(dev, 1);
-
-	return dev;
-}
-
-/* TODO: Remove static for the sake of compilation */
-void its_free_device(struct its_device *its_dev)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&its_dev->its->lock, flags);
-	list_del(&its_dev->entry);
-	spin_unlock_irqrestore(&its_dev->its->lock, flags);
-	xfree(its_dev->itt);
-	xfree(its_dev);
-}
-
-/* TODO: Remove static for the sake of compilation */
 int its_alloc_device_irq(struct its_device *dev, int *hwirq)
 {
 	int idx;
@@ -1100,193 +898,6 @@ int its_alloc_device_irq(struct its_device *dev, int *hwirq)
 	return 0;
 }
 
-/* pci and msi handling no more required here */
-#if 0
-struct its_pci_alias {
-	struct pci_dev	*pdev;
-	u32		dev_id;
-	u32		count;
-};
-
-static int its_pci_msi_vec_count(struct pci_dev *pdev)
-{
-	int msi, msix;
-
-	msi = max(pci_msi_vec_count(pdev), 0);
-	msix = max(pci_msix_vec_count(pdev), 0);
-
-	return max(msi, msix);
-}
-
-static int its_get_pci_alias(struct pci_dev *pdev, u16 alias, void *data)
-{
-	struct its_pci_alias *dev_alias = data;
-
-	dev_alias->dev_id = alias;
-	if (pdev != dev_alias->pdev)
-		dev_alias->count += its_pci_msi_vec_count(dev_alias->pdev);
-
-	return 0;
-}
-
-static int its_msi_prepare(struct irq_domain *domain, struct device *dev,
-			   int nvec, msi_alloc_info_t *info)
-{
-	struct pci_dev *pdev;
-	struct its_node *its;
-	struct its_device *its_dev;
-	struct its_pci_alias dev_alias;
-
-	if (!dev_is_pci(dev))
-		return -EINVAL;
-
-	pdev = to_pci_dev(dev);
-	dev_alias.pdev = pdev;
-	dev_alias.count = nvec;
-
-	pci_for_each_dma_alias(pdev, its_get_pci_alias, &dev_alias);
-	its = domain->parent->host_data;
-
-	its_dev = its_find_device(its, dev_alias.dev_id);
-	if (its_dev) {
-		/*
-		 * We already have seen this ID, probably through
-		 * another alias (PCI bridge of some sort). No need to
-		 * create the device.
-		 */
-		dev_dbg(dev, "Reusing ITT for devID %x\n", dev_alias.dev_id);
-		goto out;
-	}
-
-	its_dev = its_create_device(its, dev_alias.dev_id, dev_alias.count);
-	if (!its_dev)
-		return -ENOMEM;
-
-	dev_dbg(&pdev->dev, "ITT %d entries, %d bits\n",
-		dev_alias.count, ilog2(dev_alias.count));
-out:
-	info->scratchpad[0].ptr = its_dev;
-	info->scratchpad[1].ptr = dev;
-	return 0;
-}
-
-static struct msi_domain_ops its_pci_msi_ops = {
-	.msi_prepare	= its_msi_prepare,
-};
-
-static struct msi_domain_info its_pci_msi_domain_info = {
-	.flags	= (MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
-		   MSI_FLAG_MULTI_PCI_MSI | MSI_FLAG_PCI_MSIX),
-	.ops	= &its_pci_msi_ops,
-	.chip	= &its_msi_irq_chip,
-};
-
-#endif
-/* IRQ domain management is not required */
-#if 0
-static int its_irq_gic_domain_alloc(struct irq_domain *domain,
-				    unsigned int virq,
-				    irq_hw_number_t hwirq)
-{
-	struct of_phandle_args args;
-
-	args.np = domain->parent->of_node;
-	args.args_count = 3;
-	args.args[0] = GIC_IRQ_TYPE_LPI;
-	args.args[1] = hwirq;
-	args.args[2] = IRQ_TYPE_EDGE_RISING;
-
-	return irq_domain_alloc_irqs_parent(domain, virq, 1, &args);
-}
-
-static int its_irq_domain_alloc(struct irq_domain *domain, unsigned int virq,
-				unsigned int nr_irqs, void *args)
-{
-	msi_alloc_info_t *info = args;
-	struct its_device *its_dev = info->scratchpad[0].ptr;
-	irq_hw_number_t hwirq;
-	int err;
-	int i;
-
-	for (i = 0; i < nr_irqs; i++) {
-		err = its_alloc_device_irq(its_dev, &hwirq);
-		if (err)
-			return err;
-
-		err = its_irq_gic_domain_alloc(domain, virq + i, hwirq);
-		if (err)
-			return err;
-
-		irq_domain_set_hwirq_and_chip(domain, virq + i,
-					      hwirq, &its_irq_chip, its_dev);
-		dev_dbg(info->scratchpad[1].ptr, "ID:%d pID:%d vID:%d\n",
-			(int)(hwirq - its_dev->lpi_base), (int)hwirq, virq + i);
-	}
-
-	return 0;
-}
-
-static void its_irq_domain_activate(struct irq_domain *domain,
-				    struct irq_data *d)
-{
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	u32 event = its_get_event_id(d);
-
-	/* Map the GIC IRQ and event to the device */
-	its_send_mapvi(its_dev, d->hwirq, event);
-}
-
-static void its_irq_domain_deactivate(struct irq_domain *domain,
-				      struct irq_data *d)
-{
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	u32 event = its_get_event_id(d);
-
-	/* Stop the delivery of interrupts */
-	its_send_discard(its_dev, event);
-}
-
-static void its_irq_domain_free(struct irq_domain *domain, unsigned int virq,
-				unsigned int nr_irqs)
-{
-	struct irq_data *d = irq_domain_get_irq_data(domain, virq);
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	int i;
-
-	for (i = 0; i < nr_irqs; i++) {
-		struct irq_data *data = irq_domain_get_irq_data(domain,
-								virq + i);
-		u32 event = its_get_event_id(data);
-
-		/* Mark interrupt index as unused */
-		clear_bit(event, its_dev->lpi_map);
-
-		/* Nuke the entry in the domain */
-		irq_domain_reset_irq_data(data);
-	}
-
-	/* If all interrupts have been freed, start mopping the floor */
-	if (bitmap_empty(its_dev->lpi_map, its_dev->nr_lpis)) {
-		its_lpi_free(its_dev->lpi_map,
-			     its_dev->lpi_base,
-			     its_dev->nr_lpis);
-
-		/* Unmap device/itt */
-		its_send_mapd(its_dev, 0);
-		its_free_device(its_dev);
-	}
-
-	irq_domain_free_irqs_parent(domain, virq, nr_irqs);
-}
-
-static const struct irq_domain_ops its_domain_ops = {
-	.alloc			= its_irq_domain_alloc,
-	.free			= its_irq_domain_free,
-	.activate		= its_irq_domain_activate,
-	.deactivate		= its_irq_domain_deactivate,
-};
-#endif
-
 static int its_force_quiescent(void __iomem *base)
 {
 	u32 count = 1000000;	/* 1s */
@@ -1360,7 +971,6 @@ static int its_probe(struct dt_device_node *node)
 
 	spin_lock_init(&its->lock);
 	INIT_LIST_HEAD(&its->entry);
-	INIT_LIST_HEAD(&its->its_device_list);
 	its->base = its_base;
 	its->phys_base = its_addr;
 	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
@@ -1395,41 +1005,11 @@ static int its_probe(struct dt_device_node *node)
 		its_info("ITS: using cache flushing for cmd queue\n");
 		its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
 	}
-#if 0
-	if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
-		its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
-		if (!its->domain) {
-			err = -ENOMEM;
-			goto out_free_tables;
-		}
-
-		its->domain->parent = parent;
-
-		its->msi_chip.domain = pci_msi_create_irq_domain(node,
-								 &its_pci_msi_domain_info,
-								 its->domain);
-		if (!its->msi_chip.domain) {
-			err = -ENOMEM;
-			goto out_free_domains;
-		}
-
-		err = of_pci_msi_chip_add(&its->msi_chip);
-		if (err)
-			goto out_free_domains;
-	}
-#endif
 	spin_lock(&its_lock);
 	list_add(&its->entry, &its_nodes);
 	spin_unlock(&its_lock);
 
 	return 0;
-#if 0
-out_free_domains:
-	if (its->msi_chip.domain)
-		irq_domain_remove(its->msi_chip.domain);
-	if (its->domain)
-		irq_domain_remove(its->domain);
-#endif
 out_free_tables:
 	its_free_tables(its);
 out_free_cmd:
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 74c4398..b6734a3 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -83,14 +83,6 @@ static inline void its_encode_collection(struct its_cmd_block *cmd, u16 col)
     cmd->raw_cmd[2] |= col;
 }
 
-static inline void its_fixup_cmd(struct its_cmd_block *cmd)
-{
-    /* Let's fixup BE commands */
-    cmd->raw_cmd[0] = cpu_to_le64(cmd->raw_cmd[0]);
-    cmd->raw_cmd[1] = cpu_to_le64(cmd->raw_cmd[1]);
-    cmd->raw_cmd[2] = cpu_to_le64(cmd->raw_cmd[2]);
-    cmd->raw_cmd[3] = cpu_to_le64(cmd->raw_cmd[3]);
-}
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (7 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 08/22] xen/arm: its: Remove unused code in ITS driver vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-04-01 11:40   ` Ian Campbell
  2015-03-19 14:37 ` [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page vijay.kilari
                   ` (14 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add helper functions to decode ITS command
This will be useful for Virtual ITS driver

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/include/asm-arm/gic-its.h |   45 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index b6734a3..9d6fd3e 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -29,6 +29,51 @@ struct its_cmd_block {
     u64     raw_cmd[4];
 };
 
+static inline uint8_t its_decode_cmd(struct its_cmd_block *cmd)
+{
+    return cmd->raw_cmd[0] & 0xff;
+}
+
+static inline uint32_t its_decode_devid(struct its_cmd_block *cmd)
+{
+    return (cmd->raw_cmd[0] >> 32);
+}
+
+static inline uint32_t its_decode_event_id(struct its_cmd_block *cmd)
+{
+    return (uint32_t)cmd->raw_cmd[1];
+}
+
+static inline uint32_t its_decode_phys_id(struct its_cmd_block *cmd)
+{
+    return cmd->raw_cmd[1] >> 32;
+}
+
+static inline uint8_t its_decode_size(struct its_cmd_block *cmd)
+{
+    return (u8)(cmd->raw_cmd[1] & 0xff);
+}
+
+static inline uint64_t its_decode_itt(struct its_cmd_block *cmd)
+{
+    return (cmd->raw_cmd[2] & 0xffffffffff00ULL);
+}
+
+static inline int its_decode_valid(struct its_cmd_block *cmd)
+{
+    return cmd->raw_cmd[2] >> 63;
+}
+
+static inline uint64_t its_decode_target(struct its_cmd_block *cmd)
+{
+    return (cmd->raw_cmd[2] & 0xffffffff0000ULL);
+}
+
+static inline u16 its_decode_collection(struct its_cmd_block *cmd)
+{
+    return (u16)(cmd->raw_cmd[2] & 0xffffULL);
+}
+
 static inline void its_encode_cmd(struct its_cmd_block *cmd, u8 cmd_nr)
 {
     cmd->raw_cmd[0] &= ~0xffUL;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (8 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-20 16:39   ` Julien Grall
  2015-03-19 14:37 ` [RFC PATCH v2 11/22] xen/arm: its: Move its_device structure to header file vijay.kilari
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Given the physical address of the page, get
the maddr to map in Xen to access domain's memory.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/p2m.c        |   24 ++++++++++++++++++++++++
 xen/include/asm-arm/p2m.h |    3 +++
 2 files changed, 27 insertions(+)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8809f5a..b19c5e9 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1152,6 +1152,30 @@ unsigned long gmfn_to_mfn(struct domain *d, unsigned long gpfn)
     return p >> PAGE_SHIFT;
 }
 
+struct page_info *get_page_from_paddr(struct domain *d, paddr_t paddr,
+                                      unsigned long flags)
+{
+    struct p2m_domain *p2m = &d->arch.p2m;
+    struct page_info *page = NULL;
+
+    ASSERT(d == current->domain);
+
+    spin_lock(&p2m->lock);
+
+    if ( !mfn_valid(paddr >> PAGE_SHIFT) )
+        goto err;
+
+    page = mfn_to_page(paddr >> PAGE_SHIFT);
+    ASSERT(page);
+
+    if ( unlikely(!get_page(page, d)) )
+        page = NULL;
+
+err:
+    spin_unlock(&p2m->lock);
+    return page;
+}
+
 struct page_info *get_page_from_gva(struct domain *d, vaddr_t va,
                                     unsigned long flags)
 {
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index da36504..7947e1b 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -147,6 +147,9 @@ void guest_physmap_remove_page(struct domain *d,
 
 unsigned long gmfn_to_mfn(struct domain *d, unsigned long gpfn);
 
+struct page_info *get_page_from_paddr(struct domain *d, paddr_t paddr,
+                                      unsigned long flags);
+
 /*
  * Populate-on-demand
  */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 11/22] xen/arm: its: Move its_device structure to header file
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (9 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-19 14:37 ` [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support vijay.kilari
                   ` (12 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

The its_device structure can be reused in virtual ITS
driver. So move this to gic-its.h file

Also the physical LPI allocation code of physical ITS
driver is used.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/gic-v3-its.c     |   16 ----------------
 xen/include/asm-arm/gic-its.h |   16 ++++++++++++++++
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 0c55959..242cf65 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -90,22 +90,6 @@ struct its_node {
 
 #define ITS_ITT_ALIGN		SZ_256
 
-/*
- * The ITS view of a device - belongs to an ITS, a collection, owns an
- * interrupt translation table, and a list of interrupts.
- */
-struct its_device {
-	struct list_head	entry;
-	struct its_node		*its;
-	struct its_collection	*collection;
-	void			*itt;
-	unsigned long		*lpi_map;
-	u32			lpi_base;
-	int			nr_lpis;
-	u32			nr_ites;
-	u32			device_id;
-};
-
 static LIST_HEAD(its_nodes);
 static DEFINE_SPINLOCK(its_lock);
 static struct dt_device_node *gic_root_node;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 9d6fd3e..fa1e305 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -29,6 +29,22 @@ struct its_cmd_block {
     u64     raw_cmd[4];
 };
 
+/*
+ * The ITS view of a device - belongs to an ITS, a collection, owns an
+ * interrupt translation table, and a list of interrupts.
+ */
+struct its_device {
+        struct list_head        entry;
+        struct its_node         *its;
+        struct its_collection   *collection;
+        void                    *itt;
+        unsigned long           *lpi_map;
+        u32                     lpi_base;
+        int                     nr_lpis;
+        u32                     nr_ites;
+        u32                     device_id;
+};
+
 static inline uint8_t its_decode_cmd(struct its_cmd_block *cmd)
 {
     return cmd->raw_cmd[0] & 0xff;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (10 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 11/22] xen/arm: its: Move its_device structure to header file vijay.kilari
@ 2015-03-19 14:37 ` vijay.kilari
  2015-03-20 16:44   ` Julien Grall
  2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
                   ` (11 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:37 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Introduce new fields in arch_irq_desc for to hold
virtual irq number and pointer to its device.
Also introduced helper function to read and update
device pointer in arch_irq_desc

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/irq.c        |   26 ++++++++++++++++++++++++++
 xen/include/asm-arm/irq.h |    4 ++++
 2 files changed, 30 insertions(+)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index cb9c99b..d02f4cf 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -89,6 +89,8 @@ static int __cpuinit init_local_irq_data(void)
         init_one_irq_desc(desc);
         desc->irq = irq;
         desc->action  = NULL;
+        desc->arch.dev = NULL;
+        desc->arch.virq = 0;
 
         /* PPIs are included in local_irqs, we copy the IRQ type from
          * local_irqs_type when bringing up local IRQ for this CPU in
@@ -104,6 +106,30 @@ static int __cpuinit init_local_irq_data(void)
     return 0;
 }
 
+int irq_set_desc_data(unsigned int irq, struct its_device *d)
+{
+    unsigned long flags;
+    struct irq_desc *desc = irq_to_desc(irq);
+
+    spin_lock_irqsave(&desc->lock, flags);
+    desc->arch.dev = d;
+    spin_unlock_irqrestore(&desc->lock, flags);
+
+    return 0;
+}
+
+struct its_device *irq_get_desc_data(struct irq_desc *desc)
+{
+    unsigned long flags;
+    struct its_device *dev;
+
+    spin_lock_irqsave(&desc->lock, flags);
+    dev = desc->arch.dev;
+    spin_unlock_irqrestore(&desc->lock, flags);
+
+    return dev;
+}
+
 void __init init_IRQ(void)
 {
     int irq;
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 435dfcd..f091739 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -17,6 +17,8 @@ struct arch_pirq
 struct arch_irq_desc {
     int eoi_cpu;
     unsigned int type;
+    unsigned int virq;
+    struct its_device *dev;
 };
 
 #define NR_LOCAL_IRQS	32
@@ -47,6 +49,8 @@ void arch_move_irqs(struct vcpu *v);
 /* Set IRQ type for an SPI */
 int irq_set_spi_type(unsigned int spi, unsigned int type);
 
+int irq_set_desc_data(unsigned int irq, struct its_device *d);
+struct its_device *irq_get_desc_data(struct irq_desc *d);
 int platform_get_irq(const struct dt_device_node *device, int index);
 
 void irq_set_affinity(struct irq_desc *desc, const cpumask_t *cpu_mask);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (11 preceding siblings ...)
  2015-03-19 14:37 ` [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-21  0:28   ` Julien Grall
                     ` (2 more replies)
  2015-03-19 14:38 ` [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers vijay.kilari
                   ` (10 subsequent siblings)
  23 siblings, 3 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add Virtual ITS command processing support to
Virtual ITS driver. Also add API's to in physical
ITS driver to send commands from Virtual ITS driver.

In this patch, following are done
 -Physical ITS driver will allocate physical LPI for
  virtual LPI request.
 - The Device ID is used to find the ITS on which it is attached
   and ITS command is sent on that physical ITS.
 - Commands like SYNC and INVALL does not have device id. So these
   commands are sent on all Physical ITS nodes.
 - The vTA(virtual target address) is considered unique way to map
   to Physical target address and collection ids.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - put unused code under #if0/endif
    - changes to redistributor is moved to separate patch
    - Fixed comments from RFC version
---
 xen/arch/arm/Makefile         |    1 +
 xen/arch/arm/gic-v3-its.c     |  185 ++++++++-
 xen/arch/arm/vgic-v3-its.c    |  879 +++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/domain.h  |    9 +
 xen/include/asm-arm/gic-its.h |   86 +++-
 5 files changed, 1156 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 66ea264..81a3317 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -32,6 +32,7 @@ obj-y += traps.o
 obj-y += vgic.o vgic-v2.o
 obj-$(CONFIG_ARM_64) += vgic-v3.o
 obj-$(CONFIG_ARM_64) += gic-v3-its.o
+obj-$(CONFIG_ARM_64) += vgic-v3-its.o
 obj-y += vtimer.o
 obj-y += vuart.o
 obj-y += hvm.o
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index 242cf65..a9aab73 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -56,6 +56,14 @@
 
 #define its_warn(fmt, ...)                                            \
 
+//#define DEBUG_GIC_ITS
+
+#ifdef DEBUG_GIC_ITS
+# define DPRINTK(fmt, args...) printk(XENLOG_DEBUG fmt, ##args)
+#else
+# define DPRINTK(fmt, args...) do {} while ( 0 )
+#endif
+
 #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING		(1 << 0)
 
 #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING	(1 << 0)
@@ -68,6 +76,7 @@
 struct its_collection {
 	u64			target_address;
 	u16			col_id;
+	u16			valid;
 };
 
 /*
@@ -80,14 +89,19 @@ struct its_node {
 	struct list_head	entry;
 	void __iomem		*base;
 	unsigned long		phys_base;
+	unsigned long		phys_size;
 	struct its_cmd_block	*cmd_base;
 	struct its_cmd_block	*cmd_write;
 	void			*tables[GITS_BASER_NR_REGS];
 	struct its_collection	*collections;
 	u64			flags;
 	u32			ite_size;
+	u32			nr_collections;
+	struct dt_device_node	*dt_node;
 };
 
+uint32_t pta_type;
+
 #define ITS_ITT_ALIGN		SZ_256
 
 static LIST_HEAD(its_nodes);
@@ -127,6 +141,123 @@ struct its_cmd_desc {
 	};
 };
 
+uint32_t its_get_pta_type(void)
+{
+	return pta_type;
+}
+
+struct its_node * its_get_phys_node(uint32_t dev_id)
+{
+	struct its_node *its;
+
+	/* TODO: For now return ITS0 node.
+	 * Need Query PCI helper function to get on which
+	 * ITS node the device is attached
+	 */
+	list_for_each_entry(its, &its_nodes, entry) {
+		return its;
+	}
+
+	return NULL;
+}
+
+static int its_search_rdist_address(struct domain *d, uint64_t ta,
+				    uint32_t *col_id)
+{
+	int i, rg;
+	paddr_t start, end;
+
+	for (rg = 0; rg < d->arch.vgic.nr_regions; rg++) {
+		i = 0;
+		start = d->arch.vgic.rdist_regions[rg].base;
+		end = d->arch.vgic.rdist_regions[rg].base +
+			d->arch.vgic.rdist_regions[rg].size;
+		while ((( start + i * d->arch.vgic.rdist_stride) < end)) {
+			if ((start + i * d->arch.vgic.rdist_stride) == ta) {
+				DPRINTK("ITS: Found pta 0x%lx\n", ta);
+				*col_id = i;
+				return 0;
+			}
+			i++;
+		}
+	}
+	return 1;
+}
+
+int its_get_physical_cid(struct domain *d, uint32_t *col_id, uint64_t ta)
+{
+	int i;
+	struct its_collection *col;
+
+	/*
+	* For Dom0, the target address info is collected
+	* at boot time.
+	*/
+	if (is_hardware_domain(d)) {
+		struct its_node *its;
+
+		list_for_each_entry(its, &its_nodes, entry) {
+			for (i = 0; i < its->nr_collections; i++) {
+		                col = &its->collections[i];
+				if (col->valid && col->target_address == ta) {
+					DPRINTK("ITS:Match ta 0x%lx ta 0x%lx\n",
+						col->target_address, ta);
+					*col_id = col->col_id;
+					return 0;
+				}
+			}
+			/* All collections are mapped on every physical ITS */
+			break;
+		}
+	}
+	else
+	{
+		/* As per Spec, Target address is re-distributor
+		 * address/cpu number.
+		 * We cannot rely on collection id as it can any number.
+		 * So here we should rely only on vta address to map the
+		 * collection. For domU, vta != target address.
+		 * So, check vta is corresponds to which GICR region and
+		 * consider that vcpu id as collection id.
+		 */
+		if (its_get_pta_type()) {
+			its_search_rdist_address(d, ta, col_id);
+		}
+		else
+		{
+			*col_id = ta;
+			return 0;
+		}
+	}
+
+	DPRINTK("ITS: Cannot find valid pta entry for ta 0x%lx\n", ta);
+	return 1;
+}
+
+int its_get_target(uint8_t pcid, uint64_t *pta)
+{
+	int i;
+	struct its_collection *col;
+	struct its_node *its;
+
+	list_for_each_entry(its, &its_nodes, entry) {
+		for (i = 0; i < its->nr_collections; i++) {
+			col = &its->collections[i];
+			if (col->valid && col->col_id == pcid) {
+				*pta = col->target_address;
+				DPRINTK("ITS:Match pta 0x%lx vta 0x%lx\n",
+					col->target_address, *pta);
+				return 0;
+			}
+		}
+		/* All collections are mapped on every physical ITS */
+		break;
+	}
+
+	DPRINTK("ITS: Cannot find valid pta entry for vta 0x%lx\n",*pta);
+	return 1;
+}
+
 #define ITS_CMD_QUEUE_SZ		SZ_64K
 #define ITS_CMD_QUEUE_NR_ENTRIES	(ITS_CMD_QUEUE_SZ / sizeof(struct its_cmd_block))
 
@@ -541,7 +672,6 @@ out:
 	return bitmap;
 }
 
-/* TODO: Remove static for the sake of compilation */
 void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
 {
 	int lpi;
@@ -859,6 +989,8 @@ static void its_cpu_init_collection(void)
 		/* Perform collection mapping */
 		its->collections[cpu].target_address = target;
 		its->collections[cpu].col_id = cpu;
+		its->collections[cpu].valid = 1;
+		its->nr_collections++;
 
 		its_send_mapc(its, &its->collections[cpu], 1);
 		its_send_invall(its, &its->collections[cpu]);
@@ -867,8 +999,7 @@ static void its_cpu_init_collection(void)
 	spin_unlock(&its_lock);
 }
 
-/* TODO: Remove static for the sake of compilation */
-int its_alloc_device_irq(struct its_device *dev, int *hwirq)
+int its_alloc_device_irq(struct its_device *dev, uint32_t *hwirq)
 {
 	int idx;
 
@@ -882,6 +1013,47 @@ int its_alloc_device_irq(struct its_device *dev, int *hwirq)
 	return 0;
 }
 
+static int its_send_cmd(struct vcpu *v, struct its_node *its,
+			struct its_cmd_block *phys_cmd)
+{
+	struct its_cmd_block *cmd, *next_cmd;
+
+	spin_lock(&its->lock);
+
+	cmd = its_allocate_entry(its);
+	if (!cmd)
+		return 0;
+
+	cmd->raw_cmd[0] = phys_cmd->raw_cmd[0];
+	cmd->raw_cmd[1] = phys_cmd->raw_cmd[1];
+	cmd->raw_cmd[2] = phys_cmd->raw_cmd[2];
+	cmd->raw_cmd[3] = phys_cmd->raw_cmd[3];
+	its_flush_cmd(its, cmd);
+
+	next_cmd = its_post_commands(its);
+	spin_unlock(&its->lock);
+
+	its_wait_for_range_completion(its, cmd, next_cmd);
+
+	return 1;
+}
+
+int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
+		     struct its_cmd_block *phys_cmd, int send_all)
+{
+	struct its_node *pits;
+	int ret = 0;
+
+	if (send_all) {
+		list_for_each_entry(pits, &its_nodes, entry)
+		ret = its_send_cmd(v, pits, phys_cmd);
+	}
+	else
+		return its_send_cmd(v, its, phys_cmd);
+
+	return ret;
+}
+
 static int its_force_quiescent(void __iomem *base)
 {
 	u32 count = 1000000;	/* 1s */
@@ -955,10 +1127,17 @@ static int its_probe(struct dt_device_node *node)
 
 	spin_lock_init(&its->lock);
 	INIT_LIST_HEAD(&its->entry);
+	its->dt_node = node;
 	its->base = its_base;
 	its->phys_base = its_addr;
+	its->phys_size = its_size;
 	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
 
+	if ( (readq_relaxed(its->base + GITS_TYPER) & GITS_TYPER_PTA) )
+		pta_type = 1;
+	else
+		pta_type = 0;
+
 	its->cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ);
 	if (!its->cmd_base) {
 		err = -ENOMEM;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
new file mode 100644
index 0000000..7530a88
--- /dev/null
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -0,0 +1,879 @@
+/*
+ * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * Xen changes:
+ * Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
+ * Copyright (C) 2014 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xen/bitops.h>
+#include <xen/config.h>
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/softirq.h>
+#include <xen/irq.h>
+#include <xen/list.h>
+#include <xen/sched.h>
+#include <xen/sizes.h>
+#include <xen/xmalloc.h>
+#include <asm/current.h>
+#include <asm/device.h>
+#include <asm/mmio.h>
+#include <asm/io.h>
+#include <asm/gic_v3_defs.h>
+#include <asm/gic.h>
+#include <asm/vgic.h>
+#include <asm/gic-its.h>
+
+/* GITS register definitions */
+#define VITS_GITS_TYPER_HCC       (0xffU << 24)
+#define VITS_GITS_TYPER_PTA_SHIFT (19)
+#define VITS_GITS_DEV_BITS        (0x14U << 13)
+#define VITS_GITS_ID_BITS         (0x13U << 8)
+#define VITS_GITS_ITT_SIZE        (0x7U << 4)
+#define VITS_GITS_DISTRIBUTED     (0x1U << 3)
+#define VITS_GITS_PLPIS           (0x1U << 0)
+
+/* GITS_PIDRn register values for ARM implementations */
+#define GITS_PIDR0_VAL            (0x94)
+#define GITS_PIDR1_VAL            (0xb4)
+#define GITS_PIDR2_VAL            (0x3b)
+#define GITS_PIDR3_VAL            (0x00)
+#define GITS_PIDR4_VAL            (0x04)
+
+//#define DEBUG_ITS
+
+#ifdef DEBUG_ITS
+# define DPRINTK(fmt, args...) printk(XENLOG_DEBUG fmt, ##args)
+#else
+# define DPRINTK(fmt, args...) do {} while ( 0 )
+#endif
+
+#ifdef DEBUG_ITS
+static void dump_cmd(struct its_cmd_block *cmd)
+{
+    printk("CMD[0] = 0x%lx CMD[1] = 0x%lx CMD[2] = 0x%lx CMD[3] = 0x%lx\n",
+           cmd->raw_cmd[0], cmd->raw_cmd[1], cmd->raw_cmd[2], cmd->raw_cmd[3]);
+}
+#endif
+
+void vgic_its_disable_lpis(struct vcpu *v, uint32_t lpi)
+{
+    struct pending_irq *p;
+    unsigned long flags;
+
+    p = irq_to_pending(v, lpi);
+    clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+    gic_remove_from_queues(v, lpi);
+    if ( p->desc != NULL )
+    {
+        spin_lock_irqsave(&p->desc->lock, flags);
+        p->desc->handler->disable(p->desc);
+        spin_unlock_irqrestore(&p->desc->lock, flags);
+    }
+}
+
+void vgic_its_enable_lpis(struct vcpu *v, uint32_t lpi)
+{
+    struct pending_irq *p;
+    unsigned long flags;
+
+    p = irq_to_pending(v, lpi);
+    set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
+
+    spin_lock_irqsave(&v->arch.vgic.lock, flags);
+
+    if ( !list_empty(&p->inflight) &&
+         !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
+        gic_raise_guest_irq(v, p->desc->arch.virq, p->priority);
+
+    spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
+    if ( p->desc != NULL )
+    {
+        spin_lock_irqsave(&p->desc->lock, flags);
+        p->desc->handler->enable(p->desc);
+        spin_unlock_irqrestore(&p->desc->lock, flags);
+    }
+}
+
+static int vits_alloc_device_irq(struct its_device *dev, uint32_t id,
+                                uint32_t *plpi, uint32_t vlpi, uint32_t vcol_id)
+{
+
+    int idx, i = 0;
+
+    spin_lock(&dev->vlpi_lock);
+    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
+    {
+        if ( dev->vlpi_entries[i].vlpi == vlpi )
+        {
+             *plpi = dev->vlpi_entries[i].plpi;
+             DPRINTK("Found plpi %d for device 0x%x with vlpi %d id %d\n",
+                      *plpi, dev->dev_id, vlpi, dev->vlpi_entries[i].id);
+             spin_unlock(&dev->vlpi_lock);
+             return 0;
+        }
+        i++;
+    }
+ 
+    if ( its_alloc_device_irq(dev, plpi) )
+        BUG_ON(1);
+
+    idx = find_first_zero_bit(dev->vlpi_map, dev->nr_lpis);
+    dev->vlpi_entries[idx].plpi = *plpi;
+    dev->vlpi_entries[idx].vlpi = vlpi;
+    dev->vlpi_entries[idx].id  = id;
+    set_bit(idx, dev->vlpi_map);
+
+    spin_unlock(&dev->vlpi_lock);
+
+    DPRINTK("Allocated plpi %d for device 0x%x with vlpi %d id %d @idx %d\n",
+            *plpi, dev->dev_id, vlpi, id, idx);
+
+    return 0;
+}
+
+/* Should be called with its lock held */
+static void vgic_its_unmap_id(struct vcpu *v, struct its_device *dev,
+                              uint32_t id, int trash)
+{
+    int i = 0;
+
+    DPRINTK("vITS: unmap id for device 0x%x id %d trash %d\n",
+             dev->dev_id, id, trash);
+
+    spin_lock(&dev->vlpi_lock);
+    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
+    {
+        if ( dev->vlpi_entries[i].id == id )
+        {
+            DPRINTK("vITS: un mapped id for device 0x%x id %d lpi %d\n",
+                     dev->dev_id, dev->vlpi_entries[i].id,
+                     dev->vlpi_entries[i].plpi);
+            vgic_its_disable_lpis(v, dev->vlpi_entries[i].plpi);
+            release_irq(dev->vlpi_entries[i].plpi, v->domain);
+            dev->vlpi_entries[i].plpi = 0;
+            dev->vlpi_entries[i].vlpi = 0;
+            dev->vlpi_entries[i].id = 0;
+            /* XXX: Clear LPI base here? */
+            clear_bit(dev->vlpi_entries[i].plpi - dev->lpi_base, dev->lpi_map);
+            clear_bit(i, dev->vlpi_map);
+            goto out;
+        }
+        i++;
+    }
+
+    spin_unlock(&dev->vlpi_lock);
+    dprintk(XENLOG_ERR, "vITS: id %d not found for device 0x%x to unmap\n",
+           id, dev->device_id);
+
+    return;
+out:
+    if ( bitmap_empty(dev->lpi_map, dev->nr_lpis) )
+    {
+        its_lpi_free(dev->lpi_map, dev->lpi_base, dev->nr_lpis);
+        DPRINTK("vITS: Freeing lpi chunk\n");
+    }
+    /* XXX: Device entry is not removed on empty lpi list */
+    spin_unlock(&dev->vlpi_lock);
+}
+
+static int vgic_its_check_device_id(struct vcpu *v, struct its_device *dev,
+                                    uint32_t id)
+{
+    int i = 0;
+
+    spin_lock(&dev->vlpi_lock);
+    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
+    {
+        if ( dev->vlpi_entries[i].id == id )
+        {
+            spin_unlock(&dev->vlpi_lock);
+            return 0;
+        }
+        i++;
+    }
+    spin_unlock(&dev->vlpi_lock);
+
+    return 1;
+}
+
+static struct its_device *vgic_its_check_device(struct vcpu *v, int dev_id)
+{
+    struct domain *d = v->domain;
+    struct its_device *dev = NULL, *tmp;
+
+    spin_lock(&d->arch.vits_devs.lock);
+    list_for_each_entry(tmp, &d->arch.vits_devs.dev_list, entry)
+    {
+        if ( tmp->device_id == dev_id )
+        {
+            DPRINTK("vITS: Found device 0x%x\n", device_id);
+            dev = tmp;
+            break;
+        }
+    }
+    spin_unlock(&d->arch.vits_devs.lock);
+
+    return dev;
+}
+
+static int vgic_its_check_cid(struct vcpu *v,
+                              struct vgic_its *vits,
+                              uint8_t vcid, uint32_t *pcid)
+{
+    uint32_t nmap = vits->cid_map.nr_cid;
+    int i;
+
+    for ( i = 0; i < nmap; i++ )
+    {
+        if ( vcid == vits->cid_map.vcid[i] )
+        {
+            *pcid = vits->cid_map.pcid[i];
+            DPRINTK("vITS: Found vcid %d for vcid %d\n", *pcid,
+                     vits->cid_map.vcid[i]);
+            return 0;
+        }
+    }
+
+    return 1;
+}
+
+static uint64_t vgic_its_get_pta(struct vcpu *v, struct vgic_its *vits,
+                                 uint64_t vta)
+{
+    
+    uint32_t nmap = vits->cid_map.nr_cid;
+    int i;
+    uint8_t pcid;
+    uint64_t pta;
+
+    for ( i = 0; i < nmap; i++ )
+    {
+        if ( vta == vits->cid_map.vta[i] )
+        {
+            pcid = vits->cid_map.pcid[i];
+            DPRINTK("vITS: Found vcid %d for vta 0x%lx\n", pcid,
+                     vits->cid_map.vta[i]);
+            if ( its_get_target(pcid, &pta) )
+                BUG_ON(1);
+            return pta;
+        }
+    }
+
+    BUG_ON(1);
+    return 1;
+}
+
+static int vgic_its_build_mapd_cmd(struct vcpu *v,
+                                   struct its_cmd_block *virt_cmd,
+                                   struct its_cmd_block *phys_cmd)
+{
+    unsigned long itt_addr;
+
+    itt_addr = its_decode_itt(virt_cmd);
+    /* Get ITT PA from ITT IPA */
+    itt_addr = p2m_lookup(v->domain, itt_addr, NULL);
+    its_encode_cmd(phys_cmd, GITS_CMD_MAPD);
+    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));
+    its_encode_size(phys_cmd, its_decode_size(virt_cmd));
+    its_encode_itt(phys_cmd, itt_addr);
+    its_encode_valid(phys_cmd, its_decode_valid(virt_cmd));
+
+    DPRINTK("vITS: Build MAPD with itt_addr 0x%lx devId %d\n",itt_addr,
+            its_decode_devid(virt_cmd));
+
+    return 0;
+}
+
+static int vgic_its_build_sync_cmd(struct vcpu *v,
+                                   struct vgic_its *vits,
+                                   struct its_cmd_block *virt_cmd,
+                                   struct its_cmd_block *phys_cmd)
+{
+    uint64_t pta;
+
+    its_encode_cmd(phys_cmd, GITS_CMD_SYNC);
+    pta = vgic_its_get_pta(v, vits, its_decode_target(virt_cmd));
+
+    return 0;
+}
+
+static int vgic_its_build_mapvi_cmd(struct vcpu *v,
+                                    struct vgic_its *vits,
+                                    struct its_cmd_block *virt_cmd,
+                                    struct its_cmd_block *phys_cmd)
+{
+    struct domain *d = v->domain;
+    struct its_device *dev;
+    uint32_t pcol_id;
+    uint32_t pid;
+    struct irq_desc *desc;
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+    uint32_t id = its_decode_event_id(virt_cmd);
+    uint8_t vcol_id = its_decode_collection(virt_cmd);
+    uint32_t vid = its_decode_phys_id(virt_cmd);
+    uint8_t cmd = its_decode_cmd(virt_cmd);
+
+    DPRINTK("vITS: MAPVI: dev_id 0x%x vcol_id %d vid %d \n",
+             dev_id, vcol_id, vid);
+
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: MAPVI: Fail to find device 0x%x\n", dev_id);
+        return 1;
+    }
+
+    /* Check if Collection id exists */
+    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: MAPVI: with wrong Collection %d\n", vcol_id);
+        return 1;
+    }
+    if ( vits_alloc_device_irq(dev, id, &pid, vid, vcol_id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: MAPVI: Failed to alloc irq\n");
+        return 1;
+    }
+
+    /* Allocate irq desc for this pirq */
+    desc = irq_to_desc(pid);
+
+    route_irq_to_guest(d, pid, "LPI");
+
+     /* Assign device structure to desc data */
+    desc->arch.dev = dev;
+    desc->arch.virq = vid;
+
+    its_encode_cmd(phys_cmd, GITS_CMD_MAPVI);
+    its_encode_devid(phys_cmd, dev_id);
+
+    if ( cmd == GITS_CMD_MAPI )
+        its_encode_event_id(phys_cmd, vid);
+    else
+        its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));
+
+    its_encode_phys_id(phys_cmd, pid);
+    its_encode_collection(phys_cmd, pcol_id);
+
+    return 0;
+}
+
+static int vgic_its_build_movi_cmd(struct vcpu *v,
+                                   struct vgic_its *vits,
+                                   struct its_cmd_block *virt_cmd,
+                                   struct its_cmd_block *phys_cmd)
+{
+    uint32_t pcol_id;
+    struct its_device *dev;
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+    uint8_t vcol_id = its_decode_collection(virt_cmd);
+    uint32_t id = its_decode_event_id(virt_cmd);
+
+    DPRINTK("vITS: MOVI: dev_id 0x%x vcol_id %d\n", dev_id, vcol_id);
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: MOVI: Failed to find device 0x%x\n", dev_id);
+        return 1;
+    }
+
+    /* Check if Collection id exists */
+    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: MOVI: with wrong Collection %d\n", vcol_id);
+        return 1;
+    }
+
+    if ( vgic_its_check_device_id(v, dev, id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: MOVI: Invalid ID %d\n", id);
+        return 1;
+    }
+
+    its_encode_cmd(phys_cmd, GITS_CMD_MOVI);
+    its_encode_devid(phys_cmd, dev_id);
+    its_encode_event_id(phys_cmd, id);
+    its_encode_collection(phys_cmd, pcol_id);
+
+    return 0;
+}
+   
+static int vgic_its_build_discard_cmd(struct vcpu *v,
+                                      struct vgic_its *vits,
+                                      struct its_cmd_block *virt_cmd,
+                                      struct its_cmd_block *phys_cmd)
+{
+    struct its_device *dev;
+    uint32_t id = its_decode_event_id(virt_cmd);
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+
+    DPRINTK("vITS: DISCARD: dev_id 0x%x id %d\n", dev_id, id);
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: DISCARD: Failed to find device 0x%x\n",
+                dev_id);
+        return 1;
+    }
+
+    if ( vgic_its_check_device_id(v, dev, id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: DISCARD: Invalid vID %d\n", id);
+        return 1;
+    }
+
+    /* Check if PID is exists for this VID for this device and unmap it */
+    vgic_its_unmap_id(v, dev, id, 0);
+
+    /* Fetch and encode cmd */
+    its_encode_cmd(phys_cmd, GITS_CMD_DISCARD);
+    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));
+    its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));
+
+    return 0;
+}
+
+static int vgic_its_build_inv_cmd(struct vcpu *v,
+                                  struct vgic_its *vits,
+                                  struct its_cmd_block *virt_cmd,
+                                  struct its_cmd_block *phys_cmd)
+{
+    struct its_device *dev;
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+    uint32_t id = its_decode_event_id(virt_cmd);
+
+    DPRINTK("vITS: INV: dev_id 0x%x id %d\n",dev_id, id);
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: INV: Failed to find device 0x%x\n", dev_id);
+        return 1;
+    }
+
+    if ( vgic_its_check_device_id(v, dev, id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: INV: Invalid ID %d\n", id);
+        return 1;
+    }
+
+    its_encode_cmd(phys_cmd, GITS_CMD_INV);
+    its_encode_devid(phys_cmd, dev_id);
+    its_encode_event_id(phys_cmd, id);
+
+    return 0;
+}
+
+static int vgic_its_build_clear_cmd(struct vcpu *v,
+                                    struct vgic_its *vits,
+                                    struct its_cmd_block *virt_cmd,
+                                    struct its_cmd_block *phys_cmd)
+{
+    struct its_device *dev;
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+    uint32_t id = its_decode_event_id(virt_cmd);
+
+    DPRINTK("vITS: CLEAR: dev_id 0x%x id %d\n", dev_id, id);
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: CLEAR: Fail to find device 0x%x\n", dev_id);
+        return 1;
+    }
+
+    if ( vgic_its_check_device_id(v, dev, id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: CLEAR: Invalid ID %d\n", id);
+        return 1;
+    }
+
+    its_encode_cmd(phys_cmd, GITS_CMD_INV);
+    its_encode_event_id(phys_cmd, id);
+
+    return 0;
+}
+
+static int vgic_its_build_invall_cmd(struct vcpu *v,
+                                     struct vgic_its *vits,
+                                     struct its_cmd_block *virt_cmd,
+                                     struct its_cmd_block *phys_cmd)
+{
+    uint32_t pcol_id;
+    uint8_t vcol_id = its_decode_collection(virt_cmd);
+
+    DPRINTK("vITS: INVALL: vCID %d\n", vcol_id);
+    /* Check if Collection id exists */
+    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: INVALL: Wrong Collection %d\n", vcol_id);
+        return 1;
+    }
+
+    its_encode_cmd(phys_cmd, GITS_CMD_INVALL);
+    its_encode_collection(phys_cmd, pcol_id);
+
+    return 0;
+}
+
+static int vgic_its_build_int_cmd(struct vcpu *v,
+                                  struct vgic_its *vits,
+                                  struct its_cmd_block *virt_cmd,
+                                  struct its_cmd_block *phys_cmd)
+{
+    uint32_t dev_id = its_decode_devid(virt_cmd);
+    struct its_device *dev;
+    uint32_t id = its_decode_event_id(virt_cmd);
+
+    DPRINTK("vITS: INT: Device 0x%x id %d\n", its_decode_devid(virt_cmd), id);
+    /* Search if device entry exists */
+    dev = vgic_its_check_device(v, dev_id);
+    if ( dev == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: INT: Failed to find device 0x%x\n", dev_id);
+        return 1;
+    }
+
+    if ( vgic_its_check_device_id(v, dev, id) )
+    {
+        dprintk(XENLOG_ERR, "vITS: INT: Invalid ID %d\n", id);
+        return 1;
+    }
+
+    its_encode_cmd(phys_cmd, GITS_CMD_INT);
+    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));
+    its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));
+
+    return 0;
+}
+
+static void vgic_its_free_device(struct its_device *dev)
+{
+        xfree(dev);
+}
+
+static int vgic_its_add_device(struct vcpu *v, struct vgic_its *vits,
+                               struct its_cmd_block *virt_cmd)
+{
+    struct domain *d = v->domain;
+    struct its_device *dev;
+    int lpi_base, nr_lpis, nr_vecs;
+
+    /* Allocate device only if valid bit is set */
+    if ( its_decode_valid(virt_cmd) )
+    {
+        dev = xzalloc(struct its_device);
+        if ( dev == NULL )
+           return ENOMEM;
+
+        spin_lock(&d->arch.vits_devs.lock);
+        dev->device_id = its_decode_devid(virt_cmd);
+        dev->itt_size = its_decode_size(virt_cmd);
+        dev->itt_addr = its_decode_itt(virt_cmd);
+        INIT_LIST_HEAD(&dev->entry);
+        /* TODO: use pci_conf_read() to read MSI vectors count */
+        nr_vecs = 32;
+        dev->lpi_map = its_lpi_alloc_chunks(nr_vecs, &lpi_base, &nr_lpis);
+        dev->lpi_base = lpi_base;
+        dev->nr_lpis = nr_lpis;
+        spin_lock_init(&dev->vlpi_lock);
+        dev->vlpi_entries = xzalloc_array(struct vid_map, nr_lpis);
+        if ( dev->vlpi_entries == NULL )
+        {
+            spin_unlock(&d->arch.vits_devs.lock);
+            return ENOMEM;
+        }
+        dev->vlpi_map = xzalloc_bytes(nr_lpis/8);
+        if ( dev->vlpi_map == NULL )
+        {
+            spin_unlock(&d->arch.vits_devs.lock);
+            return ENOMEM;
+        }
+
+        /*
+         * TODO: Get ITS node of this pci device.
+         * Update with proper helper function after PCI-passthrough support
+         */
+        dev->its = its_get_phys_node(dev->device_id);
+        dev->vits = vits;
+        list_add(&dev->entry, &d->arch.vits_devs.dev_list);
+        spin_unlock(&d->arch.vits_devs.lock);
+        DPRINTK("vITS: Added device dev_id 0x%x\n", its_decode_devid(virt_cmd));
+    }
+    else
+    {
+        spin_lock(&d->arch.vits_devs.lock);
+        /* Search if device entry exists */
+        dev = vgic_its_check_device(v, its_decode_devid(virt_cmd));
+        if ( dev == NULL )
+        {
+            dprintk(XENLOG_ERR, "vITS: Failed to find device 0x%x\n",
+                    dev->device_id);
+            spin_unlock(&d->arch.vits_devs.lock);
+            return 1;
+        }
+
+        /* Clear all lpis of this device */
+        vgic_its_unmap_id(v, dev, 0, 1);
+
+        list_del(&dev->entry);
+        vgic_its_free_device(dev);
+        spin_unlock(&d->arch.vits_devs.lock);
+        DPRINTK("vITS: Removed device dev_id 0x%x\n", its_decode_devid(virt_cmd));
+    }
+
+    return 0;
+}
+
+static int vgic_its_process_mapc(struct vcpu *v, struct vgic_its *vits,
+                                 struct its_cmd_block *virt_cmd)
+{
+    uint32_t pcid = 0;
+    int idx;
+    uint32_t nmap;
+    uint8_t vcol_id;
+    uint64_t vta = 0;
+
+    nmap = vits->cid_map.nr_cid;
+    vcol_id = its_decode_collection(virt_cmd);
+    vta = its_decode_target(virt_cmd);
+
+    for ( idx = 0; idx < nmap; idx++ )
+    {
+        if ( vcol_id == vits->cid_map.vcid[idx] )
+            break;
+    }
+    if ( idx == nmap )
+        vits->cid_map.vcid[idx] = vcol_id;
+
+    if ( its_get_physical_cid(v->domain, &pcid, vta) )
+        BUG_ON(1);
+    vits->cid_map.pcid[idx] = pcid;
+    vits->cid_map.vta[idx] = vta;
+    vits->cid_map.nr_cid++;
+    DPRINTK("vITS: MAPC: vCID %d vTA 0x%lx added @idx 0x%x \n",
+             vcol_id, vta, idx);
+
+    return 0;
+}
+
+static void vgic_its_update_read_ptr(struct vcpu *v, struct vgic_its *vits)
+{
+    vits->cmd_read = vits->cmd_write;
+}
+
+#ifdef DEBUG_ITS
+char *cmd_str[] = {
+        [GITS_CMD_MOVI]    = "MOVI",
+        [GITS_CMD_INT]     = "INT",
+        [GITS_CMD_CLEAR]   = "CLEAR",
+        [GITS_CMD_SYNC]    = "SYNC",
+        [GITS_CMD_MAPD]    = "MAPD",
+        [GITS_CMD_MAPC]    = "MAPC",
+        [GITS_CMD_MAPVI]   = "MAPVI",
+        [GITS_CMD_MAPI]    = "MAPI",
+        [GITS_CMD_INV]     = "INV",
+        [GITS_CMD_INVALL]  = "INVALL",
+        [GITS_CMD_MOVALL]  = "MOVALL",
+        [GITS_CMD_DISCARD] = "DISCARD",
+    };
+#endif
+
+#define SEND_NONE 0x0
+#define SEND_CMD 0x1
+#define SEND_ALL 0x2
+
+static int vgic_its_parse_its_command(struct vcpu *v, struct vgic_its *vits,
+                                      struct its_cmd_block *virt_cmd)
+{
+    uint8_t cmd = its_decode_cmd(virt_cmd);
+    struct its_cmd_block phys_cmd;
+    int ret;
+    int send_flag = SEND_CMD;
+
+#ifdef DEBUG_ITS
+    DPRINTK("vITS: Received cmd %s (0x%x)\n", cmd_str[cmd], cmd);
+    DPRINTK("Dump Virt cmd: ");
+    dump_cmd(virt_cmd);
+#endif
+
+    memset(&phys_cmd, 0x0, sizeof(struct its_cmd_block));
+    switch ( cmd )
+    {
+    case GITS_CMD_MAPD:
+        /* create virtual device entry */
+        if ( vgic_its_add_device(v, vits, virt_cmd) )
+            return ENODEV;
+        ret = vgic_its_build_mapd_cmd(v, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_MAPC:
+        /* Physical ITS driver already mapped physical Collection */
+        send_flag = SEND_NONE;
+        ret =  vgic_its_process_mapc(v, vits, virt_cmd);
+        break;
+    case GITS_CMD_MAPI:
+        /* MAPI is same as MAPVI */
+    case GITS_CMD_MAPVI:
+        ret = vgic_its_build_mapvi_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_MOVI:
+        ret = vgic_its_build_movi_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_DISCARD:
+        ret = vgic_its_build_discard_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_INV:
+        ret = vgic_its_build_inv_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_INVALL:
+        /* XXX: SYNC is sent on all physical ITS */
+        send_flag = SEND_ALL;
+        ret = vgic_its_build_invall_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_INT:
+        ret = vgic_its_build_int_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_CLEAR:
+        ret = vgic_its_build_clear_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+    case GITS_CMD_SYNC:
+        /* XXX: SYNC is sent on all physical ITS */
+        send_flag = SEND_ALL;
+        ret = vgic_its_build_sync_cmd(v, vits, virt_cmd, &phys_cmd);
+        break;
+        /*TODO:  GITS_CMD_MOVALL not implemented */
+    default:
+       dprintk(XENLOG_ERR, "vITS: Unhandled command cmd %d\n", cmd);
+       return 1;
+    }
+
+#ifdef DEBUG_ITS
+    DPRINTK("Dump Phys cmd: ");
+    dump_cmd(&phys_cmd);
+#endif
+
+    if ( ret )
+    {
+       dprintk(XENLOG_ERR, "vITS: Failed to handle cmd %d\n", cmd);
+       return 1;
+    }
+
+    if ( send_flag )
+    {
+       /* XXX: Always send on physical ITS on which device is assingned */
+       if ( !gic_its_send_cmd(v,
+             its_get_phys_node(its_decode_devid(&phys_cmd)),
+             &phys_cmd, (send_flag & SEND_ALL)) )
+       {
+           dprintk(XENLOG_ERR, "vITS: Failed to push cmd %d\n", cmd);
+           return 1;
+       }
+    }
+
+    return 0;
+}
+
+/* Called with its lock held */
+static int vgic_its_read_virt_cmd(struct vcpu *v,
+                                  struct vgic_its *vits,
+                                  struct its_cmd_block *virt_cmd)
+{
+    struct page_info * page;
+    void *p;
+    paddr_t paddr;
+    paddr_t maddr = vits->cmd_base & 0xfffffffff000UL;
+    uint64_t offset;
+
+    /* CMD Q can be more than 1 page. Map only page that is required */
+    maddr = ((vits->cmd_base & 0xfffffffff000UL) +
+              vits->cmd_write_save ) & PAGE_MASK;
+
+    paddr = p2m_lookup(v->domain, maddr, NULL);
+
+    DPRINTK("vITS: Mapping CMD Q maddr 0x%lx paddr 0x%lx write_save 0x%lx \n",
+            maddr, paddr, vits->cmd_write_save);
+    page = get_page_from_paddr(v->domain, paddr, 0);
+    if ( page == NULL )
+    {
+        dprintk(XENLOG_ERR, "vITS: Failed to get command page\n");
+        return 1;
+    }
+
+    p = __map_domain_page(page);
+
+    /* Offset within the mapped 4K page to read */
+    offset = vits->cmd_write_save & 0xfff;
+
+    memcpy(virt_cmd, p + offset, sizeof(struct its_cmd_block));
+
+    /* No command queue is created by vits to check on Q full */
+    vits->cmd_write_save += 0x20;
+    if ( vits->cmd_write_save == vits->cmd_qsize )
+    {
+         DPRINTK("vITS: Reset write_save 0x%lx qsize 0x%lx \n",
+                 vits->cmd_write_save,
+                 vits->cmd_qsize);
+                 vits->cmd_write_save = 0x0;
+    }
+
+    unmap_domain_page(p);
+    put_page(page);
+
+    return 0;
+}
+
+int vgic_its_process_cmd(struct vcpu *v, struct vgic_its *vits)
+{
+    struct its_cmd_block virt_cmd;
+
+    /* XXX: Currently we are processing one cmd at a time */
+    ASSERT(spin_is_locked(&vits->lock));
+
+    do {
+        if ( vgic_its_read_virt_cmd(v, vits, &virt_cmd) )
+            goto err;
+        if ( vgic_its_parse_its_command(v, vits, &virt_cmd) )
+            goto err;
+    } while ( vits->cmd_write != vits->cmd_write_save );
+
+    vits->cmd_write_save = vits->cmd_write;
+    DPRINTK("vITS: write_save 0x%lx write 0x%lx \n",
+            vits->cmd_write_save,
+            vits->cmd_write);
+    /* XXX: Currently we are processing one cmd at a time */
+    vgic_its_update_read_ptr(v, vits);
+
+    dsb(ishst);
+
+    return 1;
+err:
+    dprintk(XENLOG_ERR, "vITS: Failed to process guest cmd\n");
+    return 0;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 9e0419e..bc7aee9 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -114,6 +114,15 @@ struct arch_domain
 #endif
     } vgic;
 
+    struct vgic_its *vits;
+    struct vgic_lpi_conf *lpi_conf;
+
+    struct vits_devs {
+        spinlock_t lock;
+        /* ITS Device list */
+        struct list_head dev_list;
+    } vits_devs;
+
     struct vuart {
 #define VUART_BUF_SIZE 128
         char                        *buf;
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index fa1e305..70ec913 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -22,6 +22,72 @@
 #ifndef __ASM_ARM_GIC_ITS_H__
 #define __ASM_ARM_GIC_ITS_H__
 
+#include <asm/gic_v3_defs.h>
+
+struct its_node;
+
+/* Collection ID mapping */
+struct cid_mapping
+{
+    uint8_t nr_cid;
+    /* XXX: assume one collection id per vcpu. can set to MAX_VCPUS? */
+    /* Virtual Collection id */
+    uint8_t vcid[32];
+    /* Physical Collection id */
+    uint8_t pcid[32];
+    /* Virtual target address of this collection id */
+    uint64_t vta[32];
+};
+
+/*
+ * Per domain virtual ITS structure.
+ * One per Physical ITS node available for the domain
+ */
+ 
+struct vgic_its
+{
+   spinlock_t lock;
+   /* Emulation of BASER */
+   paddr_t baser[8];
+   /* Command queue base */
+   paddr_t cmd_base;
+   /* Command queue write pointer */
+   paddr_t cmd_write;
+   /* Command queue write saved pointer */
+   paddr_t cmd_write_save;
+   /* Command queue read pointer */
+   paddr_t cmd_read;
+   /* Command queue size */
+   unsigned long cmd_qsize;
+   /* ITS mmio physical base */
+   paddr_t phys_base;
+   /* ITS mmio physical size */
+   unsigned long phys_size;
+   /* ITS physical node */
+   struct its_node *its;
+   /* GICR ctrl register */
+   uint32_t ctrl;
+   /* Virtual to Physical Collection id mapping */
+   struct cid_mapping cid_map;
+};
+
+struct vgic_lpi_conf
+{
+   /* LPI propbase */
+   paddr_t propbase;
+   /* percpu pendbase */
+   paddr_t pendbase[MAX_VIRT_CPUS];
+   /* Virtual LPI property table */
+   void * prop_page;
+};
+
+struct vid_map
+{
+    uint32_t vlpi;
+    uint32_t plpi;
+    uint32_t id;
+};
+
 /*
  * The ITS command block, which is what the ITS actually parses.
  */
@@ -37,12 +103,21 @@ struct its_device {
         struct list_head        entry;
         struct its_node         *its;
         struct its_collection   *collection;
-        void                    *itt;
+        /* Virtual ITS node */
+        struct vgic_its         *vits;
+        paddr_t                 itt_addr;
+        unsigned long           itt_size;
         unsigned long           *lpi_map;
         u32                     lpi_base;
         int                     nr_lpis;
         u32                     nr_ites;
         u32                     device_id;
+        /* Spinlock for vlpi allocation */
+        spinlock_t              vlpi_lock;
+        /* vlpi bitmap */
+        unsigned long           *vlpi_map;
+        /* vlpi <=> plpi mapping */
+        struct vid_map          *vlpi_entries;
 };
 
 static inline uint8_t its_decode_cmd(struct its_cmd_block *cmd)
@@ -144,6 +219,15 @@ static inline void its_encode_collection(struct its_cmd_block *cmd, u16 col)
     cmd->raw_cmd[2] |= col;
 }
 
+int its_get_physical_cid(struct domain *d, uint32_t *col_id, uint64_t ta);
+int its_get_target(uint8_t pcid, uint64_t *pta);
+int its_alloc_device_irq(struct its_device *dev, uint32_t *plpi);
+int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
+                     struct its_cmd_block *phys_cmd, int send_all);
+void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
+unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
+uint32_t its_get_pta_type(void);
+struct its_node * its_get_phys_node(uint32_t dev_id);
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
 /*
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (12 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-24 17:12   ` Julien Grall
  2015-03-19 14:38 ` [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs vijay.kilari
                   ` (9 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add support for emulating GITS_* registers

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - Each Virtual ITS is attached to Physical ITS.
    - Introduce helper function to lock and unlock
      virtual ITS lock.
    - Introduced helper to get virtual ITS structure pointer
      based on emulation address.
---
 xen/arch/arm/gic-v3-its.c     |    8 +
 xen/arch/arm/vgic-v3-its.c    |  412 +++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |    1 +
 3 files changed, 421 insertions(+)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index a9aab73..e382f8d 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -101,6 +101,8 @@ struct its_node {
 };
 
 uint32_t pta_type;
+/* Number of physical its nodes present */
+uint32_t nr_its = 0;
 
 #define ITS_ITT_ALIGN		SZ_256
 
@@ -146,6 +148,11 @@ uint32_t its_get_pta_type(void)
 	return pta_type;
 }
 
+uint32_t its_get_nr_its(void)
+{
+	return nr_its;
+}
+
 struct its_node * its_get_phys_node(uint32_t dev_id)
 {
 	struct its_node *its;
@@ -1170,6 +1177,7 @@ static int its_probe(struct dt_device_node *node)
 	}
 	spin_lock(&its_lock);
 	list_add(&its->entry, &its_nodes);
+	nr_its++;
 	spin_unlock(&its_lock);
 
 	return 0;
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 7530a88..4d8945f 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -869,6 +869,418 @@ err:
     return 0;
 }
 
+struct vgic_its *its_to_vits(struct vcpu *v, paddr_t phys_base)
+{
+    struct vgic_its *vits = NULL;
+    int i;
+
+    /* Mask 64K offset */
+    phys_base = phys_base & ~(SZ_64K - 1);
+    if ( is_hardware_domain(v->domain) )
+    {
+        for ( i = 0; i < its_get_nr_its(); i++ )
+        {
+            if ( v->domain->arch.vits[i].phys_base == phys_base )
+            {
+                vits =  &v->domain->arch.vits[i];
+                break;
+            }
+        }
+    }
+    else
+        vits = &v->domain->arch.vits[0];
+
+    return vits;
+}
+
+static inline void vits_spin_lock(struct vgic_its *vits)
+{
+    spin_lock(&vits->lock);
+}
+
+static inline void vits_spin_unlock(struct vgic_its *vits)
+{
+    spin_unlock(&vits->lock);
+}
+
+static int vgic_v3_gits_mmio_read(struct vcpu *v, mmio_info_t *info)
+{
+    struct vgic_its *vits;
+    struct hsr_dabt dabt = info->dabt;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    register_t *r = select_user_reg(regs, dabt.reg);
+    uint64_t val = 0;
+    uint32_t index, gits_reg;
+
+    vits = its_to_vits(v, info->gpa);
+    if ( vits == NULL ) BUG_ON(1);
+
+    gits_reg = info->gpa - vits->phys_base;
+
+    if ( gits_reg >= SZ_64K )
+    {
+        gdprintk(XENLOG_G_WARNING, "vGITS: unknown gpa read address \
+                  %"PRIpaddr"\n", info->gpa);
+        return 0;
+    }
+
+    switch ( gits_reg )
+    {
+    case GITS_CTLR:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        return 1;
+    case GITS_IIDR:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        return 1;
+    case GITS_TYPER:
+         /* GITS_TYPER support word read */
+        vits_spin_lock(vits);
+        val = ((its_get_pta_type() << VITS_GITS_TYPER_PTA_SHIFT) |
+               VITS_GITS_TYPER_HCC   | VITS_GITS_DEV_BITS |
+               VITS_GITS_ID_BITS     | VITS_GITS_ITT_SIZE |
+               VITS_GITS_DISTRIBUTED | VITS_GITS_PLPIS);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            *r = val;
+        else if ( dabt.size == DABT_WORD )
+            *r = (u32)(val >> 32);
+        else
+        {
+            vits_spin_unlock(vits);
+            goto bad_width;
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_TYPER + 4:
+        if (dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        val = ((its_get_pta_type() << VITS_GITS_TYPER_PTA_SHIFT) |
+               VITS_GITS_TYPER_HCC   | VITS_GITS_DEV_BITS |
+               VITS_GITS_ID_BITS     | VITS_GITS_ITT_SIZE |
+               VITS_GITS_DISTRIBUTED | VITS_GITS_PLPIS);
+        *r = (u32)val;
+        vits_spin_unlock(vits);
+        return 1;
+    case 0x0010 ... 0x007c:
+    case 0xc000 ... 0xffcc:
+        /* Implementation defined -- read ignored */
+        dprintk(XENLOG_ERR,
+                "vGITS: read unknown 0x000c - 0x007c r%d offset %#08x\n",
+                dabt.reg, gits_reg);
+        goto read_as_zero;
+    case GITS_CBASER:
+        vits_spin_lock(vits);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            *r = vits->cmd_base && 0xc7ffffffffffffffUL;
+        else if ( dabt.size == DABT_WORD )
+            *r = (u32)vits->cmd_base;
+        else
+        {
+            vits_spin_unlock(vits);
+            goto bad_width;
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CBASER + 4:
+         /* CBASER support word read */
+        if (dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        *r = (u32)(vits->cmd_base >> 32);
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CWRITER:
+        vits_spin_lock(vits);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            *r = vits->cmd_write;
+        else if ( dabt.size == DABT_WORD )
+            *r = (u32)vits->cmd_write;
+        else
+        {
+            vits_spin_unlock(vits);
+            goto bad_width;
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CWRITER + 4:
+         /* CWRITER support word read */
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        *r = (u32)(vits->cmd_write >> 32);
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CREADR:
+        vits_spin_lock(vits);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            *r = vits->cmd_read;
+        else if ( dabt.size == DABT_WORD )
+            *r = (u32)vits->cmd_read;
+        else
+        {
+            vits_spin_unlock(vits);
+            goto bad_width;
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CREADR + 4:
+         /* CREADR support word read */
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        *r = (u32)(vits->cmd_read >> 32);
+        vits_spin_unlock(vits);
+        return 1;
+    case 0x0098 ... 0x009c:
+    case 0x00a0 ... 0x00fc:
+    case 0x0140 ... 0xbffc:
+        /* Reserved -- read ignored */
+        dprintk(XENLOG_ERR,
+                "vGITS: read unknown 0x0098-9c or 0x00a0-fc r%d offset %#08x\n",
+                dabt.reg, gits_reg);
+        goto read_as_zero;
+    case GITS_BASER ... GITS_BASERN:
+        vits_spin_lock(vits);
+        index = (gits_reg - GITS_BASER) / 8;
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            *r = vits->baser[index];
+        else if ( dabt.size == DABT_WORD )
+        {
+            if ( (gits_reg % 8) == 0 )
+                *r = (u32)vits->baser[index];
+            else
+                *r = (u32)(vits->baser[index] >> 32);
+        }
+        else
+        {
+            vits_spin_unlock(vits);
+            goto bad_width;
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_PIDR0:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = GITS_PIDR0_VAL;
+        return 1;
+    case GITS_PIDR1:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = GITS_PIDR1_VAL;
+        return 1;
+    case GITS_PIDR2:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = GITS_PIDR2_VAL;
+        return 1;
+    case GITS_PIDR3:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = GITS_PIDR3_VAL;
+        return 1;
+    case GITS_PIDR4:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        *r = GITS_PIDR4_VAL;
+        return 1;
+    case GITS_PIDR5 ... GITS_PIDR7:
+        goto read_as_zero;
+   default:
+        dprintk(XENLOG_ERR, "vGITS: unhandled read r%d offset %#08x\n",
+               dabt.reg, gits_reg);
+        return 0;
+    }
+
+bad_width:
+    dprintk(XENLOG_ERR, "vGITS: bad read width %d r%d offset %#08x\n",
+           dabt.size, dabt.reg, gits_reg);
+    domain_crash_synchronous();
+    return 0;
+
+read_as_zero:
+    if ( dabt.size != DABT_WORD ) goto bad_width;
+    *r = 0;
+    return 1;
+}
+
+static int vgic_v3_gits_mmio_write(struct vcpu *v, mmio_info_t *info)
+{
+    struct vgic_its *vits;
+    struct hsr_dabt dabt = info->dabt;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    register_t *r = select_user_reg(regs, dabt.reg);
+    int ret;
+    uint32_t index, gits_reg;
+    uint64_t val;
+
+    vits = its_to_vits(v, info->gpa);
+    if ( vits == NULL ) BUG_ON(1);
+
+    gits_reg = info->gpa - vits->phys_base;
+
+    if ( gits_reg >= SZ_64K )
+    {
+        gdprintk(XENLOG_G_WARNING, "vGIC-ITS: unknown gpa write address"
+                 " %"PRIpaddr"\n", info->gpa);
+        return 0;
+    }
+
+    switch ( gits_reg )
+    {
+    case GITS_CTLR:
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        vits->ctrl = *r;
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_IIDR:
+        /* R0 -- write ignored */
+        goto write_ignore;
+    case GITS_TYPER:
+    case GITS_TYPER + 4:
+        /* R0 -- write ignored */
+        goto write_ignore;
+    case 0x0010 ... 0x007c:
+    case 0xc000 ... 0xffcc:
+        /* Implementation defined -- write ignored */
+        dprintk(XENLOG_ERR,
+                "vGITS: write to unknown 0x000c - 0x007c r%d offset %#08x\n",
+                dabt.reg, gits_reg);
+        goto write_ignore;
+    case GITS_CBASER:
+        if ( dabt.size == DABT_BYTE ) goto bad_width;
+        vits_spin_lock(vits);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            vits->cmd_base = *r;
+        else
+        {
+            val = vits->cmd_base & 0xffffffff00000000UL;
+            val = (*r) | val;
+            vits->cmd_base =  val;
+        }
+        vits->cmd_qsize  =  SZ_4K * ((*r & 0xff) + 1);
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CBASER + 4:
+         /* CBASER support word read */
+        if (dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        val = vits->cmd_base & 0xffffffffUL;
+        val = ((*r & 0xffffffffUL) << 32 ) | val;
+        vits->cmd_base =  val;
+        /* No Need to update cmd_qsize with higher word write */
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_CWRITER:
+        if ( dabt.size == DABT_BYTE ) goto bad_width;
+        vits_spin_lock(vits);
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            vits->cmd_write = *r;
+        else
+        {
+            val = vits->cmd_write & 0xffffffff00000000UL;
+            val = (*r) | val;
+            vits->cmd_write =  val;
+        }
+        ret = vgic_its_process_cmd(v, vits);
+        vits_spin_unlock(vits);
+        return ret;
+    case GITS_CWRITER + 4:
+        if (dabt.size != DABT_WORD ) goto bad_width;
+        vits_spin_lock(vits);
+        val = vits->cmd_write & 0xffffffffUL;
+        val = ((*r & 0xffffffffUL) << 32) | val;
+        vits->cmd_write =  val;
+        ret = vgic_its_process_cmd(v, vits);
+        vits_spin_unlock(vits);
+        return ret;
+    case GITS_CREADR:
+        /* R0 -- write ignored */
+        goto write_ignore;
+    case 0x0098 ... 0x009c:
+    case 0x00a0 ... 0x00fc:
+    case 0x0140 ... 0xbffc:
+        /* Reserved -- write ignored */
+        dprintk(XENLOG_ERR,
+                "vGITS: write to unknown 0x98-9c or 0xa0-fc r%d offset %#08x\n",
+                dabt.reg, gits_reg);
+        goto write_ignore;
+    case GITS_BASER ... GITS_BASERN:
+        /* Nothing to do with this values. Just store and emulate */
+        vits_spin_lock(vits);
+        index = (gits_reg - GITS_BASER) / 8;
+        if ( dabt.size == DABT_DOUBLE_WORD )
+            vits->baser[index] = *r;
+        else if ( dabt.size == DABT_WORD )
+        {
+            if ( (gits_reg % 8) == 0 )
+            {
+                val = vits->cmd_write & 0xffffffff00000000UL;
+                val = (*r) | val;
+                vits->baser[index] = val;
+            }
+            else
+            {
+                val = vits->baser[index] & 0xffffffffUL;
+                val = ((*r & 0xffffffffUL) << 32) | val;
+                vits->baser[index] = val;
+            }
+        }
+        else
+        {
+            goto bad_width;
+            vits_spin_unlock(vits);
+        }
+        vits_spin_unlock(vits);
+        return 1;
+    case GITS_PIDR7 ... GITS_PIDR0:
+        /* R0 -- write ignored */
+        goto write_ignore;
+   default:
+        dprintk(XENLOG_ERR, "vGITS: unhandled write r%d offset %#08x\n",
+                dabt.reg, gits_reg);
+        return 0;
+    }
+
+bad_width:
+    dprintk(XENLOG_ERR, "vGITS: bad write width %d r%d offset %#08x\n",
+           dabt.size, dabt.reg, gits_reg);
+    domain_crash_synchronous();
+    return 0;
+
+write_ignore:
+    if ( dabt.size != DABT_WORD ) goto bad_width;
+    *r = 0;
+    return 1;
+}
+
+static const struct mmio_handler_ops vgic_gits_mmio_handler = {
+    .read_handler  = vgic_v3_gits_mmio_read,
+    .write_handler = vgic_v3_gits_mmio_write,
+};
+
+int vgic_its_domain_init(struct domain *d)
+{
+    uint32_t num_its;
+    int i;
+
+    num_its =  its_get_nr_its();
+
+    d->arch.vits = xzalloc_array(struct vgic_its, num_its);
+    if ( d->arch.vits == NULL )
+        return -ENOMEM;
+
+    spin_lock_init(&d->arch.vits->lock);
+
+    spin_lock_init(&d->arch.vits_devs.lock);
+    INIT_LIST_HEAD(&d->arch.vits_devs.dev_list);
+
+    d->arch.lpi_conf = xzalloc(struct vgic_lpi_conf);
+    if ( d->arch.lpi_conf == NULL )
+         return -ENOMEM;
+
+    for ( i = 0; i < num_its; i++)
+    {
+         spin_lock_init(&d->arch.vits[i].lock);
+         register_mmio_handler(d, &vgic_gits_mmio_handler,
+                               d->arch.vits[i].phys_base,
+                               SZ_64K);
+    }
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 70ec913..82cfbdc 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -227,6 +227,7 @@ int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
 void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
 unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
 uint32_t its_get_pta_type(void);
+uint32_t its_get_nr_its(void);
 struct its_node * its_get_phys_node(uint32_t dev_id);
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (13 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-27 15:46   ` Julien Grall
  2015-03-19 14:38 ` [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller " vijay.kilari
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

With this patch add emulation of GICR registers for LPIs.
Also add LPI property table emulation.

Domain's LPI property table is unmapped during domain init
on LPIPROPBASE update and trapped on LPI property
table read and write

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/vgic-v3-its.c        |  144 +++++++++++++++++++++++++++++++++++++
 xen/arch/arm/vgic-v3.c            |   64 +++++++++++++----
 xen/include/asm-arm/domain.h      |    1 +
 xen/include/asm-arm/gic-its.h     |    1 +
 xen/include/asm-arm/gic.h         |    2 +
 xen/include/asm-arm/gic_v3_defs.h |    2 +
 6 files changed, 200 insertions(+), 14 deletions(-)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 4d8945f..f1d68d9 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -869,6 +869,150 @@ err:
     return 0;
 }
 
+/* Search device structure and get corresponding plpi */
+int vgic_its_get_pid(struct vcpu *v, uint32_t vlpi, uint32_t *plpi)
+{
+    struct domain *d = v->domain;
+    struct its_device *dev;
+    int i = 0;
+
+    spin_lock(&d->arch.vits_devs.lock);
+    list_for_each_entry( dev, &d->arch.vits_devs.dev_list, entry )
+    {
+        i = 0;
+        while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
+        {
+            if ( dev->vlpi_entries[i].vlpi == vlpi )
+            {
+                *plpi = dev->vlpi_entries[i].plpi;
+                spin_unlock(&d->arch.vits_devs.lock);
+                return 0;
+            }
+            i++;
+        }
+    }
+    spin_unlock(&d->arch.vits_devs.lock);
+
+    return 1;
+}
+
+static int vgic_v3_gits_lpi_mmio_read(struct vcpu *v, mmio_info_t *info)
+{
+    uint32_t offset;
+    struct hsr_dabt dabt = info->dabt;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    register_t *r = select_user_reg(regs, dabt.reg);
+    uint8_t cfg;
+
+    offset = info->gpa -
+             (v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL);
+
+    if ( offset < SZ_64K )
+    {
+        DPRINTK("vITS: LPI Table read offset 0x%x\n", offset );
+        cfg = readb_relaxed(v->domain->arch.lpi_conf->prop_page + offset);
+        *r = cfg;
+        return 1;
+    }
+    else
+        dprintk(XENLOG_ERR, "vITS: LPI Table read with wrong offset 0x%x\n",
+                offset);
+
+    return 0;
+}
+
+static int vgic_v3_gits_lpi_mmio_write(struct vcpu *v, mmio_info_t *info)
+{
+    uint32_t offset;
+    uint32_t pid, vid;
+    uint8_t cfg;
+    bool_t enable;
+    struct hsr_dabt dabt = info->dabt;
+    struct cpu_user_regs *regs = guest_cpu_user_regs();
+    register_t *r = select_user_reg(regs, dabt.reg);
+
+    offset = info->gpa -
+             (v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL);
+
+    vid = offset + NR_GIC_LPI;
+    if ( offset < SZ_64K )
+    {
+        DPRINTK("vITS: LPI Table write offset 0x%x\n", offset );
+        if ( vgic_its_get_pid(v, vid, &pid) )
+        {
+            dprintk(XENLOG_ERR, "vITS: pID not found for vid %d\n", vid);
+            return 0;
+        }
+      
+        cfg = readb_relaxed(v->domain->arch.lpi_conf->prop_page + offset);
+        enable = (cfg & *r) & 0x1;
+
+        if ( !enable )
+             vgic_its_enable_lpis(v, pid);
+        else
+             vgic_its_disable_lpis(v, pid);
+
+        /* Update virtual prop page */
+        writeb_relaxed((*r & 0xff),
+                        v->domain->arch.lpi_conf->prop_page + offset);
+        
+        return 1;
+    }
+    else
+        dprintk(XENLOG_ERR, "vITS: LPI Table write with wrong offset 0x%x\n",
+                offset);
+
+    return 0; 
+}
+
+static const struct mmio_handler_ops vgic_gits_lpi_mmio_handler = {
+    .read_handler  = vgic_v3_gits_lpi_mmio_read,
+    .write_handler = vgic_v3_gits_lpi_mmio_write,
+};
+
+int vgic_its_unmap_lpi_prop(struct vcpu *v)
+{
+    paddr_t maddr;
+    uint32_t lpi_size;
+    int i;
+    
+    maddr = v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL;
+    lpi_size = 1UL << ((v->domain->arch.lpi_conf->propbase & 0x1f) + 1);
+
+    DPRINTK("vITS: Unmap guest LPI conf table maddr 0x%lx lpi_size 0x%x\n", 
+             maddr, lpi_size);
+
+    if ( lpi_size < SZ_64K )
+    {
+        dprintk(XENLOG_ERR, "vITS: LPI Prop page < 64K\n");
+        return 0;
+    }
+
+    /* XXX: As per 4.8.9 each re-distributor shares a common LPI configuration table 
+     * So one set of mmio handlers to manage configuration table is enough
+     */
+    for ( i = 0; i < lpi_size / PAGE_SIZE; i++ )
+        guest_physmap_remove_page(v->domain, paddr_to_pfn(maddr),
+                                gmfn_to_mfn(v->domain, paddr_to_pfn(maddr)), 0);
+
+    /* Register mmio handlers for this region */
+    register_mmio_handler(v->domain, &vgic_gits_lpi_mmio_handler,
+                          maddr, lpi_size);
+
+    /* Allocate Virtual LPI Property table */
+    v->domain->arch.lpi_conf->prop_page =
+        alloc_xenheap_pages(get_order_from_bytes(lpi_size), 0);
+    if ( !v->domain->arch.lpi_conf->prop_page )
+    {
+        dprintk(XENLOG_ERR, "vITS: Failed to allocate LPI Prop page\n");
+        return 0;
+    }
+
+    memset(v->domain->arch.lpi_conf->prop_page, 0xa2, lpi_size);
+
+    return 1;
+}
+
 struct vgic_its *its_to_vits(struct vcpu *v, paddr_t phys_base)
 {
     struct vgic_its *vits = NULL;
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index ec79c2a..e9ec7fa 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -30,6 +30,7 @@
 #include <asm/mmio.h>
 #include <asm/gic_v3_defs.h>
 #include <asm/gic.h>
+#include <asm/gic-its.h>
 #include <asm/vgic.h>
 
 /* GICD_PIDRn register values for ARM implementations */
@@ -99,20 +100,30 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case GICR_CTLR:
-        /* We have not implemented LPI's, read zero */
-        goto read_as_zero_32;
+        /*
+         * Enable LPI's for ITS. Direct injection of LPI
+         * by writing to GICR_{SET,CLR}LPIR are not supported
+         */
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        vgic_lock(v);
+        *r = v->domain->arch.vgic.gicr_ctlr;
+        vgic_unlock(v);
+        return 1;
     case GICR_IIDR:
         if ( dabt.size != DABT_WORD ) goto bad_width;
         *r = GICV3_GICR_IIDR_VAL;
         return 1;
     case GICR_TYPER:
-        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
-        /* TBD: Update processor id in [23:8] when ITS support is added */
+        if ( dabt.size != DABT_WORD && dabt.size != DABT_DOUBLE_WORD )
+            goto bad_width;
+        /* XXX: Update processor id in [23:8] if GITS_TYPER: PTA is not set */
         aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
                MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
         *r = aff;
+        /* Set LPI support */
+        aff |= (GICR_TYPER_DISTRIBUTED_IMP | GICR_TYPER_PLPIS);
 
         if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
             *r |= GICR_TYPER_LAST;
@@ -131,10 +142,13 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
         /* WO. Read as zero */
         goto read_as_zero_64;
     case GICR_PROPBASER:
-        /* LPI's not implemented */
-        goto read_as_zero_64;
+        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        /* Remove shareability attribute we don't want dom to flush */
+        *r = v->domain->arch.lpi_conf->propbase;
+        return 1;
     case GICR_PENDBASER:
-        /* LPI's not implemented */
+        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        *r = v->domain->arch.lpi_conf->pendbase[v->vcpu_id];
         goto read_as_zero_64;
     case GICR_INVLPIR:
         /* WO. Read as zero */
@@ -209,8 +223,15 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
     switch ( gicr_reg )
     {
     case GICR_CTLR:
-        /* LPI's not implemented */
-        goto write_ignore_32;
+        /*
+         * Enable LPI's for ITS. Direct injection of LPI
+         * by writing to GICR_{SET,CLR}LPIR are not supported
+         */
+        if ( dabt.size != DABT_WORD ) goto bad_width;
+        vgic_lock(v);
+        v->domain->arch.vgic.gicr_ctlr = (*r) & GICR_CTL_ENABLE;
+        vgic_unlock(v);
+        return 1;
     case GICR_IIDR:
         /* RO */
         goto write_ignore_32;
@@ -230,11 +251,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
         /* LPI is not implemented */
         goto write_ignore_64;
     case GICR_PROPBASER:
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        vgic_lock(v);
+        /* LPI configuration tables are shared across cpus. Should be same */
+        if ( (v->domain->arch.lpi_conf->propbase != 0) && 
+             ((v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL) !=  (*r & 0xfffffffff000UL)) )
+        {
+            dprintk(XENLOG_ERR,
+                "vGICv3: vITS: Wrong configuration of LPI_PROPBASER\n");
+            return 0;
+        }     
+        v->domain->arch.lpi_conf->propbase = *r;
+        vgic_unlock(v);
+        return vgic_its_unmap_lpi_prop(v);
     case GICR_PENDBASER:
-        /* LPI is not implemented */
-        goto write_ignore_64;
+        /* Just hold pendbaser value for guest read */
+        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
+        vgic_lock(v);
+        v->domain->arch.lpi_conf->pendbase[v->vcpu_id] = *r;
+        vgic_unlock(v);
+        return 1;
     case GICR_INVLPIR:
         /* LPI is not implemented */
         goto write_ignore_64;
@@ -703,7 +739,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
               ((v->domain->arch.vgic.nr_spis / 32) & GICD_TYPE_LINES));
 
         *r |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
-
+        *r |= GICD_TYPE_LPIS;
         return 1;
     }
     case GICD_STATUSR:
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index bc7aee9..7202f93 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -101,6 +101,7 @@ struct arch_domain
         paddr_t dbase; /* Distributor base address */
         paddr_t cbase; /* CPU base address */
 #ifdef CONFIG_ARM_64
+	int gicr_ctlr;
         /* GIC V3 addressing */
         paddr_t dbase_size; /* Distributor base size */
         /* List of contiguous occupied by the redistributors */
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index 82cfbdc..e1a5fa0 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -229,6 +229,7 @@ unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
 uint32_t its_get_pta_type(void);
 uint32_t its_get_nr_its(void);
 struct its_node * its_get_phys_node(uint32_t dev_id);
+int vgic_its_unmap_lpi_prop(struct vcpu *v);
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
 /*
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index 6f5767f..f15174b 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -20,6 +20,7 @@
 
 #define NR_GIC_LOCAL_IRQS  NR_LOCAL_IRQS
 #define NR_GIC_SGI         16
+#define NR_GIC_LPI         8192
 #define MAX_RDIST_COUNT    4
 
 #define GICD_CTLR       (0x000)
@@ -96,6 +97,7 @@
 #define GICD_TYPE_CPUS_SHIFT 5
 #define GICD_TYPE_CPUS  0x0e0
 #define GICD_TYPE_SEC   0x400
+#define GICD_TYPE_LPIS  (0x1UL << 17)
 
 #define GICC_CTL_ENABLE 0x1
 #define GICC_CTL_EOI    (0x1 << 9)
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index f8bac52..125fc28 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -45,6 +45,7 @@
 #define GICC_SRE_EL2_DIB             (1UL << 2)
 #define GICC_SRE_EL2_ENEL1           (1UL << 3)
 
+#define GICR_CTL_ENABLE              (1U << 0)
 /* Additional bits in GICD_TYPER defined by GICv3 */
 #define GICD_TYPE_ID_BITS_SHIFT 19
 
@@ -133,6 +134,7 @@
 
 #define GICR_TYPER_PLPIS             (1U << 0)
 #define GICR_TYPER_VLPIS             (1U << 1)
+#define GICR_TYPER_DISTRIBUTED_IMP   (1U << 3)
 #define GICR_TYPER_LAST              (1U << 4)
 
 #define DEFAULT_PMR_VALUE            0xff
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller for LPIs
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (14 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-27 17:02   ` Julien Grall
  2015-03-19 14:38 ` [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space vijay.kilari
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

This patch implements hw_irq_controller api's required
to handle LPI's.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - Reused hw_irq_controller ops of gicv3 for LPIs
---
 xen/arch/arm/gic-v3-its.c     |  103 ++++++-----------------------------------
 xen/arch/arm/gic-v3.c         |   26 ++++++++---
 xen/include/asm-arm/gic-its.h |    2 +
 xen/include/asm-arm/gic.h     |    1 +
 4 files changed, 37 insertions(+), 95 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index e382f8d..eacd244 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -440,7 +440,7 @@ post:
 	its_wait_for_range_completion(its, cmd, next_cmd);
 }
 
-void its_send_inv(struct its_device *dev, u32 event_id)
+static void its_send_inv(struct its_device *dev, u32 event_id)
 {
 	struct its_cmd_desc desc;
 
@@ -461,8 +461,7 @@ static void its_send_mapc(struct its_node *its, struct its_collection *col,
 	its_send_single_command(its, its_build_mapc_cmd, &desc);
 }
 
-/* TODO: Remove static for the sake of compilation */
-void its_send_movi(struct its_node *its, struct its_collection *col,
+static void its_send_movi(struct its_node *its, struct its_collection *col,
 	           u32 dev_id, u32 id)
 {
 	struct its_cmd_desc desc;
@@ -483,28 +482,19 @@ static void its_send_invall(struct its_node *its, struct its_collection *col)
 	its_send_single_command(its, its_build_invall_cmd, &desc);
 }
 
-/*
- * The below irqchip functions are no more required.
- * TODO: Will be implemented as separate patch
- */
-#if 0
-/*
- * irqchip functions - assumes MSI, mostly.
- */
-
-static inline u32 its_get_event_id(struct irq_data *d)
+static inline u32 its_get_event_id(struct irq_desc *d)
 {
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	return d->hwirq - its_dev->lpi_base;
+	struct its_device *its_dev = irq_get_desc_data(d);
+	return d->irq - its_dev->lpi_base;
 }
 
-static void lpi_set_config(struct irq_data *d, bool enable)
+void lpi_set_config(struct irq_desc *d, int enable)
 {
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	irq_hw_number_t hwirq = d->hwirq;
+	u8 *cfg;
 	u32 id = its_get_event_id(d);
-	u8 *cfg = page_address(gic_rdists->prop_page) + hwirq - 8192;
+	struct its_device *its_dev = irq_get_desc_data(d);
 
+	cfg = gic_rdists->prop_page + d->irq - NR_GIC_LPI;
 	if (enable)
 		*cfg |= LPI_PROP_ENABLED;
 	else
@@ -516,89 +506,26 @@ static void lpi_set_config(struct irq_data *d, bool enable)
 	 * Humpf...
 	 */
 	if (gic_rdists->flags & RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING)
-		__flush_dcache_area(cfg, sizeof(*cfg));
+		clean_and_invalidate_dcache_va_range(cfg, sizeof(*cfg));
 	else
 		dsb(ishst);
-	its_send_inv(its_dev, id);
-}
 
-static void its_mask_irq(struct irq_data *d)
-{
-	lpi_set_config(d, false);
-}
-
-static void its_unmask_irq(struct irq_data *d)
-{
-	lpi_set_config(d, true);
-}
-
-static void its_eoi_irq(struct irq_data *d)
-{
-	gic_write_eoir(d->hwirq);
+	its_send_inv(its_dev, id);
 }
 
-static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
-			    bool force)
+void its_set_affinity(struct irq_desc *d, int cpu)
 {
-	unsigned int cpu = cpumask_any_and(mask_val, cpu_online_mask);
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
+	struct its_device *its_dev = irq_get_desc_data(d);
 	struct its_collection *target_col;
 	u32 id = its_get_event_id(d);
 
-	if (cpu >= nr_cpu_ids)
-		return -EINVAL;
-
+	/* Physical collection id */
 	target_col = &its_dev->its->collections[cpu];
-	its_send_movi(its_dev, target_col, id);
 	its_dev->collection = target_col;
 
-	return IRQ_SET_MASK_OK_DONE;
+	its_send_movi(its_dev->its, target_col, its_dev->device_id, id);
 }
 
-static void its_irq_compose_msi_msg(struct irq_data *d, struct msi_msg *msg)
-{
-	struct its_device *its_dev = irq_data_get_irq_chip_data(d);
-	struct its_node *its;
-	u64 addr;
-
-	its = its_dev->its;
-	addr = its->phys_base + GITS_TRANSLATER;
-
-	msg->address_lo		= addr & ((1UL << 32) - 1);
-	msg->address_hi		= addr >> 32;
-	msg->data		= its_get_event_id(d);
-}
-
-static struct irq_chip its_irq_chip = {
-	.name			= "ITS",
-	.irq_mask		= its_mask_irq,
-	.irq_unmask		= its_unmask_irq,
-	.irq_eoi		= its_eoi_irq,
-	.irq_set_affinity	= its_set_affinity,
-	.irq_compose_msi_msg	= its_irq_compose_msi_msg,
-};
-
-static void its_mask_msi_irq(struct irq_data *d)
-{
-	pci_msi_mask_irq(d);
-	irq_chip_mask_parent(d);
-}
-
-static void its_unmask_msi_irq(struct irq_data *d)
-{
-	pci_msi_unmask_irq(d);
-	irq_chip_unmask_parent(d);
-}
-
-static struct irq_chip its_msi_irq_chip = {
-	.name			= "ITS-MSI",
-	.irq_unmask		= its_unmask_msi_irq,
-	.irq_mask		= its_mask_msi_irq,
-	.irq_eoi		= irq_chip_eoi_parent,
-	.irq_write_msi_msg	= pci_msi_domain_write_msg,
-};
-#endif
-
 /*
  * How we allocate LPIs:
  *
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 2b406e6..1b3ecd7 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -40,6 +40,7 @@
 #include <asm/device.h>
 #include <asm/gic.h>
 #include <asm/gic_v3_defs.h>
+#include <asm/gic-its.h>
 #include <asm/cpufeature.h>
 
 struct rdist_region {
@@ -427,12 +428,18 @@ static void gicv3_poke_irq(struct irq_desc *irqd, u32 offset)
 
 static void gicv3_unmask_irq(struct irq_desc *irqd)
 {
-    gicv3_poke_irq(irqd, GICD_ISENABLER);
+    if ( is_lpi(irqd->irq) )
+        lpi_set_config(irqd, 1);
+    else
+        gicv3_poke_irq(irqd, GICD_ISENABLER);
 }
 
 static void gicv3_mask_irq(struct irq_desc *irqd)
 {
-    gicv3_poke_irq(irqd, GICD_ICENABLER);
+    if ( is_lpi(irqd->irq) )
+        lpi_set_config(irqd, 0);
+    else
+        gicv3_poke_irq(irqd, GICD_ICENABLER);
 }
 
 static void gicv3_eoi_irq(struct irq_desc *irqd)
@@ -1070,13 +1077,18 @@ static void gicv3_irq_set_affinity(struct irq_desc *desc, const cpumask_t *mask)
     spin_lock(&gicv3.lock);
 
     cpu = gicv3_get_cpu_from_mask(mask);
-    affinity = gicv3_mpidr_to_affinity(cpu);
-    /* Make sure we don't broadcast the interrupt */
-    affinity &= ~GICD_IROUTER_SPI_MODE_ANY;
 
-    if ( desc->irq >= NR_GIC_LOCAL_IRQS )
-        writeq_relaxed(affinity, (GICD + GICD_IROUTER + desc->irq * 8));
+    if ( is_lpi(desc->irq) )
+        its_set_affinity(desc, cpu);
+    else
+    {
+        affinity = gicv3_mpidr_to_affinity(cpu);
+        /* Make sure we don't broadcast the interrupt */
+        affinity &= ~GICD_IROUTER_SPI_MODE_ANY;
 
+        if ( desc->irq >= NR_GIC_LOCAL_IRQS )
+            writeq_relaxed(affinity, (GICD + GICD_IROUTER + desc->irq * 8));
+    }
     spin_unlock(&gicv3.lock);
 }
 
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index e1a5fa0..af28d66 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -225,6 +225,8 @@ int its_alloc_device_irq(struct its_device *dev, uint32_t *plpi);
 int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
                      struct its_cmd_block *phys_cmd, int send_all);
 void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
+void its_set_affinity(struct irq_desc *d, int cpu);
+void lpi_set_config(struct irq_desc *d, int enable);
 unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
 uint32_t its_get_pta_type(void);
 uint32_t its_get_nr_its(void);
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index f15174b..e4555e8 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -162,6 +162,7 @@
 
 #define DT_MATCH_GIC_V3 DT_MATCH_COMPATIBLE("arm,gic-v3")
 
+#define is_lpi(lpi) (lpi >= NR_GIC_LPI)
 /*
  * GICv3 registers that needs to be saved/restored
  */
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (15 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller " vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-27 17:07   ` Julien Grall
  2015-03-19 14:38 ` [RFC PATCH v2 18/22] xen/arm: its: Dynamic allocation of LPI descriptors vijay.kilari
                   ` (6 subsequent siblings)
  23 siblings, 1 reply; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

ITS translation space contains GITS_TRANSLATOR
register which is written by device to raise
LPI. This space needs to mapped to every domain
address space for all physical ITS available,
so that device can access GITS_TRANSLATOR
register using SMMU.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - Map for all physical ITS nodes
---
 xen/arch/arm/vgic-v3-its.c |   30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index f1d68d9..1447e91 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -1394,6 +1394,34 @@ static const struct mmio_handler_ops vgic_gits_mmio_handler = {
     .write_handler = vgic_v3_gits_mmio_write,
 };
 
+/*
+ * Map the 64K ITS translation space in guest.
+ * This is required purely for device smmu writes.
+*/
+
+static int vgic_map_translation_space(uint32_t nr_its, struct domain *d)
+{
+    uint64_t addr, size;
+    int ret;
+
+    addr = d->arch.vits[nr_its].phys_base + SZ_64K;
+    size = SZ_64K;
+    ret = map_mmio_regions(d,
+                            paddr_to_pfn(addr & PAGE_MASK),
+                            DIV_ROUND_UP(size, PAGE_SIZE),
+                            paddr_to_pfn(addr & PAGE_MASK));
+
+     if ( ret )
+     {
+          printk(XENLOG_ERR "Unable to map to dom%d access to"
+                   " 0x%"PRIx64" - 0x%"PRIx64"\n",
+                   d->domain_id,
+                   addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
+     }
+
+    return ret;
+}
+
 int vgic_its_domain_init(struct domain *d)
 {
     uint32_t num_its;
@@ -1420,6 +1448,8 @@ int vgic_its_domain_init(struct domain *d)
          register_mmio_handler(d, &vgic_gits_mmio_handler,
                                d->arch.vits[i].phys_base,
                                SZ_64K);
+
+        return vgic_map_translation_space(i, d);
     }
 
     return 0;
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 18/22] xen/arm: its: Dynamic allocation of LPI descriptors
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (16 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-19 14:38 ` [RFC PATCH v2 19/22] xen/arm: its: Support ITS interrupt handling vijay.kilari
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Number of LPIs supported by GICv3 is huge. Boot time
allocation of irq descriptors and pending_irq descritors
is not viable.

With this patch, allocate irq/pending_irq descritors for
LPIs on-demand and manage using radix tree

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 xen/arch/arm/irq.c           |  183 +++++++++++++++++++++++++++++++++++++++++-
 xen/arch/arm/vgic.c          |   20 ++++-
 xen/include/asm-arm/domain.h |    4 +
 xen/include/asm-arm/gic.h    |    1 +
 xen/include/asm-arm/irq.h    |   10 +++
 5 files changed, 213 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index d02f4cf..0d3bf9a 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -30,6 +30,8 @@
 
 static unsigned int local_irqs_type[NR_LOCAL_IRQS];
 static DEFINE_SPINLOCK(local_irqs_type_lock);
+static DEFINE_SPINLOCK(radix_tree_desc_lock);
+static struct radix_tree_root desc_root;
 
 static void ack_none(struct irq_desc *irq)
 {
@@ -51,18 +53,149 @@ hw_irq_controller no_irq_type = {
 static irq_desc_t irq_desc[NR_IRQS];
 static DEFINE_PER_CPU(irq_desc_t[NR_LOCAL_IRQS], local_irq_desc);
 
+static void init_one_irq_data(int irq, struct irq_desc *desc);
+
+struct irq_desc * find_irq_desc(struct radix_tree_root *root_node, int irq)
+{
+    unsigned long flags;
+    struct irq_desc *desc;
+
+    spin_lock_irqsave(&radix_tree_desc_lock, flags);
+    desc = radix_tree_lookup(root_node, irq);
+    spin_unlock_irqrestore(&radix_tree_desc_lock, flags);
+
+    return desc;
+}
+
+struct pending_irq *find_pending_irq_desc(struct domain *d, int irq)
+{
+    unsigned long flags;
+    struct pending_irq *p;
+
+    spin_lock_irqsave(&d->arch.vgic.pending_lpi_lock, flags);
+    p = radix_tree_lookup(&d->arch.vgic.pending_lpis, irq);
+    spin_unlock_irqrestore(&d->arch.vgic.pending_lpi_lock, flags);
+
+    return p;
+}
+
+struct irq_desc *insert_irq_desc(struct radix_tree_root *root_node, int irq)
+{
+    unsigned long flags;
+    struct irq_desc *desc;
+    int ret;
+
+    spin_lock_irqsave(&radix_tree_desc_lock, flags);
+    desc = radix_tree_lookup(root_node, irq);
+    if ( desc == NULL )
+    {
+
+        desc = xzalloc(struct irq_desc);
+        if ( desc == NULL )
+            goto err;
+        init_one_irq_data(irq, desc);
+        ret = radix_tree_insert(root_node, irq, desc);
+        if ( ret )
+        {
+            xfree(desc);
+            goto err;
+        }
+    }
+    spin_unlock_irqrestore(&radix_tree_desc_lock, flags);
+
+    return desc;
+err:
+    spin_unlock_irqrestore(&radix_tree_desc_lock, flags);
+
+    return NULL;
+}
+
+struct pending_irq *insert_pending_irq_desc(struct domain *d, int irq)
+{
+    unsigned long flags;
+    int ret;
+    struct pending_irq *p;
+
+    spin_lock_irqsave(&d->arch.vgic.pending_lpi_lock, flags);
+    p = radix_tree_lookup(&d->arch.vgic.pending_lpis, irq);
+    if ( p == NULL )
+    {
+        if ( (p = xzalloc(struct pending_irq)) == NULL )
+            goto err;
+        ret = radix_tree_insert(&d->arch.vgic.pending_lpis, irq, p);
+        if ( ret )
+        {
+            xfree(p);
+            goto err;
+        }
+        INIT_LIST_HEAD(&p->inflight);
+        INIT_LIST_HEAD(&p->lr_queue);
+    }
+    spin_unlock_irqrestore(&d->arch.vgic.pending_lpi_lock, flags);
+
+    return p;
+err:
+    spin_unlock_irqrestore(&d->arch.vgic.pending_lpi_lock, flags);
+
+    return NULL;
+}
+
+struct irq_desc *delete_irq_desc(struct radix_tree_root *root_node, int irq)
+{
+    unsigned long flags;
+    struct irq_desc *desc;
+
+    spin_lock_irqsave(&radix_tree_desc_lock, flags);
+    desc = radix_tree_delete(root_node, irq);
+    spin_unlock_irqrestore(&radix_tree_desc_lock, flags);
+
+    return desc;
+}
+
+struct pending_irq *delete_pending_irq_desc(struct domain *d, int irq)
+{
+    unsigned long flags;
+    struct pending_irq *p;
+
+    spin_lock_irqsave(&d->arch.vgic.pending_lpi_lock, flags);
+    p = radix_tree_delete(&d->arch.vgic.pending_lpis, irq);
+    spin_unlock_irqrestore(&d->arch.vgic.pending_lpi_lock, flags);
+
+    return p; 
+}
+
 irq_desc_t *__irq_to_desc(int irq)
 {
+    struct irq_desc *desc = NULL;
+
     if (irq < NR_LOCAL_IRQS) return &this_cpu(local_irq_desc)[irq];
-    return &irq_desc[irq-NR_LOCAL_IRQS];
+    else if ( irq >= NR_LOCAL_IRQS && irq < NR_IRQS)
+        return &irq_desc[irq-NR_LOCAL_IRQS];
+    else
+    {
+        if ( is_lpi(irq) )
+            desc = find_irq_desc(&desc_root, irq);
+        else
+            BUG();
+    }
+
+    return desc;
 }
 
-int __init arch_init_one_irq_desc(struct irq_desc *desc)
+int arch_init_one_irq_desc(struct irq_desc *desc)
 {
     desc->arch.type = DT_IRQ_TYPE_INVALID;
     return 0;
 }
 
+static void init_one_irq_data(int irq, struct irq_desc *desc)
+{
+        init_one_irq_desc(desc);
+        desc->irq = irq;
+        desc->arch.virq = 0;
+        desc->action  = NULL;
+        desc->arch.dev = NULL;
+}
 
 static int __init init_irq_data(void)
 {
@@ -72,7 +205,9 @@ static int __init init_irq_data(void)
         struct irq_desc *desc = irq_to_desc(irq);
         init_one_irq_desc(desc);
         desc->irq = irq;
+        desc->arch.virq = 0;
         desc->action  = NULL;
+        desc->arch.dev = NULL;
     }
 
     return 0;
@@ -141,6 +276,7 @@ void __init init_IRQ(void)
 
     BUG_ON(init_local_irq_data() < 0);
     BUG_ON(init_irq_data() < 0);
+    radix_tree_init(&desc_root);
 }
 
 void __cpuinit init_secondary_IRQ(void)
@@ -286,11 +422,15 @@ out_no_end:
 void release_irq(unsigned int irq, const void *dev_id)
 {
     struct irq_desc *desc;
+    struct pending_irq *p;
     unsigned long flags;
     struct irqaction *action, **action_ptr;
+    struct vcpu *v = current;
 
     desc = irq_to_desc(irq);
 
+    if ( !desc ) return;
+
     spin_lock_irqsave(&desc->lock,flags);
 
     action_ptr = &desc->action;
@@ -327,6 +467,14 @@ void release_irq(unsigned int irq, const void *dev_id)
 
     if ( action->free_on_release )
         xfree(action);
+
+    if ( is_lpi(irq) )
+    {
+        desc = delete_irq_desc(&desc_root, irq);
+        p = delete_pending_irq_desc(v->domain, irq);
+        xfree(desc);
+        xfree(p);
+    }
 }
 
 static int __setup_irq(struct irq_desc *desc, unsigned int irqflags,
@@ -365,6 +513,8 @@ int setup_irq(unsigned int irq, unsigned int irqflags, struct irqaction *new)
 
     desc = irq_to_desc(irq);
 
+    ASSERT(desc != NULL);
+
     spin_lock_irqsave(&desc->lock, flags);
 
     if ( test_bit(_IRQ_GUEST, &desc->status) )
@@ -408,7 +558,8 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
                        const char * devname)
 {
     struct irqaction *action;
-    struct irq_desc *desc = irq_to_desc(irq);
+    struct irq_desc *desc;
+    struct pending_irq *p;
     unsigned long flags;
     int retval = 0;
 
@@ -420,6 +571,20 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
     action->name = devname;
     action->free_on_release = 1;
 
+    if ( is_lpi(irq) )
+    {
+        desc = insert_irq_desc(&desc_root, irq);
+        if ( !desc )
+            return -ENOMEM;
+        init_one_irq_data(irq, desc);
+
+        p = insert_pending_irq_desc(d, irq);
+        if ( !p )
+            return -ENOMEM;
+    }
+    else
+        desc = irq_to_desc(irq);
+
     spin_lock_irqsave(&desc->lock, flags);
 
     /* If the IRQ is already used by someone
@@ -558,6 +723,7 @@ int platform_get_irq(const struct dt_device_node *device, int index)
 {
     struct dt_irq dt_irq;
     unsigned int type, irq;
+    struct irq_desc *desc;
     int res;
 
     res = dt_device_get_irq(device, index, &dt_irq);
@@ -567,6 +733,17 @@ int platform_get_irq(const struct dt_device_node *device, int index)
     irq = dt_irq.irq;
     type = dt_irq.type;
 
+    if ( is_lpi(irq) )
+    {
+        desc = insert_irq_desc(&desc_root, irq);
+        if ( !desc )
+            return -ENOMEM;
+        init_one_irq_data(irq, desc);
+        /* XXX: Here we don't know which is the domain.
+         * So pending irq structure is allocate when required
+         */
+    }
+
     /* Setup the IRQ type */
     if ( irq < NR_LOCAL_IRQS )
         res = irq_local_set_type(irq, type);
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index c14d79d..6fc8df1 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -30,6 +30,7 @@
 
 #include <asm/mmio.h>
 #include <asm/gic.h>
+#include <asm/gic-its.h>
 #include <asm/vgic.h>
 
 static inline struct vgic_irq_rank *vgic_get_rank(struct vcpu *v, int rank)
@@ -108,6 +109,9 @@ int domain_vgic_init(struct domain *d)
     for (i=0; i<DOMAIN_NR_RANKS(d); i++)
         spin_lock_init(&d->arch.vgic.shared_irqs[i].lock);
 
+    radix_tree_init(&d->arch.vgic.pending_lpis);
+    spin_lock_init(&d->arch.vgic.pending_lpi_lock);
+
     d->arch.vgic.handler->domain_init(d);
 
     d->arch.vgic.allocated_irqs =
@@ -127,11 +131,18 @@ void register_vgic_ops(struct domain *d, const struct vgic_ops *ops)
    d->arch.vgic.handler = ops;
 }
 
+void free_pending_lpis(void *ptr)
+{
+   struct pending_irq *pending_desc = ptr;
+   xfree(pending_desc);
+}
+
 void domain_vgic_free(struct domain *d)
 {
     xfree(d->arch.vgic.shared_irqs);
     xfree(d->arch.vgic.pending_irqs);
     xfree(d->arch.vgic.allocated_irqs);
+    radix_tree_destroy(&d->arch.vgic.pending_lpis, free_pending_lpis);
 }
 
 int vcpu_vgic_init(struct vcpu *v)
@@ -358,13 +369,18 @@ int vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int
 
 struct pending_irq *irq_to_pending(struct vcpu *v, unsigned int irq)
 {
-    struct pending_irq *n;
+    struct pending_irq *n = NULL;
     /* Pending irqs allocation strategy: the first vgic.nr_spis irqs
      * are used for SPIs; the rests are used for per cpu irqs */
     if ( irq < 32 )
         n = &v->arch.vgic.pending_irqs[irq];
-    else
+    else if ( irq < 1024 )
         n = &v->domain->arch.vgic.pending_irqs[irq - 32];
+    else
+    {
+        if ( is_lpi(irq) )
+            n = find_pending_irq_desc(v->domain, irq);
+    }
     return n;
 }
 
diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
index 7202f93..027ffd3 100644
--- a/xen/include/asm-arm/domain.h
+++ b/xen/include/asm-arm/domain.h
@@ -11,6 +11,7 @@
 #include <asm/gic.h>
 #include <public/hvm/params.h>
 #include <xen/serial.h>
+#include <xen/radix-tree.h>
 #include <xen/hvm/iommu.h>
 
 struct hvm_domain
@@ -97,6 +98,9 @@ struct arch_domain
          * struct arch_vcpu.
          */
         struct pending_irq *pending_irqs;
+        /* Lock for managing pending lpi in radix tree */
+        spinlock_t pending_lpi_lock;
+        struct radix_tree_root pending_lpis;
         /* Base address for guest GIC */
         paddr_t dbase; /* Distributor base address */
         paddr_t cbase; /* CPU base address */
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index e4555e8..b4f4904 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -163,6 +163,7 @@
 #define DT_MATCH_GIC_V3 DT_MATCH_COMPATIBLE("arm,gic-v3")
 
 #define is_lpi(lpi) (lpi >= NR_GIC_LPI)
+
 /*
  * GICv3 registers that needs to be saved/restored
  */
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index f091739..8568b96 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -2,6 +2,7 @@
 #define _ASM_HW_IRQ_H
 
 #include <xen/config.h>
+#include <xen/radix-tree.h>
 #include <xen/device_tree.h>
 
 #define NR_VECTORS 256 /* XXX */
@@ -29,6 +30,7 @@ struct arch_irq_desc {
 #define arch_hwdom_irqs(domid) NR_IRQS
 
 struct irq_desc;
+struct pending_irq;
 struct irqaction;
 
 struct irq_desc *__irq_to_desc(int irq);
@@ -55,6 +57,14 @@ int platform_get_irq(const struct dt_device_node *device, int index);
 
 void irq_set_affinity(struct irq_desc *desc, const cpumask_t *cpu_mask);
 
+struct irq_desc *find_irq_desc(struct radix_tree_root *root_node, int irq);
+struct irq_desc *insert_irq_desc(struct radix_tree_root *root_node, int irq);
+struct irq_desc *delete_irq_desc(struct radix_tree_root *root_node, int irq);
+
+struct pending_irq *insert_pending_irq_desc(struct domain *d, int irq);
+struct pending_irq *find_pending_irq_desc(struct domain *d, int irq);
+struct pending_irq *delete_pending_irq_desc(struct domain *d, int irq);
+
 #endif /* _ASM_HW_IRQ_H */
 /*
  * Local variables:
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 19/22] xen/arm: its: Support ITS interrupt handling
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (17 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 18/22] xen/arm: its: Dynamic allocation of LPI descriptors vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-19 14:38 ` [RFC PATCH v2 20/22] xen/arm: its: Generate ITS node for Dom0 vijay.kilari
                   ` (4 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Add support for handling ITS(LPI) interrupts.
The LPI interrupts are handled by physical ITS
driver.

nested LPI interrupt handling is not tested and
enabled.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - Removed interrupt handler in ITS driver and
      reused existing interrupt handling for LPIs.
---
 xen/arch/arm/gic-v3.c         |    8 ++++++--
 xen/arch/arm/gic.c            |   37 +++++++++++++++++++++++++++++++++++--
 xen/arch/arm/irq.c            |   10 +++++++---
 xen/arch/arm/vgic-v3-its.c    |   10 ++++++++++
 xen/arch/arm/vgic.c           |   14 ++++++++++----
 xen/include/asm-arm/gic-its.h |    2 ++
 xen/include/asm-arm/gic.h     |    3 ++-
 xen/include/asm-arm/irq.h     |    1 +
 8 files changed, 73 insertions(+), 12 deletions(-)

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index 1b3ecd7..ffdaecf 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -869,9 +869,13 @@ static void gicv3_update_lr(int lr, const struct pending_irq *p,
 
     val =  (((uint64_t)state & 0x3) << GICH_LR_STATE_SHIFT) | grp;
     val |= ((uint64_t)p->priority & 0xff) << GICH_LR_PRIORITY_SHIFT;
-    val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
 
-   if ( p->desc != NULL )
+    if ( is_lpi(p->irq) )
+        val |= ((uint64_t)p->desc->arch.virq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
+    else
+        val |= ((uint64_t)p->irq & GICH_LR_VIRTUAL_MASK) << GICH_LR_VIRTUAL_SHIFT;
+
+   if ( p->desc != NULL && !(is_lpi(p->irq)) )
        val |= GICH_LR_HW | (((uint64_t)p->desc->irq & GICH_LR_PHYSICAL_MASK)
                            << GICH_LR_PHYSICAL_SHIFT);
 
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 390c8b0..6ac1f18 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -34,6 +34,7 @@
 #include <asm/io.h>
 #include <asm/gic.h>
 #include <asm/vgic.h>
+#include <asm/gic-its.h>
 
 static void gic_restore_pending_irqs(struct vcpu *v);
 
@@ -123,6 +124,20 @@ void gic_route_irq_to_xen(struct irq_desc *desc, const cpumask_t *cpu_mask,
     gic_set_irq_properties(desc, cpu_mask, priority);
 }
 
+void gic_route_lpi_to_guest(struct domain *d, struct irq_desc *desc,
+                            const cpumask_t *cpu_mask, unsigned int priority)
+{
+    struct pending_irq *p;
+    ASSERT(spin_is_locked(&desc->lock));
+
+    desc->handler = gic_hw_ops->gic_guest_irq_type;
+    set_bit(_IRQ_GUEST, &desc->status);
+
+    /* TODO: do not assume delivery to vcpu0 */
+    p = irq_to_pending(d->vcpu[0], desc->irq);
+    p->desc = desc;
+}
+
 /* Program the GIC to route an interrupt to a guest
  *   - desc.lock must be held
  */
@@ -330,20 +345,33 @@ static void gic_update_one_lr(struct vcpu *v, int i)
     struct pending_irq *p;
     int irq;
     struct gic_lr lr_val;
+    uint32_t pirq;
 
     ASSERT(spin_is_locked(&v->arch.vgic.lock));
     ASSERT(!local_irq_is_enabled());
 
     gic_hw_ops->read_lr(i, &lr_val);
     irq = lr_val.virq;
-    p = irq_to_pending(v, irq);
+
+    if ( is_lpi(irq) )
+    {
+        // Fetch corresponding plpi for vlpi
+        if ( vgic_its_get_pid(v, irq, &pirq) )
+            BUG();
+        p = irq_to_pending(v, pirq);
+        irq = pirq;
+    }
+    else
+    {
+        p = irq_to_pending(v, irq);
+    }
     if ( lr_val.state & GICH_LR_ACTIVE )
     {
         set_bit(GIC_IRQ_GUEST_ACTIVE, &p->status);
         if ( test_bit(GIC_IRQ_GUEST_ENABLED, &p->status) &&
              test_and_clear_bit(GIC_IRQ_GUEST_QUEUED, &p->status) )
         {
-            if ( p->desc == NULL )
+            if ( p->desc == NULL  || is_lpi(irq) )
             {
                  lr_val.state |= GICH_LR_PENDING;
                  gic_hw_ops->write_lr(i, &lr_val);
@@ -569,6 +597,11 @@ void gic_interrupt(struct cpu_user_regs *regs, int is_fiq)
     do  {
         /* Reading IRQ will ACK it */
         irq = gic_hw_ops->read_irq();
+        if ( is_lpi(irq) ) {
+            // TODO: Enable irqs?
+            do_IRQ(regs, irq, is_fiq);
+            continue;
+        }
 
         if ( likely(irq >= 16 && irq < 1021) )
         {
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index 0d3bf9a..9257a23 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -284,7 +284,7 @@ void __cpuinit init_secondary_IRQ(void)
     BUG_ON(init_local_irq_data() < 0);
 }
 
-static inline struct domain *irq_get_domain(struct irq_desc *desc)
+struct domain *irq_get_domain(struct irq_desc *desc)
 {
     ASSERT(spin_is_locked(&desc->lock));
 
@@ -611,9 +611,13 @@ int route_irq_to_guest(struct domain *d, unsigned int irq,
     retval = __setup_irq(desc, 0, action);
     if ( retval )
         goto out;
+    if ( irq >= NR_LOCAL_IRQS && irq < NR_IRQS)
+        gic_route_irq_to_guest(d, desc, cpumask_of(smp_processor_id()),
+                               GIC_PRI_IRQ);
+    else
+        gic_route_lpi_to_guest(d, desc, cpumask_of(smp_processor_id()),
+                               GIC_PRI_IRQ);
 
-    gic_route_irq_to_guest(d, desc, cpumask_of(smp_processor_id()),
-                           GIC_PRI_IRQ);
     spin_unlock_irqrestore(&desc->lock, flags);
     return 0;
 
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 1447e91..360be0d 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -896,6 +896,16 @@ int vgic_its_get_pid(struct vcpu *v, uint32_t vlpi, uint32_t *plpi)
     return 1;
 }
 
+uint8_t vgic_its_get_priority(struct vcpu *v, uint32_t pid)
+{
+    uint8_t priority;
+  
+    priority =  readb_relaxed(v->domain->arch.lpi_conf->prop_page + pid);
+    priority &= 0xfc;
+
+    return priority;
+}
+
 static int vgic_v3_gits_lpi_mmio_read(struct vcpu *v, mmio_info_t *info)
 {
     uint32_t offset;
diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c
index 6fc8df1..580105f 100644
--- a/xen/arch/arm/vgic.c
+++ b/xen/arch/arm/vgic.c
@@ -399,14 +399,20 @@ void vgic_clear_pending_irqs(struct vcpu *v)
 void vgic_vcpu_inject_irq(struct vcpu *v, unsigned int irq)
 {
     uint8_t priority;
-    struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
+    struct vgic_irq_rank *rank;
     struct pending_irq *iter, *n = irq_to_pending(v, irq);
     unsigned long flags;
     bool_t running;
 
-    vgic_lock_rank(v, rank, flags);
-    priority = v->domain->arch.vgic.handler->get_irq_priority(v, irq);
-    vgic_unlock_rank(v, rank, flags);
+    if ( irq < NR_GIC_LPI )
+    {
+        rank = vgic_rank_irq(v, irq);
+        vgic_lock_rank(v, rank, flags);
+        priority = v->domain->arch.vgic.handler->get_irq_priority(v, irq);
+        vgic_unlock_rank(v, rank, flags);
+    }
+    else
+        priority = vgic_its_get_priority(v, irq);
 
     spin_lock_irqsave(&v->arch.vgic.lock, flags);
 
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index af28d66..dc1b98c 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -232,6 +232,8 @@ uint32_t its_get_pta_type(void);
 uint32_t its_get_nr_its(void);
 struct its_node * its_get_phys_node(uint32_t dev_id);
 int vgic_its_unmap_lpi_prop(struct vcpu *v);
+int vgic_its_get_pid(struct vcpu *v, uint32_t vlpi, uint32_t *plpi);
+uint8_t vgic_its_get_priority(struct vcpu *v, uint32_t pid);
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
 /*
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index b4f4904..f816664 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -214,7 +214,6 @@ enum gic_version {
 };
 
 extern enum gic_version gic_hw_version(void);
-
 /* Program the GIC to route an interrupt */
 extern void gic_route_irq_to_xen(struct irq_desc *desc, const cpumask_t *cpu_mask,
                                  unsigned int priority);
@@ -345,6 +344,8 @@ struct gic_hw_operations {
 void register_gic_ops(const struct gic_hw_operations *ops);
 int gic_make_node(const struct domain *d,const struct dt_device_node *node,
                   void *fdt);
+void gic_route_lpi_to_guest(struct domain *d, struct irq_desc *desc,
+                            const cpumask_t *cpu_mask, unsigned int priority);
 
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
index 8568b96..35a74aa 100644
--- a/xen/include/asm-arm/irq.h
+++ b/xen/include/asm-arm/irq.h
@@ -54,6 +54,7 @@ int irq_set_spi_type(unsigned int spi, unsigned int type);
 int irq_set_desc_data(unsigned int irq, struct its_device *d);
 struct its_device *irq_get_desc_data(struct irq_desc *d);
 int platform_get_irq(const struct dt_device_node *device, int index);
+struct domain *irq_get_domain(struct irq_desc *desc);
 
 void irq_set_affinity(struct irq_desc *desc, const cpumask_t *cpu_mask);
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 20/22] xen/arm: its: Generate ITS node for Dom0
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (18 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 19/22] xen/arm: its: Support ITS interrupt handling vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-19 14:38 ` [RFC PATCH v2 21/22] xen/arm: its: Initialize virtual and physical ITS driver vijay.kilari
                   ` (3 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Parse host dt and generate ITS node for Dom0.
ITS node resides inside GIC node so when GIC node
is encountered look for ITS node.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: - Generate all available ITS node in host DT for Dom0
    - its_node structure will hold pointer to ITS dt node
      that helps in search
---
 xen/arch/arm/domain_build.c   |   50 +++++++++++++++++++++++++++++++-
 xen/arch/arm/gic-v3-its.c     |   63 +++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-arm/gic-its.h |    2 ++
 xen/include/asm-arm/gic.h     |    1 +
 4 files changed, 115 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9f1f59f..800b40c 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -20,6 +20,7 @@
 #include <asm/cpufeature.h>
 
 #include <asm/gic.h>
+#include <asm/gic-its.h>
 #include <xen/irq.h>
 #include "kernel.h"
 
@@ -779,6 +780,34 @@ static int make_cpus_node(const struct domain *d, void *fdt,
     return res;
 }
 
+static int make_its_node(const struct domain *d, void *fdt,
+                         const struct dt_device_node *node)
+{
+    int res = 0;
+
+    DPRINT("Create GIC ITS node\n");
+
+    res = its_make_dt_node(d, node, fdt);
+    if ( res )
+        return res;
+
+    /*
+     * The value of the property "phandle" in the property "interrupts"
+     * to know on which interrupt controller the interrupt is wired.
+     */
+    if ( node->phandle )
+    {
+        DPRINT("  Set phandle = 0x%x\n", node->phandle);
+        res = fdt_property_cell(fdt, "phandle", node->phandle);
+        if ( res )
+            return res;
+    }
+
+    res = fdt_end_node(fdt);
+
+    return res;
+}
+
 static int make_gic_node(const struct domain *d, void *fdt,
                          const struct dt_device_node *node)
 {
@@ -1041,12 +1070,18 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
         DT_MATCH_GIC_V3,
         { /* sentinel */ },
     };
+    static const struct dt_device_match gits_matches[] __initconst =
+    {
+        DT_MATCH_GIC_ITS,
+        { /* sentinel */ },
+    };
     static const struct dt_device_match timer_matches[] __initconst =
     {
         DT_MATCH_TIMER,
         { /* sentinel */ },
     };
     struct dt_device_node *child;
+    struct dt_device_node *gic_child;
     int res;
     const char *name;
     const char *path;
@@ -1070,7 +1105,20 @@ static int handle_node(struct domain *d, struct kernel_info *kinfo,
     /* Replace these nodes with our own. Note that the original may be
      * used_by DOMID_XEN so this check comes first. */
     if ( dt_match_node(gic_matches, node) )
-        return make_gic_node(d, kinfo->fdt, node);
+    {
+        if ( !make_gic_node(d, kinfo->fdt, node) )
+        {
+            dt_for_each_child_node(node, gic_child)
+            {
+                if ( gic_child != NULL )
+                {
+                    if ( dt_match_node(gits_matches, gic_child) )
+                        return make_its_node(d, kinfo->fdt, gic_child);
+                }
+            }
+        }
+        return 0;
+    }
     if ( dt_match_node(timer_matches, node) )
         return make_timer_node(d, kinfo->fdt, node);
 
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index eacd244..a59bbb5 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -153,6 +153,18 @@ uint32_t its_get_nr_its(void)
 	return nr_its;
 }
 
+static struct its_node * find_its_node(const struct dt_device_node *node)
+{
+	struct its_node *its;
+
+	list_for_each_entry(its, &its_nodes, entry) {
+		if ( its->dt_node == node )
+			return its;
+	}
+
+	return NULL;    
+}
+
 struct its_node * its_get_phys_node(uint32_t dev_id)
 {
 	struct its_node *its;
@@ -988,6 +1000,57 @@ int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
 	return ret;
 }
 
+int its_make_dt_node(const struct domain *d,
+		     const struct dt_device_node *node, void *fdt)
+{
+	struct its_node *its;
+	const struct dt_device_node *gic;
+	const void *compatible = NULL;
+	uint32_t len;
+	__be32 *new_cells, *tmp;
+	int res = 0;
+
+	its = find_its_node(node);
+	if (its == NULL) {
+		dprintk(XENLOG_ERR, "ITS node not found\n");
+		return -FDT_ERR_XEN(ENOENT);
+	}
+
+	gic = its->dt_node;
+
+	compatible = dt_get_property(gic, "compatible", &len);
+	if (!compatible) {
+		dprintk(XENLOG_ERR, "Can't find compatible property for the its node\n");
+		return -FDT_ERR_XEN(ENOENT);
+	}
+
+	res = fdt_begin_node(fdt, "gic-its");
+	if (res)
+		return res;
+
+	res = fdt_property(fdt, "compatible", compatible, len);
+	if (res)
+		return res;
+
+	res = fdt_property(fdt, "msi-controller", NULL, 0);
+	if (res)
+		return res;
+
+	len = dt_cells_to_size(dt_n_addr_cells(node) + dt_n_size_cells(node));
+
+	new_cells = xzalloc_bytes(len);
+	if (new_cells == NULL)
+		return -FDT_ERR_XEN(ENOMEM);
+	tmp = new_cells;
+
+	dt_set_range(&tmp, node, its->phys_base, its->phys_size);
+
+	res = fdt_property(fdt, "reg", new_cells, len);
+	xfree(new_cells);
+
+	return res;
+}
+
 static int its_force_quiescent(void __iomem *base)
 {
 	u32 count = 1000000;	/* 1s */
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index dc1b98c..ed9cbc9 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -224,6 +224,8 @@ int its_get_target(uint8_t pcid, uint64_t *pta);
 int its_alloc_device_irq(struct its_device *dev, uint32_t *plpi);
 int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
                      struct its_cmd_block *phys_cmd, int send_all);
+int its_make_dt_node(const struct domain *d,
+                     const struct dt_device_node *node, void *fdt);
 void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
 void its_set_affinity(struct irq_desc *d, int cpu);
 void lpi_set_config(struct irq_desc *d, int enable);
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index f816664..d927f35 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -161,6 +161,7 @@
     DT_MATCH_COMPATIBLE("arm,gic-400")
 
 #define DT_MATCH_GIC_V3 DT_MATCH_COMPATIBLE("arm,gic-v3")
+#define DT_MATCH_GIC_ITS DT_MATCH_COMPATIBLE("arm,gic-v3-its")
 
 #define is_lpi(lpi) (lpi >= NR_GIC_LPI)
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 21/22] xen/arm: its: Initialize virtual and physical ITS driver
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (19 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 20/22] xen/arm: its: Generate ITS node for Dom0 vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-19 14:38 ` [RFC PATCH v2 22/22] xen/arm: its: Generate ITS dt node for DomU vijay.kilari
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Intialize physical ITS driver and virtual ITS driver
based on HW support information available in GICD_TYPER
register.

Based on outcome of lpi_supported() and gic_nr_id_bits()
functions ITS driver is initialized

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
v2: Multi-ITS support
---
 xen/arch/arm/gic-v3-its.c         |   25 +++++++++++++++++++++++--
 xen/arch/arm/gic-v3.c             |   21 +++++++++++++++++++++
 xen/arch/arm/gic.c                |   10 ++++++++++
 xen/arch/arm/setup.c              |    1 +
 xen/arch/arm/vgic-v3-its.c        |    8 ++++++++
 xen/arch/arm/vgic-v3.c            |    1 +
 xen/include/asm-arm/gic-its.h     |    5 +++++
 xen/include/asm-arm/gic.h         |    8 ++++++++
 xen/include/asm-arm/gic_v3_defs.h |    1 +
 xen/include/asm-arm/vgic.h        |    1 +
 10 files changed, 79 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index a59bbb5..c2c59fa 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -565,7 +565,7 @@ static int its_chunk_to_lpi(int chunk)
 	return (chunk << IRQS_PER_CHUNK_SHIFT) + 8192;
 }
 
-static int its_lpi_init(u32 id_bits)
+int its_lpi_init(u32 id_bits)
 {
 	lpi_chunks = its_lpi_to_chunk(1UL << id_bits);
 
@@ -1079,6 +1079,28 @@ static int its_force_quiescent(void __iomem *base)
 	}
 }
 
+void its_domain_init(uint32_t its_nr, struct domain *d)
+{
+	struct its_node *its;
+	u32 nr = 0;
+
+	ASSERT(its_nr < its_get_nr_its() );
+
+	if (is_hardware_domain(d)) {
+		list_for_each_entry(its, &its_nodes, entry) {
+			if ( nr == its_nr)
+				break;
+			nr++;
+		}
+
+		ASSERT(its != NULL);
+		d->arch.vits[its_nr].phys_base = its->phys_base;
+		d->arch.vits[its_nr].phys_size = its->phys_size;
+		d->arch.vits[its_nr].its  = its;
+	}
+	/* TODO: Update for DomU */
+}
+
 static int its_probe(struct dt_device_node *node)
 {
 	paddr_t its_addr, its_size;
@@ -1231,7 +1253,6 @@ int its_init(struct dt_device_node *node, struct rdist_prop *rdists)
 	gic_root_node = node;
 
 	its_alloc_lpi_tables();
-	its_lpi_init(rdists->id_bits);
 
 	return 0;
 }
diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index ffdaecf..d3278f3 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -679,6 +679,11 @@ static int __init gicv3_populate_rdist(void)
     return -ENODEV;
 }
 
+static int gicv3_dist_supports_lpis(void)
+{
+    return readl_relaxed(GICD + GICD_TYPER) & GICD_TYPER_LPIS_SUPPORTED;
+}
+
 static int __cpuinit gicv3_cpu_init(void)
 {
     int i;
@@ -691,6 +696,15 @@ static int __cpuinit gicv3_cpu_init(void)
     if ( gicv3_enable_redist() )
         return -ENODEV;
 
+    if ( gicv3_dist_supports_lpis() )
+        gicv3_info.lpi_supported = 1;
+    else
+        gicv3_info.lpi_supported = 0;
+
+        /* Give LPIs a spin */
+    if ( gicv3_info.lpi_supported )
+        its_cpu_init();
+
     /* Set priority on PPI and SGI interrupts */
     priority = (GIC_PRI_IPI << 24 | GIC_PRI_IPI << 16 | GIC_PRI_IPI << 8 |
                 GIC_PRI_IPI);
@@ -1324,10 +1338,17 @@ static int __init gicv3_init(struct dt_device_node *node, const void *data)
            gicv3.rdist_regions[0].size, gicv3.rdist_regions[0].map_base,
            gicv3_info.maintenance_irq);
 
+    reg = readl_relaxed(GICD + GICD_TYPER);
+    gicv3.rdist_data.id_bits = ((reg >> 19) & 0x1f) + 1;
+    gicv3_info.nr_id_bits = gicv3.rdist_data.id_bits;
+
     spin_lock_init(&gicv3.lock);
 
     spin_lock(&gicv3.lock);
 
+    if ( gicv3_info.lpi_supported )
+        its_init(node, &gicv3.rdist_data);
+
     gicv3_dist_init();
     res = gicv3_cpu_init();
     gicv3_hyp_init();
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c
index 6ac1f18..7c9b75c 100644
--- a/xen/arch/arm/gic.c
+++ b/xen/arch/arm/gic.c
@@ -68,6 +68,16 @@ unsigned int gic_number_lines(void)
     return gic_hw_ops->info->nr_lines;
 }
 
+unsigned int gic_nr_id_bits(void)
+{
+    return gic_hw_ops->info->nr_id_bits;
+}
+
+bool_t gic_lpi_supported(void)
+{
+    return gic_hw_ops->info->lpi_supported;
+}
+
 void gic_save_state(struct vcpu *v)
 {
     ASSERT(!local_irq_is_enabled());
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 9a1c285..ebd4bb9 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -773,6 +773,7 @@ void __init start_xen(unsigned long boot_phys_offset,
     init_xen_time();
 
     gic_init();
+    vgic_its_init();
 
     p2m_vmid_allocator_init();
 
diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
index 360be0d..bd0b8ae 100644
--- a/xen/arch/arm/vgic-v3-its.c
+++ b/xen/arch/arm/vgic-v3-its.c
@@ -1455,6 +1455,8 @@ int vgic_its_domain_init(struct domain *d)
     for ( i = 0; i < num_its; i++)
     {
          spin_lock_init(&d->arch.vits[i].lock);
+
+         its_domain_init(i, d);
          register_mmio_handler(d, &vgic_gits_mmio_handler,
                                d->arch.vits[i].phys_base,
                                SZ_64K);
@@ -1465,6 +1467,12 @@ int vgic_its_domain_init(struct domain *d)
     return 0;
 }
 
+void vgic_its_init(void)
+{
+    if ( gic_lpi_supported() )
+        its_lpi_init(gic_nr_id_bits());
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index e9ec7fa..578537f 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -1204,6 +1204,7 @@ static int vgic_v3_domain_init(struct domain *d)
             d->arch.vgic.rdist_regions[i].size);
 
     d->arch.vgic.ctlr = VGICD_CTLR_DEFAULT;
+    vgic_its_domain_init(d);
 
     return 0;
 }
diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
index ed9cbc9..519321b 100644
--- a/xen/include/asm-arm/gic-its.h
+++ b/xen/include/asm-arm/gic-its.h
@@ -226,6 +226,10 @@ int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
                      struct its_cmd_block *phys_cmd, int send_all);
 int its_make_dt_node(const struct domain *d,
                      const struct dt_device_node *node, void *fdt);
+int its_cpu_init(void);
+int its_init(struct dt_device_node *node, struct rdist_prop *rdist);
+int its_lpi_init(u32 id_bits);
+void its_domain_init(uint32_t its_nr, struct domain *d);
 void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
 void its_set_affinity(struct irq_desc *d, int cpu);
 void lpi_set_config(struct irq_desc *d, int enable);
@@ -235,6 +239,7 @@ uint32_t its_get_nr_its(void);
 struct its_node * its_get_phys_node(uint32_t dev_id);
 int vgic_its_unmap_lpi_prop(struct vcpu *v);
 int vgic_its_get_pid(struct vcpu *v, uint32_t vlpi, uint32_t *plpi);
+int vgic_its_domain_init(struct domain *d);
 uint8_t vgic_its_get_priority(struct vcpu *v, uint32_t pid);
 #endif /* __ASM_ARM_GIC_ITS_H__ */
 
diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
index d927f35..2467212 100644
--- a/xen/include/asm-arm/gic.h
+++ b/xen/include/asm-arm/gic.h
@@ -271,6 +271,10 @@ extern void gic_dump_info(struct vcpu *v);
 
 /* Number of interrupt lines */
 extern unsigned int gic_number_lines(void);
+/* Number of interrupt id bits supported */
+extern unsigned int gic_nr_id_bits(void);
+/* LPI support info */
+bool_t gic_lpi_supported(void);
 
 /* IRQ translation function for the device tree */
 int gic_irq_xlate(const u32 *intspec, unsigned int intsize,
@@ -286,6 +290,10 @@ struct gic_info {
     uint8_t nr_lrs;
     /* Maintenance irq number */
     unsigned int maintenance_irq;
+    /* Number of IRQ ID bits supported */
+    uint32_t nr_id_bits;
+    /* LPIs are support information */
+    bool_t lpi_supported; 
 };
 
 struct gic_hw_operations {
diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
index 125fc28..214b492 100644
--- a/xen/include/asm-arm/gic_v3_defs.h
+++ b/xen/include/asm-arm/gic_v3_defs.h
@@ -48,6 +48,7 @@
 #define GICR_CTL_ENABLE              (1U << 0)
 /* Additional bits in GICD_TYPER defined by GICv3 */
 #define GICD_TYPE_ID_BITS_SHIFT 19
+#define GICD_TYPER_LPIS_SUPPORTED    (1U << 17)
 
 #define GICD_CTLR_RWP                (1UL << 31)
 #define GICD_CTLR_ARE_NS             (1U << 4)
diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h
index dd93872..f2cb268 100644
--- a/xen/include/asm-arm/vgic.h
+++ b/xen/include/asm-arm/vgic.h
@@ -177,6 +177,7 @@ enum gic_sgi_mode;
 
 #define vgic_num_irqs(d)        ((d)->arch.vgic.nr_spis + 32)
 
+extern void vgic_its_init(void);
 extern int domain_vgic_init(struct domain *d);
 extern void domain_vgic_free(struct domain *d);
 extern int vcpu_vgic_init(struct vcpu *v);
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* [RFC PATCH v2 22/22] xen/arm: its: Generate ITS dt node for DomU
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (20 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 21/22] xen/arm: its: Initialize virtual and physical ITS driver vijay.kilari
@ 2015-03-19 14:38 ` vijay.kilari
  2015-03-20 13:37 ` [RFC PATCH v2 00/22] xen/arm: Add ITS support Julien Grall
  2015-03-20 16:23 ` Julien Grall
  23 siblings, 0 replies; 109+ messages in thread
From: vijay.kilari @ 2015-03-19 14:38 UTC (permalink / raw)
  To: Ian.Campbell, julien.grall, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, Vijaya Kumar K, manish.jaggi, vijay.kilari

From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>

Generate ITS device tree node for DomU.
This patch generate ITS node outside the GICv3 node
for DomU.

Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
---
 tools/libxl/libxl_arm.c       |   36 ++++++++++++++++++++++++++++++++++++
 xen/arch/arm/gic-v3-its.c     |    7 ++++++-
 xen/include/public/arch-arm.h |    3 +++
 3 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_arm.c b/tools/libxl/libxl_arm.c
index 65a762b..af6ab39 100644
--- a/tools/libxl/libxl_arm.c
+++ b/tools/libxl/libxl_arm.c
@@ -53,6 +53,7 @@ static struct arch_info {
 enum {
     PHANDLE_NONE = 0,
     PHANDLE_GIC,
+    PHANDLE_ITS,
 };
 
 typedef uint32_t be32;
@@ -362,6 +363,36 @@ static int make_gicv2_node(libxl__gc *gc, void *fdt,
     return 0;
 }
 
+static int make_its_node(libxl__gc *gc, void *fdt,
+                         uint64_t its_base, uint64_t its_size)
+{
+    int res;
+    const char *name = GCSPRINTF("gic-its@%"PRIx64, its_base);
+
+    res = fdt_begin_node(fdt, name);
+    if (res) return res;
+
+    res = fdt_property_compat(gc, fdt, 1,
+                              "arm,gic-v3-its");
+    if (res) return res;
+
+    res = fdt_property(fdt, "msi-controller", NULL, 0);
+    if (res) return res;
+
+    res = fdt_property_regs(gc, fdt, ROOT_ADDRESS_CELLS, ROOT_SIZE_CELLS,
+                            1,
+                            its_base, its_size);
+    if (res) return res;
+
+    res = fdt_property_cell(fdt, "phandle", PHANDLE_ITS);
+    if (res) return res;
+
+    res = fdt_end_node(fdt);
+    if (res) return res;
+
+    return 0;
+}
+
 static int make_gicv3_node(libxl__gc *gc, void *fdt)
 {
     int res;
@@ -600,6 +631,11 @@ next_resize:
             break;
         case XEN_DOMCTL_CONFIG_GIC_V3:
             FDT( make_gicv3_node(gc, fdt) );
+            /*
+             * TODO: Need to generate based on Config and its node should be
+             * generated inside gicv3 node
+             */
+            FDT( make_its_node(gc, fdt, GUEST_GICV3_ITS_BASE, GUEST_GICV3_ITS_SIZE) );
             break;
         default:
             LOG(ERROR, "Unknown GIC version %d", config.gic_version);
diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
index c2c59fa..dda4c20 100644
--- a/xen/arch/arm/gic-v3-its.c
+++ b/xen/arch/arm/gic-v3-its.c
@@ -1098,7 +1098,12 @@ void its_domain_init(uint32_t its_nr, struct domain *d)
 		d->arch.vits[its_nr].phys_size = its->phys_size;
 		d->arch.vits[its_nr].its  = its;
 	}
-	/* TODO: Update for DomU */
+	else
+	{
+		/* Only one vITS is supported for DomU */
+		d->arch.vits[0].phys_base = GUEST_GICV3_ITS_BASE ;
+		d->arch.vits[0].phys_size = GUEST_GICV3_ITS_SIZE;
+	}
 }
 
 static int its_probe(struct dt_device_node *node)
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index c2dcb66..0f129a2 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -381,6 +381,9 @@ typedef uint64_t xen_callback_t;
 #define GUEST_GICV3_GICR0_BASE     0x03020000ULL    /* vCPU0 - vCPU7 */
 #define GUEST_GICV3_GICR0_SIZE     0x00100000ULL
 
+#define GUEST_GICV3_ITS_BASE       0x03200000ULL
+#define GUEST_GICV3_ITS_SIZE       0x00200000ULL
+
 /*
  * 16MB == 4096 pages reserved for guest to use as a region to map its
  * grant table in.
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area helper function
  2015-03-19 14:37 ` [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area " vijay.kilari
@ 2015-03-20 13:35   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-20 13:35 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hi Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>
> bitmap_find_next_zero_area helper function will be used
> by physical ITS driver imported from linux
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> ---
>   xen/arch/arm/arm64/lib/find_next_bit.c |   39 ++++++++++++++++++++++++++++++++

The code of bitmap_find_next_zero_area is generic, why didn't you 
implement it in xen/common/bitmap.c?

FWIW, the Linux implementation is generic...

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (21 preceding siblings ...)
  2015-03-19 14:38 ` [RFC PATCH v2 22/22] xen/arm: its: Generate ITS dt node for DomU vijay.kilari
@ 2015-03-20 13:37 ` Julien Grall
  2015-03-20 16:23 ` Julien Grall
  23 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-20 13:37 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hi Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>
> Add ITS support for arm. Following major features
> are supported
>   - GICv3 ITS support for arm64 platform
>   - Supports multi ITS node
>   - LPI descriptors are allocated on-demand
>   - Only ITS Dom0 is supported
>
> Tested with single ITS node.

It would have been nice to give a link to your github in this cover 
letter too.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-03-19 14:37 ` [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen vijay.kilari
@ 2015-03-20 15:06   ` Julien Grall
  2015-03-23 12:24     ` Vijay Kilari
  2015-04-01 11:34   ` Ian Campbell
  1 sibling, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-20 15:06 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>   static LIST_HEAD(its_nodes);
>   static DEFINE_SPINLOCK(its_lock);
> -static struct device_node *gic_root_node;
> -static struct rdists *gic_rdists;
> +static struct dt_device_node *gic_root_node;
> +static struct rdist_prop  *gic_rdists;
>
> -#define gic_data_rdist()		(raw_cpu_ptr(gic_rdists->rdist))
> -#define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
> +#define gic_data_rdist()		(per_cpu(rdist, smp_processor_id()))

Again why didn't you return a pointer here? It would have been avoid 
some confusing changes (s/->/./) in the code.

#define gic_data_rdist(&per_cpu(rdist, smp_processor_id))

> +#define gic_data_rdist_rd_base()	(per_cpu(rdist, smp_processor_id()).rbase)

That would avoid this change too.

>
>   /*
>    * ITS command descriptors - parameters to be encoded in a command
> @@ -228,10 +243,10 @@ static struct its_collection *its_build_mapd_cmd(struct its_cmd_block *cmd,
>   						 struct its_cmd_desc *desc)
>   {
>   	unsigned long itt_addr;
> -	u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
> +	u8 size = max(fls(desc->its_mapd_cmd.dev->nr_ites) - 1, 1);

ilog2 on an uint32_t is defined as fls(val) - 1. Where does the max come 
from?

IHMO, I would define ilog2 in Xen, that would be easier.

>
> -	itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
> -	itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
> +	itt_addr = __pa(desc->its_mapd_cmd.dev->itt);
> +        itt_addr = ROUNDUP(itt_addr, ITS_ITT_ALIGN);

This file use the Linux coding style. Please use hard tab.

>
>   	its_encode_cmd(cmd, GITS_CMD_MAPD);
>   	its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
> @@ -348,7 +363,7 @@ static struct its_cmd_block *its_allocate_entry(struct its_node *its)
>   	while (its_queue_full(its)) {
>   		count--;
>   		if (!count) {
> -			pr_err_ratelimited("ITS queue not draining\n");
> +			its_err("ITS queue not draining\n");

its_err and pr_err_ratelimited are not the same things. The former is 
not ratelimited.

AFAICT this function will be accessible in someway from the guest. It 
would be possible to DOS Xen when sending a command.

>   			return NULL;
>   		}
>   		cpu_relax();
> @@ -380,7 +395,7 @@ static void its_flush_cmd(struct its_node *its, struct its_cmd_block *cmd)
>   	 * the ITS.
>   	 */
>   	if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
> -		__flush_dcache_area(cmd, sizeof(*cmd));
> +		clean_and_invalidate_dcache_va_range(cmd, sizeof(*cmd));
>   	else
>   		dsb(ishst);
>   }
> @@ -402,7 +417,7 @@ static void its_wait_for_range_completion(struct its_node *its,
>
>   		count--;
>   		if (!count) {
> -			pr_err_ratelimited("ITS queue timeout\n");
> +			its_err("ITS queue timeout\n");

Ditto

[..]

> -static void its_send_inv(struct its_device *dev, u32 event_id)
> +/* TODO: Remove static for the sake of compilation */
> +void its_send_inv(struct its_device *dev, u32 event_id)

Rather than changing the prototype. Would it be possible to #if 0 the 
function? It would be easier to keep track change.

>   {
>   	struct its_cmd_desc desc;
>
> @@ -479,7 +495,8 @@ static void its_send_mapc(struct its_node *its, struct its_collection *col,
>   	its_send_single_command(its, its_build_mapc_cmd, &desc);
>   }
>
> -static void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)
> +/* TODO: Remove static for the sake of compilation */
> +void its_send_mapvi(struct its_device *dev, u32 irq_id, u32 id)

Ditto and same for all those kind of changes.

[..]

> -static unsigned long *its_lpi_alloc_chunks(int nr_irqs, int *base, int *nr_ids)
> +static unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids)

This is because nr_irqs is a define on ARM, rigth? If so, I would prefer 
to define nr_irqs as a variable.

[..]

>   /*
> @@ -745,31 +769,31 @@ static void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
>   /*
>    * This is how many bits of ID we need, including the useless ones.
>    */
> -#define LPI_NRBITS		ilog2(LPI_PROPBASE_SZ + SZ_8K)
> +#define LPI_NRBITS		fls(LPI_PROPBASE_SZ + SZ_8K) - 1

Missing parenthesis.

>
>   #define LPI_PROP_DEFAULT_PRIO	0xa0

I would either move LPI_PROP_DEFAULT_PRIO in asm-arm/gic.h or define it 
using GIC_PRI_IRQ.

This would allow us to change the priority later without having issue 
with LPI.

>
>   static int __init its_alloc_lpi_tables(void)
>   {
> -	phys_addr_t paddr;
> +	paddr_t paddr;
>
> -	gic_rdists->prop_page = alloc_pages(GFP_NOWAIT,
> -					   get_order(LPI_PROPBASE_SZ));
> +	gic_rdists->prop_page = alloc_xenheap_pages(get_order_from_bytes(LPI_PROPBASE_SZ), 0);
>   	if (!gic_rdists->prop_page) {
> -		pr_err("Failed to allocate PROPBASE\n");
> +		its_err("Failed to allocate PROPBASE\n");
>   		return -ENOMEM;
>   	}
>
> -	paddr = page_to_phys(gic_rdists->prop_page);
> -	pr_info("GIC: using LPI property table @%pa\n", &paddr);
> +	paddr = __pa(gic_rdists->prop_page);
> +	its_info("GIC: using LPI property table @%pa\n", &paddr);

IIRC, %pa doesn't exist on Xen.

>
>   	/* Priority 0xa0, Group-1, disabled */
> -	memset(page_address(gic_rdists->prop_page),
> +	memset(gic_rdists->prop_page,
>   	       LPI_PROP_DEFAULT_PRIO | LPI_PROP_GROUP1,
>   	       LPI_PROPBASE_SZ);
>
>   	/* Make sure the GIC will observe the written configuration */
> -	__flush_dcache_area(page_address(gic_rdists->prop_page), LPI_PROPBASE_SZ);
> +	clean_and_invalidate_dcache_va_range(gic_rdists->prop_page,
> +	                                     LPI_PROPBASE_SZ);
>
>   	return 0;
>   }
> @@ -790,7 +814,7 @@ static void its_free_tables(struct its_node *its)
>
>   	for (i = 0; i < GITS_BASER_NR_REGS; i++) {
>   		if (its->tables[i]) {
> -			free_page((unsigned long)its->tables[i]);
> +			xfree(its->tables[i]);

The memory for the table is allocated via alloc_xenheap_pages. So 
freeing the memory should be done via free_xenheap_pages.

>   			its->tables[i] = NULL;
>   		}
>   	}
> @@ -807,7 +831,7 @@ static int its_alloc_tables(struct its_node *its)
>   		u64 val = readq_relaxed(its->base + GITS_BASER + i * 8);
>   		u64 type = GITS_BASER_TYPE(val);
>   		u64 entry_size = GITS_BASER_ENTRY_SIZE(val);
> -		int order = get_order(psz);
> +		int order = get_order_from_bytes(psz);

I saw multiple change from get_order to get_order_from_bytes.

I think get_order is very handy and more compact. I would add a macro 
for it.

>   		int alloc_size;
>   		u64 tmp;
>   		void *base;
> @@ -827,25 +851,25 @@ static int its_alloc_tables(struct its_node *its)
>   			u64 typer = readq_relaxed(its->base + GITS_TYPER);
>   			u32 ids = GITS_TYPER_DEVBITS(typer);
>
> -			order = get_order((1UL << ids) * entry_size);
> +			order = get_order_from_bytes((1UL << ids) * entry_size);
>   			if (order >= MAX_ORDER) {
>   				order = MAX_ORDER - 1;
> -				pr_warn("%s: Device Table too large, reduce its page order to %u\n",
> -					its->msi_chip.of_node->full_name, order);
> +				its_warn("Device Table too large, reduce its page order to %u\n",
> +					 order);
>   			}
>   		}
>
>   		alloc_size = (1 << order) * PAGE_SIZE;
> -		base = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
> +		base = alloc_xenheap_pages(order, 0);
>   		if (!base) {
>   			err = -ENOMEM;
>   			goto out_free;
>   		}
> -

This change is not necessary.

> +		memset(base, 0, alloc_size);
>   		its->tables[i] = base;

[..]

>   	/* If we didn't allocate the pending table yet, do it now */
> -	pend_page = gic_data_rdist()->pend_page;
> +	pend_page = gic_data_rdist().pend_page;
>   	if (!pend_page) {
> -		phys_addr_t paddr;
> +		paddr_t paddr;
>   		/*
>   		 * The pending pages have to be at least 64kB aligned,
>   		 * hence the 'max(LPI_PENDBASE_SZ, SZ_64K)' below.
>   		 */
> -		pend_page = alloc_pages(GFP_NOWAIT | __GFP_ZERO,
> -					get_order(max(LPI_PENDBASE_SZ, SZ_64K)));
> +		pend_page = alloc_xenheap_pages(get_order_from_bytes(max(LPI_PENDBASE_SZ, SZ_64K)), 0);

This line is too long.

>   		if (!pend_page) {
> -			pr_err("Failed to allocate PENDBASE for CPU%d\n",
> +			its_err("Failed to allocate PENDBASE for CPU%d\n",
>   			       smp_processor_id());

indentation

>   			return;
>   		}
> -

Spurious change

> +		memset(pend_page, 0, max(LPI_PENDBASE_SZ, SZ_64K));
>   		/* Make sure the GIC will observe the zero-ed page */
> -		__flush_dcache_area(page_address(pend_page), LPI_PENDBASE_SZ);
> +		clean_and_invalidate_dcache_va_range(pend_page, LPI_PENDBASE_SZ);
>
> -		paddr = page_to_phys(pend_page);
> -		pr_info("CPU%d: using LPI pending table @%pa\n",
> +		paddr = __pa(pend_page);
> +		its_info("CPU%d: using LPI pending table @%pa\n",
>   			smp_processor_id(), &paddr);
> -		gic_data_rdist()->pend_page = pend_page;
> +		gic_data_rdist().pend_page = pend_page;
>   	}
>
>   	/* Disable LPIs */
> @@ -971,7 +993,7 @@ static void its_cpu_init_lpis(void)
>   	dsb(sy);
>
>   	/* set PROPBASE */
> -	val = (page_to_phys(gic_rdists->prop_page) |
> +	val = (__pa(gic_rdists->prop_page)   |
>   	       GICR_PROPBASER_InnerShareable |
>   	       GICR_PROPBASER_WaWb |
>   	       ((LPI_NRBITS - 1) & GICR_PROPBASER_IDBITS_MASK));
> @@ -980,12 +1002,12 @@ static void its_cpu_init_lpis(void)
>   	tmp = readq_relaxed(rbase + GICR_PROPBASER);
>
>   	if ((tmp ^ val) & GICR_PROPBASER_SHAREABILITY_MASK) {
> -		pr_info_once("GIC: using cache flushing for LPI property table\n");
> +		its_info("GIC: using cache flushing for LPI property table\n");

Not really the same.

[..]

> -static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
> +/* TODO: Remove static for the sake of compilation */
> +int its_alloc_device_irq(struct its_device *dev, int *hwirq)
>   {
>   	int idx;
>
> @@ -1139,6 +1169,8 @@ static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t *hwirq)
>   	return 0;
>   }
>
> +/* pci and msi handling no more required here */

Hmmm why?

> +#if 0
>   struct its_pci_alias {
>   	struct pci_dev	*pdev;
>   	u32		dev_id;
> @@ -1218,6 +1250,9 @@ static struct msi_domain_info its_pci_msi_domain_info = {
>   	.chip	= &its_msi_irq_chip,
>   };
>
> +#endif
> +/* IRQ domain management is not required */
> +#if 0
>   static int its_irq_gic_domain_alloc(struct irq_domain *domain,
>   				    unsigned int virq,
>   				    irq_hw_number_t hwirq)
> @@ -1319,6 +1354,7 @@ static const struct irq_domain_ops its_domain_ops = {
>   	.activate		= its_irq_domain_activate,
>   	.deactivate		= its_irq_domain_deactivate,
>   };
> +#endif
>
>   static int its_force_quiescent(void __iomem *base)
>   {
> @@ -1348,58 +1384,57 @@ static int its_force_quiescent(void __iomem *base)
>   	}
>   }
>
> -static int its_probe(struct device_node *node, struct irq_domain *parent)
> +static int its_probe(struct dt_device_node *node)

[..]

>   	err = its_force_quiescent(its_base);
>   	if (err) {
> -		pr_warn("%s: failed to quiesce, giving up\n",
> +		its_warn("%s: failed to quiesce, giving up\n",
>   			node->full_name);

Indentation.

[..]

>   	if ((tmp ^ baser) & GITS_BASER_SHAREABILITY_MASK) {
> -		pr_info("ITS: using cache flushing for cmd queue\n");
> +		its_info("ITS: using cache flushing for cmd queue\n");
>   		its->flags |= ITS_FLAGS_CMDQ_NEEDS_FLUSHING;
>   	}
> -

Spurious change

> +#if 0
>   	if (of_property_read_bool(its->msi_chip.of_node, "msi-controller")) {
>   		its->domain = irq_domain_add_tree(NULL, &its_domain_ops, its);
>   		if (!its->domain) {
> @@ -1451,27 +1486,28 @@ static int its_probe(struct device_node *node, struct irq_domain *parent)
>   		if (err)
>   			goto out_free_domains;
>   	}
> -

Ditto

> +#endif
>   	spin_lock(&its_lock);
>   	list_add(&its->entry, &its_nodes);
>   	spin_unlock(&its_lock);
>
>   	return 0;
> -

Ditto

> +#if 0
>   out_free_domains:

[..]

> -static struct of_device_id its_device_id[] = {
> -	{	.compatible	= "arm,gic-v3-its",	},
> -	{},
> -};
> -
> -int its_init(struct device_node *node, struct rdists *rdists,
> -	     struct irq_domain *parent_domain)
> +int its_init(struct dt_device_node *node, struct rdist_prop *rdists)
>   {
> -	struct device_node *np;
> +	struct dt_device_node *np = NULL;
> +
> +	static const struct dt_device_match its_device_ids[] __initconst =
> +	{
> +		DT_MATCH_COMPATIBLE("arm,gic-v3-its"),
> +		{ /* sentinel */ },
> +	};
> +

Already said on V1: of_device_id and dt_device_match are compatible. If 
you change the name it will work too...

> +	while ((np = dt_find_matching_node(np, its_device_ids)))
> +	{
> +		if (!dt_find_property(np, "msi-controller", NULL))
> +		continue;

In your cover letter, you said you support multiple ITS node but this 
piece of code show that it's not the case...

> +	}
>
> -	for (np = of_find_matching_node(node, its_device_id); np;
> -	     np = of_find_matching_node(np, its_device_id)) {
> -		its_probe(np, parent_domain);

The for loop was perfect, why did you drop it?

> +	if (np) {
> +		its_probe(np);
>   	}
>
>   	if (list_empty(&its_nodes)) {
> -		pr_warn("ITS: No ITS available, not enabling LPIs\n");
> +		its_warn("ITS: No ITS available, not enabling LPIs\n");
>   		return -ENXIO;
>   	}
>
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index 4e64b56..f8bac52 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -59,11 +59,12 @@
>   #define GICR_WAKER_ProcessorSleep    (1U << 1)
>   #define GICR_WAKER_ChildrenAsleep    (1U << 2)
>
> -#define GICD_PIDR2_ARCH_REV_MASK     (0xf0)
> +#define GIC_PIDR2_ARCH_REV_MASK      (0xf0)
> +#define GICD_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK

Why do you define GIC_PIDR2_ARCH_REV_MASK? It's not consistent with the 
other part of the code.

>   #define GICD_PIDR2_ARCH_REV_SHIFT    (0x4)
>   #define GICD_PIDR2_ARCH_GICV3        (0x3)
>
> -#define GICR_PIDR2_ARCH_REV_MASK     GICD_PIDR2_ARCH_REV_MASK
> +#define GICR_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK

Why this change? GICD_PIDR2_ARCH_REV_MASK still exists...

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
                   ` (22 preceding siblings ...)
  2015-03-20 13:37 ` [RFC PATCH v2 00/22] xen/arm: Add ITS support Julien Grall
@ 2015-03-20 16:23 ` Julien Grall
  2015-03-23 12:37   ` Vijay Kilari
  23 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-20 16:23 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hi Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>
> Add ITS support for arm. Following major features
> are supported
>   - GICv3 ITS support for arm64 platform
>   - Supports multi ITS node
>   - LPI descriptors are allocated on-demand
>   - Only ITS Dom0 is supported
>
> Tested with single ITS node.

Some though about the whole design:

Your vGIC ITS driver does too much things. In general a virtual driver 
should only emulate the hardware for the domain and forward the request 
to the physical driver.

Your series adds device management (create/free) in the vITS, which is 
wrong.

How do you check if the domain can use the device?
Currently, you allow any domain to use any device. That would bring a 
big mess with guest using passthrough.

Also, does the guess will always pass the correct devid? If not how do 
you plan to handle it?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page
  2015-03-19 14:37 ` [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page vijay.kilari
@ 2015-03-20 16:39   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-20 16:39 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
> +struct page_info *get_page_from_paddr(struct domain *d, paddr_t paddr,
> +                                      unsigned long flags)
> +{
> +    struct p2m_domain *p2m = &d->arch.p2m;
> +    struct page_info *page = NULL;
> +
> +    ASSERT(d == current->domain);
> +
> +    spin_lock(&p2m->lock);
> +
> +    if ( !mfn_valid(paddr >> PAGE_SHIFT) )

If I understand correctly this function is to get a page from an IPA, right?

Firstly, this function is wrong because you assume IPA == MFN. This is 
not valid for guest and may not be for DOM0.

Secondly, we already have a function which does this job (see 
get_page_from_gfn). Why can't you use it?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support
  2015-03-19 14:37 ` [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support vijay.kilari
@ 2015-03-20 16:44   ` Julien Grall
  2015-03-30 14:32     ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-20 16:44 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
> index 435dfcd..f091739 100644
> --- a/xen/include/asm-arm/irq.h
> +++ b/xen/include/asm-arm/irq.h
> @@ -17,6 +17,8 @@ struct arch_pirq
>   struct arch_irq_desc {
>       int eoi_cpu;
>       unsigned int type;
> +    unsigned int virq;
> +    struct its_device *dev;
>   };

It seems you again miss my comment... As said on v1 this is not the 
solution. You add data for any IRQ (around 16K in Xen) just for handling 
LPIs.

I provided a patch to handle virq != irq [1] and we should use it in 
order to diverge handling between LPIs and SPIs.

If you are not happy with it, please see why.

Regards,

[1] https://patches.linaro.org/43012/

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
@ 2015-03-21  0:28   ` Julien Grall
  2015-03-23 15:52   ` Julien Grall
  2015-03-24 11:48   ` Julien Grall
  2 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-21  0:28 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/2015 14:38, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>
> Add Virtual ITS command processing support to
> Virtual ITS driver. Also add API's to in physical
> ITS driver to send commands from Virtual ITS driver.
>
> In this patch, following are done
>   -Physical ITS driver will allocate physical LPI for
>    virtual LPI request.

Please indent the same way as the other item.

>   - The Device ID is used to find the ITS on which it is attached
>     and ITS command is sent on that physical ITS.
>   - Commands like SYNC and INVALL does not have device id. So these
>     commands are sent on all Physical ITS nodes.
>   - The vTA(virtual target address) is considered unique way to map
>     to Physical target address and collection ids.
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> ---
> v2: - put unused code under #if0/endif
>      - changes to redistributor is moved to separate patch
>      - Fixed comments from RFC version
> ---
>   xen/arch/arm/Makefile         |    1 +
>   xen/arch/arm/gic-v3-its.c     |  185 ++++++++-
>   xen/arch/arm/vgic-v3-its.c    |  879 +++++++++++++++++++++++++++++++++++++++++
>   xen/include/asm-arm/domain.h  |    9 +
>   xen/include/asm-arm/gic-its.h |   86 +++-
>   5 files changed, 1156 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index 66ea264..81a3317 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -32,6 +32,7 @@ obj-y += traps.o
>   obj-y += vgic.o vgic-v2.o
>   obj-$(CONFIG_ARM_64) += vgic-v3.o
>   obj-$(CONFIG_ARM_64) += gic-v3-its.o
> +obj-$(CONFIG_ARM_64) += vgic-v3-its.o
>   obj-y += vtimer.o
>   obj-y += vuart.o
>   obj-y += hvm.o
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 242cf65..a9aab73 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -56,6 +56,14 @@
>
>   #define its_warn(fmt, ...)                                            \
>
> +//#define DEBUG_GIC_ITS
> +
> +#ifdef DEBUG_GIC_ITS
> +# define DPRINTK(fmt, args...) printk(XENLOG_DEBUG fmt, ##args)
> +#else
> +# define DPRINTK(fmt, args...) do {} while ( 0 )
> +#endif
> +
>   #define ITS_FLAGS_CMDQ_NEEDS_FLUSHING		(1 << 0)
>
>   #define RDIST_FLAGS_PROPBASE_NEEDS_FLUSHING	(1 << 0)
> @@ -68,6 +76,7 @@
>   struct its_collection {
>   	u64			target_address;
>   	u16			col_id;
> +	u16			valid;
>   };
>
>   /*
> @@ -80,14 +89,19 @@ struct its_node {
>   	struct list_head	entry;
>   	void __iomem		*base;
>   	unsigned long		phys_base;
> +	unsigned long		phys_size;
>   	struct its_cmd_block	*cmd_base;
>   	struct its_cmd_block	*cmd_write;
>   	void			*tables[GITS_BASER_NR_REGS];
>   	struct its_collection	*collections;
>   	u64			flags;
>   	u32			ite_size;
> +	u32			nr_collections;
> +	struct dt_device_node	*dt_node;
>   };
>
> +uint32_t pta_type;
> +

The physical target address is defined per-ITS. Therefore it should be 
moved in the its_node.

>   #define ITS_ITT_ALIGN		SZ_256
>
>   static LIST_HEAD(its_nodes);
> @@ -127,6 +141,123 @@ struct its_cmd_desc {
>   	};
>   };
>
> +uint32_t its_get_pta_type(void)
> +{

That would require its_get_pta_type to take an its_node in parameter.

> +	return pta_type;
> +}
> +
> +struct its_node * its_get_phys_node(uint32_t dev_id)
> +{
> +	struct its_node *its;
> +
> +	/* TODO: For now return ITS0 node.
> +	 * Need Query PCI helper function to get on which
> +	 * ITS node the device is attached
> +	 */
> +	list_for_each_entry(its, &its_nodes, entry) {
> +		return its;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int its_search_rdist_address(struct domain *d, uint64_t ta,
> +				    uint32_t *col_id)
> +{
> +	int i, rg;
> +	paddr_t start, end;
> +
> +	for (rg = 0; rg < d->arch.vgic.nr_regions; rg++) {
> +		i = 0;
> +		start = d->arch.vgic.rdist_regions[rg].base;
> +		end = d->arch.vgic.rdist_regions[rg].base +
> +			d->arch.vgic.rdist_regions[rg].size;
> +		while ((( start + i * d->arch.vgic.rdist_stride) < end)) {
> +			if ((start + i * d->arch.vgic.rdist_stride) == ta) {
> +				DPRINTK("ITS: Found pta 0x%lx\n", ta);
> +				*col_id = i;
> +				return 0;
> +			}
> +			i++;
> +		}
> +	}
> +	return 1;
> +}
> +

If you consider col_id == vcpu_id, you may want to give a look to 
get_vcpu_from_rdist.

> +int its_get_physical_cid(struct domain *d, uint32_t *col_id, uint64_t ta)

The function returns either 1 or 0. I would use bool_t.

After looking to the code of this function, this looks more a vITS 
specific function rather than an ITS one.

> +{
> +	int i;
> +	struct its_collection *col;
> +
> +	/*
> +	* For Dom0, the target address info is collected
> +	* at boot time.
> +	*/
> +	if (is_hardware_domain(d)) {
> +		struct its_node *its;
> +
> +		list_for_each_entry(its, &its_nodes, entry) {
> +			for (i = 0; i < its->nr_collections; i++) {
> +		                col = &its->collections[i];
> +				if (col->valid && col->target_address == ta) {
> +					DPRINTK("ITS:Match ta 0x%lx ta 0x%lx\n",
> +						col->target_address, ta);
> +					*col_id = col->col_id;
> +					return 0;
> +				}
> +			}
> +			/* All collections are mapped on every physical ITS */

If this is true, you don't need the list_for_each_entry. It will be less 
confusing to understand.

Also, you are assuming that every ITS have the same value in 
GITS_TYPER.PTA. Which may not be true on any platform.

> +			break;
> +		}
> +	}
> +	else
> +	{
> +		/* As per Spec, Target address is re-distributor
> +		 * address/cpu number.
> +		 * We cannot rely on collection id as it can any number.
> +		 * So here we should rely only on vta address to map the
> +		 * collection. For domU, vta != target address.
> +		 * So, check vta is corresponds to which GICR region and
> +		 * consider that vcpu id as collection id.

A collection ID based on the VCPU ID may not be a valid collection on 
the physical ITS.

> +		 */
> +		if (its_get_pta_type()) {
> +			its_search_rdist_address(d, ta, col_id);
> +		}
> +		else
> +		{
> +			*col_id = ta;
> +			return 0;
> +		}
> +	}
> +
> +	DPRINTK("ITS: Cannot find valid pta entry for ta 0x%lx\n", ta);
> +	return 1;
> +}
> +
> +int its_get_target(uint8_t pcid, uint64_t *pta)

Ditto for the return type.

> +{
> +	int i;
> +	struct its_collection *col;
> +	struct its_node *its;
> +
> +	list_for_each_entry(its, &its_nodes, entry) {
> +		for (i = 0; i < its->nr_collections; i++) {
> +			col = &its->collections[i];
> +			if (col->valid && col->col_id == pcid) {
> +				*pta = col->target_address;
> +				DPRINTK("ITS:Match pta 0x%lx vta 0x%lx\n",
> +					col->target_address, *pta);
> +				return 0;
> +			}
> +		}
> +		/* All collections are mapped on every physical ITS */
> +		break;
> +	}
> +
> +	DPRINTK("ITS: Cannot find valid pta entry for vta 0x%lx\n",*pta);
> +	return 1;
> +}
> +
>   #define ITS_CMD_QUEUE_SZ		SZ_64K
>   #define ITS_CMD_QUEUE_NR_ENTRIES	(ITS_CMD_QUEUE_SZ / sizeof(struct its_cmd_block))
>
> @@ -541,7 +672,6 @@ out:
>   	return bitmap;
>   }
>
> -/* TODO: Remove static for the sake of compilation */
>   void its_lpi_free(unsigned long *bitmap, int base, int nr_ids)
>   {
>   	int lpi;
> @@ -859,6 +989,8 @@ static void its_cpu_init_collection(void)
>   		/* Perform collection mapping */
>   		its->collections[cpu].target_address = target;
>   		its->collections[cpu].col_id = cpu;
> +		its->collections[cpu].valid = 1;
> +		its->nr_collections++;
>
>   		its_send_mapc(its, &its->collections[cpu], 1);
>   		its_send_invall(its, &its->collections[cpu]);
> @@ -867,8 +999,7 @@ static void its_cpu_init_collection(void)
>   	spin_unlock(&its_lock);
>   }
>
> -/* TODO: Remove static for the sake of compilation */
> -int its_alloc_device_irq(struct its_device *dev, int *hwirq)
> +int its_alloc_device_irq(struct its_device *dev, uint32_t *hwirq)

In patch #6 "Port ITS driver to Xen" you change irq_hw_number_t to int. 
Now you modify again the type to uint32_t. Why not directly moving to 
the correct type in #6?

>   {
>   	int idx;
>
> @@ -882,6 +1013,47 @@ int its_alloc_device_irq(struct its_device *dev, int *hwirq)
>   	return 0;
>   }
>
> +static int its_send_cmd(struct vcpu *v, struct its_node *its,
> +			struct its_cmd_block *phys_cmd)
> +{
> +	struct its_cmd_block *cmd, *next_cmd;
> +
> +	spin_lock(&its->lock);

spin_lock_irqsave?

> +
> +	cmd = its_allocate_entry(its);
> +	if (!cmd)
> +		return 0;
> +
> +	cmd->raw_cmd[0] = phys_cmd->raw_cmd[0];
> +	cmd->raw_cmd[1] = phys_cmd->raw_cmd[1];
> +	cmd->raw_cmd[2] = phys_cmd->raw_cmd[2];
> +	cmd->raw_cmd[3] = phys_cmd->raw_cmd[3];

memcpy?

> +	its_flush_cmd(its, cmd);
> +
> +	next_cmd = its_post_commands(its);
> +	spin_unlock(&its->lock);
> +
> +	its_wait_for_range_completion(its, cmd, next_cmd);
> +
> +	return 1;
> +}
> +
> +int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
> +		     struct its_cmd_block *phys_cmd, int send_all)
> +{
> +	struct its_node *pits;
> +	int ret = 0;
> +
> +	if (send_all) {
> +		list_for_each_entry(pits, &its_nodes, entry)
> +		ret = its_send_cmd(v, pits, phys_cmd);
> +	}
> +	else
> +		return its_send_cmd(v, its, phys_cmd);
> +
> +	return ret;
> +}
> +
>   static int its_force_quiescent(void __iomem *base)
>   {
>   	u32 count = 1000000;	/* 1s */
> @@ -955,10 +1127,17 @@ static int its_probe(struct dt_device_node *node)
>
>   	spin_lock_init(&its->lock);
>   	INIT_LIST_HEAD(&its->entry);
> +	its->dt_node = node;
>   	its->base = its_base;
>   	its->phys_base = its_addr;
> +	its->phys_size = its_size;

These 2 changes don't belong to this patch.

>   	its->ite_size = ((readl_relaxed(its_base + GITS_TYPER) >> 4) & 0xf) + 1;
>
> +	if ( (readq_relaxed(its->base + GITS_TYPER) & GITS_TYPER_PTA) )
> +		pta_type = 1;
> +	else
> +		pta_type = 0;
> +

Please add a define for those pta_type. It would be easier to understand 
the code.

>   	its->cmd_base = xzalloc_bytes(ITS_CMD_QUEUE_SZ);
>   	if (!its->cmd_base) {
>   		err = -ENOMEM;
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> new file mode 100644
> index 0000000..7530a88
> --- /dev/null
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -0,0 +1,879 @@
> +/*
> + * Copyright (C) 2013, 2014 ARM Limited, All Rights Reserved.
> + * Author: Marc Zyngier <marc.zyngier@arm.com>
> + *
> + * Xen changes:
> + * Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> + * Copyright (C) 2014 Cavium Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <xen/bitops.h>
> +#include <xen/config.h>
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/softirq.h>
> +#include <xen/irq.h>
> +#include <xen/list.h>
> +#include <xen/sched.h>
> +#include <xen/sizes.h>
> +#include <xen/xmalloc.h>
> +#include <asm/current.h>
> +#include <asm/device.h>
> +#include <asm/mmio.h>
> +#include <asm/io.h>
> +#include <asm/gic_v3_defs.h>
> +#include <asm/gic.h>
> +#include <asm/vgic.h>
> +#include <asm/gic-its.h>
> +
> +/* GITS register definitions */
> +#define VITS_GITS_TYPER_HCC       (0xffU << 24)
> +#define VITS_GITS_TYPER_PTA_SHIFT (19)
> +#define VITS_GITS_DEV_BITS        (0x14U << 13)
> +#define VITS_GITS_ID_BITS         (0x13U << 8)
> +#define VITS_GITS_ITT_SIZE        (0x7U << 4)
> +#define VITS_GITS_DISTRIBUTED     (0x1U << 3)
> +#define VITS_GITS_PLPIS           (0x1U << 0)
> +
> +/* GITS_PIDRn register values for ARM implementations */
> +#define GITS_PIDR0_VAL            (0x94)
> +#define GITS_PIDR1_VAL            (0xb4)
> +#define GITS_PIDR2_VAL            (0x3b)
> +#define GITS_PIDR3_VAL            (0x00)
> +#define GITS_PIDR4_VAL            (0x04)
> +
> +//#define DEBUG_ITS
> +
> +#ifdef DEBUG_ITS
> +# define DPRINTK(fmt, args...) printk(XENLOG_DEBUG fmt, ##args)
> +#else
> +# define DPRINTK(fmt, args...) do {} while ( 0 )
> +#endif
> +
> +#ifdef DEBUG_ITS
> +static void dump_cmd(struct its_cmd_block *cmd)
> +{
> +    printk("CMD[0] = 0x%lx CMD[1] = 0x%lx CMD[2] = 0x%lx CMD[3] = 0x%lx\n",
> +           cmd->raw_cmd[0], cmd->raw_cmd[1], cmd->raw_cmd[2], cmd->raw_cmd[3]);
> +}
> +#endif
> +
> +void vgic_its_disable_lpis(struct vcpu *v, uint32_t lpi)

This function is only used within this file. Therefore it should be static.

Also, why the 's' to lpis? AFAICT, you are only disabling one LPI at the 
time.

> +{
> +    struct pending_irq *p;
> +    unsigned long flags;
> +
> +    p = irq_to_pending(v, lpi);
> +    clear_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
> +    gic_remove_from_queues(v, lpi);
> +    if ( p->desc != NULL )
> +    {
> +        spin_lock_irqsave(&p->desc->lock, flags);
> +        p->desc->handler->disable(p->desc);
> +        spin_unlock_irqrestore(&p->desc->lock, flags);
> +    }

This code seems very similar to the content of the loop in 
vgic_disable_irqs. Please make a common helper.

> +}
> +
> +void vgic_its_enable_lpis(struct vcpu *v, uint32_t lpi)

static

Ditto for the 's'

> +{
> +    struct pending_irq *p;
> +    unsigned long flags;
> +
> +    p = irq_to_pending(v, lpi);
> +    set_bit(GIC_IRQ_GUEST_ENABLED, &p->status);
> +
> +    spin_lock_irqsave(&v->arch.vgic.lock, flags);
> +
> +    if ( !list_empty(&p->inflight) &&
> +         !test_bit(GIC_IRQ_GUEST_VISIBLE, &p->status) )
> +        gic_raise_guest_irq(v, p->desc->arch.virq, p->priority);
> +
> +    spin_unlock_irqrestore(&v->arch.vgic.lock, flags);
> +    if ( p->desc != NULL )
> +    {
> +        spin_lock_irqsave(&p->desc->lock, flags);
> +        p->desc->handler->enable(p->desc);
> +        spin_unlock_irqrestore(&p->desc->lock, flags);
> +    }

Similar to the content of loop in vgic_enable_irqs.

> +}
> +
> +static int vits_alloc_device_irq(struct its_device *dev, uint32_t id,

It took me a while to understand what is "id". I would rename to 
something more meaningful such as eventID of irqID.

> +                                uint32_t *plpi, uint32_t vlpi, uint32_t vcol_id)

The name is confusing. This function can also retrieve an LPIs...

> +{
> +
> +    int idx, i = 0;
> +
> +    spin_lock(&dev->vlpi_lock);
> +    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
> +    {
> +        if ( dev->vlpi_entries[i].vlpi == vlpi )
> +        {
> +             *plpi = dev->vlpi_entries[i].plpi;
> +             DPRINTK("Found plpi %d for device 0x%x with vlpi %d id %d\n",
> +                      *plpi, dev->dev_id, vlpi, dev->vlpi_entries[i].id);
> +             spin_unlock(&dev->vlpi_lock);
> +             return 0;
> +        }
> +        i++;
> +    }
> +
> +    if ( its_alloc_device_irq(dev, plpi) )
> +        BUG_ON(1);

Why this BUG_ON()? It looks like to me that we should return an error if 
we can't allocate the LPIs for the device. Otherwise a malicious guest 
can trigger the BUG_ON and crash the whole platform.

> +
> +    idx = find_first_zero_bit(dev->vlpi_map, dev->nr_lpis);
> +    dev->vlpi_entries[idx].plpi = *plpi;
> +    dev->vlpi_entries[idx].vlpi = vlpi;
> +    dev->vlpi_entries[idx].id  = id;
> +    set_bit(idx, dev->vlpi_map);
> +
> +    spin_unlock(&dev->vlpi_lock);
> +
> +    DPRINTK("Allocated plpi %d for device 0x%x with vlpi %d id %d @idx %d\n",
> +            *plpi, dev->dev_id, vlpi, id, idx);
> +
> +    return 0;
> +}
> +
> +/* Should be called with its lock held */

Please add an ASSERT in the function to verify the assumption.

> +static void vgic_its_unmap_id(struct vcpu *v, struct its_device *dev,
> +                              uint32_t id, int trash)
> +{
> +    int i = 0;
> +
> +    DPRINTK("vITS: unmap id for device 0x%x id %d trash %d\n",
> +             dev->dev_id, id, trash);
> +
> +    spin_lock(&dev->vlpi_lock);
> +    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )

Missing space after the first parenthesis.

while ( ... )

> +    {
> +        if ( dev->vlpi_entries[i].id == id )
> +        {
> +            DPRINTK("vITS: un mapped id for device 0x%x id %d lpi %d\n",
> +                     dev->dev_id, dev->vlpi_entries[i].id,
> +                     dev->vlpi_entries[i].plpi);
> +            vgic_its_disable_lpis(v, dev->vlpi_entries[i].plpi);
> +            release_irq(dev->vlpi_entries[i].plpi, v->domain);

That's definitely wrong. The vITS should not be able to unmap an LPI 
like that.

> +            dev->vlpi_entries[i].plpi = 0;
> +            dev->vlpi_entries[i].vlpi = 0;
> +            dev->vlpi_entries[i].id = 0;
> +            /* XXX: Clear LPI base here? */
> +            clear_bit(dev->vlpi_entries[i].plpi - dev->lpi_base, dev->lpi_map);
> +            clear_bit(i, dev->vlpi_map);
> +            goto out;
> +        }
> +        i++;
> +    }
> +
> +    spin_unlock(&dev->vlpi_lock);
> +    dprintk(XENLOG_ERR, "vITS: id %d not found for device 0x%x to unmap\n",
> +           id, dev->device_id);

XENLOG_ERR is not rate-limited, you have to use XENLOG_G_ERR.

Also please use printk rather than dprintk. The latter will be drop on 
non-debug build.

Lastly, I would print the domain, vCPU and the vITS ID. It would be 
easier for debugging.

All these comment are valid for every dprint/DPRINTK message in this file.

> +
> +    return;
> +out:
> +    if ( bitmap_empty(dev->lpi_map, dev->nr_lpis) )
> +    {
> +        its_lpi_free(dev->lpi_map, dev->lpi_base, dev->nr_lpis);
> +        DPRINTK("vITS: Freeing lpi chunk\n");
> +    }
> +    /* XXX: Device entry is not removed on empty lpi list */
> +    spin_unlock(&dev->vlpi_lock);
> +}
> +
> +static int vgic_its_check_device_id(struct vcpu *v, struct its_device *dev,
> +                                    uint32_t id)
> +{
> +    int i = 0;
> +
> +    spin_lock(&dev->vlpi_lock);
> +    while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
> +    {
> +        if ( dev->vlpi_entries[i].id == id )
> +        {
> +            spin_unlock(&dev->vlpi_lock);
> +            return 0;
> +        }
> +        i++;
> +    }
> +    spin_unlock(&dev->vlpi_lock);
> +
> +    return 1;
> +}
> +
> +static struct its_device *vgic_its_check_device(struct vcpu *v, int dev_id)
> +{
> +    struct domain *d = v->domain;
> +    struct its_device *dev = NULL, *tmp;
> +
> +    spin_lock(&d->arch.vits_devs.lock);
> +    list_for_each_entry(tmp, &d->arch.vits_devs.dev_list, entry)
> +    {
> +        if ( tmp->device_id == dev_id )
> +        {
> +            DPRINTK("vITS: Found device 0x%x\n", device_id);
> +            dev = tmp;
> +            break;
> +        }
> +    }
> +    spin_unlock(&d->arch.vits_devs.lock);
> +
> +    return dev;
> +}
> +
> +static int vgic_its_check_cid(struct vcpu *v,
> +                              struct vgic_its *vits,
> +                              uint8_t vcid, uint32_t *pcid)
> +{
> +    uint32_t nmap = vits->cid_map.nr_cid;
> +    int i;

nmap is uint32_t so i should be too.

> +
> +    for ( i = 0; i < nmap; i++ )
> +    {
> +        if ( vcid == vits->cid_map.vcid[i] )
> +        {
> +            *pcid = vits->cid_map.pcid[i];
> +            DPRINTK("vITS: Found vcid %d for vcid %d\n", *pcid,
> +                     vits->cid_map.vcid[i]);
> +            return 0;
> +        }
> +    }
> +
> +    return 1;
> +}
> +
> +static uint64_t vgic_its_get_pta(struct vcpu *v, struct vgic_its *vits,
> +                                 uint64_t vta)
> +{
> +
> +    uint32_t nmap = vits->cid_map.nr_cid;
> +    int i;

ditto

> +    uint8_t pcid;
> +    uint64_t pta;
> +
> +    for ( i = 0; i < nmap; i++ )
> +    {
> +        if ( vta == vits->cid_map.vta[i] )
> +        {
> +            pcid = vits->cid_map.pcid[i];
> +            DPRINTK("vITS: Found vcid %d for vta 0x%lx\n", pcid,
> +                     vits->cid_map.vta[i]);
> +            if ( its_get_target(pcid, &pta) )
> +                BUG_ON(1);

No BUG_ON, please handle the error correctly.

> +            return pta;
> +        }
> +    }
> +
> +    BUG_ON(1);

Ditto

> +    return 1;
> +}
> +
> +static int vgic_its_build_mapd_cmd(struct vcpu *v,

Why this function take a vCPU in a parameter. Shouldn't it take a domain?

> +                                   struct its_cmd_block *virt_cmd,
> +                                   struct its_cmd_block *phys_cmd)
> +{
> +    unsigned long itt_addr;

its_decode_itt return an uint64_t.

> +
> +    itt_addr = its_decode_itt(virt_cmd);
> +    /* Get ITT PA from ITT IPA */
> +    itt_addr = p2m_lookup(v->domain, itt_addr, NULL);

Multiple problems:

1) If the 'V' bit is not set, "ITT Address" will likely be 0.
But IPA 0 != PA 0 so you will end up to use a wrong address. So I would 
do a specific case for 'V' = 0
2) p2m_lookup may return INVALD_PADDR if the page is not mapped
3) You should validate that the guest is using a RAM page and belongs to 
him. Otherwise it may use an MMIO region or a page from another guest
4) Depending of the size, the ITT may cross multiple pages. But the 
physical address may not be contiguous
5) The guest, Xen may remove under our feat the pages which belong to 
the ITT. That would result to the ITS using a wrong page. I think you 
have to take a reference on those pages

> +    its_encode_cmd(phys_cmd, GITS_CMD_MAPD);
> +    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));

If "Device ID" exceeds the maximum value support by ITS,  a command 
error will be issued.

Also, are we sure to always have vdevid == pdevid? If so, this should be 
written somewhere.

> +    its_encode_size(phys_cmd, its_decode_size(virt_cmd));

If "Size" exceeds the value permitted by GITS_TYPER.IDbits
the physical ITS will issue a command error.

If the platform support System Error Interrupts (GITS_TYPER.SEIS set to 
1), this will generate a system error and hang the hypervisor.

As the generation of the system error is platform specific you have to 
validate all the data given by the guest in order to avoid a such thing 
happening.

My remark here is valid everywhere a command error can be issued in this 
patch.

> +    its_encode_itt(phys_cmd, itt_addr);
> +    its_encode_valid(phys_cmd, its_decode_valid(virt_cmd));
> +
> +    DPRINTK("vITS: Build MAPD with itt_addr 0x%lx devId %d\n",itt_addr,

Space after the comma and devid is an uint32_t so please use %u

> +            its_decode_devid(virt_cmd));
> +
> +    return 0;
> +}

It's late here, I will finish to review this patch next week. For now I 
hit send with my first comments.

Regards,


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-03-20 15:06   ` Julien Grall
@ 2015-03-23 12:24     ` Vijay Kilari
  2015-03-23 13:27       ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-23 12:24 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

Hi Julien,

On Fri, Mar 20, 2015 at 8:36 PM, Julien Grall <julien.grall@linaro.org> wrote:
> Hello Vijay,
>
> On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>>
>>   static LIST_HEAD(its_nodes);
>>   static DEFINE_SPINLOCK(its_lock);
>> -static struct device_node *gic_root_node;
>> -static struct rdists *gic_rdists;
>> +static struct dt_device_node *gic_root_node;
>> +static struct rdist_prop  *gic_rdists;
>>
>> -#define gic_data_rdist()               (raw_cpu_ptr(gic_rdists->rdist))
>> -#define gic_data_rdist_rd_base()       (gic_data_rdist()->rd_base)
>> +#define gic_data_rdist()               (per_cpu(rdist,
>> smp_processor_id()))
>
>
> Again why didn't you return a pointer here? It would have been avoid some
> confusing changes (s/->/./) in the code.
>
> #define gic_data_rdist(&per_cpu(rdist, smp_processor_id))
>
>> +#define gic_data_rdist_rd_base()       (per_cpu(rdist,
>> smp_processor_id()).rbase)
>
>
> That would avoid this change too.

OK. Let me try

>
>>
>>   /*
>>    * ITS command descriptors - parameters to be encoded in a command
>> @@ -228,10 +243,10 @@ static struct its_collection
>> *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>                                                  struct its_cmd_desc
>> *desc)
>>   {
>>         unsigned long itt_addr;
>> -       u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>> +       u8 size = max(fls(desc->its_mapd_cmd.dev->nr_ites) - 1, 1);
>
>
> ilog2 on an uint32_t is defined as fls(val) - 1. Where does the max come
> from?
>
> IHMO, I would define ilog2 in Xen, that would be easier.

Anyway this code is not used later. So I don't bother much about this

>
>>
>> -       itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>> -       itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>> +       itt_addr = __pa(desc->its_mapd_cmd.dev->itt);
>> +        itt_addr = ROUNDUP(itt_addr, ITS_ITT_ALIGN);
>
>
> This file use the Linux coding style. Please use hard tab.
>
>>
>>         its_encode_cmd(cmd, GITS_CMD_MAPD);
>>         its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>> @@ -348,7 +363,7 @@ static struct its_cmd_block *its_allocate_entry(struct
>> its_node *its)
>>         while (its_queue_full(its)) {
>>                 count--;
>>                 if (!count) {
>> -                       pr_err_ratelimited("ITS queue not draining\n");
>> +                       its_err("ITS queue not draining\n");
>
>
> its_err and pr_err_ratelimited are not the same things. The former is not
> ratelimited.
>
> AFAICT this function will be accessible in someway from the guest. It would
> be possible to DOS Xen when sending a command.

Any equivalent ratelimited function in Xen?

>
>>                         return NULL;
>>                 }
>>                 cpu_relax();
>> @@ -380,7 +395,7 @@ static void its_flush_cmd(struct its_node *its, struct
>> its_cmd_block *cmd)
>>          * the ITS.
>>          */
>>         if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
>> -               __flush_dcache_area(cmd, sizeof(*cmd));
>> +               clean_and_invalidate_dcache_va_range(cmd, sizeof(*cmd));
>>         else
>>                 dsb(ishst);
>>   }
>> @@ -402,7 +417,7 @@ static void its_wait_for_range_completion(struct
>> its_node *its,
>>
>>                 count--;
>>                 if (!count) {
>> -                       pr_err_ratelimited("ITS queue timeout\n");
>> +                       its_err("ITS queue timeout\n");
>
>
> Ditto
>
> [..]
>
>> -static void its_send_inv(struct its_device *dev, u32 event_id)
>> +/* TODO: Remove static for the sake of compilation */
>> +void its_send_inv(struct its_device *dev, u32 event_id)
>
>
> Rather than changing the prototype. Would it be possible to #if 0 the
> function? It would be easier to keep track change.

Does not matter much. Anyway I can try as you wish

>
>> -static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t
>> *hwirq)
>> +/* TODO: Remove static for the sake of compilation */
>> +int its_alloc_device_irq(struct its_device *dev, int *hwirq)
>>   {
>>         int idx;
>>
>> @@ -1139,6 +1169,8 @@ static int its_alloc_device_irq(struct its_device
>> *dev, irq_hw_number_t *hwirq)
>>         return 0;
>>   }
>>
>> +/* pci and msi handling no more required here */
>
>
> Hmmm why?

This code is not required. we don't have msi_domain_ops

>
>
> Already said on V1: of_device_id and dt_device_match are compatible. If you
> change the name it will work too...
>
>> +       while ((np = dt_find_matching_node(np, its_device_ids)))
>> +       {
>> +               if (!dt_find_property(np, "msi-controller", NULL))
>> +               continue;
>
>
> In your cover letter, you said you support multiple ITS node but this piece
> of code show that it's not the case...

   If I remember  correctly, this is later updated
>
>> +       }
>>
>> -       for (np = of_find_matching_node(node, its_device_id); np;
>> -            np = of_find_matching_node(np, its_device_id)) {
>> -               its_probe(np, parent_domain);
>
>
> The for loop was perfect, why did you drop it?
>
>> +       if (np) {
>> +               its_probe(np);
>>         }
>>
>>         if (list_empty(&its_nodes)) {
>> -               pr_warn("ITS: No ITS available, not enabling LPIs\n");
>> +               its_warn("ITS: No ITS available, not enabling LPIs\n");
>>                 return -ENXIO;
>>         }
>>
>> diff --git a/xen/include/asm-arm/gic_v3_defs.h
>> b/xen/include/asm-arm/gic_v3_defs.h
>> index 4e64b56..f8bac52 100644
>> --- a/xen/include/asm-arm/gic_v3_defs.h
>> +++ b/xen/include/asm-arm/gic_v3_defs.h
>> @@ -59,11 +59,12 @@
>>   #define GICR_WAKER_ProcessorSleep    (1U << 1)
>>   #define GICR_WAKER_ChildrenAsleep    (1U << 2)
>>
>> -#define GICD_PIDR2_ARCH_REV_MASK     (0xf0)
>> +#define GIC_PIDR2_ARCH_REV_MASK      (0xf0)
>> +#define GICD_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
>
>
> Why do you define GIC_PIDR2_ARCH_REV_MASK? It's not consistent with the
> other part of the code.

Linux code uses GIC_PIDR2_ARCH_REV_MASK

>
>>   #define GICD_PIDR2_ARCH_REV_SHIFT    (0x4)
>>   #define GICD_PIDR2_ARCH_GICV3        (0x3)
>>
>> -#define GICR_PIDR2_ARCH_REV_MASK     GICD_PIDR2_ARCH_REV_MASK
>> +#define GICR_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
>
>
> Why this change? GICD_PIDR2_ARCH_REV_MASK still exists...

gic-v3.c still uses this.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-20 16:23 ` Julien Grall
@ 2015-03-23 12:37   ` Vijay Kilari
  2015-03-23 13:11     ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-23 12:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Fri, Mar 20, 2015 at 9:53 PM, Julien Grall <julien.grall@linaro.org> wrote:
> Hi Vijay,
>
> On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>>
>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>
>> Add ITS support for arm. Following major features
>> are supported
>>   - GICv3 ITS support for arm64 platform
>>   - Supports multi ITS node
>>   - LPI descriptors are allocated on-demand
>>   - Only ITS Dom0 is supported
>>
>> Tested with single ITS node.
>
>
> Some though about the whole design:
>
> Your vGIC ITS driver does too much things. In general a virtual driver
> should only emulate the hardware for the domain and forward the request to
> the physical driver.
>
> Your series adds device management (create/free) in the vITS, which is
> wrong.

The device is added to ITS using MAPD command. All ITS commands are based
on this device added using MAPD command. So vITS driver needs to manage
this.

>
> How do you check if the domain can use the device?
> Currently, you allow any domain to use any device. That would bring a big
> mess with guest using passthrough.

ITS driver does not know which PCI device is assigned for which domain.
I think it should be done by above layers along with pci drivers in Xen.
vITS assume that the domain that sends MAPD command owns the device

>
> Also, does the guess will always pass the correct devid? If not how do you
> plan to handle it?

The only place it validate pci devid is vITS driver will call pci
helper function
to get ITS dt node for the BDF. Or we can add a check in MAPD command
by calling pci helper function to know if the BDF is valid or not.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-23 12:37   ` Vijay Kilari
@ 2015-03-23 13:11     ` Julien Grall
  2015-03-23 15:18       ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-23 13:11 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 23/03/15 12:37, Vijay Kilari wrote:
> On Fri, Mar 20, 2015 at 9:53 PM, Julien Grall <julien.grall@linaro.org> wrote:
>> Hi Vijay,
>>
>> On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>>>
>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>
>>> Add ITS support for arm. Following major features
>>> are supported
>>>   - GICv3 ITS support for arm64 platform
>>>   - Supports multi ITS node
>>>   - LPI descriptors are allocated on-demand
>>>   - Only ITS Dom0 is supported
>>>
>>> Tested with single ITS node.
>>
>>
>> Some though about the whole design:
>>
>> Your vGIC ITS driver does too much things. In general a virtual driver
>> should only emulate the hardware for the domain and forward the request to
>> the physical driver.
>>
>> Your series adds device management (create/free) in the vITS, which is
>> wrong.
> 
> The device is added to ITS using MAPD command. All ITS commands are based
> on this device added using MAPD command. So vITS driver needs to manage
> this.

The ITS still have to manage in someway the device. There is lots of
information that doesn't need to be created at every mapd (such as the
number of MSI).

Handling device management in ITS would help to check the validity of
the access. Which you are currently ignoring...

>>
>> How do you check if the domain can use the device?
>> Currently, you allow any domain to use any device. That would bring a big
>> mess with guest using passthrough.
> 
> ITS driver does not know which PCI device is assigned for which domain.

Wrong, Xen knows which device is assigned to which domain so ITS does.

> I think it should be done by above layers along with pci drivers in Xen.
> vITS assume that the domain that sends MAPD command owns the device

The vITS emulates hardware for a specific domain. A malicious guest
could send request to a not own device.

You have to think about security in the vITS otherwise we will end up
with many XSA in this code...

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-03-23 12:24     ` Vijay Kilari
@ 2015-03-23 13:27       ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-23 13:27 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 23/03/15 12:24, Vijay Kilari wrote:
>>>   /*
>>>    * ITS command descriptors - parameters to be encoded in a command
>>> @@ -228,10 +243,10 @@ static struct its_collection
>>> *its_build_mapd_cmd(struct its_cmd_block *cmd,
>>>                                                  struct its_cmd_desc
>>> *desc)
>>>   {
>>>         unsigned long itt_addr;
>>> -       u8 size = ilog2(desc->its_mapd_cmd.dev->nr_ites);
>>> +       u8 size = max(fls(desc->its_mapd_cmd.dev->nr_ites) - 1, 1);
>>
>>
>> ilog2 on an uint32_t is defined as fls(val) - 1. Where does the max come
>> from?
>>
>> IHMO, I would define ilog2 in Xen, that would be easier.
> 
> Anyway this code is not used later. So I don't bother much about this

Again, what's the purpose of fixing compilation bug in code which will
be remove a patch later?

If you want to fix build errors, you have to fix them correctly. Not
trying to add wrong code in order to make the compiler happy...

>>
>>>
>>> -       itt_addr = virt_to_phys(desc->its_mapd_cmd.dev->itt);
>>> -       itt_addr = ALIGN(itt_addr, ITS_ITT_ALIGN);
>>> +       itt_addr = __pa(desc->its_mapd_cmd.dev->itt);
>>> +        itt_addr = ROUNDUP(itt_addr, ITS_ITT_ALIGN);
>>
>>
>> This file use the Linux coding style. Please use hard tab.
>>
>>>
>>>         its_encode_cmd(cmd, GITS_CMD_MAPD);
>>>         its_encode_devid(cmd, desc->its_mapd_cmd.dev->device_id);
>>> @@ -348,7 +363,7 @@ static struct its_cmd_block *its_allocate_entry(struct
>>> its_node *its)
>>>         while (its_queue_full(its)) {
>>>                 count--;
>>>                 if (!count) {
>>> -                       pr_err_ratelimited("ITS queue not draining\n");
>>> +                       its_err("ITS queue not draining\n");
>>
>>
>> its_err and pr_err_ratelimited are not the same things. The former is not
>> ratelimited.
>>
>> AFAICT this function will be accessible in someway from the guest. It would
>> be possible to DOS Xen when sending a command.
> 
> Any equivalent ratelimited function in Xen?

All GUEST_* and INFO/DEBUG are ratelimited. It might be worth to
introduce ratelimited concept for ERROR.

>>
>>>                         return NULL;
>>>                 }
>>>                 cpu_relax();
>>> @@ -380,7 +395,7 @@ static void its_flush_cmd(struct its_node *its, struct
>>> its_cmd_block *cmd)
>>>          * the ITS.
>>>          */
>>>         if (its->flags & ITS_FLAGS_CMDQ_NEEDS_FLUSHING)
>>> -               __flush_dcache_area(cmd, sizeof(*cmd));
>>> +               clean_and_invalidate_dcache_va_range(cmd, sizeof(*cmd));
>>>         else
>>>                 dsb(ishst);
>>>   }
>>> @@ -402,7 +417,7 @@ static void its_wait_for_range_completion(struct
>>> its_node *its,
>>>
>>>                 count--;
>>>                 if (!count) {
>>> -                       pr_err_ratelimited("ITS queue timeout\n");
>>> +                       its_err("ITS queue timeout\n");
>>
>>
>> Ditto
>>
>> [..]
>>
>>> -static void its_send_inv(struct its_device *dev, u32 event_id)
>>> +/* TODO: Remove static for the sake of compilation */
>>> +void its_send_inv(struct its_device *dev, u32 event_id)
>>
>>
>> Rather than changing the prototype. Would it be possible to #if 0 the
>> function? It would be easier to keep track change.
> 
> Does not matter much. Anyway I can try as you wish

Depend if you care about the reviewers time or not...

> 
>>
>>> -static int its_alloc_device_irq(struct its_device *dev, irq_hw_number_t
>>> *hwirq)
>>> +/* TODO: Remove static for the sake of compilation */
>>> +int its_alloc_device_irq(struct its_device *dev, int *hwirq)
>>>   {
>>>         int idx;
>>>
>>> @@ -1139,6 +1169,8 @@ static int its_alloc_device_irq(struct its_device
>>> *dev, irq_hw_number_t *hwirq)
>>>         return 0;
>>>   }
>>>
>>> +/* pci and msi handling no more required here */
>>
>>
>> Hmmm why?
> 
> This code is not required. we don't have msi_domain_ops

I have the feeling that counting the number of MSI for a device will be
useful later.

> 
>>
>>
>> Already said on V1: of_device_id and dt_device_match are compatible. If you
>> change the name it will work too...
>>
>>> +       while ((np = dt_find_matching_node(np, its_device_ids)))
>>> +       {
>>> +               if (!dt_find_property(np, "msi-controller", NULL))
>>> +               continue;
>>
>>
>> In your cover letter, you said you support multiple ITS node but this piece
>> of code show that it's not the case...
> 
>    If I remember  correctly, this is later updated

Unfortunately not... anyway the for loop is valid. So please drop your
while here.


>>
>>> +       }
>>>
>>> -       for (np = of_find_matching_node(node, its_device_id); np;
>>> -            np = of_find_matching_node(np, its_device_id)) {
>>> -               its_probe(np, parent_domain);
>>
>>
>> The for loop was perfect, why did you drop it?
>>
>>> +       if (np) {
>>> +               its_probe(np);
>>>         }
>>>
>>>         if (list_empty(&its_nodes)) {
>>> -               pr_warn("ITS: No ITS available, not enabling LPIs\n");
>>> +               its_warn("ITS: No ITS available, not enabling LPIs\n");
>>>                 return -ENXIO;
>>>         }
>>>
>>> diff --git a/xen/include/asm-arm/gic_v3_defs.h
>>> b/xen/include/asm-arm/gic_v3_defs.h
>>> index 4e64b56..f8bac52 100644
>>> --- a/xen/include/asm-arm/gic_v3_defs.h
>>> +++ b/xen/include/asm-arm/gic_v3_defs.h
>>> @@ -59,11 +59,12 @@
>>>   #define GICR_WAKER_ProcessorSleep    (1U << 1)
>>>   #define GICR_WAKER_ChildrenAsleep    (1U << 2)
>>>
>>> -#define GICD_PIDR2_ARCH_REV_MASK     (0xf0)
>>> +#define GIC_PIDR2_ARCH_REV_MASK      (0xf0)
>>> +#define GICD_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
>>
>>
>> Why do you define GIC_PIDR2_ARCH_REV_MASK? It's not consistent with the
>> other part of the code.
> 
> Linux code uses GIC_PIDR2_ARCH_REV_MASK

You modify so heavily the Linux code (pr_* -> its_*) that modifying
again a single line (yes only one) wouldn't hurt...

>>
>>>   #define GICD_PIDR2_ARCH_REV_SHIFT    (0x4)
>>>   #define GICD_PIDR2_ARCH_GICV3        (0x3)
>>>
>>> -#define GICR_PIDR2_ARCH_REV_MASK     GICD_PIDR2_ARCH_REV_MASK
>>> +#define GICR_PIDR2_ARCH_REV_MASK     GIC_PIDR2_ARCH_REV_MASK
>>
>>
>> Why this change? GICD_PIDR2_ARCH_REV_MASK still exists...
> 
> gic-v3.c still uses this.

You define GICD_PIDR2_ARCH_REV_MASK with GIC_PIDR2_ARCH_REV_MASK.

So the pre-processor will replace by the correct value.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-23 13:11     ` Julien Grall
@ 2015-03-23 15:18       ` Vijay Kilari
  2015-03-23 15:30         ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-23 15:18 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Mon, Mar 23, 2015 at 6:41 PM, Julien Grall <julien.grall@linaro.org> wrote:
> On 23/03/15 12:37, Vijay Kilari wrote:
>> On Fri, Mar 20, 2015 at 9:53 PM, Julien Grall <julien.grall@linaro.org> wrote:
>>> Hi Vijay,
>>>
>>> On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>>>>
>>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>>
>>>> Add ITS support for arm. Following major features
>>>> are supported
>>>>   - GICv3 ITS support for arm64 platform
>>>>   - Supports multi ITS node
>>>>   - LPI descriptors are allocated on-demand
>>>>   - Only ITS Dom0 is supported
>>>>
>>>> Tested with single ITS node.
>>>
>>>
>>> Some though about the whole design:
>>>
>>> Your vGIC ITS driver does too much things. In general a virtual driver
>>> should only emulate the hardware for the domain and forward the request to
>>> the physical driver.
>>>
>>> Your series adds device management (create/free) in the vITS, which is
>>> wrong.
>>
>> The device is added to ITS using MAPD command. All ITS commands are based
>> on this device added using MAPD command. So vITS driver needs to manage
>> this.
>
> The ITS still have to manage in someway the device. There is lots of
> information that doesn't need to be created at every mapd (such as the
> number of MSI).

First assumption is VITS driver owns converting Virtual ITS commands
 to Physical ITS commands. So based on this

- arch_domain contains list of all the devices attached for the domain.
-  On MAPD command, device is created, physical LPI's and virtual LPIs
are allocated
   and added to domains list and further all other ITS commands except
MAPC, INVALL and SYNC
   depend on device information to convert virtual ITS commands to
physical ITS commands.

>
> Handling device management in ITS would help to check the validity of
> the access. Which you are currently ignoring...
>
>>>
>>> How do you check if the domain can use the device?
>>> Currently, you allow any domain to use any device. That would bring a big
>>> mess with guest using passthrough.
>>
>> ITS driver does not know which PCI device is assigned for which domain.
>
> Wrong, Xen knows which device is assigned to which domain so ITS does.
>
>> I think it should be done by above layers along with pci drivers in Xen.
>> vITS assume that the domain that sends MAPD command owns the device
>
> The vITS emulates hardware for a specific domain. A malicious guest
> could send request to a not own device.

OK.   On MAPD command when ITS device is created, I can introduce pci helper
function to know if particular device is assigned to domain or not.

>
> You have to think about security in the vITS otherwise we will end up
> with many XSA in this code...
>

 For every virtual ITS command parameters are validated
before issuing physical command, except check on device id which I will
take care in next version

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-23 15:18       ` Vijay Kilari
@ 2015-03-23 15:30         ` Julien Grall
  2015-03-23 16:09           ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-23 15:30 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 23/03/15 15:18, Vijay Kilari wrote:
>> The ITS still have to manage in someway the device. There is lots of
>> information that doesn't need to be created at every mapd (such as the
>> number of MSI).
> 
> First assumption is VITS driver owns converting Virtual ITS commands
>  to Physical ITS commands. So based on this
> 
> - arch_domain contains list of all the devices attached for the domain.
> -  On MAPD command, device is created, physical LPI's and virtual LPIs
> are allocated
>    and added to domains list and further all other ITS commands except
> MAPC, INVALL and SYNC
>    depend on device information to convert virtual ITS commands to
> physical ITS commands.

I didn't understand what you said.

>>
>> Handling device management in ITS would help to check the validity of
>> the access. Which you are currently ignoring...
>>
>>>>
>>>> How do you check if the domain can use the device?
>>>> Currently, you allow any domain to use any device. That would bring a big
>>>> mess with guest using passthrough.
>>>
>>> ITS driver does not know which PCI device is assigned for which domain.
>>
>> Wrong, Xen knows which device is assigned to which domain so ITS does.
>>
>>> I think it should be done by above layers along with pci drivers in Xen.
>>> vITS assume that the domain that sends MAPD command owns the device
>>
>> The vITS emulates hardware for a specific domain. A malicious guest
>> could send request to a not own device.
> 
> OK.   On MAPD command when ITS device is created, I can introduce pci helper
> function to know if particular device is assigned to domain or not.
> 
>>
>> You have to think about security in the vITS otherwise we will end up
>> with many XSA in this code...
>>
> 
>  For every virtual ITS command parameters are validated
> before issuing physical command, except check on device id which I will
> take care in next version

The check on device id is not the only check missing... You also have to
validate ID, Size... with the number of bits supported by the ITS.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
  2015-03-21  0:28   ` Julien Grall
@ 2015-03-23 15:52   ` Julien Grall
  2015-03-24 11:48   ` Julien Grall
  2 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-23 15:52 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello,

Second part of the review.

On 19/03/15 14:38, vijay.kilari@gmail.com wrote:
> +static int vgic_its_build_sync_cmd(struct vcpu *v,
> +                                   struct vgic_its *vits,
> +                                   struct its_cmd_block *virt_cmd,
> +                                   struct its_cmd_block *phys_cmd)
> +{
> +    uint64_t pta;
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_SYNC);
> +    pta = vgic_its_get_pta(v, vits, its_decode_target(virt_cmd));

The command SYNC is sent to all the ITS. But the target address may not
be the same depending on GITS_TYPER.PTA.

> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_mapvi_cmd(struct vcpu *v,
> +                                    struct vgic_its *vits,
> +                                    struct its_cmd_block *virt_cmd,
> +                                    struct its_cmd_block *phys_cmd)

I'm not sure to understand why we have to implement mapvi. The spec says
"It is expected that in GICv3 systems this command will only be used by
hypervisor software".

So why Linux is using it?

Also, you handle both mapi and mapvi here. So I would rename the function.

> +{
> +    struct domain *d = v->domain;
> +    struct its_device *dev;
> +    uint32_t pcol_id;
> +    uint32_t pid;
> +    struct irq_desc *desc;
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +    uint8_t vcol_id = its_decode_collection(virt_cmd);
> +    uint32_t vid = its_decode_phys_id(virt_cmd);
> +    uint8_t cmd = its_decode_cmd(virt_cmd);

It looks like to me that its_cmd_block would benefit to be an union of
command. It would avoid defining so many variable at each command.

> +
> +    DPRINTK("vITS: MAPVI: dev_id 0x%x vcol_id %d vid %d \n",
> +             dev_id, vcol_id, vid);
> +
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MAPVI: Fail to find device 0x%x\n", dev_id);
> +        return 1;
> +    }
> +
> +    /* Check if Collection id exists */
> +    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MAPVI: with wrong Collection %d\n", vcol_id);
> +        return 1;
> +    }
> +    if ( vits_alloc_device_irq(dev, id, &pid, vid, vcol_id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MAPVI: Failed to alloc irq\n");
> +        return 1;
> +    }
> +
> +    /* Allocate irq desc for this pirq */
> +    desc = irq_to_desc(pid);
> +
> +    route_irq_to_guest(d, pid, "LPI");

route_irq_to_guest may fail.

> +
> +     /* Assign device structure to desc data */
> +    desc->arch.dev = dev;
> +    desc->arch.virq = vid;

In the case of the command MAPI, the vid (i.e Physical ID field) is mark
as reserved.

>From the spec, the LPI number will be the "ID" field.

> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_MAPVI);
> +    its_encode_devid(phys_cmd, dev_id);
> +
> +    if ( cmd == GITS_CMD_MAPI )
> +        its_encode_event_id(phys_cmd, vid);

As said above, vid (i.e Physical ID field) is mark as reserved so you
can't use as an ID.

> +    else
> +        its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));
> +
> +    its_encode_phys_id(phys_cmd, pid);
> +    its_encode_collection(phys_cmd, pcol_id);
> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_movi_cmd(struct vcpu *v,
> +                                   struct vgic_its *vits,
> +                                   struct its_cmd_block *virt_cmd,
> +                                   struct its_cmd_block *phys_cmd)
> +{
> +    uint32_t pcol_id;
> +    struct its_device *dev;
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +    uint8_t vcol_id = its_decode_collection(virt_cmd);
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +
> +    DPRINTK("vITS: MOVI: dev_id 0x%x vcol_id %d\n", dev_id, vcol_id);
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MOVI: Failed to find device 0x%x\n", dev_id);
> +        return 1;
> +    }
> +
> +    /* Check if Collection id exists */
> +    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MOVI: with wrong Collection %d\n", vcol_id);
> +        return 1;
> +    }
> +
> +    if ( vgic_its_check_device_id(v, dev, id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: MOVI: Invalid ID %d\n", id);
> +        return 1;
> +    }
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_MOVI);
> +    its_encode_devid(phys_cmd, dev_id);
> +    its_encode_event_id(phys_cmd, id);
> +    its_encode_collection(phys_cmd, pcol_id);
> +
> +    return 0;
> +}
> +   
> +static int vgic_its_build_discard_cmd(struct vcpu *v,
> +                                      struct vgic_its *vits,
> +                                      struct its_cmd_block *virt_cmd,
> +                                      struct its_cmd_block *phys_cmd)
> +{
> +    struct its_device *dev;
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +
> +    DPRINTK("vITS: DISCARD: dev_id 0x%x id %d\n", dev_id, id);
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: DISCARD: Failed to find device 0x%x\n",
> +                dev_id);
> +        return 1;
> +    }
> +
> +    if ( vgic_its_check_device_id(v, dev, id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: DISCARD: Invalid vID %d\n", id);
> +        return 1;
> +    }
> +
> +    /* Check if PID is exists for this VID for this device and unmap it */
> +    vgic_its_unmap_id(v, dev, id, 0);
> +
> +    /* Fetch and encode cmd */
> +    its_encode_cmd(phys_cmd, GITS_CMD_DISCARD);
> +    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));

You can reuse id here.

> +    its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));

and reuse dev_id here.

> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_inv_cmd(struct vcpu *v,
> +                                  struct vgic_its *vits,
> +                                  struct its_cmd_block *virt_cmd,
> +                                  struct its_cmd_block *phys_cmd)
> +{
> +    struct its_device *dev;
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +
> +    DPRINTK("vITS: INV: dev_id 0x%x id %d\n",dev_id, id);
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: INV: Failed to find device 0x%x\n", dev_id);
> +        return 1;
> +    }
> +
> +    if ( vgic_its_check_device_id(v, dev, id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: INV: Invalid ID %d\n", id);
> +        return 1;
> +    }
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_INV);
> +    its_encode_devid(phys_cmd, dev_id);
> +    its_encode_event_id(phys_cmd, id);
> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_clear_cmd(struct vcpu *v,
> +                                    struct vgic_its *vits,
> +                                    struct its_cmd_block *virt_cmd,
> +                                    struct its_cmd_block *phys_cmd)
> +{

I suspect that this function should also clear the pending status of the
internal representation of the LPI.

> +    struct its_device *dev;
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +
> +    DPRINTK("vITS: CLEAR: dev_id 0x%x id %d\n", dev_id, id);
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: CLEAR: Fail to find device 0x%x\n", dev_id);
> +        return 1;
> +    }
> +
> +    if ( vgic_its_check_device_id(v, dev, id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: CLEAR: Invalid ID %d\n", id);
> +        return 1;
> +    }
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_INV);
> +    its_encode_event_id(phys_cmd, id);
> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_invall_cmd(struct vcpu *v,
> +                                     struct vgic_its *vits,
> +                                     struct its_cmd_block *virt_cmd,
> +                                     struct its_cmd_block *phys_cmd)
> +{

The collection may be shared between multiple domain. How this command
will impact another domain?

> +    uint32_t pcol_id;
> +    uint8_t vcol_id = its_decode_collection(virt_cmd);
> +
> +    DPRINTK("vITS: INVALL: vCID %d\n", vcol_id);
> +    /* Check if Collection id exists */
> +    if ( vgic_its_check_cid(v, vits, vcol_id, &pcol_id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: INVALL: Wrong Collection %d\n", vcol_id);
> +        return 1;
> +    }
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_INVALL);
> +    its_encode_collection(phys_cmd, pcol_id);
> +
> +    return 0;
> +}
> +
> +static int vgic_its_build_int_cmd(struct vcpu *v,
> +                                  struct vgic_its *vits,
> +                                  struct its_cmd_block *virt_cmd,
> +                                  struct its_cmd_block *phys_cmd)
> +{
> +    uint32_t dev_id = its_decode_devid(virt_cmd);
> +    struct its_device *dev;
> +    uint32_t id = its_decode_event_id(virt_cmd);
> +
> +    DPRINTK("vITS: INT: Device 0x%x id %d\n", its_decode_devid(virt_cmd), id);
> +    /* Search if device entry exists */
> +    dev = vgic_its_check_device(v, dev_id);
> +    if ( dev == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: INT: Failed to find device 0x%x\n", dev_id);
> +        return 1;
> +    }
> +
> +    if ( vgic_its_check_device_id(v, dev, id) )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: INT: Invalid ID %d\n", id);
> +        return 1;
> +    }
> +
> +    its_encode_cmd(phys_cmd, GITS_CMD_INT);
> +    its_encode_devid(phys_cmd, its_decode_devid(virt_cmd));
> +    its_encode_event_id(phys_cmd, its_decode_event_id(virt_cmd));
> +
> +    return 0;
> +}
> +
> +static void vgic_its_free_device(struct its_device *dev)
> +{
> +        xfree(dev);
> +}
> +
> +static int vgic_its_add_device(struct vcpu *v, struct vgic_its *vits,
> +                               struct its_cmd_block *virt_cmd)

The function name is wrong. you are not only add a device but also
remove it.

Given the size of the function it would be good to split in two parts.

[..]

> +
> +static int vgic_its_process_mapc(struct vcpu *v, struct vgic_its *vits,
> +                                 struct its_cmd_block *virt_cmd)
> +{
> +    uint32_t pcid = 0;
> +    int idx;

uint32_t as nmap.

> +    uint32_t nmap;
> +    uint8_t vcol_id;
> +    uint64_t vta = 0;
> +
> +    nmap = vits->cid_map.nr_cid;
> +    vcol_id = its_decode_collection(virt_cmd);
> +    vta = its_decode_target(virt_cmd);
> +
> +    for ( idx = 0; idx < nmap; idx++ )
> +    {
> +        if ( vcol_id == vits->cid_map.vcid[idx] )
> +            break;
> +    }
> +    if ( idx == nmap )
> +        vits->cid_map.vcid[idx] = vcol_id;

You array has a specific size, so you need to check that idx is not
greater that this size.

> +
> +    if ( its_get_physical_cid(v->domain, &pcid, vta) )
> +        BUG_ON(1);

No BUG_ON().

> +    vits->cid_map.pcid[idx] = pcid;
> +    vits->cid_map.vta[idx] = vta;
> +    vits->cid_map.nr_cid++;
> +    DPRINTK("vITS: MAPC: vCID %d vTA 0x%lx added @idx 0x%x \n",
> +             vcol_id, vta, idx);

AFAIU, a collection = a pCPU. As the vCPU can move from a pCPU to
another. How do you plan to handle interrupt migration?

> +
> +    return 0;
> +}
> +
> +static void vgic_its_update_read_ptr(struct vcpu *v, struct vgic_its *vits)
> +{
> +    vits->cmd_read = vits->cmd_write;
> +}
> +
> +#ifdef DEBUG_ITS
> +char *cmd_str[] = {
> +        [GITS_CMD_MOVI]    = "MOVI",
> +        [GITS_CMD_INT]     = "INT",
> +        [GITS_CMD_CLEAR]   = "CLEAR",
> +        [GITS_CMD_SYNC]    = "SYNC",
> +        [GITS_CMD_MAPD]    = "MAPD",
> +        [GITS_CMD_MAPC]    = "MAPC",
> +        [GITS_CMD_MAPVI]   = "MAPVI",
> +        [GITS_CMD_MAPI]    = "MAPI",
> +        [GITS_CMD_INV]     = "INV",
> +        [GITS_CMD_INVALL]  = "INVALL",
> +        [GITS_CMD_MOVALL]  = "MOVALL",
> +        [GITS_CMD_DISCARD] = "DISCARD",
> +    };
> +#endif
> +
> +#define SEND_NONE 0x0
> +#define SEND_CMD 0x1
> +#define SEND_ALL 0x2
> +
> +static int vgic_its_parse_its_command(struct vcpu *v, struct vgic_its *vits,
> +                                      struct its_cmd_block *virt_cmd)
> +{
> +    uint8_t cmd = its_decode_cmd(virt_cmd);
> +    struct its_cmd_block phys_cmd;
> +    int ret;
> +    int send_flag = SEND_CMD;
> +
> +#ifdef DEBUG_ITS
> +    DPRINTK("vITS: Received cmd %s (0x%x)\n", cmd_str[cmd], cmd);
> +    DPRINTK("Dump Virt cmd: ");
> +    dump_cmd(virt_cmd);
> +#endif
> +
> +    memset(&phys_cmd, 0x0, sizeof(struct its_cmd_block));
> +    switch ( cmd )
> +    {
> +    case GITS_CMD_MAPD:
> +        /* create virtual device entry */
> +        if ( vgic_its_add_device(v, vits, virt_cmd) )
> +            return ENODEV;
> +        ret = vgic_its_build_mapd_cmd(v, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_MAPC:
> +        /* Physical ITS driver already mapped physical Collection */
> +        send_flag = SEND_NONE;
> +        ret =  vgic_its_process_mapc(v, vits, virt_cmd);
> +        break;
> +    case GITS_CMD_MAPI:
> +        /* MAPI is same as MAPVI */

This message is not true. MAPI and MAPVI doesn't have the same layout...

> +    case GITS_CMD_MAPVI:
> +        ret = vgic_its_build_mapvi_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_MOVI:
> +        ret = vgic_its_build_movi_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_DISCARD:
> +        ret = vgic_its_build_discard_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_INV:
> +        ret = vgic_its_build_inv_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_INVALL:
> +        /* XXX: SYNC is sent on all physical ITS */

I guess you meant to INVALL.

XXX means "FIXME", so what do you need to fix?

> +        send_flag = SEND_ALL;
> +        ret = vgic_its_build_invall_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_INT:
> +        ret = vgic_its_build_int_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_CLEAR:
> +        ret = vgic_its_build_clear_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +    case GITS_CMD_SYNC:
> +        /* XXX: SYNC is sent on all physical ITS */

Ditto for the "XXX".

> +        send_flag = SEND_ALL;
> +        ret = vgic_its_build_sync_cmd(v, vits, virt_cmd, &phys_cmd);
> +        break;
> +        /*TODO:  GITS_CMD_MOVALL not implemented */
> +    default:
> +       dprintk(XENLOG_ERR, "vITS: Unhandled command cmd %d\n", cmd);
> +       return 1;
> +    }
> +
> +#ifdef DEBUG_ITS
> +    DPRINTK("Dump Phys cmd: ");
> +    dump_cmd(&phys_cmd);
> +#endif
> +
> +    if ( ret )
> +    {
> +       dprintk(XENLOG_ERR, "vITS: Failed to handle cmd %d\n", cmd);
> +       return 1;
> +    }
> +
> +    if ( send_flag )
> +    {
> +       /* XXX: Always send on physical ITS on which device is assingned */

Ditto XXX and s/assigngned/assigned/

> +       if ( !gic_its_send_cmd(v,
> +             its_get_phys_node(its_decode_devid(&phys_cmd)),
> +             &phys_cmd, (send_flag & SEND_ALL)) )
> +       {
> +           dprintk(XENLOG_ERR, "vITS: Failed to push cmd %d\n", cmd);
> +           return 1;
> +       }
> +    }
> +
> +    return 0;
> +}
> +
> +/* Called with its lock held */

Please add an ASSERT to validate this assumption.

> +static int vgic_its_read_virt_cmd(struct vcpu *v,
> +                                  struct vgic_its *vits,
> +                                  struct its_cmd_block *virt_cmd)
> +{
> +    struct page_info * page;

page_info *page;

> +    void *p;
> +    paddr_t paddr;
> +    paddr_t maddr = vits->cmd_base & 0xfffffffff000UL;

Define a mask would make the code clearer.

Also I don't see any check on GITS_CBASER.Valid in this code and in the
register emulation.

If GITS_CBASER.Valid = 0, command should be ignore.

> +    uint64_t offset;
> +
> +    /* CMD Q can be more than 1 page. Map only page that is required */
> +    maddr = ((vits->cmd_base & 0xfffffffff000UL) +
> +              vits->cmd_write_save ) & PAGE_MASK;
> +
> +    paddr = p2m_lookup(v->domain, maddr, NULL);

p2m_lookup can fail and return INVALID_PADDR.

> +
> +    DPRINTK("vITS: Mapping CMD Q maddr 0x%lx paddr 0x%lx write_save 0x%lx \n",
> +            maddr, paddr, vits->cmd_write_save);
> +    page = get_page_from_paddr(v->domain, paddr, 0);
> +    if ( page == NULL )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: Failed to get command page\n");
> +        return 1;
> +    }

Althoug you may want to use get_page_from_gfn rather than p2m_lookup +
get_page_from_paddr.

> +
> +    p = __map_domain_page(page);

__map_domain_page will map the page cacheable. What happen if the guest
decides to use a non-cacheable mapping? I think this will corrupt the
commands quickly.

I'm not against using __map_domain_page but we should at least detect
and print an error message if the caching attribute are different.

> +
> +    /* Offset within the mapped 4K page to read */
> +    offset = vits->cmd_write_save & 0xfff;
> +
> +    memcpy(virt_cmd, p + offset, sizeof(struct its_cmd_block));
> +
> +    /* No command queue is created by vits to check on Q full */
> +    vits->cmd_write_save += 0x20;

0x20 is confusing. You should use sizeof (struct its_cmd_block) or add a
command to explain the 0x20.

Although I would add a BUILD_BUG_ON(sizeof (struct its_cmd_block) == 32);

> +    if ( vits->cmd_write_save == vits->cmd_qsize )
> +    {
> +         DPRINTK("vITS: Reset write_save 0x%lx qsize 0x%lx \n",
> +                 vits->cmd_write_save,
> +                 vits->cmd_qsize);
> +                 vits->cmd_write_save = 0x0;
> +    }
> +
> +    unmap_domain_page(p);
> +    put_page(page);
> +
> +    return 0;
> +}
> +
> +int vgic_its_process_cmd(struct vcpu *v, struct vgic_its *vits)

Missing a static

> +{
> +    struct its_cmd_block virt_cmd;
> +
> +    /* XXX: Currently we are processing one cmd at a time */

This comment seems wrong. You are handling multiple commands at the same
time.

> +    ASSERT(spin_is_locked(&vits->lock));
> +
> +    do {
> +        if ( vgic_its_read_virt_cmd(v, vits, &virt_cmd) )
> +            goto err;
> +        if ( vgic_its_parse_its_command(v, vits, &virt_cmd) )
> +            goto err;

>From the Spec: if a command error occur, the command should be ignore.
If the command queue is not empty, we must continue processing commands.

> +    } while ( vits->cmd_write != vits->cmd_write_save );
> +
> +    vits->cmd_write_save = vits->cmd_write;

Why this line? At the end of the loop cmd_write_save is equals to
write_save.

> +    DPRINTK("vITS: write_save 0x%lx write 0x%lx \n",
> +            vits->cmd_write_save,
> +            vits->cmd_write);
> +    /* XXX: Currently we are processing one cmd at a time */

Ditto for the command.

> +    vgic_its_update_read_ptr(v, vits);
> +
> +    dsb(ishst);

Why the dsb?

> +
> +    return 1;
> +err:
> +    dprintk(XENLOG_ERR, "vITS: Failed to process guest cmd\n");
> +    return 0;
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index 9e0419e..bc7aee9 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -114,6 +114,15 @@ struct arch_domain
>  #endif
>      } vgic;
>  
> +    struct vgic_its *vits;
> +    struct vgic_lpi_conf *lpi_conf;

This patch is handling command not lpi_conf.

It seems that you mix the skeleton of VGIC with the implementation of
the command queue...

It would make more sense to have a separate patch introducing the vGIC
skeleton.

> +
> +    struct vits_devs {
> +        spinlock_t lock;
> +        /* ITS Device list */
> +        struct list_head dev_list;
> +    } vits_devs;
> +
>      struct vuart {
>  #define VUART_BUF_SIZE 128
>          char                        *buf;
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index fa1e305..70ec913 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -22,6 +22,72 @@
>  #ifndef __ASM_ARM_GIC_ITS_H__
>  #define __ASM_ARM_GIC_ITS_H__
>  
> +#include <asm/gic_v3_defs.h>
> +
> +struct its_node;
> +
> +/* Collection ID mapping */
> +struct cid_mapping
> +{
> +    uint8_t nr_cid;
> +    /* XXX: assume one collection id per vcpu. can set to MAX_VCPUS? */

MAX_VCPUS can be very high, so this has to be allocated dynamically
per-domain.

> +    /* Virtual Collection id */
> +    uint8_t vcid[32];
> +    /* Physical Collection id */
> +    uint8_t pcid[32];
> +    /* Virtual target address of this collection id */
> +    uint64_t vta[32];

It would have been easier to understand the matching with a structure
containing vcid, pcid, vta.

> +};
> +
> +/*
> + * Per domain virtual ITS structure.
> + * One per Physical ITS node available for the domain
> + */
> + 
> +struct vgic_its
> +{
> +   spinlock_t lock;
> +   /* Emulation of BASER */
> +   paddr_t baser[8];
> +   /* Command queue base */
> +   paddr_t cmd_base;
> +   /* Command queue write pointer */
> +   paddr_t cmd_write;
> +   /* Command queue write saved pointer */
> +   paddr_t cmd_write_save;
> +   /* Command queue read pointer */
> +   paddr_t cmd_read;
> +   /* Command queue size */
> +   unsigned long cmd_qsize;
> +   /* ITS mmio physical base */
> +   paddr_t phys_base;
> +   /* ITS mmio physical size */
> +   unsigned long phys_size;
> +   /* ITS physical node */
> +   struct its_node *its;
> +   /* GICR ctrl register */
> +   uint32_t ctrl;
> +   /* Virtual to Physical Collection id mapping */
> +   struct cid_mapping cid_map;
> +};

Most of this code doesn't belong to the command queue implementation...

> +
> +struct vgic_lpi_conf
> +{
> +   /* LPI propbase */
> +   paddr_t propbase;
> +   /* percpu pendbase */
> +   paddr_t pendbase[MAX_VIRT_CPUS];

Please allocate it dynamically.

> +   /* Virtual LPI property table */
> +   void * prop_page;
> +};
> +

Ditto for this structure.

> +struct vid_map
> +{

Please add comment in this structure.

> +    uint32_t vlpi;
> +    uint32_t plpi;
> +    uint32_t id;

"id" means lots of thing: devID, eventID, pID... Please rename this
field to something more meaningful.

> +};
> +
>  /*
>   * The ITS command block, which is what the ITS actually parses.
>   */
> @@ -37,12 +103,21 @@ struct its_device {
>          struct list_head        entry;
>          struct its_node         *its;
>          struct its_collection   *collection;
> -        void                    *itt;

Why did you introduce itt if you dropped it here?

> +        /* Virtual ITS node */
> +        struct vgic_its         *vits;
> +        paddr_t                 itt_addr;
> +        unsigned long           itt_size;
>          unsigned long           *lpi_map;
>          u32                     lpi_base;
>          int                     nr_lpis;
>          u32                     nr_ites;
>          u32                     device_id;
> +        /* Spinlock for vlpi allocation */
> +        spinlock_t              vlpi_lock;
> +        /* vlpi bitmap */
> +        unsigned long           *vlpi_map;
> +        /* vlpi <=> plpi mapping */
> +        struct vid_map          *vlpi_entries;
>  };

Ditto about the patch splitting for this structure.

>  
>  static inline uint8_t its_decode_cmd(struct its_cmd_block *cmd)
> @@ -144,6 +219,15 @@ static inline void its_encode_collection(struct its_cmd_block *cmd, u16 col)
>      cmd->raw_cmd[2] |= col;
>  }
>  
> +int its_get_physical_cid(struct domain *d, uint32_t *col_id, uint64_t ta);
> +int its_get_target(uint8_t pcid, uint64_t *pta);
> +int its_alloc_device_irq(struct its_device *dev, uint32_t *plpi);
> +int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
> +                     struct its_cmd_block *phys_cmd, int send_all);
> +void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
> +unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
> +uint32_t its_get_pta_type(void);
> +struct its_node * its_get_phys_node(uint32_t dev_id);
>  #endif /* __ASM_ARM_GIC_ITS_H__ */
>  
>  /*
> 

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-23 15:30         ` Julien Grall
@ 2015-03-23 16:09           ` Vijay Kilari
  2015-03-23 16:18             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-23 16:09 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Mon, Mar 23, 2015 at 9:00 PM, Julien Grall <julien.grall@linaro.org> wrote:
> On 23/03/15 15:18, Vijay Kilari wrote:
>>> The ITS still have to manage in someway the device. There is lots of
>>> information that doesn't need to be created at every mapd (such as the
>>> number of MSI).
>>
>> First assumption is VITS driver owns converting Virtual ITS commands
>>  to Physical ITS commands. So based on this
>>
>> - arch_domain contains list of all the devices attached for the domain.
>> -  On MAPD command, device is created, physical LPI's and virtual LPIs
>> are allocated
>>    and added to domains list and further all other ITS commands except
>> MAPC, INVALL and SYNC
>>    depend on device information to convert virtual ITS commands to
>> physical ITS commands.
>
> I didn't understand what you said.

I mean, all the virtual ITS commands has to be converted to
physical ITS commands. The (most)information required to do this conversion
is based on device_id for most of the commands (except MAPC, SYNC & INVALL)
So vITS driver contains device management.
However, all the devices managed using linked list attached to a domain.

Few API's are added in physical ITS driver to query
Physical Collection ID (pCID) to Virtual Collection ID (vCID) and
Virtual Target address(vTA)
 to Physical Target Address(pTA) because pCID & pTA info is available
in physical ITS driver.

Refer to : 4.9.22 Command Mapping for a Guest with a GICv3 ITS
of PRD03-GENC-010745 20.0

Overall the functionality was split as follows;

- Physical ITS driver manages Physical LPI allocation, Sending
Physical commands,
  Physical Collection ID & Target addresses
- Virtual ITS driver manages Virtual LPI allocation, Device creation
for respective domain,
  Conversion of virtual ITS commands to Physical ITS commands, Virtual
Collection ID's
  Virtual Target address, GITS register emulation, LPI
pending/Property table emulation.

It would be better if you go through the patch set and comment under
relevant code
and we can discuss.

>
>>>
>>> Handling device management in ITS would help to check the validity of
>>> the access. Which you are currently ignoring...
>>>
>>>>>
>>>>> How do you check if the domain can use the device?
>>>>> Currently, you allow any domain to use any device. That would bring a big
>>>>> mess with guest using passthrough.
>>>>
>>>> ITS driver does not know which PCI device is assigned for which domain.
>>>
>>> Wrong, Xen knows which device is assigned to which domain so ITS does.
>>>
>>>> I think it should be done by above layers along with pci drivers in Xen.
>>>> vITS assume that the domain that sends MAPD command owns the device
>>>
>>> The vITS emulates hardware for a specific domain. A malicious guest
>>> could send request to a not own device.
>>
>> OK.   On MAPD command when ITS device is created, I can introduce pci helper
>> function to know if particular device is assigned to domain or not.
>>
>>>
>>> You have to think about security in the vITS otherwise we will end up
>>> with many XSA in this code...
>>>
>>
>>  For every virtual ITS command parameters are validated
>> before issuing physical command, except check on device id which I will
>> take care in next version
>
> The check on device id is not the only check missing... You also have to
> validate ID, Size... with the number of bits supported by the ITS.
>
> Regards,
>
> --
> Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 00/22] xen/arm: Add ITS support
  2015-03-23 16:09           ` Vijay Kilari
@ 2015-03-23 16:18             ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-23 16:18 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 23/03/15 16:09, Vijay Kilari wrote:
> On Mon, Mar 23, 2015 at 9:00 PM, Julien Grall <julien.grall@linaro.org> wrote:
>> On 23/03/15 15:18, Vijay Kilari wrote:
>>>> The ITS still have to manage in someway the device. There is lots of
>>>> information that doesn't need to be created at every mapd (such as the
>>>> number of MSI).
>>>
>>> First assumption is VITS driver owns converting Virtual ITS commands
>>>  to Physical ITS commands. So based on this
>>>
>>> - arch_domain contains list of all the devices attached for the domain.
>>> -  On MAPD command, device is created, physical LPI's and virtual LPIs
>>> are allocated
>>>    and added to domains list and further all other ITS commands except
>>> MAPC, INVALL and SYNC
>>>    depend on device information to convert virtual ITS commands to
>>> physical ITS commands.
>>
>> I didn't understand what you said.
> 
> I mean, all the virtual ITS commands has to be converted to
> physical ITS commands. The (most)information required to do this conversion
> is based on device_id for most of the commands (except MAPC, SYNC & INVALL)
> So vITS driver contains device management.
> However, all the devices managed using linked list attached to a domain.
> 
> Few API's are added in physical ITS driver to query
> Physical Collection ID (pCID) to Virtual Collection ID (vCID) and
> Virtual Target address(vTA)
>  to Physical Target Address(pTA) because pCID & pTA info is available
> in physical ITS driver.
> 
> Refer to : 4.9.22 Command Mapping for a Guest with a GICv3 ITS
> of PRD03-GENC-010745 20.0

A reference to this section on this commit message would have been very
useful...

> Overall the functionality was split as follows;
> 
> - Physical ITS driver manages Physical LPI allocation, Sending
> Physical commands,
>   Physical Collection ID & Target addresses
> - Virtual ITS driver manages Virtual LPI allocation, Device creation
> for respective domain,
>   Conversion of virtual ITS commands to Physical ITS commands, Virtual
> Collection ID's
>   Virtual Target address, GITS register emulation, LPI
> pending/Property table emulation.
> 
> It would be better if you go through the patch set and comment under
> relevant code
> and we can discuss.

It's nearly impossible to review this series one patch by one patch. I
have to look all the things together.

That's why I bring the discussion here.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
  2015-03-21  0:28   ` Julien Grall
  2015-03-23 15:52   ` Julien Grall
@ 2015-03-24 11:48   ` Julien Grall
  2015-03-30 15:02     ` Vijay Kilari
  2 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-24 11:48 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

More questions/remarks about command processing.

On 19/03/2015 14:38, vijay.kilari@gmail.com wrote:
> +int vgic_its_process_cmd(struct vcpu *v, struct vgic_its *vits)
> +{
> +    struct its_cmd_block virt_cmd;
> +
> +    /* XXX: Currently we are processing one cmd at a time */
> +    ASSERT(spin_is_locked(&vits->lock));
> +
> +    do {
> +        if ( vgic_its_read_virt_cmd(v, vits, &virt_cmd) )
> +            goto err;
> +        if ( vgic_its_parse_its_command(v, vits, &virt_cmd) )
> +            goto err;
> +    } while ( vits->cmd_write != vits->cmd_write_save );
> +
> +    vits->cmd_write_save = vits->cmd_write;
> +    DPRINTK("vITS: write_save 0x%lx write 0x%lx \n",
> +            vits->cmd_write_save,
> +            vits->cmd_write);
> +    /* XXX: Currently we are processing one cmd at a time */
> +    vgic_its_update_read_ptr(v, vits);

 From the spec the GITS_CREADR should be updated at every command 
processing. That would make cmd_write_save pointless.

Also, you are taking the VITS lock for the whole process. This process 
can be very long. How will it affect the other vCPUs of the domain?

Finally, in environment with multiple guests using ITS, the ITS command 
send to the physical ITS may be interleaved (i.e DOM1 cmd, DOM2 cmd, 
DOM1 cmd ...). Is there any possible side-effect?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers
  2015-03-19 14:38 ` [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers vijay.kilari
@ 2015-03-24 17:12   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-24 17:12 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/2015 14:38, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>
> Add support for emulating GITS_* registers
>
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> ---
> v2: - Each Virtual ITS is attached to Physical ITS.
>      - Introduce helper function to lock and unlock
>        virtual ITS lock.
>      - Introduced helper to get virtual ITS structure pointer
>        based on emulation address.
> ---
>   xen/arch/arm/gic-v3-its.c     |    8 +
>   xen/arch/arm/vgic-v3-its.c    |  412 +++++++++++++++++++++++++++++++++++++++++
>   xen/include/asm-arm/gic-its.h |    1 +
>   3 files changed, 421 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index a9aab73..e382f8d 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -101,6 +101,8 @@ struct its_node {
>   };
>
>   uint32_t pta_type;
> +/* Number of physical its nodes present */
> +uint32_t nr_its = 0;

This variable is not exported so static.

Although, I'm not convinced this variable is useful. See my comments later.

>
>   #define ITS_ITT_ALIGN		SZ_256
>
> @@ -146,6 +148,11 @@ uint32_t its_get_pta_type(void)
>   	return pta_type;
>   }
>
> +uint32_t its_get_nr_its(void)
> +{
> +	return nr_its;
> +}
> +
>   struct its_node * its_get_phys_node(uint32_t dev_id)
>   {
>   	struct its_node *its;
> @@ -1170,6 +1177,7 @@ static int its_probe(struct dt_device_node *node)
>   	}
>   	spin_lock(&its_lock);
>   	list_add(&its->entry, &its_nodes);
> +	nr_its++;
>   	spin_unlock(&its_lock);
>
>   	return 0;
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 7530a88..4d8945f 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -869,6 +869,418 @@ err:
>       return 0;
>   }
>
> +struct vgic_its *its_to_vits(struct vcpu *v, paddr_t phys_base)

This function is not exported so static.

Also, it looks like to me that the vcpu is not necessary. Please use 
struct domain *d.

> +{
> +    struct vgic_its *vits = NULL;
> +    int i;
> +
> +    /* Mask 64K offset */
> +    phys_base = phys_base & ~(SZ_64K - 1);
> +    if ( is_hardware_domain(v->domain) )

Why do you need to have a specific case for the hardware domain? All the 
vITS code should be domain agnostic except the initialization function.

That would make the code a lot simpler.

> +    {
> +        for ( i = 0; i < its_get_nr_its(); i++ )

I would prefer if you introduce a new field in domain->arch to store the 
number of ITS for the domain.


> +        {
> +            if ( v->domain->arch.vits[i].phys_base == phys_base )
> +            {
> +                vits =  &v->domain->arch.vits[i];
> +                break;
> +            }
> +        }
> +    }
> +    else
> +        vits = &v->domain->arch.vits[0];

You should not assume that the guest as only one vITS.

> +
> +    return vits;
> +}
> +
> +static inline void vits_spin_lock(struct vgic_its *vits)
> +{
> +    spin_lock(&vits->lock);
> +}
> +
> +static inline void vits_spin_unlock(struct vgic_its *vits)
> +{
> +    spin_unlock(&vits->lock);
> +}
> +
> +static int vgic_v3_gits_mmio_read(struct vcpu *v, mmio_info_t *info)
> +{
> +    struct vgic_its *vits;
> +    struct hsr_dabt dabt = info->dabt;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    register_t *r = select_user_reg(regs, dabt.reg);
> +    uint64_t val = 0;
> +    uint32_t index, gits_reg;
> +
> +    vits = its_to_vits(v, info->gpa);
> +    if ( vits == NULL ) BUG_ON(1);

BUG_ON(vits != NULL);

Although I would document this BUG_ON to explain that its_to_vits should 
never fail because MMIOs registered always point to an ITS.

Though, an ASSERT maybe better here.

> +
> +    gits_reg = info->gpa - vits->phys_base;
> +
> +    if ( gits_reg >= SZ_64K )
> +    {
> +        gdprintk(XENLOG_G_WARNING, "vGITS: unknown gpa read address \
> +                  %"PRIpaddr"\n", info->gpa);
> +        return 0;
> +    }

This can never happen, you always register 64K range.

> +
> +    switch ( gits_reg )
> +    {
> +    case GITS_CTLR:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;

Missing implementation for GITS_CTLR

> +        return 1;
> +    case GITS_IIDR:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;

Missing implementation for GITS_IIDR

> +        return 1;
> +    case GITS_TYPER:
> +         /* GITS_TYPER support word read */
> +        vits_spin_lock(vits);
> +        val = ((its_get_pta_type() << VITS_GITS_TYPER_PTA_SHIFT) |

As said on a previous patch, each ITS may have a different value in PTA. 
I think it would make the command emulation simpler if we use an 
hardcoded PTA (PTA = 0 i.e using linear processor numbers seems the 
simpler).

> +               VITS_GITS_TYPER_HCC   | VITS_GITS_DEV_BITS |

I will comment the value here..

Where does the value of HCC and DEV_BITS come from? Both of them looks 
wrong to me.

> +               VITS_GITS_ID_BITS     | VITS_GITS_ITT_SIZE |

Ditto for ID_BITS and ITT_SIZE.

Although, it looks like that ITT_SIZE should be the same as the 
hardware. This is because you let the guest allocation the ITT.

I would also rename ITT_SIZE to ITT_ENTRY_SIZE.

> +               VITS_GITS_DISTRIBUTED | VITS_GITS_PLPIS);

The bit 3 is marked as implementation defined. So why did you name it 
DISTRIBUTED?

> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            *r = val;
> +        else if ( dabt.size == DABT_WORD )
> +            *r = (u32)(val >> 32);
> +        else
> +        {
> +            vits_spin_unlock(vits);
> +            goto bad_width;
> +        }
> +        vits_spin_unlock(vits);

The vits_spin_unlock could be done before setting *r.

> +        return 1;
> +    case GITS_TYPER + 4:

I don't like the idea to duplicate the code for GITS_TYPER just for 
reading the top word. Isn't possible to merge the 2 switch case?

Reading the spec again, it's not mandatory support support 32-bit access 
on 64-bit registers.

Given that we don't support GICv3 for 32-bit guest, I would completely 
drop the 32-bit access on 64-bit guest.

> +        if (dabt.size != DABT_WORD ) goto bad_width;

if ( ...

> +        vits_spin_lock(vits);
> +        val = ((its_get_pta_type() << VITS_GITS_TYPER_PTA_SHIFT) |
> +               VITS_GITS_TYPER_HCC   | VITS_GITS_DEV_BITS |
> +               VITS_GITS_ID_BITS     | VITS_GITS_ITT_SIZE |
> +               VITS_GITS_DISTRIBUTED | VITS_GITS_PLPIS);
> +        *r = (u32)val;
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case 0x0010 ... 0x007c:
> +    case 0xc000 ... 0xffcc:
> +        /* Implementation defined -- read ignored */
> +        dprintk(XENLOG_ERR,
> +                "vGITS: read unknown 0x000c - 0x007c r%d offset %#08x\n",
> +                dabt.reg, gits_reg);

Please don't use XENLOG_ERR in guest code. Also, this printk is not 
useful and has been dropped in other emulation.

> +        goto read_as_zero;
> +    case GITS_CBASER:
> +        vits_spin_lock(vits);
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            *r = vits->cmd_base && 0xc7ffffffffffffffUL;

Why the && and what does mean this constant?

> +        else if ( dabt.size == DABT_WORD )
> +            *r = (u32)vits->cmd_base;
> +        else
> +        {
> +            vits_spin_unlock(vits);
> +            goto bad_width;
> +        }
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_CBASER + 4:
> +         /* CBASER support word read */
> +        if (dabt.size != DABT_WORD ) goto bad_width;

if ( ...

> +        vits_spin_lock(vits);
> +        *r = (u32)(vits->cmd_base >> 32);
> +        vits_spin_unlock(vits);
> +        return 1;

Same remark as GITS_TYPER for the word read support.

> +    case GITS_CWRITER:
> +        vits_spin_lock(vits);
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            *r = vits->cmd_write;
> +        else if ( dabt.size == DABT_WORD )
> +            *r = (u32)vits->cmd_write;
> +        else
> +        {
> +            vits_spin_unlock(vits);
> +            goto bad_width;
> +        }
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_CWRITER + 4:
> +         /* CWRITER support word read */
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        vits_spin_lock(vits);
> +        *r = (u32)(vits->cmd_write >> 32);
> +        vits_spin_unlock(vits);
> +        return 1;

Ditt for the word-read

> +    case GITS_CREADR:
> +        vits_spin_lock(vits);
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            *r = vits->cmd_read;
> +        else if ( dabt.size == DABT_WORD )
> +            *r = (u32)vits->cmd_read;
> +        else
> +        {
> +            vits_spin_unlock(vits);
> +            goto bad_width;
> +        }
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_CREADR + 4:
> +         /* CREADR support word read */
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        vits_spin_lock(vits);
> +        *r = (u32)(vits->cmd_read >> 32);
> +        vits_spin_unlock(vits);
> +        return 1;

Ditto

> +    case 0x0098 ... 0x009c:
> +    case 0x00a0 ... 0x00fc:
> +    case 0x0140 ... 0xbffc:
> +        /* Reserved -- read ignored */
> +        dprintk(XENLOG_ERR,
> +                "vGITS: read unknown 0x0098-9c or 0x00a0-fc r%d offset %#08x\n",
> +                dabt.reg, gits_reg);

No need of printk here.

> +        goto read_as_zero;
> +    case GITS_BASER ... GITS_BASERN:

The spec says that registers are RES0 if not implemented.
As you use at all baser outside the register emulation, I would 
implement them RAZ/WI.

That would avoid a wrong write implementation.

> +        vits_spin_lock(vits);
> +        index = (gits_reg - GITS_BASER) / 8;
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            *r = vits->baser[index];
> +        else if ( dabt.size == DABT_WORD )
> +        {
> +            if ( (gits_reg % 8) == 0 )
> +                *r = (u32)vits->baser[index];
> +            else
> +                *r = (u32)(vits->baser[index] >> 32);
> +        }
> +        else
> +        {
> +            vits_spin_unlock(vits);
> +            goto bad_width;
> +        }
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_PIDR0:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = GITS_PIDR0_VAL;
> +        return 1;
> +    case GITS_PIDR1:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = GITS_PIDR1_VAL;
> +        return 1;
> +    case GITS_PIDR2:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = GITS_PIDR2_VAL;
> +        return 1;
> +    case GITS_PIDR3:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = GITS_PIDR3_VAL;
> +        return 1;
> +    case GITS_PIDR4:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        *r = GITS_PIDR4_VAL;
> +        return 1;
> +    case GITS_PIDR5 ... GITS_PIDR7:
> +        goto read_as_zero;

Please check that we access is done via a word-access by introducing a 
new label read_as_zero_32 (for instance see the vgic v2 emulation).

> +   default:
> +        dprintk(XENLOG_ERR, "vGITS: unhandled read r%d offset %#08x\n",
> +               dabt.reg, gits_reg);

printk(XENLOG_G_ERR "%pv: ....", v,...)

Also it may be useful to printk which vITS is in use.

> +        return 0;
> +    }
> +
> +bad_width:
> +    dprintk(XENLOG_ERR, "vGITS: bad read width %d r%d offset %#08x\n",
> +           dabt.size, dabt.reg, gits_reg);

printk(XENLOG_G_ERR "%pv: ...", v,...)

Same remark for printing the vITS.

> +    domain_crash_synchronous();
> +    return 0;
> +
> +read_as_zero:
> +    if ( dabt.size != DABT_WORD ) goto bad_width;

How do you know that all RAZ access 32-bit access? See implementation 
defined registers for instance.

I would prefer to introduce multiple label:

read_as_zero_32: /* RAZ 32-bit */
     if ( dabt.size != DABT_WORD ) goto bad_width;

read_as_zero: /* Not check necessary */
     *r = 0;

And use the correctly label for goto in the emulation. So the code would 
be self-documented too.

> +    *r = 0;
> +    return 1;
> +}
> +
> +static int vgic_v3_gits_mmio_write(struct vcpu *v, mmio_info_t *info)
> +{
> +    struct vgic_its *vits;
> +    struct hsr_dabt dabt = info->dabt;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    register_t *r = select_user_reg(regs, dabt.reg);
> +    int ret;
> +    uint32_t index, gits_reg;
> +    uint64_t val;
> +
> +    vits = its_to_vits(v, info->gpa);
> +    if ( vits == NULL ) BUG_ON(1);

Same remark as the BUG_ON in vgic_v3_gits_mmio_read.

I'm wondering if it would be better to extend the read/write handler to 
get an opaque pointer in parameter.

In this case, it would contain the vits and would avoid the its_to_vits 
every time.

> +    gits_reg = info->gpa - vits->phys_base;
> +
> +    if ( gits_reg >= SZ_64K )
> +    {
> +        gdprintk(XENLOG_G_WARNING, "vGIC-ITS: unknown gpa write address"
> +                 " %"PRIpaddr"\n", info->gpa);
> +        return 0;
> +    }

This check is not necessary.

> +    switch ( gits_reg )
> +    {
> +    case GITS_CTLR:
> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        vits_spin_lock(vits);
> +        vits->ctrl = *r;

Only bit[0] (Enabled) is writable.

> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_IIDR:
> +        /* R0 -- write ignored */
> +        goto write_ignore;

goto write_ignore_32;

> +    case GITS_TYPER:
> +    case GITS_TYPER + 4:
> +        /* R0 -- write ignored */
> +        goto write_ignore;

Please explicitly check the access size. That would avoid to crash the 
guest when TYPER is write with a 64-bit access.

> +    case 0x0010 ... 0x007c:
> +    case 0xc000 ... 0xffcc:
> +        /* Implementation defined -- write ignored */
> +        dprintk(XENLOG_ERR,
> +                "vGITS: write to unknown 0x000c - 0x007c r%d offset %#08x\n",
> +                dabt.reg, gits_reg);

Please drop the dprintk.

> +        goto write_ignore;
> +    case GITS_CBASER:
> +        if ( dabt.size == DABT_BYTE ) goto bad_width;

Please do the check in the invert way. i.e
(dabt.size != DABT_WORD) && (dabt.size != DABT_DOUBLE_WORD)

Also GITS_CBASER is read-only when GITS_CTLR.Enable is Zero or 
GITS_CTLR.Quiescent is zero.

> +        vits_spin_lock(vits);
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            vits->cmd_base = *r;
> +        else
> +        {
> +            val = vits->cmd_base & 0xffffffff00000000UL;

The mask is difficult to read. And not all the bits are writeable.

> +            val = (*r) | val;
> +            vits->cmd_base =  val;
> +        }
> +        vits->cmd_qsize  =  SZ_4K * ((*r & 0xff) + 1);

Please use a define for the mask.
Also I would use cmd_qsize to know if the valid is set or not. I.e 
cmd_qsize = 0 => command queue not valid.

You forgot to update GITS_CREADR (i.e setting to 0) when GITS_CBASER is 
successfully written.

> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_CBASER + 4:

32-bit support is not necessary and make the code more complex for nothing.

> +         /* CBASER support word read */
> +        if (dabt.size != DABT_WORD ) goto bad_width;
> +        vits_spin_lock(vits);
> +        val = vits->cmd_base & 0xffffffffUL;
> +        val = ((*r & 0xffffffffUL) << 32 ) | val;
> +        vits->cmd_base =  val;
> +        /* No Need to update cmd_qsize with higher word write */
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_CWRITER:
> +        if ( dabt.size == DABT_BYTE ) goto bad_width;
> +        vits_spin_lock(vits);
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            vits->cmd_write = *r;

Only Bits[19:5] are writable.

> +        else
> +        {
> +            val = vits->cmd_write & 0xffffffff00000000UL;
> +            val = (*r) | val;
> +            vits->cmd_write =  val;
> +        }

No validation of the value written by the guest? Given your 
implementation of the command processing, any invalid value will end up 
to an infinite loop in the hypervisor. Whoops :).

> +        ret = vgic_its_process_cmd(v, vits);
> +        vits_spin_unlock(vits);
> +        return ret;
> +    case GITS_CWRITER + 4:

Same remark as GITS_CBASER for the 32-bit support.

> +        if (dabt.size != DABT_WORD ) goto bad_width;
> +        vits_spin_lock(vits);
> +        val = vits->cmd_write & 0xffffffffUL;
> +        val = ((*r & 0xffffffffUL) << 32) | val;
> +        vits->cmd_write =  val;
> +        ret = vgic_its_process_cmd(v, vits);
> +        vits_spin_unlock(vits);
> +        return ret;
> +    case GITS_CREADR:
> +        /* R0 -- write ignored */
> +        goto write_ignore;
> +    case 0x0098 ... 0x009c:
> +    case 0x00a0 ... 0x00fc:
> +    case 0x0140 ... 0xbffc:
> +        /* Reserved -- write ignored */
> +        dprintk(XENLOG_ERR,
> +                "vGITS: write to unknown 0x98-9c or 0xa0-fc r%d offset %#08x\n",
> +                dabt.reg, gits_reg);

Please drop the dprintk

> +        goto write_ignore;
> +    case GITS_BASER ... GITS_BASERN:
> +        /* Nothing to do with this values. Just store and emulate */

As you don't use those values at all, write ignore would be better.

> +        vits_spin_lock(vits);
> +        index = (gits_reg - GITS_BASER) / 8;
> +        if ( dabt.size == DABT_DOUBLE_WORD )
> +            vits->baser[index] = *r;
> +        else if ( dabt.size == DABT_WORD )
> +        {
> +            if ( (gits_reg % 8) == 0 )
> +            {
> +                val = vits->cmd_write & 0xffffffff00000000UL;

cmd_write seems to come out of nowhere...

> +                val = (*r) | val;
> +                vits->baser[index] = val;
> +            }
> +            else
> +            {
> +                val = vits->baser[index] & 0xffffffffUL;
> +                val = ((*r & 0xffffffffUL) << 32) | val;
> +                vits->baser[index] = val;
> +            }
> +        }
> +        else
> +        {
> +            goto bad_width;
> +            vits_spin_unlock(vits);
> +        }
> +        vits_spin_unlock(vits);
> +        return 1;
> +    case GITS_PIDR7 ... GITS_PIDR0:
> +        /* R0 -- write ignored */
> +        goto write_ignore;
> +   default:
> +        dprintk(XENLOG_ERR, "vGITS: unhandled write r%d offset %#08x\n",
> +                dabt.reg, gits_reg);

printk(XENLOG_G_ERR "%pv: .....", v, ....);

+ Print which ITS is in use.

> +        return 0;
> +    }
> +
> +bad_width:
> +    dprintk(XENLOG_ERR, "vGITS: bad write width %d r%d offset %#08x\n",
> +           dabt.size, dabt.reg, gits_reg);

Ditto

> +    domain_crash_synchronous();
> +    return 0;
> +
> +write_ignore:
> +    if ( dabt.size != DABT_WORD ) goto bad_width;
> +    *r = 0;
> +    return 1;

Same remark as read_ignore.

> +}
> +
> +static const struct mmio_handler_ops vgic_gits_mmio_handler = {
> +    .read_handler  = vgic_v3_gits_mmio_read,
> +    .write_handler = vgic_v3_gits_mmio_write,
> +};
> +
> +int vgic_its_domain_init(struct domain *d)

You forgot to add the prototype of this function in the header...

> +{

This code is not really part of the ITS registers emulation...

Your patchs series splitting is really confusing.

> +    uint32_t num_its;
> +    int i;
> +
> +    num_its =  its_get_nr_its();
> +
> +    d->arch.vits = xzalloc_array(struct vgic_its, num_its);

Hmm... why did you use the number of physical ITS rather than the number 
of vITS used by the guest.

It would avoid to waste so much memory for every domain.

> +    if ( d->arch.vits == NULL )
> +        return -ENOMEM;
> +
> +    spin_lock_init(&d->arch.vits->lock);
> +
> +    spin_lock_init(&d->arch.vits_devs.lock);
> +    INIT_LIST_HEAD(&d->arch.vits_devs.dev_list);
> +
> +    d->arch.lpi_conf = xzalloc(struct vgic_lpi_conf);
> +    if ( d->arch.lpi_conf == NULL )
> +         return -ENOMEM;
> +
> +    for ( i = 0; i < num_its; i++)
> +    {
> +         spin_lock_init(&d->arch.vits[i].lock);
> +         register_mmio_handler(d, &vgic_gits_mmio_handler,
> +                               d->arch.vits[i].phys_base,
> +                               SZ_64K);
> +    }
> +
> +    return 0;
> +}
> +
>   /*
>    * Local variables:
>    * mode: C
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 70ec913..82cfbdc 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -227,6 +227,7 @@ int gic_its_send_cmd(struct vcpu *v, struct its_node *its,
>   void its_lpi_free(unsigned long *bitmap, int base, int nr_ids);
>   unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
>   uint32_t its_get_pta_type(void);
> +uint32_t its_get_nr_its(void);
>   struct its_node * its_get_phys_node(uint32_t dev_id);
>   #endif /* __ASM_ARM_GIC_ITS_H__ */

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs
  2015-03-19 14:38 ` [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs vijay.kilari
@ 2015-03-27 15:46   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-27 15:46 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/15 14:38, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> With this patch add emulation of GICR registers for LPIs.
> Also add LPI property table emulation.
> 
> Domain's LPI property table is unmapped during domain init
> on LPIPROPBASE update and trapped on LPI property
> table read and write
> 
> Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> ---
>  xen/arch/arm/vgic-v3-its.c        |  144 +++++++++++++++++++++++++++++++++++++
>  xen/arch/arm/vgic-v3.c            |   64 +++++++++++++----
>  xen/include/asm-arm/domain.h      |    1 +
>  xen/include/asm-arm/gic-its.h     |    1 +
>  xen/include/asm-arm/gic.h         |    2 +
>  xen/include/asm-arm/gic_v3_defs.h |    2 +
>  6 files changed, 200 insertions(+), 14 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 4d8945f..f1d68d9 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -869,6 +869,150 @@ err:
>      return 0;
>  }
>  
> +/* Search device structure and get corresponding plpi */
> +int vgic_its_get_pid(struct vcpu *v, uint32_t vlpi, uint32_t *plpi)

static int ....

Also you either return 0 or 1 so the return type should be bool_t.

> +{
> +    struct domain *d = v->domain;
> +    struct its_device *dev;
> +    int i = 0;
> +
> +    spin_lock(&d->arch.vits_devs.lock);
> +    list_for_each_entry( dev, &d->arch.vits_devs.dev_list, entry )
> +    {
> +        i = 0;
> +        while ((i = find_next_bit(dev->vlpi_map, dev->nr_lpis, i)) < dev->nr_lpis )
> +        {
> +            if ( dev->vlpi_entries[i].vlpi == vlpi )
> +            {
> +                *plpi = dev->vlpi_entries[i].plpi;
> +                spin_unlock(&d->arch.vits_devs.lock);
> +                return 0;
> +            }
> +            i++;
> +        }
> +    }
> +    spin_unlock(&d->arch.vits_devs.lock);
> +

The cost of this function seems high (2 imbricated loops). How often
will it be call?

> +    return 1;
> +}
> +
> +static int vgic_v3_gits_lpi_mmio_read(struct vcpu *v, mmio_info_t *info)
> +{
> +    uint32_t offset;
> +    struct hsr_dabt dabt = info->dabt;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    register_t *r = select_user_reg(regs, dabt.reg);
> +    uint8_t cfg;
> +
> +    offset = info->gpa -
> +             (v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL);
> +
> +    if ( offset < SZ_64K )

This check is pointless, you have registered the handler on the a valid
range.

> +    {
> +        DPRINTK("vITS: LPI Table read offset 0x%x\n", offset );
> +        cfg = readb_relaxed(v->domain->arch.lpi_conf->prop_page + offset);

Why do you use readb_relaxed? Those helpers have been created for using
reading MMIO not Xen memory...

Also what about the other access sizes? a 64/32/64 bits access are valid
and will return the wrong value.

> +        *r = cfg;
> +        return 1;
> +    }
> +    else
> +        dprintk(XENLOG_ERR, "vITS: LPI Table read with wrong offset 0x%x\n",
> +                offset);
> +
> +    return 0;
> +}
> +
> +static int vgic_v3_gits_lpi_mmio_write(struct vcpu *v, mmio_info_t *info)
> +{
> +    uint32_t offset;
> +    uint32_t pid, vid;
> +    uint8_t cfg;
> +    bool_t enable;
> +    struct hsr_dabt dabt = info->dabt;
> +    struct cpu_user_regs *regs = guest_cpu_user_regs();
> +    register_t *r = select_user_reg(regs, dabt.reg);
> +
> +    offset = info->gpa -
> +             (v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL);
> +
> +    vid = offset + NR_GIC_LPI;

I think NR_GIC_LPI is misnamed and should be renamed to GIC_LPI_OFFSET.

> +    if ( offset < SZ_64K )

Ditto for the check.

> +    {
> +        DPRINTK("vITS: LPI Table write offset 0x%x\n", offset );
> +        if ( vgic_its_get_pid(v, vid, &pid) )
> +        {
> +            dprintk(XENLOG_ERR, "vITS: pID not found for vid %d\n", vid);

Please don't use XENLOG_ERR, see why on my comments in a previous patch.

> +            return 0;
> +        }
> +      
> +        cfg = readb_relaxed(v->domain->arch.lpi_conf->prop_page + offset);

Same question as before for readb_relaxed.

> +        enable = (cfg & *r) & 0x1;
> +
> +        if ( !enable )
> +             vgic_its_enable_lpis(v, pid);
> +        else
> +             vgic_its_disable_lpis(v, pid);

If I'm not mistaken pid = physical LPI and vid = virtual LPI. So you
should use vid instead of pid for vgic_its_{enable,disable}_lpis.

> +        /* Update virtual prop page */
> +        writeb_relaxed((*r & 0xff),
> +                        v->domain->arch.lpi_conf->prop_page + offset);

Same question as readb_relaxed here.

Also what about the other access size? 64/32/16 bits accesses are valid.

> +        
> +        return 1;
> +    }
> +    else
> +        dprintk(XENLOG_ERR, "vITS: LPI Table write with wrong offset 0x%x\n",
> +                offset);
> +
> +    return 0; 
> +}
> +
> +static const struct mmio_handler_ops vgic_gits_lpi_mmio_handler = {
> +    .read_handler  = vgic_v3_gits_lpi_mmio_read,Although, 
> +    .write_handler = vgic_v3_gits_lpi_mmio_write,
> +};

It looks like to me that the LPI emulation should be in the GICv3 code
not ITS.

> +
> +int vgic_its_unmap_lpi_prop(struct vcpu *v)
> +{
> +    paddr_t maddr;
> +    uint32_t lpi_size;
> +    int i;
> +    
> +    maddr = v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL;
> +    lpi_size = 1UL << ((v->domain->arch.lpi_conf->propbase & 0x1f) + 1);
> +
> +    DPRINTK("vITS: Unmap guest LPI conf table maddr 0x%lx lpi_size 0x%x\n", 
> +             maddr, lpi_size);
> +
> +    if ( lpi_size < SZ_64K )

Why this restriction? The IDbits can encode up to 32 bits interrupt
identifier.

You have to check this value against GICD_TYPER.IDbits.

> +    {
> +        dprintk(XENLOG_ERR, "vITS: LPI Prop page < 64K\n");

No XENLOG_ERR

> +        return 0;
> +    }
> +
> +    /* XXX: As per 4.8.9 each re-distributor shares a common LPI configuration table 

Coding style:
/*
 *

XXX means TODO for me. So what did you forget to add?

4.8.9 from which spec?

> +     * So one set of mmio handlers to manage configuration table is enough
> +     */
> +    for ( i = 0; i < lpi_size / PAGE_SIZE; i++ )
> +        guest_physmap_remove_page(v->domain, paddr_to_pfn(maddr),
> +                                gmfn_to_mfn(v->domain, paddr_to_pfn(maddr)), 0);

No validation at all on the address pass for the guest? gmfn_to_mfn can
return an invalid MFN and I'm not sure what would happen if the guest is
trying to pass other things than RAM.

You may also need to free this unmapped page.

> +    /* Register mmio handlers for this region */
> +    register_mmio_handler(v->domain, &vgic_gits_lpi_mmio_handler,
> +                          maddr, lpi_size);
> +
> +    /* Allocate Virtual LPI Property table */
> +    v->domain->arch.lpi_conf->prop_page =
> +        alloc_xenheap_pages(get_order_from_bytes(lpi_size), 0);

I wasn't able to find a place where you free the pages allocated...

> +    if ( !v->domain->arch.lpi_conf->prop_page )
> +    {
> +        dprintk(XENLOG_ERR, "vITS: Failed to allocate LPI Prop page\n");

No XENLOG_ERR.

> +        return 0;
> +    }
> +
> +    memset(v->domain->arch.lpi_conf->prop_page, 0xa2, lpi_size);

Why?

What about if the guest decides to set another priority? Same if the
guest decides to provide an LPI page with some LPIs enabled.

> +
> +    return 1;
> +}
> +
>  struct vgic_its *its_to_vits(struct vcpu *v, paddr_t phys_base)
>  {
>      struct vgic_its *vits = NULL;
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index ec79c2a..e9ec7fa 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -30,6 +30,7 @@
>  #include <asm/mmio.h>
>  #include <asm/gic_v3_defs.h>
>  #include <asm/gic.h>
> +#include <asm/gic-its.h>
>  #include <asm/vgic.h>
>  
>  /* GICD_PIDRn register values for ARM implementations */
> @@ -99,20 +100,30 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case GICR_CTLR:
> -        /* We have not implemented LPI's, read zero */
> -        goto read_as_zero_32;
> +        /*
> +         * Enable LPI's for ITS. Direct injection of LPI
> +         * by writing to GICR_{SET,CLR}LPIR are not supported
> +         */

This comment would be more meaningful on the write emulation not read one.

> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        vgic_lock(v);
> +        *r = v->domain->arch.vgic.gicr_ctlr;
> +        vgic_unlock(v);
> +        return 1;
>      case GICR_IIDR:
>          if ( dabt.size != DABT_WORD ) goto bad_width;
>          *r = GICV3_GICR_IIDR_VAL;
>          return 1;
>      case GICR_TYPER:
> -        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
> -        /* TBD: Update processor id in [23:8] when ITS support is added */
> +        if ( dabt.size != DABT_WORD && dabt.size != DABT_DOUBLE_WORD )

Why do you change the access size check? You don't even support WORD
access in this code...

> +            goto bad_width;
> +        /* XXX: Update processor id in [23:8] if GITS_TYPER: PTA is not set */

As said on a previous patch, it would be better if GITS_TYPER is defined
for a specific value in the emulation. So we don't have to worry about
GITS_TYPER.PTA is 0 or 1.

IHMO, GITS_TYPER.PTA = 0 would make the code a lot simpler.

>          aff = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 56 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 48 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 40 |
>                 MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0) << 32);
>          *r = aff;
> +        /* Set LPI support */
> +        aff |= (GICR_TYPER_DISTRIBUTED_IMP | GICR_TYPER_PLPIS);

Funny, how can Linux works? You don't even expose those 2 bits because
you set aff not *r...

What is GICR_TYPER_DISTRIBUTED_IMP? It points to Bit 3.

Although, the spec define bit 3 as Direct LPI. As you don't implement
the register GICR_SETLPIR/GICR_CLRLPI,... this bit should be set 0.

Finally we still have to support GICv3 on platform where ITS is not
present. So, for instance GICR_TYPER_PLPIS should not always be set.

>  
>          if ( v->arch.vgic.flags & VGIC_V3_RDIST_LAST )
>              *r |= GICR_TYPER_LAST;
> @@ -131,10 +142,13 @@ static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
>          /* WO. Read as zero */
>          goto read_as_zero_64;
>      case GICR_PROPBASER:
> -        /* LPI's not implemented */
> -        goto read_as_zero_64;
> +        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
> +        /* Remove shareability attribute we don't want dom to flush */

The comment seems misplaced. I don't see a such things implemented in
the read.

> +        *r = v->domain->arch.lpi_conf->propbase;

A lock is missing.

> +        return 1;
>      case GICR_PENDBASER:
> -        /* LPI's not implemented */
> +        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
> +        *r = v->domain->arch.lpi_conf->pendbase[v->vcpu_id];

It sounds like pendbase should be stored per vcpu not in a domain array.

Also a lock is missing.

>          goto read_as_zero_64;
>      case GICR_INVLPIR:
>          /* WO. Read as zero */
> @@ -209,8 +223,15 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>      switch ( gicr_reg )
>      {
>      case GICR_CTLR:
> -        /* LPI's not implemented */
> -        goto write_ignore_32;
> +        /*
> +         * Enable LPI's for ITS. Direct injection of LPI
> +         * by writing to GICR_{SET,CLR}LPIR are not supported
> +         */

This comment should be placed in read emulation of GICR_TYPER  not
write/read of GICR_CTLR.

> +        if ( dabt.size != DABT_WORD ) goto bad_width;
> +        vgic_lock(v);
> +        v->domain->arch.vgic.gicr_ctlr = (*r) & GICR_CTL_ENABLE;

GICR_CTL_ENABLE should be named GICR_CTL_ENABLE_LPIS. Anyway, there is
already a define GICR_CTL_ENABLE_LPIS which has been added by you. So
please use it.

Futhermore, if the ITS is not present, this bit should be RES0.

> +        vgic_unlock(v);
> +        return 1;Although, 
>      case GICR_IIDR:
>          /* RO */
>          goto write_ignore_32;
> @@ -230,11 +251,26 @@ static int __vgic_v3_rdistr_rd_mmio_write(struct vcpu *v, mmio_info_t *info,
>          /* LPI is not implemented */

Odd, even after your series, there is lots of place with the comment /*
LPI is not implemented */. Did you intend to implement them? Or is it
because they deal with Direct LPI which we don't support? If it's the
latter, then the comment should be updated.

>          goto write_ignore_64;
>      case GICR_PROPBASER:
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
> +        vgic_lock(v);

When GICR_CTLR.EnableLPIs == 1, it's change is unpredictable. That means
we should deny a such case by crashing the domain.

> +        /* LPI configuration tables are shared across cpus. Should be same */
> +        if ( (v->domain->arch.lpi_conf->propbase != 0) && 
> +             ((v->domain->arch.lpi_conf->propbase & 0xfffffffff000UL) !=  (*r & 0xfffffffff000UL)) )

Multiple problems here:
  * r == 0 is perfectly valid
  * the guest can change probase at anytime when GICR_CTLR.EnableLPIs ==
0. The value only matter when this bit is set to 1

> +        {
> +            dprintk(XENLOG_ERR,

no XENLOG_ERR.Although,

> +                "vGICv3: vITS: Wrong configuration of LPI_PROPBASER\n");

This is part of the vGICv3 not vITS. Also please follow the same pattern
as the other message within the redistributor emulation.

> +            return 0;
> +        }     
> +        v->domain->arch.lpi_conf->propbase = *r;
> +        vgic_unlock(v);
> +        return vgic_its_unmap_lpi_prop(v);
>      case GICR_PENDBASER:
> -        /* LPI is not implemented */
> -        goto write_ignore_64;
> +        /* Just hold pendbaser value for guest read */

Faking the emulation is not a good things. A guest may try to use this
page and it won't work correctly.

It would take a long time for the developer to understand the problem.

If you think it's not important right now, we should at least notify the
guest in some way.

> +        if ( dabt.size != DABT_DOUBLE_WORD ) goto bad_width;
> +        vgic_lock(v);
> +        v->domain->arch.lpi_conf->pendbase[v->vcpu_id] = *r;
> +        vgic_unlock(v);

You know that taking the VCPU lock doesn't protect concurrent access on
the pendbase?

> +        return 1;
>      case GICR_INVLPIR:
>          /* LPI is not implemented */
>          goto write_ignore_64;
> @@ -703,7 +739,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v, mmio_info_t *info)
>                ((v->domain->arch.vgic.nr_spis / 32) & GICD_TYPE_LINES));

Sounds like you forgot to update irq_bits.

>          *r |= (irq_bits - 1) << GICD_TYPE_ID_BITS_SHIFT;
> -

Please keep the blank line after *r |= GICD_TYPE_LPIS.

> +        *r |= GICD_TYPE_LPIS;


It's wrong on platform without ITS support in Xen.


>          return 1;
>      }
>      case GICD_STATUSR:
> diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h
> index bc7aee9..7202f93 100644
> --- a/xen/include/asm-arm/domain.h
> +++ b/xen/include/asm-arm/domain.h
> @@ -101,6 +101,7 @@ struct arch_domain
>          paddr_t dbase; /* Distributor base address */
>          paddr_t cbase; /* CPU base address */
>  #ifdef CONFIG_ARM_64Although, 
> +	int gicr_ctlr;

The indentation is wrong.

>          /* GIC V3 addressing */
>          paddr_t dbase_size; /* Distributor base size */
>          /* List of contiguous occupied by the redistributors */
> diff --git a/xen/include/asm-arm/gic-its.h b/xen/include/asm-arm/gic-its.h
> index 82cfbdc..e1a5fa0 100644
> --- a/xen/include/asm-arm/gic-its.h
> +++ b/xen/include/asm-arm/gic-its.h
> @@ -229,6 +229,7 @@ unsigned long *its_lpi_alloc_chunks(int nirqs, int *base, int *nr_ids);
>  uint32_t its_get_pta_type(void);
>  uint32_t its_get_nr_its(void);
>  struct its_node * its_get_phys_node(uint32_t dev_id);
> +int vgic_its_unmap_lpi_prop(struct vcpu *v);
>  #endif /* __ASM_ARM_GIC_ITS_H__ */
>  
>  /*
> diff --git a/xen/include/asm-arm/gic.h b/xen/include/asm-arm/gic.h
> index 6f5767f..f15174b 100644
> --- a/xen/include/asm-arm/gic.h
> +++ b/xen/include/asm-arm/gic.h
> @@ -20,6 +20,7 @@
>  
>  #define NR_GIC_LOCAL_IRQS  NR_LOCAL_IRQS
>  #define NR_GIC_SGI         16
> +#define NR_GIC_LPI         8192

The naming is wrong.

>  #define MAX_RDIST_COUNT    4
>  
>  #define GICD_CTLR       (0x000)
> @@ -96,6 +97,7 @@
>  #define GICD_TYPE_CPUS_SHIFT 5
>  #define GICD_TYPE_CPUS  0x0e0
>  #define GICD_TYPE_SEC   0x400
> +#define GICD_TYPE_LPIS  (0x1UL << 17)
>  
>  #define GICC_CTL_ENABLE 0x1
>  #define GICC_CTL_EOI    (0x1 << 9)
> diff --git a/xen/include/asm-arm/gic_v3_defs.h b/xen/include/asm-arm/gic_v3_defs.h
> index f8bac52..125fc28 100644
> --- a/xen/include/asm-arm/gic_v3_defs.h
> +++ b/xen/include/asm-arm/gic_v3_defs.h
> @@ -45,6 +45,7 @@
>  #define GICC_SRE_EL2_DIB             (1UL << 2)
>  #define GICC_SRE_EL2_ENEL1           (1UL << 3)
>  
> +#define GICR_CTL_ENABLE              (1U << 0)

This definition is misplaced...

>  /* Additional bits in GICD_TYPER defined by GICv3 */
>  #define GICD_TYPE_ID_BITS_SHIFT 19
>  
> @@ -133,6 +134,7 @@
>  
>  #define GICR_TYPER_PLPIS             (1U << 0)
>  #define GICR_TYPER_VLPIS             (1U << 1)
> +#define GICR_TYPER_DISTRIBUTED_IMP   (1U << 3)
>  #define GICR_TYPER_LAST              (1U << 4)
>  
>  #define DEFAULT_PMR_VALUE            0xff
> 

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller for LPIs
  2015-03-19 14:38 ` [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller " vijay.kilari
@ 2015-03-27 17:02   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-27 17:02 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/15 14:38, vijay.kilari@gmail.com wrote:
> diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
> index 2b406e6..1b3ecd7 100644
> --- a/xen/arch/arm/gic-v3.c
> +++ b/xen/arch/arm/gic-v3.c
> @@ -40,6 +40,7 @@
>  #include <asm/device.h>
>  #include <asm/gic.h>
>  #include <asm/gic_v3_defs.h>
> +#include <asm/gic-its.h>
>  #include <asm/cpufeature.h>
>  
>  struct rdist_region {
> @@ -427,12 +428,18 @@ static void gicv3_poke_irq(struct irq_desc *irqd, u32 offset)
>  
>  static void gicv3_unmask_irq(struct irq_desc *irqd)
>  {
> -    gicv3_poke_irq(irqd, GICD_ISENABLER);
> +    if ( is_lpi(irqd->irq) )
> +        lpi_set_config(irqd, 1);
> +    else
> +        gicv3_poke_irq(irqd, GICD_ISENABLER);
>  }

While Stefano was asking to move the hw_irq_controller in gic-v3.c, I
believe he didn't meant merging them.

The goal of the hw_irq_control is to avoid unnecessary check like the
"if ( is_lpi(...) ) /* LPI handling */ else /* GICv3 handling */"

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space
  2015-03-19 14:38 ` [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space vijay.kilari
@ 2015-03-27 17:07   ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-27 17:07 UTC (permalink / raw)
  To: vijay.kilari, Ian.Campbell, stefano.stabellini,
	stefano.stabellini, tim, xen-devel
  Cc: Prasun.Kapoor, vijaya.kumar, manish.jaggi

Hello Vijay,

On 19/03/15 14:38, vijay.kilari@gmail.com wrote:
> +/*
> + * Map the 64K ITS translation space in guest.
> + * This is required purely for device smmu writes.
> +*/

Could this be avoid if the SMMU is not present?

> +
> +static int vgic_map_translation_space(uint32_t nr_its, struct domain *d)
> +{
> +    uint64_t addr, size;
> +    int ret;
> +
> +    addr = d->arch.vits[nr_its].phys_base + SZ_64K;
> +    size = SZ_64K;
> +    ret = map_mmio_regions(d,
> +                            paddr_to_pfn(addr & PAGE_MASK),
> +                            DIV_ROUND_UP(size, PAGE_SIZE),
> +                            paddr_to_pfn(addr & PAGE_MASK));

The translation space may not be mapped 1:1 to the guest.

> +
> +     if ( ret )
> +     {
> +          printk(XENLOG_ERR "Unable to map to dom%d access to"
> +                   " 0x%"PRIx64" - 0x%"PRIx64"\n",
> +                   d->domain_id,
> +                   addr & PAGE_MASK, PAGE_ALIGN(addr + size) - 1);
> +     }
> +
> +    return ret;
> +}
> +
>  int vgic_its_domain_init(struct domain *d)
>  {
>      uint32_t num_its;
> @@ -1420,6 +1448,8 @@ int vgic_its_domain_init(struct domain *d)
>           register_mmio_handler(d, &vgic_gits_mmio_handler,
>                                 d->arch.vits[i].phys_base,
>                                 SZ_64K);
> +
> +        return vgic_map_translation_space(i, d);

With this you can't support multiple ITS as the loop will return just
after mapping the translation space.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support
  2015-03-20 16:44   ` Julien Grall
@ 2015-03-30 14:32     ` Vijay Kilari
  2015-03-30 15:29       ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-30 14:32 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

Hi Julien,

On Fri, Mar 20, 2015 at 10:14 PM, Julien Grall <julien.grall@linaro.org> wrote:
> Hello Vijay,
>
> On 19/03/2015 14:37, vijay.kilari@gmail.com wrote:
>>
>> diff --git a/xen/include/asm-arm/irq.h b/xen/include/asm-arm/irq.h
>> index 435dfcd..f091739 100644
>> --- a/xen/include/asm-arm/irq.h
>> +++ b/xen/include/asm-arm/irq.h
>> @@ -17,6 +17,8 @@ struct arch_pirq
>>   struct arch_irq_desc {
>>       int eoi_cpu;
>>       unsigned int type;
>> +    unsigned int virq;
>> +    struct its_device *dev;
>>   };
>
>
> It seems you again miss my comment... As said on v1 this is not the
> solution. You add data for any IRQ (around 16K in Xen) just for handling
> LPIs.
>
> I provided a patch to handle virq != irq [1] and we should use it in order
> to diverge handling between LPIs and SPIs.
>
> If you are not happy with it, please see why.
>

   Stefano suggested to use arch_irq_desc to hold virq and its_device structure.
Another question is why another structure irq_guest is created?. Can't we reuse
arch_irq_desc?

Is this patch merged?.
> Regards,
>
> [1] https://patches.linaro.org/43012/
>
> --
> Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-24 11:48   ` Julien Grall
@ 2015-03-30 15:02     ` Vijay Kilari
  2015-03-30 15:47       ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-03-30 15:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

Hi Julien,

On Tue, Mar 24, 2015 at 5:18 PM, Julien Grall <julien.grall@linaro.org> wrote:
> Hello Vijay,
>
> More questions/remarks about command processing.
>
> On 19/03/2015 14:38, vijay.kilari@gmail.com wrote:
>>
>> +int vgic_its_process_cmd(struct vcpu *v, struct vgic_its *vits)
>> +{
>> +    struct its_cmd_block virt_cmd;
>> +
>> +    /* XXX: Currently we are processing one cmd at a time */
>> +    ASSERT(spin_is_locked(&vits->lock));
>> +
>> +    do {
>> +        if ( vgic_its_read_virt_cmd(v, vits, &virt_cmd) )
>> +            goto err;
>> +        if ( vgic_its_parse_its_command(v, vits, &virt_cmd) )
>> +            goto err;
>> +    } while ( vits->cmd_write != vits->cmd_write_save );
>> +
>> +    vits->cmd_write_save = vits->cmd_write;
>> +    DPRINTK("vITS: write_save 0x%lx write 0x%lx \n",
>> +            vits->cmd_write_save,
>> +            vits->cmd_write);
>> +    /* XXX: Currently we are processing one cmd at a time */
>> +    vgic_its_update_read_ptr(v, vits);
>
>
> From the spec the GITS_CREADR should be updated at every command processing.
> That would make cmd_write_save pointless.

See notes under section 4.9.9 Adding New Commands to the Queue
Multiple commands can be written to a queue at once.

>
> Also, you are taking the VITS lock for the whole process. This process can
> be very long. How will it affect the other vCPUs of the domain?
>

Yes, lock is taken on first command trap and holds until all commands
are processed.
In any case ITS commands are processed in synchronously. So any VCPU that
send ITS commands is blocked.

Also ITS commands are sent while setting up device/irq and while
releasing device/irq.
So there should not be any overhead when device is under use.

> Finally, in environment with multiple guests using ITS, the ITS command send
> to the physical ITS may be interleaved (i.e DOM1 cmd, DOM2 cmd, DOM1 cmd
> ...). Is there any possible side-effect?

Each command is independent.  Generally SYNC/INV is followed after some
commands. But it should not be a problem if they are interleaved.

Regards,
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support
  2015-03-30 14:32     ` Vijay Kilari
@ 2015-03-30 15:29       ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-03-30 15:29 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 30/03/15 15:32, Vijay Kilari wrote:
 >    Stefano suggested to use arch_irq_desc to hold virq and its_device
structure.
> Another question is why another structure irq_guest is created?. Can't we reuse
> arch_irq_desc?

Here is the answer I gave to Stefano when I introduced this new structure:

"I though about it. If we add another field in arch_irq_desc, we will
likely use more memory than xmalloc. This is because most of the
platform doesn't use 1024 interrupts but about 256 interrupts.

As the new field will be a pointer (on ARM64, 8 bytes), that would make
Xen use statically about 8K more.

We could allocate irq_desc dynamically during Xen boot."

This assumption was with only 1 pointer added. As you add 2 pointer, it
will add 16k which is mean increase 15% the current size of the final
Xen binary.

> Is this patch merged?.

Not yet. I sent a v4 2 weeks ago and I expect this patch to be merge as
soon as Ian has time to review it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-30 15:02     ` Vijay Kilari
@ 2015-03-30 15:47       ` Julien Grall
  2015-04-01 11:46         ` Ian Campbell
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-03-30 15:47 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 30/03/15 16:02, Vijay Kilari wrote:
> Hi Julien,

Hello Vijay,

> On Tue, Mar 24, 2015 at 5:18 PM, Julien Grall <julien.grall@linaro.org> wrote:
>> On 19/03/2015 14:38, vijay.kilari@gmail.com wrote:
>>>
>>> +int vgic_its_process_cmd(struct vcpu *v, struct vgic_its *vits)
>>> +{
>>> +    struct its_cmd_block virt_cmd;
>>> +
>>> +    /* XXX: Currently we are processing one cmd at a time */
>>> +    ASSERT(spin_is_locked(&vits->lock));
>>> +
>>> +    do {
>>> +        if ( vgic_its_read_virt_cmd(v, vits, &virt_cmd) )
>>> +            goto err;
>>> +        if ( vgic_its_parse_its_command(v, vits, &virt_cmd) )
>>> +            goto err;
>>> +    } while ( vits->cmd_write != vits->cmd_write_save );
>>> +
>>> +    vits->cmd_write_save = vits->cmd_write;
>>> +    DPRINTK("vITS: write_save 0x%lx write 0x%lx \n",
>>> +            vits->cmd_write_save,
>>> +            vits->cmd_write);
>>> +    /* XXX: Currently we are processing one cmd at a time */
>>> +    vgic_its_update_read_ptr(v, vits);
>>
>>
>> From the spec the GITS_CREADR should be updated at every command processing.
>> That would make cmd_write_save pointless.
> 
> See notes under section 4.9.9 Adding New Commands to the Queue
> Multiple commands can be written to a queue at once.

You didn't understand my point.

The steps to process a command are:

1)   read command
2)   handle command
3)   increment CREADR
4)   loop to 1 if another command to process

Currently, you only do the step 3 when all commands are processed.

>>
>> Also, you are taking the VITS lock for the whole process. This process can
>> be very long. How will it affect the other vCPUs of the domain?
>>
> 
> Yes, lock is taken on first command trap and holds until all commands
> are processed.
> In any case ITS commands are processed in synchronously. So any VCPU that
> send ITS commands is blocked.

This is wrong. The command processing is an asynchronous process and can
be long.

A VCPU may want to do other things (like handling interrupt) while the
ITS is processing.

With your implementation you rule out this possibility.

> Also ITS commands are sent while setting up device/irq and while
> releasing device/irq.
> So there should not be any overhead when device is under use.

Ok.

>> Finally, in environment with multiple guests using ITS, the ITS command send
>> to the physical ITS may be interleaved (i.e DOM1 cmd, DOM2 cmd, DOM1 cmd
>> ...). Is there any possible side-effect?
> 
> Each command is independent.  Generally SYNC/INV is followed after some
> commands. But it should not be a problem if they are interleaved.

What happen if the guest decide to not send the SYNC/INV? What is the
state of the ITS in this case? Would it be possible to receive a wrong LPIs?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-03-19 14:37 ` [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen vijay.kilari
  2015-03-20 15:06   ` Julien Grall
@ 2015-04-01 11:34   ` Ian Campbell
  2015-04-02  8:25     ` Vijay Kilari
  1 sibling, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-01 11:34 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	tim, xen-devel, stefano.stabellini, manish.jaggi

On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> This patch just makes ITS driver taken from linux
> compiles in xen environment.

What is your intention wrt future updates to this driver?

Are you intending to keep things in sync and import things from the
Linux side (similar to the smmu drviers) or are you taking the Linux
code as a starting point and intending that it then be maintained
independently as a Xen driver from then on?

Ian.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-03-19 14:37 ` [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command vijay.kilari
@ 2015-04-01 11:40   ` Ian Campbell
  2015-05-11 14:14     ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-01 11:40 UTC (permalink / raw)
  To: vijay.kilari
  Cc: stefano.stabellini, Prasun.Kapoor, vijaya.kumar, julien.grall,
	tim, xen-devel, stefano.stabellini, manish.jaggi

On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> 
> Add helper functions to decode ITS command
> This will be useful for Virtual ITS driver

It depends slightly on the answer to the quesiton I asked on patch #6,
but in general in Xen we have preferred to define a structure/union
overlaying the processor's view of such things and to use access to
those fields, see e.g. the hsr decode or the pte stuff.

Ian.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-03-30 15:47       ` Julien Grall
@ 2015-04-01 11:46         ` Ian Campbell
  2015-04-01 12:02           ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-01 11:46 UTC (permalink / raw)
  To: Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Mon, 2015-03-30 at 16:47 +0100, Julien Grall wrote:
> > In any case ITS commands are processed in synchronously. So any VCPU that
> > send ITS commands is blocked.

What exactly is synchronous here? Is it just the "translate vits into
requests queued with the physical its driver" phase or does it also
include waiting for the physical its' response and translating that back
into a v response?

If it involves waiting for the h/w then I think we probably don't want
to be blocking things for that length of time.

If it just involves the translation and queuing into the physical its
driver then it might be tolerable, but I'd like to see an argument as to
why that is the case.

Ian.

> 
> This is wrong. The command processing is an asynchronous process and can
> be long.
> 
> A VCPU may want to do other things (like handling interrupt) while the
> ITS is processing.
> 
> With your implementation you rule out this possibility.
> 
> > Also ITS commands are sent while setting up device/irq and while
> > releasing device/irq.
> > So there should not be any overhead when device is under use.
> 
> Ok.
> 
> >> Finally, in environment with multiple guests using ITS, the ITS command send
> >> to the physical ITS may be interleaved (i.e DOM1 cmd, DOM2 cmd, DOM1 cmd
> >> ...). Is there any possible side-effect?
> > 
> > Each command is independent.  Generally SYNC/INV is followed after some
> > commands. But it should not be a problem if they are interleaved.
> 
> What happen if the guest decide to not send the SYNC/INV? What is the
> state of the ITS in this case? Would it be possible to receive a wrong LPIs?
> 
> Regards,
> 

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-01 11:46         ` Ian Campbell
@ 2015-04-01 12:02           ` Julien Grall
  2015-04-02  9:13             ` Ian Campbell
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-01 12:02 UTC (permalink / raw)
  To: Ian Campbell, Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 01/04/15 12:46, Ian Campbell wrote:
> On Mon, 2015-03-30 at 16:47 +0100, Julien Grall wrote:
>>> In any case ITS commands are processed in synchronously. So any VCPU that
>>> send ITS commands is blocked.
> 
> What exactly is synchronous here? Is it just the "translate vits into
> requests queued with the physical its driver" phase or does it also
> include waiting for the physical its' response and translating that back
> into a v response?

>From the spec, the processing of command is asynchronous. The vCPU has
to poll a register in order to know if the ITS has finished to execute
the command.

A vCPU may decide to execute other things while the ITS is processing
commands.

The implementation suggested by Vijay, both the vCPU and CPU is blocked
while the ITS command are processing.

Futhermore, if another vCPU is trying to access the vITS it will be
blocked too (and therefore the CPU). With this solution we may take down
Xen.

> If it involves waiting for the h/w then I think we probably don't want
> to be blocking things for that length of time.

It involves waiting the h/w.

> If it just involves the translation and queuing into the physical its
> driver then it might be tolerable, but I'd like to see an argument as to
> why that is the case.

The translation/queuing can be very long if the queue is big (the
maximum size of the queue is 2^32). It would at least require some
preemption in Xen in order to avoid blocking the CPU/vCPU for a long time.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-04-01 11:34   ` Ian Campbell
@ 2015-04-02  8:25     ` Vijay Kilari
  2015-04-02  9:25       ` Ian Campbell
  2015-04-02 13:57       ` Julien Grall
  0 siblings, 2 replies; 109+ messages in thread
From: Vijay Kilari @ 2015-04-02  8:25 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Wed, Apr 1, 2015 at 5:04 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>
>> This patch just makes ITS driver taken from linux
>> compiles in xen environment.
>
> What is your intention wrt future updates to this driver?
>
> Are you intending to keep things in sync and import things from the
> Linux side (similar to the smmu drviers) or are you taking the Linux
> code as a starting point and intending that it then be maintained
> independently as a Xen driver from then on?

Yes, I intend to keep things in sync with Linux driver.
I have kept most the code same as Linux side except removing unused code.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-01 12:02           ` Julien Grall
@ 2015-04-02  9:13             ` Ian Campbell
  2015-04-02 11:06               ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-02  9:13 UTC (permalink / raw)
  To: Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Wed, 2015-04-01 at 13:02 +0100, Julien Grall wrote:
> On 01/04/15 12:46, Ian Campbell wrote:
> > On Mon, 2015-03-30 at 16:47 +0100, Julien Grall wrote:
> >>> In any case ITS commands are processed in synchronously. So any VCPU that
> >>> send ITS commands is blocked.
> > 
> > What exactly is synchronous here? Is it just the "translate vits into
> > requests queued with the physical its driver" phase or does it also
> > include waiting for the physical its' response and translating that back
> > into a v response?
> 
> From the spec, the processing of command is asynchronous.

I was asking about the implementation of our emulation of it, not about
the hardware itself. I understood that the underlying h/w is
asynchronous.

>  The vCPU has
> to poll a register in order to know if the ITS has finished to execute
> the command.
> 
> A vCPU may decide to execute other things while the ITS is processing
> commands.
> 
> The implementation suggested by Vijay, both the vCPU and CPU is blocked
> while the ITS command are processing.

Can we just enqueue with the hardware and use the guest vcpu polling
loop to trigger us to check for completion? What would happen if a guest
never polled, I suppose we would have to catch it some other way?

> Futhermore, if another vCPU is trying to access the vITS it will be
> blocked too (and therefore the CPU). With this solution we may take down
> Xen.

Yes, we can't have that I'm afraid.

> The translation/queuing can be very long if the queue is big (the
> maximum size of the queue is 2^32). It would at least require some
> preemption in Xen in order to avoid blocking the CPU/vCPU for a long time.

Yes, limiting the number of requests in flight at any one time seems
like a good idea anyway, we can always dribble things in as other ones
complete anyway.

Ian.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-04-02  8:25     ` Vijay Kilari
@ 2015-04-02  9:25       ` Ian Campbell
  2015-04-02 10:05         ` Vijay Kilari
  2015-04-02 13:57       ` Julien Grall
  1 sibling, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-02  9:25 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Thu, 2015-04-02 at 13:55 +0530, Vijay Kilari wrote:
> On Wed, Apr 1, 2015 at 5:04 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
> > On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
> >> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
> >>
> >> This patch just makes ITS driver taken from linux
> >> compiles in xen environment.
> >
> > What is your intention wrt future updates to this driver?
> >
> > Are you intending to keep things in sync and import things from the
> > Linux side (similar to the smmu drviers) or are you taking the Linux
> > code as a starting point and intending that it then be maintained
> > independently as a Xen driver from then on?
> 
> Yes, I intend to keep things in sync with Linux driver.
> I have kept most the code same as Linux side except removing unused code.

There is a tonne of changes going on if that is your goal, in particular
in this patch but also in some of the following refactoring patches.
When this series is over it seems like the driver would bear very little
resemblance to the Linux one.

If you want to go this route then to aid in future synchronisation from
Linux patches the goal should be to make the changes to the Linux code
as minimal as possible, by defining shim functions and typedefs etc at
the top of the file, e.g. as Julien has tried to do with the smmu
driver.

Unlike the smmu stuff, which has a reasonably small and well-defined
interface to the kernel which can be easily shimmed between Xen and
Linux it's not clear to me that this approach is workable for ITS, the
Xen and Linux interrupt handling systems are rather different and ITS
needs to be more tightly integrated with other bits of Xen, in
particular the GIC drivers.

However if you think maintaining something which can be synchronised
from Linux is viable and desirable then that's ok by me.

Ian.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-04-02  9:25       ` Ian Campbell
@ 2015-04-02 10:05         ` Vijay Kilari
  0 siblings, 0 replies; 109+ messages in thread
From: Vijay Kilari @ 2015-04-02 10:05 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Thu, Apr 2, 2015 at 2:55 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Thu, 2015-04-02 at 13:55 +0530, Vijay Kilari wrote:
>> On Wed, Apr 1, 2015 at 5:04 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>> > On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>> >> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>> >>
>> >> This patch just makes ITS driver taken from linux
>> >> compiles in xen environment.
>> >
>> > What is your intention wrt future updates to this driver?
>> >
>> > Are you intending to keep things in sync and import things from the
>> > Linux side (similar to the smmu drviers) or are you taking the Linux
>> > code as a starting point and intending that it then be maintained
>> > independently as a Xen driver from then on?
>>
>> Yes, I intend to keep things in sync with Linux driver.
>> I have kept most the code same as Linux side except removing unused code.
>
> There is a tonne of changes going on if that is your goal, in particular
> in this patch but also in some of the following refactoring patches.
> When this series is over it seems like the driver would bear very little
> resemblance to the Linux one.
>
> If you want to go this route then to aid in future synchronisation from
> Linux patches the goal should be to make the changes to the Linux code
> as minimal as possible, by defining shim functions and typedefs etc at
> the top of the file, e.g. as Julien has tried to do with the smmu
> driver.
>
> Unlike the smmu stuff, which has a reasonably small and well-defined
> interface to the kernel which can be easily shimmed between Xen and
> Linux it's not clear to me that this approach is workable for ITS, the
> Xen and Linux interrupt handling systems are rather different and ITS
> needs to be more tightly integrated with other bits of Xen, in
> particular the GIC drivers.

Yes, there is lot of unnecessary code for Xen in ITS driver which is
trimmed down. However the whatever functions that are used are retained,
So that changes made to Linux driver can be easily mapped to ITS driver.

IMO, We can create macros from debug prints, memory allocation & cache mgmt
apis, that only reduces changes we make.

>
> However if you think maintaining something which can be synchronised
> from Linux is viable and desirable then that's ok by me.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-02  9:13             ` Ian Campbell
@ 2015-04-02 11:06               ` Julien Grall
  2015-04-02 11:18                 ` Ian Campbell
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-02 11:06 UTC (permalink / raw)
  To: Ian Campbell, Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Ian,

On 02/04/2015 10:13, Ian Campbell wrote:
> On Wed, 2015-04-01 at 13:02 +0100, Julien Grall wrote:
>> On 01/04/15 12:46, Ian Campbell wrote:
>>> On Mon, 2015-03-30 at 16:47 +0100, Julien Grall wrote:
>>>>> In any case ITS commands are processed in synchronously. So any VCPU that
>>>>> send ITS commands is blocked.
>>>
>>> What exactly is synchronous here? Is it just the "translate vits into
>>> requests queued with the physical its driver" phase or does it also
>>> include waiting for the physical its' response and translating that back
>>> into a v response?
>>
>>  From the spec, the processing of command is asynchronous.
>
> I was asking about the implementation of our emulation of it, not about
> the hardware itself. I understood that the underlying h/w is
> asynchronous.

Sorry I though you were asking about the h/w. I think the end of my mail 
answered the question on our implementation.

>
>>   The vCPU has
>> to poll a register in order to know if the ITS has finished to execute
>> the command.
>>
>> A vCPU may decide to execute other things while the ITS is processing
>> commands.
>>
>> The implementation suggested by Vijay, both the vCPU and CPU is blocked
>> while the ITS command are processing.
>
> Can we just enqueue with the hardware and use the guest vcpu polling
> loop to trigger us to check for completion?

Enqueue may be long. I was thinking about suggesting to use a tasklet 
for processing ITS command.

But I don't know how much we can do in Xen with them.

 > What would happen if a guest
> never polled, I suppose we would have to catch it some other way?

The specification (see 4.9.9 PRD03-GENC-010745 24.0) defines 2 different 
way to be notify for completion:
	1) Polling: Reading GITS_CREADR in a loop
	2) Receiving an interrupt (see 4.9.10) by using the command MAPI.

A guest would be buggy if it doesn't implement one of this solution. And 
therefore may not run on real h/w.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-02 11:06               ` Julien Grall
@ 2015-04-02 11:18                 ` Ian Campbell
  2015-04-02 13:47                   ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Ian Campbell @ 2015-04-02 11:18 UTC (permalink / raw)
  To: Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Thu, 2015-04-02 at 12:06 +0100, Julien Grall wrote:

> > Can we just enqueue with the hardware and use the guest vcpu polling
> > loop to trigger us to check for completion?
> 
> Enqueue may be long. I was thinking about suggesting to use a tasklet 
> for processing ITS command.

We don't need to enqueue everything the guest gives us at once, we could
only do a subset and pickup the rest later as things complete at the
physical ITS.

> But I don't know how much we can do in Xen with them.
> 
>  > What would happen if a guest
> > never polled, I suppose we would have to catch it some other way?
> 
> The specification (see 4.9.9 PRD03-GENC-010745 24.0) defines 2 different 
> way to be notify for completion:
> 	1) Polling: Reading GITS_CREADR in a loop
> 	2) Receiving an interrupt (see 4.9.10) by using the command MAPI.

I would expect Xen itself to use the second option at the host level,
which would then drive the completion via the vGITS_CREADR or the
guest's virtualised interrupt.

That means the pCPU is free during the ITS processing, which is surely
what we want.

> A guest would be buggy if it doesn't implement one of this solution. And 
> therefore may not run on real h/w.

I was more concerned about it wedging the hypervisor somehow with a
large number of completed but not released operations.

Ian.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-02 11:18                 ` Ian Campbell
@ 2015-04-02 13:47                   ` Julien Grall
  2015-04-28  9:28                     ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-02 13:47 UTC (permalink / raw)
  To: Ian Campbell, Julien Grall
  Cc: Vijay Kilari, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Ian,

On 02/04/2015 12:18, Ian Campbell wrote:
> On Thu, 2015-04-02 at 12:06 +0100, Julien Grall wrote:
>
>>> Can we just enqueue with the hardware and use the guest vcpu polling
>>> loop to trigger us to check for completion?
>>
>> Enqueue may be long. I was thinking about suggesting to use a tasklet
>> for processing ITS command.
>
> We don't need to enqueue everything the guest gives us at once, we could
> only do a subset and pickup the rest later as things complete at the
> physical ITS.

That would require more tracking. Anyway, I think that would work.

> I would expect Xen itself to use the second option at the host level,
> which would then drive the completion via the vGITS_CREADR or the
> guest's virtualised interrupt.
>
> That means the pCPU is free during the ITS processing, which is surely
> what we want.

Right, that would be the best solution for Xen.

Although, the would mean diverging from Linux driver (see discussion on 
patch #6). But I think it's inevitable we can't have the same driver 
close to Linux.

>> A guest would be buggy if it doesn't implement one of this solution. And
>> therefore may not run on real h/w.
>
> I was more concerned about it wedging the hypervisor somehow with a
> large number of completed but not released operations.

We don't have to worry about released operations. Once it acknowledge 
(via the above completion notification) the command can be discard in 
the ITS driver.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen
  2015-04-02  8:25     ` Vijay Kilari
  2015-04-02  9:25       ` Ian Campbell
@ 2015-04-02 13:57       ` Julien Grall
  1 sibling, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-04-02 13:57 UTC (permalink / raw)
  To: Vijay Kilari, Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

Hi Vijay,

On 02/04/2015 09:25, Vijay Kilari wrote:
> On Wed, Apr 1, 2015 at 5:04 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>
>>> This patch just makes ITS driver taken from linux
>>> compiles in xen environment.
>>
>> What is your intention wrt future updates to this driver?
>>
>> Are you intending to keep things in sync and import things from the
>> Linux side (similar to the smmu drviers) or are you taking the Linux
>> code as a starting point and intending that it then be maintained
>> independently as a Xen driver from then on?
>
> Yes, I intend to keep things in sync with Linux driver.
> I have kept most the code same as Linux side except removing unused code.


The result of this series shows that we diverge a lot from the original 
driver. We have lots of Xen specific code added and some interface has 
changed for our purpose.

For instance, removing unused code is not something we should do in sync 
driver because it's harder to backport patch (the context of the diff 
will unlikely be the same).

Furthermore, we may also want to change the way that completion is 
notified (see discussion on patch #13).

While it was a good thing to keep the SMMU driver sync with Linux (not 
much diff required), I think this would be a mistake for the ITS.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-02 13:47                   ` Julien Grall
@ 2015-04-28  9:28                     ` Vijay Kilari
  2015-04-28  9:56                       ` Stefano Stabellini
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-28  9:28 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Julien Grall,
	Stefano Stabellini, manish.jaggi

On Thu, Apr 2, 2015 at 7:17 PM, Julien Grall <julien.grall.oss@gmail.com> wrote:
> Hi Ian,
>
> On 02/04/2015 12:18, Ian Campbell wrote:
>>
>> On Thu, 2015-04-02 at 12:06 +0100, Julien Grall wrote:
>>
>>>> Can we just enqueue with the hardware and use the guest vcpu polling
>>>> loop to trigger us to check for completion?
>>>
>>>
>>> Enqueue may be long. I was thinking about suggesting to use a tasklet
>>> for processing ITS command.
>>
>>
>> We don't need to enqueue everything the guest gives us at once, we could
>> only do a subset and pickup the rest later as things complete at the
>> physical ITS.
>
>
> That would require more tracking. Anyway, I think that would work.

Sorry for late reply. Could not get time to work on this.

Approach 1: (Using completion interrupt)
----------------
1) Create dummy device for each virtual ITS when virtual its is
created for a domain
    OR Allocate one interrupt number for each virtual ITS from the one
single dummy device

2) Trap on CWRITER
3) Take vits lock
4)     Read a Virtual Command from Virtual Command Queue
5)     Translate Virtual command to Physical command
5)  Release vits lock
6) Take physical ITS lock
7)      Post physical commands + Append INT command
8) Release physical ITS lock
9) Return from CWRITER trap and let VCPU poll in EL1 (kernel) for CREADER update
    which indicates completion of command.
10) On receiving interrupt, Update CREADER of this virtual ITS

Pros:
    - VCPU polls in EL1.

Cons:
    - Complexity in creating & managing dummy device.

Approach 2:
----------------
I thought of below approach where in we reduce locking time and no need
of Interrupt.

1) Trap on CWRITER
2) Take vits lock
3)     Read _all_ or 2 Virtual Commands from Virtual Command Queue
4)     Translate _all_ or 2 Virtual commands to Physical commands
5)  Release vits lock
6) Take physical ITS lock
7)      Post physical commands
8) Release physical ITS lock
9) Poll for completion of command and return from CWRITER trap.

Pros:
    - Simple approach
Cons:
    - VCPU polls in EL2

>
>> I would expect Xen itself to use the second option at the host level,
>> which would then drive the completion via the vGITS_CREADR or the
>> guest's virtualised interrupt.
>>
>> That means the pCPU is free during the ITS processing, which is surely
>> what we want.
>
>
> Right, that would be the best solution for Xen.
>
> Although, the would mean diverging from Linux driver (see discussion on
> patch #6). But I think it's inevitable we can't have the same driver close
> to Linux.
>
>>> A guest would be buggy if it doesn't implement one of this solution. And
>>> therefore may not run on real h/w.
>>
>>
>> I was more concerned about it wedging the hypervisor somehow with a
>> large number of completed but not released operations.
>
>
> We don't have to worry about released operations. Once it acknowledge (via
> the above completion notification) the command can be discard in the ITS
> driver.
>
> Regards,
>
> --
> Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-28  9:28                     ` Vijay Kilari
@ 2015-04-28  9:56                       ` Stefano Stabellini
  2015-04-28 10:35                         ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Stefano Stabellini @ 2015-04-28  9:56 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Julien Grall,
	Stefano Stabellini, manish.jaggi, Julien Grall

On Tue, 28 Apr 2015, Vijay Kilari wrote:
> On Thu, Apr 2, 2015 at 7:17 PM, Julien Grall <julien.grall.oss@gmail.com> wrote:
> > Hi Ian,
> >
> > On 02/04/2015 12:18, Ian Campbell wrote:
> >>
> >> On Thu, 2015-04-02 at 12:06 +0100, Julien Grall wrote:
> >>
> >>>> Can we just enqueue with the hardware and use the guest vcpu polling
> >>>> loop to trigger us to check for completion?
> >>>
> >>>
> >>> Enqueue may be long. I was thinking about suggesting to use a tasklet
> >>> for processing ITS command.
> >>
> >>
> >> We don't need to enqueue everything the guest gives us at once, we could
> >> only do a subset and pickup the rest later as things complete at the
> >> physical ITS.
> >
> >
> > That would require more tracking. Anyway, I think that would work.
> 
> Sorry for late reply. Could not get time to work on this.
> 
> Approach 1: (Using completion interrupt)
> ----------------
> 1) Create dummy device for each virtual ITS when virtual its is
> created for a domain
>     OR Allocate one interrupt number for each virtual ITS from the one
> single dummy device
> 
> 2) Trap on CWRITER
> 3) Take vits lock
> 4)     Read a Virtual Command from Virtual Command Queue
> 5)     Translate Virtual command to Physical command
> 5)  Release vits lock
> 6) Take physical ITS lock
> 7)      Post physical commands + Append INT command
> 8) Release physical ITS lock
> 9) Return from CWRITER trap and let VCPU poll in EL1 (kernel) for CREADER update
>     which indicates completion of command.
> 10) On receiving interrupt, Update CREADER of this virtual ITS
> 
> Pros:
>     - VCPU polls in EL1.
> 
> Cons:
>     - Complexity in creating & managing dummy device.
> 
> Approach 2:
> ----------------
> I thought of below approach where in we reduce locking time and no need
> of Interrupt.
> 
> 1) Trap on CWRITER
> 2) Take vits lock
> 3)     Read _all_ or 2 Virtual Commands from Virtual Command Queue
> 4)     Translate _all_ or 2 Virtual commands to Physical commands
> 5)  Release vits lock
> 6) Take physical ITS lock
> 7)      Post physical commands
> 8) Release physical ITS lock
> 9) Poll for completion of command and return from CWRITER trap.
> 
> Pros:
>     - Simple approach
> Cons:
>     - VCPU polls in EL2

I think we'll have to go with Approach 1.

The problem with Approach 2 is that we would need to watch out for
sequences of guest ITS commands that could cause Xen to poll for too
long and lock up.



> >
> >> I would expect Xen itself to use the second option at the host level,
> >> which would then drive the completion via the vGITS_CREADR or the
> >> guest's virtualised interrupt.
> >>
> >> That means the pCPU is free during the ITS processing, which is surely
> >> what we want.
> >
> >
> > Right, that would be the best solution for Xen.
> >
> > Although, the would mean diverging from Linux driver (see discussion on
> > patch #6). But I think it's inevitable we can't have the same driver close
> > to Linux.
> >
> >>> A guest would be buggy if it doesn't implement one of this solution. And
> >>> therefore may not run on real h/w.
> >>
> >>
> >> I was more concerned about it wedging the hypervisor somehow with a
> >> large number of completed but not released operations.
> >
> >
> > We don't have to worry about released operations. Once it acknowledge (via
> > the above completion notification) the command can be discard in the ITS
> > driver.
> >
> > Regards,
> >
> > --
> > Julien Grall
> 

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-28  9:56                       ` Stefano Stabellini
@ 2015-04-28 10:35                         ` Julien Grall
  2015-04-28 11:36                           ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-28 10:35 UTC (permalink / raw)
  To: Stefano Stabellini, Vijay Kilari
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

Hi,

On 28/04/15 10:56, Stefano Stabellini wrote:
> On Tue, 28 Apr 2015, Vijay Kilari wrote:
>> Approach 1: (Using completion interrupt)
>> ----------------
>> 1) Create dummy device for each virtual ITS when virtual its is
>> created for a domain
>>     OR Allocate one interrupt number for each virtual ITS from the one
>> single dummy device
>>
>> 2) Trap on CWRITER
>> 3) Take vits lock
>> 4)     Read a Virtual Command from Virtual Command Queue
>> 5)     Translate Virtual command to Physical command
>> 5)  Release vits lock
>> 6) Take physical ITS lock
>> 7)      Post physical commands + Append INT command
>> 8) Release physical ITS lock
>> 9) Return from CWRITER trap and let VCPU poll in EL1 (kernel) for CREADER update
>>     which indicates completion of command.
>> 10) On receiving interrupt, Update CREADER of this virtual ITS
>>
>> Pros:
>>     - VCPU polls in EL1.
>>
>> Cons:
>>     - Complexity in creating & managing dummy device.

If you properly manage the device with struct pci_dev or struct device
(which is, as talked earlier, obviously required for security) you
should avoid your so-called "dummy device". BTW, what do you mean by
"dummy device"?

>> Approach 2:
>> ----------------
>> I thought of below approach where in we reduce locking time and no need
>> of Interrupt.
>>
>> 1) Trap on CWRITER
>> 2) Take vits lock
>> 3)     Read _all_ or 2 Virtual Commands from Virtual Command Queue
>> 4)     Translate _all_ or 2 Virtual commands to Physical commands
>> 5)  Release vits lock
>> 6) Take physical ITS lock
>> 7)      Post physical commands
>> 8) Release physical ITS lock
>> 9) Poll for completion of command and return from CWRITER trap.
>>
>> Pros:
>>     - Simple approach
>> Cons:
>>     - VCPU polls in EL2
> 
> I think we'll have to go with Approach 1.
> 
> The problem with Approach 2 is that we would need to watch out for
> sequences of guest ITS commands that could cause Xen to poll for too
> long and lock up.

+1, the approach 1 is the best.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-28 10:35                         ` Julien Grall
@ 2015-04-28 11:36                           ` Vijay Kilari
  2015-04-28 16:15                             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-28 11:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Tue, Apr 28, 2015 at 4:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
> Hi,
>
> On 28/04/15 10:56, Stefano Stabellini wrote:
>> On Tue, 28 Apr 2015, Vijay Kilari wrote:
>>> Approach 1: (Using completion interrupt)
>>> ----------------
>>> 1) Create dummy device for each virtual ITS when virtual its is
>>> created for a domain
>>>     OR Allocate one interrupt number for each virtual ITS from the one
>>> single dummy device
>>>
>>> 2) Trap on CWRITER
>>> 3) Take vits lock
>>> 4)     Read a Virtual Command from Virtual Command Queue
>>> 5)     Translate Virtual command to Physical command
>>> 5)  Release vits lock
>>> 6) Take physical ITS lock
>>> 7)      Post physical commands + Append INT command
>>> 8) Release physical ITS lock
>>> 9) Return from CWRITER trap and let VCPU poll in EL1 (kernel) for CREADER update
>>>     which indicates completion of command.
>>> 10) On receiving interrupt, Update CREADER of this virtual ITS
>>>
>>> Pros:
>>>     - VCPU polls in EL1.
>>>
>>> Cons:
>>>     - Complexity in creating & managing dummy device.
>
> If you properly manage the device with struct pci_dev or struct device
> (which is, as talked earlier, obviously required for security) you
> should avoid your so-called "dummy device". BTW, what do you mean by
> "dummy device"?


(a) For implementing ITS command processing completion interrupt we need
a unique interrupt for each domain per vITS to update corresponding virtual ITS
CREADER
(b) INT command requires dev,ID there needs to be a device
associated with the ID
(c) The command processing completion interrupt is not coming from a
valid device, we have to provide a dummy device,ID
(d) I propose that the dummy device segment number is read from a
macro/helper function
in the platform file.
For each domain we can add the bus number so for eg: 0xff is the segment
number which is #define PLAT_DUMMY_SEG 0xff.
The device for dom0 would be PLAT_DUMMY_SEG:00:0.0
The device for domU would be PLAT_DUMMY_SEG:00:0.0 | domain_id

Let me know if there is better way to generate dummy/unused device id?

So creation of dummy device and setup for INT command execution
can be done in physical ITS driver with its_device structure managed
in vgic_its

Also with this approach, vITS is not held by the VCPU till the completion
of command processing, So another VCPU of the same domain can add
another ITS command. If so we have to keep track number of ITS commands
being processed per VCPU of the domain and increment vITS CREADER accordingly.
For this, we have to add one unique interrupt ID of the device for per
VCPU, So that
unique interrupt is received from the dummy device for per VCPU.

Regards
Vijay

>
>>> Approach 2:
>>> ----------------
>>> I thought of below approach where in we reduce locking time and no need
>>> of Interrupt.
>>>
>>> 1) Trap on CWRITER
>>> 2) Take vits lock
>>> 3)     Read _all_ or 2 Virtual Commands from Virtual Command Queue
>>> 4)     Translate _all_ or 2 Virtual commands to Physical commands
>>> 5)  Release vits lock
>>> 6) Take physical ITS lock
>>> 7)      Post physical commands
>>> 8) Release physical ITS lock
>>> 9) Poll for completion of command and return from CWRITER trap.
>>>
>>> Pros:
>>>     - Simple approach
>>> Cons:
>>>     - VCPU polls in EL2
>>
>> I think we'll have to go with Approach 1.
>>
>> The problem with Approach 2 is that we would need to watch out for
>> sequences of guest ITS commands that could cause Xen to poll for too
>> long and lock up.
>
> +1, the approach 1 is the best.
>
> Regards,
>
> --
> Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-28 11:36                           ` Vijay Kilari
@ 2015-04-28 16:15                             ` Julien Grall
  2015-04-29  1:44                               ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-28 16:15 UTC (permalink / raw)
  To: Vijay Kilari, Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Vijay,

On 28/04/15 12:36, Vijay Kilari wrote:
> On Tue, Apr 28, 2015 at 4:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> If you properly manage the device with struct pci_dev or struct device
>> (which is, as talked earlier, obviously required for security) you
>> should avoid your so-called "dummy device". BTW, what do you mean by
>> "dummy device"?
> 
> 
> (a) For implementing ITS command processing completion interrupt we need
> a unique interrupt for each domain per vITS to update corresponding virtual ITS
> CREADER
> (b) INT command requires dev,ID there needs to be a device
> associated with the ID
> (c) The command processing completion interrupt is not coming from a
> valid device, we have to provide a dummy device,ID
> (d) I propose that the dummy device segment number is read from a
> macro/helper function
> in the platform file.
> For each domain we can add the bus number so for eg: 0xff is the segment
> number which is #define PLAT_DUMMY_SEG 0xff.
> The device for dom0 would be PLAT_DUMMY_SEG:00:0.0
> The device for domU would be PLAT_DUMMY_SEG:00:0.0 | domain_id

There is multiple problem with this solution:
	- What prevents a platform to use this Device ID in the future?
	- What's is the behavior of the ITS when the Device ID doesn't belong
to a real device?
	- The number of bits for the Device ID can be limited via
GITS_TYPER.Devbits, so it's not possible to use an hardware value

> Let me know if there is better way to generate dummy/unused device id?
> 
> So creation of dummy device and setup for INT command execution
> can be done in physical ITS driver with its_device structure managed
> in vgic_its
> 
> Also with this approach, vITS is not held by the VCPU till the completion
> of command processing, So another VCPU of the same domain can add
> another ITS command. If so we have to keep track number of ITS commands
> being processed per VCPU of the domain and increment vITS CREADER accordingly.
> For this, we have to add one unique interrupt ID of the device for per
> VCPU, So that
> unique interrupt is received from the dummy device for per VCPU.

The number of LPIs supported by the ITS could be very limited. We need
to use them with parsimony.

Even if you have a per-VCPU  interrupt, it doesn't prevent a same vCPU
writing other commands after a first batch then wait.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-28 16:15                             ` Julien Grall
@ 2015-04-29  1:44                               ` Vijay Kilari
  2015-04-29 11:56                                 ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-29  1:44 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Tue, Apr 28, 2015 at 9:45 PM, Julien Grall <julien.grall@citrix.com> wrote:
> Hi Vijay,
>
> On 28/04/15 12:36, Vijay Kilari wrote:
>> On Tue, Apr 28, 2015 at 4:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>> If you properly manage the device with struct pci_dev or struct device
>>> (which is, as talked earlier, obviously required for security) you
>>> should avoid your so-called "dummy device". BTW, what do you mean by
>>> "dummy device"?
>>
>>
>> (a) For implementing ITS command processing completion interrupt we need
>> a unique interrupt for each domain per vITS to update corresponding virtual ITS
>> CREADER
>> (b) INT command requires dev,ID there needs to be a device
>> associated with the ID
>> (c) The command processing completion interrupt is not coming from a
>> valid device, we have to provide a dummy device,ID
>> (d) I propose that the dummy device segment number is read from a
>> macro/helper function
>> in the platform file.
>> For each domain we can add the bus number so for eg: 0xff is the segment
>> number which is #define PLAT_DUMMY_SEG 0xff.
>> The device for dom0 would be PLAT_DUMMY_SEG:00:0.0
>> The device for domU would be PLAT_DUMMY_SEG:00:0.0 | domain_id
>
> There is multiple problem with this solution:
>         - What prevents a platform to use this Device ID in the future?

Nothing prevents. But we can make a check in the ITS driver.

>         - What's is the behavior of the ITS when the Device ID doesn't belong
> to a real device?
            ITS behavior is same for any device ID provided it falls
in Devbits range

>         - The number of bits for the Device ID can be limited via
> GITS_TYPER.Devbits, so it's not possible to use an hardware value

For ITS any Device ID should fall withing the Devbits range. Otherwise
ITS cannot create/prase ITT table and generate interrupt

>
>> Let me know if there is better way to generate dummy/unused device id?
>>
>> So creation of dummy device and setup for INT command execution
>> can be done in physical ITS driver with its_device structure managed
>> in vgic_its
>>
>> Also with this approach, vITS is not held by the VCPU till the completion
>> of command processing, So another VCPU of the same domain can add
>> another ITS command. If so we have to keep track number of ITS commands
>> being processed per VCPU of the domain and increment vITS CREADER accordingly.
>> For this, we have to add one unique interrupt ID of the device for per
>> VCPU, So that
>> unique interrupt is received from the dummy device for per VCPU.
>
> The number of LPIs supported by the ITS could be very limited. We need
> to use them with parsimony.

Per device we can generate upto 2K MSIx. With dummy device we can
use without restriction.

>
> Even if you have a per-VCPU  interrupt, it doesn't prevent a same vCPU
> writing other commands after a first batch then wait.

Yes, This case should be managed.

Managing all this complexity, What is the problem with Approach 2?
We have time bound polling which loops only for few ms and that too
ITS is not in critical path.
It is only used when configuring interrupts of the device?

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29  1:44                               ` Vijay Kilari
@ 2015-04-29 11:56                                 ` Julien Grall
  2015-04-29 12:12                                   ` Manish Jaggi
  2015-04-29 13:35                                   ` Julien Grall
  0 siblings, 2 replies; 109+ messages in thread
From: Julien Grall @ 2015-04-29 11:56 UTC (permalink / raw)
  To: Vijay Kilari, Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hello,

On 29/04/15 02:44, Vijay Kilari wrote:
> On Tue, Apr 28, 2015 at 9:45 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> On 28/04/15 12:36, Vijay Kilari wrote:
>>> On Tue, Apr 28, 2015 at 4:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>>> If you properly manage the device with struct pci_dev or struct device
>>>> (which is, as talked earlier, obviously required for security) you
>>>> should avoid your so-called "dummy device". BTW, what do you mean by
>>>> "dummy device"?
>>>
>>>
>>> (a) For implementing ITS command processing completion interrupt we need
>>> a unique interrupt for each domain per vITS to update corresponding virtual ITS
>>> CREADER
>>> (b) INT command requires dev,ID there needs to be a device
>>> associated with the ID
>>> (c) The command processing completion interrupt is not coming from a
>>> valid device, we have to provide a dummy device,ID
>>> (d) I propose that the dummy device segment number is read from a
>>> macro/helper function
>>> in the platform file.
>>> For each domain we can add the bus number so for eg: 0xff is the segment
>>> number which is #define PLAT_DUMMY_SEG 0xff.
>>> The device for dom0 would be PLAT_DUMMY_SEG:00:0.0
>>> The device for domU would be PLAT_DUMMY_SEG:00:0.0 | domain_id
>>
>> There is multiple problem with this solution:
>>         - What prevents a platform to use this Device ID in the future?
> 
> Nothing prevents. But we can make a check in the ITS driver.

The number of Devbits is not fixed so you can't hardcode them (See
GITS_TYPER.Devbits).

> 
>>         - What's is the behavior of the ITS when the Device ID doesn't belong
>> to a real device?
>             ITS behavior is same for any device ID provided it falls
> in Devbits range

Can you give a reference from the spec?

>>
>>> Let me know if there is better way to generate dummy/unused device id?
>>>
>>> So creation of dummy device and setup for INT command execution
>>> can be done in physical ITS driver with its_device structure managed
>>> in vgic_its
>>>
>>> Also with this approach, vITS is not held by the VCPU till the completion
>>> of command processing, So another VCPU of the same domain can add
>>> another ITS command. If so we have to keep track number of ITS commands
>>> being processed per VCPU of the domain and increment vITS CREADER accordingly.
>>> For this, we have to add one unique interrupt ID of the device for per
>>> VCPU, So that
>>> unique interrupt is received from the dummy device for per VCPU.
>>
>> The number of LPIs supported by the ITS could be very limited. We need
>> to use them with parsimony.
> 
> Per device we can generate upto 2K MSIx. With dummy device we can
> use without restriction.

The number of MSI supported by the device is not the problem... The
problem is number of LPIs supported by the ITS (see GITS.IDbits).

> Managing all this complexity, What is the problem with Approach 2?

The problem is the polling in EL2 for several reasons:
  1) The VCPU is not preemptible when running in EL2. So the scheduler
can't schedule another VCPU on this physical CPU.
  2) The guest VCPU may want to execute other code while waiting the
completion of vITS. For instance because he choose to use receive an
interrupt for completion. I talked about it longer on previous mails.

> We have time bound polling which loops only for few ms

The few ms would be transformed to several seconds if the guest sends
lots of commands.

Furthermore, are you sure that few ms is enough? Linux seems to wait up
to 1s for each command...

> and that too
> ITS is not in critical path.
> It is only used when configuring interrupts of the device?

You need to think about security... Even though the ITS should only be
used for configuring interrupts, a malicious guest could try to exploit
weakness in the emulation.

As the 2 suggested approach don't seem to fit our usage, we need to find
another approach.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 11:56                                 ` Julien Grall
@ 2015-04-29 12:12                                   ` Manish Jaggi
  2015-04-29 12:21                                     ` Julien Grall
  2015-04-29 13:35                                   ` Julien Grall
  1 sibling, 1 reply; 109+ messages in thread
From: Manish Jaggi @ 2015-04-29 12:12 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On Wednesday 29 April 2015 05:26 PM, Julien Grall wrote:
> Hello,
>
> On 29/04/15 02:44, Vijay Kilari wrote:
>> On Tue, Apr 28, 2015 at 9:45 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>> On 28/04/15 12:36, Vijay Kilari wrote:
>>>> On Tue, Apr 28, 2015 at 4:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>>>> If you properly manage the device with struct pci_dev or struct device
>>>>> (which is, as talked earlier, obviously required for security) you
>>>>> should avoid your so-called "dummy device". BTW, what do you mean by
>>>>> "dummy device"?
>>>>
>>>> (a) For implementing ITS command processing completion interrupt we need
>>>> a unique interrupt for each domain per vITS to update corresponding virtual ITS
>>>> CREADER
>>>> (b) INT command requires dev,ID there needs to be a device
>>>> associated with the ID
>>>> (c) The command processing completion interrupt is not coming from a
>>>> valid device, we have to provide a dummy device,ID
>>>> (d) I propose that the dummy device segment number is read from a
>>>> macro/helper function
>>>> in the platform file.
>>>> For each domain we can add the bus number so for eg: 0xff is the segment
>>>> number which is #define PLAT_DUMMY_SEG 0xff.
>>>> The device for dom0 would be PLAT_DUMMY_SEG:00:0.0
>>>> The device for domU would be PLAT_DUMMY_SEG:00:0.0 | domain_id
>>> There is multiple problem with this solution:
>>>          - What prevents a platform to use this Device ID in the future?
>> Nothing prevents. But we can make a check in the ITS driver.
> The number of Devbits is not fixed so you can't hardcode them (See
> GITS_TYPER.Devbits).
>
>>>          - What's is the behavior of the ITS when the Device ID doesn't belong
>>> to a real device?
>>              ITS behavior is same for any device ID provided it falls
>> in Devbits range
> Can you give a reference from the spec?
>
>>>> Let me know if there is better way to generate dummy/unused device id?
>>>>
>>>> So creation of dummy device and setup for INT command execution
>>>> can be done in physical ITS driver with its_device structure managed
>>>> in vgic_its
>>>>
>>>> Also with this approach, vITS is not held by the VCPU till the completion
>>>> of command processing, So another VCPU of the same domain can add
>>>> another ITS command. If so we have to keep track number of ITS commands
>>>> being processed per VCPU of the domain and increment vITS CREADER accordingly.
>>>> For this, we have to add one unique interrupt ID of the device for per
>>>> VCPU, So that
>>>> unique interrupt is received from the dummy device for per VCPU.
>>> The number of LPIs supported by the ITS could be very limited. We need
>>> to use them with parsimony.
>> Per device we can generate upto 2K MSIx. With dummy device we can
>> use without restriction.
> The number of MSI supported by the device is not the problem... The
> problem is number of LPIs supported by the ITS (see GITS.IDbits).
>
>> Managing all this complexity, What is the problem with Approach 2?
> The problem is the polling in EL2 for several reasons:
>    1) The VCPU is not preemptible when running in EL2. So the scheduler
> can't schedule another VCPU on this physical CPU.
>    2) The guest VCPU may want to execute other code while waiting the
> completion of vITS. For instance because he choose to use receive an
> interrupt for completion. I talked about it longer on previous mails.
>
>> We have time bound polling which loops only for few ms
> The few ms would be transformed to several seconds if the guest sends
> lots of commands.
>
> Furthermore, are you sure that few ms is enough? Linux seems to wait up
> to 1s for each command...
>
>> and that too
>> ITS is not in critical path.
>> It is only used when configuring interrupts of the device?
> You need to think about security... Even though the ITS should only be
> used for configuring interrupts, a malicious guest could try to exploit
> weakness in the emulation.
Can you describe the scenario ?
>
> As the 2 suggested approach don't seem to fit our usage, we need to find
> another approach.
>
> Regards,
>

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 12:12                                   ` Manish Jaggi
@ 2015-04-29 12:21                                     ` Julien Grall
  2015-04-29 12:33                                       ` Manish Jaggi
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-29 12:21 UTC (permalink / raw)
  To: Manish Jaggi, Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 29/04/15 13:12, Manish Jaggi wrote:
>>> and that too
>>> ITS is not in critical path.
>>> It is only used when configuring interrupts of the device?
>> You need to think about security... Even though the ITS should only be
>> used for configuring interrupts, a malicious guest could try to exploit
>> weakness in the emulation.
> Can you describe the scenario ?

I already wrote several times the possible security impacts of the
polling solution... Please read again the previous mails.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 12:21                                     ` Julien Grall
@ 2015-04-29 12:33                                       ` Manish Jaggi
  2015-04-29 13:01                                         ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Manish Jaggi @ 2015-04-29 12:33 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On Wednesday 29 April 2015 05:51 PM, Julien Grall wrote:
> On 29/04/15 13:12, Manish Jaggi wrote:
>>>> and that too ITS is not in critical path. It is only used when 
>>>> configuring interrupts of the device? 
>>> You need to think about security... Even though the ITS should only 
>>> be used for configuring interrupts, a malicious guest could try to 
>>> exploit weakness in the emulation. 
>> Can you describe the scenario ? 
> I already wrote several times the possible security impacts of the 
> polling solution... Please read again the previous mails.
I see your comment "The vITS emulates hardware for a specific domain. A 
malicious guest could send request to a not own device"
This scenario cannot happen as guest sbdf is converted to physical sbdf 
based on the domain. So if it does not own a device it would be treated 
as invalid command.

Do you have any other security concern ?


> Regards,

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 12:33                                       ` Manish Jaggi
@ 2015-04-29 13:01                                         ` Julien Grall
  2015-04-29 13:08                                           ` Manish Jaggi
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-29 13:01 UTC (permalink / raw)
  To: Manish Jaggi, Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 29/04/15 13:33, Manish Jaggi wrote:
> On Wednesday 29 April 2015 05:51 PM, Julien Grall wrote:
>> On 29/04/15 13:12, Manish Jaggi wrote:
>>>>> and that too ITS is not in critical path. It is only used when
>>>>> configuring interrupts of the device? 
>>>> You need to think about security... Even though the ITS should only
>>>> be used for configuring interrupts, a malicious guest could try to
>>>> exploit weakness in the emulation. 
>>> Can you describe the scenario ? 
>> I already wrote several times the possible security impacts of the
>> polling solution... Please read again the previous mails.
> I see your comment "The vITS emulates hardware for a specific domain. A
> malicious guest could send request to a not own device"
> This scenario cannot happen as guest sbdf is converted to physical sbdf
> based on the domain. So if it does not own a device it would be treated
> as invalid command.

Can you point the code in this patch series that implement what you
said? From what I read, you just forward the command to the physical ITS
as long as the guest called MAPD to the device.

> Do you have any other security concern ?

Yes. The one we talked in every mail since the beginning of this thread
"polling in EL2". We got several XSA because the hypervisor code wasn't
preemptible (see [1])


[1] http://xenbits.xen.org/xsa/advisory-97.html

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 13:01                                         ` Julien Grall
@ 2015-04-29 13:08                                           ` Manish Jaggi
  2015-04-29 13:16                                             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Manish Jaggi @ 2015-04-29 13:08 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On Wednesday 29 April 2015 06:31 PM, Julien Grall wrote:
> On 29/04/15 13:33, Manish Jaggi wrote:
>> On Wednesday 29 April 2015 05:51 PM, Julien Grall wrote:
>>> On 29/04/15 13:12, Manish Jaggi wrote:
>>>>>> and that too ITS is not in critical path. It is only used when
>>>>>> configuring interrupts of the device?
>>>>> You need to think about security... Even though the ITS should only
>>>>> be used for configuring interrupts, a malicious guest could try to
>>>>> exploit weakness in the emulation.
>>>> Can you describe the scenario ?
>>> I already wrote several times the possible security impacts of the
>>> polling solution... Please read again the previous mails.
>> I see your comment "The vITS emulates hardware for a specific domain. A
>> malicious guest could send request to a not own device"
>> This scenario cannot happen as guest sbdf is converted to physical sbdf
>> based on the domain. So if it does not own a device it would be treated
>> as invalid command.
> Can you point the code in this patch series that implement what you
> said? From what I read, you just forward the command to the physical ITS
> as long as the guest called MAPD to the device.
>
>> Do you have any other security concern ?
> Yes. The one we talked in every mail since the beginning of this thread
> "polling in EL2". We got several XSA because the hypervisor code wasn't
> preemptible (see [1])
>
We are removing polling using command processing completion which is 
signalled using INT interrupt.
> [1] http://xenbits.xen.org/xsa/advisory-97.html
>

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 13:08                                           ` Manish Jaggi
@ 2015-04-29 13:16                                             ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-04-29 13:16 UTC (permalink / raw)
  To: Manish Jaggi, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 29/04/15 14:08, Manish Jaggi wrote:
>>> Do you have any other security concern ?
>> Yes. The one we talked in every mail since the beginning of this thread
>> "polling in EL2". We got several XSA because the hypervisor code wasn't
>> preemptible (see [1])
>>
> We are removing polling using command processing completion which is
> signalled using INT interrupt.

Based on the discussion with Vijay, the completion based on the INT
command doesn't seems the right solution (see in particular [1] and [2])

Could you wait the end of this discussion before starting to implement
something that you may have to throw later?

Regards,

[1] http://lists.xen.org/archives/html/xen-devel/2015-04/msg03038.html
[2] http://lists.xen.org/archives/html/xen-devel/2015-04/msg03056.html

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 11:56                                 ` Julien Grall
  2015-04-29 12:12                                   ` Manish Jaggi
@ 2015-04-29 13:35                                   ` Julien Grall
  2015-04-29 16:26                                     ` Vijay Kilari
  1 sibling, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-29 13:35 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 29/04/15 12:56, Julien Grall wrote:
> As the 2 suggested approach don't seem to fit our usage, we need to find
> another approach.

I think I have another approach which doesn't require interrupt neither
polling in EL2.

1) Trap on CWRITER
   a) Read command for the vITS CQ
   b) Transform command
   c) Inject command
        - If ITS CQ full => 2)
        - If vITS CQ empty => 2)
        - Else => 1)
2) Return to guest
3) Trap on CREADR
   a) Check completion of the current batch of command
      - If complete => 3.b)
      - Else => 4)
   b) Update CREADR
   c) Check if more command and inject it (see process in 1))
4) Return to guest

Some other restrictions to add:
   - If there is already a batch of command in process for the domain,
defer the injection of new command
   - The number of command sent in a batch should be limited.

I think thoses restrictions are okay because the ITS is per-domain not
per-VCPU.

Although, there is a possible problem if the guest is not reading
CREADR. Maybe having a timer would be fine to update CREADR and inject
new commands.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 13:35                                   ` Julien Grall
@ 2015-04-29 16:26                                     ` Vijay Kilari
  2015-04-29 16:30                                       ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-29 16:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Julien,

On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
> On 29/04/15 12:56, Julien Grall wrote:
>> As the 2 suggested approach don't seem to fit our usage, we need to find
>> another approach.
>
> I think I have another approach which doesn't require interrupt neither
> polling in EL2.

I could resolve all the issues around approach 1
only concern is generating dummy/fake device id.

>
> 1) Trap on CWRITER
>    a) Read command for the vITS CQ
>    b) Transform command
>    c) Inject command
>         - If ITS CQ full => 2)
>         - If vITS CQ empty => 2)
>         - Else => 1)
> 2) Return to guest
> 3) Trap on CREADR
>    a) Check completion of the current batch of command
>       - If complete => 3.b)
>       - Else => 4)
>    b) Update CREADR
>    c) Check if more command and inject it (see process in 1))
> 4) Return to guest
>
> Some other restrictions to add:
>    - If there is already a batch of command in process for the domain,
> defer the injection of new command
>    - The number of command sent in a batch should be limited.
>
> I think thoses restrictions are okay because the ITS is per-domain not
> per-VCPU.
>
> Although, there is a possible problem if the guest is not reading
> CREADR. Maybe having a timer would be fine to update CREADR and inject
> new commands.

 How to know that guest is not reading CREADR and using INT mode to
check for completion?
One way is when INT ITS command is emulated, then we can assume that guest
is using INT mode and use timer to update CREADR and inject new
commands.

However with INT mode, if the guest driver is checking for CREADR
value on receiving completion interrupt requested through INT ITS command
then CREADR will show up old value.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 16:26                                     ` Vijay Kilari
@ 2015-04-29 16:30                                       ` Vijay Kilari
  2015-04-29 18:04                                         ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-29 16:30 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com> wrote:
> Hi Julien,
>
> On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> On 29/04/15 12:56, Julien Grall wrote:
>>> As the 2 suggested approach don't seem to fit our usage, we need to find
>>> another approach.
>>
>> I think I have another approach which doesn't require interrupt neither
>> polling in EL2.
>
> I could resolve all the issues around approach 1
> only concern is generating dummy/fake device id.
>
>>
>> 1) Trap on CWRITER
>>    a) Read command for the vITS CQ
>>    b) Transform command
>>    c) Inject command
>>         - If ITS CQ full => 2)
>>         - If vITS CQ empty => 2)
>>         - Else => 1)
>> 2) Return to guest
>> 3) Trap on CREADR
>>    a) Check completion of the current batch of command
>>       - If complete => 3.b)
>>       - Else => 4)
>>    b) Update CREADR
>>    c) Check if more command and inject it (see process in 1))
>> 4) Return to guest
>>
>> Some other restrictions to add:
>>    - If there is already a batch of command in process for the domain,
>> defer the injection of new command
>>    - The number of command sent in a batch should be limited.
>>
>> I think thoses restrictions are okay because the ITS is per-domain not
>> per-VCPU.
>>
>> Although, there is a possible problem if the guest is not reading
>> CREADR. Maybe having a timer would be fine to update CREADR and inject
>> new commands.
>
>  How to know that guest is not reading CREADR and using INT mode to
> check for completion?
> One way is when INT ITS command is emulated, then we can assume that guest
> is using INT mode and use timer to update CREADR and inject new
> commands.
>
> However with INT mode, if the guest driver is checking for CREADR
> value on receiving completion interrupt requested through INT ITS command
> then CREADR will show up old value.

  On trap of CREADR we have to update it. So this can be handled

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 16:30                                       ` Vijay Kilari
@ 2015-04-29 18:04                                         ` Julien Grall
  2015-04-30 10:02                                           ` Stefano Stabellini
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-29 18:04 UTC (permalink / raw)
  To: Vijay Kilari, Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 29/04/15 17:30, Vijay Kilari wrote:
> On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com> wrote:
>> On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>> On 29/04/15 12:56, Julien Grall wrote:
>>>> As the 2 suggested approach don't seem to fit our usage, we need to find
>>>> another approach.
>>>
>>> I think I have another approach which doesn't require interrupt neither
>>> polling in EL2.
>>
>> I could resolve all the issues around approach 1
>> only concern is generating dummy/fake device id.

This is a big concern. We can't hardcode the devID because a real device
could use it later. Having an ID generating at the boot time wouldn't be
better because it could be broken with device hotplug.

It's very unfortunate that the ITS doesn't have a reserved interrupt.

So, I think we need to rule out the interrupt completion.

>>
>> However with INT mode, if the guest driver is checking for CREADR
>> value on receiving completion interrupt requested through INT ITS command
>> then CREADR will show up old value.
> 
>   On trap of CREADR we have to update it. So this can be handled

Unfortunately not. We need to work in batch mode in order to have a
simple and secure implementation. This would restrict the number of
commands sent together. So the INT command may be sent in another batch.

Furthermore, if the command queue is full we need to defer the rest of
the guest (rather than discarding the command as you do now).

I'm open to other approach as long as they are secured. By secure, I
mean, no polling in EL2 and no possibility to take down another guest
(for instance by flooding the command queue).

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-29 18:04                                         ` Julien Grall
@ 2015-04-30 10:02                                           ` Stefano Stabellini
  2015-04-30 10:09                                             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Stefano Stabellini @ 2015-04-30 10:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Wed, 29 Apr 2015, Julien Grall wrote:
> On 29/04/15 17:30, Vijay Kilari wrote:
> > On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com> wrote:
> >> On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
> >>> On 29/04/15 12:56, Julien Grall wrote:
> >>>> As the 2 suggested approach don't seem to fit our usage, we need to find
> >>>> another approach.
> >>>
> >>> I think I have another approach which doesn't require interrupt neither
> >>> polling in EL2.
> >>
> >> I could resolve all the issues around approach 1
> >> only concern is generating dummy/fake device id.
> 
> This is a big concern. We can't hardcode the devID because a real device
> could use it later. Having an ID generating at the boot time wouldn't be
> better because it could be broken with device hotplug.
> 
> It's very unfortunate that the ITS doesn't have a reserved interrupt.

Indeed, but it is an issue that can be overcome. We should just use an
out-of-range devid. One that cannot be hot-plugged later.

The one proposed by Vijay is actually not a bad idea (segment number
0xff). Segment numbers correspond to MCFG config spaces. There is
usually one per root complex, but theoretically I think one root complex
could have more.

I don't think is possible to hot-plug a root complex, so if one is spare
at boot time, we should be safe. Even if it was possible, we could still
return error from PHYSDEVOP_pci_host_bridge_add if the segment number
overlaps (see http://marc.info/?l=xen-devel&m=142495489932714).

So I think we should just go ahead and use PLAT_DUMMY_SEG 0xff.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 10:02                                           ` Stefano Stabellini
@ 2015-04-30 10:09                                             ` Julien Grall
  2015-04-30 10:15                                               ` Stefano Stabellini
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-30 10:09 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Vijay Kilari, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Stefano,

On 30/04/2015 11:02, Stefano Stabellini wrote:
> On Wed, 29 Apr 2015, Julien Grall wrote:
>> On 29/04/15 17:30, Vijay Kilari wrote:
>>> On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com> wrote:
>>>> On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>>>> On 29/04/15 12:56, Julien Grall wrote:
>>>>>> As the 2 suggested approach don't seem to fit our usage, we need to find
>>>>>> another approach.
>>>>>
>>>>> I think I have another approach which doesn't require interrupt neither
>>>>> polling in EL2.
>>>>
>>>> I could resolve all the issues around approach 1
>>>> only concern is generating dummy/fake device id.
>>
>> This is a big concern. We can't hardcode the devID because a real device
>> could use it later. Having an ID generating at the boot time wouldn't be
>> better because it could be broken with device hotplug.
>>
>> It's very unfortunate that the ITS doesn't have a reserved interrupt.
>
> Indeed, but it is an issue that can be overcome. We should just use an
> out-of-range devid. One that cannot be hot-plugged later.
>
> The one proposed by Vijay is actually not a bad idea (segment number
> 0xff). Segment numbers correspond to MCFG config spaces. There is
> usually one per root complex, but theoretically I think one root complex
> could have more.
>
> I don't think is possible to hot-plug a root complex, so if one is spare
> at boot time, we should be safe. Even if it was possible, we could still
> return error from PHYSDEVOP_pci_host_bridge_add if the segment number
> overlaps (see http://marc.info/?l=xen-devel&m=142495489932714).
>
> So I think we should just go ahead and use PLAT_DUMMY_SEG 0xff.

As said earlier, the number of DevBits implemented by the ITS can be 
limited (see GITS_TYPER.Devbits).

If the devid is not within this range, the ITS won't recognize the value 
and won't be able to send the interrupt.

So this is clearly not the right value.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 10:09                                             ` Julien Grall
@ 2015-04-30 10:15                                               ` Stefano Stabellini
  2015-04-30 10:20                                                 ` Julien Grall
  2015-04-30 13:19                                                 ` Vijay Kilari
  0 siblings, 2 replies; 109+ messages in thread
From: Stefano Stabellini @ 2015-04-30 10:15 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Thu, 30 Apr 2015, Julien Grall wrote:
> Hi Stefano,
> 
> On 30/04/2015 11:02, Stefano Stabellini wrote:
> > On Wed, 29 Apr 2015, Julien Grall wrote:
> > > On 29/04/15 17:30, Vijay Kilari wrote:
> > > > On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com>
> > > > wrote:
> > > > > On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall
> > > > > <julien.grall@citrix.com> wrote:
> > > > > > On 29/04/15 12:56, Julien Grall wrote:
> > > > > > > As the 2 suggested approach don't seem to fit our usage, we need
> > > > > > > to find
> > > > > > > another approach.
> > > > > > 
> > > > > > I think I have another approach which doesn't require interrupt
> > > > > > neither
> > > > > > polling in EL2.
> > > > > 
> > > > > I could resolve all the issues around approach 1
> > > > > only concern is generating dummy/fake device id.
> > > 
> > > This is a big concern. We can't hardcode the devID because a real device
> > > could use it later. Having an ID generating at the boot time wouldn't be
> > > better because it could be broken with device hotplug.
> > > 
> > > It's very unfortunate that the ITS doesn't have a reserved interrupt.
> > 
> > Indeed, but it is an issue that can be overcome. We should just use an
> > out-of-range devid. One that cannot be hot-plugged later.
> > 
> > The one proposed by Vijay is actually not a bad idea (segment number
> > 0xff). Segment numbers correspond to MCFG config spaces. There is
> > usually one per root complex, but theoretically I think one root complex
> > could have more.
> > 
> > I don't think is possible to hot-plug a root complex, so if one is spare
> > at boot time, we should be safe. Even if it was possible, we could still
> > return error from PHYSDEVOP_pci_host_bridge_add if the segment number
> > overlaps (see http://marc.info/?l=xen-devel&m=142495489932714).
> > 
> > So I think we should just go ahead and use PLAT_DUMMY_SEG 0xff.
> 
> As said earlier, the number of DevBits implemented by the ITS can be limited
> (see GITS_TYPER.Devbits).
> 
> If the devid is not within this range, the ITS won't recognize the value and
> won't be able to send the interrupt.
> 
> So this is clearly not the right value.

Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
Vijay, what is the value of GITS_TYPER.Devbits on your platform?

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 10:15                                               ` Stefano Stabellini
@ 2015-04-30 10:20                                                 ` Julien Grall
  2015-04-30 10:50                                                   ` Stefano Stabellini
  2015-04-30 13:19                                                 ` Vijay Kilari
  1 sibling, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-30 10:20 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Vijay Kilari, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On 30/04/2015 11:15, Stefano Stabellini wrote:
>> As said earlier, the number of DevBits implemented by the ITS can be limited
>> (see GITS_TYPER.Devbits).
>>
>> If the devid is not within this range, the ITS won't recognize the value and
>> won't be able to send the interrupt.
>>
>> So this is clearly not the right value.
>
> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> Vijay, what is the value of GITS_TYPER.Devbits on your platform?

How can you be sure that this value won't be use for a device on this 
platform?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 10:20                                                 ` Julien Grall
@ 2015-04-30 10:50                                                   ` Stefano Stabellini
  0 siblings, 0 replies; 109+ messages in thread
From: Stefano Stabellini @ 2015-04-30 10:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Thu, 30 Apr 2015, Julien Grall wrote:
> On 30/04/2015 11:15, Stefano Stabellini wrote:
> > > As said earlier, the number of DevBits implemented by the ITS can be
> > > limited
> > > (see GITS_TYPER.Devbits).
> > > 
> > > If the devid is not within this range, the ITS won't recognize the value
> > > and
> > > won't be able to send the interrupt.
> > > 
> > > So this is clearly not the right value.
> > 
> > Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> > Vijay, what is the value of GITS_TYPER.Devbits on your platform?
> 
> How can you be sure that this value won't be use for a device on this
> platform?

It really depends on the platform: if GITS_TYPER.Devbits is much higher
than the number of hotpluggable devices, then we are fine. If it is
exactly identical, we need to return error on the last device hotplug.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 10:15                                               ` Stefano Stabellini
  2015-04-30 10:20                                                 ` Julien Grall
@ 2015-04-30 13:19                                                 ` Vijay Kilari
  2015-04-30 13:47                                                   ` Stefano Stabellini
  1 sibling, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-04-30 13:19 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

On Thu, Apr 30, 2015 at 3:45 PM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Thu, 30 Apr 2015, Julien Grall wrote:
>> Hi Stefano,
>>
>> On 30/04/2015 11:02, Stefano Stabellini wrote:
>> > On Wed, 29 Apr 2015, Julien Grall wrote:
>> > > On 29/04/15 17:30, Vijay Kilari wrote:
>> > > > On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com>
>> > > > wrote:
>> > > > > On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall
>> > > > > <julien.grall@citrix.com> wrote:
>> > > > > > On 29/04/15 12:56, Julien Grall wrote:
>> > > > > > > As the 2 suggested approach don't seem to fit our usage, we need
>> > > > > > > to find
>> > > > > > > another approach.
>> > > > > >
>> > > > > > I think I have another approach which doesn't require interrupt
>> > > > > > neither
>> > > > > > polling in EL2.
>> > > > >
>> > > > > I could resolve all the issues around approach 1
>> > > > > only concern is generating dummy/fake device id.
>> > >
>> > > This is a big concern. We can't hardcode the devID because a real device
>> > > could use it later. Having an ID generating at the boot time wouldn't be
>> > > better because it could be broken with device hotplug.
>> > >
>> > > It's very unfortunate that the ITS doesn't have a reserved interrupt.
>> >
>> > Indeed, but it is an issue that can be overcome. We should just use an
>> > out-of-range devid. One that cannot be hot-plugged later.
>> >
>> > The one proposed by Vijay is actually not a bad idea (segment number
>> > 0xff). Segment numbers correspond to MCFG config spaces. There is
>> > usually one per root complex, but theoretically I think one root complex
>> > could have more.
>> >
>> > I don't think is possible to hot-plug a root complex, so if one is spare
>> > at boot time, we should be safe. Even if it was possible, we could still
>> > return error from PHYSDEVOP_pci_host_bridge_add if the segment number
>> > overlaps (see http://marc.info/?l=xen-devel&m=142495489932714).
>> >
>> > So I think we should just go ahead and use PLAT_DUMMY_SEG 0xff.
>>
>> As said earlier, the number of DevBits implemented by the ITS can be limited
>> (see GITS_TYPER.Devbits).

ITS specs does not define how may of these bits corresponds to segment number.
This is platform specific.

>>
>> If the devid is not within this range, the ITS won't recognize the value and
>> won't be able to send the interrupt.
>>
>> So this is clearly not the right value.
>
> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> Vijay, what is the value of GITS_TYPER.Devbits on your platform?

It is 21 bits

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 13:19                                                 ` Vijay Kilari
@ 2015-04-30 13:47                                                   ` Stefano Stabellini
  2015-04-30 14:29                                                     ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Stefano Stabellini @ 2015-04-30 13:47 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Julien Grall,
	Stefano Stabellini, manish.jaggi

On Thu, 30 Apr 2015, Vijay Kilari wrote:
> On Thu, Apr 30, 2015 at 3:45 PM, Stefano Stabellini
> <stefano.stabellini@eu.citrix.com> wrote:
> > On Thu, 30 Apr 2015, Julien Grall wrote:
> >> Hi Stefano,
> >>
> >> On 30/04/2015 11:02, Stefano Stabellini wrote:
> >> > On Wed, 29 Apr 2015, Julien Grall wrote:
> >> > > On 29/04/15 17:30, Vijay Kilari wrote:
> >> > > > On Wed, Apr 29, 2015 at 9:56 PM, Vijay Kilari <vijay.kilari@gmail.com>
> >> > > > wrote:
> >> > > > > On Wed, Apr 29, 2015 at 7:05 PM, Julien Grall
> >> > > > > <julien.grall@citrix.com> wrote:
> >> > > > > > On 29/04/15 12:56, Julien Grall wrote:
> >> > > > > > > As the 2 suggested approach don't seem to fit our usage, we need
> >> > > > > > > to find
> >> > > > > > > another approach.
> >> > > > > >
> >> > > > > > I think I have another approach which doesn't require interrupt
> >> > > > > > neither
> >> > > > > > polling in EL2.
> >> > > > >
> >> > > > > I could resolve all the issues around approach 1
> >> > > > > only concern is generating dummy/fake device id.
> >> > >
> >> > > This is a big concern. We can't hardcode the devID because a real device
> >> > > could use it later. Having an ID generating at the boot time wouldn't be
> >> > > better because it could be broken with device hotplug.
> >> > >
> >> > > It's very unfortunate that the ITS doesn't have a reserved interrupt.
> >> >
> >> > Indeed, but it is an issue that can be overcome. We should just use an
> >> > out-of-range devid. One that cannot be hot-plugged later.
> >> >
> >> > The one proposed by Vijay is actually not a bad idea (segment number
> >> > 0xff). Segment numbers correspond to MCFG config spaces. There is
> >> > usually one per root complex, but theoretically I think one root complex
> >> > could have more.
> >> >
> >> > I don't think is possible to hot-plug a root complex, so if one is spare
> >> > at boot time, we should be safe. Even if it was possible, we could still
> >> > return error from PHYSDEVOP_pci_host_bridge_add if the segment number
> >> > overlaps (see http://marc.info/?l=xen-devel&m=142495489932714).
> >> >
> >> > So I think we should just go ahead and use PLAT_DUMMY_SEG 0xff.
> >>
> >> As said earlier, the number of DevBits implemented by the ITS can be limited
> >> (see GITS_TYPER.Devbits).
> 
> ITS specs does not define how may of these bits corresponds to segment number.
> This is platform specific.
> 
> >>
> >> If the devid is not within this range, the ITS won't recognize the value and
> >> won't be able to send the interrupt.
> >>
> >> So this is clearly not the right value.
> >
> > Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> > Vijay, what is the value of GITS_TYPER.Devbits on your platform?
> 
> It is 21 bits

I would imagine that 21 bits would be plenty to find an unused devid.
Alternatively we could use an inexistent function of a real device, such
as 00:00.1 (function 1 of the host bridge).

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 13:47                                                   ` Stefano Stabellini
@ 2015-04-30 14:29                                                     ` Julien Grall
  2015-05-04 12:58                                                       ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-04-30 14:29 UTC (permalink / raw)
  To: Stefano Stabellini, Vijay Kilari
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

Hi,

On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>
>>>> If the devid is not within this range, the ITS won't recognize the value and
>>>> won't be able to send the interrupt.
>>>>
>>>> So this is clearly not the right value.
>>>
>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>
>> It is 21 bits
> 
> I would imagine that 21 bits would be plenty to find an unused devid.
>
> Alternatively we could use an inexistent function of a real device, such
> as 00:00.1 (function 1 of the host bridge).

As discussed IRL, this idea sounds good to me.

Although I would be happy with any other way which ensure the devid is free.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-04-30 14:29                                                     ` Julien Grall
@ 2015-05-04 12:58                                                       ` Vijay Kilari
  2015-05-04 13:04                                                         ` Julien Grall
  2015-05-05 10:39                                                         ` Stefano Stabellini
  0 siblings, 2 replies; 109+ messages in thread
From: Vijay Kilari @ 2015-05-04 12:58 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com> wrote:
> Hi,
>
> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>
>>>>> If the devid is not within this range, the ITS won't recognize the value and
>>>>> won't be able to send the interrupt.
>>>>>
>>>>> So this is clearly not the right value.
>>>>
>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>
>>> It is 21 bits
>>
>> I would imagine that 21 bits would be plenty to find an unused devid.
>>
>> Alternatively we could use an inexistent function of a real device, such
>> as 00:00.1 (function 1 of the host bridge).
>
> As discussed IRL, this idea sounds good to me.
>
> Although I would be happy with any other way which ensure the devid is free.

Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
slow compared to polling mode. This could be because Dom0 has to keep
trapping on creader to check if creader is updated or not.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 12:58                                                       ` Vijay Kilari
@ 2015-05-04 13:04                                                         ` Julien Grall
  2015-05-04 13:27                                                           ` Vijay Kilari
  2015-05-05 10:39                                                         ` Stefano Stabellini
  1 sibling, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-04 13:04 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On 04/05/2015 13:58, Vijay Kilari wrote:
> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> Hi,
>>
>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>
>>>>>> If the devid is not within this range, the ITS won't recognize the value and
>>>>>> won't be able to send the interrupt.
>>>>>>
>>>>>> So this is clearly not the right value.
>>>>>
>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>
>>>> It is 21 bits
>>>
>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>
>>> Alternatively we could use an inexistent function of a real device, such
>>> as 00:00.1 (function 1 of the host bridge).
>>
>> As discussed IRL, this idea sounds good to me.
>>
>> Although I would be happy with any other way which ensure the devid is free.
>
> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
> slow compared to polling mode. This could be because Dom0 has to keep
> trapping on creader to check if creader is updated or not.

How did you implement the interrupt mode? Could it be improve?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 13:04                                                         ` Julien Grall
@ 2015-05-04 13:27                                                           ` Vijay Kilari
  2015-05-04 13:44                                                             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-04 13:27 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Mon, May 4, 2015 at 6:34 PM, Julien Grall <julien.grall@citrix.com> wrote:
>
>
> On 04/05/2015 13:58, Vijay Kilari wrote:
>>
>> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>>
>>>>>>>
>>>>>>> If the devid is not within this range, the ITS won't recognize the
>>>>>>> value and
>>>>>>> won't be able to send the interrupt.
>>>>>>>
>>>>>>> So this is clearly not the right value.
>>>>>>
>>>>>>
>>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>>
>>>>>
>>>>> It is 21 bits
>>>>
>>>>
>>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>>
>>>> Alternatively we could use an inexistent function of a real device, such
>>>> as 00:00.1 (function 1 of the host bridge).
>>>
>>>
>>> As discussed IRL, this idea sounds good to me.
>>>
>>> Although I would be happy with any other way which ensure the devid is
>>> free.
>>
>>
>> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
>> slow compared to polling mode. This could be because Dom0 has to keep
>> trapping on creader to check if creader is updated or not.
>
>
> How did you implement the interrupt mode? Could it be improve?

   1) In physical ITS driver its_device is created with devID 00:00.1 with
256 MSI-x are reserved and is named as completion_dev, which is global.

   2) In Domain init,
         - one irq (called completion_irq) is allocated per domain for
this device
           and irq desc is allocated to this domain. This way we can get vITS
           context when interrupt is received.
         - An array of 32 requests(its_requests) is allocated which stores all
           the information about the ITS commands that are converted and written
           to physical ITS queue and each request info contains
           [CReader vITS] [CWriter vITS] [Physical Q Index] [Number of commands]
            [Completion irq] [ status ]

   3) When vITS received ITS command. This command is converted to physical
      command  and written to physical queue, INT command is appended with
      completion_irq and entry is made in its_requests.
   4) On receiving completion_irq, vITS structure and its_requests
info is parsed
       and Creader of vITS is upstated with [ Cwriter vITS ] stored
for this request
       and this request is removed from its_requests.

 I am adding one INT per command. This can be improved to add one INT
cmd for all
 the pending commands. Existing Linux driver sends 2 commands at a time.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 13:27                                                           ` Vijay Kilari
@ 2015-05-04 13:44                                                             ` Julien Grall
  2015-05-04 13:54                                                               ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-04 13:44 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Vijay,

On 04/05/2015 14:27, Vijay Kilari wrote:
> On Mon, May 4, 2015 at 6:34 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>
>>
>> On 04/05/2015 13:58, Vijay Kilari wrote:
>>>
>>> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> If the devid is not within this range, the ITS won't recognize the
>>>>>>>> value and
>>>>>>>> won't be able to send the interrupt.
>>>>>>>>
>>>>>>>> So this is clearly not the right value.
>>>>>>>
>>>>>>>
>>>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>>>
>>>>>>
>>>>>> It is 21 bits
>>>>>
>>>>>
>>>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>>>
>>>>> Alternatively we could use an inexistent function of a real device, such
>>>>> as 00:00.1 (function 1 of the host bridge).
>>>>
>>>>
>>>> As discussed IRL, this idea sounds good to me.
>>>>
>>>> Although I would be happy with any other way which ensure the devid is
>>>> free.
>>>
>>>
>>> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
>>> slow compared to polling mode. This could be because Dom0 has to keep
>>> trapping on creader to check if creader is updated or not.
>>
>>
>> How did you implement the interrupt mode? Could it be improve?
>
>     1) In physical ITS driver its_device is created with devID 00:00.1 with
> 256 MSI-x are reserved and is named as completion_dev, which is global.

That's a lot of MSI-x reserved... Can't you use only one per domain?

>     2) In Domain init,
>           - one irq (called completion_irq) is allocated per domain for
> this device

So only 256 domain can run on your platform at the same time?

>             and irq desc is allocated to this domain. This way we can get vITS
>             context when interrupt is received.
>           - An array of 32 requests(its_requests) is allocated which stores all
>             the information about the ITS commands that are converted and written
>             to physical ITS queue and each request info contains
>             [CReader vITS] [CWriter vITS] [Physical Q Index] [Number of commands]
>              [Completion irq] [ status ]

Why 32 requests?

Also some of the fields don't make much sense to me such as "Number of 
Commands" , "Completion IRQ"  and "status" I guess I will find out when 
you will send the new series.

>     3) When vITS received ITS command. This command is converted to physical
>        command  and written to physical queue, INT command is appended with
>        completion_irq and entry is made in its_requests.
>     4) On receiving completion_irq, vITS structure and its_requests
> info is parsed
>         and Creader of vITS is upstated with [ Cwriter vITS ] stored
> for this request
>         and this request is removed from its_requests.

You complicate the code by allowing 32 batch of command per domain 
(looping is very slow). You should only allow one batch of command per 
domain. When the batch is finished you can send another one.

>   I am adding one INT per command. This can be improved to add one INT
> cmd for all
>   the pending commands. Existing Linux driver sends 2 commands at a time.

You should not assume that other OS will send 2 commands at the same 
time... It could be more or less.

Although, having a INT per command is rather slow. One INT command per 
batch would improve the boot time.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 13:44                                                             ` Julien Grall
@ 2015-05-04 13:54                                                               ` Julien Grall
  2015-05-04 15:19                                                                 ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-04 13:54 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi



On 04/05/2015 14:44, Julien Grall wrote:
> Hi Vijay,
>
> On 04/05/2015 14:27, Vijay Kilari wrote:
>> On Mon, May 4, 2015 at 6:34 PM, Julien Grall <julien.grall@citrix.com>
>> wrote:
>>>
>>>
>>> On 04/05/2015 13:58, Vijay Kilari wrote:
>>>>
>>>> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If the devid is not within this range, the ITS won't recognize the
>>>>>>>>> value and
>>>>>>>>> won't be able to send the interrupt.
>>>>>>>>>
>>>>>>>>> So this is clearly not the right value.
>>>>>>>>
>>>>>>>>
>>>>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>>>>
>>>>>>>
>>>>>>> It is 21 bits
>>>>>>
>>>>>>
>>>>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>>>>
>>>>>> Alternatively we could use an inexistent function of a real
>>>>>> device, such
>>>>>> as 00:00.1 (function 1 of the host bridge).
>>>>>
>>>>>
>>>>> As discussed IRL, this idea sounds good to me.
>>>>>
>>>>> Although I would be happy with any other way which ensure the devid is
>>>>> free.
>>>>
>>>>
>>>> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
>>>> slow compared to polling mode. This could be because Dom0 has to keep
>>>> trapping on creader to check if creader is updated or not.
>>>
>>>
>>> How did you implement the interrupt mode? Could it be improve?
>>
>>     1) In physical ITS driver its_device is created with devID 00:00.1
>> with
>> 256 MSI-x are reserved and is named as completion_dev, which is global.
>
> That's a lot of MSI-x reserved... Can't you use only one per domain?

Hmmm... I meant for all the domain, not "per domain".


>
>>     2) In Domain init,
>>           - one irq (called completion_irq) is allocated per domain for
>> this device
>
> So only 256 domain can run on your platform at the same time?
>
>>             and irq desc is allocated to this domain. This way we can
>> get vITS
>>             context when interrupt is received.
>>           - An array of 32 requests(its_requests) is allocated which
>> stores all
>>             the information about the ITS commands that are converted
>> and written
>>             to physical ITS queue and each request info contains
>>             [CReader vITS] [CWriter vITS] [Physical Q Index] [Number
>> of commands]
>>              [Completion irq] [ status ]
>
> Why 32 requests?
>
> Also some of the fields don't make much sense to me such as "Number of
> Commands" , "Completion IRQ"  and "status" I guess I will find out when
> you will send the new series.
>
>>     3) When vITS received ITS command. This command is converted to
>> physical
>>        command  and written to physical queue, INT command is appended
>> with
>>        completion_irq and entry is made in its_requests.
>>     4) On receiving completion_irq, vITS structure and its_requests
>> info is parsed
>>         and Creader of vITS is upstated with [ Cwriter vITS ] stored
>> for this request
>>         and this request is removed from its_requests.
>
> You complicate the code by allowing 32 batch of command per domain
> (looping is very slow). You should only allow one batch of command per
> domain. When the batch is finished you can send another one.
>
>>   I am adding one INT per command. This can be improved to add one INT
>> cmd for all
>>   the pending commands. Existing Linux driver sends 2 commands at a time.
>
> You should not assume that other OS will send 2 commands at the same
> time... It could be more or less.
>
> Although, having a INT per command is rather slow. One INT command per
> batch would improve the boot time.
>
> Regards,
>

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 13:54                                                               ` Julien Grall
@ 2015-05-04 15:19                                                                 ` Vijay Kilari
  2015-05-04 17:00                                                                   ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-04 15:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Mon, May 4, 2015 at 7:24 PM, Julien Grall <julien.grall@citrix.com> wrote:
>
>
> On 04/05/2015 14:44, Julien Grall wrote:
>>
>> Hi Vijay,
>>
>> On 04/05/2015 14:27, Vijay Kilari wrote:
>>>
>>> On Mon, May 4, 2015 at 6:34 PM, Julien Grall <julien.grall@citrix.com>
>>> wrote:
>>>>
>>>>
>>>>
>>>> On 04/05/2015 13:58, Vijay Kilari wrote:
>>>>>
>>>>>
>>>>> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com>
>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If the devid is not within this range, the ITS won't recognize the
>>>>>>>>>> value and
>>>>>>>>>> won't be able to send the interrupt.
>>>>>>>>>>
>>>>>>>>>> So this is clearly not the right value.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> It is 21 bits
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>>>>>
>>>>>>> Alternatively we could use an inexistent function of a real
>>>>>>> device, such
>>>>>>> as 00:00.1 (function 1 of the host bridge).
>>>>>>
>>>>>>
>>>>>>
>>>>>> As discussed IRL, this idea sounds good to me.
>>>>>>
>>>>>> Although I would be happy with any other way which ensure the devid is
>>>>>> free.
>>>>>
>>>>>
>>>>>
>>>>> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
>>>>> slow compared to polling mode. This could be because Dom0 has to keep
>>>>> trapping on creader to check if creader is updated or not.
>>>>
>>>>
>>>>
>>>> How did you implement the interrupt mode? Could it be improve?
>>>
>>>
>>>     1) In physical ITS driver its_device is created with devID 00:00.1
>>> with
>>> 256 MSI-x are reserved and is named as completion_dev, which is global.
>>
>>
>> That's a lot of MSI-x reserved... Can't you use only one per domain?
>
>
> Hmmm... I meant for all the domain, not "per domain".

   Complexity with one irq for all domains is that if completion interrupt
comes it is difficult to find out  for which vITS/Domain ITS command
it came for.

>>
>>>     2) In Domain init,
>>>           - one irq (called completion_irq) is allocated per domain for
>>> this device
>>
>>
>> So only 256 domain can run on your platform at the same time?
>>
>>>             and irq desc is allocated to this domain. This way we can
>>> get vITS
>>>             context when interrupt is received.
>>>           - An array of 32 requests(its_requests) is allocated which
>>> stores all
>>>             the information about the ITS commands that are converted
>>> and written
>>>             to physical ITS queue and each request info contains
>>>             [CReader vITS] [CWriter vITS] [Physical Q Index] [Number
>>> of commands]
>>>              [Completion irq] [ status ]
>>
>>
>> Why 32 requests?

   Thought that 32 requests would be sufficient. Can increase more.

>>
>> Also some of the fields don't make much sense to me such as "Number of
>> Commands" , "Completion IRQ"  and "status" I guess I will find out when
>> you will send the new series.
>>
>>>     3) When vITS received ITS command. This command is converted to
>>> physical
>>>        command  and written to physical queue, INT command is appended
>>> with
>>>        completion_irq and entry is made in its_requests.
>>>     4) On receiving completion_irq, vITS structure and its_requests
>>> info is parsed
>>>         and Creader of vITS is upstated with [ Cwriter vITS ] stored
>>> for this request
>>>         and this request is removed from its_requests.
>>
>>
>> You complicate the code by allowing 32 batch of command per domain
>> (looping is very slow). You should only allow one batch of command per
>> domain. When the batch is finished you can send another one.
>>
      There is an index to first request and last request. So don't loop for
all 32 requests. Since we have to maintain vITS and physical ITS commands
are always processed in the same order, always the completion interrupt
comes for the first pending request.

>>>   I am adding one INT per command. This can be improved to add one INT
>>> cmd for all
>>>   the pending commands. Existing Linux driver sends 2 commands at a time.
>>
>>
>> You should not assume that other OS will send 2 commands at the same
>> time... It could be more or less.
>>
>> Although, having a INT per command is rather slow. One INT command per
>> batch would improve the boot time.

   We cannot limit on number of commands sent at a time. we have to send all the
pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
we have to check for pending interrupts on completion interrupt and translate
and send pending commands in interrupt context. Which complicates and adds more
delays.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 15:19                                                                 ` Vijay Kilari
@ 2015-05-04 17:00                                                                   ` Julien Grall
  2015-05-05 10:28                                                                     ` Stefano Stabellini
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-04 17:00 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi Vijay,

On 04/05/2015 16:19, Vijay Kilari wrote:
>>>>> How did you implement the interrupt mode? Could it be improve?
>>>>
>>>>
>>>>      1) In physical ITS driver its_device is created with devID 00:00.1
>>>> with
>>>> 256 MSI-x are reserved and is named as completion_dev, which is global.
>>>
>>>
>>> That's a lot of MSI-x reserved... Can't you use only one per domain?
>>
>>
>> Hmmm... I meant for all the domain, not "per domain".
>
>     Complexity with one irq for all domains is that if completion interrupt
> comes it is difficult to find out  for which vITS/Domain ITS command
> it came for.

While reserving a single devID sounds feasible on all the future 
platform. Allocating 256 MSI-x sounds more difficult, you assume that 
any board will have at least 256 MSI-x free.

Although, this is not scalable. How do you plan to handle more than 256 
domains? By increasing the number of reserved MSI-x?

I don't ask you to implement the later now... but if increasing the 
number of domain supported means rewriting all the completion code and 
maybe the vITS then you should ask yourself if it's really worth to take 
this current approach.

[..]

>>>>    I am adding one INT per command. This can be improved to add one INT
>>>> cmd for all
>>>>    the pending commands. Existing Linux driver sends 2 commands at a time.
>>>
>>>
>>> You should not assume that other OS will send 2 commands at the same
>>> time... It could be more or less.
>>>
>>> Although, having a INT per command is rather slow. One INT command per
>>> batch would improve the boot time.
>
>     We cannot limit on number of commands sent at a time. we have to send all the
> pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
> we have to check for pending interrupts on completion interrupt and translate
> and send pending commands in interrupt context. Which complicates and adds more
> delays.

If we don't limit the number of commands sent, we would allow a domain 
to flood the command queue. Therefore, other domains wouldn't be able to 
send command and will likely timeout and crash. This is one possible 
security issue among many others.

Nobody like security issue, it impacts both end-user and the project. 
Please have this security concern in mind before performance. 
Performance is usually more easier to address later.

As the vITS is only used for interrupt managing (mapping, unmapping), 
it's not used in hot path such as receiving interrupt. So we don't care 
if it's "slow" from the guest point of view as long as we emulate the 
behavior correctly without impacting the other domain.

Also, what happen if the physical queue is full? You need to have a away 
to inject new commands later.

Overall I'm aware that the command queue emulation is huge. Lots of 
things to take into account : security, performance, concurrence problem...

As you did before in this thread, I would suggest you to write down all 
the possible solutions and see what are the impacts (security, theorical 
performances...). So we could talk on the ML (or over an IRC/phone 
meeting) and agree to a solution that satisfy everyone.

By experience, doing a such things may speed up the acceptance of the 
series because everyone will focus on the implementation when you send a 
new patch.

One good example is x86 PML support. Intel has sent a design doc a 
couple of months ago [1]. Developers discussed about the overall design, 
when a common agreement has been made they send the patch series.

The same would have been very helpful to understand this series. TBH, I 
spent most of my time trying to understand what was your design and how 
everything works together. On a 4000 lines series split in 22 patches 
it's a rather big task.

Although I don't necessarily ask for exactly the same. It could be part 
of the cover letter and/or commit messages.

Regards,

[1] http://www.gossamer-threads.com/lists/xen/devel/366537

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 17:00                                                                   ` Julien Grall
@ 2015-05-05 10:28                                                                     ` Stefano Stabellini
  2015-05-05 11:06                                                                       ` Vijay Kilari
  2015-05-05 11:08                                                                       ` Julien Grall
  0 siblings, 2 replies; 109+ messages in thread
From: Stefano Stabellini @ 2015-05-05 10:28 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Mon, 4 May 2015, Julien Grall wrote:
> Hi Vijay,
> 
> On 04/05/2015 16:19, Vijay Kilari wrote:
> > > > > > How did you implement the interrupt mode? Could it be improve?
> > > > > 
> > > > > 
> > > > >      1) In physical ITS driver its_device is created with devID
> > > > > 00:00.1
> > > > > with
> > > > > 256 MSI-x are reserved and is named as completion_dev, which is
> > > > > global.
> > > > 
> > > > 
> > > > That's a lot of MSI-x reserved... Can't you use only one per domain?
> > > 
> > > 
> > > Hmmm... I meant for all the domain, not "per domain".
> > 
> >     Complexity with one irq for all domains is that if completion interrupt
> > comes it is difficult to find out  for which vITS/Domain ITS command
> > it came for.
> 
> While reserving a single devID sounds feasible on all the future platform.
> Allocating 256 MSI-x sounds more difficult, you assume that any board will
> have at least 256 MSI-x free.
> 
> Although, this is not scalable. How do you plan to handle more than 256
> domains? By increasing the number of reserved MSI-x?
> 
> I don't ask you to implement the later now... but if increasing the number of
> domain supported means rewriting all the completion code and maybe the vITS
> then you should ask yourself if it's really worth to take this current
> approach.

As far as I understand there are max 2048 MSI-X per devid and max 8
functions per device (we can continue to 00:00.2, etc). That gives us
16384 max domains with a PCI device assigned to them. We wouldn't use
any of these MSIs for domains without devices assigned to them. Overall
I think is OK as a limit, as long as we can handle the allocation
efficiently (we cannot really allocate 16384 data structures at boot
time).

Actually even 256 domains with devices assigned to them would be enough
for now, if we don't consume these MSIs with regular domains without PCI
passthrough.


> > > > >    I am adding one INT per command. This can be improved to add one
> > > > > INT
> > > > > cmd for all
> > > > >    the pending commands. Existing Linux driver sends 2 commands at a
> > > > > time.
> > > > 
> > > > 
> > > > You should not assume that other OS will send 2 commands at the same
> > > > time... It could be more or less.
> > > > 
> > > > Although, having a INT per command is rather slow. One INT command per
> > > > batch would improve the boot time.
> > 
> >     We cannot limit on number of commands sent at a time. we have to send
> > all the
> > pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
> > we have to check for pending interrupts on completion interrupt and
> > translate
> > and send pending commands in interrupt context. Which complicates and adds
> > more
> > delays.
> 
> If we don't limit the number of commands sent, we would allow a domain to
> flood the command queue. Therefore, other domains wouldn't be able to send
> command and will likely timeout and crash. This is one possible security issue
> among many others.
> 
> Nobody like security issue, it impacts both end-user and the project. Please
> have this security concern in mind before performance. Performance is usually
> more easier to address later.
> 
> As the vITS is only used for interrupt managing (mapping, unmapping), it's not
> used in hot path such as receiving interrupt. So we don't care if it's "slow"
> from the guest point of view as long as we emulate the behavior correctly
> without impacting the other domain.

I think that rate limiting the guest vITS commands could be done in
second stage. I wouldn't worry about it for now, not because is not
important, but because we need to get the basic mechanics right first.
Rome wasn't built in a day.



> Also, what happen if the physical queue is full? You need to have a away to
> inject new commands later.
> 
> Overall I'm aware that the command queue emulation is huge. Lots of things to
> take into account : security, performance, concurrence problem...
> 
> As you did before in this thread, I would suggest you to write down all the
> possible solutions and see what are the impacts (security, theorical
> performances...). So we could talk on the ML (or over an IRC/phone meeting)
> and agree to a solution that satisfy everyone.
> 
> By experience, doing a such things may speed up the acceptance of the series
> because everyone will focus on the implementation when you send a new patch.
>
> One good example is x86 PML support. Intel has sent a design doc a couple of
> months ago [1]. Developers discussed about the overall design, when a common
> agreement has been made they send the patch series.

It is true that a design document could help to clarify complex design
decisions.



> The same would have been very helpful to understand this series. TBH, I spent
> most of my time trying to understand what was your design and how everything
> works together. On a 4000 lines series split in 22 patches it's a rather big
> task.
> 
> Although I don't necessarily ask for exactly the same. It could be part of the
> cover letter and/or commit messages.
> 
> Regards,
> 
> [1] http://www.gossamer-threads.com/lists/xen/devel/366537
> 
> -- 
> Julien Grall
> 

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-04 12:58                                                       ` Vijay Kilari
  2015-05-04 13:04                                                         ` Julien Grall
@ 2015-05-05 10:39                                                         ` Stefano Stabellini
  2015-05-05 11:10                                                           ` Julien Grall
  1 sibling, 1 reply; 109+ messages in thread
From: Stefano Stabellini @ 2015-05-05 10:39 UTC (permalink / raw)
  To: Vijay Kilari
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Julien Grall,
	Stefano Stabellini, manish.jaggi

On Mon, 4 May 2015, Vijay Kilari wrote:
> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com> wrote:
> > Hi,
> >
> > On 30/04/15 14:47, Stefano Stabellini wrote:
> >>>>>
> >>>>> If the devid is not within this range, the ITS won't recognize the value and
> >>>>> won't be able to send the interrupt.
> >>>>>
> >>>>> So this is clearly not the right value.
> >>>>
> >>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> >>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
> >>>
> >>> It is 21 bits
> >>
> >> I would imagine that 21 bits would be plenty to find an unused devid.
> >>
> >> Alternatively we could use an inexistent function of a real device, such
> >> as 00:00.1 (function 1 of the host bridge).
> >
> > As discussed IRL, this idea sounds good to me.
> >
> > Although I would be happy with any other way which ensure the devid is free.
> 
> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
> slow compared to polling mode.

This is very interesting.


> This could be because Dom0 has to keep
> trapping on creader to check if creader is updated or not.

This sounds like a plausible explanation. That's because the guest is
polling, right? I think we should pause the guest vcpu until we receive
the interrupt.

You can do that by calling vcpu_block.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 10:28                                                                     ` Stefano Stabellini
@ 2015-05-05 11:06                                                                       ` Vijay Kilari
  2015-05-05 11:47                                                                         ` Julien Grall
  2015-05-05 11:08                                                                       ` Julien Grall
  1 sibling, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-05 11:06 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

On Tue, May 5, 2015 at 3:58 PM, Stefano Stabellini
<stefano.stabellini@eu.citrix.com> wrote:
> On Mon, 4 May 2015, Julien Grall wrote:
>> Hi Vijay,
>>
>> On 04/05/2015 16:19, Vijay Kilari wrote:
>> > > > > > How did you implement the interrupt mode? Could it be improve?
>> > > > >
>> > > > >
>> > > > >      1) In physical ITS driver its_device is created with devID
>> > > > > 00:00.1
>> > > > > with
>> > > > > 256 MSI-x are reserved and is named as completion_dev, which is
>> > > > > global.
>> > > >
>> > > >
>> > > > That's a lot of MSI-x reserved... Can't you use only one per domain?
>> > >
>> > >
>> > > Hmmm... I meant for all the domain, not "per domain".
>> >
>> >     Complexity with one irq for all domains is that if completion interrupt
>> > comes it is difficult to find out  for which vITS/Domain ITS command
>> > it came for.
>>
>> While reserving a single devID sounds feasible on all the future platform.
>> Allocating 256 MSI-x sounds more difficult, you assume that any board will
>> have at least 256 MSI-x free.
>>
>> Although, this is not scalable. How do you plan to handle more than 256
>> domains? By increasing the number of reserved MSI-x?
>>
>> I don't ask you to implement the later now... but if increasing the number of
>> domain supported means rewriting all the completion code and maybe the vITS
>> then you should ask yourself if it's really worth to take this current
>> approach.
>
> As far as I understand there are max 2048 MSI-X per devid and max 8
> functions per device (we can continue to 00:00.2, etc). That gives us
> 16384 max domains with a PCI device assigned to them. We wouldn't use
> any of these MSIs for domains without devices assigned to them. Overall
> I think is OK as a limit, as long as we can handle the allocation
> efficiently (we cannot really allocate 16384 data structures at boot
> time).
>
> Actually even 256 domains with devices assigned to them would be enough
> for now, if we don't consume these MSIs with regular domains without PCI
> passthrough.

   One MSI per domain is always consumed because each domain during
ITS initialization creates Virtual ITS and sends basic initialization
commands (MAPC)

However unless PCI devices are attached, there will not be any further
ITS commands
from domain.

>
>
>> > > > >    I am adding one INT per command. This can be improved to add one
>> > > > > INT
>> > > > > cmd for all
>> > > > >    the pending commands. Existing Linux driver sends 2 commands at a
>> > > > > time.
>> > > >
>> > > >
>> > > > You should not assume that other OS will send 2 commands at the same
>> > > > time... It could be more or less.
>> > > >
>> > > > Although, having a INT per command is rather slow. One INT command per
>> > > > batch would improve the boot time.
>> >
>> >     We cannot limit on number of commands sent at a time. we have to send
>> > all the
>> > pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
>> > we have to check for pending interrupts on completion interrupt and
>> > translate
>> > and send pending commands in interrupt context. Which complicates and adds
>> > more
>> > delays.
>>
>> If we don't limit the number of commands sent, we would allow a domain to
>> flood the command queue. Therefore, other domains wouldn't be able to send
>> command and will likely timeout and crash. This is one possible security issue
>> among many others.
>>
>> Nobody like security issue, it impacts both end-user and the project. Please
>> have this security concern in mind before performance. Performance is usually
>> more easier to address later.
>>
>> As the vITS is only used for interrupt managing (mapping, unmapping), it's not
>> used in hot path such as receiving interrupt. So we don't care if it's "slow"
>> from the guest point of view as long as we emulate the behavior correctly
>> without impacting the other domain.
>
> I think that rate limiting the guest vITS commands could be done in
> second stage. I wouldn't worry about it for now, not because is not
> important, but because we need to get the basic mechanics right first.
> Rome wasn't built in a day.

Agreed, I have already 4K lines of code.

>
>
>
>> Also, what happen if the physical queue is full? You need to have a away to
>> inject new commands later.
>>
>> Overall I'm aware that the command queue emulation is huge. Lots of things to
>> take into account : security, performance, concurrence problem...
>>
>> As you did before in this thread, I would suggest you to write down all the
>> possible solutions and see what are the impacts (security, theorical
>> performances...). So we could talk on the ML (or over an IRC/phone meeting)
>> and agree to a solution that satisfy everyone.
>>
>> By experience, doing a such things may speed up the acceptance of the series
>> because everyone will focus on the implementation when you send a new patch.
>>
>> One good example is x86 PML support. Intel has sent a design doc a couple of
>> months ago [1]. Developers discussed about the overall design, when a common
>> agreement has been made they send the patch series.
>
> It is true that a design document could help to clarify complex design
> decisions.
>
>
>
>> The same would have been very helpful to understand this series. TBH, I spent
>> most of my time trying to understand what was your design and how everything
>> works together. On a 4000 lines series split in 22 patches it's a rather big
>> task.
>>
>> Although I don't necessarily ask for exactly the same. It could be part of the
>> cover letter and/or commit messages.
>>
>> Regards,
>>
>> [1] http://www.gossamer-threads.com/lists/xen/devel/366537
>>
>> --
>> Julien Grall
>>

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 10:28                                                                     ` Stefano Stabellini
  2015-05-05 11:06                                                                       ` Vijay Kilari
@ 2015-05-05 11:08                                                                       ` Julien Grall
  2015-05-05 11:45                                                                         ` Vijay Kilari
  2015-05-05 11:54                                                                         ` Stefano Stabellini
  1 sibling, 2 replies; 109+ messages in thread
From: Julien Grall @ 2015-05-05 11:08 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 05/05/15 11:28, Stefano Stabellini wrote:
> On Mon, 4 May 2015, Julien Grall wrote:
>> Hi Vijay,
>>
>> On 04/05/2015 16:19, Vijay Kilari wrote:
>>>>>>> How did you implement the interrupt mode? Could it be improve?
>>>>>>
>>>>>>
>>>>>>      1) In physical ITS driver its_device is created with devID
>>>>>> 00:00.1
>>>>>> with
>>>>>> 256 MSI-x are reserved and is named as completion_dev, which is
>>>>>> global.
>>>>>
>>>>>
>>>>> That's a lot of MSI-x reserved... Can't you use only one per domain?
>>>>
>>>>
>>>> Hmmm... I meant for all the domain, not "per domain".
>>>
>>>     Complexity with one irq for all domains is that if completion interrupt
>>> comes it is difficult to find out  for which vITS/Domain ITS command
>>> it came for.
>>
>> While reserving a single devID sounds feasible on all the future platform.
>> Allocating 256 MSI-x sounds more difficult, you assume that any board will
>> have at least 256 MSI-x free.
>>
>> Although, this is not scalable. How do you plan to handle more than 256
>> domains? By increasing the number of reserved MSI-x?
>>
>> I don't ask you to implement the later now... but if increasing the number of
>> domain supported means rewriting all the completion code and maybe the vITS
>> then you should ask yourself if it's really worth to take this current
>> approach.
> 
> As far as I understand there are max 2048 MSI-X per devid and max 8
> functions per device (we can continue to 00:00.2, etc). That gives us
> 16384 max domains with a PCI device assigned to them. We wouldn't use
> any of these MSIs for domains without devices assigned to them. Overall
> I think is OK as a limit, as long as we can handle the allocation
> efficiently (we cannot really allocate 16384 data structures at boot
> time).

You assume that there is enough of LPIs unused. This may not be true on
every platform.

> Actually even 256 domains with devices assigned to them would be enough
> for now, if we don't consume these MSIs with regular domains without PCI
> passthrough.

It would need some plumbing in the toolstack to use vITS only when PCI
passthrough is used for the guest.

> 
>>>>>>    I am adding one INT per command. This can be improved to add one
>>>>>> INT
>>>>>> cmd for all
>>>>>>    the pending commands. Existing Linux driver sends 2 commands at a
>>>>>> time.
>>>>>
>>>>>
>>>>> You should not assume that other OS will send 2 commands at the same
>>>>> time... It could be more or less.
>>>>>
>>>>> Although, having a INT per command is rather slow. One INT command per
>>>>> batch would improve the boot time.
>>>
>>>     We cannot limit on number of commands sent at a time. we have to send
>>> all the
>>> pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
>>> we have to check for pending interrupts on completion interrupt and
>>> translate
>>> and send pending commands in interrupt context. Which complicates and adds
>>> more
>>> delays.
>>
>> If we don't limit the number of commands sent, we would allow a domain to
>> flood the command queue. Therefore, other domains wouldn't be able to send
>> command and will likely timeout and crash. This is one possible security issue
>> among many others.
>>
>> Nobody like security issue, it impacts both end-user and the project. Please
>> have this security concern in mind before performance. Performance is usually
>> more easier to address later.
>>
>> As the vITS is only used for interrupt managing (mapping, unmapping), it's not
>> used in hot path such as receiving interrupt. So we don't care if it's "slow"
>> from the guest point of view as long as we emulate the behavior correctly
>> without impacting the other domain.
> 
> I think that rate limiting the guest vITS commands could be done in
> second stage. I wouldn't worry about it for now, not because is not
> important, but because we need to get the basic mechanics right first.
> Rome wasn't built in a day.

Even though Rome wasn't built in a day, the design has been well though
before...

The command queue is the big part of the vITS and tight with the
physical ITS driver. If we don't think about rate limiting in the
design, we may need to rework heavily the ITS.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 10:39                                                         ` Stefano Stabellini
@ 2015-05-05 11:10                                                           ` Julien Grall
  2015-05-05 11:57                                                             ` Stefano Stabellini
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-05 11:10 UTC (permalink / raw)
  To: Stefano Stabellini, Vijay Kilari
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

On 05/05/15 11:39, Stefano Stabellini wrote:
> On Mon, 4 May 2015, Vijay Kilari wrote:
>> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com> wrote:
>>> Hi,
>>>
>>> On 30/04/15 14:47, Stefano Stabellini wrote:
>>>>>>>
>>>>>>> If the devid is not within this range, the ITS won't recognize the value and
>>>>>>> won't be able to send the interrupt.
>>>>>>>
>>>>>>> So this is clearly not the right value.
>>>>>>
>>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
>>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
>>>>>
>>>>> It is 21 bits
>>>>
>>>> I would imagine that 21 bits would be plenty to find an unused devid.
>>>>
>>>> Alternatively we could use an inexistent function of a real device, such
>>>> as 00:00.1 (function 1 of the host bridge).
>>>
>>> As discussed IRL, this idea sounds good to me.
>>>
>>> Although I would be happy with any other way which ensure the devid is free.
>>
>> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
>> slow compared to polling mode.
> 
> This is very interesting.
> 
> 
>> This could be because Dom0 has to keep
>> trapping on creader to check if creader is updated or not.
> 
> This sounds like a plausible explanation. That's because the guest is
> polling, right?

He was inject an INT command for every command sent...

> I think we should pause the guest vcpu until we receive
> the interrupt.
> You can do that by calling vcpu_block.

Blocking the vcpu is not the right thing to do. The processing of the
ITS command queue is asynchronous and the guest may decide to execute
other task while waiting.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:08                                                                       ` Julien Grall
@ 2015-05-05 11:45                                                                         ` Vijay Kilari
  2015-05-05 11:54                                                                         ` Stefano Stabellini
  1 sibling, 0 replies; 109+ messages in thread
From: Vijay Kilari @ 2015-05-05 11:45 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Tue, May 5, 2015 at 4:38 PM, Julien Grall <julien.grall@citrix.com> wrote:
> On 05/05/15 11:28, Stefano Stabellini wrote:
>> On Mon, 4 May 2015, Julien Grall wrote:
>>> Hi Vijay,
>>>
>>> On 04/05/2015 16:19, Vijay Kilari wrote:
>>>>>>>> How did you implement the interrupt mode? Could it be improve?
>>>>>>>
>>>>>>>
>>>>>>>      1) In physical ITS driver its_device is created with devID
>>>>>>> 00:00.1
>>>>>>> with
>>>>>>> 256 MSI-x are reserved and is named as completion_dev, which is
>>>>>>> global.
>>>>>>
>>>>>>
>>>>>> That's a lot of MSI-x reserved... Can't you use only one per domain?
>>>>>
>>>>>
>>>>> Hmmm... I meant for all the domain, not "per domain".
>>>>
>>>>     Complexity with one irq for all domains is that if completion interrupt
>>>> comes it is difficult to find out  for which vITS/Domain ITS command
>>>> it came for.
>>>
>>> While reserving a single devID sounds feasible on all the future platform.
>>> Allocating 256 MSI-x sounds more difficult, you assume that any board will
>>> have at least 256 MSI-x free.
>>>
>>> Although, this is not scalable. How do you plan to handle more than 256
>>> domains? By increasing the number of reserved MSI-x?
>>>
>>> I don't ask you to implement the later now... but if increasing the number of
>>> domain supported means rewriting all the completion code and maybe the vITS
>>> then you should ask yourself if it's really worth to take this current
>>> approach.
>>
>> As far as I understand there are max 2048 MSI-X per devid and max 8
>> functions per device (we can continue to 00:00.2, etc). That gives us
>> 16384 max domains with a PCI device assigned to them. We wouldn't use
>> any of these MSIs for domains without devices assigned to them. Overall
>> I think is OK as a limit, as long as we can handle the allocation
>> efficiently (we cannot really allocate 16384 data structures at boot
>> time).
>
> You assume that there is enough of LPIs unused. This may not be true on
> every platform.

Below is the note from Spec. Minimum ID bits should be 14 to support LPIs
 which makes minimum support of 8192 LPIs in any platform.

Note: an ITS or Distributor implementation might choose to support any
size of LPI identifier field up to and
including 32 bits. For example, an implementation might choose to
support 14 bits. Because IDs 0 to 8191 are
used for other classes of interrupt, a 14 bit identifier provides
support for 8192 LPIs. The number supported by
software is configured writing a value to the “IDbits” field in
GICR_PROPBASER (see section 5.4.23), subject to
the maximum supported by the implementation (see section 5.11).


>
>> Actually even 256 domains with devices assigned to them would be enough
>> for now, if we don't consume these MSIs with regular domains without PCI
>> passthrough.
>
> It would need some plumbing in the toolstack to use vITS only when PCI
> passthrough is used for the guest.
>
>>
>>>>>>>    I am adding one INT per command. This can be improved to add one
>>>>>>> INT
>>>>>>> cmd for all
>>>>>>>    the pending commands. Existing Linux driver sends 2 commands at a
>>>>>>> time.
>>>>>>
>>>>>>
>>>>>> You should not assume that other OS will send 2 commands at the same
>>>>>> time... It could be more or less.
>>>>>>
>>>>>> Although, having a INT per command is rather slow. One INT command per
>>>>>> batch would improve the boot time.
>>>>
>>>>     We cannot limit on number of commands sent at a time. we have to send
>>>> all the
>>>> pending commands in vITS queue at a time when trapped on CWRITER, Otherwise
>>>> we have to check for pending interrupts on completion interrupt and
>>>> translate
>>>> and send pending commands in interrupt context. Which complicates and adds
>>>> more
>>>> delays.
>>>
>>> If we don't limit the number of commands sent, we would allow a domain to
>>> flood the command queue. Therefore, other domains wouldn't be able to send
>>> command and will likely timeout and crash. This is one possible security issue
>>> among many others.
>>>
>>> Nobody like security issue, it impacts both end-user and the project. Please
>>> have this security concern in mind before performance. Performance is usually
>>> more easier to address later.
>>>
>>> As the vITS is only used for interrupt managing (mapping, unmapping), it's not
>>> used in hot path such as receiving interrupt. So we don't care if it's "slow"
>>> from the guest point of view as long as we emulate the behavior correctly
>>> without impacting the other domain.
>>
>> I think that rate limiting the guest vITS commands could be done in
>> second stage. I wouldn't worry about it for now, not because is not
>> important, but because we need to get the basic mechanics right first.
>> Rome wasn't built in a day.
>
> Even though Rome wasn't built in a day, the design has been well though
> before...
>
> The command queue is the big part of the vITS and tight with the
> physical ITS driver. If we don't think about rate limiting in the
> design, we may need to rework heavily the ITS.
>
> Regards,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:06                                                                       ` Vijay Kilari
@ 2015-05-05 11:47                                                                         ` Julien Grall
  2015-05-05 12:00                                                                           ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-05 11:47 UTC (permalink / raw)
  To: Vijay Kilari, Stefano Stabellini
  Cc: Ian Campbell, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Julien Grall, Stefano Stabellini,
	manish.jaggi

On 05/05/15 12:06, Vijay Kilari wrote:
>    One MSI per domain is always consumed because each domain during
> ITS initialization creates Virtual ITS and sends basic initialization
> commands (MAPC)

AFAICT, MAPC won't be translated to a physical command. So not interrupt
completion is necessary.

Couldn't we defer the interrupt completion allocation until a PCI device
is passthrough or a command requiring physical command is sent?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:08                                                                       ` Julien Grall
  2015-05-05 11:45                                                                         ` Vijay Kilari
@ 2015-05-05 11:54                                                                         ` Stefano Stabellini
  1 sibling, 0 replies; 109+ messages in thread
From: Stefano Stabellini @ 2015-05-05 11:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Tue, 5 May 2015, Julien Grall wrote:
> On 05/05/15 11:28, Stefano Stabellini wrote:
> > On Mon, 4 May 2015, Julien Grall wrote:
> >> Hi Vijay,
> >>
> >> On 04/05/2015 16:19, Vijay Kilari wrote:
> >>>>>>> How did you implement the interrupt mode? Could it be improve?
> >>>>>>
> >>>>>>
> >>>>>>      1) In physical ITS driver its_device is created with devID
> >>>>>> 00:00.1
> >>>>>> with
> >>>>>> 256 MSI-x are reserved and is named as completion_dev, which is
> >>>>>> global.
> >>>>>
> >>>>>
> >>>>> That's a lot of MSI-x reserved... Can't you use only one per domain?
> >>>>
> >>>>
> >>>> Hmmm... I meant for all the domain, not "per domain".
> >>>
> >>>     Complexity with one irq for all domains is that if completion interrupt
> >>> comes it is difficult to find out  for which vITS/Domain ITS command
> >>> it came for.
> >>
> >> While reserving a single devID sounds feasible on all the future platform.
> >> Allocating 256 MSI-x sounds more difficult, you assume that any board will
> >> have at least 256 MSI-x free.
> >>
> >> Although, this is not scalable. How do you plan to handle more than 256
> >> domains? By increasing the number of reserved MSI-x?
> >>
> >> I don't ask you to implement the later now... but if increasing the number of
> >> domain supported means rewriting all the completion code and maybe the vITS
> >> then you should ask yourself if it's really worth to take this current
> >> approach.
> > 
> > As far as I understand there are max 2048 MSI-X per devid and max 8
> > functions per device (we can continue to 00:00.2, etc). That gives us
> > 16384 max domains with a PCI device assigned to them. We wouldn't use
> > any of these MSIs for domains without devices assigned to them. Overall
> > I think is OK as a limit, as long as we can handle the allocation
> > efficiently (we cannot really allocate 16384 data structures at boot
> > time).
> 
> You assume that there is enough of LPIs unused. This may not be true on
> every platform.

In that case, we just fail PCI device assignment. It is OK to fail when
no hw resources are available.  A guest without PCI devices should boot
without issues though. It is important that guests without PCI devices
assigned continue operating as usual.


> > Actually even 256 domains with devices assigned to them would be enough
> > for now, if we don't consume these MSIs with regular domains without PCI
> > passthrough.
> 
> It would need some plumbing in the toolstack to use vITS only when PCI
> passthrough is used for the guest.

Given that in a regular DomU there are no PCI devices (emulated or
otherwise), the vITS should be completely unused, or useless anyway. So
it is OK to have it in place, as long as it doesn't do anything and it
doesn't allocate any resources. If a PCI device has been assigned since
boot time, or when a PCI device is hotplugged into the guest, then we
allocate resources and start doing something useful.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:10                                                           ` Julien Grall
@ 2015-05-05 11:57                                                             ` Stefano Stabellini
  2015-05-05 12:03                                                               ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Stefano Stabellini @ 2015-05-05 11:57 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Stefano Stabellini, Prasun Kapoor,
	Vijaya Kumar K, Julien Grall, Tim Deegan, xen-devel,
	Stefano Stabellini, manish.jaggi

On Tue, 5 May 2015, Julien Grall wrote:
> On 05/05/15 11:39, Stefano Stabellini wrote:
> > On Mon, 4 May 2015, Vijay Kilari wrote:
> >> On Thu, Apr 30, 2015 at 7:59 PM, Julien Grall <julien.grall@citrix.com> wrote:
> >>> Hi,
> >>>
> >>> On 30/04/15 14:47, Stefano Stabellini wrote:
> >>>>>>>
> >>>>>>> If the devid is not within this range, the ITS won't recognize the value and
> >>>>>>> won't be able to send the interrupt.
> >>>>>>>
> >>>>>>> So this is clearly not the right value.
> >>>>>>
> >>>>>> Sure, in that case the maximum value allowed by GITS_TYPER.Devbits.
> >>>>>> Vijay, what is the value of GITS_TYPER.Devbits on your platform?
> >>>>>
> >>>>> It is 21 bits
> >>>>
> >>>> I would imagine that 21 bits would be plenty to find an unused devid.
> >>>>
> >>>> Alternatively we could use an inexistent function of a real device, such
> >>>> as 00:00.1 (function 1 of the host bridge).
> >>>
> >>> As discussed IRL, this idea sounds good to me.
> >>>
> >>> Although I would be happy with any other way which ensure the devid is free.
> >>
> >> Has prototyped with 00.00.1 as device id. But I see that Dom0 boot is
> >> slow compared to polling mode.
> > 
> > This is very interesting.
> > 
> > 
> >> This could be because Dom0 has to keep
> >> trapping on creader to check if creader is updated or not.
> > 
> > This sounds like a plausible explanation. That's because the guest is
> > polling, right?
> 
> He was inject an INT command for every command sent...
> 
> > I think we should pause the guest vcpu until we receive
> > the interrupt.
> > You can do that by calling vcpu_block.
> 
> Blocking the vcpu is not the right thing to do. The processing of the
> ITS command queue is asynchronous and the guest may decide to execute
> other task while waiting.

Sure, but the hypervisor might have other things to do and other vcpus
to schedule as well. It is a matter of priorities and scheduling the
most appropriate workload. Keep in mind that vcpu_block doesn't
completely pause the guest, it just only blocks the vcpu until the next
guest event occurs.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:47                                                                         ` Julien Grall
@ 2015-05-05 12:00                                                                           ` Vijay Kilari
  2015-05-05 12:08                                                                             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-05 12:00 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Tue, May 5, 2015 at 5:17 PM, Julien Grall <julien.grall@citrix.com> wrote:
> On 05/05/15 12:06, Vijay Kilari wrote:
>>    One MSI per domain is always consumed because each domain during
>> ITS initialization creates Virtual ITS and sends basic initialization
>> commands (MAPC)
>
> AFAICT, MAPC won't be translated to a physical command. So not interrupt
> completion is necessary.
>
> Couldn't we defer the interrupt completion allocation until a PCI device
> is passthrough or a command requiring physical command is sent?

yes, but MAPC is generally followed by SYNC command and INVALL command
is also received during guest driver init.

On option that I can think of is that avoid posting  to physical
queue/completion
interrupt allocation, untill first MAPD command arrives for that domain.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 11:57                                                             ` Stefano Stabellini
@ 2015-05-05 12:03                                                               ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-05-05 12:03 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Ian Campbell, Vijay Kilari, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 05/05/15 12:57, Stefano Stabellini wrote:
> On Tue, 5 May 2015, Julien Grall wrote:
>>> I think we should pause the guest vcpu until we receive
>>> the interrupt.
>>> You can do that by calling vcpu_block.
>>
>> Blocking the vcpu is not the right thing to do. The processing of the
>> ITS command queue is asynchronous and the guest may decide to execute
>> other task while waiting.
> 
> Sure, but the hypervisor might have other things to do and other vcpus
> to schedule as well. It is a matter of priorities and scheduling the
> most appropriate workload. Keep in mind that vcpu_block doesn't
> completely pause the guest, it just only blocks the vcpu until the next
> guest event occurs.

I don't think it will help to get the guest booting faster. Linux is
trying to read CREADR (The completion register) every millisecond. I
suspect we will receive an IRQ faster than that.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support
  2015-05-05 12:00                                                                           ` Vijay Kilari
@ 2015-05-05 12:08                                                                             ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-05-05 12:08 UTC (permalink / raw)
  To: Vijay Kilari, Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On 05/05/15 13:00, Vijay Kilari wrote:
> On Tue, May 5, 2015 at 5:17 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> On 05/05/15 12:06, Vijay Kilari wrote:
>>>    One MSI per domain is always consumed because each domain during
>>> ITS initialization creates Virtual ITS and sends basic initialization
>>> commands (MAPC)
>>
>> AFAICT, MAPC won't be translated to a physical command. So not interrupt
>> completion is necessary.
>>
>> Couldn't we defer the interrupt completion allocation until a PCI device
>> is passthrough or a command requiring physical command is sent?
> 
> yes, but MAPC is generally followed by SYNC command and INVALL command
> is also received during guest driver init.

You could ignore SYNC, INVALL until a PCI device is passthrough to the
guest.

> On option that I can think of is that avoid posting  to physical
> queue/completion
> interrupt allocation, untill first MAPD command arrives for that domain.

I prefer the solution where we ignore any command until a PCI device is
passthrough. It's more logical.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-04-01 11:40   ` Ian Campbell
@ 2015-05-11 14:14     ` Vijay Kilari
  2015-05-11 14:25       ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-11 14:14 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On Wed, Apr 1, 2015 at 5:10 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>
>> Add helper functions to decode ITS command
>> This will be useful for Virtual ITS driver
>
> It depends slightly on the answer to the quesiton I asked on patch #6,
> but in general in Xen we have preferred to define a structure/union
> overlaying the processor's view of such things and to use access to
> those fields, see e.g. the hsr decode or the pte stuff.

   This make heavy and simple changes. I prefer to make it a separate
patch series.

Regards
Vijay

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-05-11 14:14     ` Vijay Kilari
@ 2015-05-11 14:25       ` Julien Grall
  2015-05-11 14:25         ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-11 14:25 UTC (permalink / raw)
  To: Vijay Kilari, Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

Hi Vijay,

On 11/05/15 15:14, Vijay Kilari wrote:
> On Wed, Apr 1, 2015 at 5:10 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>
>>> Add helper functions to decode ITS command
>>> This will be useful for Virtual ITS driver
>>
>> It depends slightly on the answer to the quesiton I asked on patch #6,
>> but in general in Xen we have preferred to define a structure/union
>> overlaying the processor's view of such things and to use access to
>> those fields, see e.g. the hsr decode or the pte stuff.
> 
>    This make heavy and simple changes. I prefer to make it a separate
> patch series.

We are at early stage of the review process, it's the second version and
the design is not yet set in stone. So it's normal to have heavy changes
in the code between 2 versions.

Although, when you say heavy, it's compare to what? The GICv3 ITS driver?

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-05-11 14:25       ` Julien Grall
@ 2015-05-11 14:25         ` Julien Grall
  2015-05-11 14:36           ` Vijay Kilari
  0 siblings, 1 reply; 109+ messages in thread
From: Julien Grall @ 2015-05-11 14:25 UTC (permalink / raw)
  To: Julien Grall, Vijay Kilari, Ian Campbell
  Cc: Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K, Julien Grall,
	Tim Deegan, xen-devel, Stefano Stabellini, manish.jaggi

On 11/05/15 15:25, Julien Grall wrote:
> Hi Vijay,
> 
> On 11/05/15 15:14, Vijay Kilari wrote:
>> On Wed, Apr 1, 2015 at 5:10 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>>> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>>
>>>> Add helper functions to decode ITS command
>>>> This will be useful for Virtual ITS driver
>>>
>>> It depends slightly on the answer to the quesiton I asked on patch #6,
>>> but in general in Xen we have preferred to define a structure/union
>>> overlaying the processor's view of such things and to use access to
>>> those fields, see e.g. the hsr decode or the pte stuff.
>>
>>    This make heavy and simple changes. I prefer to make it a separate
>> patch series.
> 
> We are at early stage of the review process, it's the second version and
> the design is not yet set in stone. So it's normal to have heavy changes
> in the code between 2 versions.
> 
> Although, when you say heavy, it's compare to what? The GICv3 ITS driver?

* from Linux?


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-05-11 14:25         ` Julien Grall
@ 2015-05-11 14:36           ` Vijay Kilari
  2015-05-11 22:06             ` Julien Grall
  0 siblings, 1 reply; 109+ messages in thread
From: Vijay Kilari @ 2015-05-11 14:36 UTC (permalink / raw)
  To: Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

On Mon, May 11, 2015 at 7:55 PM, Julien Grall <julien.grall@citrix.com> wrote:
> On 11/05/15 15:25, Julien Grall wrote:
>> Hi Vijay,
>>
>> On 11/05/15 15:14, Vijay Kilari wrote:
>>> On Wed, Apr 1, 2015 at 5:10 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>>>> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>>>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>>>
>>>>> Add helper functions to decode ITS command
>>>>> This will be useful for Virtual ITS driver
>>>>
>>>> It depends slightly on the answer to the quesiton I asked on patch #6,
>>>> but in general in Xen we have preferred to define a structure/union
>>>> overlaying the processor's view of such things and to use access to
>>>> those fields, see e.g. the hsr decode or the pte stuff.
>>>
>>>    This make heavy and simple changes. I prefer to make it a separate
>>> patch series.
>>
>> We are at early stage of the review process, it's the second version and
>> the design is not yet set in stone. So it's normal to have heavy changes
>> in the code between 2 versions.
>>
>> Although, when you say heavy, it's compare to what? The GICv3 ITS driver?
>
> * from Linux?

Yes.

^ permalink raw reply	[flat|nested] 109+ messages in thread

* Re: [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command
  2015-05-11 14:36           ` Vijay Kilari
@ 2015-05-11 22:06             ` Julien Grall
  0 siblings, 0 replies; 109+ messages in thread
From: Julien Grall @ 2015-05-11 22:06 UTC (permalink / raw)
  To: Vijay Kilari, Julien Grall
  Cc: Ian Campbell, Stefano Stabellini, Prasun Kapoor, Vijaya Kumar K,
	Julien Grall, Tim Deegan, xen-devel, Stefano Stabellini,
	manish.jaggi

Hi,

On 11/05/2015 15:36, Vijay Kilari wrote:
> On Mon, May 11, 2015 at 7:55 PM, Julien Grall <julien.grall@citrix.com> wrote:
>> On 11/05/15 15:25, Julien Grall wrote:
>>> Hi Vijay,
>>>
>>> On 11/05/15 15:14, Vijay Kilari wrote:
>>>> On Wed, Apr 1, 2015 at 5:10 PM, Ian Campbell <ian.campbell@citrix.com> wrote:
>>>>> On Thu, 2015-03-19 at 20:07 +0530, vijay.kilari@gmail.com wrote:
>>>>>> From: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com>
>>>>>>
>>>>>> Add helper functions to decode ITS command
>>>>>> This will be useful for Virtual ITS driver
>>>>>
>>>>> It depends slightly on the answer to the quesiton I asked on patch #6,
>>>>> but in general in Xen we have preferred to define a structure/union
>>>>> overlaying the processor's view of such things and to use access to
>>>>> those fields, see e.g. the hsr decode or the pte stuff.
>>>>
>>>>     This make heavy and simple changes. I prefer to make it a separate
>>>> patch series.
>>>
>>> We are at early stage of the review process, it's the second version and
>>> the design is not yet set in stone. So it's normal to have heavy changes
>>> in the code between 2 versions.
>>>
>>> Although, when you say heavy, it's compare to what? The GICv3 ITS driver?
>>
>> * from Linux?
>
> Yes.

I'm joining Ian's concern on a previous mail [1]. The diff between the 
Linux driver and the Xen driver is very huge. Removing unnecessary code, 
moving code in another file... doesn't help for porting fixes for Linux. 
How would you know if you remove/move the function for good reasons? 
What would you do if you have to bring back a function?

Your previous mail saying "This make heavy and simple changes. I prefer 
to make it a separate patch series." achieved to convince me that you 
don't use the GICv3 ITS driver from Linux for helping backporting fixes 
later (even you if you stated the invert on multiple mails [2], [3]...).

To help backporting, the driver needs to have very small changes. 
Removing/moving code is even too much, the resulting patch won't be 
similar to the Linux one and it will be harder to know whether it's 
valid or not.

When I worked on the SMMU drivers last year, I choose to differ from 
Linux because the page table wasn't shared. The base was from Linux but 
heavily change. The code went in, but few months after we notice it was 
hard to maintain and the API is fairly similar to Linux. So we decided 
to get a driver very close to Linux.

Now for the GICv3 ITS driver... the final design of the ITS in Xen will 
be very different to the Linux one:
	- Command are generated by the vITS and not the ITS driver
         - The Linux ITS is waiting the completion of each command 
before continuing while it's not possible for Xen
         - and so on...

TBH, with these reasons, I don't see how it will be possible to keep the 
Xen drivers synced...

If your driver is not sync to the latest version of Linux (which I guess 
it is). Give a try to take the patches and backport one by one on the 
top of your series. You will be able to see how difficult it is to 
backport on a diverging driver.

Regards,


[1] 
http://lists.xenproject.org/archives/html/xen-devel/2015-04/msg00235.html

[2] 
http://lists.xenproject.org/archives/html/xen-devel/2015-04/msg00201.html

[3] 
http://lists.xenproject.org/archives/html/xen-devel/2015-04/msg00246.html



> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 109+ messages in thread

end of thread, other threads:[~2015-05-11 22:06 UTC | newest]

Thread overview: 109+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-19 14:37 [RFC PATCH v2 00/22] xen/arm: Add ITS support vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 01/22] add linked list apis vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 02/22] Use linked list accessors for page_list helper function vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 03/22] xen/arm: Add bitmap_find_next_zero_area " vijay.kilari
2015-03-20 13:35   ` Julien Grall
2015-03-19 14:37 ` [RFC PATCH v2 04/22] xen/arm: its: Import GICv3 ITS driver from linux vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 05/22] xen/arm: gicv3: Refactor redistributor information vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 06/22] xen/arm: its: Port ITS driver to xen vijay.kilari
2015-03-20 15:06   ` Julien Grall
2015-03-23 12:24     ` Vijay Kilari
2015-03-23 13:27       ` Julien Grall
2015-04-01 11:34   ` Ian Campbell
2015-04-02  8:25     ` Vijay Kilari
2015-04-02  9:25       ` Ian Campbell
2015-04-02 10:05         ` Vijay Kilari
2015-04-02 13:57       ` Julien Grall
2015-03-19 14:37 ` [RFC PATCH v2 07/22] xen/arm: its: Move ITS command encode helper functions vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 08/22] xen/arm: its: Remove unused code in ITS driver vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 09/22] xen/arm: its: Add helper functions to decode ITS Command vijay.kilari
2015-04-01 11:40   ` Ian Campbell
2015-05-11 14:14     ` Vijay Kilari
2015-05-11 14:25       ` Julien Grall
2015-05-11 14:25         ` Julien Grall
2015-05-11 14:36           ` Vijay Kilari
2015-05-11 22:06             ` Julien Grall
2015-03-19 14:37 ` [RFC PATCH v2 10/22] xen/arm: Add helper function to get domain page vijay.kilari
2015-03-20 16:39   ` Julien Grall
2015-03-19 14:37 ` [RFC PATCH v2 11/22] xen/arm: its: Move its_device structure to header file vijay.kilari
2015-03-19 14:37 ` [RFC PATCH v2 12/22] xen/arm: its: Update irq descriptor for LPIs support vijay.kilari
2015-03-20 16:44   ` Julien Grall
2015-03-30 14:32     ` Vijay Kilari
2015-03-30 15:29       ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 13/22] xen/arm: its: Add virtual ITS command support vijay.kilari
2015-03-21  0:28   ` Julien Grall
2015-03-23 15:52   ` Julien Grall
2015-03-24 11:48   ` Julien Grall
2015-03-30 15:02     ` Vijay Kilari
2015-03-30 15:47       ` Julien Grall
2015-04-01 11:46         ` Ian Campbell
2015-04-01 12:02           ` Julien Grall
2015-04-02  9:13             ` Ian Campbell
2015-04-02 11:06               ` Julien Grall
2015-04-02 11:18                 ` Ian Campbell
2015-04-02 13:47                   ` Julien Grall
2015-04-28  9:28                     ` Vijay Kilari
2015-04-28  9:56                       ` Stefano Stabellini
2015-04-28 10:35                         ` Julien Grall
2015-04-28 11:36                           ` Vijay Kilari
2015-04-28 16:15                             ` Julien Grall
2015-04-29  1:44                               ` Vijay Kilari
2015-04-29 11:56                                 ` Julien Grall
2015-04-29 12:12                                   ` Manish Jaggi
2015-04-29 12:21                                     ` Julien Grall
2015-04-29 12:33                                       ` Manish Jaggi
2015-04-29 13:01                                         ` Julien Grall
2015-04-29 13:08                                           ` Manish Jaggi
2015-04-29 13:16                                             ` Julien Grall
2015-04-29 13:35                                   ` Julien Grall
2015-04-29 16:26                                     ` Vijay Kilari
2015-04-29 16:30                                       ` Vijay Kilari
2015-04-29 18:04                                         ` Julien Grall
2015-04-30 10:02                                           ` Stefano Stabellini
2015-04-30 10:09                                             ` Julien Grall
2015-04-30 10:15                                               ` Stefano Stabellini
2015-04-30 10:20                                                 ` Julien Grall
2015-04-30 10:50                                                   ` Stefano Stabellini
2015-04-30 13:19                                                 ` Vijay Kilari
2015-04-30 13:47                                                   ` Stefano Stabellini
2015-04-30 14:29                                                     ` Julien Grall
2015-05-04 12:58                                                       ` Vijay Kilari
2015-05-04 13:04                                                         ` Julien Grall
2015-05-04 13:27                                                           ` Vijay Kilari
2015-05-04 13:44                                                             ` Julien Grall
2015-05-04 13:54                                                               ` Julien Grall
2015-05-04 15:19                                                                 ` Vijay Kilari
2015-05-04 17:00                                                                   ` Julien Grall
2015-05-05 10:28                                                                     ` Stefano Stabellini
2015-05-05 11:06                                                                       ` Vijay Kilari
2015-05-05 11:47                                                                         ` Julien Grall
2015-05-05 12:00                                                                           ` Vijay Kilari
2015-05-05 12:08                                                                             ` Julien Grall
2015-05-05 11:08                                                                       ` Julien Grall
2015-05-05 11:45                                                                         ` Vijay Kilari
2015-05-05 11:54                                                                         ` Stefano Stabellini
2015-05-05 10:39                                                         ` Stefano Stabellini
2015-05-05 11:10                                                           ` Julien Grall
2015-05-05 11:57                                                             ` Stefano Stabellini
2015-05-05 12:03                                                               ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 14/22] xen/arm: its: Add emulation of ITS control registers vijay.kilari
2015-03-24 17:12   ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 15/22] xen/arm: its: Add support to emulate GICR register for LPIs vijay.kilari
2015-03-27 15:46   ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 16/22] xen/arm: its: implement hw_irq_controller " vijay.kilari
2015-03-27 17:02   ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 17/22] xen/arm: its: Map ITS translation space vijay.kilari
2015-03-27 17:07   ` Julien Grall
2015-03-19 14:38 ` [RFC PATCH v2 18/22] xen/arm: its: Dynamic allocation of LPI descriptors vijay.kilari
2015-03-19 14:38 ` [RFC PATCH v2 19/22] xen/arm: its: Support ITS interrupt handling vijay.kilari
2015-03-19 14:38 ` [RFC PATCH v2 20/22] xen/arm: its: Generate ITS node for Dom0 vijay.kilari
2015-03-19 14:38 ` [RFC PATCH v2 21/22] xen/arm: its: Initialize virtual and physical ITS driver vijay.kilari
2015-03-19 14:38 ` [RFC PATCH v2 22/22] xen/arm: its: Generate ITS dt node for DomU vijay.kilari
2015-03-20 13:37 ` [RFC PATCH v2 00/22] xen/arm: Add ITS support Julien Grall
2015-03-20 16:23 ` Julien Grall
2015-03-23 12:37   ` Vijay Kilari
2015-03-23 13:11     ` Julien Grall
2015-03-23 15:18       ` Vijay Kilari
2015-03-23 15:30         ` Julien Grall
2015-03-23 16:09           ` Vijay Kilari
2015-03-23 16:18             ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.